firewing1's blog

Testing Flask applications (code, database, views, flask config, and app context) with pytest

I love writing tests for my code but whenever starting in a new language or framework it's such a pain since getting the mocks and fixtures just right tends to be very language & framework specific. I searched and searched for a good pytest + flask configuration that would let me do unit and integration tests for the app and found some good pieces but nothing holistic.

Thorough testing of Flask apps

I wanted a pytest configuration that would give no side-effects between tests and provide a solid foundation for writing everything from unit to integration to database tests:

  • Mocked (monkey-patched) methods and classes for unit testing
  • Per-test Flask application context, letting you test things like oauth-filter-for-python-flask
  • Fast, in-memory SQLite database that is torn down between each test
  • REST client to test Flask APIs or views

pytest configuration

The code samples below are pretty generic but may require minor customization for your Flask application. I highly recommend you take a look at the flask-bones sample, which contains many best practices and this sample will work with it out of the box.

It assumes the use of the following modules available via pip: pytest, pytest-flask and pytest-mock

pytest's conftest.py:

import pytest

from yourflaskmodule import create_app
from yourflaskmodule.config import test_config
from yourflaskmodule import db as _db


@pytest.fixture(scope="session")
def app(request):
    """Test session-wide test `Flask` application."""
    app = create_app(test_config)
    return app


@pytest.fixture(autouse=True)
def _setup_app_context_for_test(request, app):
    """
    Given app is session-wide, sets up a app context per test to ensure that
    app and request stack is not shared between tests.
    """
    ctx = app.app_context()
    ctx.push()
    yield  # tests will run here
    ctx.pop()


@pytest.fixture(scope="session")
def db(app, request):
    """Returns session-wide initialized database"""
    with app.app_context():
        _db.create_all()
        yield _db
        _db.drop_all()


@pytest.fixture(scope="function")
def session(app, db, request):
    """Creates a new database session for each test, rolling back changes afterwards"""
    connection = _db.engine.connect()
    transaction = connection.begin()

    options = dict(bind=connection, binds={})
    session = _db.create_scoped_session(options=options)

    _db.session = session

    yield session

    transaction.rollback()
    connection.close()
    session.remove()

Here's an example of a base config class with the SQLite in-memory override:

class test_config(base_config):
    """Testing configuration options."""
    ENV_PREFIX = 'APP_'

    TESTING = True
    SQLALCHEMY_DATABASE_URI = 'sqlite:///memory'

Here's an example of a test making use of all the different features:

import pytest

try:
    from flask import _app_ctx_stack as ctx_stack
except ImportError:
    from flask import _request_ctx_stack as ctx_stack

class TestCase:
    # set a Flask app config value using a pytest mark
    @pytest.mark.options(VERIFY_IDENTITY=True)
    def test_foo(self, client, session):
        # set user identity in app context
        ctx_stack.top.claims = {'sub': 'user1', 'tid': 'expected-audience'}

        # mock a class
        mocked_batch_client = mocker.patch('backend_class.BackendClient')
        assert(mocked_batch_client.return_value.list.return_value = ['a', 'b'])

        # test a view - it uses BackendClient (mocked now)
        resp = client.get('/api/items')
        data = json.loads(resp.data)
        assert(len(data['results']) > 0)

        # insert data into the database - will get rolled back after test completion
        item = YourModel()
        item.foo = "bar"
        session.add(item)
        session.commit()

Routing packets from a specific Docker container through a specific outgoing interface

My home server is multi-homed (multiple outgoing network interfaces) which a lot of the times is more trouble than it's worth... This time around I had a need to route a specific Docker container's traffic through a non-default outgoing interface (i.e. an OpenVPN interface to access secured resources). Below I'll show you how I made that happen.

Primer on multi-homed networking

Controlling incoming connections is generally easier than outgoing. Listening connections can be setup on all interfaces or on a specific IP, which will bind it to a specific incoming interface.

Outgoing connections, on the other hand, are a routing decision performed by the kernel. So regardless of the incoming connection, data flows out (generally) through the default route.

Policy-based routing customizes these routing decisions so that the route -- that is, the outgoing interface -- can be determined by a set of rules like source IP or marked packet by IP tables.

A bit about Docker networking

Docker is a marvel of technology but at times feels very user-hostile due to its rigidity - it makes a lot of assumptions and doesn't often communicate them well in documentation.

So, to the point: Docker supports adding multiple network interfaces to containers, great! I can have my specific container continue to join the default Docker network and talk to my other containers, and create a new network specifically for this container that maps to my target outgoing interface on the host.

However, the user-hostility: Docker doesn't let you customize which network is the container's default route. Normally I wouldn't care and we'd just use policy-based routing to fix that, but remember how it's the kernel's routing decision? Well containers don't have their own kernel. Docker is all NAT magic under the hood, and the actual routing decision is done on the host.

Turns out, you can influence the default route in a container... It's just that Docker uses the first network added to a container as the default, and from testing it appears to also add networks to containers alphabetically. OK then.

Putting it all together

Our recipe will leverage three key components:
1. A custom Docker network named such that Docker adds it to the container first, making it the default route
2. An IP tables rule to mark packets coming out of that Docker network
3. Policy-based routing on the host to route marked packets through the non-default interface

Here we go:

# create a new Docker-managed, bridged connection
# 'avpn' because docker chooses the default route alphabetically
DOCKER_SUBNET="172.57.0.0/16"
docker network create --subnet=$DOCKER_SUBNET -d bridge -o com.docker.network.bridge.name=docker_vpn avpn

# mark packets from the docker_vpn interface during prerouting, to destine them for non-default routing decisions
# 0x25 is arbitrary, any hex (int mask) should work
firewall-cmd --permanent --direct --add-rule ipv4 mangle PREROUTING 0  -i docker_vpn ! -d $DOCKER_SUBNET -j MARK --set-mark 0x25

# alternatively, for regular iptables:
#iptables -t mangle -I PREROUTING 0  -i docker_vpn ! -d $DOCKER_SUBNET -j MARK --set-mark 0x25`

# create new routing table - 100 is arbitrary, any integer 1-252
echo "100 vpn" >> /etc/iproute2/rt_tables

# configure rules for when to route packets using the new table
ip rule add from all fwmark 0x25 lookup vpn

# setup a different default route on the new routing table
# this route can differ from the normal routing table's default route
ip route add default via 10.17.0.1 dev tun0 

# connect the docker_vpn
docker network connect docker_vpn mycontainer

That's it! You should now observe that outgoing traffic from mycontainer going through tun0 with gateway 10.17.0.1. To get this all setup programmatically on boot, I recommend looking into docker-compose to automatically attach the docker_vpn network and create files route-docker_vpn and rule-docker_vpn in /etc/sysconfig/network-scripts to configure the routing rules.

Note that you may need an add-on package for NetworkManager to trigger the network-scripts - on Fedora, it's called NetworkManager-dispatcher-routing-rules.

Using Homebridge with cmdscript2 to control your Linux machine over HomeKit

A few of my tech projects experience occasional hiccups and need to be soft-reset from my Linux host (e.g. Wi-Fi SSID routed through VPN, Windows gaming VM with hardware passthough). This was annoying as it meant having a machine nearby to SSH and execute a few simple commands -- often just a systemctl restart foo. Fortunately, homebridge-cmdswitch2 can easily expose arbitrary commands as lights so I would be able to easily bounce the services via my phone.

First, since Homebridge should be running as its own system user, we need to give it permissions to restart services (as root). We don't want to grant services to all of /bin/systemctl, so a wrapper script will be placed at /usr/local/bin/serviceswitch to encapsulate the desired behavior. Grant the homebridge user permission to run it with sudo:

cat << EOF > /etc/sudoers.d/homebridge-cmdswitch
homebridge ALL = (root) NOPASSWD: /usr/local/bin/serviceswitch

Next, let's create that /usr/local/bin/serviceswitch script with service status, start and stop commands - using such a wrapper also has the benefit that complex checks consisting of several commands can be performed. Keep in mind these are now being run as root from Homebridge!

#!/bin/sh

if [ "$(id -u)" -ne 0 ];then
  echo "You must run this script as root."
  exit 1
fi

usage() {
  error="$1"
  if [ ! -z "$error" ];then
    echo "Error: $error"
  fi
  echo "Usage: $0 [action] [service]"
}

action="$1"
service="$2"
if [ -z "$action" ] || [ -z "$service" ];then
  usage
  exit 1
fi

case $action in
  start|stop|status) ;;
  *) usage "invalid action, must be one of [start, stop, status]"; exit 1;;
esac

case $service in
  vm-guests)
    [ "$action" == "start" ] && (systemctl start libvirt-guests)
    [ "$action" == "stop" ] && (systemctl stop libvirt-guests)
    [ "$action" == "status" ] && { systemctl -q is-active libvirt-guests; exit $?; }
    ;;
  fileserver)
    [ "$action" == "start" ] && (systemctl start smb;systemctl start nmb;systemctl start netatalk)
    [ "$action" == "stop" ] && (systemctl stop smb;systemctl stop nmb;systemctl stop netatalk)
    [ "$action" == "status" ] && { (systemctl -q is-active smb && systemctl -q is-active nmb && systemctl -q is-active netatalk); exit $?; }
    ;;
  web)
    [ "$action" == "start" ] && (systemctl start httpd)
    [ "$action" == "stop" ] && (systemctl stop httpd)
    [ "$action" == "status" ] && { systemctl -q is-active httpd; exit $?; }
    ;;
  *) usage "invalid service"; exit 1;;
esac
exit 0

Finally, here is the relevant platform section from the homebridge config:

{
  "platforms": [{
    "platform": "cmdSwitch2",
    "name": "Command Switch",
    "switches": [{
       "name" : "vm-guests",
        "on_cmd": "sudo /usr/local/bin/serviceswitch start vm-guests",
        "off_cmd": "sudo /usr/local/bin/serviceswitch stop vm-guests",
        "state_cmd": "sudo /usr/local/bin/serviceswitch status vm-guests",
        "polling": false,
        "interval": 5,
        "timeout": 10000
    },
    {
       "name" : "fileserver",
        "on_cmd": "sudo /usr/local/bin/serviceswitch start fileserver",
        "off_cmd": "sudo /usr/local/bin/serviceswitch stop fileserver",
        "state_cmd": "sudo /usr/local/bin/serviceswitch status fileserver",
        "polling": false,
        "interval": 5,
        "timeout": 10000
    },
    {
       "name" : "web",
        "on_cmd": "sudo /usr/local/bin/serviceswitch start web",
        "off_cmd": "sudo /usr/local/bin/serviceswitch stop web",
        "state_cmd": "sudo /usr/local/bin/serviceswitch status web",
        "polling": false,
        "interval": 5,
        "timeout": 10000
    }]
  }]
}

Using Monit to restart a running service, without automatically starting it

I recently ran into an issue where a bug in one of my Docker containers would intermittently chew through CPU until restarted. I wanted Monit to automatically restart the service when it was eating CPU (which ordinarily is trivial to do), but due to the mapped volume, I only wanted it to stop & start the service if it was already running. Otherwise, Monit would proceed to start the container on boot prior to the mounted drive being present, resulting in a bunch of headaches.

"if already running" turned out to be a little more complicated than I expected. Monit doesn't have a good answer for this built-in, so the key is to override the start action by executing a no-op when the service isn't running:

only check process home-assistant MATCHING ^python.+homeassistant
   start program = "/usr/bin/docker start home-assistant"
   stop  program = "/usr/bin/docker stop home-assistant"
   if cpu usage > 13% for 3 cycles then restart
   if does not exist then exec /bin/true

Monit considers 100% CPU usage to be full utilization on all cores, which is why you see 13% (you can also verify current service CPU usage checking the output of monit status). In my case, 13% is about 65% CPU on a single core which (over 3 minutes) I deemed enough to recognize when the bug had occurred.

Here you see I'm also using the MATCHING syntax because the entrypoint for Docker containers may change in the future (I don't maintain this one myself).

The only downside to this method is that Monit will log about the service not running repeatedly until it is started. In my case, because I start docker services on boot it wasn't an issue.

Automatically splitting, cropping and rotating multiple photos from a combined scan

I recently offered to digitize all the 4x6 inch family & childhood photos prints, which ended up being a harder task than I thought it would be due to some newbie mistakes.

Originally, I had thought it would be a piece of cake to simply scan multiple photos at a time with a flatbed scanner, which I could trigger from my computer and save images directly to it via the macOS Image Capture app. I'd then write a quick script to detect whitespace between photos and crop them out.

To maximize the number of print photos per scan, I arranged my scans like this:
Example of bad photo positioning on flatbed scanner

This turned out to be a terrible idea.

Scanners are not perfect, and both scanners I used in the process of digitization captured black bars near the edge of the scan, particularly on the side where the lid closes:

Example 1 of edge artifacts in scan

Example 2 of edge artifacts in scan

Example 3 of edge artifacts in scan

This is a classic example of something really easy for the human eye to detect, but something that is difficult to get computers to detect. Auto-split features in Photoshop didn't work, nor did open-source tooling like ImageMagick + Fred's multicrop.

Doing it properly

So, did you volunteer to digitize a bunch of photos as well? Don't be like me, simply arrange your photos like this and you'll have no problems at all:

  1. Use Image Capture (or equivalent for your OS) to begin scanning photos
  2. Set format to PNG and DPI to 300 (you can do 600 if you'd like but it will be considerably slower and isn't useful unless you intend to make larger prints than the originals)
  3. Position photos non-overlapping in the center of the flatbed so that whitespace exists on all sides, like this:
    Example of good photo positioning on flatbed scanner
  4. After you've completed all your scans, install ImageMagick. It's typically available via Homebrew or your OS' package manager.
  5. Download Fred's multicrop and run it:

    cd /path/to/scans
    mkdir split
    for photo in *.png;do
    /path/to/multicrop "$photo" "split/${photo// /_}"
    

    I noticed that the multicrop script has issues if you specify spaces in the output file, so this invokation automatically replaces them with an underscore.

I doubt this will be relevant for much longer since I'm likely the last generation that will need to do this, but hopefully this helps!

But wait, how did I fix it?

After learning the above, surely I didn't have to re-scan all the photos you might ask?

I was not thrilled about the prospect of cropping manually, but I was also not about to rescan some 2000 photos. With a bit of help from ImageMagick, I was able to get most of the pictures auto-cropped and rotated thanks to Fred's great script.

The photos that were joined by a black bar would still needed to be split manually, but most of the scanned photos could still benefit from being auto-cropped and rotated.

I wrote a quick script to address the issue:

  1. Chop 6px off the left of the combined scans, which was roughly the width of the black artifacting
  2. Take each combined scan and add a 50px margin to the left and top to ensure each individual photos would have whitespace on all sides
  3. Run Fred's multicrop script as usual

Here's the script:

getFilename() {
    filename=$(basename "$1")
    filename="${filename%.*}"
    echo $filename
}

getExtension() {
    filename=$(basename "$1")
    extension=$([[ "$filename" = *.* ]] && echo ".${filename##*.}" || echo '');
    echo $extension
}

pad() {
    in="$(getFilename "$1")"
    ext="$(getExtension "$1")"

  # crop 6px from left
    convert "${in}${ext}" -gravity West -chop 6x0 "tmp/${in}-cropped${ext}"

  # add 50px whitespace top and left
    convert "tmp/${in}-cropped${ext}" -gravity east -background white -extent $(identify -format '%[fx:W+50]x%[fx:H+50]' "${in}${ext}") "tmp/${in}-extended${ext}"
}

split() {
    in="$(getFilename "$1")"
    ext="$(getExtension "$1")"
    ~/bin/multicrop "tmp/${in}-extended${ext}" "output/${in// /_}-split${ext}"
}

mkdir -p tmp output done
for combined in *.png;do
    pad "$combined"
    split "$combined"
    mv "$combined" done
done