Continuous Integration testing with GitHub Actions using tox and hypothesis

I recently published a major update for the Python tmtoolkit package for text mining and topic modeling. Since it is a fairly large research software package, I’m using a Continuous Integration (CI) system for automated testing on different platforms. This system makes sure that every code update that is pushed to the software repository is automatically checked by running the test suite on all three major operating systems (Linux, MacOS, Windows). For the recent update of tmtoolkit, I decided to move the CI system from Travis CI to GitHub Actions (GHA) since GHA is directly integrated into GitHub and easy to set up. Still, there are some obstacles to overcome so this short post shows how to set up GHA for a Python project with a few extra requirements such as installing system packages on the test runner machine or running tests with tox and hypothesis.

Creating a new workflow

You should start by creating a YAML file in the folder .github/workflows of your source code repository and give the GHA workflow a name:

# GHA workflow for running tests

name: run tests

# [...]

Specifying branches

Next, you should specify when the workflow is executed. For automated testing, this is usually whenever you push commits to the GitHub repository. It often makes sense to specify certain branches for which the workflow should be executed when updates are pushed and ignore pushes to all other branches. You can also specify patterns. All this can be done in a section that is introduced with the “on:” keyword. The next example sets up the “run tests” workflow to be activated for all pushes to “master”, “develop” and anything that starts with “release”:

# [...]

on:
  push:
    branches:
      - master
      - develop
      - 'release*'

# [...]

Setting up jobs with a build matrix

The next step is to set up the jobs that fulfill your workflow instructions. Jobs are executed by runners which are basically virtual machines that run your instructions. GHA allows to set up runners with different OS and Python versions which is very helpful to test your code on a range of system configurations. A convenient way to set up these job configurations is to use a build matrix. It specifies all combinations of configuration options that should spawn a job, i.e. Ubuntu and MacOS with Python versions 3.8 to 3.10. This for example would spawn six jobs (the Cartesian product of two OS versions and three Python versions, i.e. Ubuntu w/ Python 3.8, Ubuntu w/ Python 3.9, …, MacOS w/ Python 3.10). Jobs can run in parallel if they’re independent from each other and this is the case for our tests. All this must be specified in the “jobs” section as shown in this example which spawns nine jobs:

# [...]

jobs:
  build:
    runs-on: ${{ matrix.os }}
    strategy:
      matrix:
        os: [ubuntu-latest, macos-latest, windows-latest]
        python-version: ["3.8", "3.9", "3.10"]

# [...]

Job instructions

Finally, it’s necessary to define the instructions that each job should execute. This is done in the “steps” section inside “jobs / build”. For running automated tests, we need to do the following things:

checkout the GitHub repository
set up a Python environment
optionally install system dependencies
install Python dependencies
run the tests

The tmtoolkit package uses pytest for automated testing. Furthermore, it also uses tox for installing all required dependencies before the testing process and checking that the package can be installed correctly. So tox is basically a tool that automatically installs your package into a temporary virtual environment and then runs all tests inside of that environment. This comes in very handy when using CI tools such as GHA for testing, since once you’ve set up tox to run on your local machine, you can easily reproduce it on any other machine. So tox completely handles steps four and five from the above list.

Let’s start with steps one and two: We can use pre-defined actions for these, namely the checkout and the setup-python actions. For the latter, we also specify the Python version to use and enable caching packages that are installed via pip. This saves some time as these packages are not installed again every time the job is run.

    # [...]

    steps:
      - uses: actions/checkout@v2
      - name: set up python ${{ matrix.python-version }}
        uses: actions/setup-python@v2
        with:
          python-version: ${{ matrix.python-version }}
          cache: 'pip'

      # [...]

Some Python packages that should be installed in step four may require packages to be installed on OS level. For example, some topic modeling evaluation metrics in tmtoolkit require multiple-precision arithmetic provided by the gmpy2 package. This package in turn requires some packages to be installed on the OS (namely libgmp-dev, libmpfr-dev and libmpc-dev on Ubuntu). I couldn’t install these dependencies on the MacOS and Windows runners, but at least on Ubuntu it works fine. We can add a conditional rule to run step three only on Ubuntu Linux:

      # [...]

      - name: install system dependencies (linux)
        if: runner.os == 'Linux'
        run: |
          sudo apt update
          sudo apt install libgmp-dev libmpfr-dev libmpc-dev

      # [...]

As explained before, steps four and five are quite easy to set up since we’re using tox to manage installing the Python packages required for testing tmtoolkit. So we actually only need to do two things in step four: update pip itself and install tox.

      # [...]

      - name: install python dependencies
        run: |
          pip install -U pip
          pip install tox

      # [...]

In step five we finally run tox which in turn installs the Python dependencies:

      # [...]

      - name: run tox
        run: tox -- --hypothesis-profile=ci

      # [...]

That’s it! One drawback of using tox is that so far GHA doesn’t support caching tox environments. This means that on each job run, the Python packages will be installed anew by tox which may take quite some time.

Failed tests because of slow runners

What about the --hypothesis-profile=ci parameter defined as parameter to tox in the last step? This is actually quite an important instruction. That parameter setting is passed by tox to pytest, which in turn enables a settings profile for the property-based testing framework hypothesis. The reason for that is the following: The hypothesis framework generates random inputs for testing your functions. Hypothesis by default treats slow value generation and slow test runs as failed test if they exceed a certain time (1 sec. by default). This is fine when running tests on a local machine, but generating values and testing a function can be temporarily slow on virtual machines like CI runners and this would cause tests to fail although they actually work fine. To circumvent this, you can create a configuration file for pytest in your project root which is called conftest.py and set up a special ci profile with extra long deadlines:

from hypothesis import settings, HealthCheck

# set default timeout deadline
settings.register_profile('default', deadline=5000)

# profile for CI runs on GitHub machines, which may be slow from time to time so we disable
# the "too slow" HealthCheck and set the timeout deadline very high (60 sec.)
settings.register_profile('ci',
                          suppress_health_check=(HealthCheck.too_slow, ),
                          deadline=60000)

# load default settings profile
settings.load_profile('default')

The full GHA workflow file from which I took the above examples is available on GitHub.

Continuous Integration testing with GitHub Actions using tox and hypothesis

Creating a new workflow

Specifying branches

Setting up jobs with a build matrix

Job instructions

Failed tests because of slow runners

Post Navigation

Recent posts

Categories

Links

Links

Recent Posts

Recent Comments

Archives

Categories

Meta