Database tests in Python
This post describes a setup for automated tests with a real database. These are the steps we’ll cover:
- Database setup with Docker
- Collecting test data with
mysqldump
- Implementing useful
pytest
fixtures - Writing a simple test case
I’m using a specific tech stack (GitLab, MySQL/MariaDB, pytest), however the overall strategy doesn’t depend on these details. It should be relatively easy to swap out, for example, GitLab with Bitbucket or MySQL with Postgres.
Database setup with Docker
In GitLab CI/CD we can run a database in a service container next to our test container. We only need to specify an image that comes with a database under services
in .gitlab-ci.yml
.
dbtests:
services:
- name: mariadb:11.1.2-jammy
alias: mysql
variables:
MYSQL_ROOT_PASSWORD: $MYSQL_ROOT_PASSWORD
MYSQL_PORT: 3306
MYSQL_HOST: "mysql"
DB_TESTS_ENABLED: 1
FF_NETWORK_PER_BUILD: 1
We define an alias that we’ll use as host name in the test image. It’s important to enable FF_NETWORK_PER_BUILD
, to create a Docker network such that the database container can be resolved by name. DB_TESTS_ENABLED
is a custom environment variable that we’ll use as a feature flag to enable the database tests.
Now this will work on the server, but how do we run the tests locally during development? We’ll use docker compose up -d
to spin up the same database image as in the pipeline with this docker-compose.yml
.
version: '3.8'
services:
mysql:
image: mariadb:11.1.2-jammy
environment:
MYSQL_ROOT_PASSWORD: testpassword
ports:
- '3306:3306'
volumes:
mysql:
driver: local
We’ll still be able to run the fast unit tests without database dependency by running pytest
without DB_TESTS_ENABLED=1
.
An alternative to this solution is using the testcontainers
library for Python. With that library you can spin up Docker containers from your Python code, so you can use the same setup for local development and the pipeline on the server. The downside of that approach is that your test container needs to be able to launch other Docker containers1, which might be a security issue or not even possible, depending on your setup.
Collecting test data with mysqldump
If you’re not dealing with sensitive production data, one simple way of obtaining test data is sampling data from your production database, for example, with mysqldump
.
mysqldump --host "my.database.com" \
--port 3306 \
--user myuser \
--single-transaction \
"mydb" "mytable" \
--where "myid IN ('1', '2', '3')" \
> mydb/mytable.sql
In this case, we just select rows from mytable
in the database mydb
with three constant myid
s, but the where
condition can be as complex as necessary. If you have to join several tables to be able to select the data you need, go ahead.
Implementing useful pytest
fixtures
We implement a fixture that gives us a connection to our database. This will be useful both for preparing the database and for testing.
@pytest.fixture(scope="function")
def mysql_connection():
with MySQLdb.connect(
="root",
user=os.environ["MYSQL_ROOT_PASSWORD"],
password=os.environ["MYSQL_HOST"],
host=int(os.environ["MYSQL_PORT"]),
port=MySQLdb.constants.CLIENT.MULTI_STATEMENTS,
client_flagas connection:
) yield connection
Here I use the mysqlclient
lib. It’s a C wrapper, which means it’s fast, but depends on a few system libs. If you can’t or don’t want to install these dependencies, use a pure-Python client like PyMySQL. The connection is configured with values from environment variables.2 To set them locally, you can use .env
.
Now we use the connection from the fixture to write our sampled test data to the database.
@pytest.fixture(scope="function")
def prepare_db(mysql_connection):
with mysql_connection.cursor() as cursor:
# Create fresh DB
"DROP DATABASE IF EXISTS mydb;")
cursor.execute("CREATE DATABASE mydb;")
cursor.execute(
# Insert data from mysql dump
"USE mydb;")
cursor.execute("mydb/mytable.sql").read_text()) cursor.execute(Path(
The fixture is function
scoped to isolate the test cases from one another. If you use the same data in each test case and only read from the database, you could make it module
or session
scoped to speed up your tests.
For our convenience, we add a little helper function that allows us to hide tests behind a feature flag.
def requires_flag(flag_name: str):
return pytest.mark.skipif(
not os.environ.get(flag_name, False),
=f"Environment variable {flag_name} needs to be set to run this test",
reason )
The helper is using pytest
’s skipif
.
Writing a simple test case
Finally, we use our fixtures to run queries against a real database in our tests.
@requires_flag("DB_TESTS_ENABLED")
@pytest.mark.usefixtures("prepare_db")
def test_sql_query(mysql_connection):
with mysql_connection.cursor() as cursor:
# Whatever you want to test
"SELECT * FROM mytable"); cursor.execute(
We hide the test behind the DB_TESTS_ENABLED
flag, so it will only run if enabled explicitly.
DB_TESTS_ENABLED=1 pytest
That’s it. Go forth and write database tests.