Database backups via mysqldump: from MariaDB container to S3

For a little side project I wanted an easy way to perform regular backups of a MariaDB database and upload the resultant dump gzipped to S3.

Here are the steps to make this happen.

The setup

Docker container running MariaDB
Docker engine running on a AWS EC2 instance
An S3 bucket as the destination for the dumps

Writing the backup script

We begin with writing our shell script, backup.sh, which we will later execute in regular intervals from our host:

#!/bin/bash

set -e

trap '/usr/bin/rm -f $temp_file; /usr/bin/rm -f "$temp_file.gz"' EXIT
temp_file=$(/usr/bin/mktemp)

# create db dump
/usr/bin/docker exec <my container> sh -c 'exec mysqldump --hex-blob -uroot -p"$MYSQL_PASSWORD" <my database>' > $temp_file

# gzip
/usr/bin/gzip $temp_file

# upload to S3
/usr/bin/aws s3api put-object --bucket <my bucket> --key "<some directory/prefix>/$(date -u '+%Y-%m-%d_%H-%M-%S').dump.sql.gz" --body "$temp_file.gz"

exit 0

What are we doing here?

First, we make sure to have a temp file for our operation (mktemp) and register a trap for cleaning up this temp file (with or without .gz extension) on exit.
Next, we perform the dump via mysqldump running inside the mariadb container. The result is written to our temp file.
Note that we’re using --hex-blob to get data from binary columns hex-encoded and not garbled up
After writing our dump file, we gzip it
Finally, we upload the gzipped file via the aws cli tool to a bucket of our choice and give it a timestamp

Setting permissions for S3 upload

To allow our EC2 instance to put objects into our bucket, we’ll adjust the policy like so:

{
  "Effect":"Allow",
  "Action": [
    "s3:ListBucket",
    "s3:PutObject"
  ],
  "Resource": [
    "arn:aws:s3:::<my bucket>/<some directory/prefix>/*",
   ]
}

Installing the cron job

To let our backup script run every nth hour, we set up a cronjob via crontab -e:

0 */6 * * * /path/to/script/backup.sh && /usr/bin/curl -s https://ping.allgood.systems/l/j/<random id>

Here, we are running the script every 6 hours and – on success – issue a HTTP request via curl to a monitoring system. The latter part is optional but a good way to get notified should your script ever stop running at the expected interval. The monitoring service I use is allgood.systems, which I recently built.

Cleaning up old backups

Once the backup files are in your S3 bucket you can decide how to proceed. Generally, it is recommended to archive some of the dumps at a different location.

To regularly clean up your old files I would additionally recommend setting up a “Lifecycle rule” for your S3 bucket or even just a prefix in that bucket. This would then automatically assign an expiry date to your uploaded files.

Final thoughts

If you are working with a larger/busier/more critical database, there are a couple of things you might want to tweak (e.g. in regards to buffering, single transactions, checking for enough space, etc.) or choose a different approach altogether (e.g. full snapshots, setting up replication, etc.).

The setup#

Writing the backup script#

Setting permissions for S3 upload#

Installing the cron job#

Cleaning up old backups#

Final thoughts#