Docker Compose

May 13, 2024

Following the post for Docker Volumes, I wanted to cover Docker Compose for two reasons:

Some projects that are part of the “modern data stack” use it as an entry point for local testing a. A single docker run command works well for when Compose is overkill
Volumes work a little differently when used with Compose vs. a container started with the docker run command

ℹ️

A lot of projects recommend deployment on Kubernetes in production, with Compose suggested for local testing.

Local Testing

As we explore more of the modern data stack, we’ll see most of these products aren’t just a single service or daemon; they’re a lot of moving parts that work as “Airflow” or “Superset”. Compose is used to logcally separate different functions of an application into different containers. For example, Open Metadata below requires:

Elasticsearch
Postgres
A few copies of it’s own Dockerfile (openmetadata-server, openmetadata-ingestion, etc) The file that specifies these requirements, docker-compose.yml, gives you an idea of the config required to run this application.

Here’s a sample list of applications in some of the modern data stack that use Compose as an entry point for local testing:

Project	Docs Link	GitHub
Airbyte	Deploy airbyte - Using Docker Compose
Apache Airflow	Running Airflow in Docker
Apache Superset	Installation - Using Docker Compose
Hydra	Run locally
OpenMetadata	Local Docker Deployment
Redash	Setting up a Redash Instance - Docker

In the list above, each of those software projects have provided a compose file, usually in their repo. It’s theoretically possible to try and run the different components (web/db/cache) in one mega-container. You could also have a bunch of docker run commands for each service needed, along with port mapping, volumes/mounts, etc.

In my opinion, it’s a lot easier to clone a repo and type docker compose up.

Compose Volumes vs. Docker Container Volumes

Technically they’re the same thing. However, volumes created as part of a compose up command usually aren’t accessible outside of the network created for the Compose application. When a volume name is specified in a compose.yml file, Docker automatically prepends the file with the either name field in the compose file, or the folder name the compose file.

For the Open Metadata compose file, I can see a volume named ingestion-volume-dags.

Creating Docker Volume `ingestion-volume-dags`

docker volumes screenshot 1

Even if I already have the volume created from earlier, when I run the compose up command for this Compose app, I can see it adds the default name of the compose application to the front of the volumes (image reference).

Docker Volumes after running `docker compose up`

docker volumes screenshot 2

Exception to volume naming with Compose

You can specify external: True for a volume, and Compose will use an existing volume with all of the data contained there. But, Compose understands that volume is managed externally to it’s process, and running compose up without that volume already present will fail.

References

Here’s a short list of good references I’ve found on Docker Compose, particularly the tutorial from the “TechWorld with Nana”. If you have the time (about an hour), it takes you through a sample project without too much bloat.

Last updated on May 13, 2024

Docker Volumes Docker Run Commands