brainsteam.co.uk/brainsteam/content/posts/2023/12/29/Serving Django inside Docke...

17 KiB
Raw Blame History

categories date draft tags title type url
Software Development
2023-12-29 14:22:27 false
django
docker
python
Serving Django inside Docker the Right Way posts /2023/12/29/serving-django-inside-docker-the-right-way/

I've seen a number of tutorials that incorrectly configure django to run inside docker containers by leveraging it's built in dev server. In this post I explore the benefits of using django with gunicorn and nginx and how to set this up using Docker and docker-compose.

I'm working on a couple of side projects that use django and I'm a big fan of docker for simplifying deployment.

Django is a bit of a strange beast in that, it has a simple, single-threaded development server that you can start with python manage.py runserver but then it can also be run in prod mode using WSGI but once in this mode, it doesn't serve static files any more. This can be a little off-putting for people who are used to packaging a single server that does everything (like a nodejs app). It is especially confusing to people used to packaging an app along with everything it needs inside docker.

Part 1: Why not just use runserver?

If you already understand why it's better to use WSGI than runserver and just want to see the working config, skip down to Part 2 below.

I've seen a few tutorials for packaging up django apps inside docker containers that just use the runserver mechanism. The problem with this is that you don't get any of the performance benefits of using a proper WSGI runner and in order to handle server load, you end up needing to run multiple copies of the docker container very quickly.

A Rudimentary Performance Test

I did a quick performance test against the python runserver versus my WSGI + Nginx configuration (below) to illustrate the difference on my desktop machine. I used bombardier and asked it to make as many requests as it can for 10s with up to 200 concurrent connections:

bombardier -c 200 -d 10s http://localhost:8000

The thing at that address is the index view of my django app so we're interested in how quickly we can get the Python interpreter to run and return a response.

The python runserver results:

Statistics        Avg      Stdev        Max
  Reqs/sec      1487.50     633.11    2988.81
  Latency      133.84ms   259.97ms      7.12s
  HTTP codes:
    1xx - 0, 2xx - 15042, 3xx - 0, 4xx - 0, 5xx - 0
    others - 0
  Throughput:    13.70MB/s

And the WSGI config results:

Statistics        Avg      Stdev        Max
  Reqs/sec      1754.20     666.40   16224.55
  Latency      115.05ms     7.23ms   174.44ms
  HTTP codes:
    1xx - 0, 2xx - 17472, 3xx - 0, 4xx - 0, 5xx - 0
    others - 0
  Throughput:    15.95MB/s

As you can see, using a proper deployment configuration, the average number of requests handled per second goes up by about 15% but also we get a much more consistent latency (115ms average with a deviation of about 7ms as opposed to in the first example where latency is all over the place and if you're really unlucky, you're the person waiting 7s for the index page to load).

Testing Static File Service

Now let's look at handling files. When we use runserver we are relying on the python script to serve up the files we care about. I ask bombardier to request the logo of my app as many times as it can for 10 seconds like before:

bombardier -c 200 -d 10s http://localhost:8000/static/images/logo.png

First we run this with django runserver:

Statistics        Avg      Stdev        Max
  Reqs/sec       731.51     252.55    1795.93
  Latency      270.50ms   338.53ms      5.01s
  HTTP codes:
    1xx - 0, 2xx - 7504, 3xx - 0, 4xx - 0, 5xx - 0
    others - 0
  Throughput:   255.27MB/s

And again with Nginx and WSGI.

Statistics        Avg      Stdev        Max
  Reqs/sec      6612.33     705.07    9332.41
  Latency       30.27ms    19.95ms      1.30s
  HTTP codes:
    1xx - 0, 2xx - 66156, 3xx - 0, 4xx - 0, 5xx - 0
    others - 0
  Throughput:     2.25GB/s

And suddenly the counter-intuitive reason for Django splitting static file service from code execution makes a little bit more sense. Since we are just requesting static files, Python never actually gets called. Nginx, which is an efficient server that is written in battle-hardened C, is able to just directly serve up the static files.

In the first example, Python is the bottleneck and using Nginx + WSGI just makes some of the shifting around of information a little bit smoother. In the second example, we can completely sidestep python.

If you still need convincing...

The devs literally tell you not to use runserver in prod in the official docs:

DO NOT USE THIS SERVER IN A PRODUCTION SETTING. It has not gone through security audits or performance tests. (And thats how its gonna stay. Were in the business of making web frameworks, not web servers, so improving this server to be able to handle a production environment is outside the scope of Django.)

django-admin and manage.py - Django documentation

Part 2: Packaging Django + WSGI in Docker

Getting Started

Ok so I'm going to assume that you have a django project that you want to deploy and it has a requirements.txt file containing the dependencies that you have installed. If you are using a python package manager, I'll drop some hints but you'll have to infer what is needed in a couple of places.

Install and Configure Gunicorn

Firstly, we need to add a WSGI server component that we can run inside the docker container. I will use gunicorn.

pip install gunicorn (or you know, pdm add/poetry add etc)

We can test that it's installed and working by running:

gunicorn -b :8000 appname.wsgi

If you go to localhost:8000 you should see your app there but, wait a minute, there are no images or css or js. As I mentioned, django won't serve your static resources so we'll pair gunicorn up with nginx in order to do that.

Collect Static Resources

nginx needs a folder that it can serve static files from. Thankfully django's manage.py has a command to do this so we can simply run:

python manage.py collectstatic --noinput

The --noinput argument prevents the script from asking you questions in the terminal and it will simply dump the files into a static folder in the current directory.

Try running the command in your django project to see how it works. We'll be using this in the next step

Build a Dockerfile for the app

We can produce a docker file that builds our django app and packages any necessary files along with it.

FROM python:3
WORKDIR /app
ADD . /app

RUN python3 -m pip install -r requirements.txt
# nb if you are using poetry or pdm you might need to do something like:
# RUN python3 -m pip install pdm
# RUN pdm install

ENV STATIC_ROOT /static
CMD ["/app/entrypoint.sh"]

NB: if you are using pdm or poetry or similar, you will want to install them

We also need to create the entrypoint.sh file which docker will run when the container starts up. Save this file in the root of your project so that it can be picked up by Docker when it builds:

#!/usr/bin/env bash
pdm run manage.py collectstatic --noinput
pdm run manage.py migrate --noinput
pdm run gunicorn -b :8000 appname.wsgi

This script runs the collectstatic command which, with a little bit of docker magic we will hook up to our nginx instance later. Then we run any necessary database migrations and then we use gunicorn to start the web app.

Build the nginx.conf

We need to configure nginx to serve static files when someone asks for /static/something and forward any other requests to the django app. Create a file called nginx.conf and copy the following:

events {
    worker_connections  1024;  # Adjust this to your needs
}

http {
    include       mime.types;
    default_type  application/octet-stream;
    sendfile        on;
    keepalive_timeout  65;

    # Server block
    server {
        listen       80;
        server_name  localhost;

        # Static file serving
        location /static/ {
            alias /static/;
            expires 30d;
        }

        # Proxy pass to WSGI server
        location / {
            proxy_pass http://frontend:8000;
            proxy_set_header Host $host;
            proxy_set_header X-Real-IP $remote_addr;
            proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
            proxy_set_header X-Forwarded-Proto $scheme;
        }
    }
}

This configuration should be relatively self-explanatory but a couple of notes:

  • The alias command tells nginx to serve static files from the /static/ folder on the filesystem. This is where we will mount the static files using docker later. The behaviour of alias is described here.
  • The expires 30d directive tells nginx that it can let browsers cache these static files for up to 30 days at a time - hopefully saving bandwidth and speeding things up.
  • If the request does not start with /static/ then nginx will assume it is a request for django and send it to http://frontend:8000 - again we will configure docker so that the django gunicorn process is listening from there.
  • Note that we use /static/ but we could change this - the value needs to match whatever we set as STATIC_ROOT in the docker file.

Glue it together with docker compose

We will use docker compose to combine our nginx and django containers together.

services:
  frontend:
    build: .
    restart: unless-stopped
    volumes:
      - ./static:/app/static
    environment:
      DJANGO_CSRF_TRUSTED_ORIGINS: 'http://localhost:8000'

  frontend-proxy:
    image: nginx:latest
    ports:
      - "8000:80"
    volumes:
      - ./nginx.conf:/etc/nginx/nginx.conf:ro
      - ./static:/static:ro
    depends_on:
      - frontend

Ok so what we are doing here is using volume mounts to connect /app/static inside the django container where the results of collectstatic are dumped to /static/ in our nginx container where the static files are served from.

We also mount the nginx.conf file in the nginx container. You'll probably end up using docker compose to add database connections too or perhaps a volume mount for a sqlite database file.

Finally we bind port 8000 on the host machine to port 80 in nginx so that when we go to http://localhost:8000 we can see the running app.

Running it

Now we need to build and run the solution. You can do this by running:

docker compose build
docker compose up -d

Now we can test it out by going to http://localhost:8000. Hopefully you will see your app running in all its glory. We can debug it by using docker compose logs -f if we need to.

Conclusion

Hopefully this post has shown you why it is important to set up Django properly rather than relying on runserver and how to do that using Docker, Nginx and Gunicorn. As you can see, it is a little bit more involved than your average npm application install but it isn't too complicated.