24

I've got a django setup are using Django 1.6.7 and Postgres 9.3 on Ubuntu 14.04 LTS.

At any given time, the site gets about ~250 simultaneous connections to the PostgreSQL database, which is a Quad Core Xeon E5-2670 at 2.5GHz, and has 16GB of ram. The load average on that particular machine throughout the day is around 20 to 30.

Occasionally I will get emails in sentry about connections timing out to the database, and I figure enabling some sort of connection pooling will help mitigate this issue, as well as lower the load on the database a bit.

Since we are using Django 1.6, we do have the built in pooling available to us. However, when I set CONN_MAX_AGE to 10 seconds, or 60 seconds, almost immediately the number of simultaneous connections jumps to the maximum allowed setting (which is about double what we usually see), and connections start getting rejected.

So, it appears for what ever reason, the connections ARE persisting, but they ARE NOT being reused.

What could be the cause of this?

PS. We are also using gunicorn with --worker-class=eventlet. Perhaps this is the source of our woes?

synic
  • 783
  • 1
  • 8
  • 13

1 Answers1

25

Doing some more experimenting, I have found that the cause of our problem was indeed gunicorn's eventlet worker class. Each microthread made it's own persistent connection, and there was no way at all to reuse any of them.

Disabling eventlet has made the load on our webservers go up (but not by much), but the postgres load is now down to an average of 3. From 30.

synic
  • 783
  • 1
  • 8
  • 13
  • 2
    You've just saved us a ton of time! We observe exactly the same behaviour and we are using eventlet. Will try to switch to connection pooling and see how it will work. – silentser Oct 18 '14 at 23:19
  • 3
    Update: pooling database connections with pgBouncer seemed to solve the problem (we are still using eventlet) – silentser Nov 13 '14 at 21:07
  • Apparently there's also psycogreen: https://pypi.python.org/pypi/psycogreen/1.0 (I've not tried it as once I set the CONN_MAX_AGE to zero it takes our system 20ms to make a DB connection so we simply don't need pooling) – Darren Jun 30 '16 at 10:37
  • 2
    It took me some time googling to turn up this answer to the exact same problem we were having. – Alper Aug 05 '19 at 10:15