19

Our app's REST API is served by Gunicorn (not behind Nginx) running on AWS EC2 instances with a typical auto-scaling/load balancing setup. The load balancer's idle timeout is 60 seconds, and Gunicorn's keep-alive timeout is 2 seconds. We've been seeing sporadic 504 Gateway Timeout responses from this configuration. According to Amazon docs, this may be because the server's keep-alive timeout is lower than the load balancer's idle timeout setting:

Cause 2: Registered instances closing the connection to Elastic Load Balancing.

Solution 2: Enable keep-alive settings on your EC2 instances and set the keep-alive timeout to greater than or equal to the idle timeout settings of your load balancer.

With Nginx, the default keepalive_timeout is 75 seconds, which apparently works well with the ELB default settings. However, Gunicorn docs recommend a keepalive setting in the range of 1-5 seconds.

Does it make sense to bump Gunicorn's keepalive to 75 seconds, or is there a good reason for keep it below 5 seconds even though we're not using a reverse proxy in front of it?

handsofaten
  • 327
  • 3
  • 6

1 Answers1

19

You will almost certainly want to raise the keepalive timer per the ELB recommendation, because ELB reuses connections. It will hold them until the timeout expires and if another request arrives at the ELB, it will often use one of the already open connections to send it to you.

504 Gateway Timeout is an odd error for this condition but it appears that's what ELB returns when the reuse of a connection coincides with the back-end's premature close.

The 5 second recommendation might make sense if browsers were communicating with the back-end directly, but that isn't the case with ELB, which is itself a proper reverse-proxy when running in HTTP mode.

Michael - sqlbot
  • 21,988
  • 1
  • 57
  • 81
  • Thanks, this is what I suspected. I'll try this change out this week and mark your answer correct if everything goes smoothly :) – handsofaten Jun 05 '16 at 17:10
  • 1
    We merged the change about a week ago and 504s have become much less common (a couple times a week instead of a couple hundred times a week). – handsofaten Jul 01 '16 at 19:22