1

Currently, we have a queue size of 3000 requests.

location /api/v2 {
     limit_req zone=bursted burst=3000;
     include /etc/nginx/proxy.conf;
 }

The rate limit is 10 requests per second.

 limit_req_zone $limit zone=api_slow:10m rate=1r/s;
 limit_req_zone $server_name zone=bursted:10m rate=10r/s;

Keep-Alive Timeout is 30 seconds. In other words, 2700 requests should be rejected with error code 408 every 30 seconds, when the queue is full.

 reset_timedout_connection on;
 client_body_timeout 10;
 send_timeout 2;
 keepalive_timeout 30;

In a rush hours, I could not find any request in logs, which was rejected with error code 408 by NGINX, due to a timeout, while the request was waiting in the queue for forwarding to servlet container. Only reject with 503 error code, which is correspondent to requests rate overhead.

delaying request, excess: 2958.320, by zone "bursted"
limiting requests, excess: 3000.730 by zone "bursted"

Does NGINX reject requests in such queues by timeout, if they hang too long? What is this timeout? Where is its configuration?

1 Answers1

4

Seems there is a bit of confusion how nginx rate limiting and timeouts work. There is no timeout for rate limiting. You just set a rate and a queue size. Any requests exceeding the rate are being added to the queue to be processed later. Once the queue is completely filled any additional request will be rejected with a 503 status code.


In your example you have set a rate of 10 requests per second (10r/s), a burst size of 3000 a zone 'bursted' with a size of 10 megabyte. And this rate limit applies as separate count for each defined server.

In other words, your server accepts and processes one request every 0.1 second and can queue up to 3000 exceeding requests, that are then being processed at the defined rate: one every 0.1 second. And your burst zone can store about 160.000 IP adresses.

That means if 3011 reques arrive within one second, nginx processes the first 10 requests immediately, puts another 3000 requests in the queue and the 3011th request will be rejected with a 503 status code. The queue will then be processed at the defined rate of one request every 0.1 second. As long as no new requests arrive the queue will get shorter and new requests can be added to the queue again. But while the queue already holds 3000 requests every additional request will be rejected with a 503 status code.

This behaviour linear processing of the burst queue might make your site appear slow. To prevent that you can add the nodelay parameter to limit_req zone=bursted burst=3000 nodelay;. This will make all requests from your burst queue being processed immediately while marking the slots in the queue as 'taken' and then 'freeing' slot by slot at the defined rate again, so the defined rate limit is met over time.

btw: You can change the status code for rejected requests from 503 to 444 by adding limit_req_status 444; to your http config block.

For more details see:


The two timeouts from your config:

The client_body_timeout 10; will make your server wait up to 10 seconds for a client body to be sent after a request. If no body is sent from the client within this time, the server will close the connection with a 408 status code.

The keepalive_timeout 30; will make your server close any connection to a client that is still open after 30 seconds. But according to my tests, the time a request is waiting in the burst queue does not count for the keepalive_timeout.


Perform load tests using ab or siege.

Bob
  • 422
  • 2
  • 5