4

What is the relationship between keep-alive on a HTTP request and a tcp socket in TIME_WAIT - should they be correlated?

Furthermore, should system and web server settings be aligned e.g. server.max-keep-alive-idle = 60? According to How to reduce number of sockets in TIME_WAIT? in Linux the TIME_WAIT state is hardcoded at 60 seconds (at least for Ubuntu/Debain values of Linux).

In lighttpd the default value server.max-keep-alive-idle = 5 and they recommend even lower for high load. It seems a waste to close a http request after 5 seconds if the tcp socket is available - assuming of course that the setting net.ipv4.tcp_tw_reuse = 1 does what it says on the tin.

This related question - How does tcp keep a connection alive? [closed] touches on the issue but doesn't fully answer it for me.

aland
  • 172
  • 1
  • 7
  • TCP is layer 4, HTTP is layer 7. There's no relationship except if you mean HTTP/1.1 persistent connections instead of HTTP Keep-Alive (which is used for HTTP/1.0). – Xavier Lucas Oct 24 '14 at 22:10
  • I wasn't aware that there was much different between 1.0 and 1.1 - other than it's default in 1.1 - http://www.w3.org/Protocols/rfc2616/rfc2616-sec8.html – aland Oct 24 '14 at 22:20
  • Just to mention I've since found a great explaination at http://vincent.bernat.im/en/blog/2014-tcp-time-wait-state-linux.html – aland Oct 25 '14 at 00:02

1 Answers1

5

TCP is layer 4, HTTP layer 7.

In HTTP 1.0, HTTP Keep-Alive is used at layer 7 to simulate persistent connections using Connection header.

In HTTP 1.1, connections are assumed persistent by default and then rely on TCP only to do that job. Requests can be pipelined in the same TCP connection, then one side will set Connection: close in the last request or response headers, so both side knows that no more HTTP request can be exchanged and the connection will then be closed.

Usually in the case of a web server, the TIME_WAIT state will be the state after which, once decided to actively close the connection, it received client's FIN packet and is sending the last ACK back in the four-way tear-down. After this, it waits for 2 * MSL : it's a way to be sure that the connection is closed. That's where the 60s compiled in the kernel comes from. In this way we are sure that we won't receive in a new connection, using the same 4 tuple, packets out of sequence arising from the previous connection.

You don't want to change it.

In the other side server.max-keep-alive-idle is the timeout after which an ESTABLISHED connection will be considered idle if no HTTP request comes in and will be actively closed by the web server. When this decision is made, as you understand now, the TCP tear-down will take place.

Be very careful with tcp_tw_recycle, if your visitors come from behind a wide NATed network then it could lead to multiple TCP connections with the same 4 tuple taking place with out of order timestamps resulting in silently dropping client connections attempts on the server side.

So the best option is to adjust the parameter you saw in lighttpd. System-wide, you can safely lower FIN_WAIT2 state and raise buckets for sockets in TIME_WAIT state with net.ipv4.tcp_fin_timeout and net.ipv4.tcp_max_tw_buckets.

Xavier Lucas
  • 12,815
  • 2
  • 44
  • 50
  • I thought that it was tcp_tw_recycle that carried the NAT warning - or is it both? http://www.stolk.org/debian/timewait.html – aland Oct 24 '14 at 23:05
  • @aland Yes it is, was editing :) . The other one, `tcp_tw_reuse` will have no effect on incoming connections and takes place for new outgoing connections. – Xavier Lucas Oct 24 '14 at 23:06
  • Ah, if reuse is only for outgoing then it's not very useful for a webserver I think – aland Oct 24 '14 at 23:25
  • @aland Depends what you do with it, for example if you use it as a reverse proxy it can be useful. – Xavier Lucas Oct 24 '14 at 23:27