101

What kernel parameter or other settings control the maximum number of TCP sockets that can be open on a Linux server? What are the tradeoffs of allowing more connections?

I noticed while load testing an Apache server with ab that it's pretty easy to max out the open connections on the server. If you leave off ab's -k option, which allows connection reuse, and have it send more than about 10,000 requests then Apache serves the first 11,000 or so requests and then halts for 60 seconds. A look at netstat output shows 11,000 connections in the TIME_WAIT state. Apparently, this is normal. Connections are kept open a default of 60 seconds even after the client is done with them for TCP reliability reasons.

It seems like this would be an easy way to DoS a server and I'm wondering what the usual tunings and precautions for it are.

Here's my test output:

# ab -c 5 -n 50000 http://localhost/
This is ApacheBench, Version 2.0.40-dev <$Revision: 1.146 $> apache-2.0
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Copyright 2006 The Apache Software Foundation, http://www.apache.org/

Benchmarking localhost (be patient)
Completed 5000 requests
Completed 10000 requests
apr_poll: The timeout specified has expired (70007)
Total of 11655 requests completed

Here's the netstat command I run during the test:

 # netstat --inet -p | grep "localhost:www" | sed -e 's/ \+/ /g' | cut -d' ' -f 1-4,6-7 | sort | uniq -c 
  11651 tcp 0 0 localhost:www TIME_WAIT -
      1 tcp 0 1 localhost:44423 SYN_SENT 7831/ab
      1 tcp 0 1 localhost:44424 SYN_SENT 7831/ab
      1 tcp 0 1 localhost:44425 SYN_SENT 7831/ab
      1 tcp 0 1 localhost:44426 SYN_SENT 7831/ab
      1 tcp 0 1 localhost:44428 SYN_SENT 7831/ab
Ben Williams
  • 2,318
  • 4
  • 21
  • 17

8 Answers8

73

I finally found the setting that was really limiting the number of connections: net.ipv4.netfilter.ip_conntrack_max. This was set to 11,776 and whatever I set it to is the number of requests I can serve in my test before having to wait tcp_fin_timeout seconds for more connections to become available. The conntrack table is what the kernel uses to track the state of connections so once it's full, the kernel starts dropping packets and printing this in the log:

Jun  2 20:39:14 XXXX-XXX kernel: ip_conntrack: table full, dropping packet.

The next step was getting the kernel to recycle all those connections in the TIME_WAIT state rather than dropping packets. I could get that to happen either by turning on tcp_tw_recycle or increasing ip_conntrack_max to be larger than the number of local ports made available for connections by ip_local_port_range. I guess once the kernel is out of local ports it starts recycling connections. This uses more memory tracking connections but it seems like the better solution than turning on tcp_tw_recycle since the docs imply that that is dangerous.

With this configuration I can run ab all day and never run out of connections:

net.ipv4.netfilter.ip_conntrack_max = 32768
net.ipv4.tcp_tw_recycle = 0
net.ipv4.tcp_tw_reuse = 0
net.ipv4.tcp_orphan_retries = 1
net.ipv4.tcp_fin_timeout = 25
net.ipv4.tcp_max_orphans = 8192
net.ipv4.ip_local_port_range = 32768    61000

The tcp_max_orphans setting didn't have any effect on my tests and I don't know why. I would think it would close the connections in TIME_WAIT state once there were 8192 of them but it doesn't do that for me.

chicks
  • 3,639
  • 10
  • 26
  • 36
Ben Williams
  • 2,318
  • 4
  • 21
  • 17
24

You really want to look at what the /proc filesystem has to offer you in this regard.

On that last page, you might find the following to be of interest to you:

  • /proc/sys/net/ipv4/tcp_max_orphans, which controls the maximum number of sockets held by the system not attached to something. Raising this can consume as much as 64kbyte of non-swappable memory per orphan socket.
  • /proc/sys/net/ipv4/tcp_orphan_retries, which controls the amount of retries before a socket is orphaned and closed. There is a specific note on that page about web servers that is of direct interest to you...
Avery Payne
  • 14,326
  • 1
  • 48
  • 87
  • tcp_max_orphans is interesting but it seems like it's not working. When I try to measure orphaned sockets during my test I see 11,651 of them while tcp_max_orphans is 8,092. # netstat --inet -p | grep "localhost:www" | sed -e 's/ \+/ /g' | cut -d' ' -f 1-4,6-7 | sort | uniq -c 11651 tcp 0 0 localhost:www TIME_WAIT - – Ben Williams May 21 '09 at 19:27
  • Look at the tcp_orphan_retries setting - the idea being, the sockets are "culled" quicker... – Avery Payne May 21 '09 at 19:39
  • @Jauder Ho's suggestion + tcp_orphan_retries sound like potential win for your situation. – Avery Payne May 21 '09 at 19:41
3

I don't think there is a tunable to set that directly. This falls under the category of TCP/IP tuning. To find out what you can tune, try 'man 7 tcp'. The sysctl ( 'man 8 sysctl' ) is used to set these. 'sysctl -a | grep tcp' will show you most of what you can tune, but I am not sure if it will show all of them. Also, unless this has changed, TCP/IP sockets open up look like file descriptors. So this and the next section in that link might be what you are looking for.

Kyle Brandt
  • 82,107
  • 71
  • 302
  • 444
2

Try setting the following as well setting tcp_fin_timeout. This should close out TIME_WAIT more quickly.

net.ipv4.tcp_tw_reuse = 1
net.ipv4.tcp_tw_recycle = 1
Jauder Ho
  • 5,337
  • 2
  • 18
  • 17
  • Careful here! Experienced the hard way. "This may cause dropped frames with load-balancing and NATs, only use this for a server that communicates only over your local network." - https://wiki.archlinux.org/index.php/Sysctl – Henk Aug 09 '11 at 13:23
  • @Henk I guess it is `tcp_tw_recycle` that is potentially dangerous. `tcp_tw_reuse` is safer and I don't see any reason to use them simultaneously. – Vladislav Rastrusny Mar 17 '12 at 09:36
2

The stock apache(1) used to come predefined to only support 250 concurrent connections - if you wanted more, there was one header file to modify to allow more concurrent sessions. I don't know if this is still true with Apache 2.

Also, you need to add an option to allow loads of more open file descriptors for the account that runs Apache - something that the previous comments fail to point out.

Pay attention to your worker settings and what sort of keepalive timeouts you have inside Apache itself, how many spare ones servers you have running at once, and how fast these extra processes are getting killed.

Peter Mortensen
  • 2,319
  • 5
  • 23
  • 24
rasjani
  • 365
  • 2
  • 6
1

You could reduce the time spent in the TIME_WAIT state (Set net.ipv4.tcp_fin_timeout). You could replace Apache with YAWS or nginx or something similar.

Tradeoffs of more connections generally involve memory usage, and if you have a forking process, lots of child processes which swamp your CPU.

Devdas
  • 737
  • 4
  • 6
  • 1
    tcp_fin_timeout is not for setting TIME-WAIT expiration, which isn't changeable outside of rebuilding the kernel, but for FIN, as the name indicates. – Alexandr Kurilin Oct 29 '15 at 18:28
0

The absolute number of sockets that can be open on a single IP address is 2^16 and is defined by TCP/UDP, not the kernel.

Peter Mortensen
  • 2,319
  • 5
  • 23
  • 24
Jason Tan
  • 2,742
  • 2
  • 17
  • 24
  • 6
    No it isn't. You can open more, as the local port doesn't need to be unique as long as the remote addresses are different. Moreover, the OP said per server, and you can have >1 address per server. – MarkR Sep 04 '09 at 22:13
0

The Apache HTTP server benchmarking tool, ab, in the 2.4 version has the -s timeout option. See also ab (Apache Bench) error: apr_poll: The timeout specified has expired (70007) on Windows.

This option solves your problem.

Peter Mortensen
  • 2,319
  • 5
  • 23
  • 24