3

While investigating complains on bad HTTP server performance, I've discovered these lines in dmesg of my Xen XCP host that contains a guest OS with said server:

[11458852.811070] net_ratelimit: 321 callbacks suppressed
[11458852.811075] nf_conntrack: table full, dropping packet.
[11458852.819957] nf_conntrack: table full, dropping packet.
[11458852.821083] nf_conntrack: table full, dropping packet.
[11458852.822195] nf_conntrack: table full, dropping packet.
[11458852.824987] nf_conntrack: table full, dropping packet.
[11458852.825298] nf_conntrack: table full, dropping packet.
[11458852.825891] nf_conntrack: table full, dropping packet.
[11458852.826225] nf_conntrack: table full, dropping packet.
[11458852.826234] nf_conntrack: table full, dropping packet.
[11458852.826814] nf_conntrack: table full, dropping packet.

Complains are repeated every five seconds (number of suppressed callbacks is different each time).

What can these sympthoms mean? Is that bad? Any hints?

(Note that the question is more narrow than "how to solve specific case of bad HTTP server performance", so I do not give more details on that.)

Additional info:

$ uname -a
Linux MYHOST 3.2.0-24-generic #37-Ubuntu SMP Wed Apr 25 08:43:22 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux

$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 12.04 LTS
Release:    12.04
Codename:   precise

$ cat /proc/sys/net/netfilter/nf_conntrack_max 
1548576

The server is under about 10M hits / day load.

Update:

iptables on Dom0:

$ iptables -L -t nat -v
Chain PREROUTING (policy ACCEPT 23155 packets, 1390K bytes)
 pkts bytes target     prot opt in     out     source               destination         

Chain INPUT (policy ACCEPT 9 packets, 720 bytes)
 pkts bytes target     prot opt in     out     source               destination         

Chain OUTPUT (policy ACCEPT 27 packets, 1780 bytes)
 pkts bytes target     prot opt in     out     source               destination         

Chain POSTROUTING (policy ACCEPT 23173 packets, 1392K bytes)
 pkts bytes target     prot opt in     out     source               destination

$ iptables -L -v
Chain INPUT (policy ACCEPT 13976 packets, 1015K bytes)
 pkts bytes target     prot opt in     out     source               destination         

Chain FORWARD (policy ACCEPT 241K packets, 24M bytes)
 pkts bytes target     prot opt in     out     source               destination         

Chain OUTPUT (policy ACCEPT 13946 packets, 1119K bytes)
 pkts bytes target     prot opt in     out     source               destination

iptables on one of the DomUs:

$ iptables -L -t nat -v
Chain PREROUTING (policy ACCEPT 53465 packets, 2825K bytes)
 pkts bytes target     prot opt in     out     source               destination         

Chain INPUT (policy ACCEPT 53466 packets, 2825K bytes)
 pkts bytes target     prot opt in     out     source               destination         

Chain OUTPUT (policy ACCEPT 51527 packets, 3091K bytes)
 pkts bytes target     prot opt in     out     source               destination         

Chain POSTROUTING (policy ACCEPT 51527 packets, 3091K bytes)
 pkts bytes target     prot opt in     out     source               destination

$ iptables -L -v
Chain INPUT (policy ACCEPT 539K packets, 108M bytes)
 pkts bytes target     prot opt in     out     source               destination         

Chain FORWARD (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination         

Chain OUTPUT (policy ACCEPT 459K packets, 116M bytes)
 pkts bytes target     prot opt in     out     source               destination
Alexander Gladysh
  • 2,343
  • 7
  • 30
  • 47

2 Answers2

4

I was a little curious about this one and found a pretty good explanation for your symptons. They are well described in nf_conntrack: table full - how the absence of rules can lead to unexpected behaviour.

TL;DR: Just running iptables -t nat -vnL starts loading the nf_conntrack module, resulting to get unintended stateful firewalling. I haven't verified this myself yet, you can bet I'll do it right tomorrow at work.

Solution: If you don't need NAT because you're doing bridging anyway, unload the nf_conntrack_* modules and all other dependant modules which depend on those. Disabling firewalling alltogether through chkconfig ip[6]tables off would be a good idea, too.

Disabling the firewall in Ubuntu can be done by sudo ufw disable and following these instructions if you don't want to reboot.

Alexander Janssen
  • 2,557
  • 15
  • 21
3

Xen must be NATting connections to your domU server, and the sheer number of connections is overwhelming the kernel's ability to keep track of them. While you could increase the space allocated to tracking the connections by increasing nf_conntrack_max, you would probably be better off with bridged networking instead of NAT. That way, the domU server gets it own virtual Ethernet card, avoiding the problem altogether.

200_success
  • 4,701
  • 1
  • 24
  • 42
  • Thanks, I'll look into that and will get back with what I found. If it is true, will it affect performance? Lead to other problems? – Alexander Gladysh Oct 01 '12 at 05:27
  • @AlexanderGladysh If you have exceeded nf_conntrack_max, then that means that the kernel is losing track of some connections. It's not just a matter of performance, but also of correctness. With bridged networking, the virtualization layer rewrites packets at the Ethernet layer rather than at the IP layer, so it would be stateless. – 200_success Oct 02 '12 at 03:23
  • It seems that we do _not_ use NAT, but are already using bridged networking as you suggested. Is there an easy way to confirm that? What else can lead to the symptoms we observe? – Alexander Gladysh Oct 02 '12 at 05:00
  • @AlexanderGladysh It's also possible that you are doing stateful firewalling. What kinds of rules appear when you do iptables -L -v? iptables -t nat -L -v? On domU? On dom0? – 200_success Oct 02 '12 at 08:07
  • Updated the question. – Alexander Gladysh Oct 02 '12 at 15:16