9

I have a TCP server listening on a machine ("the server") running Ubuntu 12.04.3 (kernel 3.8.0-31-generic). It receives connections from 2 different client machines. Machine A is running Ubuntu 12.04.4 (3.11.0-17-generic) and machine B is running Ubuntu 11.10 (3.0.0-32-server).

If TCP timestamps are enabled on the server (sysctl net.ipv4.tcp_timestamps=1) then sometimes SYN packets from machine A are "ignored". Using tcpdump on the server (in non promiscuous mode) I can see the SYNs arrive OK and with correct checksums - there is just no response - no SYN/ACK and no RST. Machine A retransmits the SYN a number of times before giving up. The client software running on machine A (wget in this case) immediately retries with a new connection and succeeds, getting an instant SYN/ACK.

Machine B has no problems with the same server and it's traffic looks normal - it is using the same TCP options as machine A as well (from what I see from the capture files). Disabling TCP timestamps on the server makes everything work as it should.

The timestamps in the ignored SYN packets seem to be valid to me however so I'm not sure why they are causing problems or if they are the underlying cause at all.

I have put an anonyimised pcap here https://www.dropbox.com/s/onimdkbyx9lim70/server-machineA.pcap . It was taken on the server (10.76.0.74) showing machine A (10.4.0.76) successfully performing an HTTP GET (packets 1 to 10) and then 1 second later trying to fetch the same URL again (packets 11 to 17) but instead has its SYNs ignored. Packets 18 to 27 is another success.

I suspect this is a similar problem to that described in "Why would a server not send a SYN/ACK packet in response to a SYN packet" and whilst disabling timestamps is a workaround I would like to understand what is going on. Is this just a bug?

There is no local firewall running. The server handles quite a few TCP connections (approx 32K at any one time) but has plenty of free memory/CPU. At the time of the test shown in the pcap there were no other TCP connections between machine A and the server. There is no sign that the server application's accept queue is suddenly filling up (besides that should affect both clients I would presume). As the packets look OK in a pcap taken on the server it doesn't seem like an intervening network device is breaking things.

I originally posted this in the ubuntu forums but in hindsight this may be a more appropriate location. Hoping for the loan of a clue.

user133831
  • 191
  • 1
  • 3

1 Answers1

5

In my case the following command fixed the problem with missing SYN/ACK replies from Linux server:

sysctl -w net.ipv4.tcp_tw_recycle=0

I think it is more correct than disabling TCP timestamps, as TCP timestamps are useful after all (PAWS, window scaling, etc).

The documentation on the tcp_tw_recycle explicitly states that it is not recommended to enable it, as many NAT routers preserve timestamps and thus PAWS kicks in, as timestamps from the same IP are not consistent.

   tcp_tw_recycle (Boolean; default: disabled; since Linux 2.4)
          Enable fast recycling of TIME_WAIT sockets.  Enabling this
          option is not recommended for devices communicating with the
          general Internet or using NAT (Network Address Translation).
          Since some NAT gateways pass through IP timestamp values, one
          IP can appear to have non-increasing timestamps.  See RFC 1323
          (PAWS), RFC 6191.
lav
  • 341
  • 3
  • 7
  • The machines in question have all been upgraded and I believe the problem is no longer happening so I can't try this now. In this case there was no NAT involved between the client and server however. It still seems suspiciously bug like to me. – user133831 Jun 27 '16 at 17:58