7

Using iptraf, tcpdump and wireshark I can see a SYN packet coming in but only the ACK FLAG is set in reply packet.

I'm running Debian 5 with kernel 2.6.36

I've turned off window_scaling and tcp_timestamps, tcp_tw_recycle and tcp_tw_reuse:

cat /etc/sysctl.conf 



net.ipv4.tcp_tw_recycle = 0
net.ipv4.tcp_tw_reuse = 0
net.ipv4.tcp_window_scaling = 0
net.ipv4.tcp_timestamps = 0

I've attached an image of the wireshark output.

http://imgur.com/pECG0.png

Output to netstat

netstat -natu | grep '72.23.130.104'

tcp        0      0 97.107.134.212:18000    72.23.130.104:42905     SYN_RECV

I've been doing everything possible to find a solution and have yet to figure out the problem, so any help/suggestions are much appreciated.

UPDATE 1: I've set tcp_syncookies = 0 and noticed I am now replying with 1 SYN+ACK for every 50 SYN requests. The host trying to connect is sending a SYN request about once every second.

PCAP FILE

jeff
  • 71
  • 1
  • 1
  • 3
  • Did you run `sysctl -p` after changing `/etc/sysctl.conf`? – Mark Wagner Aug 03 '11 at 16:51
  • yes and rebooted. – jeff Aug 03 '11 at 20:31
  • I assume your server is the machine with the IP 97.107.134.212? This packet capture was generated on that machine itself, not on a firewall in between? What state is the TCP socket in when you run netstat? SYN_RECV? SYN_SENT? – akramer Aug 05 '11 at 06:55
  • correct the local server is 97.107.134.212. The packet capture was generated on the local server, not on the firewall. netstat -natu returns the state to be SYN_RECV. – jeff Aug 05 '11 at 17:36
  • Can you post the pcap somewhere? Or at least turn off relative sequence numbers? – MikeyB Aug 05 '11 at 17:57
  • Yes, I've updated the original question. Cheers, thank you. – jeff Aug 05 '11 at 20:49
  • 1
    We have exactly the same issue so I'm wondering if you finally find any root cause for this ? At this moment our test shows that when we cross a fortinet firewall we have the issue, when we don't cross it it works well. I can't explain how the firewall could impact how the server reply to SYN but it's what we see – radius Jun 14 '13 at 21:29

5 Answers5

7

After having the same issue I finally catch the root cause.

On Linux when a socket is on TIME_WAIT and a new SYN append (for the same pair of ip/port src, ip/port dest), the kernel check if the SEQ number of the SYN is < or > than the last SEQ received for this socket.

(PS: in the image of the wireshark output attached to this issue, seq number are shown as relative, if you don't set them as absolute you can't see the issue. The capture would have to show the old session also to be able to compare SEQ numbers)

  • if the SEQ number of the SYN is > than the SEQ number of the previous packet, a new connection is crated and everything works
  • if the SEQ number of the SYN is < than the SEQ number of the previous packet, the kernel will send an ACK related to the previous socket because the kernel think that the SYN received is a delayed packet of the previous socket.

The behaviour is like that because at the beginning of TCP the SEQ number generated by computers where incremental, it was almost impossible to receive a SEQ number < than the SEQ number of a previous socket still in TIME_WAIT.

The increase of bandwidth of computers make this from almost impossible to rare. But the most important things here is that now most system use random ISN (initial SEQ number) to improve security. So nothing prevent the SEQ number a of new socket to be > than the SEQ number of a previous one.

Each OS use different algorithms that are more or less safe to avoid this particular issue http://www.bsdcan.org/2006/papers/ImprovingTCPIP.pdf give a good presentation of the issue.

There is a last tricky things... so the kernel will send a ACK related to the old session, then ? The client OS should receive the ACK (of the previous session), don't understand it because for the client the session is closed, send a RST. When the server receive this RST it will immediately clear the socket (so it's no longer in TIME_WAIT). On his side, the client is waiting for a SYN/ACK, as it don't get it, it will send a new SYN . In the meantime the RST has been send and the session cleared on the server, so this secondary SYN will work and the server will reply SYN/ACK and so on.

So the normal behavior is that the connection should work but be delayed by a second (till the secondary SYN is sent). In Jeff case, he said in a comment he use a Fortinet firewall, these firewall (by default) will drop the ACK related to the old session (because the firewall see no open session related to the ACK), so the client doesn't send any RST and the server can't clear the session from TIME_WAIT state (except of course at the end of the TIME_WAIT timer). The "set anti-replay loose" command on fortinet can allow this ACK packet to be forwarded instead of dropped.

radius
  • 9,545
  • 23
  • 45
3

It appears that 97.107.134.212 already believes there is a connection (72.23.130.104:42905, 97.107.134.212:18000).

When 72.23.130.104:42905 sends its SYN packet, its sequence number is 246811966. Next should be a SYN/ACK packet with its own SEQ number and an ACK value of 246811967.

But it's sending an ACK with SEQ=1736793629 and ACK=172352206. Those are probably values from an earlier connection.

Any new connection attempts should be coming FROM a different port number... is that happening? Wireshark points this out in pkt#11: "TCP Port numbers reused".

Looks like the problem is on the sender.

FWIW, I can connect just fine:

1   0.000000    192.168.0.135   97.107.134.212  TCP 45883 > biimenu [SYN] Seq=809402803 Win=14600 Len=0 MSS=1460 SACK_PERM=1 TSV=2319725 TSER=0 WS=7
2   0.022525    97.107.134.212  192.168.0.135   TCP biimenu > 45883 [SYN, ACK] Seq=4293896301 Ack=809402804 Win=14600 Len=0 MSS=1360 SACK_PERM=1
3   0.022553    192.168.0.135   97.107.134.212  TCP 45883 > biimenu [ACK] Seq=809402804 Ack=4293896302 Win=14600 Len=0
MikeyB
  • 38,725
  • 10
  • 102
  • 186
  • Yes new connections are working fine. There are only a few instances where the devices are not connecting and I feel it has something to do with the configuration on *their* firewall. Are you aware of any settings that could be causing these issues on there side. Some sort of routing config that isn't allowing the SYN/ACK packet to reach our host device? – jeff Aug 08 '11 at 14:26
  • 2
    If their connections are always coming from the same source port, that *will* be a problem. – MikeyB Aug 08 '11 at 14:44
  • I have seen this behaviour on various NAT firewalls. If a new connection is opened before a limit in X seconds has passed, the NAT firewall will reuse the source port for the new connection against the outside. – espenfjo Jun 27 '13 at 14:32
1

The one time I've seen this before it was because the outbound and inbound packets were taking different routes on the network, and there was a stateful connection-tracking device on the inbound leg. Since that device (a load-balancer in my case, but it could just as easily be a firewall) never saw the initial SYN, the SYN-ACK was dropped on the floor as spurious.

sysadmin1138
  • 131,083
  • 18
  • 173
  • 296
0

It must be more than just asymmetry, because we are missing an outgoing packet too:

The SYN goes out, but we don't see the incoming SYN-ACK, or the outgoing ACK from the local server. So something else must have proxied both those packets and then we see the incoming ACK - which is really the fourth packet in the sequence.

My guess is a WAN accelerator misconfigured in between.

Paul
  • 1,228
  • 12
  • 24
  • My concern is we are not generating the outgoing SYN/ACK packet. I'm not sure how it could be missing if I'm running the capture locally on 97.107.134.212. All we are seeing is the ACK packet in response to a SYN request. I know the firewall is a fortigate 50b but I do not know how it is configured as I do not manage it. – jeff Aug 05 '11 at 17:40
0

I would check a few things:

Is your host multi-homed (e.g. do you have more than one ethernet interface?) - if so your routes might be messed up. Easiest way to test this would be to disable your secondary interface(s) and see if the problem goes away.

Other thing to check is whether iptables (or some other firewall) is enabled. service iptables stop will shut it down until your next reboot - if that resolves the issue then you need to tweak your iptables settings.

Also, if you have IPv6 enabled on your interface, sometimes theres a route over ipv4 but not over ipv6. When this happens, and the ipv6 route is the "default" your packets can go across the wrong address (even on the correct interface). Try disabling ipv6 to see if that is the issue.

Keith
  • 352
  • 3
  • 11