9

Ubuntu 12.04

I am trying to better understand how many times TCP will attempt to retransmit a packet when it does not receive confirmation the destination received it. After reading the tcp man page it seemed clear this is controlled by the sysctl tcp_retries1:

tcp_retries1 (integer; default: 3)
           The number of times TCP will attempt to retransmit a  packet  on
           an  established connection normally, without the extra effort of
           getting the network layers involved.  Once we exceed this number
           of retransmits, we first have the network layer update the route
           if possible before each new retransmit.  The default is the  RFC
           specified minimum of 3.

My system is set to the default value of 3:

# cat /proc/sys/net/ipv4/tcp_retries1 
3

Wanting to test this, I connected from system A (172.16.249.138) to system B (172.16.249.137) over ssh and started a simple print loop on the console. I then disconnected B abruptly from the network while this communication was occurring.

In another terminal, I was running 'tcpdump host 172.16.249.137' on system A. Below is the relevant lines from the output (line numbers added for clarity).

00: ...
01: 13:29:46.994715 IP 172.16.249.138.50489 > 172.16.249.137.ssh: Flags [.], ack 5989441, win 80, options [nop,nop,TS val 1957286 ecr 4294962520], length 0
02: 13:29:46.995084 IP 172.16.249.138.50489 > 172.16.249.137.ssh: Flags [.], ack 5989441, win 186, options [nop,nop,TS val 1957286 ecr 4294962520], length 0    
03: 13:29:47.040360 IP 172.16.249.138.50489 > 172.16.249.137.ssh: Flags [P.], seq 29136:29184, ack 5989441, win 186, options [nop,nop,TS val 1957298 ecr 4294962520], length 48
04: 13:29:47.086552 IP 172.16.249.138.50489 > 172.16.249.137.ssh: Flags [.], ack 5989441, win 376, options [nop,nop,TS val 1957309 ecr 4294962520], length 0
05: 13:29:47.680608 IP 172.16.249.138.50489 > 172.16.249.137.ssh: Flags [P.], seq 29136:29184, ack 5989441, win 376, options [nop,nop,TS val 1957458 ecr 4294962520], length 48
06: 13:29:48.963721 IP 172.16.249.138.50489 > 172.16.249.137.ssh: Flags [P.], seq 29136:29184, ack 5989441, win 376, options [nop,nop,TS val 1957779 ecr 4294962520], length 48
07: 13:29:51.528564 IP 172.16.249.138.50489 > 172.16.249.137.ssh: Flags [P.], seq 29136:29184, ack 5989441, win 376, options [nop,nop,TS val 1958420 ecr 4294962520], length 48
08: 13:29:56.664384 IP 172.16.249.138.50489 > 172.16.249.137.ssh: Flags [P.], seq 29136:29184, ack 5989441, win 376, options [nop,nop,TS val 1959704 ecr 4294962520], length 48
09: 13:30:06.936480 IP 172.16.249.138.50489 > 172.16.249.137.ssh: Flags [P.], seq 29136:29184, ack 5989441, win 376, options [nop,nop,TS val 1962272 ecr 4294962520], length 48
10: 13:30:27.480381 IP 172.16.249.138.50489 > 172.16.249.137.ssh: Flags [P.], seq 29136:29184, ack 5989441, win 376, options [nop,nop,TS val 1967408 ecr 4294962520], length 48
11: 13:31:08.504033 IP 172.16.249.138.50489 > 172.16.249.137.ssh: Flags [P.], seq 29136:29184, ack 5989441, win 376, options [nop,nop,TS val 1977664 ecr 4294962520], length 48
12: 13:31:13.512437 ARP, Request who-has 172.16.249.137 tell 172.16.249.138, length 28
13: 13:31:14.512336 ARP, Request who-has 172.16.249.137 tell 172.16.249.138, length 28
14: 13:31:15.512241 ARP, Request who-has 172.16.249.137 tell 172.16.249.138, length 28

If I am interpreting this correctly (and I may not be), line 3's packet is never acknowledged by system B. A then retries to send this packet 7 times (lines 5-11) each time increasing its retransmission timer (roughly doubling it each time).

Why is the packet being retransmitted 7 times instead of 3?

Note: I performed this formal test after noticing a few pcap files where retransmits were occurring 6-7 times over HTTP connections so that number of retransmits does not seem specific to SSH.

HodB
  • 93
  • 1
  • 5
  • Did you read the explanation of the setting? It's not the number of retries to attempt. It's the number of retries to attempt before changing strategies. – David Schwartz Mar 21 '14 at 17:11
  • As mentioned above, yes, I read the setting. In this case there would be no route to update as they are both on the same subnet. Why 7 retries? What determines how many retries occur in total? – HodB Mar 21 '14 at 17:24
  • 2
    What is your value for the sysctl net.ipv4.tcp_retries2? The net.ipv4.tcp_retries2 variableis the one that actually controls the number of retries that will be attempted. The net.ipv4.tcp_retries1 variable just controls the number of retries before the system signals a lower level to try verifying that networking is available. – hrunting Jul 16 '14 at 02:55

1 Answers1

5

I believe you created an orphan socket by killing the connection on the .137 server. So, the kernel parameter in use would be tcp_orphan_retries - which has a generic linux default of 7.

You can get a description of both the condition you created and the results here: http://www.linuxinsight.com/proc_sys_net_ipv4_tcp_orphan_retries.html

Andrew S
  • 510
  • 4
  • 7