3

edit: there were actually 2 problems, a buggy TCP implementation on the device running the RTOS and an issue causing the Linux network stack to receive the TCP fragments out of order when more than 1 core was active.

I have a sender on IP 192.168.2.250 running some embedded RTOS and a receiver running Linux 4.9.x on IP 192.168.2.1

The receiver is configured as a Wireless AccessPoint and the sender is directly connected to the receiver through WiFi.

I have made a tcpdump on the receiving side during a TCP data transfer and I'm noticing quite a lot of duplicate ACKs being sent by the receiver without actual packet loss occurring (or at least that's what I think, because I see no retransmissions and the ACKs eventually follow the sent sequence numbers).

wireshark trace tcp duplicate ack

Anybody an idea what might be causing the receiver's behaviour?

Edit: you are not seeing fast retransmissions from the sender because I turned them off to prove that the stream is not missing data (and throughput went up a lot by doing that). One explanation would be that the packets are seen out of order by the tcp stack. Can I make Linux more tolerable to out of order packets? As in not sending dup acks immediately.

output of sysctl net | grep tcp

net.ipv4.tcp_abort_on_overflow=0
net.ipv4.tcp_adv_win_scale=1
net.ipv4.tcp_allowed_congestion_control=cubic reno
net.ipv4.tcp_app_win=31
net.ipv4.tcp_autocorking=1
net.ipv4.tcp_available_congestion_control=cubic reno
net.ipv4.tcp_base_mss=1024
net.ipv4.tcp_challenge_ack_limit=1000
net.ipv4.tcp_congestion_control=cubic
net.ipv4.tcp_delack_seg=1
net.ipv4.tcp_dsack=1
net.ipv4.tcp_early_retrans=3
net.ipv4.tcp_ecn=2
net.ipv4.tcp_ecn_fallback=1
net.ipv4.tcp_fack=1
net.ipv4.tcp_fastopen=1
net.ipv4.tcp_fin_timeout=60
net.ipv4.tcp_frto=2
net.ipv4.tcp_fwmark_accept=0
net.ipv4.tcp_invalid_ratelimit=500
net.ipv4.tcp_keepalive_intvl=75
net.ipv4.tcp_keepalive_probes=9
net.ipv4.tcp_keepalive_time=7200
net.ipv4.tcp_limit_output_bytes=262144
net.ipv4.tcp_low_latency=0
net.ipv4.tcp_max_orphans=16384
net.ipv4.tcp_max_reordering=300
net.ipv4.tcp_max_syn_backlog=128
net.ipv4.tcp_max_tw_buckets=16384
net.ipv4.tcp_mem=332494433366498
net.ipv4.tcp_min_rtt_wlen=300
net.ipv4.tcp_min_tso_segs=2
net.ipv4.tcp_moderate_rcvbuf=1
net.ipv4.tcp_mtu_probing=0
net.ipv4.tcp_no_metrics_save=0
net.ipv4.tcp_notsent_lowat=4294967295
net.ipv4.tcp_orphan_retries=0
net.ipv4.tcp_pacing_ca_ratio=120
net.ipv4.tcp_pacing_ss_ratio=200
net.ipv4.tcp_probe_interval=600
net.ipv4.tcp_probe_threshold=8
net.ipv4.tcp_recovery=1
net.ipv4.tcp_reordering=3
net.ipv4.tcp_retrans_collapse=1
net.ipv4.tcp_retries1=3
net.ipv4.tcp_retries2=15
net.ipv4.tcp_rfc1337=0
net.ipv4.tcp_rmem=4096873806291456
net.ipv4.tcp_sack=1
net.ipv4.tcp_slow_start_after_idle=1
net.ipv4.tcp_stdurg=0
net.ipv4.tcp_syn_retries=6
net.ipv4.tcp_synack_retries=5
net.ipv4.tcp_syncookies=1
net.ipv4.tcp_thin_dupack=0
net.ipv4.tcp_thin_linear_timeouts=0
net.ipv4.tcp_timestamps=0
net.ipv4.tcp_tso_win_divisor=3
net.ipv4.tcp_tw_recycle=0
net.ipv4.tcp_tw_reuse=0
net.ipv4.tcp_use_userconfig=0
net.ipv4.tcp_window_scaling=1
net.ipv4.tcp_wmem=4096163844194304
net.ipv4.tcp_workaround_signed_windows=0

1 Answers1

1

For some reason, .250 is sending old values of ACKs for SEQs it already received. See packet #551: .1 states SEQ=290541. In packet #552, .250 says ACK=267181. Hence, as .250 acknowledge number (267181) is lower than .1 sequence number (290541), .1 assumes that .250 lost #551 because all packets between #552 and #558 use the outdated SEQ=267181 and send another ACK for each packet it receives with an outdated ACK number.

If RTOS is reporting no loss, we can only assume that its scheduler is prioritizing pushing data instead of processing acknowledges.

PEdroArthur
  • 326
  • 2
  • 3