5

I have an application that sends 100 of 186-byte (excluding headers) TCP messages back to back without gap from host A to host B.

I ran tcpdump to capture the packets on host A (where the sender is), and I noticed that after few messages (like 9), then the next ~25 messages got merged into one 5+K message.

I have already turned off Nagle's algorithm through setsockopt() in the sender application, and the calculated TCP windows is over 14K byte all the time. Hence, it doesn't seem like the first 9 messages filled up host B and host B asked host A to slow down.

Any tips on how to figure out why the TCP messages got merged?

Thanks!

Hei
  • 175
  • 1
  • 1
  • 6
  • 1
    You are aware that unless you configure jumbo frames, the max ethernet message size is around 1k? – TomTom Dec 26 '15 at 12:12

2 Answers2

4

I have an application that sends 100 of 186-byte (excluding headers) TCP messages back to back without gap from host A to host B.

Then you may be sending them faster than the network can transport them, in which case, by the time the TCP implementation on the sender is ready to send a packet on the network, there may be multiple messages queued up, in which case it'll send as many as it can in a single TCP segment. The TCP protocol offers a byte-stream service, with no notion of message boundaries, so it's permitted to do that.

I have already turned on Nagle's algorithm

Nagle's algorithm explicitly does what you're saying the TCP on the sender is doing:

Nagle's algorithm works by combining a number of small outgoing messages, and sending them all at once.

so turning it on won't prevent that. Turning it off might, in some cases, prevent that, but given that your application sends a burst of messages, it probably won't prevent that.

(I.e., the answer to "why did the TCP on the sender merge the messages?" is "because it can".)

  • I am an idiot -- another typo in my post...I turned off Nagle's algo actually. Thanks for your detailed explanation. I am looking for a way to identify whether it is caused by some settings (e.g. as kasperd suggested -- tcp-segmentation-offload, tx-tcp-segmentation, tx-tcp-ecn-segmentation, and tx-tcp6-segmentation were on), or something else (such as the NIC or the kernel can't process fast enough as you mention). Any suggestion? – Hei Dec 27 '15 at 03:14
  • 1
    Try turning off all the settings in question, and see what happens. You *can't* prevent it from happening - programs using TCP cannot rely on message boundaries being preserved, and must be able to deal with multiple messages per TCP segment and messages split between TCP segments. –  Dec 27 '15 at 06:01
1

What you are seeing is most likely due to functionality being offloaded from the kernel network stack to network interface and/or driver.

The network interface will still be receiving the individual packets from the network. But before the packets are handed off to the kernel they are merged by either the interface or the driver.

You can see the current settings of all the offload features using this command:

ethtool -k eth0

If you want to disable this particular feature, it can be done with this command:

ethtool -K eth0 generic-receive-offload off

You can read more about offloading in this older question.

kasperd
  • 29,894
  • 16
  • 72
  • 122
  • sorry, my question was ambiguous. The messages got merged on the sender's side (host A) not the receiver's side. I updated my question to reflect it. Sorry about that. – Hei Dec 26 '15 at 11:38
  • @Hei In that case look for the tx transmission rather than receive transmission. Settings related to this include: `tcp-segmentation-offload`, `tx-tcp-segmentation`, `tx-tcp-ecn-segmentation`, `tx-tcp6-segmentation`. – kasperd Dec 26 '15 at 21:58