Root cause behind increase in throughput

Question

I am running simple tcp client and server application on 2 linux hosts (2.6.x kernel, rhel 6.3 enterprise linux. In an infinite loop, the client sends a message of 1024 bytes and the server responds with 100 byte ack. Then the client sends another 1024 bytes message and so on. The latency (RTT) as determined by ping between 2 hosts average around .23 ms.

I am observing that normally the client and server are only sending 3200 messages per second, but after running for 2-3 minutes, i would see message rate hit as high as 5100 messages per second. This rate will exist for few seconds and then fall back 3200. How can i figure out as to what causes these jumps in throughput?

UPDATE: The two hosts are on the same VLAN, connected by Cisco catalyst switch and network bandwidth is 1Gb/sec.

Is this with the [realtime kernel from this question](http://serverfault.com/questions/445077/how-to-troubleshoot-latency-between-2-linux-hosts) or not? — ewwhite, Nov 06 '12 at 17:35
No simple rhel 2.6.32 kernel. When the throughput increases, there is no corresponding increase in CPU utilization as shown by top — Jimm, Nov 06 '12 at 17:40
By investigating what the heck this application is doing ? How would WE know what is going on ? — adaptr, Nov 06 '12 at 17:32

score 1 · Answer 1 · answered Nov 06 '12 at 17:44

1

It may be that the TCP window sizes change, allowing more outstanding bytes on the wire, causing an increase in throughput. Then there may be a condition on the network causing a packet to be lost, dropping the TCP window size.

Run wireshark and look at the various fields in the TCP header around where you see an increase in throughput and when it drops, if that is the case it should be pretty clear.

answered Nov 06 '12 at 17:44

Vatine

5,390
23
24

I am on corporate network and it is a long process to get access to wireshark. meantime, i ran ping, along with the client and server and ping displays no message loss. – Jimm Nov 06 '12 at 17:54
@Jimm: Use the switch monitoring capabilities (SPAN/RSPAN) and dump that flow on another server running tcpdump. If you are expected to do network troubleshooting you should be allowed to do this. – petrus Nov 06 '12 at 17:57
The most important thing to remember is that the client ONLY publishes message after RTT (i.e. after it receives an ACK from the server). So at any point in time, there is ONLY ONE packet in-flight. So, its highly unlikely that window size is changing(reducing in size). Also, both the hosts are on the same VLAN, with average CPU utilization of .1%. Also, as i said, when i run ping along with the client and server, there is 0 message loss – Jimm Nov 06 '12 at 18:00

Root cause behind increase in throughput

1 Answers1