-1

I am running simple tcp client and server application on 2 linux hosts (2.6.x kernel, rhel 6.3 enterprise linux. In an infinite loop, the client sends a message of 1024 bytes and the server responds with 100 byte ack. Then the client sends another 1024 bytes message and so on. The latency (RTT) as determined by ping between 2 hosts average around .23 ms.

I am observing that normally the client and server are only sending 3200 messages per second, but after running for 2-3 minutes, i would see message rate hit as high as 5100 messages per second. This rate will exist for few seconds and then fall back 3200. How can i figure out as to what causes these jumps in throughput?

UPDATE: The two hosts are on the same VLAN, connected by Cisco catalyst switch and network bandwidth is 1Gb/sec.

Jimm
  • 303
  • 1
  • 4
  • 11
  • Is this with the [realtime kernel from this question](http://serverfault.com/questions/445077/how-to-troubleshoot-latency-between-2-linux-hosts) or not? – ewwhite Nov 06 '12 at 17:35
  • No simple rhel 2.6.32 kernel. When the throughput increases, there is no corresponding increase in CPU utilization as shown by top – Jimm Nov 06 '12 at 17:40
  • By investigating what the heck this application is doing ? How would WE know what is going on ? – adaptr Nov 06 '12 at 17:32
  • What language is the client and server written in? – James Nov 06 '12 at 19:26

1 Answers1

1

It may be that the TCP window sizes change, allowing more outstanding bytes on the wire, causing an increase in throughput. Then there may be a condition on the network causing a packet to be lost, dropping the TCP window size.

Run wireshark and look at the various fields in the TCP header around where you see an increase in throughput and when it drops, if that is the case it should be pretty clear.

Vatine
  • 5,390
  • 23
  • 24
  • I am on corporate network and it is a long process to get access to wireshark. meantime, i ran ping, along with the client and server and ping displays no message loss. – Jimm Nov 06 '12 at 17:54
  • @Jimm: Use the switch monitoring capabilities (SPAN/RSPAN) and dump that flow on another server running tcpdump. If you are expected to do network troubleshooting you should be allowed to do this. – petrus Nov 06 '12 at 17:57
  • The most important thing to remember is that the client ONLY publishes message after RTT (i.e. after it receives an ACK from the server). So at any point in time, there is ONLY ONE packet in-flight. So, its highly unlikely that window size is changing(reducing in size). Also, both the hosts are on the same VLAN, with average CPU utilization of .1%. Also, as i said, when i run ping along with the client and server, there is 0 message loss – Jimm Nov 06 '12 at 18:00