4

Settings: this is a quad CPU machine, plenty strong, not loaded at all (neither CPU nor network), the client is a Windows Server 2008 64bit, the server is a linux box.

I have four threads that are all issuing HTTP requests starting at the same time. The connections are initiated to IPs X, X, Y, Z (two connections to X, one to Y and Z). All targets are on the local LAN.

I am seeing that connections to X, Y and Z are formed (SYN-SYN/ACK), and the second connection to to X is with a 100 ms delay. Meaning, the machine is not sending the second SYN to X for a full 100 ms.

Could this be related to TCP Offload Engine? What else could be causing this delay?

Edit - Another suspect is the client code - it's written in Java, uses HttpURLConnection.

ripper234
  • 5,710
  • 9
  • 40
  • 49

6 Answers6

4

A network trace (e.g., Wireshark) will show if the delay is in waiting for a response. It would also point out other "detours" like the suggestion about DNS. Sounds like you may have done this already, but you didn't say.

gbarry
  • 615
  • 5
  • 11
  • +1. I would recommend running wireshark on the server first. That will allow you to precisely measure when the request hits the server and when the server responds. If no delay is seen on the server itself then it's a network issue. If a delay is seen on the server, then it's a server issue. – joeqwerty Dec 09 '09 at 17:10
  • The SYN for the request is not leaving until later. In fact, I have written a standalone utility that simply does N connections concurrently do the same machine - I am seeing delays of 3 seconds before some of the SYN packets leave the client machine. – ripper234 Dec 09 '09 at 17:21
3

A different possibility: the Windows XP SP2 limit on outgoing half-open connections, which defaults to 10. I'm not sure how you see how many connections are in this state, but I believe that if this rate limiter kicks in it will show up in the error logs.

Half-open.com

pjc50
  • 1,720
  • 10
  • 12
2

Does it have to do a DNS lookup for each request? Is that limiting the rate?

pjc50
  • 1,720
  • 10
  • 12
1

Initial connection goes through fine but it is your second connection that is getting queued. I would review the client implementation in software, I don't know if more recent JDKs have made more changes but it used to be that even if you made individual HttpUrlConnections the underlaying Protocol Handler would still reuse the socket connection.

You should check in at StackOverflow and see if some of them have dealt with this issue before.

Shial
  • 1,017
  • 1
  • 9
  • 14
  • I've cross-posted to SO. Also the problem reproduces when I simply use a socket and bypass HTTP completely. – ripper234 Dec 09 '09 at 18:02
1

OK, there's a LOT of possible places this could be going wrong. You mentioned the TCP offload engine, and that's a reasonable suspect (especially if you've got Broadcom NICs in there), so let's rule it out and disable it (consult your documentation for this).

After that you want to start reducing other possible candidates, so look to switches, network cables and so on. If you can, connect source to target via a crossover and see if you can reproduce it there.

It's also worthwhile trying dear old ping - from the sounds of it, you should be able to reproduce the abberant behaviour with 4 concurrent pings.

But what it boils down to is that there is no point in suspecting anything at this early stage, as there as just too many places where it could be going wrong (including your app).

Maximus Minimus
  • 8,937
  • 1
  • 22
  • 36
  • 1. I wrote a small test app to reproduce the problem. 2. I've disabled the offload engine. 3. I doubt it's the switches, we're getting this phenomenon all over the place with different switches. I'll try a cross to make sure. 4. I doubt I'll see this problem with 4 concurrent pings - will try it though. – ripper234 Dec 09 '09 at 20:24
0

Have you checked your firewall settings? There might be something in the firewall settings that is rate limiting the connections.

Do you have any special sysctl settings for the server? There are lots of minor tweaks that one can do for networking in sysctl.

Have you checked against different servers/clients? This is to help isolate the cause of the problem - whether it is the particular server, client or both.

sybreon
  • 7,357
  • 1
  • 19
  • 19
  • No firewall. I believe the problem is in the Windows side, not the linux. It happens on multiple servers/client, I am suspecting Windows limitations / registry flags. – ripper234 Dec 09 '09 at 15:38