5

Issue

I run an IRC server for 20-50 users. We sometimes have issues with messages not arriving in a timely fashion or at all. After some packet captures we determined that messages sit in the server's "Send-Q". When a message doesn't arrive I'll look at "netstat -ct" output and see something like this:

Proto Recv-Q Send-Q Local Address Foreign Address State tcp 0 1756 ubuntu:ircd 10.8.1.7:63602 ESTABLISHED

Sometimes if I wait for a couple of minutes, the Send-Q will go to 0 and the message will be delivered, other times the client times out. My question is, why doesn't it just deliver the messages? What causes them to sit in Send-Q so long?

sshd also exhibits similar behavior, my ssh sessions freeze sometimes they come back, sometimes they time out.

Background

Not sure if the infrastructure here could be related to the issue, so here's what it looks like: these clients are on Windows 7 connecting with OpenVPN. OpenVPN server is on PFSense, the IRC server is on a local (NAT'd) LAN connected to PFSense. I have a firewall rule in place to allow clients to talk to 6667 on the server.

Investigating...

Latency/loss - looks decent enough. Not the best link ever but I would think this would be fine for IRC and SSH. Here is a ping from my client to the server, this is while my IRC and SSH are intermittantly hanging:

Ping statistics for 10.8.5.2:
    Packets: Sent = 4478, Received = 4460, Lost = 18 (0% loss)

Approximate round trip times in milli-seconds: Minimum = 17.2 ms, Maximum = 273.4 ms, Average = 32.3 ms

MSS/MTU issues - MTU appears to be fine. OpenVPN mtu-test on my client says:

Thu Dec 03 12:41:21 2015 NOTE: Empirical MTU test completed [Tried,Actual] local->remote=[1589,1589] remote->local=[1589,1589]

...and here's my manual test:

> ping -f -l 1472 10.8.5.2

Pinging 10.8.5.2 with 1472 bytes of data:
Reply from 10.8.5.2: bytes=1472 time=23ms TTL=63

> ping -f -l 1473 10.8.5.2

Pinging 10.8.5.2 with 1473 bytes of data:
Packet needs to be fragmented but DF set.

Bandwidth/throughput - did some iperf tests to make sure there wasn't a throughput issue. Again, looks decent enough:

iperf -c 10.8.5.2
------------------------------------------------------------
Client connecting to 10.8.5.2, TCP port 5001
TCP window size: 63.0 KByte (default)
------------------------------------------------------------
[  3] local 10.8.0.23 port 18587 connected with 10.8.5.2 port 5001
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0-10.0 sec  26.0 MBytes  21.8 Mbits/sec

Thanks, any help understanding "Send-Q" or more specific ideas about this issue would be much appreciated. Let me know if I can provide any more info here.

Update

Found out that I actually had massive packet loss. Pings from client->VPN didn't show this, but it was very apparent when using fping from VPN->client. I noticed it was only the Windows clients, and reinstalling the newest OpenVPN client seems to have fixed the loss. It might have been related to the OpenVPN TAP adapter being installed via disk imaging. Installing it manually per-machine seems to fix the problem.

Cory J
  • 1,528
  • 4
  • 19
  • 28

2 Answers2

9

Data goes in the send queue when the application writes it to its local kernel TCP stack. Data gets removed from the send queue when the other side's TCP stack acknowledges receipt of the data. If they're sitting in the send queue that means that your IRC server code has sent them to your kernel, but the other side of the connection hasn't acknowledged them yet. This may be because they haven't been sent yet. This can be caused by server bandwidth limitations or server performance limitations, but most commonly it's simply because the other side isn't receiving the data as fast as the server is sending it.

David Schwartz
  • 31,215
  • 2
  • 53
  • 82
  • Thanks for answering the question. I finally found the culprit was packet loss from the VPN to the clients. For some reason it didn't show up when pinging from client->VPN, but VPN->client showed massive loss. – Cory J Dec 03 '15 at 22:31
  • Wow, that's a weird one. Packet loss can certainly result in long delays between when data is sent and when it's received. – David Schwartz Dec 03 '15 at 22:35
  • In my case, the host veth bridge dropped my packets – user5723841 Feb 25 '22 at 02:49
0

I had a similar problem:

  • ssh connection remained ESTABLISHED
  • terminal freezes and un-freezes repeatedly
    • packets are visible in SEND-Q while terminal freezes
    • freeze time about 2-4 minutes
    • un-frozen for about 30 seconds

Turns out the firmware of my router was outdated!

  • fixed issue with firmware update
Leevi L
  • 101
  • 1