7

I have the exact same issue as the one described here, but I cannot request clarification from the author, since I am a new user and I cannot post a comment on that, so I am posting a new question (I tried posting this under that as an answer for reference on the same thread, and it was deleted since it doesn't provide an answer...).

How do I prevent TCP connection freezes over an OpenVPN network?

Question: Does anyone have any recommendations for how to troubleshoot and/or determine the root cause of the TCP issue described on that thread? It's as if the remote end isn't accepting the ACK messages sent by the VPN client.

My setup is exactly the same as in the original qustion: CentOS server (topology subnet), and two clients, one CentOS and one Ubuntu14.03. When I do an 'ssh cat abc.txt' from the ubuntu-client to the centos-client the vpn connection of the centos stalls. Only way to get it back up is to restart both the openvpn server (on a centos box) and the openvpn client on the centos - just restarting the centos-client connection doesn't make it operational (it will bring up the tun0 after ~1-2 minutes, but I cannot ping or ssh the box via vpn anymore). I also tried all the MTU adjustment suggestions found in other threads (tun-mtu 1300 / fragment 1100 / mssfix etc) and none of them helps.

What makes this even more weird, is that if I do the same ssh-cat from Ubuntu, using the CentOS server vpn for internet to the public ip address of the centos-client (thus bypassing the centos-client<->centos-server vpn leg), everything works fine (no stalls, ever).

UPDATE 1: I found is a workaround to fix this, but it is a very ugly one. Posting it here, in case some people come up with any other ideas/hints. When I set the verbose level to 9 on the openvpn server (not on the client, server only), the issue never occurs again. Verb 9 causes the openvpn server to log lots of data, and use up 100% of the CPU it is running on. This then limits the transfer speed and makes the scp complete successfully with no stalls; scp now copies with 40-50Kb/sec, while before it was stalling after hitting above 100Kb/sec.

UPDATE 2: I believe this is a buffering problem. The size of the file transferred (via scp or ssh cat) matters, a lot. If I scp a 700KB file (or smaller), it will always succeed, no matter how many times I try it. If I try for an 800KB file instead, it will always fail/stall after 7xxKb+.

Atomo
  • 69
  • 1
  • 3
  • I am also suffering from this issue - but using an Amazon AWS micro instance as the VPN server (Ubuntu) and two Debian clients. I think this evening I will try using a more powerful Ubuntu instance (with more RAM, etc), and see if that changes anything, and if not, ill try a microserver with different OSs. I have tried literally every setting in my OpenVPN config, but most things like fragment and MTU have no effect on TCP connections. Only mssfix, which does nothing to fix the problem. – J.J Jan 12 '16 at 14:30
  • Do you have a `tcpdump` or Wireshark capture of the connection reset process? And of the 700KB vs 800KB file transfers? – zymhan Sep 27 '18 at 15:07
  • I would made a network capture of both encrypted traffic and unencrypted one on both endpoints if possible. Compare the packets in the captures. Some firewalls, antivirus or IDS have bugs and they change the traffic in unexpected and curious ways. – Mircea Vutcovici Dec 10 '19 at 20:31

2 Answers2

0

We eliminated openvpn by creating ssh tunnels. Basically our remote hosts establish an ssh tunnel towards us and if it goes down they initiate it again. We had to use a few shell scripts to get it working, but once it was running it has worked flawlessly for years. Allows us to ssh through the tunnel back to the servers.

TekOps
  • 61
  • 4
0

I've seen similar issue and been able to work around them by disabling TCP window scaling.

sysctl -w net.ipv4.tcp_window_scaling=0

Maybe this will point you in the right direction of where the problem may be.

chutz
  • 7,569
  • 1
  • 28
  • 57