TLS negotiation hangs

2

I have a setup that has worked quite well for the past years but that has stopped working a few days ago. On my living room, I have a "bare metal" machine, with a few VMs on it. One of the VMs is responsible for my emails (dovecot, postfix), one is responsible for HTTP/S server (nginx). Each of those two make an OpenVPN connection to a droplet at Digital Ocean. The droplet has firewall rules (iptables) that forwards the packets to the OpenVPN client responsible for the packet: if it's coming for port 80, forward to the internal VM that has nginx. If it's an email-related packet, forward to the email VM.

Since a few days ago, anything related to TLS stopped working (see edits). SMTP transactions would just hang during the STARTTLS phase, HTTPS connections would hang during the TLS negotiation. Anything else works fine: plain HTTP connections are fulfilled (see edits) and if I disable TLS on postfix, emails come in just fine.

Also, if I'm on the same network and I add the hostname in /etc/hosts with the internal IP of the VM, I get both HTTPS and STARTTLS to work fine. It just doesn't work if the connection is coming from "the outside" (ie: from the tunnel).

If I change the internet connection from the "bare metal" to use my phone's mobile data connection, it works as expected as well.

All in all, it seems to me that my router and/or my ISP is playing a key role in this equation, but I just can't accept it, because the traffic to/from the VM goes via OpenVPN (encrypted). If I learned something already is that the simplest explanation is usually the right one, and the only explanation I'm finding is not simple at all, so, I'm interested in other possible causes for this problem.

What I have tried so far:

  • Another VM on the bare metal, with another OS: doesn't work
  • Raspberry Pi on my internal network as OpenVPN client with nginx: doesn't work
  • Another droplet as OpenVPN client with nginx: works
  • A VM in my laptop as OpenVPN client with nginx: doesn't work
  • My laptop connected to the internet via my phone's mobile data connection, plus the scenario above: works
  • An OpenVPN server at a different provider (Amazon EC2): doesn't work

Logs that might help:

TCP dump from the internal VM

TCP dump from the OpenVPN server (Droplet)

Firewall rules on the Droplet

Successful connection from the public internet to the internal VM, thru the Droplet, on the port 80

Sample of connection that hangs when connecting from the public internet to the internal VM, thru the Droplet, on port 443

Successful connection from the internal network on port 443

Edit:

Wireshark log for the VM

Wireshark log for the OpenVPN server

Edit 2:

I tested it a bit more, and created a new vhost on nginx with a self-signed cert. It seems that the data exchange between the OpenVPN server and the VM went further, but has not completed:

Wireshark log for the VM

Wireshark log for the OpenVPN server

Edit 3:

It seems that in the end, it's not about TLS at all. It's a "coincidence" that it happened whenever TLS was involved, but I'm getting convinced that it's about the packet size: removing the "forced redirect" on nginx that forwards all HTTP connections to HTTPS, I see that the problem also happens for HTTP connections when the payload is bigger than a few KBs.

I also tested disconnecting my router/modem from my ISP's fiber connection and made it use USB tethering, from my phone, to check if the problem was on the router/modem, and it worked. So, it seems that problem is now isolated.

jpkrohling

Posted 2015-11-02T19:30:28.067

Reputation: 121

is wireshark any use in figuring out what's going on? – barlop – 2015-11-02T23:59:34.943

I just added links to the wireshark logs, but it just helps confirm that packages are departing from the VM but not arriving at the end of the tunnel at the OpenVPN server. – jpkrohling – 2015-11-03T08:33:58.637

I also added logs for a further attempt, with a self-signed cert. – jpkrohling – 2015-11-03T09:14:39.450

No answers