9

I have setup a linux box (on an esxi5) which acts as an OpenVPN server. the server is configured to use bridging for the clients, which essentially works, with one exception.

If the client pings some machine on the network which is not the server itself it does not work. I ruled out everything I know of (iptables, etc) and running tcpdump boiled it down to the following things:

  • I see ARP requests on tap0 and br0
  • I see the ARP replies on br0
  • I do NOT see the ARP replies on tap0

Question: why does the br0 device not forward ARP replies to the tap0 device?

Michal Sokolowski
  • 1,461
  • 1
  • 11
  • 24
fen
  • 415
  • 4
  • 8
  • 1
    ok - i got a step further. when i watch the mac table of the bridge using brctl showmacs i see the mac address of my vpn client on the tap0 side. if i now start pinging from the vpn client to the subnet the mac address moves over to the over bridge port which of course blocks the arp reply of the subnet. the mac switches back almost immediately when the ping stops. so what i do not know is why the mac address switches to the wrong switchport - all my searches yielded to no results so far. – fen Dec 12 '11 at 09:46
  • if it "moves over" to another port, that would be a definite clue that the MAC address is either present more than once in your network or you are seeing the effects of a network loop (two ports of the same bridge connected by an active path). Both are configuration problems which need to be corrected. – the-wabbit Dec 10 '14 at 06:55
  • 1
    isolate the issue by using a static ARP entry first in your client, if pings work well after that then you can move onto troubleshooting ARP. If it does not work then you have a bigger networking issue than just ARP. – Ricardo Mar 20 '15 at 14:20
  • As we can't know **anything** about that how your network looks like. Long shot; do you have `client-to-client` in your server's openvpn config file? If your servers are connected to VPN network using openvpn as client, then the sentence could be true. PS. What kind of distro are you using? – Michal Sokolowski May 23 '15 at 10:42

2 Answers2

2

Without more info, we are guessing, but lets try:

First make sure that both eth0 and tap0 are in promiscuous mode. br0 should not be in promiscuous mode.

Next check it you have arptables and any iptables rules that might be interfering.

As you already get arp replies, your probably don't have this, but check it anyway.

finally check the rp_filter settings, but also check any extra sysctl parameters you may have set.

higuita
  • 1,093
  • 9
  • 13
1

If your ESXi host has redundant connections to the network, there are a variety of ARP issues that can appear due to the default setting of Net.ReversePathFwdCheckPromisc. pfSense users using CARP were among the earliest to debug this, described over at https://doc.pfsense.org/index.php/CARP_Configuration_Troubleshooting

In a similar environment, we have OpenVPN bridging set up on FreeBSD, but also the additional complication of vlans. On a host where Net.ReversePathFwdCheckPromisc has not been set to 1, and where multiple uplinks to the network exist, we see massive packet loss (95%+) on inbound traffic to the tap device. It works just fine when set to 1.

JG23
  • 11
  • 1