0

I'm trying to implement Direct Server Return scheme for our web cluster and I think i got stuck with some ARP issues. For testing purposes I deployed 2 virtual servers (inside ESXi environment).

Host A: eth0 10.0.0.1/24 (VIP) is our director with its virtual ip on eth0
Host B: eth0 10.0.0.2/24, lo:0 10.0.0.1/32 is one of the webapp nodes, which runs httpd daemon

These two servers are on the same ethernet segment. As you can see host B has its loopback interface aliased to hold the VIP (10.0.0.1). In order for server B not to reply to VIP arp filtering is implemented via arptables:

arptables -A IN -d 10.0.0.1 -j DROP
arptables -A OUT -s 10.0.0.1 -j mangle --mangle-ip-s 10.0.0.2

Everything seemed good so far until i tried to ping host B from host A. "Destination host Unreachable" is what i got. By running tcpdump on host B i discovered that it did receive ARP requests from host A, but didn't send replies. Meanwhile ARP requests from other nodes we successfully replied by host B. So it looks like host A can't communicate with another machine, holding its VIP. Even though i did arp filtering. This is weird for me actually.
Any suggestions? Btw i'm running Centos 6.

Paul Rin
  • 1
  • 1

1 Answers1

0

From the way this is setup when host B gets the ping from host A, and then tries to respond, it won't even have to do an arp request(which you want to mangle) in order to 'reach 10.0.0.1', it is already directly connected. You can confirm this by printing the arp table on host B and seeing that there is already a record for 10.0.0.1. I've never used DSR in a setup where the VIP is in a subnet that has real server IPs in it. Usually the VIPs are bound to loopback and are in different subnets with routing setup. It prevents both this issue and also having to mangle arp tables which can lead to some really giant debugging headaches later on.

polynomial
  • 3,968
  • 13
  • 24
  • Thanks for the reply. Actually i figured out my mistake. I forgot to add primary ip to the host A so it should look like `eth0 10.0.0.3/24, lo:0 10.0.0.1/32` – Paul Rin Aug 22 '11 at 16:48