I have a configuration of a client machine (Ubuntu 14.04.2 LTS) that directly connects to two server machines. The server machines--while on the same subnet--cannot route to each other. Packets must not route between the client's eth1
and eth2
, either (i.e., client is not a router) and since there's an act of discovery involved, the routing table cannot be pre-programmed with the serverX IPs.
Below is a crude diagram. eth0
is left out of this picture as it cannot route to the subnet 172.16.37.0/24
. There are no firewalls (both iptables
and ufw
are clean as a whistle).
client server01 +-------------------+ +-------------------+ | 172.16.37.53 eth1-|----------|-eth1 172.16.37.11 | | | +-------------------+ | | server02 | | +-------------------+ | 172.16.37.54 eth2-|----------|-eth1 172.16.37.12 | +------------------ + +-------------------+
The problem that I'm encountering is that ping -I eth2 172.16.37.12
does not work from the client machine. In short, what I observe via tcpdump
is:
client:eth2
issues ping-requestserver02:eth1
receives ping-requestserver02:eth1
issues an ARP who-has for172.16.37.54
client:eth2
receives the ARP who-has- ARP never responds (and the arp cache isn't updated)
If I send gratuitous ARPs with arping -A -I eth2 172.16.37.54
from the client, then the observation is thus:
client:eth2
issues ping-requestserver02:eth1
receives ping-requestserver02:eth1
issues ping-reply to clientclient:eth2
receives ping-reply- ping application never receives packet
When I strace
d my ping session, it keeps retrying recvmsg
; nothing shows up on the socket.
Here's the routing table for the aforementioned observations:
$ route -n Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface 0.0.0.0 172.16.187.2 0.0.0.0 UG 0 0 0 eth0 172.16.37.0 0.0.0.0 255.255.255.0 U 0 0 0 eth1 172.16.37.0 0.0.0.0 255.255.255.0 U 0 0 0 eth2 172.16.187.0 0.0.0.0 255.255.255.0 U 0 0 0 eth0
If I add a host route into the routing table for 172.16.37.12 -> eth2
, then both ARP resolution and ping operation work as intended. However, I cannot pre-program the host routes since I need to discover the connected machines--all I have are the active interfaces and their subnets.
Packets are not received on eth1
when testing eth2
(I had pretty much all the interfaces being watched). It's also worth noting that this problem does not happen with eth1
; it is able to ping -I eth1
correctly without the host route entry (and due to route ordering, the -I
option is excessive in this case, but generally I cannot rely on the order of the table).
Why isn't the application (in one case the ARP cache; the other ping
) receiving the packets? How can I track down where the data is going? Everything seems to be operating correctly right up until the packet should be delivered to the application.