16

I've got a robot running linux with wired and wireless adapters. When I boot up, it connects to the wireless fine. When I assign an IP to the wired (either statically or with DHCP), it looks like it works. As in, ifconfig shows a proper IP and route shows proper routes. However, when I do an ARP request of the wired IP, the ARP reply contains the wireless MAC.

??? There's no bridge running on the robot, so why don't I get the wired MAC???

When the wire is disconnected, the wired IP replies to ping...

Why is the robot replying over the wireless interface to IP requests on the wired???

EDIT: both the wired and wireless adapters on the same IP subnet. I do an ARP request from a computer (tried with different computers) on the same IP subnet.

relevant ifconfig output:

eth0      Link encap:Ethernet  HWaddr 00:01:C0:04:BD:F7  
          inet addr:192.168.0.110  Bcast:192.168.0.255  Mask:255.255.255.0
          UP BROADCAST MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)
ra0       Link encap:Ethernet  HWaddr 24:3C:20:06:3E:6D  
          inet addr:192.168.0.101  Bcast:192.168.0.255  Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:59 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:31023598 (29.5 MiB)  TX bytes:85640627 (81.6 MiB)

relevant route output:

Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
192.168.0.0     0.0.0.0         255.255.255.0   U     0      0        0 ra0
192.168.0.0     0.0.0.0         255.255.255.0   U     0      0        0 eth0

It's a very cutdown linux, so I don't have tools like artptables, iptables, sysctl, brctl, etc.

EDIT: diagram as requested

network diagram

EDIT: I am dumping traffic and looking at the ARP table. An ARP request of 192.168.0.110 returns an ARP reply containing 24:3C:20:06:3E:6D. The source MAC of the ARP reply packet is also 24:3C:20:06:3E:6D. I've tried fiddling with _filter, _ignore, and _announce, as mentioned here, but to no avail.

EDIT: setting a gateway (on either interface) makes no difference (as it shouldn't).

EDIT: this worked fine on a previous version of the OS (based on openembedded). is it possible they changed something?

Jayen
  • 1,827
  • 3
  • 16
  • 27
  • 5
    maybe a diagram would be cool, and you could put a robot in it...extra cool points – The Unix Janitor Mar 15 '11 at 08:30
  • Are both the wired and wireless adapters on the same IP subnet? Where do you "do an ARP request" from? It might help to include the results of 'ifconfig' and show your routing table. – Dave Mar 16 '11 at 05:12
  • Did this ever get resolved? I'm seeing a similar issue, and have been thoroughly unable to find the resolution. – Kirk Nov 01 '11 at 21:24
  • my solution was to wait for an update to the distribution. – Jayen Nov 02 '11 at 01:03
  • 1
    i believe the kernel module for the wireless card was broken. – Jayen Nov 02 '11 at 01:36
  • @Jayen You can post your own solution as an answer to your question. – Skyhawk Nov 02 '11 at 14:28
  • Not really an expert, but it seems odd to me that both of your interfaces have the same value for metric. – nickgrim Nov 07 '11 at 11:54
  • @nickgrim it's the same on my laptop running debian wheezy. metric 1 in ifconfig. metric 0 in the routing tables. – Jayen Nov 07 '11 at 22:53
  • 1
    The question is *WHY* are you doing this... I'm guessing that you are wanting it so that when the robot is mobile it can talk, but when you plug in an Ethernet cable you can get high transfer speeds? If so, have you considered bonding the wired and wireless interfaces, putting them both on the same IP, and then configuring it so that if the wired is up, it gets priority, but if not traffic goes over the wireless? I used to set up my laptop this way and it worked great, but now I have 300Mbps wireless rather than 2Mbps, so I don't do that any more. – Sean Reifschneider Nov 08 '11 at 01:51
  • @SeanReifschneider that sounds like a great idea! i wish someone suggested that 8 months ago. can i make the bonded address come from dhcp? – Jayen Nov 08 '11 at 02:10
  • It's just a network interface, the address can come from DHCP or anything else you'd use to set the address. So you'd run DHCP on the "bond" interface rather than one of the other underlying interfaces. – Sean Reifschneider Nov 09 '11 at 03:36

6 Answers6

12

What you are seeing is normal behavior when you have two interfaces on the same network. It is described in this LWN article.

sciurus
  • 12,493
  • 2
  • 30
  • 49
  • setting arp_filter has no effect. why not? – Jayen Mar 17 '11 at 22:27
  • 1
    Since both your interfaces have routes to the local network, linux will send packets from either of your IPs out either of your interfaces. Thus it will answer ARP requests for either of your IPs from both interfaces. To change this, you must not only set arp_filter to 1, you must also enable source-based routing and set up routing tables that ensure that traffic for each IP goes out the interface you want. It's a slightly different scenario than what you have, but http://www.wlug.org.nz/SourceBasedRouting might help you. – sciurus Mar 18 '11 at 01:48
4

When you say you get an ARP response for the wrong interface, are you actually dumping traffic or just looking at the resulting ARP table? It's possible you're getting ARP replies for both interfaces...

Anyway, I believe the answer to your problem lies in properly manipulating rp_filter and arp_filter. The documentation for each of them is included below.

I suggest first trying this:

echo 1 > /proc/sys/net/ipv4/conf/all/arp_filter

You may need to make this change as well:

echo 0 > /proc/sys/net/ipv4/conf/all/rp_filter
rp_filter - BOOLEAN
    1 - do source validation by reversed path, as specified in RFC1812
        Recommended option for single homed hosts and stub network
        routers. Could cause troubles for complicated (not loop free)
        networks running a slow unreliable protocol (sort of RIP),
        or using static routes.

    0 - No source validation.

    conf/all/rp_filter must also be set to TRUE to do source validation
    on the interface

    Default value is 0. Note that some distributions enable it
    in startup scripts.

arp_filter - BOOLEAN
    1 - Allows you to have multiple network interfaces on the same
    subnet, and have the ARPs for each interface be answered
    based on whether or not the kernel would route a packet from
    the ARP'd IP out that interface (therefore you must use source
    based routing for this to work). In other words it allows control
    of which cards (usually 1) will respond to an arp request.

    0 - (default) The kernel can respond to arp requests with addresses
    from other interfaces. This may seem wrong but it usually makes
    sense, because it increases the chance of successful communication.
    IP addresses are owned by the complete host on Linux, not by
    particular interfaces. Only for more complex setups like load-
    balancing, does this behaviour cause problems.

    arp_filter for the interface will be enabled if at least one of
    conf/{all,interface}/arp_filter is set to TRUE,
    it will be disabled otherwise

For a more thorough treatment, see this article:

http://www.embedded-bits.co.uk/tag/rp_filter/

Insyte
  • 9,314
  • 2
  • 27
  • 45
  • 1
    I am dumping traffic and looking at the ARP table. An ARP request of 192.168.0.110 returns an ARP reply containing 24:3C:20:06:3E:6D. The source MAC of the packet is also 24:3C:20:06:3E:6D. I've tried both of your suggested filter settings, but to no avail. I've also tried playing with _ignore and _announce, as mentioned [here](http://serverfault.com/questions/22253/ubuntu-linux-multiple-nics-same-lan-arp-responses-always-go-out-a-single-ni). – Jayen Mar 17 '11 at 01:01
4

I know this is an old issue but I recently encountered the exact same situation with an embedded device. The device has both an ethernet and wifi interface and the requirements are that both interfaces can be active and on the same network at any time, but network traffic must be routed through the "preferred" interface.

Most users wouldn't configure their devices this way but in theory it should be possible.

We first picked up the issues with Netgear routers because they would report an IP Address conflict - 2 MAC addresses were sharing a single IP. Apparently the router would start behaving badly in this scenario and mess up the users network.

I created a private network that only contained the router (ethernet + wifi), windows laptop (Ethernet only), and the embedded device (ethernet + wifi). Using wireshark, tcpdump on the device, and arp on windows I can see the following behaviour:

  1. Ifconfig on the device shows distinct wln and Ethernet IP’s and distinct MAC addresses
  2. Sometimes (very rarely) an arp –a from windows shows the correct IP-MAC combination.
  3. Most of the time arp –a from windows shows both the wln and eth0 have the same MAC address
  4. When pinging either wln or eth0 from windows, the ping response comes from wln and very rarely from eth0. tcpdump shows the wln only responded to 1 of the 4 pings (for example)
  5. When windows sends an arp “who has” message for the eth0 IP – both the eth0 and wln interfaces respond saying they have that IP

I believe that item 3 is caused by item 5. The arp tables are being messed up because the wln is responding to arp messages that only eth0 should respond to. I believe item 4 is also caused by item 5. Ping is sent based on MAC address and since the last arp message received was from wln saying it has the eth0 IP, the pings are routed incorrectly to the wln interface.

After much digging and testing the solution was actually really simple. See this article - https://chrisdietri.ch/post/preventing-arp-flux-on-linux/

The Linux kernel network drivers are configured in such a way that when an arp request is received for a known interface (even if it is received on another interface) it will respond to the arp.

This setting resolves the issue:

echo 1 > /proc/sys/net/ipv4/conf/wln/arp_ignore
echo 1 > /proc/sys/net/ipv4/conf/eth0/arp_ignore

Explanation:

arp_ignore - INTEGER
Define different modes for sending replies in response to
received ARP requests that resolve local target IP addresses:
0 - (default): reply for any local target IP address, configured
on any interface
1 - reply only if the target IP address is local address
configured on the incoming interface
2 - reply only if the target IP address is local address
configured on the incoming interface and both with the
sender's IP address are part from same subnet on this interface
3 - do not reply for local addresses configured with scope host,
only resolutions for global and link addresses are replied
dev-rowbot
  • 141
  • 4
  • Very informative reply, but as the OP said, he tried that and linked to the similar answer http://serverfault.com/a/30648/57200 – Jayen Jan 23 '15 at 21:42
  • The link that you're pointing two is no longer valid. A similar working link is: https://chrisdietri.ch/post/preventing-arp-flux-on-linux/ Thanks for the answer. – Lethargos Apr 05 '22 at 13:30
  • @Lethargos - thank you, I have updated the post – dev-rowbot Apr 06 '22 at 10:43
1

As this worked fine on a previous version of the OS (based on openembedded), my solution was to wait for the next version of the OS. My best guess was that the wireless kernel module was buggy.

Jayen
  • 1,827
  • 3
  • 16
  • 27
  • Unlikely, since the behavior you are experiencing is expected as mentioned by @sciurus. It may be that the previous release where it "worked fine" was the buggy one and they have fixed it :-). Actually, it's just whichever one responds last is the one that sticks in the ARP table on the remote end. Since wireless is likely slower than wired, you're going to get wireless. – Sean Reifschneider Nov 08 '11 at 01:47
0

Following up on Insyte's comment.

Lets do some naming:

  • PC1 - Top Right
  • PC2 - Top Left
  • PC3 - Bottom Left

For you have your robot reachable from the 3 PCs via both wired and wireless media. And for they are on same subnet you cannot for sure tell that which way the arp request for your wired media has gone through. By that what I mean is when the switch broadcasts for an arp request, your robot receives it on both the interfaces [referring to your diagram] so for it receives the arp request for the IP on the wired media on the wireless media too chances are it replied with the physical address of the wireless media for the box does have that IP configured in it

I have had this issue in the past, it wasn't exact to yours but was similar. By default linux replies with the physical address of the interface it receives the arp request regardless of which interface the IP is configured onto. So in your case connect PC3 to the robot's eth0 interface directly and do an arp request for 192.168.0.101 it would reply you with the physical address of the eth0 interface instead of ra0.

My deployment scenario was:

[RTR] |------------eth0---[server]
|--------| switch1 |-----eth1-----[server]

Its the same switch, that both interfaces connect to. Hopeful that it would help you.

The router had primary and secondary IP address configured on its interface for two different networks on the two different interfaces on the server. But receiving an arp request on eth1 for IP address of eth0 it did reply with the physical address of eth1

To prevent it the following has so far worked for me

# echo 2 > /proc/sys/net/ipv4/conf/eth0/arp_announce
# echo 1 > /proc/sys/net/ipv4/conf/eth0/arp_ignore
# echo 2 > /proc/sys/net/ipv4/conf/ra0/arp_announce
# echo 1 > /proc/sys/net/ipv4/conf/ra0/arp_ignore

put it somewhere on your robot so you can get it be applied at boot.

Recommendations: I would recommend you have two different subnets be configured (say 192.168.1.x/24 on ra0 and 192.168.2.x/24 on eth0) you can use IP aliases on your PCs and your robot would be accessible over any of the two IPs. You cannot have two outbound paths for same subnet on a same host. Not unless there is something that makes your robot prefer one over another. Your robot can only take one path to send packets out of it.

Some readings: arp_announce, arp_ignore

Gaumire
  • 825
  • 6
  • 12
  • arp_announce & arp_ignore didn't work for me. see my comment on insyte's solution. Two different subnets is unfortunately, not an option. Also, as per the last edit of my question, this worked fine on a previous version of the OS, so I can have two outbound paths for same subnet on a same host. – Jayen Nov 02 '11 at 22:10
-2

i think there is a misconfiguration between your wireless AP and your switch. switch and AP are getting confused where to send packets. not sure about this though. also, i think you should try to define a gateway where programs can know where to send packets. something like

route add default gw 192.168.0.1

fmysky
  • 94
  • 6
  • laptops on both the wired and wireless work fine, so it's not the AP or switch. also, a gateway is only necessary when i'm trying to get outside the subnet. (i tried it anyway, and it doesn't change anything. doesn't matter if i add the gateway to eth0 or ra0.) – Jayen Mar 17 '11 at 04:23
  • there is no RX/TX on your eth0, indicates some config. error? – fmysky Mar 17 '11 at 21:47
  • hrm, i guess that's true. eth0 should at least receive the ARP request, since it's a broadcast, right? – Jayen Mar 17 '11 at 22:28
  • yes, thats what i thought – fmysky Mar 17 '11 at 23:16
  • arp problem in that lwn article seems to affect only 2.4.x kernels – fmysky Mar 17 '11 at 23:26
  • can you try changing your MAC address of your wired adapter of robot to that of wireless adapter? – fmysky Mar 18 '11 at 01:27