If I remove one network connection, shouldn’t my UNIX host just use the other?

An interesting problem.

If there are two network connections attached to my MacBook Pro, and I remove one, shouldn’t the networking just use the other?

Well, it didn’t last night - when I went to bed, my wife asked about an ethernet cable I had run through the kitchen, and thinking that the wireless connection would handle the traffic if the ethernet was removed, I told her just to remove the ethernet cable, retract the line out of the kitchen, shut the door, and go to bed, as I rolled over... :-D

This morning, Mac Mail, and other tools (Safari) couldn’t get to the Internet, even though there was a wireless connection to my WiFi router. So, I naively restarted both Mail and Safari, and they still couldn’t see the Internet.

So, I opened Network in System Preferences, and changed the order so that Wi-Fi came first, not second, as in:

OSX 10.9.2]System Preferences --> Network

And immediately, Safari and Mail were re-connected to the Internet.

I thought that TCP/IP would simply use the network interface which is available if the other is not?

This leads me to a more general question: How do I divide the network traffic up to take advantage of both network interfaces, thus increasing my throughput?

Billy McCloskey

Posted 2014-04-26T14:05:44.133

Reputation: 1 487

I can do some script trickery to detect which interfaces are actually working and issue the necessary commands to up/down the interface(s). My background is safety critical embedded system programming. From what I understand, TCP/IP was designed with these kind of outages in mind given the KAOS (Max?) of a battlefield scenario, but I gave too much credit to these particular safety critical engineers. Anti-missile missiles and unmanned space planes simply cannot fail; I thought complete fault tolerance was a design goal and specification for TCP/IP. Use working routes? TMTOWTDI I'm sure. – Billy McCloskey – 2014-04-27T14:50:33.923

Answers

Great question btw! I have had a similar situation happen to me back when I was working for Intel, but I do not remember the interfaces not automatically switching over when one of them was disconnected.

My answer is regarding your general question

This leads me to a more general question: How do I divide the network traffic up to take advantage of both network interfaces, thus increasing my throughput?

Now, if you had two Ethernet interfaces (as opposed to Ethernet and WiFi), you could use a technology known as Link Aggregation to 'bond' two similar interfaces to make your computer think it was a single Ethernet connection. The network protocol that provides this functionality is called LACP (link aggregation control protocol). Unfortunately, in order to do Link Aggregation, not only does the dual NIC need to support this functionality, but the gear you are connecting to also needs to support LACP and have all of its settings be the same as on the client side. This feature is common on high end routers and switches, but most consumer grade gear does not support this.

Since your WiFi adapter and your Ethernet adapter are two totally separate interfaces, you cannot simply join the two together and get increased throughput. This is also due to the fact that the theoretical link speed of Wireless (varies depending on technology used) and Ethernet (10/100/1000) are quite different. According to Wikipedia

In most implementations, all the ports used in an aggregation consist of the same physical type, such as all copper ports (10/100/1000BASE‑T), all multi-mode fiber ports, or all single-mode fiber ports. However, all the IEEE standard requires is that each link be full duplex and all of them have an identical speed (10, 100, 1,000 or 10,000 Mbit/s).

Many switches are PHY independent, meaning that a switch could have a mixture of copper, SX, LX, LX10 or other GBICs. While maintaining the same PHY is the usual approach, it is possible to aggregate a 1000BASE-SX fiber for one link and a 1000BASE-LX (longer, diverse path) for the second link, but the important thing is that the speed will be 1 Gbit/s full duplex for both links. One path may have a slightly longer transit time but the standard has been engineered so this will not cause an issue.

Richie086

Posted 2014-04-26T14:05:44.133

Reputation: 4 299

Well, the theoretical limit and what I have should be in the same ballpark - gigabit wired and gigabit wireless (802.11ac), or so the advertising claims. ;-) Thank you for the nice reply. This is the gravy part of my question. I do want to find out why the router just didn't split my traffic, even if just a trickle, over the other, extent network interface. All or nothing seems a little Extreme, and I'm not even using an AirPort, anymore... :-D – Billy McCloskey – 2014-04-26T15:26:09.757

So, the actual speed of 802.11ac depends on a wide range of factors (mainly the number of antennas your router has, how many streams it uses, etc). I think the primary issue with not being able to bond the two interfaces comes down to the fact that they both use different mediums to transmit. – Richie086 – 2014-04-28T15:24:01.420

I just tested failover with my Mac mini. I have two network interfaces configured, Ethernet and WiFi. Both were in good standing as the test commenced, with the Ethernet having precedence in service order.

I repeated the test a few times, looking at different network indicators. Failover to WiFi happened as one would expect. One ping packet dropped on disconnect; none dropped on reconnect.

`en0` -  Built-in Broadcom Gigabit Ethernet  
`en1` -  Built-in Apple Wireless Network Adapter

Console.app reports these messages when I pull the RJ-45 out:

2014-04-26 1:14:40.000 PM kernel[0]: AppleBCM5701Ethernet [en0]: Link down (womp disabled, proxy idle)
2014-04-26 1:14:41.267 PM configd[54]: network changed: v4(en1:192.168.2.22, en0-:192.168.2.122) DNS! Proxy! SMB

One ping was dropped.

Upon reconnecting the Ethernet cable, these messages were logged:

2014-04-26 1:14:47.000 PM kernel[0]: Ethernet [AppleBCM5701Ethernet]: Link up on en0, 1-Gigabit, Full-duplex, Symmetric flow-control, Debug [796d,2321,0de1,0300,cde1,3c00]
2014-04-26 1:14:47.901 PM configd[54]: network changed: v4(en0+:192.168.2.122, en1) DNS! Proxy! SMB

No dropped packets.

A route monitor showed a flurry of routes being changed.

In summary: On Mac OS X, Version 10.9.2, running on a Mid-2011 Mac Mini, failover works as expected.

So, why mightn't this have happened for you..? I thought one reason might be a Thunderbolt transceiver dongle not reporting a carrier drop to the kernel, but in your screen capture, it seems like the system is aware there is no Ethernet connection.

Is the problem repeatable, and if so, what messages get logged?

Are the logs from this particular event still accessible?

Nevin Williams

Posted 2014-04-26T14:05:44.133

Reputation: 3 725

My configuration is MBP <--> Thuderbolt/Ethernet dongle <--> Cat6 <--> gigabit switch (a) <--> (b) Cat5e <--> ASUS RT-AC68U (Router). The severed connection was between (a) and (b). Mac OS X, 10.9.2, Late-2013 MBP (upgraded everything). I'm woking on getting a functioning test case, or isolate the event in syslog. Stay tuned. – Billy McCloskey – 2014-04-26T22:00:26.127

You narrowed my Q/A in that by describing what I have, & trying to recreate the problem, I noticed that by disconnecting the thunderbolt dongle from the MBP, the network failover'd immediately. However, when the network is severed after the gigabit switch, the failover fails. :-D Also, I notice that once the computer receives a self-assigned IP after not getting one while a/b are disconnected, after I reconnect a/b, the self-assigned IP disappears, and the MBP re-negotiates for a new IP via DHCP. If a/b disconnected, ethernet keeps previously assigned IP, I guess, as expected. – Billy McCloskey – 2014-04-26T23:09:49.577

I assigned each of my interfaces' MAC-addrs different IPs in my router's DHCP config; Both have an IP on the same network, but the wifi routes are tagged 'I'-nactive in the route tables. I'd done this some time ago. I suppose if the wifi didn't have its lease, it'd have dropped more than one packet on failover. If configured, the secondary route may be activated upon expiry of the ARP cache. 'sysctl -a net.link' will give you a list of (possibly) tuneable parameters. Their exact functions and timescales would have to be googled. Also, shorter lease times...? – Nevin Williams – 2014-04-27T07:31:08.543

I will indeed check into those parameters, and publish anything that will benefit the community, once it is determined what that is. – Billy McCloskey – 2014-04-27T15:03:11.160

+1 for "Is the problem repeatable?" – MattBianco – 2014-04-29T13:53:23.060

I'm not very familiar with Macintosh computers, but I am with Ethernet and TCP/IP networking. Other than using some kind of aggregation/link sharing/splitting scheme, the answer to your "why didn't it use the other link" question is.. it depends on what interface(s) the applications in question are bound to, if they can find another route, and so on.

A few concepts I see a lot of people new to networking have trouble with..

Ethernet is NOT TCP/IP. TCP/IP is NOT Ethernet.
IP Addresses are NOT assigned to your computer; They're assigned to your network interface card's MAC address.
Data is sent over the network by MAPPING an IP address to the matching MAC address (and back again), using the ARP--Address Resolution Protocol.
Within any single segment of a network, every address, whether it be an Ethernet MAC address or a TCP/IP address, MUST be unique. Think of it like your telephone--Imagine how annoyed you'd get if you had to share your phone with your neighbors so that they had the same phone number as you. The Telco would have to ring you both, and then you and your neighbor would have to figure out who was calling, and for which of you.

With the above in mind, it should start to make sense why you lost "connectivity".. Your wired connection has one MAC address which is mapped to one IP address, and your wireless has a different, unique MAC/IP address pair. Between your computer and your router, these pairs of MAC/IP are used to identify where any network data should be sent.. BUT.. if one link fails, neither your computer nor your router has an easily feasible way to suddenly "change" the source/destination address pairs after the fact.

Think of it like the post office. Someone writes you a letter, and they put your name and address on the envelope. What if you moved? Without doing anything to redirect your mail, it would still get delivered to your old address and you'd never see it. Of course, as I said, you could have it redirected/forwarded, and that's what some aggregation schemes do--but they require additional hardware or software to do it, just like the post office needs additional workers/machines to sort and redirect your mail. It's not something that just happens on it's own.

Many applications also "bind" to a specific network interface--Without a binding, it would be like telling a stranger to "call me tomorrow" without giving them the number to call. They'd have no way to know how to contact you. Likewise, your mail client needs to tell the mail server what IP address to send any data back to you. It usually happens transparently because the source and destination addresses are embedded in the packet headers.

Now, given what you described, not all of the above applies to you.. In a properly configured system, most outgoing connections should indeed find any available interface to be able to send network traffic on, and as routers and switches become more sophisticated, automatic translation of addresses happens without you needing to any additional hardware/software. But there are still other issues to contend with; One of them is interface metric (priority), which is what it sounds like was your original problem.

Without being told that it can use the other interface, your computer only tried the first interface in your list. For all it knows, the second interface could be linked to a nuclear launch facility and any data will start WW3.. Not a good idea, even if extreme :-P Anyhow, the point is, without you configuring it to do so, your computer does not have a real brain to figure out the other connection will work just as well as the first. That's why it started working after you properly configured it to "failover".

Again, as computers and OSes become more sophisticated, more and more of them DO come out of the box pre-configured to do this sort of thing automatically these days.. but understanding the underlying issues and configurations will help you understand and correct problems when they do occur. Likewise, I'm not entirely sure on how Macintosh computers are set up to handle thee situations, but I understand the newer un*x-based ones of the past few years generally have solid iface up/down handling and logic.

Another possible issue/cause of your exact symptoms may be in the routing tables. Generally speaking, when the TCP/IP stack tries to route a data packet, it looks up the destination address in the routing table. If it finds an exact match, it sends to pack out the assigned interface. If it doesn't find an exact match, but finds close/masked match, it will send the packet to that assigned interface. If it finds no match at all, it will send the packet to whatever is the "default gateway" to let the gateway decide how to route the packet.

However, there can only be one default gateway! This is because the computer (well, your computer's TCP/IP stack) is unable to determine where to send the packet, so it needs to send it to the gateway to make that determination. However, if you had two default gateways--your computer would not know which one to send it to, since it's not able to determine where to send the packet in the first place! Your computer would have a nervous breakdown trying to decide, and eventually divide by zero, which we know would not be good..

Again, how you configure the system would affect how it responds when the link assigned to the default gateway suddenly goes down. But that brings us back to configuration--Your computer would need to know that if the default gateway become inaccessible, it should switch to the default gateway of the other iface. It sounds like that's what it eventually did once you change the priority of the NICs, which is why it started working again.

Sorry if this all is clear as mud! But to help you diagnose these kind of issues, learn how to use some of the network diagnostic tools such as 'arp' (tells you what MAC is bound/mapped to an IP address, or vice versa) and 'route' (will tell you how it'll try to route packets based on destination address).

C. M.

Posted 2014-04-26T14:05:44.133

Reputation: 687