5
0
I am managing a small handful of newish Dell laptops and desktops that use similar ethernet hardware -- Intel I217-LM for the desktops and Intel I218-LM for the laptops. These all are running the same Intel driver in Windows 7, currently "Intel(R) PROSet Version: 18.1.59.00", or driver version "12.11.77.0, 2/11/2014" (not sure what the two versions are for, but whatever).
These machines are having problems dropping packets to a server a few hops away on our local campus network. My diagnostic tool for these issues has been running a simple ping -t -l 3500 targetserver01
for a few hours at a time, and comparing the number of dropped packets with a control.
What I find is that these new machines are dropping dozens of packets per hour, while an ancient desktop next door drops almost none. The last trial I ran had this old desktop drop 14 packets over 2.5 hours, while all of the newer machines dropped between 110 and 130 over the same period of time. Even running the same laptops on wifi has them drop fewer packets than when they are using ethernet. I've also controlled for network infrastructure -- I am 100% sure (+/- 10%) at this point that the variable coincident with this issue is the Intel ethernet driver on Windows, and this is proved by booting one of the laptops into Ubuntu on a USB stick. When running the default Ubuntu driver on the same exact Intel chipset, the issue disappears and packet loss rates are back in line with the "old desktop" control.
I've tried playing with all the settings I can get my hands on in the driver settings in Device Manager, but to no avail. These machines are required to run Windows software, so I can't just install Linux on all of them. The best workaround I have at this point is to buy USB-Ethernet adapters for all of these machines to use instead of the built-in interfaces, but I figure there's got to be a better way since the issue is with the driver software, not the interface itself.
I found this question, which seems to indicate I'm not going to find "generic" drivers for this Intel chipset:
Generic Ethernet drivers for WINDOWS
So, what are my options? Does this warrant further research, perhaps by running WireShark on the affected machines? Does Intel take user feedback?
EDIT:
The new drivers haven't made any difference. All of the Windows 7 machines I have access to at the moment are the affected, with the exception of a virtual machine running on an iMac. The virtual machine does NOT experience the ~1% packet loss issue.
Next steps are to simultaneously test ping every router between this office and the server (there are only 2), read up on how to use NTttcp, and find another native Windows 7 machine that uses a different chipset. I'll report back.
Oh, also I did a trial with a USB-Ethernet dongle and got the same approximate 1% packet loss. So now I'm just weirded out.
New question: How can I keep myself from slowly descending into madness?
2nd Edit:
This is finally starting to look like a network issue after all. Still haven't had the chance to explore some of the suggested analysis tools (but thank you for that), but performance on previously unaffected machines has started to degrade over the past 24 hours -- and now my testing has narrowed this down to anything past the first switch in the route -- pings to a machine on the same switch and in the same office succeed with 0% packet loss.
Why are you dropping 0.1% - 1.5% packets over cable in first place? I just made a stress test and lost 0 packets out of 10000, while you lost 14 in your "good" attempt. – LatinSuD – 2014-06-12T17:22:21.723
I lack the networking chops to answer that question -- my network admin says the switches are not reporting any errors. I also agree that my method of testing -- pinging an application server -- is somewhat barbaric, but it's the >1% packet loss that I'm worried about here, not the .1%. – NReilingh – 2014-06-12T17:26:10.630
Some instruction on a more refined manner of testing these things would be appreciated as well. – NReilingh – 2014-06-12T17:28:17.127
You need to use a different ping tool which allows more than 1 pps. I personally use cygwin's ping or linux's ping, but you may try psping (i haven't tried personally). – LatinSuD – 2014-06-12T17:33:10.933
Interestingly, I can run a 10000-packet flood ping to this same server over the same network, and get 0 packet loss completed in a few seconds -- but this is from an iMac. Perhaps the 14 packets lost from the old desktop are short losses of connectivity that happened over time. Still doesn't explain the drastically higher loss rate on this newer Dell hardware. – NReilingh – 2014-06-12T17:43:55.287
Have you also tested the connection between those new computers as well i.e. on the local network? I would also suggest using a more proper testing tool, perhaps NTttcp which even Dell recommends.
– Cristian Ciupitu – 2014-06-13T11:20:40.8571
There is a newer driver available at Intel since April : Version 19.1, might give it a try. Also have a look in the Event Viewer for computers that dropped frames.
– harrymc – 2014-06-13T13:05:33.057Why 3500 bytes? Does it fail with the default size? If not, the problem could be related to jumbo frames. The I217/218 adapters support jumbo frames whereas your old desktops probably do not. – Jason – 2014-06-13T22:52:53.340
@Jason Ooh, this is interesting. I was using 3500 instead of the default at the suggestion of our networking person, simply to stress the interface a bit more. Failures were still happening with the default size, but I didn't realize that was over the jumbo-frame threshold. – NReilingh – 2014-06-13T22:55:08.987