Windows network bandwidth issue

5

0

We have a computer (running Windows 10) with five Ethernet ports (each 1 Gb), two of them are built-in, the other three are on two PCIe cards. Four of the Ethernet ports have in total six cameras plugged into them (with two switches, so no port handles more than two cameras at once). The system was originally designed to run distributed over several computers because the cameras send uncompressed images, so there is a service running (originally on each computer) that grabs the frames and hands them over to a recording/display program (now in a compressed format).

When the system is running, the four Ethernet ports are way below their theoretical limit: enter image description here

On the other hand, when looking at the service handling the incoming traffic, I see 99% usage (it was 100% but I set all cards to gigabit full duplex, then it dropped to 99%) while the actual usage is pretty much the sum of the four incoming traffics (headers in order: CPU, memory, network, disk, GPU):

enter image description here

As you can see, memory and CPU usage is very low, and the 800 Mb/s speed should be WAY below the capacity of the network, yet it shows 100% and the capturing program functions as if it were having serious bandwidth issues. Downscaling to four cameras (and around 600 Mb/s total) restores behaviour.

The strangest thing of all, is that for a few trials the six cameras in total were working perfectly, so my feeling is that Windows 10 is somehow thinking we only have 1000 Mb/s bandwidth and is trying to limit usage which somehow kicked in later.

What am I missing?

Hardware (edit)

Motherboard: GA-X99-Designare EX

Devices listed in device manager:

  • Intel Ethernet Connection (2) I218-V
  • Intel I211 Gigabit Network Connection
  • Intel PRO/1000 PT Dual Port Server Adapter
  • Intel PRO/1000 PT Dual Port Server Adapter #2
  • Realtek PCIe GBE Family Controller

Two PCIe NICs:

  • TP LINK TG-3468
  • GigE Card PCIe Intel PRO/1000 PT Dual Port Server Adapter

The interface statuses:

Name                      InterfaceDescription                    ifIndex Status       MacAddress             LinkSpeed
----                      --------------------                    ------- ------       ----------             ---------
Ethernet                  Intel(R) I211 Gigabit Network Connec...      12 Up           1C-1B-0D-6C-A0-27         1 Gbps
Ethernet 2                Intel(R) Ethernet Connection (2) I218-V      15 Up           1C-1B-0D-6C-A0-29         1 Gbps
Slot04 x16                Realtek PCIe GBE Family Controller           14 Up           18-D6-C7-01-C9-F6       100 Mbps
Ethernet 4                Intel(R) PRO/1000 PT Dual Port Ser...#2       9 Up           68-05-CA-3F-CB-32         1 Gbps
Ethernet 3                Intel(R) PRO/1000 PT Dual Port Serve...      20 Up           68-05-CA-3F-CB-33         1 Gbps

It seems that the Realtek GBE (which should be the TP LINK TG-3468, which says it can do 1 Gbps) is on 100 Mbps. It's connected with a Cat5e cable to a gigabit switch. I'm not sure this is relevant, but also seems strange.

fbence

Posted 2018-03-28T17:15:32.630

Reputation: 101

So, there's no actual problem except that Windows may have a incorrect metric measuring Network usage? – music2myear – 2018-03-28T17:20:08.470

@music2myear No, unfortunately, after a brief period of working fine, the capture system is acting like it is not having enough bandwidth. This is why I have the feeling Windows might think we don't have enough and is trying to artifically limit usage by one program. – fbence – 2018-03-28T17:24:22.837

I'm not aware of any built-in throttling behavior in Windows. – music2myear – 2018-03-28T17:47:51.747

How are the devices connected (ie IP addresses of cameras and ports)? What type of NICs are you using? – davidgo – 2018-03-28T19:04:53.853

The adapters all have localaddresses 192.168.X.1 where X is 0 to 3, the cameras all have 192.168.X.Y where X is the respective adapter they're plugged into and Y is 2 or 3. The one used for internet (the fifth) is has a static non-local IP. The single port NIC is a TP LINK TG-3468, there are two on the motherboard (GA-X99-Designare EX is the board: Dual Intel® GbE LAN with cFosSpeed Internet Accelerator Software), for the third one I'd have to check. I think the cameras use port 80, and the service and recorder uses some random ports at the high end. – fbence – 2018-03-28T19:25:26.873

What's the motherboard model this is running on? – Tim_Stewart – 2018-03-29T05:10:48.880

@Tim_Stewart: GA-X99-Designare EX – fbence – 2018-03-29T05:16:19.017

the name of the capture service software might be useful. – Yorik – 2018-04-11T16:03:40.787

It's the Etoservice.exe in the picture, unfortunately, it was costume made. The creator says there should not be any limits due to the software, but I'm trying to lay my hands on the code. – fbence – 2018-04-11T16:19:46.843

Should use "adapters" instead of "ports". – harrymc – 2018-04-11T17:10:13.900

Answers

3

The 800 Mb/s combined network performance that you are getting means that you are only using one of your network adapters at a time.

The explanation may be found in the Microsoft article How multiple adapters on the same network are expected to behave, from which I quote (the article uses an example case of two adapters) :

In this scenario, you may expect the two adapters on the same physical network to perform load balancing. However, by definition, only one adapter may communicate on the network at a time in the Ethernet network topology. Therefore, both adapters cannot be transmitting at the same time and must wait if another device on the network is transmitting. Additionally, broadcast messages must be handled by each adapter because both are listening on the same network. This configuration requires significant overhead, excluding any protocol-related issues. This configuration does not offer a good method for providing a redundant network adapter for the same network.

If all your adapters are on the same physical network and protocol subnet, the above text explains the network performance hit that you are seeing, since they are working in series and not in parallel.

The overhead mentioned in the article is responsible for the fact that you cannot actually reach the limit of 1 Gbps but only get up to 800 Mbps.

To use 5 adapters with your configuration, you will at least need to connect the computer to 5 different VLANs, but may hit other limits.

I am not convinced that Windows 10 is a good platform for such a configuration. Windows Server 2016 might be able to do better under the right configuration. In Server 2012 and latter. Windows natively supports bridging/aggregating NICs, also called NIC Teaming.

An alternative solution is to replace the five 1 Gbps adapters by one 10 Gbps network adapter. This might be a better solution and maybe even cheaper than Windows Server. A fundamental rule of building switched networks is that a faster technology is always needed to aggregate multiple lower-speed segments. 10 Gigabit can aggregate these five 1 Gigabit segments. In case of problems, you might need your network verified by a cabling specialist.

harrymc

Posted 2018-03-28T17:15:32.630

Reputation: 306 093

By the same phyiscal network you mean that for example, all cards are connected to the same switch and everything else is connected to that switch? In that case, this is not true. All the adapters have different IP-s, and the cameras are separately connected to them. Since four adapters can be used for the cameras (the fifth adapter is for connecting to the internet), this means that two adapters handle two cameras and the other two one each. On the other hand, Windows Server might be a good idea. I do not know if I can procure one fast. – fbence – 2018-04-11T19:19:41.567

I mean that all are on the same network segment, for example 192.168.0.x/24. Only one transmitter is allowed in a network, otherwise the collisions will cause transmission errors. I believe that using here one 10 Gbps network adapter will be a better solution and maybe even cheaper than Windows Server. A fundamental rule of building switched networks is that a faster technology is always needed to aggregate multiple lower-speed segments. 10 Gigabit can aggregate these five 1 Gigabit segments. In case of problems, you might need your network verified by a cabling specialist. – harrymc – 2018-04-11T20:03:44.430

I have to check this tomorrow, but I believe they are not on the same network, each adapter is on 192.168.x.1/24. – fbence – 2018-04-11T20:17:14.490

The point is whether they are transmitting on the same physical network. If so, they have to take turns. – harrymc – 2018-04-12T06:55:30.353

Sorry! I haven't had time to phsyically go there and pester the machine. I scheduled it for tomorrow, I will clear up this question after that. – fbence – 2018-04-18T06:26:21.657

Ah, I see I have to award the bounty in 31 hours. I will take that into account :) – fbence – 2018-04-18T06:28:11.273

2

Your Windows is in a foreign language, so I cant say for sure...

On the English version of Windows, the third column in the task manager is disk usage.

Your hard disk does not appear to be able to handle the data throughput, which makes far more sense. If this is the case, you need to calculate the necessary disk throughput and get a a much faster disk subsystem to handle that load. Probably a high speed RAID array, or even two, to distribute the load.

Keltari

Posted 2018-03-28T17:15:32.630

Reputation: 57 019

google recons that it is "network" – Baldrickk – 2018-04-18T08:41:32.983