4

I have 2 physical servers:

  1. Two way Intel E5504 @ 2GHz, 24GB RAM, 12x32GB Intel X25-E SSDs in RAID10.
  2. Intel Core2 6400 @ 2.12GHz, 3GB RAM, simple 80GB SATA drive.

Both machines run Windows Server 2008 R2 now and have 10Gbit Supermicro AOC-STGN-i2S (actually they are Intel 82599 bearing Supermicro logo) in PCIe x4 slots- with a SFP+ direct attached twin axial cable between them.

The second server is for testing only.

First I installed ESXi on the 2nd and used the 1st as a datastore.

I noticed that according to CrystalDiskMark, a VM on ESX only got 325 MB/s seq transfer rate (tried with both NFS and ISCSI).

I ran the same test on the first server locally and got ~1000 MB/s. I wondered if the network link really kills 2/3 of speed, so I replaced 2nd's hard drive and installed Windows Server 2008 R2 and tried Jperf and NTTtcp. Jperf showed 400 MB/s and NTttcp showed 4300-4600Mbit/s. Windows Task Manager showed some 600 000 000 bytes per interval which translates to 4.47 Gigabits.

I verified that both ends had full duplex and tried toggling jumbo frames on and off both ends but the difference was only 580 000 000 vs 600 000 000 bytes per interval.

Why the throughput I'm seeing is only about half the theoretical maximum of 10 gigabits?

ADDENDUM

NTTtcp command lines:

ntttcpr -m 6,0,192.168.137.1 -a 6 (receiver)
ntttcps -m 6,0,192.168.137.1 -a 6 (sender)
Henno
  • 1,046
  • 5
  • 19
  • 33
  • When I had ESXi installed I had ISCSI mistakenly directed through the motherboard's onboard NICs for a moment, which are 1 Gbit/s. Then I saw 95% NIC utilization (~120 000 000 bytes/sec on wire). – Henno May 18 '11 at 21:58

1 Answers1

3

I'd suspect your PCI-e x4 slots are the bottleneck. The theoretical throughput on those slots should be in the range of 16 Gbps (saturating the NIC with room to spare), but that's not always well-implemented from the controller standpoint.

Got an x8 or higher slot you can steal from something else to test?

Shane Madden
  • 112,982
  • 12
  • 174
  • 248
  • Actually the storage server slot is PCI-e 2.0 x8 and the testing server's slot is x4. I was also suspecting this to be the culprit but Wikipedia dispelled my suspicions. Also the specs say the card works with x1 too and doesn't say it would work slower then. – Henno May 18 '11 at 22:03
  • Actually PCIe transer rates depend on the PCIe version implemented. While PCIe 1.0 would give theoretical 250 MB/s per lane (adding up to 1 GB/s total for x4), PCIe 2.0 doubles the data rate to 500 MB/s per lane. But you are right that it is quite a narrow fit. – the-wabbit May 18 '11 at 22:13
  • http://download.intel.com/design/network/specupdt/322421.pdf says that PCIe compliance pattern cannot be transmitted if the 82599 is in x4 or lower mode but from what I can tell by googling it, the compliance pattern is just a generated repeating test signal for insuring that everything functioning properly. – Henno May 18 '11 at 22:15
  • @syneticon-dj Good call - for some strange reason I assumed it was 2.0 (thus the 16 Gbps in the post). @Henno Which version are the slots on the test system? – Shane Madden May 18 '11 at 22:21
  • 1
    Well, I just found out that the testing server has Asus P5B Deluxe motherboard which has 2 PCIe slots: x16 and x4. But it's a PCIe 1.0 board. So based on @syneticon-dj's data it is getting 250 * 4 = 1GB/s. I will see tomorrow if I can put the video card into this x4 slot and put the nic into x16 and see if that changes anything. The storage server is PCIe 2.0 (and x8 slot). – Henno May 18 '11 at 22:21
  • 1
    I swapped video card and NIC and I now see an improvement of 2 Gbits/s, resulting in total of 6 Gbit/s. The NIC itself is x8 so after putting it into x16 (PCIe 1.0) should give it 16Gbit/s. The card showed as being in x8 mode (Device Manager > Network adapters > Properties > Link Speed > Identify Adapter > Negotiated Link Width). – Henno May 19 '11 at 08:01