3

When I test my own '10 Gigabit' instances (c3.8xlarge) with iperf I won't see transfer rates exceeding 1.73 Gbps. This is at least four times worse than what a blogger at scalablelogic reports where tests show results of 7 Gbps and 9.5 Gbps.

I'm testing between two c3.8xlarge instances located in the same zone and region, so these should be optimal benchmarking conditions. The one c3.8xlarge acts as iperf server and the other as an iperf client. Both instances are launched with Amazon Linux AMI 2013.09.2 - ami-5256b825 (64-bit).

Why am I seeing such poor results?

What should I look at if I want to improve throughput?

niemion
  • 161
  • 1
  • 8
  • Please paste your iperf configuration, as it very well could be the culprit. – MDMarra Jan 29 '14 at 22:43
  • @MDMarra, please explain how I find the iperf configuration? I have simply installed on both instances using `wget http://iperf.fr/download/iperf_2.0.2/iperf_2.0.2-4_i386 ; chmod +x iperf_2.0.2-4_i386 ; sudo mv iperf_2.0.2-4_i386 /usr/bin/iperf`, then started the server with `iperf –s` and connected from the client using `iperf -c elastic_ip_of_iperf_server`. – niemion Jan 29 '14 at 22:52
  • When you are running these iperf tests, are you sure your instance isn't maxing out the CPU or anything? Have you tried with an alternate OS? Have you checked with Amazon's support? – Zoredache Jan 29 '14 at 23:02
  • @Zoredache, CPU-usage is only a few percent. I have not tried with another OS, but I could try with Red Hat Enterprise Linux 6.4, SUSE Linux Enterprise Server 11 or Ubuntu Server 13.10. Which one would you suggest? Amazon support has not answered me, I guess it's because I haven't payed for support so I only have access to sales "support". – niemion Jan 29 '14 at 23:07
  • 1
    @niemion things like window size, threads, etc will play in here. They are all detailed in the iperf manpage. I'm not sure that running iperf with no options will ever yield "good" results, but I don't have any 10GbE hardware to test with – MDMarra Jan 29 '14 at 23:10
  • What settings would you suggest? Interestingly, testing with http://www.wowza.com/resources/LoadTestingTool.pdf simulating a high number of concurrent connections, I hit the exact same limit. So I'm not sure this is iperf specific. – niemion Jan 29 '14 at 23:16
  • I have now tried to launch two instances with Ubuntu Server 13.10 instead. I have tried setting the windows size to 64KB, 128KB and 512KB. I also tried setting the number of parallel client streams to 2 and 10. These settings offered no real improvements to the measured throughput, as the reported throughput maxed out at 1.74 Gbps. – niemion Jan 29 '14 at 23:50
  • Have you asked EC2 support? – EEAA Jan 30 '14 at 01:18
  • Amazon requires a support subscription for helping with technical matters. Before paying that premium I wanted serverfault to have a crack at it. – niemion Jan 30 '14 at 01:50
  • Did you enable Enhanced Networking http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/enhanced-networking.html – Dusan Bajic Jan 30 '14 at 08:14
  • No risk that you're confusing GBps and Gbps ? – mveroone Jan 30 '14 at 10:28
  • Enhanced Networking is disabled for now. It doesn't account for the poor performance though, see the blog post in #0. I'm not mistaking GBps and Gbps. – niemion Jan 30 '14 at 12:42
  • Amazon has finally recognized that something caps the throughput at 1.73 Gbps. My findings was initially received with a fair amount of skepticism, but accepted after they agreed to test themselves. Support has promised to perform further tests to find our why the instance cannot archive a higher throughput when connecting to its public IP. There is something to note however. This limit is only seen when testing against an instance's public IP. When we tested against an instance's private IP, which of course cannot be tested form outside Amazons environment, we saw speeds up to 9.65 Gbps. – niemion Feb 01 '14 at 13:21

4 Answers4

12

AWS Support admit that 10 GbE speeds can only be achieved between instances on the private subnet network. It requires that the private IP is used as opposed to the public IP which in my case always maxes out at 1.73 Gbps. That might change depending on zone and region. If you see different results please post them here.

This means that when it comes to external throughput, the c3.8xlarge (or similar 10 GbE instances) offer terrible value when compared to smaller instances with "High" network capabilities. A c1.medium instance comes at 1/16 the price of a c3.8xlarge, but it will allow for over half the througput (~0,95 Gbps) of a c3.8xlarge 10 GbE instance (~1,7 Gbps).

See this thread on the Wowza forums for AWS Support's answers.

niemion
  • 161
  • 1
  • 8
5

Because of the virtualization layer the networking layer can't use DMA directly and CPU has to copy data back and forth spending time doing softirq. In this case, when you have too many packets transferred you need to tell the kernel to use more than one CPU core for that.

You can monitor this by doing watch -n1 cat /proc/softirqs and looking at NET_RX.

Fortunately there is a feature called packet steering which allow us to use more CPU cores for receiving and transiting packets. enter image description here

To allow the CPU to use more than one core for receiving packets you can do echo f > /sys/class/net/eth0/queues/rx-0/rps_cpus

For transiting you can do echo f0 > /sys/class/net/eth0/queues/tx-0/xps_cpus

This way the first 4 cores would be used for receiving and the next for 4 for transmitting.

f  => 1+2+4+8 = 15 in hexadecimal
f0 => 16+32+64+128 = 240 in hexadecimal
Bogdan
  • 218
  • 2
  • 8
  • thank you. I tried it on both the iperf server and client, but I'm still limited at ~ 1.73 Gbps. – niemion Jan 30 '14 at 12:44
  • 1
    Not true (any more?): *SR-IOV* aka *AWS Enhanced Networking*, which gets rid of one layer of buffer copying for virtual machines, can be used on [Linux](http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/enhanced-networking.html) and [Windows](http://docs.aws.amazon.com/AWSEC2/latest/WindowsGuide/enhanced-networking.html). – Evgeniy Berezovsky May 28 '15 at 04:15
1

Hope this helps you, we've wondered EC2's true public facing throughput for a while. We just finished running several Wowza Edge instances on C4.8xl instances and had no issues at 6+Gbps per instance. Per http://www.aerospike.com/blog/boosting-amazon-ec2-network-for-high-throughput/, the benchmarks below seem to be very accurate:

*Network Bandwidth Amazon offers a range of instance types with varying amounts of memory and CPU. What is not well “documented” however, is network capabilities which are simply categorized as – Low, Moderate, High, and 10Gb. Based on our experiments running Aerospike servers on AWS and iperf runs on AWS, we were able to better define these categories to the following numbers:

  • Low – Up to 100 Mbps
  • Moderate – 100 Mbps to 300 Mbps
  • High – 100 Mbps to 1.86 Gbps
  • 10Gb – upto 8.86Gbps*
1

I am not sure how you are running iperf for your tests but sometimes it needs to be run multi-threaded to yield results that better reflect the actual maximum throughput of the underlying network stack. I have seen it necessary to sometimes build the thread count up to 96 to get to what appeared to be close to the optimal throughput.