9

We frequently need to transfer huge files (upwards of 50 GB) between two hosts, and the transfer rate never seems to reach the expected throughput for the network. There are several points which could be the bottleneck, but each of their theorical upper limit are way over the actual transfer rate. Here's a typical setup :

Laptop --> 802.11n --> AP --> CAT 6 cable --> 10/100 Mbits router --> Desktop

In this connection, the bottleneck is clearly the router, which would limit the transfer rate at 100 Mbits/sec. Even then, I rarely see a transfer rate (with scp) exceeding 9.5 MB/s, which represents 76 Mbits/sec, or only 76% of theorical maximum limit.

Can there really be a 24% overhead at the access point, or is there something else limiting the speed? It could be disk I/O (although SATA is rated at 1.5 Gbps), or anything on the motherboard between the disk and the NIC (how can I measure that?).

Is there a way to know for sure(*) where the bottleneck is? If I can't get more than 76 Mbps from a 100 Mbps router, will upgrading the network to gigabit increase throughput or will I still get 76 Mbps because the bottleneck is elsewhere?

(*) or at least in a way convincing enough that a boss would agree to invest to upgrade that one part of the network

Fred
  • 303
  • 1
  • 4
  • 7
  • 1
    SSH (/sftp) can add some mad overhead (I believe its due to encryption). I can see a steady 900kb/s over http but that drops to 250kb/s when using ssh. – BuildTheRobots Jan 02 '10 at 22:03
  • 3
    sftp's poor transfer performance is due to an absurdly small default window size. – womble Jan 03 '10 at 03:11
  • This won't help you find the bottleneck, but for others running into this problem -- there's an 'HPN-SSH' (high performance networks) that's supposed to adjust the buffer size ... but it also enables the 'none' cipher (ie, doesn't encrypt the data). If that's not a problem for what you're transmitting : https://www.psc.edu/index.php/hpn-ssh – Joe H. Jun 30 '16 at 14:07

4 Answers4

13

your problem is that you are testing too many things at once:

  • disk read speed
  • SSH encryption
  • wireless
  • SSH decryption
  • disk write speed

Since you mentioned SSH I am going to assume this is a unix system...

You can rule out any problems with disk read speed with a simple

dd if=yourfile of=/dev/null #or
pv yourfile > /dev/null

on the receiving end you can do a simple disk write test

dd if=/dev/zero of=testfile bs=1M count=2000 # or
dd if=/dev/zero bs=1M count=2000 | pv > testfile

dd is not really a "benchmark" but since scp uses sequential IO, it is close enough

you can also test SSH by doing something like

dd if=/dev/zero bs=1M count=100 | ssh server dd of=/dev/null # or
dd if=/dev/zero bs=1M count=100 | pv | ssh server dd of=/dev/null

finally, to rule out SSH being the bottleneck, you can use nc to test the network performance

server$ nc -l 1234 > /dev/null
client$ dd if=/dev/zero bs=1M count=100 | pv | nc server 1234 # or
client$ dd if=/dev/zero bs=1M count=100 | nc server 1234

if you really want to properly test the network, install and use something like iperf, but nc is a good start.

I'd start with the nc test as that will rule out the most things. You should also definitely run the tests while not using wireless. 802.11n can easily max out a 100mbit port, but only if you have it properly setup.

(Ubuntu >= 12.04 defaults to netcat-openbsd. nc -l -p 1234 > /dev/null may be what you want if you're using netcat-traditional).

mrm
  • 163
  • 1
  • 6
Justin
  • 3,776
  • 15
  • 20
  • OP: See if you get get a 1000BT router. I just did a test between two servers, both with Atom 330 CPUs (3k bogomips according to /proc/cpuinfo), connected via Powerline AV and a 1000BT router. `dd | ssh` ran at 6.5 MB/s with 45% cpu utilization over Powerline AV, and 10.4MB/s at 75% cpu when I switched it to the 1000BT switch. `dd | nc` ran at 10-15% cpu utilization for both networks but 6MB/s (Powerline AV) and 40MB/s on the 1000BT router (!). – mrm Aug 26 '13 at 05:07
2

Think of it this way;

You have a slow (laptop disks are slow) SATA disk running one file system or another which then turns into an IP-based file sharing protocol such as SMB. This then gets turned into wifi format which then hits an AP, which then goes over wired ethernet (which does require some reformating) to a pretty slow switch.router then onto a probably-quite-slow-desktop, gets broken back out your file system format of choice and finally onto the disk. All of this happens for every packet, most if not all of which require an acknowledge packet sent back before it sends the next packet.

I'm surprised you're seeing as much speed at you are!

Here's a clue, wire the laptop to the 100Mbps switch/router when you need to transfer the files - seriously, it'll be much, much quicker. Also consider faster disks at each end and make sure you're using an efficient file transfer mechanism too.

Hope this helps.

Chopper3
  • 100,240
  • 9
  • 106
  • 238
1

You can use this command to measure the read speed of your disks:

hdparm -tT /dev/sdX

(replace X with drive letter)

1.5/3/6gbps is technically the sata transfer speed, but most disks are only capable of 50-60mbps continuous read.

Nick
  • 11
  • 1
1

As Chopper3 alludes to, also try using rsync-over-ssh for files of that size as there's a goodly chance that something could go wrong; nothing sucks more than to get 45GB through a 50GB transfer and have it fail. It's possible it may also decrease your overhead but I've not personally tested it with filesizes this large.

When transferring thousands of small files rsync can also decrease the overhead substantially -- a 75K file/1500 dir/5.6K average filesize test I ran once took 12min with FTP and SFTP, 10min with SCP, but only 1min50sec with rsync-over-ssh due to the decreased setup/teardown overhead. Rsync w/o SSH was only 20sec faster at 1min33sec.

  • 1
    For lots of small files it is hard to beat the 'tar cv dir/ | ssh box tar x' idiom. multi-core friendly, and no sftp/rsync protocol overhead. – Justin Jan 02 '10 at 23:19