46

Few days ago I noticed something rather odd (at least for me). I ran rsync copying the same data and deleting it afterwards to NFS mount, called /nfs_mount/TEST. This /nfs_mount/TEST is hosted/exported from nfs_server-eth1. The MTU on both network interfaces is 9000, the switch in between supports jumbo frames as well. If I do rsync -av dir /nfs_mount/TEST/ I get network transfer speed X MBps. If I do rsync -av dir nfs_server-eth1:/nfs_mount/TEST/ I get network transfer speed at least 2X MBps. My NFS mount options are nfs rw,nodev,relatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountvers=3,mountproto=tcp.

Bottom line: both transfers go over the same network subnet, same wires, same interfaces, read the same data, write to the same directory, etc. Only difference one is via NFSv3, the other one over rsync.

The client is Ubuntu 10.04, the server Ubuntu 9.10.

How come rsync is that much faster? How to make NFS match that speed?

Thanks

Edit: please note I use rsync to write to NFS share or to SSH into the NFS server and write locally there. Both times I do rsync -av, starting with clear destination directory. Tomorrow I will try with plain copy.

Edit2 (additional info): File size ranges from 1KB-15MB. The files are already compressed, I tried to compress them further with no success. I made tar.gz file from that dir. Here is the pattern:

  • rsync -av dir /nfs_mount/TEST/ = slowest transfer;
  • rsync -av dir nfs_server-eth1:/nfs_mount/TEST/ = fastest rsync with jumbo frames enabled; without jumbo frames is a bit slower, but still significantly faster than the one directly to NFS;
  • rsync -av dir.tar.gz nfs_server-eth1:/nfs_mount/TEST/ = about the same as its non-tar.gz equivalent;

Tests with cp and scp:

  • cp -r dir /nfs_mount/TEST/ = slightly faster than rsync -av dir /nfs_mount/TEST/ but still significantly slower than rsync -av dir nfs_server-eth1:/nfs_mount/TEST/.
  • scp -r dir /nfs_mount/TEST/ = fastest overall, slightly overcomes rsync -av dir nfs_server-eth1:/nfs_mount/TEST/;
  • scp -r dir.tar.gz /nfs_mount/TEST/ = about the same as its non-tar.gz equivalent;

Conclusion, based on this results: For this test there is not significant difference if using tar.gz large file or many small ones. Jumbo frames on or off also makes almost no difference. cp and scp are faster than their respective rsync -av equivalents. Writing directly to exported NFS share is significantly slower (at least 2 times) than writing to the same directory over SSH, regardless of the method used.

Differences between cp and rsync are not relevant in this case. I decided to try cp and scp just to see if they show the same pattern and they do - 2X difference.

As I use rsync or cp in both cases, I can't understand what prevents NFS to reach the transfer speed of the same commands over SSH.

How come writing to NFS share is 2X slower than writing to the same place over SSH?

Edit3 (NFS server /etc/exports options): rw,no_root_squash,no_subtree_check,sync. The client's /proc/mounts shows: nfs rw,nodev,relatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountvers=3,mountproto=tcp.

Thank you all!

grs
  • 2,235
  • 6
  • 28
  • 36
  • Should this be same result for many small files and one large file? – Xiè Jìléi May 11 '11 at 03:54
  • @notpeter - added the options in the original post. Thank you! – grs May 12 '11 at 02:16
  • I realize this is a rather old question, but one major difference between SCP and rsync that does account for a slight difference in transfer time is the automatic file transfer checksum done to show that the file transfered correctly. This is different than the -c option of rsync which uses a checksum to validate if a file has been updated between hosts. If you are only coping new files that doesn't come into play. – Rowan Hawkins Jun 06 '17 at 22:28

7 Answers7

24

NFS is a sharing protocol, while Rsync is optimized for file transfers; there are lots of optimizations which can be done when you know a priori that your goal is to copy files around as fast as possible instead of providing shared access to them.

This should help: http://en.wikipedia.org/wiki/Rsync

Massimo
  • 68,714
  • 56
  • 196
  • 319
  • 2
    If you know the data before hand (which you usually do), you can turn off compression selectively with the option `-e "ssh Compression=no"` to get possibly quicker transfer speed. This will keep it from compressing files that are possibly already compressed. I've noticed a speed up a lot of times. – lsd May 10 '11 at 22:31
  • 5
    @lsd - ssh compression is usually off by default, and not recommended for rsync. Allowing rsync to compress the data with the options `-z`, `--compress-level`, and `--skip-compress` will get better performance tha with a compressed transport. – JimB May 11 '11 at 14:35
24

Maybe it's not slower transfer speed, but increased write latency. Try mounting the NFS share async instead of sync and see if that closes the speed gap. When you rsync over ssh, the remote rsync process writes asynchronously (quickly). But when writing to the synchronously mounted nfs share, the writes aren't confirmed immediately: the NFS server waits until they've hit disk (or more likely the controller cache) before sending confirmation to the NFS client that the write was successful.

If 'async' fixes your problem, be aware that if something happens to the NFS server mid-write you very well might end up with inconsistent data on disk. As long as this NFS mount isn't the primary storage for this (or any other) data, you'll probably be fine. Of course you'd be in the same boat if you pulled the plug on the nfs server during/after rsync-over-ssh ran (e.g. rsync returns having 'finished', nfs server crashes, uncommitted data in the write cache is now lost leaving inconsistent data on disk).

Although not an issue with your test (rsyncing new data), do be aware that rsync over ssh can make significant CPU and IO demands on remote server before a single byte is transfered while it calculating checksums and generating the list of files that need to be updated.

notpeter
  • 3,505
  • 1
  • 24
  • 44
  • 1
    I think this answer is the right one. If the media (disks) on the two machines are comparable (same RPM / bandwidth / RAID configuration), you can get a good idea as to whether this is the case by doing the inverse operation: 'rsync -av /nfs_mount/TEST/ dir' Otherwise, turning sync off and trying it is the way to test. – Slartibartfast May 12 '11 at 05:37
  • I did quick tests with sync vs async and I think this answer has great chances to be the right one. Choosing async close the gap significantly, but it is still a bit slower than SSH one. I will make further tests and let you guys know. Thanks a lot! – grs May 12 '11 at 19:02
  • 3
    Update: my new tests demonstrated significant difference in terms of speed of sync vs async NFS export option. With NFS mounted with async and `rsync -av dir.tar.gz /nfs_mount/TEST/` I got about the same transfer speed as with `rsync -av dir nfs_server-eth1:/nfs_mount/TEST/`. I will mark this answer as correct one, but I am curious if I can improve the setup further. Thank you! Well done notpeter! – grs May 12 '11 at 22:01
5

Rsync is a file protocol that transfers only the changed bits between files. NFS is a remote directory file protocol that handles everything every time ... kind of like a SMB in a way. The two are different and for different purposes. You could use Rsync to transfer between two NFS shares.

pcunite
  • 130
  • 2
  • 6
    I feel a little bad down-voting you because you didn't say anything technically wrong, but it doesn't seem like you added anything to the discussion, and you came in after much more specific information had been made available. Also, from his post it looks like the author was aware of these things. – Slartibartfast May 12 '11 at 05:27
  • I thought I was the second post and the first to mention that both were protocols with different goals in mind. Its okay, I thought the first edit of the question was a bit daft. – pcunite May 14 '11 at 01:12
3

This is interesting. A possibility that you may not have considered is the content / type of file you are transmitting.

If you have scads of little files (e.g. emails in individual files), NFS efficiency may be tanking due to not making use of the full MTU (maybe this is less likely with TCP over UDP though).

Alternatively, if you have highly compressible files / data, fast CPUs, and a network that doesn't have quite the speed of the CPU(*), you could get speedup just from implicit compression over the ssh link.

A third possibility is that the files (or one version thereof) already exist in the destination. In this case the speedup would be because the rsync protocol saves you transferring the files.

(*) In this case by 'speed', I'm referring to the rate at which the CPU can compress data compared to the rate the network can transmit data, e.g. it takes 5 seconds to send 5MB across the wire, but the CPU can compress that 5MB into 1MB in 1 second. In this case your transmit time of compressed data would be slightly over 1 second, whereas uncompressed data is 5 seconds.

Slartibartfast
  • 3,265
  • 17
  • 16
  • Very good! The files I test with are many small images. They vary in size. I have to double check if I can compress them further. The files definitely do not exist at the destination, as I start from scratch every time. Tomorrow, I will make tests with simple `cp -r` vs `rsync` and then I will compress the files to have larger files in order to benefit from the MTU. Thanks! – grs May 11 '11 at 03:52
1

if you're goal is to just copy all files from one place to another, then tar/netcat will be the fastest option. if you know that you have lots of whitespace in your files (zeros) then use the -i option.

SOURCE: tar cvif - /path/to/source | nc DESTINATION PORTNUM DESTINATION: cd /path/to/source && nc -l PORTNUM | tar xvif -

if you know your data is compressible, then use compression on your tar commands -z -j -Ipixz

I am a fan of pixz .. parallel xz, it offers great compression and I can tune the number of cpu's I have to the network bandwidth. if I have slower bandwidth I'll use higher compression so I'm waiting on cpu more than network.. if I have a fast network I'll use a very low compression:

SOURCE: tar cvif - /path/to/source | pixz -2 -p12 | nc DESTINATION PORTNUM# tar, ignore zeros, level 2 pixz compression using 12 cpu cores DESTINATION: nc -l PORTNUM | tar -Ipixz -xvif

if you tune the compression level and cores right, depending on your data set, you should be able to keep the network close to saturated and do enough compression you're bottleneck becomes the disk (usually the write side if read and write disk systems are the same).

as for rsync, I believe it skips zeros similarly to the way tar does with that option, so it's transmitting less data than NFS. NFS can't make assumptions about the data so has to transmit every byte along with the NFS protocol overhead. rsync has some overhead..

netcat has basically none.. it'll send full TCP packets which contain nothing but data you care about.

with netcat, as with scp, you have to send all the source data all the time, you can't be selective as with rsync so it's not suitable for incremental backups or that sort of thing, but it's good for copying data or archiving.

  • This is way faster than rsync, and it was perfect for my use case: transferring ~10TB of new files from one storage array to another over a 10Gb network connection (meaning its limited by the target array's write speed). Some test numbers - Single 38GB file - rsync: 215s, tar | netcat: 141s; 32GB of various files with various sizes: rsync: 187s, tar | netcat: 112s – Kayson Aug 16 '22 at 06:28
0

Do you have file locking setup on the nfsshare? You might get a lot more preformance if that was disabled.

n8whnp
  • 1,316
  • 7
  • 9
  • How I can find out if it is enabled or not? This here: http://docstore.mik.ua/orelly/networking_2ndEd/nfs/ch11_02.htm suggests that NFS v3 does not have file locking capabilities. – grs May 12 '11 at 02:22
-1

I assume the increased speed is at least partly due to "rsync src host:/path" spawning a local process on the remote machine for sending/receiving, effectively cutting your I/O in half.

squillman
  • 37,618
  • 10
  • 90
  • 145