18

We use rsync to backup servers.

Unfortunately the network to some servers is slow.

It takes up to five minutes for rsync to detect, that nothing has changed in huge directories. These huge directory trees contain a lot of small files (about 80k files).

I guess that the rsync clients sends data for each of the 80k files.

Since the network is slow I would like to avoid to send 80k times information about each file.

Is there a way to tell rsync to make a hash-sum of a sub directory tree?

This way the rsync client would send only a few bytes for a huge directory tree.

Update

Up to now my strategy is to use rsync. But if a different tools fits better here, I am able to switch. Both (server and client) are under my control.

Update2

There are 80k files in one directory tree. Each single directory does not have more than 2k files or sub-directories

Update3

Details on the slowness of the network:

time ssh einswp 'cd attachments/200 && ls -lLR' >/tmp/list
real    0m2.645s

Size of tmp/list file: 2MByte

time scp einswp:/tmp/list tmp/
real    0m2.821s

Conclusion: scp has the same speed (no surprise)

time scp einswp:tmp/100MB tmp/
real    1m24.049s

Speed: 1.2MB/s

guettli
  • 3,113
  • 14
  • 59
  • 110
  • 1
    You might read up on zsync. I have not used it myself, but from what I read, it pre-renders the metadata on the server side and might just speed up transfers in your case. It might be worth testing anyway. Beyond that, the only other solution I am aware of is real time block level syncronization that comes with some san/nas solutions. – Aaron Jan 05 '16 at 04:47

6 Answers6

41

Some unrelated points:

80K is a lot of files.

80,000 files in one directory? No operating system or app handles that situation very well by default. You just happen to notice this problem with rsync.

Check your rsync version

Modern rsync handles large directories a lot better than in the past. Be sure you are using the latest version.

Even old rsync handles large directories fairly well over high latency links... but 80k files isn't large...it is huge!

That said, rsync's memory usage is directly proportional to the number of files in a tree. Large directories take a large amount of RAM. The slowness may be due to a lack of RAM on either side. Do a test run while watching memory usage. Linux uses any left-over RAM as a disk cache, so if you are running low on RAM, there is less disk caching. If you run out of RAM and the system starts using swap, performance will be really bad.

Make sure --checksum is not being used

--checksum (or -c) requires reading each and every block of every file. You probably can get by with the default behavior of just reading the modification times (stored in the inode).

Split the job into small batches.

There are some projects like Gigasync which will "Chop up the workload by using perl to recurse the directory tree, building smallish lists of files to transfer with rsync."

The extra directory scan is going to be a large amount of overhead, but maybe it will be a net win.

OS defaults aren't made for this situation.

If you are using Linux/FreeBSD/etc with all the defaults, performance will be terrible for all your applications. The defaults assume smaller directories so-as not to waste RAM on oversized caches.

Tune your filesystem to better handle large directories: Do large folder sizes slow down IO performance?

Look at the "namei cache"

BSD-like operating systems have a cache that accelerates looking up a name to the inode (the "namei" cache"). There is a namei cache for each directory. If it is too small, it is a hindrance more than an optimization. Since rsync is doing a lstat() on each file, the inode is being accessed for every one of the 80k files. That might be blowing your cache. Research how to tune file directory performance on your system.

Consider a different file system

XFS was designed to handle larger directories. See Filesystem large number of files in a single directory

Maybe 5 minutes is the best you can do.

Consider calculating how many disk blocks are being read, and calculate how fast you should expect the hardware to be able to read that many blocks.

Maybe your expectations are too high. Consider how many disk blocks must be read to do an rsync with no changed files: each server will need to read the directory and read one inode per file. Let's assume nothing is cached because, well, 80k files has probably blown your cache. Let's say that it is 80k blocks to keep the math simple. That's about 40M of data, which should be readable in a few seconds. However if there needs to be a disk seek between each block, that could take much longer.

So you are going to need to read about 80,000 disk blocks. How fast can your hard drive do that? Considering that this is random I/O, not a long linear read, 5 minutes might be pretty excellent. That's 1 / (80000 / 600), or a disk read every 7.5ms. Is that fast or slow for your hard drive? It depends on the model.

Benchmark against something similar

Another way to think about it is this. If no files have changed, ls -Llr does the same amount of disk activity but never reads any file data (just metadata). The time ls -Llr takes to run is your upper bound.

  • Is rsync (with no files changed) significantly slower than ls -Llr? Then the options you are using for rsync can be improved. Maybe -c is enabled or some other flag that reads more than just directories and metadata (inode data).

  • Is rsync (with no files changed) nearly as fast as ls -Llr? Then you've tuned rsync as best as you can. You have to tune the OS, add RAM, get faster drives, change filesystems, etc.

Talk to your devs

80k files is just bad design. Very few file systems and system tools handle such large directories very well. If the filenames are abcdefg.txt, consider storing them in abdc/abcdefg.txt (note the repetition). This breaks the directories up into smaller ones, but doesn't require a huge change to the code.

Also.... consider using a database. If you have 80k files in a directory, maybe your developers are working around the fact that what they really want is a database. MariaDB or MySQL or PostgreSQL would be a much better option for storing large amounts of data.

Hey, what's wrong with 5 minutes?

Lastly, is 5 minutes really so bad? If you run this backup once a day, 5 minutes is not a lot of time. Yes, I love speed. However if 5 minutes is "good enough" for your customers, then it is good enough for you. If you don't have a written SLA, how about an informal discussion with your users to find out how fast they expect the backups to take.

I assume you didn't ask this question if there wasn't a need to improve the performance. However, if your customers are happy with 5 minutes, declare victory and move on to other projects that need your efforts.

Update: After some discussion we determined that the bottleneck is the network. I'm going to recommend 2 things before I give up :-).

  • Try to squeeze more bandwidth out of the pipe with compression. However compression requires more CPU, so if your CPU is overloaded, it might make performance worse. Try rsync with and without -z, and configure your ssh with and without compression. Time all 4 combinations to see if any of them perform significantly better than others.
  • Watch network traffic to see if there are any pauses. If there are pauses, you can find what is causing them and optimize there. If rsync is always sending, then you really are at your limit. Your choices are:
    • a faster network
    • something other than rsync
    • move the source and destination closer together. If you can't do that, can you rsync to a local machine then rsync to the real destination? There may be benefits to doing this if the system has to be down during the initial rsync.
TomOnTime
  • 7,567
  • 6
  • 28
  • 51
  • 80K is a lot of files.: There are 80k files in one directory **tree**. Each single directory does not have more than 2k files/subdirectories. – guettli Jan 07 '16 at 08:50
  • Check your rsync version: done, Make sure --checksum is not being used: done. Split the job into small batches: Thank you I will have a look at gigasync. OS defaults aren't made for this situation: done (the bottleneck is network not OS). Look at the "namei cache": done (it is net, not OS). Consider a different file system: again net, not OS. Maybe 5 minutes is the best you can do.: I think it could be much faster. Talk to your devs (use DB): This would be a giant change. Maybe an filesystem with better backup support would solve it. – guettli Jan 07 '16 at 08:58
  • 2k files per directory is a lot better. thank you for the update. You hadn't mentioned that the network was slow. Is it low bandwidth, high latency, or both? rsync usually performs well on high latency links (it was developed by someone working on his PhD from Australia while dealing with computers in the U.S.). Try doing that "ls -lLR" over ssh and time how long it takes to transmit the result. "time ssh remotehost 'cd /dest && ls -lLR' >/tmp/list". Make sure the /tmp/list gets created on the local host. – TomOnTime Jan 07 '16 at 13:40
  • yes the network is slow. It is a pitty. – guettli Jan 07 '16 at 15:40
  • How slow? If you use "scp" to copy a 100M file, how long does it take? Also, what is the output of "time ssh remotehost 'cd /dest && ls -lLR' >/tmp/list"? – TomOnTime Jan 07 '16 at 15:41
  • I updated the question with details about the slowness. – guettli Jan 07 '16 at 16:01
  • (sorry, I didn't see the update). Hmm... on 1.5M link you When I do a similar benchmark, if "ls -lLR" takes 1 second, rsync takes about 1.3 seconds. A 30% overhead for the additional processing that rsync is doing seems reasonable. What is the ratio for you? If you do a "tcpdump", do you see network traffic the entire time or a pause? If there is a pause, then the disks are your limit. No pauses, it is the network. – TomOnTime Jan 07 '16 at 16:11
  • I run the `time ls -lLR` command on the remote, too and it took 0.0secs. I guess discs are not the limit. – guettli Jan 07 '16 at 16:54
  • I added an update to the end of my answer. – TomOnTime Jan 07 '16 at 17:08
  • 1
    your hint "move the source and destination closer together. " is our current solution. I guess rsync could be optimized and detect unchanged directory trees. But we will run a second server next to the source and rsync the long and slow connection only once a week. Thank you very much for your constant engagement. You helped me. – guettli Jan 08 '16 at 10:00
  • Um, 80k files is not a lot of files at all. My dinky personal backup has directories with this many files. – mlissner Mar 08 '18 at 00:54
  • I had to LOL at "80k files is huge". I'm trying to migrate data from a NFS filer to GCP, one directory alone has 535k first level folders, each of these has 255 sub level folders, each of those has 255 sublevel folders, followed by anywhere from 1000-400k files beneath it. not counting any files, thats 34.7M folders in one directory...i LOL hard at your "80k files" – Evan R. Mar 14 '19 at 18:39
  • This *lengthy* answer is based on a miscommunication, and as such the advice it gives is not very useful. – Gringo Suave Mar 22 '19 at 21:18
  • I would say that it _is_ in fact a useful answer as it gives many tips on diagnosing and improving the performance of rsync. – DaVince Sep 27 '21 at 17:15
5

You can also try lsyncd, which will rsync only when changes are detected on the filesystem and just the changed subdirectories. I've been using it for directories with up to two million files on a decent server.

Juanga Covas
  • 181
  • 1
  • 2
3

I think that 80k files is nothing extraordinary today.

My explanation for the problem lies in the way how rsync works: See here. They say: While it is being built, each entry is transmitted to the receiving side in a network-optimised way. This leads to write-stop-write-stop-write sending over the network, which is supposedly inferior to preparing full data first and then sending it over the network at full speed. The write-stop-write-stop-write sequence might require many network roundtrips more, in worst case eventually even 80k network roundtrips ...

See info about TCP packet handling, Nagle's algorithm and so on. This also corresponds with empirical evidence: When designing a system that processes bulk data, then one should use batch techniques and not evade to techniques used in real-time systems that process each item/record individually.

I did a practical test with a sync program that indeed works batch-wise: The local synchronizer Zaloha.sh has recently been extended to allow remote backup: Zaloha2.sh. Get it from Fitus/Zaloha.sh, the new version is under the Zaloha2.sh link near the "three cats".

The program works by running find on the directories to obtain CSV files. The find on the remote directory runs in an ssh session and after it finishes, the CSV file is downloaded to the local system by scp. The find on the local directory runs locally. The comparing of the two CSV files occurs locally by GNU sort and mawk.

I have chosen one of my directories to most closely match 80k files (in fact it is nearly 90k files and 3k directories). The hardware used during the test is nothing special or "bleeding edge": an eight year old notebook with Linux and a desktop PC of approximately same age with Linux serving as the remote backup host. The link between them is a plain vanilla home Wi-Fi network.

The notebook has its data on an USB-connected (!) external HDD, the desktop PC has its data on an internal HDD.

The data is in synchronized state (same condition as yours) except of one not-synchronized file (to proof that Zaloha2.sh indeed detects it).

Results of the practical test:

The find scan of the USB-connected external HDD took 1 minute and 7 seconds. The find scan of the internal HDD took 14 seconds. The scp-transferring of the CSV files over the Wi-Fi and their sort and mawk processing took 34 seconds.

Overall: 1 minute and 56 seconds. The one differing file was indeed detected.

Interestingly, when running the whole test again, both finds finished almost immediately. I assume that this is due to caching of directory data in the Linux kernels.

The second test lasted just 35 seconds ...

Hope this helps.

Petas
  • 31
  • 1
2

No, that's not possible with rsync and it would be quite inefficient in another regard:

Normally, rsync only compares file modification dates and file sizes. Your approach would force it to read and checksum the content of all files twice (on the local and remote system) to find changed directories.

Sven
  • 97,248
  • 13
  • 177
  • 225
  • 1
    AFAIK rsync checks mtime and size. If both matches, the file is not transferred again (at least in the default settings). It would be enough to send the hash of the tuples (filename, size, mtime). There is no need to checksum the content. – guettli Jan 04 '16 at 10:59
  • Yes, you are correct, but anyway, `rsync` doesn't do this. – Sven Jan 04 '16 at 10:59
  • You're both right. rsync reads the contents of every file only with the "-c" option. Otherwise, it compares the filename,size,mtime tuple and skips reading the file contents if they match. – TomOnTime Sep 27 '21 at 21:03
2

For synchronisation of large numbers of files (where little has changed), it is also worth setting noatime on the source and destination partitions. This saves writing access times to the disk for each unchanged file.

  • Yes, the noatime option makes sense. We use it since several years. I guess an alternative to rsync is needed. – guettli Aug 28 '17 at 07:07
2

Use rsync in daemon mode at the server end to speed up the listing/checksum process:

Note it isn't encrypted, but may be able to be tunneled without losing the listing performance improvement.

Also having rsync do compression rather than ssh should improve performance.

Gringo Suave
  • 454
  • 5
  • 12