7

I have a large directory called servers, which contains many hard-links made by rsnapshot. That means that the structure is more or less like:

./servers
./servers/daily.0
./servers/daily.0/file1
./servers/daily.0/file2
./servers/daily.0/file3
./servers/daily.1
./servers/daily.1/file1
./servers/daily.1/file2
./servers/daily.1/file3
...

The snapshots were created with rsnapshot in a space-saving way: if /servers/daily.0/file1 is the same as /servers/daily.1/file1, they both point to the same inode using hard-link, instead of just copying a complete snapshot every cycle./servers/daily.0/file1/servers/daily.0/file1

I've tried to copy it with the hard links structure, in order to save space on the destination drive, using:

nohup time rsync -avr --remove-source-files --hard-links servers /old_backups

After some time, the rsync freezes - no new lines are added to the nohup.out file, and no files seem to move from one drive to another. Removing the nohup didn't solve the problem.

Any idea what's wrong?

Adam

Adam Matan
  • 12,504
  • 19
  • 54
  • 73

3 Answers3

14

My answer, which I give from hard-earned experience, is: Don't do this. Don't try to copy a directory hierarchy that makes heavy use of hard links, such as one created using rsnapshot or rsync --link-dest or similar. It won't work on anything but small datasets. At least, not reliably. (Your mileage may vary, of course; perhaps your backup datasets are much smaller than mine were.)

The problem with using rsync --hard-links to recreate the hard-linked structure of files on the destination side is that discovering the hard-links on the source side is hard. rsync has to build a map of inodes in memory to find the hard-links, and unless your source has relatively few files, this can and will blow up. In my case, when I learned of this problem and was looking around for alternate solutions, I tried cp -a, which is also supposed to preserve the hard-link structure of files in the destination. It churned away for a long time and then finally died (with a segfault, or something like that).

My recommendation is to set aside an entire partition for your rsnapshot backup. When it fills up, bring another partition online. It is much easier to move around hard-link-heavy datasets as entire partitions, rather than as individual files.

Steven Monday
  • 13,019
  • 4
  • 35
  • 45
  • Plus one to that. You may also want to consider a different approach to the problem, like `rdiff-backup`, which uses binary diffs instead of hardlinks. (It's got some problems of its own, unfortunately.) – mattdm Dec 01 '10 at 03:30
  • 1
    The `hardlink` program can search for identical files and hardlink them, but it requires them to all have exactly the same attributes (size, contents, permission, owner, group, etc) to work properly. I was able to use it to relink several tens of gigabytes of a music backup in about half an hour. – Robbie Jul 07 '12 at 08:10
  • 1
    What do you mean by "blow up"? Exponential slowness? Or does an error occur to let you know that it's not going to work? – Steve Pitchers Jul 22 '14 at 11:16
  • Even COW file systems (LVM/ZFS/Btrfs) get slow with many copies but they are more robust than hard links. – user1133275 Dec 02 '19 at 17:10
7

At the point rsync seems to hang, is it hung or just busy? Check for cpu activity with top and disk activity with iotop -o.

It could be busy copying over a large file. You would see this in iotop or similar, or in rsync's display if you ran it with the --progress option.

It could also be busy scanning through lists of inodes to check for linked files. If incremental recursion is being used, which is the default for recursive transfers in most cases if both client and server have rsync v3.0.0 or later, it could have just hit a directory with many files and be running the link check between all the files in it and all those found previously. The --hard-links option can be very CPU intensive over large sets of files (this is why it is not included in the list of options implied by the general --archive option). This will manifest itself as high CPU use at the time rsync seems paused/hung.

David Spillett
  • 22,534
  • 42
  • 66
  • Many people report that it's actually hung. It makes a lot of `stat()` system calls, https://bugzilla.samba.org/show_bug.cgi?id=10678#c1 - the difference for the end user whether it's very slow or completely stopped is probably non-existent. – Yaroslav Nikitenko Feb 17 '21 at 18:24
  • @YaroslavNikitenko - many people *incorrectly* report that it is actually hung. It *has* to make all those `stat()` calls and keep track of the extra information (consuming memory and CPU resource) to do the job it is being asked to do when run over a large tree with those options enabled. Perhaps there could be more progress information while it churns over the required processing, though that wouldn't help in many scripted circumstances when rsync is told to run quiet unless there are actual errors to report. – David Spillett Feb 18 '21 at 10:47
-1

I had the same problem. My problem was solved by adding --no-inc-recursive option.

From https://download.samba.org/pub/rsync/rsync.html:

If incremental recursion is active (see --recursive), rsync may transfer a missing hard-linked file before it finds that another link for that contents exists elsewhere in the hierarchy.

This does not affect the accuracy of the transfer (i.e. which files are hard-linked together), just its efficiency (i.e. copying the data for a new, early copy of a hard-linked file that could have been found later in the transfer in another member of the hard-linked set of files).

One way to avoid this inefficiency is to disable incremental recursion using the --no-inc-recursive option.

Andrey
  • 1