39

I am working on an rsync script for directory replication. I have it syncing only new and modified files or directories but I don't like the fact that it's copying renamed files or directories as a new file or directory, keeping the files not in sync. I have also set a bandwidth limit of 1MB since this will run during business work hours.

Here's my script:

rsync -zvru --bwlimit=1024  /mymounts/test1/ /mymounts/test2

How can I keep the files and directories in sync if someone renames something and still only copy new or modified files?

Here are the files in question:

ls "/mymounts/test1/some stuff"
new directory  newfile1.txt  newfile3.txt  renamedFile.txt

ls "/mymounts/test2/some stuff"
new directory  newfile1.txt  newfile2.txt  newfile3.txt  renamedFile.txt

Or would there be a way to even move the renamed files to another directory say: /mymounts/VerControl?

Matthias Braun
  • 205
  • 1
  • 8
jmituzas
  • 503
  • 1
  • 5
  • 12

6 Answers6

32

You may want to look at -y | --fuzzy rsync option. Other than that, rsync has no way of tracking renames, so you'll end up transferring renamed file.

From rsync manpage:

   -y, --fuzzy
          This option tells rsync that it should look for a basis file for
          any  destination  file  that  is missing.  The current algorithm
          looks in the same directory as the destination file for either a
          file  that  has  an identical size and modified-time, or a simi-
          larly-named file.  If found, rsync uses the fuzzy basis file  to
          try to speed up the transfer.
artyom
  • 956
  • 9
  • 8
  • 13
    Note that, if using `--fuzzy`, you should consider coupling it with `--delete-delay`, as *"rsync by default does a `--delete-before`, thus removing the base file before it can be copied/moved"*. Source: [Sonia Hamilton](http://www.snowfrog.net/2011/03/03/fuzzy-rsync/) – Ronan Jouchet Oct 23 '17 at 14:13
  • 4
    You might also need --delay-updates, i was able to use `--fuzzy --delay-updates --delete-delay` together to NOT copy renamed dir – Alec Istomin Jan 05 '20 at 06:05
  • 2
    [Note that](https://unix.stackexchange.com/a/620080/59989): a rename of `/test/10GBfile` to `/test/otherdir/10GBfile_newname` would still resend the data, since it's not in the same directory. – Basj Nov 17 '20 at 12:58
28

You can handle moved and renamed files with rsync if the filesystems on the source and target directory have support for hard links. The idea is to let rsync reconstruct hard links before real transfer. You can find a brilliant explanation here.

We ended up with a simple solution that create an hidden tree of hard links inside the source/target directory, the basic script could be like this:

# Name of hidden directory
Shadow=".rsync_shadow"

# do real sync
rsync -ahHv --stats --no-inc-recursive --delete --delete-after "$Source"/ "$Target"

# update/create hidden dir of hard links in source
rsync -a --delete --link-dest="$Source" --exclude="/$Shadow" "$Source"/ "$Source/$Shadow"

# update/create hidden dir of hard links in target
rsync -a --delete --link-dest="$Target" --exclude="/$Shadow" "$Target"/ "$Target/$Shadow"

I have an example script on GitHub. But I advise you to do a large amount of testing before use this method on production.

Giacomo1968
  • 3,522
  • 25
  • 38
dparoli
  • 381
  • 5
  • 7
  • Also the “brilliant explanation” you link to also refers to this great basic `rsync` article, FWIW. http://everythinglinux.org/rsync/ – Giacomo1968 May 17 '14 at 15:17
  • I guess "support hard links" including for folders? As most current systems does not now... https://askubuntu.com/questions/210741/why-are-hard-links-not-allowed-for-directories – Martian2020 Nov 30 '21 at 11:57
  • I gave it some thought and as I've understood it won't check if file was renamed in destination, not source (and I have not found easy fix for that), therefore not good for sync, only for backup (you script have backup word in description), but the question was for sync. – Martian2020 Nov 30 '21 at 13:18
  • @Martian2020 As I warned my answer has limited uses and need testing, I never claimed it to be "the answer". From here to say that since it is not "the answer" it is not "an answer" the step seems long to me. The exact difference between "sync" and "backup" escapes me without a practical example but speaking only in general. The solution uses rsync which is a sync and backup tool so your, in the end, is an objection to using rsync as a synchronization tool. For hard links: in my case I didn't find the script to need hard links for directories but each filesystem has to be tested for that. – dparoli Dec 02 '21 at 10:58
5

Here is a tool that is designed to work before rsync is run: rsync-sidekick

This propagates following changes from source directory to destination directory (or any combination of below):

  1. Change in file modification timestamp
  2. Rename of file/directory
  3. Moving a file from one directory to another

Using this tool saves you from horrible situations of renaming a directory leading to transferring GBs of files.

Disclaimer: I'm author of above tool

Manu Manjunath
  • 151
  • 1
  • 4
  • Good idea, I want to handle renames. I wanted to see in read.me (cause source seems to be long) criteria for deciding which files renamed - I have not found that info. Where in code than? – Martian2020 Nov 30 '21 at 05:05
  • @Martian2020 Can you create a Github issue? – Manu Manjunath Dec 01 '21 at 06:26
  • This tool is a great approach, thanks! Sceptical sysadmins like me should read [Github issue #1|https://github.com/m-manu/rsync-sidekick/issues/1]. Many points where I was afraid about are mentioned there, and solved in a sensitive way. @ManuManjunath if you put these points into an FAQ section of the docs, then your tool will have a great future I guess. – rudimeier Feb 05 '22 at 16:51
4

While this is an old post, these were the first discussion I found when searching, so I thought I'd share a solution. Since about 2010, there's been patches to rsync that detect renames. One patch that worked for me is here.

So, if you can patch your rsync in your environment, you can use rsync --detect-renamed or --detect-renamed-lax.

Derek Frye
  • 41
  • 1
  • if you sync, does rsync know which side has file renamed (maybe check using `stat file`, looks at `change` timestamp and if destination timestamp is later that of source than do not rename)? – Martian2020 Nov 30 '21 at 12:58
1

You can't.

Rsync has three modes,

  • Copy all files
  • Copy existing (modified/non-modified) files -- on this you have many options
  • Copy non-existing files.

However these three modes have two subcategories,

  • Exclude on
  • Include on

Rsync doesn't track which files are renamed, it has no state. Consider instead copying all files, and excluding the ones you don't want. You can't have a shifting-whitelist, you can have a blacklist.

rsync [..stuff..] --exclude 'lib/'
Evan Carroll
  • 2,245
  • 10
  • 34
  • 50
  • gotcha thanks for the input will have to live with the renamed files then. – jmituzas Mar 19 '13 at 15:31
  • You can have it delete the files that don't exist, that's perfectly fine. But as a function of copy, if you rename on source, it will copy the renamed file. – Evan Carroll Mar 19 '13 at 15:34
  • I can keep the versions with -b and set suffix=`date +%Y%m%d%k%M%S` anyway I can get suffix=`date +%Y%m%d%k%M%S`.ORIGINALSUFFIX? – jmituzas Mar 19 '13 at 20:32
0

As far as I know, rsync cannot recognize renaming of files The new files will have to transferred again. See artyom's answer.

To delete the gone files, make sure to use --delete option.

Also, for mirroring, I would recommend you to use -a (archive), which aliases some nice options.

Have a look at the man 1 rsync for details.

Lukas
  • 984
  • 5
  • 14