Are cp/rsync asynchronous?

2

We're running a backup script which first copies a file to a destination and then runs tar over it.

DIR2BCK='/foo/bar'
TMPDIR=$(mktemp -d)
rsync -a ${DIR2BCK} ${TMPDIR}/ > /dev/null 2>&1
tar czf /tmp/foo.backup.tar ${TMPDIR}

After running this last command, sometimes the following warning is shown:

/tmp/tmp.blqspkA136: file changed as we read it

We copy the destination to a temporary directory precisely to avoid file changes at compression time. This behavior is also reproducible when using the cp command instead of rsync. All my life I thought these commands were synchronous, but this warning seems to show the opposite.

If I put a sleep command between the rsync/cp and the tar lines, the warning doesn't show up, but I consider this a not quite clean solution.

Some facts:

  • I tried adding a sync command between the rsync and tar commands with same result.
  • As suggested by @jcbermu I also tried changing the script so the two lines are:

    rsync -a ${DIR2BCK} ${TMPDIR}/ > /dev/null 2>&1 &
    wait
    

    I run the script several times and some of them shown the same behavior, claiming the file changed when copying.

  • The filesystem used is EXT4 both for ${TMPDIR} and ${DIR2BCK}.

  • ${DIR2BCK} is on a remote filesystem, actually this is a samba mountpoint of a remote machine. ${TMPDIR} is on the local filesystem. However, changing ${DIR2BCK} to the local filesystem makes no difference.
  • All filesystems are hardware RAID-5 based.

Are these commands actually synchronous? If not, is there a way to make them so, or an alternative command?

nKn

Posted 2017-09-29T11:18:14.173

Reputation: 4 960

Yes, these commands are done when they exit. You are not quoting file/directory names. Please fix that ASAP. – Daniel B – 2017-09-29T11:43:43.057

@DanielB My mistake. Although the logics is the same, this is not the actual code. The real one includes the quotes. – nKn – 2017-09-29T11:49:27.980

Well then perhaps there is a subtle mistake in the real code. Or is the code above actually enough to reproduce the issue? – Daniel B – 2017-09-29T11:54:17.977

Yes, it should be enough, as paths are correct. The major issue is that when tar starts, rsync/cp don't seem to have ended copying yet, and the warning is shown. I assumed these commands are synchronous, that's why I'm surprised. – nKn – 2017-09-29T11:58:38.113

Don’t assume, try. Otherwise, a qualified answer cannot be provided. // sync is not helping here. You’re not accessing raw data on devices. // Please also provide information about which filesystems both $TMPDIR and $DIR2BCK reside on. – Daniel B – 2017-09-29T20:06:57.137

Are both $TMPDIR and $DIR2BCK on the same filesystem? or are they separate drives or partitions? Are you using regular hard drives/ssd's or some sort of RAID? – arielnmz – 2017-10-02T17:09:03.093

@arielnmz they are on separate filesystems, ${TMPDIR} is actually a samba share on a different machine, but this also happened when both where on the same filesystem. All filesystems are on RAID5. – nKn – 2017-10-02T17:24:07.007

Answers

0

One solution is to rewrite it as:

rsync -a ${DIR2BCK} ${TMPDIR}/ > /dev/null 2>&1 ; tar czf foo.backup.tar ${TMPDIR}

So, tar won't start until rsync ends.


The other solution is to send the cp/ rsync to the background and wait until it ends with the command wait.

For example:

rsync -a ${DIR2BCK} ${TMPDIR}/ > /dev/null 2>&1 &
wait
tar czf foo.backup.tar ${TMPDIR}

The last & in the rsync line sends the execution to the background (it becomes a child of the current session), and then the waitforces this shell session to wait until all the children have finished to continue.

jcbermu

Posted 2017-09-29T11:18:14.173

Reputation: 15 868

1The problem with the first approach is that it is equivalent to the one I'm using. Splitting the two commands into two lines is the same as a one liner with the ; as separator. None of them should make the tar line start until the rsync/cp commands have ended if they are synchronous, but that doesn't happen in this case. I'll try the second approach and get back. – nKn – 2017-09-29T12:07:38.753

Unfortunately the change you proposed showed the same behavior in some runnings, I edited my question adding this test. – nKn – 2017-09-30T09:57:16.353

0

I put a sleep command between the rsync/cp and the tar lines, the warning doesn't show up, but I consider this a not quite clean solution.

Good for you or having standards. What happens if, instead of sleep, you use:

sudo sync; echo 3 | sudo tee /proc/sys/vm/drop_caches

Do you consider that to be a good solution?

Note: /proc/sys/vm/drop_caches seems to be what Ubuntu uses, and is not expected to be an approach that works on all Unixes (although maybe all Linuxes). I'm mentioning it after having read https://ubuntuforums.org/showthread.php?t=589975 and, after reading an initial report questioning the safety of doing this, reading more forum thread posts that confirm its safety.

TOOGAM

Posted 2017-09-29T11:18:14.173

Reputation: 12 651

I forgot to mention in my question, but I already tried the sync command with no success. I tried both a local filesystem and a remote one as a destionation of the copy, but the same happens. The drop_caches is not good for us as we use a lot of different OS (CentOS, Ubuntu, Debian...). – nKn – 2017-09-29T16:06:17.750