Would rsync save meaningful amount of data transfer for compressed/encrypted files?

1

Would rsync save meaningful amount of data transfer for sync'ing

1) zip files,

2) ASCII armored GPG encrypted files, and

3) Mathematica .mx files,

respectively ?

A typical scenario is that I already have an old copy and the compressed and/or encrypted file is NOT the only file to sync, i.e. uncompressed and unencrypted files can exist.

qazwsx

Posted 2012-02-01T17:17:37.013

Reputation: 6 739

Are you talking about the case where you already have an older copy of those files on the remote system? Or is this a brand new copy? – Zoredache – 2012-02-01T17:36:56.937

It is for the former case. – qazwsx – 2012-02-01T18:07:04.673

Answers

1

In the case where you are rsync'ing only one file and that file is encrypted or compressed, the only bandwidth you would likely save is that of not needing to transfer it at all if unchanged.

However, if you had a directory full of ZIP or JPEG or GPG files, rsync still only transfers those files that have changed, and is a great way to easily transfer only new files.

Note: I find it useful to rsync the uncompressed data whenever possible, and then compress it for storage on both sides of the link if necessary. In this manner, you can save yourself the transfer bandwidth. ie:

mkdir /tmp/torsync
cd /tmp/torsync
unzip /home/me/somefile.zip
rsync -avz . remote:/tmp/somefile
ssh remote 'zip -r somefile.zip /tmp/somefile'

YMMV of course.

OT: with its backup options, I find rsync useful even when it does not save bandwidth as it will create backup copies of replaced files, allowing me to retrieve historical copies easily.

Follow-up: this applies to all formats where compression or encryption is involved, but I'm not familiar with Mathematica users.

mikebabcock

Posted 2012-02-01T17:17:37.013

Reputation: 1 019

Does your description apply to all three types? – qazwsx – 2012-02-01T18:10:10.520

1

The problem with encrypted or compressed files is that even if only one byte is modified in the data, all remaining file contents are different, not just the changed data byte.

This defeats one strategy used by rsync to reduce data transfer - namely only transferring the changed sections of a file and not the whole file.

So don't compress data unless you need space on the disk (in which case use disk-based compression as that is transparent to applications like rsync).

Don't encrypt data unless you need to protect the privacy of the data should the computer (or disk) be stolen or lost. (Do keep backups of your data and especially of your encryption keys or recovery keys). Again, whole-disk encryption is likely to be least detrimental to rsync performance (but most likely to lead to a catastrophic loss of data when a hard disk fails and data-backups are not available, or when you reinstall the OS without backing up a recovery key for other data disks/partitions)

The above assumes that (a significant number of) the relevant uncompressed/unencypted files are likely to have partial changes from time to time - by editing or appending of some sort - whilst the bulk of the data in the file remains unchanged.

RedGrittyBrick

Posted 2012-02-01T17:17:37.013

Reputation: 70 632

Rsyncrypto advertises itself as rsync-friendly encryption. https://rsyncrypto.lingnu.com/index.php/Home_Page

– Matthew Hannigan – 2018-03-21T06:49:14.593

2In the specific case of zip files, if it contains many files and only some of the files have changed, then rsync can avoid having to resend most of the zip file. You can get similar behavior out of gzip with the --rsyncable switch ( though it costs a small amount of compression ratio ). – psusi – 2012-02-01T19:13:41.317