Backing up 80G hard drive 1G per day

2

2

I want to securely backup my 80G HD, but doing a complete backup takes forever and slows down my machine, so I want to backup just 1G per day. Details:

% First hurdle: on the first day, I want to backup the "first" 1G of the hard drive. Of course, there really is no "first" 1G on a hard drive.

% After 80 days, I'll have my whole HD backed up... assuming none of my files ever change, which of course they do. So the backup plan/program must also catch file creation/changes as they come along.

% The backups must be consistent, in that I can restore my system by restoring the backups sequentially. In other words, "dd if=/harddrive" probably won't work.

% The backups should encrypt file contents AND names, but I don't see this as a major hurdle.

% Once the backup has backed up everything (even changed files), it can re-backup the first 1G on my hard drive. Even though this backup is redundant, that's OK, because I always want to be backing up something (eg, if I'm backing up to optical media, the older media might start going corrupt).

Is there a magic backup plan/program that does this?

In reality, I want to do this for multiple machines with multiple drives each, but think that solving the above will solve the general case.

barrycarter

Posted 2010-12-16T00:37:51.520

Reputation: 695

1

Sounds like a good question for http://superuser.com.

– Carl Norum – 2010-12-16T00:39:15.957

1What is the backup target? is it disk space in a separate server or external plug in (USB?) disks? Over the Internet? This will help determine the best feasible strategy. – Linker3000 – 2010-12-16T09:00:00.837

Right now, I'm thinking of burning to DVD, one DVD per session. However, this may change. I particularly like backing up over the Internet, but it's fairly slow for large amounts of data (then again, 1G/day might not be too bad?) – barrycarter – 2010-12-16T15:21:44.213

DVDs? Good grief, does anyone still backup to DVD ? Surely removable USB (or external SATA) or even portable flash-based storage is far more convenient/practical these days ? Personally I back up multiple machines over the LAN to an onsite NAS, then periodically backup the backups to removable storage for offsite peace of mind. – timday – 2011-01-08T10:43:42.960

Answers

2

Problem

I'm familiar with rsync and have tried using it, along with other tools, to write a Perl script that does what I want. However, rsync by itself does not do what I want. Unfortunately, if a file changes slightly, the encrypted version of the file changes a lot, so rsync doesn't even work that well for single files.

Solution?

rsyncrypto is a utility to encrypt files in an rsync-friendly fashion. The rsyncrypto algorithm ensures that two almost identical files, when encrypted with rsyncrypto and the same key, will produce almost identical encrypted files. This allows for the low-overhead data transfer achieved by rsync while providing encryption for secure transfer and storage of sensitive data in a remote location.

(from Wikipedia)

Also

Off Site Encrypted Backups using Rsync and AES

RedGrittyBrick

Posted 2010-12-16T00:37:51.520

Reputation: 70 632

rsyncrypto solves a big part of this for me. It's not 100%, but definitely a big help. I couldn't find rsyncrypto directly, but Fedora 11 has a "duplicity" program that might do this.

@RedGrittyBrick, could you answer http://stackoverflow.com/questions/4535620/rsync-useful-w-encrypted-files so I can give you credit? Your answer here solves that problem completely.

– barrycarter – 2011-01-11T18:49:10.027

2

CrashPlan is free and does everything you need it to, I think.

Carl Norum

Posted 2010-12-16T00:37:51.520

Reputation: 2 621

Looking into this now. Is free, but doesn't appear to be open source (which would be nice). – barrycarter – 2010-12-16T00:50:02.900

1

Try using Rsync. You would have to complete one full backup, but then you would only need to move compressed tarballs of the changed files on a daily basis. A little Googling will turn up numerous shell scripts to accomplish this, and there are Windows implementations of Rsync that work very well.

themicahmachine

Posted 2010-12-16T00:37:51.520

Reputation: 353

Thanks. I'm familiar with rsync and have tried using it, along with other tools, to write a Perl script that does what I want. However, rsync by itself does not do what I want.

Unfortunately, if a file changes slightly, the encrypted version of the file changes a lot, so rsync doesn't even work that well for single files. – barrycarter – 2010-12-16T17:39:28.173

I use rsync to back up multiple machines with 100s of GBytes of data to a "backup server" nightly. It works - and works well - because only a relatively small percentage of the backed-up machines data changes each day. Occasionally you'll see a big spike in the amount of data transferred because someone added/rearranged multiple GBs of data, but that doesn't happen that often. A much better solution than backing up everything (apart from the huge initial copy when you first set it up of course). – timday – 2011-01-08T10:29:16.417

0

My immediate gut reaction is that it isn't viable. The reasoning I'm following is this: Let's say you backup a Gig of files including parts of your OS, program and data files.. The next day, you continue and start backing up the next Gig... The problem is that on that second day, maybe the OS got updated. Maybe some program files got updated. Maybe some data files got changed. Now the first backup has obsolete files. In other words, you're dealing with a moving target. Even if you threw attribute flags and always said - backup the files without the attributes, this still isn't efficient because you keep backing up some files that constantly change.

That being said some programs like the one in Carbonite or Mozy backup in the background and slowly manage to eventually backup everything, but they are data only - no OS or programs. You could setup something like that and have the program run in the background constantly making updates, but it would take forever to backup and the backup itself would be possibly monstrously inefficient in terms of diskspace and reliability wouldn't be all that great either.

Have you ever considered getting another internal drive and just cloning the system overnight? That would be best for speed/reliability/efficiency.

Blackbeagle

Posted 2010-12-16T00:37:51.520

Reputation: 6 424

0

If you are using (or don't mind switching to) a file system that supports snapshots(*) then backing up 1 GB per day is pretty simple:

  • take a snapshot of the current state of the local disk
  • back up 1 GB of that snapshot every day to the remote machine at your off-site location (possibly using "zip" with the "--encrypt" option and the "--split-size 1g" option)
  • After the complete snapshot has been backed up, discard the local snapshot.

After you have the first snapshot backed up, you could keep creating fresh new snapshots in exactly the same way. But I think you would get exactly the same results in less time by

  • run a copy command on the remote machine, copying the most recent complete snapshot it has to create a rough draft of a fresh new snapshot.
  • take a fresh new snapshot of the current state of the local disk
  • use rsyncrypto (thanks, RedGrittyBrick) with the --bwlimit set to 1 GB/day, allowing it to modify the remote rough draft to an exact copy of the fresh new snapshot.

(*) By "supports snapshots", I mean either (a) run inside a virtual machine that supports system snapshots, or (b) use ext3cow, btrfs, ZFS, or some other file system that support snapshots -- all the ones I know of are indicated by the "snapshot" column in the "comparison of file systems" article.

David Cary

Posted 2010-12-16T00:37:51.520

Reputation: 773