7

What tool(s) would you use to verify that a restored file structure is whole and complete? My environment is a Windows Server 2008 file server. (We use tape for backup, but that is inconsequential.)

I am specifically looking for a tool that will:

  • Record the names of all files and folders below a specified directory
  • Optionally calculate checksums of each file encountered
  • Save this index in a human-readable format
  • Compare the index against restored data and show differences

Some background: I recently had to replace the disks in our file server. The upgrade was scheduled to start 36 hours after the most recent full backup, so I created a differential backup. However, it turns out that one of our applications was clearing the archive bit on files saved to the server, so these were not included in the differential backup. I was unaware of this until my users reported some files as missing.

Aside from this, are there any other common methods for validating the integrity completeness of a restore? I am frequently told that testing backups by restoring them is the only way to know that backups are working, but how do you deal with the case where it works 99% correctly and the other 1% silently fails?


Update: Apparently I need to clarify a few things.

  • I do already use full backups when possible, but sometimes the situation calls for a differential backup. When that happens, I need to verify that every file in the original data is also in the restored data.
  • I am already using the "verify" feature in Backup Exec, but that only ensures that everything written to tape can be read back again.
  • I do conduct occasional spot-check restores to ensure that the backup media is intact.

I am already familiar with the common wisdom that "the best way to test a backup is to restore it." This is a necessary step, but it is NOT sufficient. Being able to restore the files you backed up does NOT guarantee that all the files you need were backed up in the first place. That is the problem I need solved.

weegee
  • 143
  • 7
Nic
  • 13,025
  • 16
  • 59
  • 102
  • 14
    Not an answer, but once a year I try to re-build our entire network inside VMs using nothing but our backups. It can be... very insightful and often shows huge shortcomings in backup procedures. – Mark Henderson Jan 27 '11 at 03:57
  • 3
    Mark, I don't think that's "not an answer", I think that's an exceptionally good answer, and I urge you to put it as a proper answer for voting. In many ways, the "does the server do what it used to do" test is the best possible one. – MadHatter Jan 31 '11 at 18:14
  • Mark's suggestion is really the only way to "test" a restore: If you can't completely recover a system using nothing but base OS install media and your backups you failed the test. – voretaq7 Feb 02 '11 at 20:12
  • Although I agree that Mark's suggestion is excellent advice, it does not specifically address my question. – Nic Feb 02 '11 at 21:26
  • Mark's is the way to go. A back tool can "lie". Worked one place where those charged with doing backups trusted the tool's output for backup integrity, location, etc. Needed a backup to keep the business from going down. The tool's response "Backup file unknown". That was a CEM for one person. – jl. Feb 03 '11 at 15:12

8 Answers8

4

There are a variety of tools available on Linux which are well-suited to this task. You can use mount.cifs to mount Windows shared folders on a Linux host, or you could just run Cygwin right on the file server.

Before starting the backup, use the find command to recursively iterate from a specified directory and write the results to a file. This listing can be saved along with the backup for future use.

find /path/to/dir > list_before.txt

If you want to have checksums calculated for every file, just pass the output through md5. This command only shows filenames because folders don't need hashes.

find /path/to/dir -type f -print0 | xargs -0 md5 > md5_before.txt

After restoring the backup, build another file list using the same command, then use diff to find differences between them. Ideally, this command should give no output.

diff list_before.txt list_after.txt
Nic
  • 13,025
  • 16
  • 59
  • 102
  • Why did I get downvoted for this? – Nic Feb 02 '11 at 17:47
  • I'm not sure why this was downvoted -- It's an OK way to test the integrity of a restore (provided your server data hasn't changed) AND if you use relative paths it lets you restore locally to an "alternate restore path" & validate that way. What it **doesn't** do though is prove that your restore is sufficient to recover your environment. See the comments on the original question Re: that. – voretaq7 Feb 02 '11 at 20:14
  • @voretaq7, there are a very small number of issues which would mean that this does not prove the environment is not recoverable, however AFAICS it does specifically address every point asked in the original question, without introducing the complication of switching backup software. And is easily implmented on MSWindows using one of the free posix kits available. I don't see how the comments elsewhere explain why you think this is not a valid approach. – symcbean Feb 03 '11 at 12:24
  • @symcbean, re-read my comment. I believe this is a valid approach to testing the *contents* of a restore, but the only way to test that "all the files you need (to restore your environment)" were backed up is to perform a restore in a clean-room environment. You could back up "all the Documents folders", verify that their contents are correct in this way, and still be missing something critical for rebuilding your environment. – voretaq7 Feb 03 '11 at 15:43
  • @woretaq7: yes. I got that - but its not what was asked in the original question. – symcbean Feb 04 '11 at 12:21
  • I had to use this today. It only takes about 5 minutes to list a quarter million files over the network. – Nic Feb 05 '11 at 02:59
0

First of all, enable the the 'verify' option in your chosen backup app and then stick to complete backups where-ever possible.

You can use additional apps to perform all the actions you want, but they will take as long as the backup does. Maybe something to add to the weekend work-load of your servers?

DutchUncle
  • 1,265
  • 8
  • 16
  • Can you mention any specific apps which do this? – Nic Jan 27 '11 at 18:37
  • Maybe. What environment/OS do you fancy? GUI or command line? – DutchUncle Jan 27 '11 at 19:11
  • The file server is running Windows Server 2008. However, I can connect a Linux computer to the file server if there are better tools available that way. – Nic Jan 27 '11 at 22:22
  • you best stick to the platform that actually performs the backup, because that can preserve ACLs and there's cross-platform subtleties (e.g. Windows apps don't understand inodes much). So what backup solution do you use on Windows 2008 and why can't you run full backups each night? – DutchUncle Jan 27 '11 at 22:29
  • We use Symantec Backup Exec with LTO-3 tapes, but we have too much data to fit on a single LTO-3 tape. There is no room in this year's budget for a new tape drive. – Nic Jan 27 '11 at 23:49
  • I stopped using tape in the nineties :-) If you want to run additional data checks, you'll need the full backup (set) on your network. Start saving for a NAS or SAN, or break your backup up into (coherent/logical) parts? – DutchUncle Jan 28 '11 at 20:29
  • 1
    @UnisoftDesign: How do you handle off-site and off-line backups? – Evan Anderson Feb 02 '11 at 20:56
  • I only get involved with setting up the backup system for small businesses (let's say up to 50 employees). For windows: built-in backup & shadowcopy, sometimes robocopy for certain file shares, together with truecrypt for external USB/hotswap-eSATA hard disks or hotswap SATA modules in a 3.5" bay. They have 7,200 rpm laptop hard disks inside. These are employee resistant and fit in hand bags or sport jackets :-) – DutchUncle Feb 04 '11 at 19:18
  • For OpenBSD is use the built-in tools and rsync to backup and built-in openSSL encryption. – DutchUncle Feb 04 '11 at 19:27
0

Backup Exec (in recent versions) should verify after backup by default. Double check it though, should be a checkbox in the options.

You might look at the "Write checksums to media" option to save checksums after each backup, and consider saving the job logs to compare from run to run. I don't know the format of these files, but you may be able to get file lists or at least size details to compare, as a starting point.

Robert Novak
  • 619
  • 4
  • 6
  • 1
    This is a valid answer, but I thought I should mention that backup exec verifying that it has what it believes it has, is not equal to backup exec having everything you believe it should have. – Chris Thorpe Feb 02 '11 at 20:02
  • @Chris +1 you understand exactly what I am trying to ask – Nic Feb 02 '11 at 20:56
  • @ChrisThorpe Or even that your restore procedure didn't leave something out. – Scott Pack Feb 04 '11 at 16:46
0

The best way to check a backup is to restore it. Anything else is a compromise - nothing wrong with compromises but you really do need to do restores of your data to test.

In an ideal world you'd do a DR restore every 6 to 12 months, and restore random files on a more frequent basis, but any routine where you restore at least one server onto a virtual machine and check it boots afterwards is a great start.

This is something you'd do in addition to any verification routine that the backup software itself has.

Rob Moir
  • 31,664
  • 6
  • 58
  • 86
  • Neither restoring random files nor having the machine be able to boot actually provide any guarantee that all of the original data has been restored. – Nic Feb 02 '11 at 21:32
  • 1
    Fair point Nic, so you would do a verify too as I say - your suggest there is a good one. But I also stand by the idea that actually doing a restore from cold is a very important part of testing a backup. – Rob Moir Feb 02 '11 at 22:49
  • I definitely agree that testing full restores is a necessary part of validating backups. – Nic Feb 03 '11 at 03:49
0

I use a combination of methods for backups. I use an on-line backup, as well as taking weekly images of my production servers. I do test restores on a monthly basis of random files such as SQL databases and attach them and verify they are functional.

With my imaging, I do P2V backups of my servers using SCVMM into a big SAN. For DR testing I can bring them all up in a separate IP environment. If a server ever physically fails, I can bring up a VM of the server which are always less than a week old, and restore any discrepancies from the on-line backup. I also have a single XP machine joined to the domain that sits in that closed environment where I can test all my apps and email. I do this every 6 months or so to ensure a good DR environment.

DanBig
  • 11,393
  • 1
  • 28
  • 53
0

Sorry, can't post a comment.)

As far as i can tell (i'm not a windows guy), Nic's solution should work in Windows "natively" (just find and get UnixUtils for win32 or any similar package).

You also can diff directories directly (optionally with trailing > difffile):

diff -r /path/to/what-to-backup /path/to/restored-data
brownian
  • 291
  • 3
  • 13
0

Not what you want to hear but I have the luxury of full 1:1 ratio reference environments for all my platforms for just such tests.

Chopper3
  • 100,240
  • 9
  • 106
  • 238
0

I would restore the files to a test location and use a tool like Winmerge:

http://lifehacker.com/290657/compare-and-merge-files-and-folders-with-winmerge

to compare them to the original source. There is also Windiff:

http://www.computerperformance.co.uk/w2k3/utilities/windiff.htm

I would also recommending backing up your valuable data three different ways, especially if you aren't verifying backups every day. I would suggest Backup Exec to tape, an offsite rsnapshot file backup, and disk based backup:

http://backuppc.sourceforge.net/

running locally. Try Backuppc, you'll thank me. When something goes wrong you'll appreciate the variety of options.

Antonius Bloch
  • 4,480
  • 6
  • 28
  • 41