1

Is there a way to guarantee consistency across volumes when doing backups from LVM snapshots? Consider this scenario:

  • Some system upgrade is in progress. It will write some files to the /usr volume, and once completed, will record success in the /var volume.
  • As the upgrade is just about complete, I run a backup script that creates snapshots of the /usr and /var volumes, along with the rest of the system's volumes, and proceeds to create backups from those snapshots.
  • Just before the upgrade's last write/flush on the /usr volume completes, the backup script takes its snapshot of /usr.
  • That write completes, and the upgrade operation's success is quickly recorded in the nebulous depths of /var.
  • The backup script takes a snapshot of /var.
  • The backup script creates backups from the snapshots it has, er, snapshotted.

So the result of all of this tomfoolery is that the resulting /usr backup contains a file which is missing a few bits, and the /var backup contains metadata indicating that that file is complete and approved for use.

Without delving into the details of which operating systems' system upgrade systems would be unfazed by such trifles, is there a way to avoid such problems? At the least this seems like it could cause some application to fail unexpectedly after restoration of such a backup.

intuited
  • 405
  • 1
  • 5
  • 12
  • What if you check exit status of write/flush of /usr and then take snapshot. – vnix27 Feb 18 '11 at 05:23
  • Hmmrmm.. not sure I understand what you mean. The upgrade and the backup script's operation are totally unrelated (except for the conflict described in the question). Actually it's more likely that the backup script would be being invoked by `cron` and I would be doing the upgrade myself, unaware of the fact that the automated backup was soon to start. – intuited Feb 18 '11 at 05:36

1 Answers1

1

The problem is a bit more general than that even. Even with one volume getting a snapshot, there's no guarantee that the data on that volume make sense at a file level. LVM snapshots only ensure block-level consistency.

The only way to be 100% sure that your files are in a consistent state is to get your applications to flush everything to disk and suspend writes while you create the snapshot. If you really care about that, and your applications support it, you should script it in to your backup procedure.

For your particular use case, why not just ensure that he the backup doesn't run while you are performing the upgrade? Either disable it for that time period or delay it until your upgrade has completed. At the minimum the consistency of the files being upgraded will be preserved.

Kamil Kisiel
  • 11,946
  • 7
  • 46
  • 68
  • I'm thinking that if it's possible to take two or more snapshot "atomically", then the worst-case scenario will be basically the same as, or maybe a bit better than, if there was a power failure, which is a better worse-case than what's described above. – intuited Feb 18 '11 at 15:54
  • I think that writes are effectively suspended while snapshots are being taken; there was something in the LVM manual about this, mentioning that it integrates with common linux filesystems. [This kernel commit](http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=c4be0c1dc4cdc37b175579be1460f15ac6495e9a) is apparently responsible for bringing that underlying facility to ext* and some others. – intuited Feb 18 '11 at 16:25
  • I guess what I'm getting at is that while most applications are designed to recover in the case of a power failure, they may not be designed to deal with a scenario such as that painted in the question. – intuited Feb 18 '11 at 16:27