1

Does anybody have a good solution for backing up /var on a live system?

All the recommendations I have seen so far have not taken into account the fact that /var holds live data and that restoring a file that was being simultaneously written to at the time it was captured is potentially disastrous. I wouldn't be caught dead backing up /var/lib/pgsql with a straight copy.

fthinker
  • 316
  • 1
  • 3
  • 9
  • This depends on the type of data being backed up. If you backing up a database dir, you have to do this via a DB tool or at least tell the DB somehow you are doing a backup to flush the in-memory changes and have a consistent copy. – Khaled Apr 20 '11 at 20:35
  • 3
    You may perform a two way backup: 1) Backup the databases of all your services that hold their data in /var (MySQL, PgSQL, mail daemon spool db's, spamassassin db, etc..) by using their respective backup tools. 2) Backup the static /var data with the backup tool of your choice (tar, rsync, commercial tool etc.) and add exceptions to it regarding the dynamic files&foldes mentioned above (plus the ones you don't need to backup (e.g. /var/run..). – desasteralex Apr 20 '11 at 20:44
  • 1
    desasteralex answer is the best way to go. If you can use specific backup tools to save your data do so, and save to /myspecialbackupsdirectory if you like, then use your tools to backup the data that gets put into the directory - as well as your general backup tools. You get some replication of data in your backup sset but I'd rather get 100% of the data in multiple forms, rather that 99% in one form. – Mister IT Guru Apr 20 '11 at 21:11
  • @desasteralex I was afraid that was the answer. Thats something I have been avoiding since its going to be difficult to find out a) which services belong to which files for all files b) figuring out how to gracefully shut those services down when the backup script runs c) even figuring out which files are static data. So yeah, its a heavy operation and I was hoping for some sage one shot unix advice :) – fthinker Apr 21 '11 at 02:24
  • When restoring, are you OK with losing all the changes made after the backup was made? If you have lost all of /var, you have no choice, but what if you have lost *some* of /var and want to restore that? – reinierpost Nov 11 '16 at 13:19

3 Answers3

3

Generally I don't back up /var -- Things like Postgres should be backed up in accordance with the backup procedures in their respective manuals, and restored similarly (e.g. only an idiot would restore over a running instance).

Should you need to back up a specific subset of /var (like postgres' data directory, openldap's BDB directory, etc.) you should follow appropriate procedures outlined by the software vendor, or exercise good common sense (ensure the files are quiescent, etc)

voretaq7
  • 79,345
  • 17
  • 128
  • 213
  • Well.. /var is a very important and active part of a current system's state. And yes... I realize that about postgres, thats why I said what I said. The difficulty in backing up /var arises in exactly what you have said in your second sentence. In some cases it may be very difficult to track down which files belong to which process and how to lock these files when the backup runs. Just look at /var/lib – fthinker Apr 21 '11 at 02:20
  • I'm looking at another question you answered here: http://serverfault.com/questions/230354/should-i-be-using-lvm-snapshots-along-with-rsnapshot | I'm also using Bacula, and LVM snap shots are looking more and more appealing. – fthinker Apr 21 '11 at 04:34
  • 1
    LVM snapshots work to a first approximation - databases can still be tricky though. If you're backing up something like Postgres my suggestion is either a dump/restore or backing up a slave (shut the slave down, grab your backup & start it up again) to ensure that the data on disk will be clean & sufficient to recover to at least the point in time where you shut the slave down. – voretaq7 Apr 21 '11 at 14:17
  • It doesn't look like LVM snapshots will solve my problem if I understand how they work correctly. Its strange that many comments I've seen have not just raised the point you just raised: That even when using an LVM snapshot you must still ensure that your database has a workable state on disk that can be backed up (I.e locking all tables and then flushing them to disk, then unlocking the tables as seen here: http://pointyhair.com/tiki-view_blog_post.php?blogId=1&postId=5 and here: http://mike.kruckenberg.com/archives/2006/05/mysql_backups_u.html ) Not even sure of the benefit of LVM right now. – fthinker Apr 21 '11 at 17:03
  • Many of the comments I've seen have just amounted to "With LVM snapshots you can just ignore any active files. Your backups made from this snapshot will be fully usable" – fthinker Apr 21 '11 at 17:07
  • @fthinker - those comments (and answers) are wrong - LVM snapshots will ensure that the file you grab is quiescent, but they won't ensure that what you grab is usable. As you noted you still need to ensure that your DB is in a usable state. The upshot however is that "Shut down the database, Make an LVM snapshot, restart the database, back up the snapshot" takes a lot less time than "Shut down the database, make your backup, restart the database" :-) – voretaq7 Apr 21 '11 at 17:14
  • Thanks for confirming what I was thinking. I have two follow up questions if you don't mind :) 1) The LVM system maintains the integrity of files in your snapshot by ensuring that they are only updated when a file has been written to and has _stopped_ being written to on the original filesystem? 2) I'm not following you on the difference in speed between the two options you have presented. – fthinker Apr 21 '11 at 17:43
  • Re: integrity, once an LVM snapshot has been made it always appears as it did at that point in time (much like a Polaroid snapshot, it doesn't change after it's made). Re: performance, creating a snapshot takes a few seconds, and when it's done you can go back to using the main disk as you normally would. `The stop->snapshot->restart->backup` chain I described above means a few seconds where your database is unavailable, versus minutes-to-hours for `stop->backup->restart` – voretaq7 Apr 21 '11 at 17:50
2

I backup /var with rsync the way I do everything else, but my then run a secondary set of backups just for the databases using the database tools to do a data dump (or hot copy in some cases). Surprisingly the file system backups have proven useful more often that the database data backups.

Also, the drives my databases are on all run file systems (or hardware) that allows snapshoting the entire file system at an instant in time. Doing this periodically is another great way to keep your data safe and sound.

Caleb
  • 11,583
  • 4
  • 35
  • 49
  • Are you sure your method is safe for files in /var/lib and /var/spool? It sounds dangerous. What file systems are you using, that feature sounds awsome! – fthinker Apr 21 '11 at 02:12
  • First of all, remember that it's not dangerous for the host, only for the possible usefulness of your backup. In that regard it is dangerous, but it can still be useful to have all the peices. For mail queues it seems to work pretty well, I've had to restore those a time or two and it's worked fine. Anything that you can reasonably freeze while the backup syncs should be. Anything that you can't reasonably freeze probably has it's own backup routine. – Caleb Apr 21 '11 at 07:27
  • You can snapshot almost any file system by using a volume manager like LVM (highly recommended). Some file systems like ext4 also have internal mechanisms for it. Almost all Virtual Machine and cloud computing systems have the ability to snapshot drives no matter what is running on them. – Caleb Apr 21 '11 at 07:29
  • Thanks for the input, thats the third vote I've seen for LVM snap shotting now. I think its probably the most realistic because in my situation, determining all the services that I should freeze will consume a greater amount of time. Not to mention its not the most reusable solution when I eventually need to back up /var on another machine. – fthinker Apr 21 '11 at 14:02
  • Granted, the advantages of LVM are something you do kind of have to know about and plan for ahead of time. It's a little more complicated to implement on some box you 'inherited'. – Caleb Apr 21 '11 at 20:30
2

But if for some reason you still want take consistent backup of entire /var , you may consider mount it as lvm partition and take an lvm snapshot: http://tldp.org/HOWTO/LVM-HOWTO/snapshotintro.html http://tldp.org/HOWTO/LVM-HOWTO/snapshots_backup.html

You mentioned postgres backup - for postgres to have consistent backup you can use pg_dump or combination of binary logs together with snapshot of postgres data directory (so called PITR) http://www.postgresql.org/docs/9.0/static/continuous-archiving.html#BACKUP-BASE-BACKUP

There is no silver bullet that will backup everything in one shot and make sure that all services data are consistent logically.

Ruslan
  • 349
  • 1
  • 4
  • I am aware of how to do a postgres dump. I just mentioned it because its the most obvious example of why backing up var is not simply a 'cp -a' operation. I am going to look into this snapshot solution however. Thank you for the help. – fthinker Apr 21 '11 at 02:15