1

Assuming a non-virtualized environment it a good idea to take actual images of servers (using something like Acronis True Image) and store them on\off site?

Backing up data is great but I feel it would be good to have copies of OS images in the event hardware dies or an upgrade gets botched I can always revert back.

What would be your recommended way to do this (preferably using a NAS and an online backup service)?

I was talking with the Iron Mountain folks and the service they described is more geared toward taking incremental snapshots of data. I'm not sure if there's a way to backup images in an incremental way such that only the changes between them are saved (that way I'm not wasting X GB each time I take an image).

MadHatter
  • 78,442
  • 20
  • 178
  • 229
ServerAdminGuy45
  • 371
  • 1
  • 3
  • 3
  • 3
    It's all based on risk, comfort levels, DR RTO/RPO, backup windows, amount of data, change rate, etc. As such, any answer here will be opinion and not necessarily black and white. You have to evaluate your own needs/wants. That said, your question _could_ be a good subjective question (http://blog.stackoverflow.com/2010/09/good-subjective-bad-subjective/) if you got rid of the "What would be your recommended way to do this?" and focused on the pros/cons of doing image level backups of servers...maybe. – TheCleaner Oct 22 '13 at 15:08

2 Answers2

0

My answer doesn't differ between virtualized and non-virtualized environments:
Images are useful tools, but they don't make good backups.

Here's why:

  • In a virtual environment you can do a base snapshot of the system and then store multiple delta snapshots (which would need to be replayed in order).
    Eventually you have so many deltas that you need to do another base snapshot.

  • In a physical environment you can't really do proper snapshots without quiescing the system.
    That's fancy-pants talk for "Outage Window".
    There are non-outage techniques for snapshotting which produce usable results, but there are LOTS of caveats.

  • In either case above, you can't pick and choose what you're restoring: You need to restore the last snapshot (plus some deltas) to get the whole system at a specific point in time.
    You can't just get back "the one file the CEO deleted this morning that is ABSOLUTELY CRITICAL for this afternoon's presenation" - you have to restore the whole system somewhere and then pluck that file out of it.
    This basically means you still need traditional backups if you're using snapshots!


What would I recommend instead? Traditional backups.
(Yes, even on virtual machines. Treat them exactly like a physical host.)
Think about what is involved if you have a hardware failure in each candidate workflow:

Snapshots

  1. Network boot (or boot off CD) into snapshot recovery software.
  2. Restore the appropriate base snapshot, plus any incremental snapshots.
  3. If you have traditional backups, restore them from the snapshot point forward.

Traditional Backups (With Standard Images)

  1. Network boot (or boot off CD) into imaging software.
  2. Load standard image (including backup/restore software & needed apps).
  3. Configure the network as needed.
  4. Restore from backups.

So we're adding one more step (network configuration), but it's not a big step.
The time to recovery with standard images is pretty close between backups and snapshots, and not much longer with manual installation if your staff is reasonably competent.

How you handle off-site backups is your call - you can certainly go with traditional magnetic tape (or removable hard drives), and Iron Mountain provides tape service where they'll bring you big blue bins of media on a standard rotation.
You can also drop a storage system (NAS, mirrored SAN, computer running some software) at an off-site location, or use commercial solutions like Amazon Glacier or http://rsync.net


Note that you can (and should) use snapshots for one of the cases you described in your question: To have something to revert to in the event an upgrade or other major change goes wrong. Snapshots are great for that.

Prior to doing an upgrade or major change, snapshot critical systems and store the snapshots locally. If you need to revert, simply restore from the image you made.
In this case you're already in an outage window (hopefully, if you're doing it right) so you can shut the system down to take a cold snapshot, and it's a single point-in-time that you're interested in ("before we start making changes"), so none of the potential drawbacks of snapshotting apply.

voretaq7
  • 79,345
  • 17
  • 128
  • 213
0

The answer to your question is: yes. For years, I've made it a habit to image servers, or at least important ones, with tools like Mondo, Acronis, Ghost, etc. The bare-metal recover time with images is much faster than restoring from backups. When you restore from a filesystem backup, you still must go through the process first of installing and configuring the operating system, and potentially compiling your applications all over again. Whereas with image restores, the amount of manual labor involving (and the amount of "thinking") is much less. This has two benefits:

(a) substantially faster recovery (I very much disagree with the above poster who says the time to recovery is the same)

(b) less chance for error. There's only one step involved in restoring from an image. Contrast this with multiple steps that you need to do if you manually compile and install applications after you recover the OS.

This applies to VMs as much as physcial hosts, however with VMware you have "clones" and other tools you can use, but you still are faced with the task of transfering the "clones" or disk images offsite.

When I set up production server environments, I use both imaging software as well as regular backup software as part of a disaster recovery solution.

Michael Martinez
  • 2,543
  • 3
  • 20
  • 31
  • Like I said in my answer I'm all in favor of *standard OS images* (operating system + all required apps) so you don't have to waste time reinstalling everything by hand, but *data* needs a much more granular backup and restore solution than images can provide. I don't believe it's viable to maintain ongoing images (with requisite outage windows) as a "backup" solution, and you yourself said you are using traditional backup software as a component of your recovery strategy (presumably to mitigate those issues & avoid a nightly outage to cut new images?) – voretaq7 Oct 22 '13 at 21:35
  • Yeah, the imaging schedule is much less aggressive. The main benefit to imaging is that you don't have to rebuild applications and config files. You won't face a case where you're, like, "oh cr*p, how did I tune apache last time to get it just right for my environment." "What was that change I made to sysctl?" – Michael Martinez Oct 22 '13 at 22:09
  • Right - in my case apps and (most) config files are already handled (as part of the "standard image" that gets installed on new bare metal machines) -- the only extra step is configuring the network and then restoring from the traditional backups (which include all the `/etc` configuration stuff & gets me to where I was the day the backup ran). It *would* be nice not to have to type in the network configs, but that's really my only procedural difference. Our standard images get pushed offsite for DR, but those update quarterly which keeps the huge off-site dumps to a minimum :-) – voretaq7 Oct 22 '13 at 23:18
  • Where I'm currently working (large internet company), they've got things split into two pieces: (a) an OS restore (specific versions of kernel, specific versions of REdhat, etc.) and (b) a package restore which uses their own package management system that pushes out the correct config files and all. So, doing a bare-metal restore is a two-part process, but it's all automated and launchable via either gui or command line. – Michael Martinez Oct 22 '13 at 23:26
  • That's a sensible way to handle it (we do basically the same thing, except (a) and (b) are combined as part of a first-boot script). The configuration-related stuff I'm getting out of backups *should* really come from proper configuration management tools like puppet, but I haven't had a chance to actually deploy nice things yet :'( – voretaq7 Oct 22 '13 at 23:34
  • @voretaq7 - Let's say I used an image only backup on something like an SQL server. Sure the image takes longer (not substantially) to restore than the MySQL data, but do I risk corruption of the database by backing it up as an image (while the image is being taken different parts of the DB can be written) – ServerAdminGuy45 Oct 24 '13 at 02:08
  • @ServerAdminGuy45 As I said [in my answer](http://serverfault.com/a/547860/32986), I would generally ***not*** recommend images as a *backup* solution. In my experience the only reliable way to get an image that is certain to work when it's restored it to shut the system down and take the image in that state. If you take a "live" image then much like any other tool making a (filesystem) copy of a running database there is a risk of corruption. You should always follow the backup and restore procedures recommended for your database (and *test restores*) to be sure you have good backups. – voretaq7 Oct 24 '13 at 15:29
  • this is what replication slaves and hot dumps are for. – Michael Martinez Oct 24 '13 at 18:53