2

I looked a little bit around at other questions, and none of them really answered what I needed to know. What are the steps I should take towards building an efficient disaster recovery strategy, both for servers and workstations?

Here I'm talking about setting up backups, ghosting systems, and the like.

What I'm looking for is recommandations I could pass to my boss, ideally with "graceful cheapening" of solutions.

IE: We cannot afford to have a replacement quickly set up in case our server fails (as it was the case just now), and have not the possibility to keep spare parts in case shit hits the fan. Ideally, I'd suggest a powerful hardware RAID on two domain controllers, and then less costly options, and then the cheapest available. As it is now, we're making nightly backups from a shared drive to another, and it is plugged to my co-worker's pc.

I'm pretty sure the cheapest option would be the best, in my boss' eyes, but I want to make sure he understands how critical it is that we get what we need to keep the servers and workstations up and running.

Olivier Tremblay
  • 347
  • 3
  • 16

4 Answers4

4

Just to be 10000% clear.

YOUR SERVER SHOULD HAVE RAID also, RAID IS NOT BACKUP.

That being covered -- VMWare makes server disaster recovery easy. You can script a once-a-day snapshot and copy of all of your server VMDK (virtual drives) to another workstation, or a cheap network attached storage device. In the event your ESX/ESXi server crashes, you can run ESXi on a laptop in a pinch, or a cheap server. There would be no reconfiguration, no recovery, little setup.

In my experience workign with clients, desktop workstations typically died because of non-harddrive issues. If a power supply dies and fries your harddrive, it will fry them both if you had RAID.

My recommendation for workstations:

  1. Purchase quality, business class machines (Dell Optiplex, not BestBuy deal of the day).
  2. Consider ghosting them weekly to an external harddrive (?), or use Windows NTBackup to backup essential files daily to a "backup server".
  3. UPS's (APC name brand) on critical workstations.
  4. Workstations under warranty, so parts are delivered next day (ie, Dell).

My recommendations for servers:

  1. RAID. Required. Software raid seems to work just as well as hardware raid in many cases if you can't afford hardware raid.
  2. Backups. Every night. Realtime if you can afford it.
  3. VMWare ESXi.
  4. APC battery Backup.

A backup server could be an old PC you have in the office, a cheap SATA controller, and 3 1TB drives in RAID5. Total investment for machine backups to a backup server with 2TB of storage should be < $500.

After you have a plan and an implementation -- TEST IT. Then schedule tests on a regular schedule.

With regards to having spare hardware -- in a small sample of machines, I think you will find that the failure is entirely random. Having an extra harddrive and power supply may never be useful. I would keep a spare configured workstation on hand, and just order parts from Newegg on demand if needed.

SirStan
  • 2,373
  • 15
  • 19
  • About VMWare, is VMWare server 2 enough? Say that I have to fit all that under 1000$, which is the case at the moment, and that we are under 20 users? – Olivier Tremblay Aug 19 '09 at 19:52
  • Look at VMWare ESXi, which will run faster than VMWare server 2. For $1000 you could pickup a lowend Dell Poweredge, or a used machine that could easily handle 20 users. Feel free to email me with questions. – SirStan Aug 19 '09 at 21:01
2

Chapter 21 of "The Practice of System and Network Administration" gives you the kind of detail you really need to understand backups, especially strategy.

Keep in mind: Backups aren't just for when your server catches on fire, or there's a disk failure. And RAID is not a backup solution, RAID is a hardware failover solution. Backups are there for when you or your users accidentally delete files they shouldn't have. Backups are for when some software corrupts files or makes changes it really shouldn't have. Backups are also for archival purposes, like for logs of DHCP leases so that when the police come knocking and say "We've detected illegal activity X coming from IP Y on date Z, 4 months ago. Who had that IP?"

Also, backups don't necessarily have to be expensive, but even if they are they're worth 10x more when you don't have them. Our backup server uses cheap consumer hardware, hard drives instead of tape, and sits on-site (in the datacenter, which happens to sit in a basement that is apparently rated on the proximity of an atomic bomb blast).

Ernie
  • 5,324
  • 6
  • 30
  • 37
1

Cheap/Inexpensive Resources (in no particular order)

BOOK -- Backup & Recovery (Inexpensive Backup Solutions for Open Systems) http://oreilly.com/catalog/9780596102463

Get a PLAN first... in fact get THREE... then talk to your boss and let him "choose" one.


Clonezilla -- Free way to backup/image a drive.

http://www.clonezilla.org

JungleDisk -- Inexpensive (and easy) way to backup your critical files offsite.

http://www.jungledisk.com


As for hardware... that's always going to be a problem. If your power supply dies... you need a new one. If your motherboard dies... you need a new one... If your... you get the idea.

And I'm sure everyone is going to scream this but remember... RAID IS NOT A BACKUP! :-)

Only you (and your boss) will be able to determine if your downtime (and rebuild time) is worth having a "warm standby" server or just a few common spare parts.

If you're REALLY trying to keep things cheap... at the very least go and buy yourself a 1TB USB drive for about $80 and use clonezilla every ___ days/weeks to back up the server.

I think the key here is to come up with a plan (or several options) and then talk to your boss in an intelligent manner. Give him the pros and cons of each option and then let him decide how he wants to proceed.

KPWINC
  • 11,274
  • 3
  • 36
  • 44
-1

As far as I know myself, I'd need in no particular order:

For data Backup
- Hardware RAID
- Server Replication
- Off-site backup, either served by a third party or homemade

For system stability
- System backup (using Ghost or an equivalent)
- Uniform hardware on the most workstations possible
- Essential spare parts for the hardware that's the most likely to die sooner

Olivier Tremblay
  • 347
  • 3
  • 16
  • 1
    You can edit your questions you know. This doesn't belong in the "answers" section. :) – Ernie Aug 19 '09 at 16:13
  • It doesn't belong within the question either, I was trying to answer myself, both because I think it's a "not bad" start and because I think I was somewhat in a vaguely correct direction. – Olivier Tremblay Aug 19 '09 at 16:44