When someone mentions RAID in a conversation about backups, invariably someone declares that "RAID is not a backup."
Sure, for striping, that's true. But what's the difference between redundancy and a backup?
When someone mentions RAID in a conversation about backups, invariably someone declares that "RAID is not a backup."
Sure, for striping, that's true. But what's the difference between redundancy and a backup?
RAID guards against one kind of hardware failure. There's lots of failure modes that it doesn't guard against.
and more.
Q: Why is RAID not a backup?
A: Because the whole purpose of a RAID is to make sure that nothing in the world can interrupt that accidental rm -rf /
(or DELTREE /X C:\
), not even yanking the power chord in panic.
Q: But whats the difference between redundancy and a backup?
A: If you accidentally overwrite your PhD thesis with garbage, redundancy ensures that you have multiple copies of garbage, in case one gets bad. A backup ensures that you can restore your PhD thesis.
(And an archive ensures that you can retrieve multiple older versions of your thesis, and a version control system also tells you why you made a new version in the first place.)
Redundancy protects you against your hardware failing. It does not protect against user error, nor against malicious activity (e.g., crackers getting into your system).
See: Why Mirroring is Not a Backup Solution for a hard-earned lesson.
The number one reason you want a backup is not because the physical media died (this is rare), but because of some error that caused the data to be lost or corrupted.
RAID doesn't protect you against a file being deleted.
RAID doesn't protect you against a file being overwritten.
RAID doesn't protect you from your system being compromised and all of your data being overwritten, deleted, or corrupted.
RAID doesn't protect you from your ops team accidentally paving a machine with important data on it.
RAID doesn't protect you from a foolish DBA running a drop command on the production server (mistaking it for a test environment).
RAID doesn't protect you if the building burns down.
P.S. http://ma.gnolia.com/. This is what can happen if you don't have good backups. Your site is snuffed out of existence (note: this tends to be bad for business).
Redundancy is great if one of your disks fails. It's no so great if your computer gets a virus, or you mistakenly delete a file, or you need to restore the disk to a previous version for some other reason. That's when you need a backup.
RAID helps you recover from failures, but backups let you go back in time.
It should also be mentioned that a hardware fault in the raid controller can easily corrupt the data on all attached disks. So while you reduce the danger from disk failures you add the danger of raid controller failures.
Asked in a comment to the accepted question:
Will a backup refuse to copy a corrupt file?
Even if a backup copies corrupt or bad data, the point of a backup is that you can and should have multiple copies. For instance, last hour, yesterday, last week, etc. You can get a similar effect from using rotating snapshots on your storage device.
But the other reason for backups is geographic redundancy. You should certainly keep copies of critical data in two different geographic locations. How separate those locations are depends on how critical the data is; keeping copies in two different buildings in the same city protects against fire or theft. Keeping copies in two different countries protects against bigger problems.
RAID can be a great way to mitigate risks due to hardware failures, but RAID won't help you when your users delete (accidentally or otherwise) their data. To recover data you need some archival facilities, either through local snapshots or online/offline backups.
In a RAID5 array, consisting of disks over 400Gb, if you lose a disk there's something like a 75% chance of having an unrecoverable read error while the array is being rebuilt. Think about that for a second and it becomes pretty obvious why someone will always remind you that "RAID is not a backup".
RAID gives you higher reliability and performance, but it's not infallible.
What's the difference between redundancy and backup? Ok, configure a RAID 5 disk set. Store some business-critical stuff on it. Pull a disk out. Everything still works! That's redundancy. Now delete all the data (don't cheat with the recycle bin). Now restore it from the most recent backup. You don't have one? Oops. Well at least you can tell your boss your disks are using RAID 5 redundancy (as you get marched out of the building...)
RAID helps you to eliminate downtime in case of limited, but most probable scenarios, of HDD failure scenarios. Usually it's one drive failure at a time.
RAID does not protect you from having stored invalid data on drives. Application or system software bug causing wipe of some or all data from drives, or human mistake deleting wrong data, or malicious users, or viruses. In such scenarios, RAID ensures, that data loss happened also on redundant drives.
RAID does not protect you from lossing whole array at the same time. Fires, floods, or other catastrophes destroy it all at once. Similarly thiefs can stole whole NAS at once, or very drunk roommate in a very bad mood can play "throw it as far away as possible" with NAS.
Backups help you get back in time. Restore what was once stored as current/live data.
Backups help you to restore previous versions of lost data in case of catastrophic failure.
Mirroring of data helps you to protect from catastrophic loss of single physical location, but doesn't necessarily prevent hackers or viruses or other means of data loss, or corruption, propagation to mirrors.
Also consider with raid that you have multiple hard drives probably build at the same time and then exposed to the same conditions for years .... what are the chances that they will all fail about the same time .... pretty high
[Not an answer, I already know. But an instructive tale nevertheless. Feel free to not upvote it. I'm posting it as an answer simply because it's too long for a comment.]
"We have redundant web servers on a load balancer, redundant database servers in a cluster and redundant hard drives in every server. So how did this happen? According to our server company there was a manufacturers bug in the firmware of the specific model that 6 of our 8 hard drives were on. That bug caused the disks to die after a certain number of hours running."
https://tvtropes.org/pmwiki/posts.php?discussion=15941624520A37147500