What's the point of having a RAID 1 configuration over incremental backups to a secondary drive?

25

3

I have a synology NAS and I don't understand the point of RAID 1 system. Why bother having a mirror? If I delete a file by accident it's deleted on both drives.

Fractale

Posted 2019-07-08T07:34:00.010

Reputation: 509

2I would always use RAID 1 in a two disk NAS. What if a disk dies? Without RAID you have lost your backups, gg. – Džuris – 2019-07-08T19:36:54.327

2If you can live with the downtime when a disk fails - I.e. you don’t mind to restore your backup to a newly purchased drive not using RAID1 is an budget option. You will get double the useable space (writes might also get a bit faster, usually reads will get a bit slower). Keep in mind that you will lose all modifications since the last backup in that case. – eckes – 2019-07-08T23:16:37.003

1Do you backup your NAS to your NAS ? – Criggie – 2019-07-09T04:04:45.927

I have a backup server that performs incremental backups from two computers on another site. Those backups take around 20 minutes every morning. Then, to protect against disk failure of the backup disk, I rsync the backup disk to another disk, locally on the same computer. That takes around 6 hours. So in this case at least, I believe it would have been better to set up the two disks as raid. – Thomas Padron-McCarthy – 2019-07-09T08:42:17.303

15@Džuris Without RAID you have lost your backups Absolutely, utterly wrong. RAID of any level is no backup. RAID does nothing to protect you against rm -f -r ~/All/My/Data. RAID provides nothing but availability - it improves your access to your data, but it does not protect your data at all. – Andrew Henle – 2019-07-09T11:29:26.163

4@AndrewHenle What is wrong in what I said? Backup to NAS protects you from losing data by that rm. RAID on NAS protects you from losing the backups when one disk on the NAS dies. – Džuris – 2019-07-09T14:26:42.547

If you want to compare similar things, you have to compare a RAID System with regular filesystem snapshots to your backup solution. And the RAID+Snapshot system wins in all categories! – Josef says Reinstate Monica – 2019-07-10T11:08:22.740

RAID 1 is not for backups. It's for disk failure. – njzk2 – 2019-07-11T02:39:35.423

Answers

75

RAID is not a backup mechanism; it's a redundancy mechanism, and it does a completely different job – one protects against disk failures, the other protects individual files. So you wouldn't use one instead of the other; you'd normally use RAID along with backups or snapshots.

The main advantage of a redundant system is that it will not go down completely when a complete disk failure happens – the mirror allows you to continue using the NAS without interruption while the array is rebuilding.

(In other words: If you had a backup system but no RAID, you'd have to spend time restoring the complete system from backups every time a disk failed. If you have a backup system and a RAID 1 mirror, you only have to use backups when both disks fail at once, which happens much less often.)

Likewise, the redundant array should also allow you to replace disks that are only about to fail (e.g. if you see bad sectors increasing), and even to swap them with larger ones (if you're running out of space), without any downtime.


1 Doesn't apply to RAID 0.

user1686

Posted 2019-07-08T07:34:00.010

Reputation: 283 655

Backup avoid data lost with is priceless. If raid 1 only avoid downtime when a disk fail that's quite exepensive! If I have to choose I should use backup right? (Limited budget) – Fractale – 2019-07-08T08:02:43.160

"without any downtime" ? Rebuild time can be very very long for large disks. – harrymc – 2019-07-08T08:54:08.470

8If losing an hour of data is OK but losing all your data isn't, then backups are more important than RAID. – gmatht – 2019-07-08T08:54:17.840

Thanks @gmatht, that's what I was feeling. Downtime will be short. It's not long to copy backup data to a new drive anyway – Fractale – 2019-07-08T09:18:17.780

18@harrymc, is there a RAID rebuild that requires the system go offline? Sure you might have reduced performance, but rebuild is done while the array is operational. – JPhi1618 – 2019-07-08T15:59:00.480

2Many hardware cards have the option to rebuild while on-line. It that is wise is another question, esp when you RAID1 several large drives and risk URE. – Hennes – 2019-07-08T16:11:01.853

1The real nice reason for RAID1 though is real time protection vs a single disk failure. That allows you you to get a new disk, restore from backup and copy the difference between the last backup and the files on the remaining disk[s] of the raid1 array. – Hennes – 2019-07-08T16:12:11.357

47For emphasis (for future readers): RAID is not a backup mechanism – FreeMan – 2019-07-08T18:36:36.260

8Backups are also suppose to be off-site. If your server was stolen, flooded, set on fire, offline, crypto-ransomed, etc, you still have a backup. They do different things. – Nelson – 2019-07-09T08:35:12.323

1@Nelson exactly. Many people repeat "RAID isn't a backup" but are perfectly happy with one on-site copy on a NAS, which isn't safer as a backup mechanism. – Eric Duminil – 2019-07-09T09:02:31.637

3@FreeMan: RAID is a backup mechanism that protects against drive failures. It just isn't (usually) as good a mechanism as incremental backups over the past year. Which isn't as good as all of the above plus multiple off-site backups. – MichaelS – 2019-07-09T11:16:15.677

10Am I blind, or is the footnote not referenced anywhere in the text of the answer? – Falco – 2019-07-09T11:26:25.997

1@Falco: I don't see it anywhere either. It's not hard to figure out though, so I didn't bother mentioning it. :) – MichaelS – 2019-07-09T11:49:51.507

2@Nelson it's worth repeating. "If you don't have at least three back up copies, of which at least two are offsite you are not properly backed up." – dgnuff – 2019-07-09T18:13:06.140

3If you haven't performed test restores, you are not properly backed up. – Paused until further notice. – 2019-07-09T19:38:42.730

RAID may not be a backup mechanism, but it is a FAILOVER mechanism. The risk traditionally was that in cases of corruption to the point of a "broken mirror", updates could be lost if the wrong disk is selected. Further, in a SOHO NAS, rebuilding after a failure often stresses the remaining disk to fail as well, especially if they are from the same batch. – mckenzm – 2019-07-10T01:46:33.993

5"you only have to use backups when both disks fail at once" <-- or when data loss was not caused by disk failure but by operator error or malice which is the whole point of backups. – R.. GitHub STOP HELPING ICE – 2019-07-10T14:33:01.757

All of the other comments are correct, but I think it's also worth pointing out that RAID is also a performance boost. Reading from multiple disks simultaneously is usually faster than reading from a single disk. – Charles Burge – 2019-08-10T16:11:17.463

18

Raid 1 isn't meant to protect you from deleted files. It only provides protection (redundancy) in the case of disk failure, wherein if one drive fails, the other has a complete copy of all your data.

RAID 1 consists of an exact copy (or mirror) of a set of data on two or more disks; a classic RAID 1 mirrored pair contains two disks.

This layout is useful when read performance or reliability is more important than write performance or the resulting data storage capacity.

The array will continue to operate so long as at least one member drive is operational.

From wikipedia

Incremental backups, on the other hand, will help you in case you end up deleting a file by mistake, but if your main drive fails, you will lose all the data that hasn't been backed up.

It is always recommended that you use one in conjunction with another, such as by having your NAS work in a RAID 1 configuration, and then taking regular backups on a third drive (or possibly another RAID config!).

rahuldottech

Posted 2019-07-08T07:34:00.010

Reputation: 5 095

I don't have unlimited budget. I don't see the point of raid 1. If I have an hourly backup it's avoid 1h of work lost? That's it? – Fractale – 2019-07-08T07:58:11.930

@Fractale Yes, that is it. – rahuldottech – 2019-07-08T07:58:33.727

Quite expensive for just 1h of work – Fractale – 2019-07-08T08:03:48.063

8@Fractale It is of use in cases where each and every bit of data is important, and where you can't afford any downtime - such as servers – rahuldottech – 2019-07-08T08:04:31.930

For personal NAS I should use the second hard drive for backup. You agree? – Fractale – 2019-07-08T08:06:34.430

4@Fractale If you are confident that you'll be okay with losing some data in the case of disk failure, sure. Be sure to test your backups and ensure that they're functioning as expected. – rahuldottech – 2019-07-08T08:07:36.313

I'm not confident by loosing some data. So if I do data transfer wait for an hour then delete from my main computer. I should be fine right? – Fractale – 2019-07-08T08:10:59.610

1

@Fractale Let us continue this discussion in chat.

– rahuldottech – 2019-07-08T08:11:28.137

7@Fractale, it's not just an hour of lost work you'd avoid. It's also the extended period of downtime needed to restore from backup. For example, I estimate that if I needed to restore my home fileserver from backup, it would take on the order of a week (10 hours restoring files from an external hard drive, and the rest of the time spent re-ripping CDs and DVDs). – Mark – 2019-07-08T22:31:51.947

3Also noting that it may be expensive for 1 hour of work in your situation, for a business or even home office situation, the cost of a drive vs a team's wages is pretty much rounding error. – Hugoagogo – 2019-07-09T00:14:32.643

Is your "hourly backup" actually a backup? Usually these kinds of systems provide no protection against accidental deletion if not caught before the next (or next few) "backups" occur, and provide no protection whatsoever agaist destruction by malware/ransomware (since the compromised machine has full write access to the "backup"). In most cases, they're the "worst of both worlds" between RAID and backup. – R.. GitHub STOP HELPING ICE – 2019-07-10T14:36:39.170

1Worth pointing out that accidental deletion vs. drive failure aren't the only ways to lose data. There's also filesystem corruption due to buggy software or failing RAM or other hardware in your desktop (or in the NAS). This can corrupt the data on both mirrors of a RAID1 because the corruption happens before duplication. – Peter Cordes – 2019-07-10T17:15:09.433

7

Some points the other answers have glossed over.

  • A backup is a point-in-time copy of your data. A RAID1 or higher array is a right-now redundant storage of your data. So if you did a daily backup (and it completed quick enough) then you could be restoring data that is up to 24 hours out of date. Can your use-case cope with losing a day's changes?

  • Cost - you mentioned in a comment that RAID1 feels wasteful. It is. But if the cost of losing your data is high enough then the cost of doubling the drives is miniscule. The cost of downtime also has to be considered.
    Would I RAID and backup my /photos directory? Absolutely!
    Would I RAID and backup my /TV+Movies directory? No, not at all.

If your budget is limited, RAID may not be feasable. However good backups are priceless. You can't replace some data like family photos, scans and documents.

TL:DR Backups are mandatory, RAID is optional.

Criggie

Posted 2019-07-08T07:34:00.010

Reputation: 985

2In fairness, photos are probably among the easier to keep multiple copies of on disparate media anyway. Any time I've been photographing, I'll copy the files from the memory card to the computer (where they're stored redundantly, but that's not important here), thus creating a second copy. Only after the first backup after that has run without problems (thus creating a third copy) do I delete the files from the card (thus reducing to two copies). If I'm not in a hurry, I might even wait until the cloud backup has run as well as having switched backup drives, for five to four copies. – a CVn – 2019-07-10T15:56:28.150

@aCVn fair points - I was trying to show the relative "value" or irreplaceability of different types of files. You can't go back and re-take the same photo, but "MASH-final-episode.avi" is in another category. – Criggie – 2019-07-10T20:54:31.320

6

If your aim is to protect against deletes, then RAID 1 is not for you.

RAID 1 will reduce the available disk-space by half, by making two disks serve as one disk, with the additional inconvenience that if one disk fails and is replaced, then the RAID might be inaccessible or very slow to access while it is rebuilding itself.

As your aim is backups, rather than sub-second data accuracy, you would be better served by using the two disks as two stand-alone disks and keeping two copies of the data, one perhaps somewhat behind the other.

With that simple setup, you would avoid the problems that can cause a RAID to fail, as some RAID failures may result in the total lose of data of both disks (some such cases are found on our site).

From your post and comments I get the impression that resiliency and resistance for wrong deletes are the most important to you. In that case two classic backups are better and safer than one RAID backup.

harrymc

Posted 2019-07-08T07:34:00.010

Reputation: 306 093

3

Aside from the redundancy and uptime benefits of RAID highlighted in other answers, there is another factor: data corruption.

A decent implementation of RAID will protect you from bit-rot (unrecoverable read errors) automatically. Backups only provide manual* protection against bit-rot. This is reduced if you discard old backups (you could lose the last intact copy of a file) or the backup medium itself suffers bit-rot.

You can further improve the catching of bit-rot by performing regular scans of your RAID, if the implementation allows it. Even better is to use a filesystem-based RAID like BTRFS or ZFS, where checksumming of data is done in software, reducing the reliance on disks to report ECC errors correctly.

If bitrot is something that concerns you, you should use RAID or a checksumming filesystem (plus backups). Ideally, use both.

* For example you could perform regular drive scans, and then cross-reference any read error sectors to files using filesystem debug tools, and then replace the referenced files with backed-up copies.

Luke F

Posted 2019-07-08T07:34:00.010

Reputation: 321

1I'm not sure about Btrfs, but ZFS can do data checksumming and validation just fine without any redundancy. It can even do (up to double = three copies) redundancy on a single disk, if you want to, without you needing to do crazy stuff like combining multiple partitions within a vdev. And when there's a data error, as long as something still works, it can tell you which files are affected, often by name. For all their good sides, I don't think many hardware solutions can do that. (As an aside, don't run ZFS on top of a RAID array. Give ZFS the raw disks. Your later self will thank you.) – a CVn – 2019-07-10T16:01:08.090

3

In addition to everything else mentioned here, RAID also improves performance. A large file can be read in parallel from both physical disks, one half from each disk, reducing read time by nearly half.

Tsahi Asher

Posted 2019-07-08T07:34:00.010

Reputation: 191

Assuming that the RAID implementation is intelligent enough to do so. Not all are, so this might be a side benefit, but it's not something I'd count on happening without verifying with the particular RAID implementation. – a CVn – 2019-07-10T16:03:20.337

This is tricky for one sequential read stream because it introduces seeking. If the RAID alternates reads between drives with chunks that are too small, it won't be anywhere near 2x and could even be worse than 1 drive. But with aggressive readahead in large chunks from alternating drives then yes it can certainly win for one large sequential read with rotational media. It's a bigger win if there's parallelism in the accesses, like two programs both reading data from the filesystem, or parallel requests for multiple small files. – Peter Cordes – 2019-07-10T17:20:51.990

-1

I have built about 20 RAID 1 systems as home computers for individuals who have no technical knowledge. Over the years, about half a dozen of these have had single drive failures. From the point of view of the non-technical user, nothing much happens when half of a RAID 1 fails - the computer continues to work, but gives an error message complaining about a problem with the RAID drive. The computer owner phones me, and I check out the problem. The solution is relatively simple - identify which drive has dropped out of the array, remove it, and install a new disk. The array is rebuilt over a period of perhaps 5 or 10 hours. The clients are impressed when I tell them that had I not insisted on building them a RAID system, they would have lost some or all of their files. But what about using a single drive with regular backups. The problem is that non-computer literate individuals struggle to successfully do a single backup (I have gone through coaching a senior citizen how to do it, and it isn't easy from their point of view). Forget about doing monthly backups, let alone backups every day.

Bruce Dunn

Posted 2019-07-08T07:34:00.010

Reputation: 1

2You could configure a scheduled back up. Most backup software support that. The only thing to worry about is making sure the backup media stays connected to the computer. – Tsahi Asher – 2019-07-11T13:04:51.067

And disconnected otherwise. Else randomware will have your main data and your backups. And yes, getting people to do that can be a challenge. – Hennes – 2019-08-10T13:46:13.253