BTRFS, RAID5 or RAID6 for data storage?

3

I need to setup a storage for my server. The hardware is a 5 bay enclosure and 5 WD RED 8TB.

I have read the statistical probability of a catastrophic failure in RAID5 (= normal RAID) rebuild after complete failure of one HD, in case there is a URE event.

This calculator according to the WD specs, gives only 4.1% of chance of a successful rebuild in case of one disk failure. I know that there are speculations about these kind of calculation, but there are still some questions in my mind:

  • In "normal" RAID5 (let's say mdadm raid5) if there is an URE during rebuild, does it means that the rebuild will be aborted with no other possibility, or the rebuild will continue, leaving "only" the affected data (across stripes) to be inconsistent?

  • Will BTRFS with its journaling mechanism lower this probability?

My storage will store video and pictures. Of course one drive failure must be tolerated, but I can accept that. A successful rebuild may deliver few corrupted files, but it cannot just stop for a single URE event.

Menion

Posted 2018-01-24T15:24:41.377

Reputation: 41

Can you please reformat your question so it's not a "wall of text" to make it easier to read. At this moment, it's not clear what your actual question is. – Darren – 2018-01-24T15:41:28.447

Well, a long story short:

  1. In case of URE, the RAID5 reconstruct will just stop or continue leaving some incosistent data?
  2. BTRFS, thanks to it's journaling, would lower the rebuild failure rate of RAID5 scheme to some better values?
  3. < – Menion – 2018-01-24T16:27:35.480

@user2910710, Darren was requesting that you make the question clearer. StackExchange (SE) is a Question and Answer site, not a discussion site. A properly formatted question is posted, an answer is selected from the responses as THE answer. Yours needs formatting to remove it from the 'poor quality' status. – Xalorous – 2018-01-24T17:47:15.427

Note: you were asked to edit the question to improve it. You can [edit] your question anytime; improving is highly encouraged. This time I did it for you. – Kamil Maciorowski – 2018-01-24T18:31:45.827

Can you please tell me where my statement and questions are not clear? I have started with a short summary of the real use case, linked the resources where my questions start, placed two question in a specific bullets and finally explained what I can tollerate for my problem – Menion – 2018-01-24T19:37:10.143

Those error numbers and the calculator are BS. According to it, simply trying to read all of the data from one drive one time, has a 50% chance of getting an error. Try it. Try it 10 times. You won't see any error despite the statistics supposedly saying that is nearly impossible. – psusi – 2018-01-24T19:54:58.870

@Menion are you still looking for another answer? I recently went on a research rampage for my server, and I believe I could shed some light if you're still looking. – Michael – 2018-07-05T15:29:29.583

Answers

-1

RAID does not protect data, it only potentially reduces downtime in the event of a single drive failure.

First and foremost, RAID is no substitute for backups. If you do not have a backup system in place, no RAID system will prevent data loss in the event of a rebuild failure.

RAID allows potential recovery from hardware failure. BTRFS' journaling system allows recovery from filesystem errors. They don't influence each other.

RAID 6 is more expensive than RAID 5 and potentially allows recovery from two disk failures.

The answer of what RAID (if any) to use is determined by the purpose of the array.

For operating system, the goal is continued operation, and the size requirement is normally relatively small. Two drives in a mirrored (RAID 0) setup is pretty good for this. The cost of RAID 0 is high. Basically it is one half of the drives in the array. Then, keeping your data separate, RAID 5 or 6 are the most economically efficient. The "cost" is one disk for RAID 5 or two disks for RAID 6. So, basically, can you afford to reduce the overall size of your data storage capacity by one disk or two?

Now, back to the purpose of RAID. RAID protects availability of whatever is stored on the array. Backups protect the integrity AND the availability. RAID protects against a drive going bad (or two with RAID 6, or more with more sophisticated storage schemes).

The purpose of Backups is to protect data. Backups provide disaster recovery from any of a list of potential disasters. Anywhere you see me use the word backups, I mean GOOD, VERIFIED backups created using a system that you design to provide a frequency of backups that satisfies your needs, in a rotation that ensures you can recover from "Oops" errors (Hey sysadmin, I accidentally deleted this file 30 days ago), and with a copy stored offsite so your data is safe from system destruction type disasters. And your backup routine should include restoring a random file after each backup to confirm that the backup is readable.

Since drive space is not infinite, and backups can get expensive, and we're talking at the superuser level here, assume you're going to have to make a bunch of tradeoffs. Personally, I make duplicate backups on a yearly basis of an external drive that holds the stuff that I do not want to lose. I have terabytes of junk that I only keep for the convenience of not having to download it again. The stuff I keep is backed up to an external drive automatically on a weekly basis. That drive is copied twice on a yearly basis. The two copies are stored offsite. One copy in a local safe deposit box. The other at a family member's house.

So, short answer. Backups to protect your data. RAID 6 to protect availability of your system.

Edit: Another way to view this is that RAID recovery is performed block level against disk sectors. File system journaling recovery tools are at the file level.

Xalorous

Posted 2018-01-24T15:24:41.377

Reputation: 459

1Thanks, I think I knew more or less what you have explained, but this not answer to my questions which are if an URE will stop a RAID 5 rebuild and, if the answer is yes, if BTRFS is more tollerant to complete the rebuild even leaving some inconsistent data where the URE occurred – Menion – 2018-01-24T19:39:40.807

You ask about recovery of data from failure with using RAID. RAID is hardware redundancy. BTRFS (or ext4 or xfs for that matter) are file systems with journaling and may allow you to recover data due to bad sectors. RAID is for performance and to allow fast recovery of the system when hardware fails. If the recovery fails, then you have to then try any file system tools, and if that fails you go to backups. A journaling file system will give you another set of tools that may allow faster and/or more complete recovery than restoring from backup. But the filesystem does not improve RAID. – Xalorous – 2018-06-12T23:18:08.193