RAID 5 Array: Probability of Failing to Rebuild Array

7

I have a Qnap ts463u-rp data storage array, which contains 4x4TB HDD.

my question here is, is it safe to configure 3x4TB as RAID 5?

because from what I understood, it's not recommend to configure 12TB storage with RAID 5, because of the Unrecoverable Read Error when rebuilding the Array. is this Right?

M. Ahmed

Posted 2018-06-27T14:05:01.650

Reputation: 71

Answers

4

It's not necessarily bad to configure your RAID 5 array with N TB. What you need to worry about is how much data has to be read AFTER you lose a drive. Depending on the amount and size of the disks in your RAID 5 array, is what calculates your probability of failure for rebuilding the array.

What is a URE?

A URE (Unrecoverable Read Error) is when reading the sector of a drive fails, and it cannot be fixed. A URE number simply means that "on average 1 error is detected while reading n bits". In our scenario, it's simply running a risk that it will fail to be able to read one of the distributed parities while rebuilding the array.

How to find out my probability of failure?

A typical consumer grade hard drive has a bit error rate of 10¹⁴. To calculate the probability, we would use the formula Probability = 1-((X-1)/X)^R. X is the number of outcomes, and R is the number of trials. In this case X will be our bit error rate for the drive, 10¹⁴. R will be the size of the RAID array after losing a disk. For this case that is 8 TB.

Math Time!

First, we want to convert 8 TB to bits. This equals 6.4*10¹³. Now we can finally calculate the probability of our RAID 5 failing to rebuild after losing a disk.

Probability = 1-((10¹⁴-1)/10¹⁴)^(6.4*10¹³) ... plug this bad boy into wolfram alpha and you get 0.4727... multiply that by 100 and you have a 47% chance of your RAID 5 array failing to rebuilding in a 3x4TB setup that loses a drive. If you have a 4x4TB setup and you lose one drive, you have a 61% of the disks failing to rebuild.

So should I risk it?

The overall conclusion is that if you are using consumer grade hard drives, it can be risky to use large drives with RAID 5. It's a much different story with enterprise grade drives and hardware. As an example, using a Seagate Enterprise drive with a bit error rate of 10¹⁵, recovering a 4x4TB RAID 5 array that loses a drive, has only a 9% chance of failing to rebuild. The consumer grade hard drives has a 61% chance of failure.

In response to a comment by Christoper, I found this sleek RAID rebuild failure chance calculator. It quickly and easily calculates the probability of a URE failing during a RAID 5 or RAID 6 rebuild.

RAID rebuild failure chance calculator - created and maintained by magJ

DrZoo

Posted 2018-06-27T14:05:01.650

Reputation: 8 101

Neat! Do you have a source and/or similar information for RAID10? – Christopher Hostage – 2018-06-29T15:11:51.343

@ChristopherHostage no I don't. But thanks to your comment I did some looking around and happen to stumble upon a nice calculator for calculating the probability of failure for RAID 5 and RAID 6. – DrZoo – 2018-06-29T15:29:17.520

@ChristopherHostage I showed this post to some co-workers today because this topic came up. I somehow stumbled upon a Reddit post that shows how to calculate the failure rate for Raid10 due to UREs

– DrZoo – 2018-11-05T20:54:43.277

Excellent answer, thank you. I am, however, curious - what happens after an URE? Is the whole array lost, all the data "after" the URE or just files residing on blocks affected by the URE? – quantum – 2019-05-08T20:15:11.630

1

@quantum If a RAID5 system experiences a URE during rebuild, is all the data lost? I believe that will answer your questions.

– DrZoo – 2019-05-08T21:20:13.923