How can a disk array use less parity than data?

2

0

I've been looking at disk array calculators like the one shown here:

https://www.synology.com/en-us/support/RAID_calculator

I put in 16TB (4+6+6) and it says I can have 10TB of data and 6TB of protection. How can it afford to offer this? I would think that after striping 4TB it should be unable to protect the remaining 2TB, because there is not enough information to resolve parity

I assume the parity is compressed in some way. Is there a way to prove that a maximum quantity of space will be required for the compressed parity, or is it lying and would pull some hijinx like pretending the drive is full if it couldn't get the compression ratio it wants?

awiebe

Posted 2017-08-15T05:59:12.813

Reputation: 134

1Don't use SHR - it will use RAID5. Don't use RAID5 either - with 4TB and 6TB disks you are simply asking for your data to be destroyed. RAID5 is a joke with big modern consumer drives. It's an enormous gamble to rebuild, and that's when you need it most. There is no protection here. This is a bad combination of disks to put into an array - just don't do it. – J... – 2017-08-15T09:58:41.010

^^ Hear, hear. It literally takes half a week at these sizes to build and rebuild. Which is no biggie when disks are new, there's no data on them anyway, and failures are a mostly theoretical thing. But when you have SMART degrading (or one disk already failed) then scrubbing the remaining disks for 4 days straight is not precisely what you dream of. – Damon – 2017-08-15T10:02:38.550

So even though this is dooable your advice is always have a spared configuration. I think synology has solved this "spare problem" with SHR-2, which has the requisite double redundancy, but will still automatically upscale my volumes if I add bigger drives. – awiebe – 2017-08-16T05:25:25.250

@awiebe No, the advice is to not use RAID5. If you have a single disk failure with RAID5 and 3-6TB consumer grade disks then your only hope to get the array back online is to swap in a good drive and wait half a week for the rebuild, crossing your fingers that you don't get a single read error on any other sector of any other drive - an outcome with such frighteningly low probability that you might as well just forget about it. RAID5 is basically a massive liability. – J... – 2017-08-16T14:22:40.960

Answers

2

With three drives of 4, 6, and 6 TB, the Synology would configure your data as:

  • A 12TB RAID5 array using the first 4TB of each disk -- 8 TB of data, 4 TB of parity.

  • A 4TB RAID1 array across the remaining space on the two 6TB disks, with 2TB of data mirrored to each disk (2TB data, 2TB "parity").

In total: 10 TB of data, 6 TB of redundant copies. No compression required.

duskwuff -inactive-

Posted 2017-08-15T05:59:12.813

Reputation: 3 824

Ah so the answer is that SHR will fall back to mirroring, meaning if any of the data in that 4TB block becomes corrupt there isn't actually enough information to know which blocks are corrupt, only enough to prevent loss. Which is why I would disagree and not call that 2TB of parity. – awiebe – 2017-08-16T05:22:45.023

None of it is "parity" in that sense. The extra disk in a RAID5 array can be used to detect errors, but not to correct them. (Modern hard disks internally perform enough error correction that this is rarely an issue, anyway.) – duskwuff -inactive- – 2017-08-16T05:26:06.163

Well it's parity in the mathematical sense, but yes it just tells you what disk to kick out of the array, I don't know if there's a hybrid solution that would allow you to mark that sector as bad across all the disks in the array in exchange for not having to replace a whole disk. I suppose the argument is that SMART is enough so that you can decide which disk was bad in the mirrored pair, which is why you can claim that SHR using mirroring is "protection". – awiebe – 2017-08-16T05:33:01.597

No, the "parity" disk in RAID5 doesn't even tell you that much. For the case of a three-drive RAID5 array, the three disks will hold stripes a, b, and a XOR b. It's possible to tell if the three are out of sync, but there's no way to tell which one is correct. – duskwuff -inactive- – 2017-08-16T05:37:43.393

0

Depending on how the array is set up all you need to protect against is one single drive failing, the third drive contains what is known as "parity" information for the other two drives.

Effectively what you have is the following oversimplification:

  • Drive 1: data block 1
  • Drive 2: data block 2
  • Drive 3: block of data that contains sum of data block 1 and data block 2

If data block 1 becomes corrupt then you can simply take the data on drive 2 and subtract it from the data on drive 3 to recover the information on drive 1. In this way you have resilience, but don't need as much space as a full backup and it can be extended to any number of drives and give you fault tolerance for any single drive failure.

This will be slower than just reading the original data and will not protect your data if you have two drives fail, but it does mean that you only need to use a single drive for the protection rather than having to have a backup drive for every "data" drive.

As an example for RAID 1 (mirror) you will always have one half of your total number of disks "wasted" to give you redundancy in case of failure. In the theme above (RAID 5) you could have four disks and only need to use one of them for the parity information meaning that rather than using half of your disk space for protection you are now only using one quarter leaving you three quarters for your storage.

Mokubai

Posted 2017-08-15T05:59:12.813

Reputation: 64 434

This nicely explains what would happen in the 4x4TB case, but not the 4+6+6 case which was explained well by duskwuff. Of course when you say addition I assume you're actually using a finite field other than integers. – awiebe – 2017-08-16T10:10:22.020

Yes, duskwuffs answer is better from that perspective, "addition" is a bad word I admit, but I was at a loss at the time to describe something like a complicated logical XOR operation happening on the data. As you have your answer I'm happy to delete this unless you think it is in some way useful. – Mokubai – 2017-08-16T10:19:27.340