31
7
I'm looking to build a nice little RAID array for dedicated backups. I'd like to have about 2-4TB of space available, as I have this nasty little habit of digitizing everything. Thus, I need a lot of storage and a lot of redundancy in case of drive failure. I'll also essentially be backing up 2-3 computers' /home
folders using one of the "Time Machine" clones for Linux. This array will be accessible over my local network via SSH.
I'm having difficulties understanding how RAID-5 achieves parity and how many drives are actually required. One would assume that it needs 5 drives, but I could be wrong. Most of the diagrams I've seen have only yet confused me. It seems that this is how RAID-5 works, please correct me as I'm sure I'm not grasping it properly:
/---STORAGE---\ /---PARITY----\
| DRIVE_1 | | DRIVE_4 |
| DRIVE_2 |----| ... |
| DRIVE_3 | | |
\-------------/ \-------------/
It seems that drives 1-3 appear and work as a single, massive drive (capacity * number_of_drives
) and the parity drive(s) back up those drives. What seems strange to me is that I usually see 3+ storage drives in a diagram to only 1 or 2 parity drives. Say we're running 4 1TB drives in a RAID-5 array, 3 running storage and 1 running parity, we have 3TB of actual storage, but only have 1TB of parity!?
I know I'm missing something here, can someone help me out? Also, for my use case, what would be better, RAID-5 or RAID-6? Fault tolerance is the highest priority for me at this point, since it's going to be running over a network for home use only, speed isn't hugely critical.
It was easy to "feel" that you already had
(drives - 1)/drives
of your information even without the parity on a single drive failure, but the explanation here makes the reason obvious. If you have n-1 drives' worth of bits from your XOR equation, comparing an XORing of the n-1 to your parity bit will always tell you if the "lost" bit is switched on or not. Nicely done. (Understanding RAID 6, heaven help me.) – ruffin – 2014-10-03T19:19:12.1931If the parity is just an XOR of the two other disks, how do you know which of the two disks was corrupted? Wouldn't a bit flip on either disk result in a bit flip in the parity? – Jay Sullivan – 2015-01-11T21:19:47.157
Hi Little confused about situations like line 4 - (1,1,0 = 0) If you have (1,1,?) = 0, ? could be 1 or 0 and the XOR would still be correct. What am I missing? – MarkD – 2019-10-03T12:22:32.737
@MarkD Don't think of it as XOR, think of it as "even or odd number of 1s".
(1,1,0 = 0)
,(1,1,1 = 1)
. – The Guy with The Hat – 2019-12-11T19:18:45.130If you have (1,1,?) = 0, ? could be 1 or 0 and the XOR would still be correct. What am I missing? If you have a XOR b XOR c, you first compute a XOR b, and then compute the result XOR c. Think of it like [ ( a XOR b ) XOR c ]. – Vinny – 2020-02-27T00:33:48.987
So for a=1,b=1,c=0, you have [ ( 1 XOR 1 ) XOR 0 ] = [ ( 0 ) XOR 0 ] = 0
But for a=1,b=1,c=1, you have [ ( 1 XOR 1 ) XOR 1 ] = [ ( 0 ) XOR 1 ] = 1 – Vinny – 2020-02-27T00:39:55.840
Excellent answer. I was thinking on too large a scale, on an actual complete hard-disk basis, rather than a bit-level. So does RAID-5 use a dedicated drive for parity, or rather all drives for parity? I'm confused on that. – Naftuli Kay – 2011-05-23T22:44:40.220
2I believe the modern approach is to distribute the parity diagonally across all the drives. This has the effect of accelerating the read time to parity bits since multiple IO requests can be sent in parallel to different drives, but don't quote me on that. – Matt – 2011-05-23T22:55:02.920
Is there a mathematical formula I can use to determine the capacity given
x
drives andy
GB available on each drive? – Naftuli Kay – 2011-05-23T22:59:01.4872Yeah, it's the (smallest drive size) * (number of drives in array - 1) – Matt – 2011-05-23T23:01:20.650