Any study about RAID arrays with identical/different drive models?

5

2

We often see advice about mixing hard-drive's model or brand for RAID arrays (also works for any disks groups, for example a ZFS pool).
The rationale is: drives produced in the same batch tend to have same intrinsic problems, so tend to fail together.

I use identical drives for RAID since years on 60+ systems. I never noticed any problem.
But other people do.
Point of view, statistics, coincidence, luck, fate... or real hazard ?

Is there any (serious) study or source about drive pairing in a RAID ?

The only good argument I know until now is about firmwares: when a drive bricks because of a firmware bug, the twin is very likely to fails in a narrow time frame. But also a similar drive from another batch. This is a rare event, but we speak about small improvements between two methods, so rare events count in the balance.

Gregory MOUSSAT

Posted 2014-02-07T18:00:33.793

Reputation: 1 031

Question was closed 2014-02-15T15:22:57.800

"Questions seeking product, service, or learning material recommendations are off-topic because they become outdated quickly and attract opinion-based answers. Instead, describe your situation and the specific problem you're trying to solve. Share your research." – Ƭᴇcʜιᴇ007 – 2014-02-07T18:07:19.693

1I can speak from experience - it does happen. I once watched a server with a 7 drive array lose almost every drive in that array over a 6 week period due to a bad batch of identical disks. Fortunately the spares came in fast enough and rebuilt in time to prevent data loss. That said, I assume the risk is small. – uSlackr – 2014-02-12T17:45:07.430

Answers

4

I know 2 papers about hard-drives and/or RAID:

Using Device Diversity to Protect Data against Batch-Correlated Disk Failures
This one is based on a batch failure, but no discussion about the frequency of such problem.

Disk failures in the real world: What does an MTTF of 1,000,000 hours mean to you?
This one is based on a study against 100,000 disks, and there a little coverage about batches.

Bertrand SCHITS

Posted 2014-02-07T18:00:33.793

Reputation: 1 043

This question is from a few years ago, but since then BackBlaze has started releasing data on massive hard drive usage and longevity. – Christopher Hostage – 2018-02-13T17:16:42.953

7

I have never seen a study on RAID arrays specifically, but what you're referring to is called Common Mode Failure in the scientific community and there are plenty of studies on that. Google is your friend.

Anecdotally, like you, I've built RAID (5/6) arrays on many systems for many years and of the half-dozen or so systems I built with identical drives, of the ones that had drive failures, those sets all had multiple drive failures within months of each other. I had one array years ago with 8 identical 9G drives and 6 of them failed within a 6 month window after running fine for 3+ years.. This certainly firmed up my opinion of certain drive manufacturers. On the flip side, of the arrays that didn't have failures, they're still working just fine, one going on 10 years with (enterprise) drives that had a 3 year warranty.

But Common Mode Failure here still applies. I try to mix and match manufacturers on same-sized (enterprise) drives to avoid the issue entirely. (I've also switched to ZFS as well to get past the RAID5 write-hole, but that's another topic).

milli

Posted 2014-02-07T18:00:33.793

Reputation: 1 682

1I don't find any valuable reference to common mode failure refering to RAID. Can you give me some pointers please ? – Bertrand SCHITS – 2014-02-13T10:44:57.990

I haven't either, hence the first sentence in my response. Just read any paper on Common Mode Failure and you can extrapolate to using identical drives in a RAID array... a flaw in one means there is very likely the same flaw in all of the drives meaning your probability of losing an array is much higher than using different manufacturers drives (but same capacity). – milli – 2014-02-13T16:01:23.983