Average life of SATA Drives?

0

What is the average life of a SATA hardrive?

Almost all data I can find gives failure rates for the first 0-5 years, but none seem to actually find the end of the life of the drives.

The reports, charts, and studies by google, backblaze, and the likes only tell part of the story as they focus on the first 5 years +/-.

Hypothetically to say 50% of drives die in 8 years does not infer the other 50% die in 16 years. Is there a chart that takes 100% of a set of drives to their death and gives the results? Or something that would provide equivalent information?

Assuming heavy consumer work load on consumer drives in typical climate controled home/office, what is a real world average of hard drive lives? Again, not failure rates given a (short) set life span.

Real world results for us is we've had less than 10% drive failure in 10 years and never failures close together so I am pretty comfortable with using aged drives but like to be informed where possible; Our current set of drives range from 0-8 (running) years averaging probably around 3-4 years, the most recent failure was a 5 year running drive. Further We have a 40gb and 80 gb drive the each are well over 10 years (manufacture date) old and still get used reliably here and there. Enough data to say SATA HDDs last reliably well beyond 5 years, but not enough to show a trend of how long.

Backround:

We are moving to an OBR10 setup for a small business with aged SATA drives of 4-6 years and I am trying to figure out how prudent it would be to move to a 3 copy MD RAID 10 vs 2 copy.

With daily data mirrors and full backups it would not be detrimental to have a full primary array loss and need to rebuild and restore from backup, but I would love to avoid such a scenario. However I cannot seem to find data that looks well beyond the age of our current drives. and their is no indications that they fail in droves at the 5 +/- year mark where the data seems to stop.

Damon

Posted 2017-01-02T07:43:13.487

Reputation: 1 789

3

They tend to fail by usage, not age. So it really really depends on workload. The best data I can think of is that which is provided by backblaze. https://www.backblaze.com/blog/hard-drive-failure-rates-q2-2016/ - No one else I know of publishes anything near this.

– djsmiley2k TMW – 2017-01-02T07:57:06.423

2@djsmiley2k Annualized failure rates for drives 0-5 years old have nothing to do with the average life span of drives and further nothing to do with rates of failure after 5 years. I would agree that life span without corresponding annualized failure rates for a given group is also problematic for making decisions, but where is the data for 5-10 years+? There has got to be many millions of drives, if not billions older than 5 years still running reliably. My assumption is someone somewhere has some insight. – Damon – 2017-01-02T08:09:08.400

Also the other thing to bear in mind is if you've got a 10 yr old drive, that's likely running SATA 1? At some point it becomes harder to pick up replacement drives 'on the spot' so to speak and also more expensive (if required) to recover data from said drives. – djsmiley2k TMW – 2017-01-02T08:09:59.680

@djsmiley2k I'm not sure I agree. SATA I still works on new hardware (SATA III) and they have slowed down on switching things up so often. Not to mention replacing an old failed drive with a new drive on the new SATA interface and adding it to the array is not a problem; further we do not need to find a drive of the same vintage to mitigate a failed drive so no problem there. Also, data recovery would not be needed with mirrors and backups. – Damon – 2017-01-02T08:22:35.820

then the question arises 'why do you care if the disk is going to fail?' – djsmiley2k TMW – 2017-01-02T08:31:16.147

@djsmiley2k namely downtime, possibility of user error during restore, mitigating time array spends in degraded state. We do care about the data, we just work with a budget, like to make informed decisions, and currently have no information on the question at hand. – Damon – 2017-01-02T08:38:08.750

Let us continue this discussion in chat.

– djsmiley2k TMW – 2017-01-02T08:50:56.440

Ok, Car Anology time - there's cars from the 1920's that still run. Would you trust yourself to them? Also the fact there's no 10 year data is due to the reasons I pointed out, the people who test on large enough scale to generate this data (google, backblaze) don't run the hardware that long, because it doesn't make sense to do so. Interfaces and technolgy changes mean they move to newer versions before the hardware ever reaches that 10 year point. – djsmiley2k TMW – 2017-01-02T10:49:54.823

Answers

8

TLDR: It's impossible to put a number on average hard drive life, 'cause it's too darned complex.

There's no real measure of average life since it deeply depends on a whole load of different factors. It's a little like asking how long is a piece of string. For a specific drive, a datasheet may have some relevant information, though it's still a rough indicate, that may need to be interpreted with a pinch of salt and tea leaves.

To start with, a single drive failure when you have one drive is a tragedy having one drive of a raided array that's part of a cluster of arrays is a statistic.One cannot look at a specific drive and say "this will certainly last a decade". One can say "This drive ought to last 5 years" and plan to replace it in a planned manner.

I'd also note that backblaze and google, and most of the industry are concerned with average failure rates and reliability over the lifespan of a drive under specific conditions. They want to buy a truckload of drives, run them as cheaply and efficiently as possible, and not really worry about them until planned replacement. It's even better to know "these are the signs a drive will die" than having them die, and being able to balance the costs of cooling a place with hardware costs from toasty hard drives frying.

Practically speaking, hard drives are commodity devices - and typically most places don't actually keep track of reliability. It's only recently (relatively!) that large companies started deploying gigantic fleets of these drives and started sharing their reliability information.

There's a good reason there's a focus on predictive failure analysis and picking models for reliability over long term reliability. Simply all hardware dies and it's 'cheaper' in terms of manpower, downtime, and even in some cases accounting to replace drives before they tend to die of mechanical failure.

Specific drives may have issues - the seagate 7200.11 was known for randomly dying due to bad firmware for example and was fixed later. Other drive brands and models may have ridiculous levels of reliability. I've literally never had a HGST desktop drive fail, ever.

You could look up the mean time to failure for the model - which should correlate to the average life of the drive, but modern literature seems to consider it a load of horse hockey. Seagate's switched to AFR anyway.

While looking this up - I came across this great set of slides by someone from WD. Not sure whether the associated lecture is anywhere online.

There's an excellent indication what's the minimum reliability/lifetime that a major hard drive maker expects.

Avoid an un-manageable catastrophe midway (or beyond) through a product’s warranty life

The typical warranty for an enterprise device, and older consumer hard drives is 5 years. It's 3 years for newer drives. So, your hard drive maker assumes that their drives will not fail before 5 years cause it'll cost them money. As such, they assume you'd either assume the risk, or replace it after the time.

The rest of the presentation is a good read but skipping through most of the physics.

This is a simple little graphic showing all the elements involved in hard drive reliability, taken from the same set of slides

enter image description here

And while the classic bathub curve is what people talk about with drive reliability, things like the actual duty cycle, when writes happen to a drive, and temperature matter, in addition to all these design and environmental factors. It's just too complex to guess.

Journeyman Geek

Posted 2017-01-02T07:43:13.487

Reputation: 119 122

Definitely makes sense while there is no real predictive model given all the variables. I guess I had a concept that with the billions of drives deployed someone might have used them to their death and on a basic level documented their real world results. Although I could see how the failure curve for Hitachi drives vs a brand like Seagate drives taken out 15 years would be drastically different meaning generalizations would break down unless the data had enough diversity. Thank you for your insight! – Damon – 2017-01-02T16:41:56.457

I think we will go on the no news is good news concept. I sought out data or information saying drives fail or don't fail much after 5 years and got no answer specifically to the point probably meaning their is not some cliff of failures at the 8 year mark or something. We have a small data set on the drives we have used so I think I will start tracking the drive models, age, and running years and see if we can find trends over [the longer period of] time. We specially buy Hitachi drives due to the data we do have although with HGST owned by WD, trends will change there. – Damon – 2017-01-02T16:46:00.677