Howto interpret SMART-Attributes?

5

5

I want to estimate the health and the remaining lifetime of my hard drive using SMART (in my case gsmartcontrol). However I don't know how to interpret the attributes. More specifically:

  • Which attributes are most important for this estimates?
  • how to interpret the raw values: how high or low are the raw values allowed to be for a given hard drive (for example in my case a WD scorpio black)
  • are there any tables from the manufacturer where I can compare the current values with some limits?

How would you interpret the current raw values for my WD scorpio black as shown below concerning health and remaining lifetime? (I use the drive for 3 years now on a regular basis, I am not going to change the use pattern).

Smart

student

Posted 2012-05-28T10:09:08.497

Reputation: 455

Answers

8

First, here's what I can tell you about your drive's health:

  • Your hard drive doesn't have any signs of impending failure (0 reallocated/pending sectors, no problems spinning up, with the SATA cable, etc. and the "bad, but not lethal" attributes are mostly 0s)
  • Your laptop has had a fair number of knocks while its operating (G-Sense + Free Fall Protection are fairly high)
  • Your hard drive runs at a fairly average temperature for a laptop drive (although this depends on how much load it was under when you took this report)

Some research conducted by Google indicates that drives are most likely to fail in the first 6 months, especially if under heavy use. Since your drive has survived 3 years without any signs of failure, it is likely to continue working just fine. That doesn't mean you shouldn't keep backups though, just in case ;-)

Also, try to avoid knocking your laptop while it is powered on, and try to avoid picking it up until its powered off - This might have something to do with your drive's high Free Fall Protection & G-Sense error counts. Your drive has shock detection capabilities (some drives don't, and will always report 0s even if shaken while running), so it will attempt to park the drive heads when it detects movement. Obviously it hasn't killed your drive, but a particularly hard knock at the wrong time could, so it will attempt to park the drive heads when it detects movement make the drive's heads hit and damage the platters.

And some trivia and guesswork:

  • You are fairly mobile with your laptop, and likely use it on the move a lot (due to the sizable number of G-Sense & Free Fall Protection counts — these would be near 0 for a laptop used at a desk and turned off while moving)
  • Your laptop is on at least a third of the time (Power on hours = a year of continuous use, you say the drive is 3 years old)
  • You turn your laptop on and off several times a day (based on power cycle count compared to drive age)
  • Your laptop doesn't appear to have all the power saving options turned on (based on the load/unload cycle compared to power cycle count, and head flying hours compared to power on hours)
  • Your hard drive has written approximately 120TB of data and read 866TB of data (based on total LBAs written & read)

Which attributes are the most important?

The most important attribute in terms of failure rates is the Reallocated Sector Count. If it is a number greater than 0, then you drive is many times more likely to fail. The other important attribute is the Current Pending Sector Count (these can later turn into reallocated sectors). If either of these are higher than 1, then you should replace your drive as soon as possible. (source: Google research paper)

A particularly bad UDMA CRC Error Count can show that the SATA cable needs replacing. (source: personal experience)

How to interpret the raw values

Raw values differ on a manufacturer by manufacturer basis. In the case of Western Digital, most of the numbers tend to be how often the specific condition has occurred. Seagate drives store some of the numbers in a different manner which results in very high raw values for some attributes. Given this, for many values (other than reallocated sector count and pending count ad other obvious count raw values) it makes more sense to look at the norm-ed value - The drive comes up with the norm-ed values, not the program, so its what the drive considers to be normal.

Are there any tables from the manufacturer where I can compare the current values with some limits?

Generally, if an attribute's normalised (or worst) value reaches the threshold or lower, then the drive is toast. (Normalized values get worse as they approach zero.)

Hard drives also have spec sheets, which list how many start/stop cycles a drive is rated for, among other things.

William Lawn Stewart

Posted 2012-05-28T10:09:08.497

Reputation: 1 889

Good answer. Could you give more details on how you got from understanding his SMART parameters to the list of statements about his hard drive e.g "His laptop is on at least a third of the time". Thanks – bbaja42 – 2012-05-28T11:10:50.830

@bbaja42 I've added some explanations as to how I reached those conclusions =) – William Lawn Stewart – 2012-05-28T11:23:48.907

1

OK, I know this topic is rather old, but here my 2 cents:

(I'm new here so I can't answer as a comment)

Head Flying Hours 40858023897390 => TO HEX: 0x2529 0000 292E

lowest 4 bytes 0x292E = 10.541 power on hours (as seen below)

highest 4 bytes 0x2529 = ??? (milliseconds maybe? the number goes up and down without changing hours, maybe is in a binary two's complement or doesn't have any relation with time)

and about Total LBAs r/w... seem to be exactly that.

Información de SMART para Disco 1
SEAGATE 2 TB

Modelo:     ST2000DM001-1CH164
Número de serie:    Z1E5716J
Firmware:   CC27

Atributo SMART

Tiempo de giro  0
Contador de inicio/parada   32
Contador de sectores recolocados    0
Horas de encendido         10541   (POWER ON HOURS)
Contador de reintento de giro   0
Contador de reinicio    32
Runtime Bad Block   1
End-to-End Error    0
Reported Uncorrect  0
Command Timeout 0
High Fly Writes 3
Airflow Temperature Cel 41
G-Sense Error Rate  0
Contador de retracción de apagado   20
Contador de ciclo de carga  32
Temperatura en grados Celsius   41
Sector actualmente pendiente    0
No corregibles sin conexión 0
Contador de error CRC de UDMA   0
Head Flying Hours   40858023897390
Total LBAs Written  93750333994
Total LBAs Read 69405426987
Contador de errores ATA 0

Hernexto

Posted 2012-05-28T10:09:08.497

Reputation: 76