Strange disk issues sunfire x2200 freebsd 8.3

1

So, my Sunfire x2200 M2 has two SATA drives in it, a 250GB and a 2TB. Sometime last night/this morning, the system rebooted by itself. It came back up fine, but after reviewing a few logs, I found this: http://pastebin.com/Bctbzwb9.

da0 is the 250GB drive, which is the OS drive. I reviewed the drives information with smartctl, and everything seemed fine, however, upon running a test with smartctl, it failed with a read error. I then noticed this in var/log/messages:

Jan  1 05:20:31 fuzzbox smartd[1160]: Device: /dev/da0 [SAT], 7 Currently unreadable (pending) sectors
Jan  1 05:20:31 fuzzbox smartd[1160]: Device: /dev/da0 [SAT], 7 Offline uncorrectable sectors
Jan  1 05:20:31 fuzzbox smartd[1160]: Device: /dev/da0 [SAT], previous self-test completed with error (read test element)
Jan  1 05:20:31 fuzzbox smartd[1160]: Device: /dev/da0 [SAT], Self-Test Log error count increased from 0 to 1

I'm not really sure what to make of this. Does this look like a failing drive or controller?

smartctl -a /dev/da0 output: http://pastebin.com/RJ6043KJ

user183784

Posted 2013-01-01T10:32:41.540

Reputation:

Answers

1

This looks like a failing drive.

Any modern SATA drive (and any ancient SCSI drive) has checksums on a sector. If you read it and the checksum does not match then it will reread the data. If rereading the data fails often enough the drive will assume that the physical sector on the disk is bad.

Two things can happen if that occurs:

  1. The drive will make an effort to recover the data, and once it is successfully read it will write that data to a spare sector. Whenever you next try to read to original sector you will get redirected to the spare sector instead. If this is in progress but has not successfully been completed then the status is pending (just as in your log).
  2. If this happens often enough the drive will run out of spare sectors. It can no longer use a spare and reading will result in a read error.

In your log you have entries for currently unreadable (pending) sectors and 7 Offline uncorrectable sectors. That seems a clear pointer to case 1).

Hennes

Posted 2013-01-01T10:32:41.540

Reputation: 60 739