possible disk problems - nothing in SMART

1

I keep getting these IO errors:

Feb 22 07:08:19  kernel: [70724.773260] sd 0:0:0:0: [sda] CDB: Read(10): 28 00 50 17 f5 d8 00 00 08 00
Feb 22 07:08:19  kernel: [70724.773344] ata1: EH complete
Feb 22 07:09:04  kernel: [70769.804854] ata1.00: configured for UDMA/133
Feb 22 07:09:04  kernel: [70769.804880] ata1: EH complete
Feb 22 07:09:07  kernel: [70772.745086] ata1.00: configured for UDMA/133
Feb 22 07:09:07  kernel: [70772.745120] ata1: EH complete
Feb 22 07:09:10  kernel: [70775.685021] ata1.00: configured for UDMA/133
Feb 22 07:09:10  kernel: [70775.685055] ata1: EH complete
Feb 22 07:09:13  kernel: [70778.566529] ata1.00: configured for UDMA/133
Feb 22 07:09:13  kernel: [70778.566565] ata1: EH complete
Feb 22 07:09:16  kernel: [70781.473288] ata1.00: configured for UDMA/133
Feb 22 07:09:16  kernel: [70781.473323] ata1: EH complete
Feb 22 07:09:18  kernel: [70784.363288] ata1.00: configured for UDMA/133
Feb 22 07:09:18  kernel: [70784.363323] sd 0:0:0:0: [sda] Unhandled sense code
Feb 22 07:09:18  kernel: [70784.363349] sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
Feb 22 07:09:18  kernel: [70784.363380] sd 0:0:0:0: [sda] Sense Key : Medium Error [current] [descriptor]
Feb 22 07:09:18  kernel: [70784.363414] Descriptor sense data with sense descriptors (in hex):
Feb 22 07:09:18  kernel: [70784.363442]         72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00
Feb 22 07:09:18  kernel: [70784.363486]         50 17 f5 d8
Feb 22 07:09:18  kernel: [70784.363511] sd 0:0:0:0: [sda] Add. Sense: Unrecovered read error - auto reallocate failed
Feb 22 07:09:18  kernel: [70784.363560] sd 0:0:0:0: [sda] CDB: Read(10): 28 00 50 17 f5 d8 00 00 08 00

sda is my only disk, I dont have raid setup here. What can I do to fix it?

output from SMART diag:

/var/log# smartctl -H /dev/sda
smartctl 5.40 2010-07-12 r3124 [x86_64-unknown-linux-gnu] (local build)
Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

Anything to worry about? Ever since it started, I see that the machine has higher cpu utilization (IOwait presumably), but no other negative impact (and syslog's gigs in size)

user210409

Posted 2014-02-22T06:16:17.690

Reputation:

3Your disk is dying, replace it, make sure your backups are good. – None – 2014-02-22T06:22:24.080

please run the long test SMART: smartctl -t long /dev/sda – neutrinus – 2014-02-22T08:48:35.830

Answers

1

The only software that I know of that can recover a sector that is unrecoverable by the drive's error correction is Spinrite. It can safely put an additional load on your disk, so that the disk marks iffy sectors as bad, so they won't be used anymore. It also has data recovery abilities. Spinrite may be able to recover your drive to a usable state, but it's probably better to just get another drive.

I have never got any good information from SMART. Don't rely on it. I don't have any suggestions on why your CPU is running higher. Possibly unrelated?

Edwin

Posted 2014-02-22T06:16:17.690

Reputation: 113

+1 for mentioning SpinRite. It managed to rescue my data from bad sectors on old hard disks and even floppies. Of course, the main purpose of SpinRite, IMO, is to make the disk good enough to salvage the data within; copy the data to a new media as soon as possible, because a bad sector appearing on a disk is only the first of a series. The drive's firmware is running out of spare sectors to replace the weak sectors. – pepoluan – 2014-02-24T18:01:55.090