17

What's the best way to check for HDD errors and early signs of failure on CentOS?

030
  • 5,731
  • 12
  • 61
  • 107
inac
  • 509
  • 2
  • 9
  • 20

6 Answers6

3

I would recommend installing smartmon (http://sourceforge.net/apps/trac/smartmontools/wiki) to your machine this is some software which can check the health of your disks otherwise its going to be checking /var/log/messages or /var/log/syslog for any mentions of scsi errors

Paul
  • 593
  • 2
  • 6
  • smartmon seems it, although its stats mention it'd catch only 60% of failing drives.. if i set smartmon to scan daily, would this actually help the hdd die faster -- it's a seagate 7200.10? – inac Jun 12 '10 at 04:18
  • @inac smartmon will help hdds to die faster? Where did you read this? Please add an URL. – 030 Feb 26 '15 at 12:19
3
dmesg

The kernel will log any diagnostic messages about I/O devices, so you can check those messages out with the dmesg command.

Banjer
  • 3,854
  • 11
  • 40
  • 47
2

SMART monitoring is a good way. As root, smartctl -a /dev/hda, where hda is the drive you want... could be hdb, sda, etc. Also recommend setting your email address in /etc/aliases as the person who should get root's mail.

That's a very vague answer though. If you have a server made by any of the big manufacturers (Dell, HP, etc), chances are there are better monitoring capabilities available.

churnd
  • 3,977
  • 5
  • 33
  • 41
1

You can run fsck on the device to check for errors.

cdated
  • 199
  • 1
  • 1
  • 9
0

As Paul says, the SMART logs are a good place to check.

I'd also recommend running BadBlocks. If you've got a RAID card, you might have to use the monitoring on that.

Dentrasi
  • 3,672
  • 23
  • 19
0

You can try full check of partition /dev/sda1 (for example) as

fsck -f /dev/sda1

or, try full write-read non-descructive test of given partition

badblocks -vn /dev/sda1
Liibo
  • 109
  • 1