0

I maintain several Linux servers, that we are in the process of migrating from SUSE Linux Enterprise Server (version 10.4) to Scientific Linux (version 5.9). The six SAS hard drives on each machine are connected to an Adaptec AAC-RAID controller, configured as 3 RAID-1 arrays.

On the machines still running SLES, using smartctl -t short /dev/sg[3-8] successfully runs self-test on the physical drives. On the machines running Scientific Linux, however, while I could gather information from the drives using SMART (e.g. using --all or -l selftest), attempting to run the tests (-t short, -t long etc.) fails with

Short offline self test failed [Operation not permitted]

Any idea what could be causing this?

We're running:

kernel-PAE-2.6.18-348.3.1.el5
smartmontools-5.42-2.el5
michel-slm
  • 156
  • 3

1 Answers1

0

My recommendation is to use the Adaptec RAID monitoring software in this setup. It will work with your OS and when run as a daemon, will provide alerts/logs/SNMP traps to indicate drive failure.

While this doesn't address the smartctl situation, I don't think it's necessary to test your drives in this manner. Is this something you wish to do often? What are you trying to protect/prevent?

ewwhite
  • 194,921
  • 91
  • 434
  • 799
  • We've had disk failures in the past, and want to have advance notification of failing drives. The SMART attributes on these drives seem to be improperly calibrated (most of the attributes are flagged as pre-fail / old age) so it seems that the only proper way to get info from SMART is to actually run the self-tests – michel-slm Jun 17 '13 at 07:34
  • Then leverage the RAID controller's capabilities. Not every hard drive failure can be predicted by S.M.A.R.T. – ewwhite Jun 17 '13 at 07:55