5

I have a Synology 1812+ NAS with 8 3TB drives configured as RAID 5. Its running DSM 4.1. It was purchased to replace USB drives, consolidate storage and short term OS X backups using Time Machine. The device and drives are only 2 months old.

Every other week I started to get IO errors from two of the drives. The logs has the following error:

Read error at internal disk [3] sector 2586312968.

And later on

Bad sector at md2 disk3 sector 250049936 has been corrected.

The sectors never match. The recommendation is to run a extended S.M.A.R.T. test on the drives. I did and this is the values I got:

enter image description here

I then ran an extended extended S.M.A.R.T. test on one of the drives for which I received no complaints and here is the values I got:

enter image description here

The values look very similar. It is unclear to me if there is a problem and if not, what is the point of a S.M.A.R.T. test if it doesn't reveal any real problem? How should I then interpret these results and when should I know its time to replace HDD?

bloudraak
  • 462
  • 2
  • 5
  • 14

2 Answers2

1

the raw data column usually represents the number of events happened. E.g., number of read errors in the first row. However, your numbers are so high, that I assume you have a Seagate drive, which always reports abnormal high raw error values (also when hard drive is OK).

What else you can see - Status column. It is OK for all parameters, which means exactly the same - your drive is generally OK.

As written at http://www.linuxjournal.com/node/6983/print, the VALUE column presents a current "normalized value", which should be always greater than threshold.

So your SMART data shows that all drives are OK. However, if you get a lot of read errors (not just one found in the logs for last year:), it seems that your drive are going to die soon. It's somehow "normal" to have several (up to 1-2 thousand, see How many SMART sector reallocations indicate problems?) bad sectors on the drive which will be replaced with other and therefore corrected. But if you have too many of such messages or they come very often, you should replace your drive.

You can probably make SMART tests or some other tests (both depends on your NAS)... E.g., if you have smartctl and can login into NAS via ssh, you could try:

# smartctl -t short /dev/<device>

This command will run a short test for selected drive. After it will be finished, you could veiw results with

# smartctl -H /dev/<device>
# smartctl -l selftest /dev/<device>
Andrey Sapegin
  • 1,191
  • 2
  • 11
  • 27
  • I do have Seagate drives. The drives are relatively new and the messages have stopped, but it is usually a different drive. The Synology 1812+ device runs smart tests once a week automatically and notifies me if there were problems. The links were very helpful. – bloudraak Jan 06 '13 at 05:12
0

I have another option you could try, I found I had a similar issue with my DS1812 and a friend of mine also with his DS1512, if the drives are new and you are getting these errors it might be that you had a few bad blocks on the drives when you first created the volumes (which is normal by the way) and if do not you choose the option check for bad blocks when creating the volume, the Synology skips that step and doesn't really deal with the bad blocks on the drives.

As such you get those errors. Assuming your volume can handle 2 drive failures and still keep running, you could pull one bad drive out at a time, leaving the other good ones in the NAS along with one of the bad ones, using a USB adapter or plugging the drive in directly, place the bad drive you just pulled from your NAS on to another computer and check the integrity of the drive on it from that computer.

Perhaps if you have a Windows box you could run CHKDSK or checkdisk, once the task completes, see if there are any problems, if not then format the bad drive you pulled from your Synology NAS with NTFS and place it back in to the volume, when you do that you will instruct the NAS to repair the volume and in that stage the NAS will reformat the drive to the file system you are using on the NAS and also look for and fix the bad blocks.

Once the first drive is completed and volume has been repaired, repeat these steps with your second "bad drive" with any luck you wont get those I/O errors any more. I figured this little trick out when I first encountered the same I/O error type of errors like you are getting and now all is good, the same went for my friend when I had him perform these steps.

Good Luck I hope this helps you out.

Frank R
  • 141
  • 3