-1

We got some replacement drives from HP PN 454273-001 1TB 7.2k drives. We put them into the msa. It completes rebuild but when we run the hp insight diagnostics tests. It comes back as read write error threshold reached. At first we thought it might be just faulty disk. But we now have received three disks and they all exhibits the same behaviour from different slot.

The drives that we received is slightly different. The part number is the same but the sticker got an extra 3G on it and they are HP oem branded disks rather than the standard seagate we get normally. They also don't have the normal HP serial number on it so when I logged a call with HP they had trouble identifying the drive but they eventually found it.

Is it a compatibility issue? I think we upgraded the firmware on the msa half a year ago.

ewwhite
  • 194,921
  • 91
  • 434
  • 799
lbanz
  • 1,579
  • 4
  • 18
  • 30
  • 1
    ARE YOU USING RAID5? – ewwhite Oct 19 '12 at 13:28
  • How many disks are in your 12-bay MSA60? What's the RAID level? What is the MSA60 connected to? Which operating system are you using? What color are the LED's on the disks? How about the MSA60's LED's? – ewwhite Oct 19 '12 at 13:30
  • 12 disk on each enclosure and there is two enclosure. Yes, I'm using RAID5. The MSA60 is connected to a DL380 G5. Windows server 2003 32bit. LED shows all healthy and green on disk + MSA. I'm aware its a terrible design but I didn't set this up and it will be retired in a few months. – lbanz Oct 19 '12 at 13:32

2 Answers2

4

If this is the 21 disk RAID5 array you have, the issue is definitely a URE in the array preventing the rebuild from succeeding. In fact, that's probably what it is anyway, as a URE in a parity RAID array is much more likely than receiving 3 bad disks.

You can try upgrading the firmware, and HP support will generally suggest it, but it probably won't help. If you've got errors on your array, you're going to have to recreate it and restore the data to it. (Oh, but when you recreate it, do so in a sane fashion that doesn't involve a couple dozen disks or so in RAID5.)

HopelessN00b
  • 53,385
  • 32
  • 133
  • 208
  • Yes, that is the one. A bit of googling shows http://www.raidtips.com/raid5-ure.aspx. I've lost count on how many times the raid failed to rebuild after I put in a new disk. This is especially the case in my scenario as there were 8 drives that were detected faulty but not failed completely with amber lights yet. – lbanz Oct 19 '12 at 13:48
  • 1
    `I've lost count on how many times the raid failed to rebuild after I put in a new disk.` This is what a lot of techies refer to as a "red flag." Your array is very unhealthy, back up what data you can and recreate it, this time in a sane fashion. – HopelessN00b Oct 19 '12 at 13:56
  • Just got the results. I took those two brand new HP replacement drives out of the MSA this morning. Plugged them into two separate desktop. I ran the bios ide test on one and the other I used a seagate desktop tools. Both showed bad sectors on the drives. Is my test right or those disks shouldn't be used on normal desktop? – lbanz Oct 19 '12 at 14:06
4

If the lights are healthy on your disks and MSA array, you're may be okay. If relying solely on your Insight Manager, restart the agents on your Windows 2003 server. You can also just try a reboot.

You didn't explain what actions you took before this... You received replacement disks... But what were they replacing? Did you have a multiple disk failure? If on RAID 5, that's a bit of a problem.

Look for an error or status in the Array Configuration Utility that says "Waiting for Rebuild". If you see that, it's an indication that the Smart Array controller cannot rebuild the RAID 5 array due to a read error on one of the existing disks.

ewwhite
  • 194,921
  • 91
  • 434
  • 799
  • Before this, we started losing the storage several times a week. There were no smart failures or amber lights on the msa or any of the disks. But when we ran insight diagnostic tools, it showed 8 drives with read write error. We replaced them one by one. Two of them were brand new drives that we only put in 2 months ago so we logged the call with HP. They sent new drives and we put them in. The first new drive crashed the storage 3 times in a row after 99% rebuild. I logged another call to get that drive replaced. Once they are all replaced I find that it still shows error on those HP drives. – lbanz Oct 19 '12 at 13:44
  • I put in an old drive to replace the HP drive and the errors are gone. – lbanz Oct 19 '12 at 13:44
  • 1
    @lbanz Alrighty then, good luck with that. Next time you have an issue with this array, keep in mind that "making the errors go away" is not the same thing as fixing the problem. – HopelessN00b Oct 19 '12 at 13:52