2

I've been putting myself to find an learn more about the product before posting a question. Well, I've tried my luck and didn't got a proper answer to my questions. So I'm posting it here.

Had an issue with the RAID due to a disk failure, replaced it and the Array configuration utility showed that is ok now. However, it later showed that slot 11 is rebuilding, so does the overall array. So I kept the server for two days to rebuild, but at the end of the process it simply said; "interim recovery failed". Knowing that I've ordered another two disk so that I can get it replaced.

Having said that; I've couple of questions to clear;

  • One of the disks that I have is a seagate raw SCSI disk which matches the same model number as the server currently has? Will it work if I hot-swap it by putting it into the disk cage (cage I've took from previously removed disk)?
  • When I check the array through ACU offline utility, I see an option to erase each drives. Can I try erasing drive in slot 11, remove it and replace it or any ways to rebuild.
  • Meanwhile, I do have the existing drive which I replaced on slot 1 as said above, can I replace it atleast. ADU report is attached for further understanding.

I've set-up iLO management but I don't see anything related to the raid controller its just the basic. What can I do about it?

Our ESXi is not been obtained by HP. So I can't even see the array using CLI also it doesn't show on vSphere too. What can be the cause to this "interim recovery failed" error. I'm totally lost with this & its pretty difficult to search for proper article on HP site. Part number for the disk in the server is: 461289 - 001 (1TB SAS disk).

Please advise on this.


EDIT

HP Model: HP Proliant DL180 G6 // RAID 6 // P410 Smart Array

Where you see interim failure: I booted the server using HP's ACU to check the raid with more expanded information. It showed that it was rebuilding on slot 11. Therefore, I kept the server for two days on it without giving the server any more hassle. Once it was 100%, it suddenly showed that its failed & interim recovery failure on the array of the said disk. Refer this post to see the attached adu report as I don't any option to attach here.

ESXi Version: 5.0.0 // Build, 469512

enter image description here

AzkerM
  • 259
  • 4
  • 18
  • Voted to close as "I am a newbie" translates into "that really does not know what I am doing" (as shown in the rest of the question) and while this is a common scenario that is described - this type of question is off topic as per FAQ. – TomTom May 24 '14 at 11:27
  • @TomTom My bad. I've tried my best to search for similar scenarios but couldn't find any informative answer. Just seeking for an advise that would help me on. – AzkerM May 24 '14 at 11:30
  • Please provide the model of the servers and version of ESXi? What RAID level is this? RAID5? RAID 1+0? Also, where did you see the "Interim Recovery Error"? – ewwhite May 24 '14 at 12:00
  • @ewwhite Added the info as requested. – AzkerM May 24 '14 at 12:20
  • Hi @ewwhite! I've added an image of the utility report which was different to what I showed.. this says **disk 11 - has physically failed**. This what I was talking about before which showed later as interim failure – AzkerM May 24 '14 at 13:04
  • This isn't a best-practice question. This is a pretty specific question for a pretty specific situation. However, you just need to replace two bad disks. That's all. – ewwhite May 24 '14 at 13:28
  • @ewwhite - Thank you for your advise! I've replaced the raw seagate drive and the array took it.. waiting until it rebuilds.. Once done, I will also replace the other.. thank you. – AzkerM May 24 '14 at 13:36

1 Answers1

2

This is from your Array Diagnostics Utility report...

enter image description here

You have one disk being rebuilt. Another disk is in prefailure mode. You can attempt to rebuild again by just reinserting disk 11 and letting it try once more. Are you absolutely sure this is RAID 6 (ADG)? You didn't mention "ADG", and I wanted to clarify.

ewwhite
  • 194,921
  • 91
  • 434
  • 799
  • Yes, it is an ADG according to the array utility and it's on a RAID 6.. the disk 11 you said, I'm planning on changing hot-swapping the disk. But its a raw seagate disk with same part number.. So, I placed the raw seagate disk into the the disk cage which I removed previously from the server. will that work out? – AzkerM May 24 '14 at 12:40
  • You can try, but understand that the disk with the prefailure is probably making the recovery effort more difficult. In the end, you need to replace TWO disks. – ewwhite May 24 '14 at 12:45
  • Yes!! I'm going to the seagate now. If it doesn't take on, then I will replace the newly bought HP disk... Will keep you posted in few mins. – AzkerM May 24 '14 at 12:55