4

Failed Dell PowerEdge T410: Move Hardware RAID PERC H700: 3 x 300GB SAS (1 faulty) to 2nd T410 for Disk Cloning & Boot

FYI - PRE QUESTION CONTEXT:

  • We barely got the original T410 up once where it was flagging CMOS battery errors, which we changed twice.
  • Then figured we might as well come back to this T410 later and move the Data & OS off it.

Infrastructure Scenario

Source:

  • Dell T410 + Hardware RAID PERC H700 Controller - 6 slots/ bays
  • 3 x 300 GB SAS Drives (Slot 0 1 2 on H700) configured in RAID 5 as a
  • 557 GB "Virtual Disk" Volume (having 2 partitions C & D)

    • Dell OMSA Tool tests & diagnoses reports Drive 3 SMART issues
  • Lack of easily available SAS Hardware

Destination:

  • 1 x 1 TB SATA Drive (Slot 4 on H700) set up as RAID 0 to backup/ clone

Objectives & Scenario

  • Our goal at the moment is to get this up and running on other T410 Box.

    • Our concern is how do we ensure that this runs without glitch till the point we can recover or clone it off from this RAID.

    • With a software RAID things are simpler and we are not sure how/ what settings / configuration of RAID is kept inside the H700, if at all.

  • Then recover/ clone this volume with few partitions etc. and figure out another way to go forward.

Request:

Thoughts/ steps and what we should do/ not do would be good to get inputs on

Ensure the cloned 1 x 1 TB Bootable

PS: We are serving a small Asian SMB NGO so are limited on resources.

Alex S
  • 241
  • 2
  • 11

2 Answers2

4

You don't write anything about the kind of failure on the non working server. I'll assume that the problem is not a broken RAID due to failed hard drives. Obviously, in that situation, it would be of little help to move the broken disks to another server ...

Generally the RAID configuration is stored on the disks by the H700 (and also most other RAID controllers nowadays). This is supposed to make it easy to move RAID sets between similiar controllers/servers.

You just need to move the disks to the working server (I would make sure to plug them in the same slots nevertheless). When booting up you'll have to enter the RAID BIOS. There will be a menu "Import foreign config".

In a normal situation (i.e. all disks working perfectly) the controller is even supposed to detect this by itself:

When a controller firmware detects a physical disk with existing foreign metadata, it flags the physical disk as foreign and generates an alert indicating that a foreign disk was detected. Press F at this prompt to import the configuration (if all member drives of the virtual disk are present) without loading the BIOS configuration utility. Or, press C to enter the BIOS configuration utility and either import or clear the foreign configuration.

Source: How to Troubleshoot Hard Drive and RAID Controller Errors on Dell PowerEdge Servers

s1lv3r
  • 1,155
  • 1
  • 14
  • 24
  • We did suspect some issues, but were called in to help this SMB NGO at crucial point where things were going down. More updates above. – Alex S Feb 14 '16 at 09:28
0

Summary of the Approach taken and Solution Steps:

  • We moved the PERC Controller and 3 SAS RAID 5 Hard Drives to another T410

    • We got a warning/ error message at boot about RAID data so went in to ensure the config/ settings went through

      • The other answer here: https://serverfault.com/a/756013/152268 was right on the dot about the RAID controller - But, one has to be careful to ensure the configuration & disks carry over and are not changed by mistake
    • An additional step is to use Dell OSMA disc as it does a thorough analysis and gives clear insight into the RAID as well

  • We used Paragon Hard Drive Manager - HDM Server 12 and cloned the 557GB "virtual disk" volume (2 partitions C & D) from RAID 5 (Slot 0 1 2)

    • To: A RAID 0 - Single 1 TB SATA Drive (Slot 4)

    • To: Single 1 TB SATA Drive on Motherboards SATA port

  • We did get boot errors, which I suspected required a "Startup Repair" from Windows 2008 bootable CD/DVD but upon a few tries we realized there were some hard drive issues.

    • Once those were resolved we got working clones of the failed Server
Alex S
  • 241
  • 2
  • 11