1

I need help. To date the majority of my Linux 'software RAID' use has been done via the Graphical Installer. Now, one of the drives comprising my RAID-0 volume has failed and at boot, the system bails out to a command line so I can fix the problem. I'm not quite sure how to do that, but I believe that dmraid and mdadm and the tools I'm going to need. Let me start by telling you about the system configuration. (And please, no need to go into the pros and cons of why I chose to configure this way. Thanks)

The system has four (4) 1TB drives: sda, sdb, sdc, and sdd. Two drives (sda & sdb) are combined to make a RAID-0 volumes; the other two (sdc & sdd) are combined to make RAID-1 volumes. Here's the breakdown:

  • Partitions sda1 and sdb1 are each 8GB partitions, striped to make a 16GB swap volume.
  • Partitions sda2 and sdb2 are each 992GB, striped to make a 2TB bulk storage volume.
  • Partitions sdc1 and sdd1 are each 500MB, mirrored to make a 500MB /boot volume.
  • Partitions sdc2 and sdd2 are each 120GB, mirrored to make a 120GB volume where CentOS 6.8 is installed.
  • Partitions sdc3 and sdd3 are each 880GB, mirrored to make a 880GB redundant storage volume.

Everything the system needs to boot and run is on the RAID-1 drive(s). With the exception of the swap file which (ugh!) is on the RAID-0 drive(s), I should be able to pull out sda and sdb and bring the system up using the RAID-1 drives. (Silly me, I figured having the swap file on the striped drive would give me faster swaps. Live and learn, I guess.)

Once the system boots and the complete OS loads, it will mount the 880GB RAID-1 Volume on /rd1 and the 2TB RAID-0 volume on /rd0. Now, in /rd0 are a collection of VM image files to start bringing up the VMs. In /rd1 are non-critical files: pcap files, .iso image files, a local share for temporary storage, etc.

So, sda's SMART trigger fired off and during the initial boot the system seems to be trying to put all the volumes together, fails and bails to a command line. Nothing is mounted, there are big differences between what /proc/mounts and mount are telling me. Pulling out sda and sdb doesn't change that.

I figure I need to go in, de-construct and delete the volumes that are on the RAID-0 drives, remove them from whatever config files they may appear in, and reboot. I think I need to use dmraid and mdadm to do that, but not quite sure is these are the right tools and exactly how to use them.

I know that the contents of the striped volumes are most likely lost (although I am going to try dd-ing the entire sda disk onto a fresh 1TB drive and see what happens -- you never know; I might get lucky), I'm OK with losing that data. I would like to get things running so I can work with the important stuff.

If anyone has any pointers, suggestions, guidance, links to any good tutorials, or useful incantations, I'd sure appreciate them.

Thanks in advance for the assist!

ChiefEngr
  • 11
  • 2
  • Not really sure what your question is. Do you simply want to boot from RAID-1 without using the RAID-0 at all? Does your system give you any warning at startup? Usually there are two potential problems: The MBR is on sda instead of sdc (and thus unable to point to your bootloader), the missing swap should normally just give you an error on boot and proceed if you want. But you can try booting a live disk and disabling the swap in /etc/fstab together with all other mountings of rd0 – Broco Sep 26 '18 at 09:13
  • Basically Broco, that is my goal - to remove all of the RAID-0 volumes so I can successfuly boot from the RAID-1 volumes. I hadn't thought about bringing a live disk into the picture so I'll have to give that a try. I suspect that I am not even getting to the point of mounting via /etc/fstab; I think it is bailing long before that. Unfortunately, I'm out of town and won't get back to that machine until next week. Right now I'm just trying to do my research on manually manipulating these drives. – ChiefEngr Sep 27 '18 at 12:50

0 Answers0