6

If you remove a component HDD from an array, it drops into a "busy box" shell under "(initramfs)" saying something similar to "cannot mount root device" as the RAID1 array becomes "inactive".

It is possible to start it using:

(initramfs): mdadm --run /dev/md0
(initramfs): exit

after what, it boots up normally using the started RAID1(the filesystem is on the RAID1 array), and will keep booting normally until you remove another drive(in which case it will do exactly the same).

Google spit out a bunch of posts about UBUNTU using "BOOT_DEGRADED=true" but that doesn't work for DEBIAN.

There is also a post about using "md-mod.start_dirty_degraded=1" as a boot argument to the kernel image. I have tried passing it in GRUB menu option, with no avail.

There might be something that explains it, but I am a newbie to understand :(

Any ideas?

Bob
  • 81
  • 1
  • 6

1 Answers1

9

The initramfs executes /scripts/local-top/mdadm to handle raid. In that script is the statement

if $MDADM --assemble --scan --run --auto=yes${extra_args:+ $extra_args};
  then
    verbose && log_success_msg "assembled all arrays."
  else
    log_failure_msg "failed to assemble all arrays."
fi

With the version of mdadm shipping with Debian Jessie, the --run parameter seems to be ignored when used in conjunction with --scan. According to the man page it is supposed to activate all arrays even if they are degraded. But instead, any arrays that are degraded are marked as 'inactive'. If the root filesystem is on one of those inactive arrays, the boot process is halted.

It is possible to modify this script and then rebuild the initramfs with the command update-initramfs -u.

  1. Copy the script to the local override directory
  2. Patch the script with some additional lines to run mdadm --run on each array individually if the first attempt fails.
  3. Update the initramfs.

The following commands will perform the previous steps. Verify that you don't already have a /etc/initramfs-tools/scripts/local-top/mdadm file before you copy on top of it.

cd /etc/initramfs-tools/scripts/local-top
cp /usr/share/initramfs-tools/scripts/local-top/mdadm .
patch --verbose --ignore-whitespace <<'EndOfPatch'
--- mdadm
+++ mdadm
@@ -76,7 +76,15 @@
   if $MDADM --assemble --scan --run --auto=yes${extra_args:+ $extra_args}; then
     verbose && log_success_msg "assembled all arrays."
   else
-    log_failure_msg "failed to assemble all arrays."
+    log_warning_msg "failed to assemble all arrays...attempting individual starts"
+    for dev in $(cat /proc/mdstat | grep md | cut -d ' ' -f 1); do
+      log_begin_msg "attempting mdadm --run $dev"
+      if $MDADM --run $dev; then
+        verbose && log_success_msg "started $dev"
+      else
+        log_failure_msg "failed to start $dev"
+      fi
+    done
   fi
   verbose && log_end_msg

EndOfPatch
update-initramfs -u

With this updated initramfs, it is possible to boot without intervention when a RAID1 containing the root filesystem is missing a drive.

  • 2
    Confirming that this is still the case in Jessie update 1 (8.1). I have been chasing this same issue for hours and @Mark Neyhart you sir are a gentleman and a scholar –  Sep 15 '15 at 00:41
  • Still a problem in 8.2. It is tracked at https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=784070 I had a small problem with this patch, "cut: not found" and "grep: not found" when booting. Installing busybox before running update-initramfs -u is the solution. – Johan Nilsson Oct 09 '15 at 08:30
  • Thanks, this was incredibly helpful. Note that if you're using LVM you will see an error like "Unable to find LVM volume vg1/rootfs" upon boot-up. After starting the degraded array using `mdadm --run /dev/md0` you'll need to also run `vgchange -a y` to activate your volume group before exiting the BusyBox shell. – twaddington Jan 11 '16 at 00:17