7

The simple question: how does initramfs know how to assemble mdadm RAID arrays at startup?

My problem: I boot my server and get:

Gave up waiting for root device.
ALERT! /dev/disk/by-uuid/[UUID] does not exist. Dropping to a shell!

This happens because /dev/md0 (which is /boot, RAID 1) and /dev/md1 (which is /, RAID 5) are not being assembled correctly. What I get is /dev/md0 isn't assembled at all. /dev/md1 is assembled, but instead of using /dev/sda2, /dev/sdb2, /dev/sdc2, and /dev/sdd2, it uses /dev/sda, /dev/sdb, /dev/sdc, /dev/sdd.

To fix this and boot my server I do:

$(initramfs) mdadm --stop /dev/md1
$(initramfs) mdadm --assemble /dev/md0 /dev/sda1 /dev/sdb1 /dev/sdc1 /dev/sdd1
$(initramfs) mdadm --assemble /dev/md1 /dev/sda2 /dev/sdb2 /dev/sdc2 /dev/sdd2
$(initramfs) exit

And it boots properly and everything works. Now I just need the RAID arrays to assemble properly at boot so I don't have to manually assemble them. I've checked /etc/mdadm/mdadm.conf and the UUIDs of the two arrays listed in that file match the UUIDs from $ mdadm --detail /dev/md[0,1].

Other details: Ubuntu 10.10, GRUB2, mdadm 2.6.7.1

UPDATE: I have a feeling it has to do with superblocks. $ mdadm --examine /dev/sda outputs the same thing as $ mdadm --examine /dev/sda2. $ mdadm --examine /dev/sda1 seems to be fine because it outputs information about /dev/md0. I don't know if this is the problem or not, but it seems to fit with /dev/md1 getting assembled with /dev/sd[abcd] instead of /dev/sd[abcd]2.

I tried zeroing the superblock on /dev/sd[abcd]. This removed the superblock from /dev/sd[abcd]2 as well and prevented me from being able to assemble /dev/md1 at all. I had to $ mdadm --create to get it back. This also put the super blocks back to the way they were.

Brad
  • 203
  • 1
  • 2
  • 8

7 Answers7

10

Well looking at the scripts used to assemble the initramfs, I'm thinking the problem is probably just that your /etc/mdadm/mdadm.conf is out of date.

When your system's up with the arrays assemble, execute the following command to update your mdadm config. You may want to double check it just in case as well.

mdadm --detail --scan > /etc/mdadm/mdadm.conf

Once done, update your initramfs with :

update-initramfs

If this consistently fails, then your superblocks ( the metadata used to assemble the arrays ) may be shot. You may want to examine each of your drives and their partitions to verify. Worse case, zero out out superblocks via mdadm and recreate.

jonathanserafini
  • 1,738
  • 14
  • 20
1

It sounds like your initramfs was created when your RAID setup was wrong (or just different to now) and hasn't been updated since.

You could run update-initramfs (which is normally run after kernel updates) and hopefully this will rebuild your initramfs file, including building in the right raid configuration files.

David Spillett
  • 22,534
  • 42
  • 66
  • I've tried that a couple times, it doesn't help. It still loads /dev/sd[abcd] into /dev/md1 instead of /dev/sd[abcd]2. – Brad Dec 05 '10 at 22:16
1

Here's a workaround I came up with:

Add this script to /etc/initramfs-tools/scripts/local-top:

 #!/bin/sh
 sleep 6
 mdadm --stop /dev/md1
 mdadm --stop /dev/md0
 sleep 6
 mdadm --assemble --scan

This fixes the RAID arrays before the system tries to mount md1 to /root. I had to add the pauses in in order to get the commands to work consistently.

This doesn't actually fix the problem, but it's the best solution I've found that doesn't require changing the RAID arrays or upgrading software.

Brad
  • 203
  • 1
  • 2
  • 8
1

I have the same problem, and found this link that explains why it happens: https://bugs.launchpad.net/ubuntu/+source/debian-installer/+bug/599515 seems that your sda2 partition goes all the way to the end of the disk and overwrites the disk superblock, so that sda and sda2 are the same thing to mdadm and it ends up assembling md1 with sda instead of sda2

MrGuga
  • 11
  • 1
0

To answer the question: yes, it does have to do with superblocks. The technical documentation is here: https://raid.wiki.kernel.org/index.php/RAID_superblock_formats

mattdm
  • 6,550
  • 1
  • 25
  • 48
  • I recreated the array: `$ mdadm --create -n 4 -l 5 -e 1.1 /dev/md1 /dev/sd[abcd]2` and it seemed to work and mdadm --examine looked better. However, I can no longer mount /dev/md1 or /dev/md/1. – Brad Dec 05 '10 at 23:46
0

Are /dev/sd[abcd]2 set as type "fd" (RAID auto-detect) in the partition table? Run fdisk -l | less to see the partition tables. It sounds like the initrd is not detecting the partitions, but then on the raw device it is seeing the superblock. Or it could be that there's an incorrect mdadm.conf on the initrd, but I would expect that update-initramfs would fix that.

You can extract the initrd by creating a directory, cd into it and then run:

gunzip </path/to/initrd | cpio -ivd

Then you can see all the files that make up the initrd and any scripts that it's running. Investigating these may help track down what exactly is causing it.

But first check the partition tables...

Sean Reifschneider
  • 10,370
  • 3
  • 24
  • 28
  • All of the partitions are marked as fd - Linux raid autodetect. – Brad Dec 06 '10 at 00:05
  • Thanks for the tip on how to look at the initrd file. It looked correct, but it helped me figure out what initramfs is doing. – Brad Dec 06 '10 at 06:55
  • @Brad: Any luck fixing it? I was really hoping the partition type would do it, otherwise I think it's going to take some digging. – Sean Reifschneider Dec 06 '10 at 07:34
  • I was able to come up with a workaround to get it to boot, but I wasn't able to actually fix the issue. – Brad Dec 13 '10 at 04:22
0

Similar problem with RAID + LVM on a Debian Lenny box. Before exiting the initramfs shell do :

$(initramfs) vgchange MyLvmVol -a y
$(initramfs) exit

then

update-initramfs -u
Tom O'Connor
  • 27,440
  • 10
  • 72
  • 148