I'm trying to make a raid array on an existing linux ubuntu install.

I'm following this tutorial... http://howtoforge.org/software-raid1-grub-boot-fedora-8

After going through the list of things a million times I finally understand what's going on. You make the raid device, on your new blank drive, copy your old / drive to it, set up the grub menu.lst, fstab, mtab initrd and grub MBR to all point to the raid device (which I have defined and is working) and then you reboot. Once you've booted, you now run in the raid device (/dev/md0) Then you merely hook your original drive up to the raid array, it syncs and voila you're done.

So I set up my menu.lst to primarily load the kernel and initrd from the raid device, and failover to my original (still intact) old disk. And it always fails over when I reboot. I boot the machine, run my new grub entry and it says "error 15 file not found." Lots of stuff on the web about it, none seem to help.

The only thing that's weird is when I go to setup the MBR with grub, you say "root (hd0,0)" which I finally understand what it means, and it's supposed to say Filesystem type is ext2fs, partition type 0xfd or somethingn like that. Mine says nothing. But when I run setup (hd0) and setup (hd2) it says it's doing the right thing to the right drive. So I assume it's working. but it can't load initrd/the kernel from the md0 device.

The only other thing I'm thinking, is how on earth does grub know what a raid device is. The kernel hasn't loaded, the software raid modules haven't loaded, how can stupid little grub have any idea at all where to load initrd from? So I'm thinking, okay there's a mapping somewhere from /dev/md0 to /dev/sdc1 (the new raid drive) but I don't see where that could be happening. And for kicks, (I did this SO many times in various combinations) I tried setting the grub menu.lst to try and load the initrd and kernel from root=/dev/sdc1 (my new drive) and it still says file not found. So either the grub mbr setup isn't working, or I'm missing something really simple.

Any ideas?

Here's some more info...

root@io:~# cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md0 : active raid1 sdc1[1]
      18771840 blocks [2/1] [_U]

root@io:~# fdisk -l

Disk /dev/sda: 20.8 GB, 20847697920 bytes
255 heads, 63 sectors/track, 2534 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x9d949d94

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1   *           1        2337    18771921   83  Linux
/dev/sda2            2338        2434      779152+   5  Extended
/dev/sda5            2338        2434      779121   82  Linux swap / Solaris

Disk /dev/sdb: 320.0 GB, 320072933376 bytes
16 heads, 63 sectors/track, 620181 cylinders
Units = cylinders of 1008 * 512 = 516096 bytes
Disk identifier: 0x00000000

   Device Boot      Start         End      Blocks   Id  System
/dev/sdb1   *           1        4064     2048224+  83  Linux
/dev/sdb2            4065      620181   310522968   83  Linux

Disk /dev/sdc: 20.0 GB, 20020396032 bytes
255 heads, 63 sectors/track, 2434 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x00000080

   Device Boot      Start         End      Blocks   Id  System
/dev/sdc1   *           1        2337    18771921   fd  Linux raid autodetect
/dev/sdc2            2338        2434      779152+   5  Extended
/dev/sdc5            2338        2434      779121   82  Linux swap / Solaris

Disk /dev/md0: 19.2 GB, 19222364160 bytes
2 heads, 4 sectors/track, 4692960 cylinders
Units = cylinders of 8 * 512 = 4096 bytes
Disk identifier: 0x00000000

Disk /dev/md0 doesn't contain a valid partition table

root@io:~# mdadm -E
mdadm: No devices to examine

root@io:~# cat /etc/mdadm.conf
ARRAY /dev/md0 level=raid1 num-devices=2 UUID=5248ed76:cba39cc2:3082255a:649c0d18

root@io:~# cat /boot/grub/menu.lst

default         0
# 8/14/09 added this
fallback        1

## timeout sec
# Set a timeout, in SEC seconds, before automatically booting the default entry
# (normally the first entry defined).
timeout         3

## hiddenmenu
# Hides the menu by default (press ESC to see the menu)

# added this 8/14/09 for raid boot, note this will get blown away on next kernel update
# if it's after the magic marker
# this means we will have to manually update this when there's a kernel upgrade :-(
# in grub land hd0 = /dev/sda and hd1 = /dev/sdb and hd2 = /dev/sdc I hope
# we're putting sdc first for now
title           Ubuntu 8.04.3 LTS, kernel 2.6.24-24-generic (raid)
root            (hd2,0)
#kernel         /boot/vmlinuz-2.6.24-24-generic root=UUID=b11d6b08-fdfe-4b0d-adec-4e263455be23 ro
kernel          /boot/vmlinuz-2.6.24-24-generic root=/dev/md0 ro
initrd          /boot/initrd.img-2.6.24-24-generic

title           Ubuntu 8.04.3 LTS, kernel 2.6.24-24-generic
root            (hd0,0)
kernel          /boot/vmlinuz-2.6.24-24-generic root=UUID=d8c402cc-7445-4878-b3aa-c9568b740b51 ro
initrd          /boot/initrd.img-2.6.24-24-generic

title           Ubuntu 8.04.3 LTS, kernel 2.6.24-24-generic (recovery mode)
root            (hd0,0)
kernel          /boot/vmlinuz-2.6.24-24-generic root=UUID=d8c402cc-7445-4878-b3aa-c9568b740b51 ro single
initrd          /boot/initrd.img-2.6.24-24-generic

root@io:~# blkid
/dev/sda1: UUID="d8c402cc-7445-4878-b3aa-c9568b740b51" SEC_TYPE="ext2" TYPE="ext3"
/dev/sda5: TYPE="swap" UUID="e0509276-30eb-4dcb-8e17-20f8244f5403"
/dev/sdb1: LABEL="alt" UUID="ea1789eb-9d6f-47a9-a074-18121792b30a" SEC_TYPE="ext2" TYPE="ext3"
/dev/sdb2: LABEL="sp" UUID="3b6d1173-f9fd-4a3e-8e5d-249fc682355b" SEC_TYPE="ext2" TYPE="ext3"
/dev/sdc1: UUID="76ed4852-c29c-a3cb-5a25-8230180d9c64" TYPE="mdraid"
/dev/md0: UUID="b11d6b08-fdfe-4b0d-adec-4e263455be23" SEC_TYPE="ext2" TYPE="ext3"

  • I think I see the problem. when I say in grub root(hd0,0) it isn't able to mount the partition's filesystem. But it doesn't say why. – Stu Aug 22 '09 at 11:51

For anybody else who ends up suffering the error 15 grief that I did, it turns out that the device naming scheme in grub (hd0, hd1, hd2...) ended up being different between when grub boots and when grub is running after the system is up and running. I spent a week with root (hd2,0) because that's what grub told me the drive I wanted was called. But when I dropped to the grub shell on bootup I was surprised to find out that what was hd2 when the machine is up, is hd1 on boot. So I changed the menu.lst to use root (hd1,0) and it started working. I hope to save somebody else lots of hair pulling with that one.

The thing about Grub is that is is invoked before the rest of the linux system is (obviously), so it doesn't know anything about your software raid. It only sees the bare hard drives.

So, it is very important to install grub on both drives of your RAID1 array. The BIOS will pick one to boot from, and if grub is not installed on that drive, it will not boot. (I found this out the hard way when one of my drives in a sw RAID1 configuration failed - the system refused to boot saying it had no boot partition.. the drive that had grub installed had failed, and I was left with a non-bootable HDD. Installing Grub on it fixed it)

So open grub at the prompt (you can do this with linux running) and at type:


to get the grub prompt.



that sets grub to each of the first partitions on your drives (*** if your boot partition is elsewhere on the drive, change that 0 to reflect the correct partition) then setup installs grub boot files.

That should be everything you need to do. If it isn;t working correctly, are you sure you have the right boot partition, and that your drives are laid out identcally?

  • yeah I did that on both drives and it says its working when I do setup, but no love. there's only one partition (and one swap) so it's not like I can screw that up. I copied the partition table from one drive to the other so they're identical. It's the file not found that I don't get, it's obviously looking at the non-original drive, but I can't see why it can't find the initrd. – Stu Aug 22 '09 at 13:47
  • then its more likely to be something simple. Try booting off each drive without the other present. If one works and the other doesn't, then maybe the mdr is screwed on one. I assume the partition has synced fully and isn't still in progress, that the files are present on both drives. Don't look at md0, look at sda1 and sdc1 instead. Lastly, add 'debug' to the menu.lst to start it up in debug mode. http://www.gnu.org/software/grub/manual/html_node/Command_002dline-and-menu-entry-commands.html#Command_002dline-and-menu-entry-commands – gbjbaanb Aug 23 '09 at 15:34
  • More progress. As somebody pointed out, grub knows not of raid, but since a raid 1 drive is a drive, it should be able to mount it and find /boot/initrd... So I tried it by hand. I ran grub interactive, said root (hd2) which is /dev/sdc which is now the working half of the raid array (that won't boot) Then I figured out this neat trick, I type root and hit tab for the completion, without the open paren, and it tries to mount the filesystem so it can display the file options, and it says "error 17" When I try mount... mount /dev/sdc1 sdc1 mount: unknown filesystem type 'linux_raid_member' – Stu Aug 24 '09 at 20:59
  • So although I can mount it as md0, I can no longer mount it as a plain drive, and neither can grub. So that's what's wrong. The question is why? Is there a newer version of grub that is raid 1 drive aware? It says Grub 0.97 – Stu Aug 24 '09 at 21:01
  • Grub is not RAID aware. The filesystem layout is the same, but grub sees it as 2 separate drives. If you have your RAID "working" but the underlying drive is not.. then the raid system has not finished syncing the files. look at /proc/mdstat to see the progress. If in doubt, reformat the drive and start again, set up as RAID1, run the resync, wait for it to finish, then install grub on the new drive. – gbjbaanb Aug 25 '09 at 12:48
  • Thanks. I'll try making it again from scratch, but the drive I'm trying to mount is the good one. The one that it will be syncing FROM. I'm sooo confused. – Stu Aug 25 '09 at 18:02
  • Well I half figured it out. I rebuild the raid device, no love. Then I got the bright idea to drop to grub cmd line on boot and poke around. Sure enough, what is hd2 when the machine is fully up is hd1 on boot. Lord knows why. So I changed my menu.lst from root (hd2,0) to root (hd1,0) and now it will boot off the second drive. Yaaay. Or rather, it will start to boot. It gets to the scsi device list and then hangs. Sigh. kernel boot problems. But at least that's that part. One wonders why the device maps differently at boot vs kernel running. – Stu Aug 28 '09 at 00:17

Grub doesn't know about your RAID device; it just reads direct from the drive, which (in a RAID-1 setup) is still fine, because an entire copy of the drive is right there (not chopped up into bits as it would be on a RAID-5 or RAID-10 configuration).

You haven't really provided enough info to determine what's going on though; what would handy would be:

  • Partition tables for all your drives;
  • RAID configuration details (output of /proc/mdstat, mdadm -E, etc)
    The solution for your root partition to be RAID 5 or 10 is to make a small /boot partition and make that raid 1, as grub just needs to load your kernel and initrd. – David Pashley Aug 22 '09 at 05:03