6

I am running a 14 disk RAID 6 on mdadm behind 2 LSI SAS2008's in JBOD mode (no HW raid) on Debian 7 in BIOS legacy mode.

Grub2 is dropping to a rescue shell complaining that "no such device" exists for "mduuid/b1c40379914e5d18dddb893b4dc5a28f".

Output from mdadm:

# mdadm -D /dev/md0
/dev/md0:
        Version : 1.2
  Creation Time : Wed Nov  7 17:06:02 2012
     Raid Level : raid6
     Array Size : 35160446976 (33531.62 GiB 36004.30 GB)
  Used Dev Size : 2930037248 (2794.30 GiB 3000.36 GB)
   Raid Devices : 14
  Total Devices : 14
    Persistence : Superblock is persistent

    Update Time : Thu Sep 18 19:44:56 2014
          State : clean
 Active Devices : 14
Working Devices : 14
 Failed Devices : 0
  Spare Devices : 0

         Layout : left-symmetric
     Chunk Size : 512K

           Name : media:0  (local to host media)
           UUID : b1c40379:914e5d18:dddb893b:4dc5a28f
         Events : 2319862

    Number   Major   Minor   RaidDevice State
      13       8       82        0      active sync   /dev/sdf2
      15       8      130        1      active sync   /dev/sdi2
      14       8       98        2      active sync   /dev/sdg2
      21       8      194        3      active sync   /dev/sdm2
      16       8      226        4      active sync   /dev/sdo2
      12       8      162        5      active sync   /dev/sdk2
      18       8       50        6      active sync   /dev/sdd2
      17       8      146        7      active sync   /dev/sdj2
      20       8      210        8      active sync   /dev/sdn2
      19       8       66        9      active sync   /dev/sde2
      11       8       34       10      active sync   /dev/sdc2
      24       8      178       11      active sync   /dev/sdl2
      23       8      114       12      active sync   /dev/sdh2
      22       8       18       13      active sync   /dev/sdb2

Output from blkid:

# blkid
/dev/md0: UUID="2c61b08d-cb1f-4c2c-8ce0-eaea15af32fb" TYPE="xfs"
/dev/md/0: UUID="2c61b08d-cb1f-4c2c-8ce0-eaea15af32fb" TYPE="xfs"
/dev/sdd2: UUID="b1c40379-914e-5d18-dddb-893b4dc5a28f" UUID_SUB="09a00673-c9c1-dc15-b792-f0226016a8a6" LABEL="media:0" TYPE="linux_raid_member"
/dev/sdc2: UUID="b1c40379-914e-5d18-dddb-893b4dc5a28f" UUID_SUB="ce717500-cadf-3b12-e893-48d43c1408e7" LABEL="media:0" TYPE="linux_raid_member"
/dev/sdf2: UUID="b1c40379-914e-5d18-dddb-893b4dc5a28f" UUID_SUB="071afb12-f78f-4f15-f65a-a6298eadcfa7" LABEL="media:0" TYPE="linux_raid_member"
/dev/sdb2: UUID="b1c40379-914e-5d18-dddb-893b4dc5a28f" UUID_SUB="822fd02b-454d-a94c-57f6-8535964996b1" LABEL="media:0" TYPE="linux_raid_member"
/dev/sde2: UUID="b1c40379-914e-5d18-dddb-893b4dc5a28f" UUID_SUB="de3f41b8-3016-870c-344f-2a92c08e1085" LABEL="media:0" TYPE="linux_raid_member"
/dev/sdg2: UUID="b1c40379-914e-5d18-dddb-893b4dc5a28f" UUID_SUB="e319bdaa-22bc-1153-c43b-48788a9c1832" LABEL="media:0" TYPE="linux_raid_member"
/dev/sdi2: UUID="b1c40379-914e-5d18-dddb-893b4dc5a28f" UUID_SUB="3dd1df1b-203c-6453-0964-ebad245b1670" LABEL="media:0" TYPE="linux_raid_member"
/dev/sdh2: UUID="b1c40379-914e-5d18-dddb-893b4dc5a28f" UUID_SUB="f5477580-9435-7948-6e97-fe82c8805bcd" LABEL="media:0" TYPE="linux_raid_member"
/dev/sdj2: UUID="b1c40379-914e-5d18-dddb-893b4dc5a28f" UUID_SUB="4a013330-37c5-65f9-cb76-1d357ce4ddb4" LABEL="media:0" TYPE="linux_raid_member"
/dev/sdm2: UUID="b1c40379-914e-5d18-dddb-893b4dc5a28f" UUID_SUB="b750b4e4-2b1b-ac5f-cbd3-bde5eab657e7" LABEL="media:0" TYPE="linux_raid_member"
/dev/sdk2: UUID="b1c40379-914e-5d18-dddb-893b4dc5a28f" UUID_SUB="d5521994-6c4f-04f9-f7ca-0dd9dff3c6cd" LABEL="media:0" TYPE="linux_raid_member"
/dev/sdn2: UUID="b1c40379-914e-5d18-dddb-893b4dc5a28f" UUID_SUB="4670b36c-07cb-e661-20e3-d314f7c3fd42" LABEL="media:0" TYPE="linux_raid_member"
/dev/sdl2: UUID="b1c40379-914e-5d18-dddb-893b4dc5a28f" UUID_SUB="c1514b9f-2461-6fed-324a-50fb9469043a" LABEL="media:0" TYPE="linux_raid_member"
/dev/sdo2: UUID="b1c40379-914e-5d18-dddb-893b4dc5a28f" UUID_SUB="6c33c472-af1f-fd8f-22d1-0ea39edc75bb" LABEL="media:0" TYPE="linux_raid_member"

The UUID for md0 is 2c61b08d-cb1f-4c2c-8ce0-eaea15af32fb so I do not understand why grub insists on looking for b1c40379914e5d18dddb893b4dc5a28f.

Here is the output from bootinfoscript 0.61. This contains alot of detailed information, and I couldn't find anything wrong with any of it:

http://pastebin.com/bPgGN68L

During the grub rescue an ls shows the member disks and also shows (md/0) but if I try an ls (md/0) I get an unknown disk error. Trying an ls on any member device results in unknown filesystem. The filesystem on the md0 is XFS, and I assume the unknown filesystem is normal if its trying to read an individual disk instead of md0.

I have come close to losing my mind over this, I've tried uninstalling and reinstalling grub numerous times, update-initramfs -u -k all numerous times, update-grub numerous times, grub-install numerous times to all member disks without error, etc.

I even tried manually editing grub.cfg to replace all instances of mduuid/b1c40379914e5d18dddb893b4dc5a28f with (md/0) and then re-install grub, but the exact same error of no such device mduuid/b1c40379914e5d18dddb893b4dc5a28f still happened.

EDIT TO ADD

I don't have IPMI on this box so please forgive the embarrassing cell phone phone picture:

http://imgur.com/zooX12b

One thing I noticed is it is only showing half the disks. I am not sure if this matters or is important or not, but one theory would be because there are two LSI cards physically in the machine.

This last screenshot was shown after I specifically altered grub.cfg to replace all instances of mduuid/b1c40379914e5d18dddb893b4dc5a28f with mduuid/2c61b08d-cb1f-4c2c-8ce0-eaea15af32fb and then re-ran grub-install on all member drives. Where it is getting this old b1c* address I have no clue.

I even tried installing a SATA drive on /dev/sda, outside of the array, and installing grub on it and booting from it. Still, same identical error.

EDIT TO CLARIFY

Grub installation is to each individual member disk, not to /dev/md0, and completes without error. But drops to grub rescue on reboot.

EDIT TO ADD

These operations were suggested by a friend. They did not work, I still need help!

enter image description here

I could really use some assistance from anyone/everyone to help me get GRUB working on this box.

Anyone have other suggestions and fixes?

EDIT 5

Grub bug report:

https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=764798

HopelessN00b
  • 53,385
  • 32
  • 133
  • 208
ctrlbrk
  • 310
  • 1
  • 4
  • 9

2 Answers2

2

Look at /dev/disk/by-id with the raid device prefixed by md-uuid. Those are the correct id's for using mduuid/ in grub. Probably need to insmod mdraid1x too if you are using current metadata.

user262022
  • 21
  • 2
0

As I said here: Cannot install grub, segmentation fault, unable to identify filesystem, superfluous RAID member, found two disks with same index — Debian 7

You can not install grub on an mdadm device, it exists by virtue of the raid software, i.e. mdadm, it is not pointing to a physical device. You need to install grub on a physical device.

It's best not to start a new question, but add the info to the existing one. This one may well be voted to be closed due to it being a duplicate to your other question.

aseq
  • 4,550
  • 1
  • 22
  • 46
  • I'm not installing grub to mdadm, I'm installing it to all member disks as described in my post and shown by the attachment – ctrlbrk Sep 19 '14 at 19:25
  • I created a new article because this issue is quite different than the original issue, as grub does install, there is no seg fault, and no raid errors – ctrlbrk Sep 19 '14 at 19:26