7

I have a problem on a server with 4 x 1 TB drives running Debian wheezy and GRUB 1.99-27+deb7u3.

sda and sdb have partitions mirrored using (Linux software) RAID1, including /boot. sdc and sdd have a single partition each, mirroring a LVM physical volume for data. GRUB is installed to sda and sdb. I used mdadm to --fail and --remove the 1 TB sdc, and replaced the old drive (a ST91000640NS) with a new 2 TB ST2000NX0243.

With the new drive in, GRUB gets as far as

GRUB loading.
Welcome to GRUB!

but fails to show the menu. The drive light on sdc is lit continuously, so presumably the GRUB core is trying to read that drive, even though it's not needed to access /boot/grub. I've tried two drives of the same model, both of which test fine with smartctl, with the same result. With the sdc drive bay empty, everything boots normally. The system boots from live USB and the new drive is accessible, so it's not a hardware incompatibility(*). I'm sure it was sdc that was removed, and there's no indication the BIOS reordered the drives.

(*) this may not have been a safe assumption. See answers.

So I have the following related questions:

  1. Could the changed logical sector size (4096 rather than 512 bytes) be causing a problem, perhaps in the RAID support built into the GRUB core? Why don't I at least get a grub rescue> prompt? Could a 4K problem also prevent using the drive for Linux RAID?
  2. What's the quickest way to solve this? [Previous suggestions included: Do I need to reinstall GRUB with the new drive in place, and in that case how? Would a GRUB rescue USB (made from the same system) have the same problem? Is it a known bug in GRUB, and should I upgrade? Answers to these appear to be: no, yes and no.] Can I permanently configure the GRUB image prefix used by Debian?
  3. How would one go about debugging this stage of GRUB? It might be sensitive to what modules are built in, but how do you find that out?

I'm thinking of a debug.cfg with just debug=all and something like:

grub-mkimage -c debug.cfg -o dcore.img configfile normal raid fs multiboot
grub-setup -c dcore.img /dev/sda

Would that work? (I address this point 3 in my own answer, but the hang in my case appears to happen before embedded configuration is acted on.)

More system details

In case it helps visualise, here's part of lsblk output:

NAME                             MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINT
sdb                                8:16   0 931.5G  0 disk  
├─sdb1                             8:17   0   957M  0 part  
│ └─md0                            9:0    0 956.9M  0 raid1 /boot
├─sdb2                             8:18   0   9.3G  0 part  
│ └─md1                            9:1    0   9.3G  0 raid1 /
├─sdb3                             8:19   0 279.4G  0 part  
│ └─md2                            9:2    0 279.4G  0 raid1 /var
└─sdb4                             8:20   0 641.9G  0 part  
  └─md3                            9:3    0 641.9G  0 raid1 
    ├─vg0-home (dm-0)            253:0    0   1.4T  0 lvm   /home
    └─vg0-swap (dm-2)            253:2    0    32G  0 lvm   [SWAP]
sdc                                8:32   0 931.5G  0 disk  
└─sdc1                             8:33   0 931.5G  0 part  
  └─md4                            9:4    0 931.5G  0 raid1 
    └─vg0-home (dm-0)            253:0    0   1.4T  0 lvm   /home
sdd                                8:48   0 931.5G  0 disk  
└─sdd1                             8:49   0 931.5G  0 part  
  └─md4                            9:4    0 931.5G  0 raid1 
    └─vg0-home (dm-0)            253:0    0   1.4T  0 lvm   /home
sda                                8:0    0 931.5G  0 disk  
├─sda1                             8:1    0   957M  0 part  
│ └─md0                            9:0    0 956.9M  0 raid1 /boot
├─sda2                             8:2    0   9.3G  0 part  
│ └─md1                            9:1    0   9.3G  0 raid1 /
├─sda3                             8:3    0 279.4G  0 part  
│ └─md2                            9:2    0 279.4G  0 raid1 /var
└─sda4                             8:4    0 641.9G  0 part  
  └─md3                            9:3    0 641.9G  0 raid1 
    ├─vg0-home (dm-0)            253:0    0   1.4T  0 lvm   /home
    └─vg0-swap (dm-2)            253:2    0    32G  0 lvm   [SWAP]

This is a pre-2010 BIOS and has no EFI capability.

Irrelevant: on the running system the following gives the same LVM error from grub-probe 1.99 as I get on grub-install, although everything appears to work (this seems fixed in GRUB 2.02).

# grub-fstest /dev/sda cp '(loop0,msdos1)/grub/grub.cfg' grub.cfg
error: unknown LVM metadata header.

The debug methods in the answer below show the prefix of the image being installed to sd[ab] is:

grub-mkimage -d /usr/lib/grub/i386-pc -O i386-pc --output=/boot/grub/core.img '--prefix=(mduuid/<UUID of sdN1>)/grub' biosdisk ext2 part_msdos part_msdos raid mdraid09

I don't know why 'part_msdos' is repeated. There are no gpt tables. md0 (boot) uses RAID superblock version 0.9, as do md1, md2 and md4 (these are old arrays). md3 is super 1.2, but shouldn't be involved in booting.


Update

Thanks for the suggestions so far. After further testing:

  • The BIOS was already set to boot using sda (ata1.00). After GRUB was reinstalled to all drives with dpkg-reconfigure grub-pc, nothing changed and GRUB still hangs before the menu when the new drive is connected by SATA. This couldn't have been accounted for by /boot/grub contents not matching the core image anyway. Similarly, physically rearranging drives makes no difference.
  • An upgrade to GRUB to 2.02 in Debian Jessie only has the effect that the Welcome to GRUB! messages are not printed - instead it gets as far as changing graphics mode. It still hangs under the same conditions.
  • The hang appears to occur before the embedded configuration sets the debug variable. No useful debug information is emitted.
  • GRUB shows a menu when booted from a removable medium where the prefix does not use UUIDs, and in this way it is possible to boot the system with the drive physically present. However, TAB enumeration of drives freezes. As expected, chainloading GRUB from a hard drive hangs as before. Booting from a USB drive made by grub-mkrescue from the same system also hangs.
  • As a separate fault, on the live system (Linux 3.2.0-4-amd64), trying to add the new 4Kn drive to the RAID1 array, either via internal SATA or USB results in Bad block number requested on the device, followed by the md system failing the drive, BUG: unable to handle kernel paging request and a kernel oops. (mdadm --remove says the failed element is busy and the md-resync process doesn't respond to SIGKILL. I didn't try echo frozen > /sys/block/mdX/md/sync_action. Testing the drive using dd over SATA everything appears fine.). Surely the Linux MD drivers are capable of syncing a 4Kn drive with older drives and do not use the BIOS?

So workarounds might include mounting a non-RAID partition as /boot/; installing GRUB with a device-dependent prefix; or flashing the BIOS. The most sensible thing is probably to contact the supplier to exchange the drives.

In other words question 3 has a solution whose ineffectiveness is possibly subject of a GRUB feature request; question 2 was barking up the wrong tree, so I've revised it; and question 1, if it's not going too far off topic, is now additionally about why the drive apparently cannot be used for Linux RAID.

I'd be happy to award the bounty to a decent explanation of any of this, something about the RAID resync bug, or anecdotes of using flashrom for 4Kn support, how to tell grub-install not to use UUIDs or any relevant sysadmin tips.

Cedric Knight
  • 1,098
  • 6
  • 20
  • 1
    It is strange. Are you shire that you have replaced `sdc` disk? Because `boot` and `root` partitions are on `sda` and `sdb` disks. – Mikhail Khirgiy Aug 20 '17 at 13:22
  • Yes, I'm certain it was `sdc`, by serial numbers, `mdstat`, and returning it to bay and it resyncing. Good point though. If sdb had been removed would it give similar symptoms or boot normally? I'd also not quite expect the behaviour mentioned here: https://serverfault.com/questions/241109/linux-software-raid1-how-to-boot-after-physically-removing-dev-sda-lvm-md – Cedric Knight Aug 20 '17 at 13:55
  • If `sdb` disk is unplugged now then may be there the same symptom. Because `sda` can have old version of boot record and/or BIOS has other drive to boot now. – Mikhail Khirgiy Aug 20 '17 at 14:05
  • Also I strongly recommend you to update BIOS. – Mikhail Khirgiy Aug 20 '17 at 16:30
  • Why update the BIOS? This is a live server so the risk is significant. Although the BIOS miscalculates drive size and there is an update available, I can boot from USB so is there reason to think GRUB is using the BIOS here? – Cedric Knight Aug 20 '17 at 16:45
  • Well. Grub on USB drive has the same version as on live server? – Mikhail Khirgiy Aug 20 '17 at 17:00
  • No, I was using 2.02 on the USB drive. But doesn't that point to updating GRUB (via jessie-backports maybe) rather than updating BIOS? – Cedric Knight Aug 20 '17 at 17:04
  • If you didn't change boot method from legacy to efi then grub update will help you. – Mikhail Khirgiy Aug 20 '17 at 19:09
  • If you physically detach `sdc`, does the boot complete? – shodanshok Aug 22 '17 at 18:12
  • @shodanshok yes, as I say 'With the sdc drive bay empty, everything boots normally.' The menu displays after about two or three seconds, and I can boot a kernel. Obviously the RAID array is incomplete. Something about the drive seems to be confusing GRUB or the BIOS. – Cedric Knight Aug 22 '17 at 19:03
  • 1
    Maybe the BIOS is enumerating the disks in the wrong order. With `sdc` connected, can you try to specify (via BIOS or at POST) to boot precisely by `sda`? If that does not work, you can try booting from a livecd and installing grub on `sdc` also. – shodanshok Aug 22 '17 at 19:18
  • Yes, BIOSes do reorder drives and that may be part of the problem. The boot order options in the Phoenix BIOS only have one entry for 'All HDDs'. I can try using a live USB to chainload the MBRs from each disk to similar effect though. Maybe sdd has a GRUB boot MBR but tries to read the core image from sdc. – Cedric Knight Aug 22 '17 at 19:26
  • @shodanshok Thanks. I've updated my question. Please have a look, and consider answering, as there is a +100 bounty. Reinstalling GRUB to all drives had no effect. BTW the welcome message is in GRUB (1.99) core, not the MBR (you'd get 'GRUB' if the core couldn't load). – Cedric Knight Aug 27 '17 at 18:35
  • @MikhailKhirgiy Thanks for the BIOS suggestion. And yes it is BIOS, and EFI boot won't be possible. Do you know where to get changelogs for Phoenix BIOSes? – Cedric Knight Aug 27 '17 at 18:37
  • BIOS change log is usually published with BIOS updates on motherboard manufacturer site. – Mikhail Khirgiy Aug 28 '17 at 03:01
  • @MikhailKhirgiy Yes, one would hope so, but unfortunately the motherboard manufacturer is Supermicro and they say even 'BIOS release notes are confidential'. I'm wondering when Phoenix started including 4Kn support in general. – Cedric Knight Aug 28 '17 at 09:47
  • @MikhailKhirgiy If you want to give you answer to my question suggesting a motherboard problem, please do, for the +100 bounty. – Cedric Knight Aug 29 '17 at 19:13
  • 1
    @Cedric Thank you very match. But I don't have a real answer. I know only that many BIOSes can boot OS from 4K disk only in EFI mode and usually do it after update. Now i'm not using 4K disks and don't have any practice. – Mikhail Khirgiy Aug 30 '17 at 05:17

3 Answers3

4

I'm going to answer the third part of my question, about a procedure to install GRUB with debugging enabled. I'd still appreciate informed suggestions about where the trouble may lie, or strategies to solve with minimal downtime and maximum information as to the cause.


Some general points: GRUB provides other methods of debugging - grub-mkrescue will produce an .iso that includes all modules you might possibly need built-in, so like a live USB could be used to try to navigate a RAID array and try to load the .cfg file or even the kernel. The grub-emu emulator is available in most distros, but is more oriented towards how the menu will look. More advanced is the standard GRUB module for debugging using gdb over a serial cable.

Procedure to install GRUB with debugging enabled

So, the procedure to get debug messages is referred to in the GRUB manual section 6, but not in detail. The first thing you may want to consider is doing the debugging over a serial console and run script before screen to record the debug messages. Obviously you need root privileges. Note that the drive layout in this answer does not necessarily match the question and is just an example. Assume that normal (non-debug) GRUB is installed to other drives as appropriate: this is just the procedure for installing a debug GRUB to the drive that you expect to boot. (That means debug messages make it obvious which drive is booting. For installing to a RAID partition, the prefix is likely to be the same in both cases, so you can just run the same command for /dev/sda as /dev/sdb.)

Firstly, check where the existing grub files are, /boot/grub or more likely /boot/grub/<platform>. In this case assume they are in /boot/grub/i386-pc/. We'll not modify the files already there, but add an additional core image with debug enabled. If the .cfg files are missing or have been modified, regenerate them as standard with grub-mkconfig -o /boot/grub/grub.cfg.

Checking installed modules and prefix

The quick and dirty way to show which modules are already compiled into your core image is just to run grub-install again. This works in GRUB 2.02:

grub-install -v /dev/sda 2>&1 | grep '\(mkimage\|setup\)'

In a simple case without RAID or lvm this might reveal a list like ext2 part_gpt biosdisk. However GRUB 1.99 does not use -v for verbose, so use --debug instead. We'll combine this with the trick to not actually install the image, to save a little time:

grub-install --debug --grub-setup=/bin/true /dev/sda 2>&1 | grep '\(-mkimage\|-setup\|true\)'

Note that grub-install can run shell scripts in place of the programs it calls, so instead we could have done something like:

# create grub-mkimage wrapper
cat > /usr/local/bin/grub-mkimage.sh <<"EOF"
echo Arguments to grub-mkimage: $*
/usr/bin/grub-mkimage $*
EOF
# create a dummy grub-setup
cat > /usr/local/bin/grub-setup.sh <<"EOF"
#!/bin/bash
echo Arguments are: $*
EOF
# run grub-install using the above
chmod u+x /usr/local/bin/grub-*.sh
grub-install --grub-mkimage=/usr/local/bin/grub-mkimage.sh \
  --grub-setup=/usr/local/bin/grub-setup.sh /dev/sda 2>&1 \
  | grep 'Arguments' | tee grub-args.txt

Paths of course may vary according to your distribution and chosen shell.

Setting the debug variable

We now create a file we can call debug.cfg with the debug settings. (The core generates a non-fatal error if it encounters a comment at this stage, so we won't use any.)

set pager=1
set debug='init modules disk ata,scsi,linuxefi,efi,badram,drivemap linux,fs,elf,dl,chain serial,usb,usb_keyboard,video'
set

Any combination of whitespace, ,, ; or | can be used to separate the module names within the string.

I extracted the list of debug facilities from the GRUB 2.02 source and ordered them semantically. 'all' produces too much memory information from the scripting interpreter. There are additional facilities for particular filesystems like 'xfs' and 'reiserfs', as well as 'net', 'partition' and 'loader' ('loader' is too late for what we're interested in before the menu. If we can get a menu, we can set the debug variable there.) There are no debug messages unfortunately in the 'mdraid_linux' source, but disk shows the most important operations.

The pager variable is needed to read the debug messages if you are not capturing them over a console (for instance with script). I've found that pager doesn't work without including an additional module like sleep or configfile, which more than doubles the size of the image. The debug environment variable takes effect regardless.

Installing

Now make a variant image of the one you want to debug:

grub-mkimage -p '(,msdos3)/boot/grub' -c debug.cfg \
   -O i386-pc -o dcore.img -C auto ext2 part_msdos biosdisk

where the list of modules is that from grub-install that you want to debug, and include sleep or anything else you need. The prefix -p should be copied from the output of grub-install too, as obviously it has a huge effect on what happens after the GRUB banner. You may however want to experiment with using a GRUB device code (as in this case) rather than the standard UUID. You can show UUIDs with lsblk -o NAME,TYPE,FSTYPE,LABEL,SIZE,STATE,UUID or ls -l /dev/disk/by-id/ and on RAID drives with mdadm --detail /dev/sda.

Now install the core that has just been created to whichever disk is normally booted:

cp dcore.img /boot/grub/i386-pc
grub-bios-setup -d /boot/grub/i386-pc -c dcore.img /dev/sda

For versions of GRUB before 2.0, the grub-bios-setup command may still be called grub-setup as in the manual.

Reboot. You should see the Welcome to GRUB! followed by several pages of debug messages before the menu is shown (or not as the case may be).

Cedric Knight
  • 1,098
  • 6
  • 20
  • Nice. But you need repeat `grub-install` on `/dev/sdb`. If `sda` will die then you will can boot from `sdb`. Another you will get the same error. – Mikhail Khirgiy Aug 24 '17 at 05:13
  • I was assuming that the _normal_ grub is installed to the desired drives already. It seems to me to be an advantage to only install the _debug_ grub to the single drive you expect to boot as it's then obvious that the correct core is booting. – Cedric Knight Aug 24 '17 at 13:46
  • This procedure could be applied, but there must be some drive access, presumably to find the prefix directory, before the embedded configuration is acted on. The above answer may still be useful on later hangs, for developers, or finding why GRUB drops to a rescue prompt. – Cedric Knight Aug 27 '17 at 18:43
1

I'm now answering my own question 1. Is this a 4Kn ('advanced format') problem?

Yes.

4Kn drives aren't as widely supported as you might think; for example they are not compatible with Windows 7 or GRUB 1 or a lot of Intel chipsets. In my case the problem seems to be the Intel 82801I Enterprise Southbridge controller chip (ICH9 family) on the motherboard. I think this is also the reason for partial failure of the drive to md_resync even over USB. The analysis in the above link seems to find the Linux ata_piix driver worked fine for 4Kn over Intel ICH10, despite lack of official support from Intel. I may have found differently for ICH9. I've not tested whether the drive might work in AHCI or SAS mode.

Only the motherboard manufacturer, or someone else who's conducted a thorough test, is likely to know drive compatibility information. I concluded too soon that "it's not a hardware incompatibility" just because simple reads and writes worked. There's a reason why updated BIOS for this motherboard wouldn't support 4Kn: because the motherboard does not do so reliably.

There is no reason the equivalent 512e drive should not work in these situations.

Cedric Knight
  • 1,098
  • 6
  • 20
0

To answer your 2nd question, there is a bug related to raid1 that was patched in 2.02.

I hope it'll help, even if I can't tell if this bug was or wasn't present prior to 2.02~beta1 (version where the bug was reported).

edit: Also, a question came to mind right after posting this : is your RAID1 a software or hardware RAID?

Taz8du29
  • 96
  • 9
  • 1
    Given that OP "used mdadm to --fail and --remove the 1 TB sdc", I'd say it's likely MD RAID, so software. – user Aug 24 '17 at 18:48
  • Thanks, but not sure the bug is relevant as it seems specific to RAID1 managed by lvm. The server in question has lvm over mdadm-managed RAID1, and none of the LVs are mirrored by lvm. – Cedric Knight Aug 24 '17 at 19:04
  • @CedricKnight okay. But why do you keep grub 1.99? can't you get grub 2.02 from jessie's or stretch's repos? – Taz8du29 Aug 24 '17 at 21:51
  • Yes, an upgrade to jessie grub, along with it various libc library dependencies, is one of several things I'm planning to try when sufficiently out-of-hours. As does happen, a long story prevents a full dist upgrade. It would be good to know the _specific_ bug that's being patched. Without that, it's not very reassuring, – Cedric Knight Aug 24 '17 at 23:46
  • @MichaelKjörling Thanks, yes, Linux software RAID and I also had the question tagged 'software-raid' at that point. (Not sure who would use hardware RAID on Linux.) There's an update to this question and a +100 bounty. – Cedric Knight Aug 27 '17 at 18:47