grub rescue on Ubuntu 18.04 with btrfs partition

Question

I have a small server (HP ProLiant MicroServer Gen8) running Ubuntu 18.04 64 bit with the latest HWE generic kernel (5.4.0.74.83~18.04.67); it has two SATA drives, GPT partitioned but booting in legacy BIOS mode. Both drives are partitioned as follows:

partition 1 (1 MB): "shadow" boot partition, for GRUB code;
partition 2 (several TBs): BtrFS, containing @ subvolume for / and @home subvolume for /home; /boot is indeed under /@/boot
partition 3 (4 GB): swap

The BtrFS partitions of the two drives are tied together in a BtrFS mirror configuration.

A few days a go power went out, and the NAS didn't ever recover. At boot, I just got the dreaded "grub rescue", saying it couldn't find the target partition (specified with the BtrFS volume GUID, I think). Trying to do ls (hd0,gpt2) (or really, any other partition) told me "unknown filesystem" (even though insmod btrfs seems to work correctly).

Thinking something was fried with the GRUB stuff, I booted with an Ubuntu 18.04 server installation key, and tried to reparir GRUB, as follows:

mount /dev/sda2 /mnt (/dev/sda2 is the BtrFS partition); I can see that all data is still there;
mount --bind /dev /mnt/@/dev; same for /dev/pts, /proc, /sys, /run (otherwise DNS, managed by resolvconf in both the live system and the "broken" system wouldn't run)
chroot /mnt/@
inside the chroot, I did mount /dev/sda2 / -o subvol=@ otherwise grub-probe/grub-install would get confused; outside the chroot, redid the bind thing as that was shadowed by the remount;
mount -a

Then (across multiple tries) I tried several things:

reinstalled GRUB to /dev/sda (grub-install /dev/sda); tried with --recheck and without, tried grub-mkdevicemap (removing manually the USB key device from the generated device.map file);
recreated the GRUB configuration (update-grub);
upgraded the kernel, not really thinking it's much of a kernel version problem, but to make sure (1) I'm trying to boot something that is surely not corrupted and (2) to have freshly made initrd
checked if the kernel/initrd files were all readable (e.g. with sha1sum /boot/*), as the GRUB messages about that were scary - see later;
downgraded GRUB (which is at the latest 2.02 revision available in the repos) to a few patch revisions earlier, as I suspected the power loss hadn't much to do about it, but it was instead some upgrade that broke it
run btrfs check /dev/sda2 and btrfs check /dev/sdb2; both ran without errors, the only problem (reported for both) was in the free space cache of a single block (I later mounted /dev/sda2/ with clear_cache option just to be sure)

The results from all this stuff were far from encouraging: when done correctly, this stuff always leads me again to grub rescue, but this time with a more interesting twist:

it can read somewhat the BtrFS volume, but, so to say, barely; doing ls (hd0,gpt2)/ shows the subvolumes, but both ls (hd0,gpt2)/@ and ls (hd0,gpt2)/@home show empty directories (which is why it cannot find (hd0,gpt2)/@/boot/ and then all the stuff it needs to go into normal mode);
the most interesting thing is that there's other subvolumes, namely three snapshots done by the Ubuntu version upgrade script; those subvolumes instead can be read by GRUB; not only that: if I adjust prefix, root & co. to point inside the snapshot subvolumes, I can manage to go into "normal" GRUB mode and, again adjusting the paths in the menu entries, boot the older kernel that was in the snapshot (the only problem is that it cannot find the modules, as I'm loading an older kernel version that is no longer installed in the current root filesystem, but given they aren't really used it boots just fine).

Once (when I probably botched something in the aforementioned mount dance in the chroot) I managed to reboot and get directly normal mode, but GRUB was always finding something bad for all entries in the boot menu: in particular, for each and every entry it couldn't either load the kernel or initrd, giving a scary "inode not found" error message (or, in another case, "couldn't find the chunk descriptor"; both of these have no matches in Google besides the GRUB sources themselves, which is always a bad sign); notice that, as I said above, inside the live USB session they were all readable.

Extra thoughts:

the server motherboard has a bit of a weird problem with the SATA controller - disk enumeration seems to take a lot, which lead to this bug when I upgraded to 18.04 https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1752961; the only (sad) fix was to add a sleep 5 in the initrd before btrfs scan; this also lead to having the second disk as /dev/sdc in some of the live USB runs, but it didn't seem to matter (when it happened I also adjusted the file name in /dev, which didn't seem to be particularly relevant for the result though);
I thought about bad RAM, although it felt unlikely because (1) once booted the system runs normally and (2) it's ECC RAM and I'd expect at least some kind of error message in case; anyhow, I started a MemTest and it's several hours it's running without errors;
~~iLO reports a self-test error at boot, that however seems to be unrelated.~~ the error was later solved upgrading to the latest firmware version and clearing its NVRAM; as expected, this made no difference at all, but at least the POST screen is a bit cleaner

My current guess is something like this: the HWE kernel is using a BtrFS version with some feature incompatible with the "regular" Ubuntu 18.04 GRUB version, so all files/directories that are written "now" are potentially problematic for GRUB to read; indeed, I saw that between GRUB 2.02 and the current version (2.06) there has been some work on BtrFS stuff, although for zstd compression, that isn't enabled on my disks. Maybe trying a more recent GRUB could fix the issue? But is there some packaged GRUB 2.06 build that works for 18.04?

Long story short: any idea about what the problem might be and how to fix it?

score 1 · Answer 1 · answered Aug 04 '21 at 01:21

1

any idea about what the problem might be

Not really, though I share your suspicion that grub and btrfs may not play along well.

and how to fix it?

Have you considered saving yourself this headache and moving the /boot outside of btrfs?

You could carve out a bit of space (shrink down the swap partition by 1G - swap is overrated anyway) and get yourself a new /boot partition of this extra 1G.

For disk redundancy you can use software raid1 - grub supports it quite well.

You would lose the snapshot capability for /boot, though /boot is small enough to have some other way to recover it.

answered Aug 04 '21 at 01:21

chutz

7,569
1
28
57

That's one of the solutions I thought of, but in that case probably I'd just redo the whole server from scratch using more boring choices for everything (MDRaid + LVM + ext4 probably), as btrfs mostly just gave me headaches and bad performance. What bugs me most is that this setup worked flawlessly for 5 years or so and then suddenly GRUB decides it doesn't like the partition anymore, even though it mounts fine on pretty much every kernel I tried - that's a thing that just shouldn't happen. – Matteo Italia Aug 08 '21 at 08:43
@MatteoItalia It's off topic now, but I was in your situation. Then I converted my machines from btrfs to ZFS when I realized that with 20.04 Ubuntu made the ZFS on root (and `/boot`) well mature and fully supported. – chutz Aug 10 '21 at 14:21
If that's finally working fine I may give it a try... Do you use it with its native mirror capabilities or with MDRaid underneath? How do you handle keeping the bootloader in sync on the various disks of the array? – Matteo Italia Aug 10 '21 at 21:27
Grub supports zfs (mirror), and the grub package automatically installs bootloader to multiple disks. I use UEFI mode though. https://openzfs.github.io/openzfs-docs/Getting%20Started/Ubuntu/Ubuntu%2020.04%20Root%20on%20ZFS.html – chutz Aug 11 '21 at 22:39

grub rescue on Ubuntu 18.04 with btrfs partition

1 Answers1