4

I've just set up a PXE boot server on Ubuntu 14 - running kernel 3.13.0-30-generic - as it is described here https://help.ubuntu.com/community/DisklessUbuntuHowto on Supermicro X9DRFF hardware.

I have only remote access to the client server via KVM. The boot process on the client goes well but I get a kernel panic.

Is it ever possible to determine the cause of this kernel panic?

enter image description here

Pro Backup
  • 914
  • 4
  • 15
  • 33
Grigory
  • 167
  • 1
  • 2
  • 9
  • 1
    Can you get the rest of it? – Michael Hampton Jul 07 '14 at 19:32
  • @MichaelHampton the output actually stops at this moment.. So there is no any rest. – Grigory Jul 07 '14 at 19:37
  • what are you trying to boot? a regular Ubuntu install or a Live distribution? – Pat Jul 09 '14 at 10:56
  • @Pat a regular Ubuntu install. It is Lubuntu 14.04 – Grigory Jul 09 '14 at 12:41
  • ok see my answer. – Pat Jul 09 '14 at 13:53
  • Gentlemen, thank you for you comments but the problem appeared to be deeper. Please, welcome to my new question here http://serverfault.com/questions/612024/ubuntu-14-04-pxe-boot-with-initrd-initramfs-kernel-panic – Grigory Jul 11 '14 at 20:50
  • your problem is not deeper; this question and your new question have the same answer that for some reason you refuse to take. If you want to PXE boot/install Lubuntu just do what the answer says. – Pat Jul 15 '14 at 15:19

5 Answers5

3

Some of the output is missing, since it scrolled off the screen already, but it's possible to see that the kernel crashed in mount_root(). This means that it had a problem with mounting whatever you passed as the root filesystem. Check to ensure that you have passed the correct parameters to the kernel to boot from whatever media it is supposed to be booting from.

Michael Hampton
  • 237,123
  • 42
  • 477
  • 940
  • Hmm. The root is acutally suppose to mount from nfs. I'll check this for sure. Thank you. – Grigory Jul 07 '14 at 19:47
  • There's a kernel panic throw in the code: `panic("VFS: Unable to mount root fs on %s", b);` so my money's on a filesystem issue too. – Nathan C Jul 07 '14 at 19:52
3

It is actually possible. You need the debug kernel for that particular distro.

On a seperate host.

  • Download the kebug version of that kernel. It will contain a vmlinux file.

Open the vmlinux file in gdb.

$ gdb /usr/lib/debug/lib/modules/3.14.9-200.fc20.x86_64/vmlinux
GNU gdb (GDB) Fedora 7.7.1-13.fc20
Copyright (C) 2014 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /usr/lib/debug/lib/modules/3.14.9-200.fc20.x86_64/vmlinux...done.
(gdb) 

Judging from the stack output, we can see prior to the panic the most useful function the kernel was in was mount_block_root.

In order to determine where we failed, we need to feed in the function name plus an offset into GDB. This is done by de-referencing the address to the function, plus the offset. The stack trace supplies the offset as the first value after the function.

I.E mount_block_root+0x225 means (I was at "mount_block_root" plus 549 bytes (the hexadecimal translation).

Finally, we tell GDB to print the source code of that area. In my Linux system, this results in the following

(gdb) list *(mount_block_root+0x225)
0xffffffff81d26513 is in mount_block_root (init/do_mounts.c:422).
417                "explicit textual name for \"root=\" boot option.\n");
418 #endif
419         panic("VFS: Unable to mount root fs on %s", b);
420     }
421 
422     printk("List of all partitions:\n");
423     printk_all_partitions();
424     printk("No filesystem could mount root, tried: ");
425     for (p = fs_names; *p; p += strlen(p)+1)
426         printk(" %s", p);

From here we can tell exactly where we were at the point of the crash. NOTE my kernel is not your kernel, so the offsets are probably off. Based off of the likelihood that both these kernels are nearly the same, I'll hedge a bet that the real panic actually occurs at line 419, not line 422 (as was suggested).

Reading further up the code slightly indicates it was unable to open the block device specified -- but without a crash dump its not possible to tell why from the information. So its probably:-

  • You dont want to mount a block device (likely).
  • You specified a non-existent block device address (or partition).
  • Your initrd, does not contain the proper filesystem module in the initrd to mount it.
  • There is no filesystem on the disk.
  • The superblock for the filesystem is not at the beginning of that location.

Following on from the link in you're reference, it suggests you are trying to mount with NFS as the root, in which case you should never end up landing in this function at all. In which case:

  • Your kernel command line contains multiple root directives.
  • You have mistyped your NFS address such that it does not get parsed correctly to go into the real function you want (mount_nfs_root).

So, overall based off of the information in the question I assume you have omitted something or made a typo.

Matthew Ife
  • 22,927
  • 2
  • 54
  • 71
  • Before debugging a kernel it is always a better idea to see around what's going on. You usually do not need a mass spectrometer to know what's in the meatloaf. – Pat Jul 09 '14 at 15:07
  • @Pat how would you determine what was going on in this case? – Matthew Ife Jul 09 '14 at 15:16
  • 3
    easy; PXE boot, using NFS, the kernel panic mention a mounting problem. The first thing that comes to my mind is to check if the distro has the required NFS support (most of then do) then checking mounting points, NFS parameters and loop devices; If I have to debug a kernel every time I see a kernel panic my life would be really a nightmare. – Pat Jul 09 '14 at 15:22
  • @MatthewIfe thank you for your comments. Please, welcome to my new question http://serverfault.com/questions/612024/ubuntu-14-04-pxe-boot-with-initrd-initramfs-kernel-panic – Grigory Jul 11 '14 at 20:52
1
  1. The quoted link does not have the complete set of parameters (append line) required for PXE booting Lubuntu 14.04.
  2. The kernel panic => the mount cannot be performed correctly because of 1).

You can see How Serva solved the correct lines here (I'm related to Serva development) http://vercot.com/~serva/an/NonWindowsPXE3.html

Serva uses a CIFS share instead of NFS but you could very well use NFS if you want. Of course you do not need to use Serva; you can use its parameters in your own PXE server

[PXESERVA_MENU_ENTRY]
asset    = Lubuntu 14.04 Desktop Live
platform = amd64
kernel   = NWA_PXE/$HEAD_DIR$/casper/vmlinuz
append   = showmounts toram root=/dev/cifs initrd=NWA_PXE/$HEAD_DIR$/casper/initrd.lz,NWA_PXE/$HEAD_DIR$/casper/INITRD_N11.GZ boot=casper netboot=cifs nfsroot=//$IP_BSRV$/NWA_PXE_SHARE/$HEAD_DIR$ NFSOPTS=-ouser=serva,pass=avres,ro ip=dhcp ro

Please consider

  1. Ubunu/Lubuntu have a bug that if you PXE boot them using CIFS you must add the complementary initrd INITRD_N11.GZ (freely available from Serva's page)
  2. If you are installing the 64bit version the former parameters require you to rename the file \casper\vmlinuz.efi to \casper\vmlinuz
Pat
  • 3,339
  • 2
  • 16
  • 17
1

I had the same issue with Ubuntu 14.04 today, and it was quite obnoxious so I want to share the solution I found with the world here...

I was using pxelinux.0, NFS for the root filesystem, and TFTP for serving up the kernel image and initramfs. As mentioned above by @MatthewIfe, looking at the stack backtrace and functions being called clearly indicates this issue was occurring in a block device related function, and mount_nfs_root was never being called.

So I turned to the TFTP logs, as indicated by the author of this post, and noted my configuration file was named as:

tftproot/pxelinux.cfg/default

Also it looked like this:

DEFAULT vmlinuz
LABEL Ubuntu 14.04 Blah Blah
KERNEL vmlinuz
APPEND initrd=initrd root=/dev/nfs nfsroot=192.168.1.123:/path/to/exportfs

Also my iPXE loader was also looking for other files just like in the post:

pxelinux.cfg/40709cda-a8e0-d411-8c6c-001e68e210ae
pxelinux.cfg/01-00-1e-68-e2-10-ae
pxelinux.cfg/C0A8010E
pxelinux.cfg/C0A8010
pxelinux.cfg/C0A801
pxelinux.cfg/C0A80
pxelinux.cfg/C0A8
pxelinux.cfg/C0A
pxelinux.cfg/C0
pxelinux.cfg/C
pxelinux.cfg/default

But I saw no record in the log of initrd being pulled down. So I decided to test and see if my APPEND line was working at all. So I added a "panic=10", again as in the post linked. And it seemed to not be working. So none of my kernel config line directives were being used! On a hunch I decided to do two things -- simplify my file to match the post

DEFAULT linux
LABEL linux
KERNEL vmlinuz
APPEND root=/dev/nfs nfsroot=192.168.1.123:/path/to/exportfs initrd=initrd panic=10

and rename it to something like

tftproot/pxelinux.cfg/01-00-1e-68-e2-10-ae

And voilà -- the initrd gets pulled down, no more kernel panic, and NFS is mounted as the root filesystem properly using the default/generic kernel and initramfs. I'm sure I can change the label back, etc. I think the actual issue was with the naming of the configuration file and what pxelinux.0 expects.

Zack
  • 11
  • 1
0

I've made it. The problem turned out to be very simple. I gave to PXE client a 3.13.0-30 kernel. But I was running mkinitramfs on a machine with a 3.13.0-24 kernel.

I started to give a PXE client the 3.13.0-24 kernel and it worked.

Grigory
  • 167
  • 1
  • 2
  • 9