It turns out that 'nash' does a lot of "stuff" under the covers, while with the 'busybox' all-in-1 you have to do most the hard work yourself. In my case we have a CUSTOM Centos-5 kernel that must boot into a diskless (root-on-nfs) environment, as well as on disk-based systems with Sata, SAS, MPT raid controllers, Megaraid*, ... attached disks. The kernel configuration is SO custom that the standard RH/Centos .spec build procedures barf when trying to process it and build the kernel RPM. I've massively hacked the distributed .spec file to get around these problems, and that in itself is a major problem. In trying to revert to a kernel that is as-standard-as-possible I've done the following, and tripped over many issues some of which I'll describe-
a) Network booting with busybox and initramfs
This works nicely. My initramfs.cpio archive contains a statically-linked bb, lvm, and a custom lite-weight dhcp client, along with all of the Ethernet drivers that that system will ever boot over using the system's BIOS PXE boot code. The NFS client modules (sunrpc, nfs, lockd, ...) are also in the cpio archive and loaded by bb. NB. The size of the vmlinuz image that is booted via PXE/TFTP is limited to 8MB on the Dell systems that I have. The procedure is to load drivers in a list until an eth0 appears, run the dhcp client to get an IP address, netmask, hostname, and gateway. Then the network is brought up, and the root f/s specified via the "nfsroot=server:remote-root-fs" command-line argument is NFS mounted onto a directory in the rootfs (/newroot), then 'switch_root -c /dev/console /newroot /sbin/init' to transfer from initramfs to the NFS-based root f/s.
The only difference here is that I supply a cpio archive to be built into the kernel. In all other respects the kernel image and modules are as supplied by RH/Centos. Everything that can be built as a module (and that almost everything), is!
b) Disk booting via initramfs and nash
Why switch from busybox to 'nash' when booting to a local (no iSCSI/FC/etc) attached disk or raid/LVM setup. The simple answer is that I cannot figure out the recipe to get the job done using bb alone, so my cpio archive also includes the nash binary along with a reduced init.nash startup script. Once I've determined that we're not booting over the network (no "nfsroot=..." in the kernel command line) I simple load the appropriate modules then 'exec' the nash script and let it do the final stages of the bring-up. The bits that nash does that I could NOT figure out how to replicate using busybox are:
List item
- mkrootdev -t ext3 -o defaults,ro /dev/VolGroup00/NAS_ROOTFS
- echo Mounting root filesystem.
- mount /sysroot
- echo Setting up other filesystems.
- setuproot
- echo Switching to new root and running init.
- switchroot
In particular, the last two nash builtin commands 'setuproot' and 'switchroot' do something that I cannot replicate, despite having tried. In particular, if I ignore what 'setuproot' does and do what I think should be done (create an fstab entry for the /sysroot mount, transfer any logical-volume/device-mapper info from /dev -> /sysroot/dev, etc) and then 'switch_root /sysroot /sbin/init' it appears to work but then crashes out once rc.sysinit gets a short way into processing the startup commands. In the emergency shell I see that "/dev/mapper/" is empty apart from the control file, and that the "/dev/VolGroup00/" directory that normally has softlinks to the LVM device nodes in /dev/mapper/ has completely disappeared. Why this occurs with busybox and NOT nash I do not know, and for the moment I'll just live with exec'ing a minimal nash script to do the last bit of setuproot and switchroot to the real disk root f/s.