2

I have an old server running Solaris 10 x86 64 bit. Haven't had issues with it but recently it was powered off without a clean shutdown. That hasn't been an issue in the past with reboots from the power port but this time I'm stuck.

This is what I see when it boots up (part of the left if cut off because of the KVM software)

enter image description here

It doesn't do anything and doesn't respond to any keyboard commands.

I rebooted into failsafe mode and got an error about a corrupt boot_archive so I had it rebuild it. After that was done tried rebooting and still the same issue.

Rebooted again into failsafe mode and checked the disk for errors (format, analyze, read from here) and didn't find any errors, ran fsck on the drive and nothing.

Tried to install grub

cd /a/boot/grub
installgrub -fm stage1 stage2 /dev/rdsk/c0d0s0

Then rebuilt the boot archive again

bootadm update-archive -fv -R /a

Output was

Forced update of archive requested
Cannot find: /a/etc/cluster/nodeid: No such file or directory
Cannot find: /a/etc/devices/mdi_ib_cache: No such file or directory
Creating ram disk for /a
Updating /a/platform/i86pc/boot/boot_archive...this may take a minute

finally unmounted /a and rebooted

umount /a
reboot

No improvement. Nothing in /a/var/adm/messages since the shutdown. Any other ideas or suggestions on where I can look for next steps?

Set verbose option in grub and see the following before booting stops.

enter image description here

  • Boot in verbose mode. Select and edit the GRUB line you want to boot. See https://docs.oracle.com/cd/E19082-01/819-2379/fwbme/index.html Add a `-v` to that line, similar to https://docs.oracle.com/cd/E26502_01/html/E28983/glyas.html (that's for Solaris 11, but the `-v` option is the same) – Andrew Henle Oct 04 '16 at 10:37
  • Thanks @AndrewHenle at least I'm seeing something now. Not sure what I'm seeing as I've been up all night and haven't worked on this level of this system in a long time. I added a screenshot. Anything jump out at you? – OrganicLawnDIY Oct 04 '16 at 11:35

1 Answers1

0

While the boot archive was indeed corrupted and needed to be rebuilt there was also a different problem. The steps I used to fix the corrupted boot_archive were correct.

Thanks to Andrews comment to my question I was able to turn on verbose output and see where the system was hanging.

From the grub menu I selected the menu item I use to boot normally, hit 'e' to edit and then 'e' again on the kernel line and added -v to the end. Hit enter to save edits and then hit 'b' to boot the edited menu item.

The device that was hanging was pci1458,5004 and after looking through /etc/device_aliases I was able to see that it was the USB controller. After some searching I saw the suggestion to go into the BIOS and disable Legacy USB Support. After doing that the system booted normally. A new device was connected to the server that must have caused the issue.