Debian Server Kernel Panic - Increasing Frequency

Question

Server first crashed a week ago into a kernel panic and then rebooted, performed an FSCK and then came back up as normal.

This then happened again this morning. Same panic, reboot, FSCK and then booted.

However, it has now happened again today. I requested to actually see the panic message from the server company and got the following which seems to mention the ext3 file system. If anyone could help decode what exactly this means and what could be the issue then that'd be great:

Kernel Panic Error 1

Kernel Panic Error 2

For some reason both of the images are not appearing so here are the two URLs - http://i.stack.imgur.com/hjOZ5.jpg http://i.stack.imgur.com/NrHwr.jpg

Have you done a RAM test (http://www.memtest.org/) and/or does the server have ECC RAM? While it's possible you have a genuine software issue I typically rule out bad RAM first when dealing with panics. — voretaq7, Feb 23 '11 at 20:34
could be legit then, though I've got no idea from what off the top of my head :-/ — voretaq7, Feb 23 '11 at 21:10

score 1 · Answer 1 · answered Feb 23 '11 at 21:51

1

If RAM is ruled out... look in 'dmesg' output for disk related messages. Maybe you have a failing drive. Are you using RAID? try to run a smart self test on the drives :

smartctl -t short /dev/sdXX

Wait a couple of minutes, then run

smartctl -a /dev/sdXX

to read the informations.

answered Feb 23 '11 at 21:51

wazoox

6,782
4
30
62

It died again whilst the smartctl command was running. Just before it did so I received 'kernel:[ 9777.519387] journal commit I/O error'. Looking at the output of `dmesg` I notice 'EXT3-fs: sda1: orphan cleanup on readonly fs' and six 'ext3_orphan_cleanup: deleting unreferenced inode 6062133'. Would this point towards a failing drive? – Joe Feb 23 '11 at 22:36
If this is on a dedicated server: don't rule out RAID card or motherboard chip trouble. – DutchUncle Feb 24 '11 at 00:00
Indeed. I think a server move may be in order counting that the provider doesn't seem to want to do any hardware changes until something actually blows up. – Joe Feb 24 '11 at 10:55
Sorry for the late answer but yes, one of your drives is dying, obviously. Replace it as soon as possible. – wazoox Mar 01 '11 at 12:33

Debian Server Kernel Panic - Increasing Frequency

1 Answers1