CentOS 5.10 / VMWare ESX 5.1
I've got an older email server running CentOS 5.10 (with SendMail) and it's experiencing intermittent hangs wherein the system becomes completely unresponsive. During these times, I can't connect to it at all and the virtual console is unresponsive.
The strange part is that our VMWare admin group aren't seeing any obvious resource spikes that would be indicative of insufficient resources, load spikes, etc. Furthermore, when I examine the system logs (e.g. maillog, messages, etc) there's a noticeable absence in ALL log activity during the time of the hang which suggests that these outages are severe enough to prevent logging (or perhaps there's a filesystem/disk issue).
The one abnormality is that sendmail logging on the box was pretty high (98 instead of the usual level 9). I'm going to set it back to normal shortly.
I'm stumped on where I can go for more info here. Is there a thread dump that would tell me what the OS was working on during the hang?
Additional information:
- Kernel version is:
2.6.18-371.4.1.el5 #1 SMP Thu Jan 30 06:09:24 EST 2014 i686 i686 i386 GNU/Linux
- The storage is handled on a shared SAN.
- VMWare tools is not installed on the system as per internal policy however we've been running for a long time without vmware tools so we don't think the absence of it is necessarily the root cause.
- Specific version of VMWare is: VMware ESXi 5.1.0 build-2000251
- Hardware is IBM 3850 M2, Model 7233AC1