I have a standalone ESXi 5.5.0 b2143827. It is running on a Dell R710 with 144GB of RAM. It has approximately 20 VM's on it.
Right now, I cannot get onto the console via the VMWare vSphere client or SSH. It just acts as if the server does not exist. The host will come back at seemingly random times and I can get onto the host via SSH and the vSphere client, but then it will just go off the network again at an undetermined time in the future. I can access it through the emergency console on the physical host itself (Alt+F1
).
However, all the VM's are active and working. But about 10 times a day, all the VMs will drop off the network for between 15 seconds and 5 minutes. Then they will come back just fine and everything keeps on ticking.
I have done the following:
- It was on a previous build, I updated it to b2143827. This made no difference
/sbin/services.sh restart
- this does not help the situation- Restarted the physical host. This made no difference.
- From the physical console (
Alt+F1
) I have pinged another physical device on the network. It does not drop any packets at all. - From the physical console, I have pinged a virtual machine on the host. It suffers approximately 80% loss
- From a remote machine, I can ping the management IP address with 0% packet loss
- From a remote machine, I can ping a VM on the host and can see the host clearly go off and back on the network occasionally
- I watched
tail -f /var/log/hostd.log
for a while and saw nothing untoward happening there - The system is installed on an SD card. I have shut the server down,
DD
'd the card to another card, then booted it on the new card. Same issue. - Tried a different network switch
- Ran the Dell Update Manager and updated every single firmware to the latest version.
I'm at a loss where to go from here. This server has operated flawlessly for the past 2.5 years. VMWare used to be installed on a physical drive, but 6 months ago it was moved onto the SD card so we could reconfigure the physical drives.