2

Strangely I have a virtual machine that keeps constantly freezing for some seconds or minutes.

On the OS side, the CPU and memory used are quite low. On VMWare side, it's the same, the machine never uses more than 50% CPU. All other machines on the same VMWare ESXi server are running perfectly fine.

After hours of trying to figure out what could be the cause of this problem I decided to ask if anyone had an idea ? Of course I also searched intensively the web, and a colleague was also unable to find any hint.

What's difficult is that it can be anything, either on Windows, either on VMware site, as I don't see any graphic or number showing any problem.

Thanks a lot in advance for any hint or help !

P. S. Versions details : Guest OS : Windows XP, running "Symantec Endpoint Antivirus Console" VMWare ESXi 4.1.0

These are the latency stats

db_ch
  • 638
  • 5
  • 14
  • 20
  • 1
    Probably a disk i/o problem. What's the datastore latency and the physical storage latency like? I'd also remove Symantec to see whether that makes any difference. It would be better to run vShield and not run AV on VMs. – Reality Extractor Aug 14 '14 at 16:43
  • Disk I/O: Write latency is between 60 and 300ms. Read latency between and 200. When I say "between" it means that I have a one-minute peak every 2 minutes and then a minimum, then it goes to almost zero for some dozens of seconds and then a peak again. – db_ch Aug 15 '14 at 08:17
  • I must add, that the software installed is the Symantec console (that comes with MS-SQL light) as a aim to manage the machines on the network. But it doesn't seem to use a lot of CPU. – db_ch Aug 15 '14 at 08:19
  • I added a screenshot of disk I/O latency. – db_ch Aug 15 '14 at 08:22
  • >> Windows XP, running "Symantec Endpoint Antivirus Console" You're running your central security application on an obsolete and unsupported OS? That's... interesting. – BlueCompute Aug 15 '14 at 13:00
  • @BlueCompute: yes you're right. The problem is that we were first trying to solve this problem. But in a sense you're right we should re-install it on a newer OS ! – db_ch Aug 15 '14 at 13:36
  • Which will also 'solve', or at least sidestep this issue. Just reinstall on a supported OS I would. – BlueCompute Aug 15 '14 at 14:14
  • The latencies are high but probably not high enough to explain the freezing. Do the latency spikes coincide with the freezing though? Also, what's the underlying storage? And, is this a production or a lab system? – Reality Extractor Aug 18 '14 at 15:15

1 Answers1

1

Check the vmware syslog, an easy way is move the /scratch/log in advanced configuration to a datastore then download the syslog using the file manager in vsphere.

Can also use the syslog collector http://blogs.vmware.com/vsphere/2011/07/setting-up-the-esxi-syslog-collector.html

If you have guest specific problems and the hosts are fine, then it points towards a guest os software problems and nothing to do with vmware. Read the event logs on the guest operating systems for errors that might indicate driver problems. Consider upgrading the vm version and vmware tools. If it is just a symantec console then you can build a new server and reinstall the software and readd all the clients back in if the problem persists.

  • I checked the logs, but there is so many lines that I don't know what to look for. (20'000 logs lines each hour). I have so many messages like "GuestInfo changed 'guest.disk'" "Received callback in WaitForUpdatesDone" and thinks like that... In the guest logs (logs that are stored with the machine vmx file) I only see two lines that seems to be recurrent : 1) vmx| GuestRpcSendTimedOut: message to toolbox timed out. 2) mks| SVGA: Restoring cursor bypass 3 from vm which took 3->2->3 roundtrip – db_ch Aug 15 '14 at 08:12