About a year ago, we virtualized most of our small business Windows and Linux infrastructure, which included starting a fresh (not P2V) Server 2008 R2 VM. This performed acceptably until a few weeks ago, when the following changes were made:
- Xen was upgraded from 5.5 to 5.6FP2
- The 2k8R2 C: drive was expanded by 10 gigs in XenCenter, and Windows was allowed to expand the drive to include the added space.
- Pagefiles were relocated from another virtual drive to the expanded C:
Since these changes, we've had a number of occurrences of the server becoming unresponsive. Exhibited behaviors are that the application and website that query the one database this machine hosts would give timeout errors, RDP connections were likely to either never begin, or not render the login screen to authenticate, virtual console control in XenCenter wouldn't be able to authenticate if the GUI had been locked, or if it was open, any interactions (except mouse tracking) would result in errors about the system being unresponsive. My SNMP monitoring still reports the server and SQL service as available, but any attempt to restart the server through proper means (XenCenter, shutdown /i from another machine, the virtual console if I could interact with it) would fail. Only means to solve this is a "force reboot" from XenCenter.
Troubleshooting steps I've taken so far:
- Increased RAM allocation
- Moved to alternate Xen host
- Installed MS KB979149
- Brought up another virtual drive moved all paging to it
- Set up a nightly reboot (just yesterday)
Any ideas on what sort of monitoring to begin to answer what is happening, or any known issues that could lead to this?