Introduction:
Time is OS like Linux is typically derived from a clock chip (RTC), or maintained by software using either periodic interrupts or some hardware registers (e.g. CPU's TSC cycle counter) for implementation.
Obviously in a virtual machine there is no direct hardware access (e.g. to RTC), so keeping the correct time may be tricky.
Specifically I'm wondering about the two POSIX clock implementations: CLOCK_REALTIME
and CLOCK_MONOTONIC
(there are more).
Disturbances
There are two major "disturbances" I'm considering:
- "CPU overcommitting": giving more virtual CPUs to VMs than there are physical ones
- "Live Migration": Moving a VM from one machine to another "without" affecting operations
Normal operation
Processes running in an operating system on bare hardware are interrupted only by the operation system (that has control then). So the operating system can keep the time easily.
VM operation
An operating system running in a VM does not continuously have control over the CPU. For example if the OS "does not have the CPU", it cannot process timer interrupts. In turn that could cause the timer interrupts to be lost completely, be delayed by some seemingly random amount (jitter), or maybe even be processed in rapid sequence (processing "delayed" interrupts now). Likewise the clock would not progress as linearly as expected.
Choices
CLOCK_REALTIME
: If the OS is missing CPU, the real-time clock could either be slowed down (lack behind), or jump forward occasionally to keep upCLOCK_MONOTONIC
: If the OS is missing CPU, the real-time clock could either be slowed down (in relation to other VMs or wall-time), or jump forward occasionally to keep up
Effects
CLOCK_REALTIME
: Obviously if the real-time clock is slow, it cannot be used as an absolute timing measure, but it would look consistent within the VM. If the clock keeps up by jumping forward variable amounts of time, it could be used as an absolute measure, but it would be bad for measuring any performance (duration) within the VM.CLOCK_MONOTONIC
: Advancing the monotonic clock only if the VM "has the CPU" will provide a consistent view of elapsed time within the VM. Making the clock jump forward variable amounts of time would prevent use for performance (duration) measurements within the VM.
Live-Migration
When live migration requires copying of gigabytes of RAM from one node to another, there will be some "freezing time" when the VM cannot run, lets say 3 seconds.
Now should the real-time jump forward by 3 seconds also, or should it loose the three seconds until being corrected manually or automatically at some later time? Likewise when the monotonic clock is being used to measure "uptime", should it take those three seconds into account by adding those, or should it account for the time when the VM actually had the CPU?
Over-committing CPU
Like above, but there are more frequent short delays instead of occasional larger ones.
Questions
What approach does Xen use?
How does VMware handle that? Are there configurable options? (I know that in Xen the VMs can be synced from the hypervisor, or run independently (e.g. synced from external by using NTP))
Are there any "best practices"?