2

I have a weird problem with some servers here at work. We have a few XEN guests who's current time fluctuates.

# date;date;date;date;date;date;date
Thu Feb 25 16:00:40 PHT 2010
Thu Feb 25 16:00:48 PHT 2010
Thu Feb 25 16:00:40 PHT 2010
Thu Feb 25 16:00:48 PHT 2010
Thu Feb 25 16:00:40 PHT 2010
Thu Feb 25 16:00:48 PHT 2010
Thu Feb 25 16:00:40 PHT 2010

As seen above, the time fluctuates between 16:00:48 and 16:00:40, which is problematic for us since computing for time differences in some of our scripts becomes inaccurate (ex. what should be a few ms differences becomes some few second differences, and even sometimes, negative differences).

The problematic servers are linux guests on a XEN host. The time fluctuates on the guest systems, but it is okay in the host itself. I've ruled out ntpd since this happens irregardless of whether ntpd is running or not on the guest systems.

Guest is on full virtualisation. The time on both the host and the guest does match except that the time in the guest fluctuates at about a few seconds from the host's time, and the host time does not fluctuate.

/proc/sys/xen/independent_wallclock is 0 in the host and does not exist in the guest. Ntpd service was stopped and disabled. Setting independent_wallclock to 1 in the host has no effect (that is, time still fluctuates in the guest). Though I was not able to restart the guest as it is a production server. Might be able to do that over the weekend.

Any ideas on what to check and how to resolve this problem?


After a long bit of search and testing, the kernel parameters that worked perfectly are:

divider=10 clocksource=acpi_pm

I used this page to determine which parameters were best for the OS we were using. The information applied to VMware, but amazingly also applies to XEN. We also had ntpd re-enabled so we have a working time-sync.

Vin-G
  • 143
  • 6

3 Answers3

0

Only recently I discovered that the Xen clock API may have a problem: I saw many occurrences where subsequent calls of clock_gettime() would return exactly the same time, up to the nanosecond (which obviously cannot be true as the call should take a few nanoseconds already)! Maybe you see a different variant of that problem.

The concrete pattern in the question may also indicate that multiple (two?) sources try to synchronize the clock, and those sources have a different time.

In the past I found that Xen's clock synchronization may be good enough for many people, but if you want an offset less than one second, I had to use NTP and an "independent wall clock" in the PVM guest.

U. Windl
  • 286
  • 2
  • 13
0

Possible answer may depend on a few details. Is that a full virtualisation or a para-virtualisation?

If that is para-virtualisation guest, then check the output of:

cat /proc/sys/xen/independent_wallclock

In the guest and in dom0. Also, check the time in dom0.

When time in dom0 is correct and /proc/sys/xen/independent_wallclock is '0' in both dom0 and domU, then time in the guest should be correct too. Do not try to run ntpd in any of guests, let dom0 keep the time.

I am not sure about full-virtualisation, though (I guess /proc/sys/xen/independent_wallclock will not be available in domU, but you can still check the dom0).

quanta
  • 50,327
  • 19
  • 152
  • 213
Jacek Konieczny
  • 3,597
  • 2
  • 21
  • 22
  • 1
    Full virtualisation. /proc/sys/xen/independent_wallclock is 0 in the host. In the guest, /proc/sys/xen/independent_wallclock does not exist XD. I did stop and disabled the ntpd service though. Still presents the same problem. I also did set independent_wallclock to 1 with ntpd service not running and still the same (though I did not try to restart the guest, I'll try that maybe over the weekend). The time on both the guest and the host are the same. The only difference is that performing `date;date;date;date;date;date;date ` does not fluctuate in the host, and does fluctuate in the guest. – Vin-G Feb 25 '10 at 11:52
  • *updated original question with that info – Vin-G Feb 25 '10 at 12:17
0

typical timedrift. you need to set up NTP and also provide the VM with the correct kernel cmd line

https://access.redhat.com/solutions/27865

Oliver Salzburg
  • 4,505
  • 16
  • 53
  • 80
dyasny
  • 18,482
  • 6
  • 48
  • 63
  • Hmmmm we had ntpd originally set up in both the guest and the host. The guest server still exhibited the problem (and the host was fine). I tried disabling the ntpd service in the guest as I thought the vm host was already taking care of the synchronization (with /proc/sys/xen/independent_wallclock set to 0). Is this not so? Though I have not tried that tip on kernel boot parameters. We'll see over the weekend. – Vin-G Feb 25 '10 at 12:33
  • usually, stuff like divider=10 notsc works. The problem is that inside a VM, the clocksource that relies on CPU ticks can go a bit insane, because the CPU isn't real, and it's ticking fluctuates according to the v-cpu cycles, which are actually threads in the physical machines' Xen kernel processes. So the more you load the guests and hosts, the more timedrift you get. – dyasny Feb 25 '10 at 12:53
  • Thanks! The notsc had no apparent effect (maybe a kernel compile configuration), but this led me to the correct path (and solution). – Vin-G Apr 28 '10 at 03:59
  • perfect! glad I could help :) – dyasny Apr 28 '10 at 20:03