6

The server time was 7 hours off (instead of 10AM it was 3AM, even though date showed the correct timezone). The output for ntpq was:

$ ntpq -p

     remote           refid      st t when poll reach   delay   offset  jitter
==============================================================================
 xx.xxx.xxx.x.ar xxx.x.xx.xx      2 u   72 1024  177    6.516  2520657 1650156
 ntp.xxxx.ac.uk  xxx.xxx.xxx.x    2 u   7h 1024  377   14.039  2520655 1347346
 xxx.xxx.xxx.xx  xxx.xxx.xxx.x    2 u  114 1024  377    5.449  -18.941 2130343
 ns1.xxxxxxx.com xxx.x.xx.xx      2 u  148 1024  377    8.050  2520655 1650156

The time was fixed by:

ntpdate -u 0.europe.pool.ntp.org

However, it happened again a few days after. I suspect the second row in ntpq -p, which says it's been 7h since last received packet. But if that is the reason, then why didn't ntp use the other servers to sync the time?

What has happened? How would you prevent this from happening again?

Edit Another thing that might be useful to consider is that it is a VM. Is it possible that the VM was on some kind of paused state?

Note that vmware-toolbox-cmd timesync status is disabled.

sina
  • 189
  • 1
  • 2
  • 9
  • The third line of output above is interesting; it looks like one of your servers has a wrong clock. Is that one a public server? (And Alessandro is right; obfuscating the server list makes no sense.) – MadHatter Apr 28 '15 at 10:27
  • I have had issues running `ntp` on virtual machines. Unless the clock is virtual, `ntpd` may try to modify the system clock. Normally an issue like this is a result of the system expecting the hardware clock to set to a different time than it is actually set to. The container usually has a timesync mechanism, which appears to be disabled. – BillThor Apr 28 '15 at 12:51

2 Answers2

5

When it start, ntpd check the time difference between your host and the remote NTP servers. If that difference is too big (10-15 mins, typically) it refuse to change anything.

When you executed ntpdate you effectively use a one-shot, simpler SNTP implementation that bring your time within milliseconds of what ntpd itself would do. Now, if you restart ntpd service, you should have a synchronized server (check this with ntpq -p).

A simple permanent solution would be to first use ntpdate early in the boot process, and after some time launch the "real" ntp daemon. For the record, CentOS 6.x and 7.x do tha very same thing: if you install both ntpdate and ntp, the former will be used early in the boot process, while the latter will be used in a later stage.

shodanshok
  • 44,038
  • 6
  • 98
  • 162
  • So, what could be the reason that the difference between my vm and the NTP server too big? Why didn't it use the other 3 servers? – sina Apr 28 '15 at 10:11
  • 1
    The initial time skew can be due to a number of factor; for example, your VM was paused / suspended and than resumed. The reason why it did not synchronized is explained in the answer: the time skew was too high and ntp cautiously decided to do nothing. ntpdate is a simpler, "do-what-I-want" command, so it did not complained. – shodanshok Apr 28 '15 at 10:31
  • Thank you, the vm was suspended in fact. Does that mean that if my vm is suspended for more than 10-15 minutes, then NTP will be useless afterwards? This sucks because I have to use "ntpdate" manually to fix it. I tried adding "nptdate" as an hourly cron job, but then realised that there is already an hourly "ntpd" job, so it would be overkill. – sina Apr 28 '15 at 13:26
  • 2
    Unfortunately, yes: you had to run `ntpdate`. To tell the truth, as ntpdate is deprecated in new Linux installation, you can also use `ntp` with special parameters (see man page for details). You your VM hypervisor provides special guest additions, install them - often they include some serial communication channel with the host machine, enabling on-demand time synchronization. – shodanshok Apr 28 '15 at 15:03
3

It seems that your ntp fail to sync due to excessive jitter / offset I suggest to try a different pool of ntp server near to your country.

There is no need to obfuscate the ip in your status because these ip are public and well-documented servers

If you machine run under VMware please check also http://www.vmware.com/files/pdf/Timekeeping-In-VirtualMachines.pdf and keep the ntp clock of physical server aligned.

About "Another thing that might be useful to consider is that it is a VM. Is it possible that the VM was on some kind of paused state?"

Yes, VMware re-sync the clock after a pause even if vmware tools are set to sync disable

Regardless of whether you turn on VMware Tools periodic time synchronization, time synchronization occurs after certain operations:

  • When the VMware Tools daemon is started (such as during a reboot or power on operation)
  • When resuming a virtual machine from a suspend operation
  • After reverting to a snapshot
  • After shrinking a disk
  • That's interesting. Just to clarify, does that mean that in this case, it is possible that the vm was resumed from a suspend state, then synced the (wrong) time with its host, even though I had disabled timesync? – sina Apr 28 '15 at 10:08
  • 1
    It seems: Check VMware KB 1189 http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1189. "Time is resynchronized when you migrate the virtual machine using vMotion, take a snapshot, restore to a snapshot, shrink the virtual disk, or restart the VMware Tools service in the virtual machine (including rebooting the virtual machine)." – Alessandro Carini Apr 28 '15 at 10:11
  • It appears that the vm was supspened while taking a snapshot – sina Apr 28 '15 at 11:03