14

Is there a existing mechanism that synchronizes a linux system with NTP while online, and with a predictably drifting RTC while offline?


We operate remote "collectors": embedded Linux systems that collect and timestamp sensor data. We need their clock errors to stay reasonably small, say below 5 seconds. Usually we use NTP to sync their clocks, and that works fine - as long as the system is online.

The problem is that some collectors have very bad uplinks which can go down for hours, days or even weeks. That doesn't stop the local data collection, but without NTP, the Linux system clock drifts badly and quite unpredictably.

OTOH, the hardware's RTC drifts heavily too, but at a constant rate. The RTC drift rate varies from board to board, but is constant per board and can be measured.

I guess what we need is a mechanism that does the following:

  • Measure the RTC drift rate of a board before its deployment
  • Adjust system time ongoing/regularly via NTP when possible
  • Adjust system time regularly from RTC when NTP is unavaiable. Take known RTC drift rate into account.
  • Optional: Measure and record the RTC drift rate ongoing while being online (1)

With 'mechanism' i mean some well-maintained, documented piece of software and/or config that can handle the two states "online" vs. "offline", ensure that the system clock is synchronized with the correct time source (ntp vs. rtc), detect change of state, and correct for the RTC drift. It doesn't matter much whether it is implemented as a special ntpd configuration/plugin, as a separate daemon, as a cron job, or else.

I had a look at Chrony, but according to its documentation it tries to predict the drift of the system clock, which in our case drifts far more unpredictably than the RTC. Chrony seems to use the RTC only to keep time across reboots.


(1) Note ntpd activates the kernel's '11-minute mode' (update rtc from system clock every 11 minutes). There seems to be no ways with current kernels and ntpd to prevent the 11-minute mode. Therefore, any rtc drift information gets lost while ntpd is running (thx @billthor).


Updates/edits:

  • We are considering to add an external radio clock for the MSF or DCF77 signal (we are based in Europe) via USB or Serial. But we rather keep the hardware lean.
  • Our collectors are located indoors, often in the basement. So adding GPS clocks won't help.
  • We use Debian 7. That means hwclock from util-linux-2.20.1, ntpdate-4.2.6p5, ntpd from ntp-4.2.6.p5, chrony-1.24 (potentially 1.30).
  • Note that our problem is not that we don't know how to use ntpdate(8), hwclock(8), date(1), etc. Please see the added section in italics about what i mean with 'mechanism'.
  • Added footnote about the '11-minute mode'
  • Here is a very interesting discussion about offline-sync and RTC drift
leventov
  • 103
  • 4
Nils Toedtmann
  • 3,202
  • 5
  • 25
  • 36
  • As I understand it, a combination of ntpd and hwclock already allows you to do all of these things. – Roy Oct 23 '14 at 10:27
  • @Roy Sure. The question is: *How* to combine ntp(d) and RTC (hwclock) coherently to achieve maximal accuracy? – Nils Toedtmann Oct 23 '14 at 10:41
  • I understand that the sys clock drifts more than RTC. I am curious what you found unacceptable regarding the manner/effectiveness of chrony's management of system drift? How did chrony fail for you? – dfc Oct 23 '14 at 15:25
  • @dfc chrony didn't fail on us. We haven't tried it yet because is seems to not use the RTC to keep time during offline periods, which i think would increase accuracy in our use case. We will test chrony if no other more-promising-looking methods get suggested. – Nils Toedtmann Oct 23 '14 at 15:51
  • 1
    I think you should look into chrony. Respectfully it seems you are dismissing a good option based on a hunch. In my opinion it is backwards to investigate chrony iff no RTC-ntpd is found. It seems the easiest thing is to see if chrony meets your needs and if not then go down this rabbit hole – dfc Oct 23 '14 at 16:01

3 Answers3

4

Your situation is unusual, and I'd be surprised if anyone comes up with a standard ntpd-based configuration to do what you want. That said, I like being surprised, and it happens quite often around these parts.

But until someone comes up with a better idea, have you considered a crontab entry like this?

*/5 * * * *   ntpdate 0.pool.ntp.org || ( hwclock --adjust; hwclock --hctosys )

IE, every five minutes try to sync the clock via ntpdate, and if (and only if) that fails, adjust the hardware clock for drift according to the /etc/adjtime file (whose format is detailed in man hwclock, and whose first line you have populated appropriately using your knowledge of that particular RTC's rate), then set the system clock from the RTC.

Note that if you go for a solution like this, and you are deploying any significant number of these systems, it is considered polite to work with the pool, and contribute servers back in proportion to your usage. You can find more information at http://www.pool.ntp.org/en/vendors.html .

MadHatter
  • 78,442
  • 20
  • 178
  • 229
  • You nailed the basic idea :-) But it doesn't count as answer (yet) since it doesn't account for the (constant, but significant) RTC drift. Can we improve it, e.g. utilizing `/etc/adjtime` and `hwclock --adjust`? – Nils Toedtmann Oct 23 '14 at 16:28
  • Yes; see above. – MadHatter Oct 23 '14 at 19:25
  • This is the kind of solution I had in mind when I wrote my previous comment. Also, if the system clock is currently in sync via ntpd, you can use hwclock to measure and set a fairly precise drift rate for the RTC. – Roy Oct 24 '14 at 11:29
  • Unfortunately not, see comments about ntpd & '11-minute mode' – Nils Toedtmann Oct 24 '14 at 12:00
0

Set it up to use the pseudo-clock "local clock". Add this to ntp.conf

server 127.127.1.1 iburst
fudge  127.127.1.1 stratum 8

server 127.127.1.1 is the pseudo-clock aka the local RTC. iburst to sync quickly to it (or don't). fudge stratum 8 changes it from the default 3 to a higher number. Should be higher than your other servers.

This will cause ntpd to use the local clock as the sync source when connectivity to the lower stratum servers are lost, and revert to those when connectivity is regained.

cde
  • 131
  • 5
  • Are you sure the "local clock" is the RTC? I implemented it like this and I always got 0.000s between this source of time and System Time. – inversus Nov 23 '21 at 17:11
0

NTP already has mechanisms to know if it is online or offline, and it will switch to lower priority sources as required. It is very easy to check the reach value to trigger an alternate source, but I would stick with NTP. As discussed below monitoring and correcting for RTC drift is likely to be difficult.

In pre-Internet days I used a program which would phone out to a data source and sync the clock. There may still be services available that provide a time source over a modem. This would require access to a phone line.

There are known issues with the local clock, that don't apply to the RTC. Some of issues are documented in the list of NTP Known OS Issues. These may account for your clock drift. Resolving them may solve your issue. In the absence of missed ticks, I've found the local (system) time source may be very stable.

You may be able to use the Dumb clock driver (33) with a program that writes the appropriate RTC time to the /dev/dumbclockX device.

There are a number of other drivers based on Radio clocks. Some of these use shortwave services like WWV and CHU, which may work in environments where GPS signals aren't available. For Europe this list would include BBC, TDF, RBU, and RMW.

Pavel Krejci has written an RTC driver as well, but it does not appear to be incorporated into the official drivers. This may work with PPS type synchronization.

It should be possible to measure RTC drift prior to deployment. However, you will need to ensure that the RTC is not being automatically updated. When the system clock is updated with the adjtimex function, the RTC may be updated every 11 minutes.

NTP will update the clock when it is connected. Normally NTP will refuse to make large adjustments to the system clock. There are options to adjust how far the clock can be adjusted.

I've suggested options for using the RTC above. A radio clock may be more suitable than a GPS clock.

Measuring drift in the absence of a reliable time source to compare it against is likely a futile effort. If local time is unstable, you can't use it to monitor the RTC and vice versa. Measuring drift while NTP is connected will not work if the kernel is updating the RTC every 11 minutes. The RTCs I have used have a one second resolution, so they would have to drift significantly to be reliably measurable.

BillThor
  • 27,354
  • 3
  • 35
  • 69
  • I don't understand how this relates to my question. I don't think my problem is a lack of drivers ... or is it? – Nils Toedtmann Oct 23 '14 at 14:28
  • @NilsToedtmann As far as I can find there is no official driver for the RTC. I believe the `local` driver just uses the servers clock, which you report drifts. I'll update my response. – BillThor Oct 23 '14 at 22:22
  • When you say 'drivers', do you mean Linux kernel drivers (of which there are plenty), or ntpd features? Thx for your tips, some of them are interesting - though i think they should have been posted rather as comments. Thx in particular for mentioning the '11-minute more', i had forgotten about that. I updated my question. – Nils Toedtmann Oct 24 '14 at 08:46
  • @NilsToedtmann No, I mean NTP clock drivers. It's been my experience that RTCs normally drift, but not at high rates if the batter is good. Missed ticks can be an issue with the system clock. – BillThor Oct 25 '14 at 03:03