2

Issue description:

On bootup we trigger the service initialization script that is shown below. The script is a part of Instance User Data.

This script copies the necessary service/timer things to systemd folder and starts a timer.

From time to time after reboot our timers are not working and stays in N/A.

The sudo systemctl restart cleanup.timer command does not help.

Only sudo systemctl stop cleanup.timer and then sudo systemctl start cleanup.timer works.

We have this problem not only with a particular timer. We got lot of timers with almost same structure and initialization service. All timers are loaded but some of them can be in active (elapsed) status and not working.

Our setup:

We run Ubuntu 18.04 LTS.

We have this service initialization script:

sudo cp <BASIC PATH>/services/cleanup.service /etc/systemd/system/cleanup.service
sudo cp <BASIC PATH>/services/cleanup.timer /etc/systemd/system/cleanup.timer
sudo systemctl daemon-reload
sudo systemctl start cleanup.timer

And this cleanup.service:

[Unit]
Description=Service to cleanup things
Requires=docker.service
After=docker.service

[Service]
ExecStart=<<SHELL COMMAND>>

And this cleanup.timer:

[Unit]
Description=Timer for service to cleanup things

[Timer]
OnBootSec=1min
OnUnitActiveSec=1800sec
AccuracySec=5sec

[Install]
WantedBy=timers.target

Now check these commands:

> sudo systemctl list-timers
NEXT                         LEFT          LAST                         PASSED       UNIT                         ACTIVATES
n/a                          n/a           n/a                          n/a          cleanup.timer                cleanup.service
> sudo systemctl status cleanup.timer
● cleanup.timer - Timer for service to cleanup things
   Loaded: loaded (/etc/systemd/system/cleanup.timer; disabled; vendor preset: enabled)
   Active: active (elapsed) since Wed 2021-11-17 21:07:22 UTC; 11h ago
  Trigger: n/a

Nov 17 21:07:22 ip-XX-XX-XX-XX systemd[1]: Started Timer for service to cleanup things.

Question:

What could be the reason of this situation and how to avoid it?

Cjoerg
  • 21
  • 4

2 Answers2

1

I looks like you forgot to enable the timer.

systemctl start <unit> starts that timer right now (ie when you first install it and want it to run).

systemctl enable <unit> doesn't start the unit now but sets up the relevant hooks so the unit is started based on what is in the unit file...

e.g. timer will start after reboot because of...

[Install]
WantedBy=timers.target

running the service initialization script on every boot?

On bootup we trigger the service initialization script... We have this service initialization script:

That clearly isn't running successfully because the init script is supposedly running systemctl start cleanup.timer but the cleanup.timer hasn't run or failed in 11 hours.

So I would look at the logs around your init script to see what is failing and why. Could be your init script is running before docker is available or something like that.

mattpr
  • 561
  • 3
  • 8
  • As I understood the docs, main profit of `enable` was reboot persistence so the service will be started automatically by system. But because `User data` scripts do `start` at each boot I saw no reason to do `enable`. Anyway thanks, will check and test this out. About `service initialization script on every boot`. Each of them contains `set -e` at the beginning of script. So it should fail if any of the next commands exits with a non-zero status, but this has not happened. – Cjoerg Nov 29 '21 at 13:04
  • About `Could be your init script is running before docker is available or something like that.`. It is strange because all my systemctl services contains `Requires=` and `After=` options and this should force services to wait until necessary is available. Also there was situations when in the same time part of services with same options was working fine and another part of them was not. – Cjoerg Nov 29 '21 at 13:05
  • I can only comment on what it obvious from what you posted. Your init script says to start the timer, but `journalctl -u ` shows it has not run in 11 hours. So either the `systemctl`/`journalctl` is wrong (not likely) or your init script is not actually starting the timer for some reason (more likely). So I would be adding more debug/logging to your init script and taking a look at that over a series of reboots. If you can post more I might be able to help more. – mattpr Nov 29 '21 at 13:41
0

I'm not still familiar with systemd but as far as I know PATH environment ain't yet available on boot-up thus some scripts and program set the PATH environment firsts. Assuming your service runs whenever you run it on console, this would be always the case.

Kenkoy
  • 11
  • 2