4

I'm asking this question, because I couldn't find the answer here :
Why is my crontab not working, and how can I troubleshoot it?

Context

We have several servers running debian/wheezy.

One backup task requires that we deactivate the crontab of a specific user during the backup, so we have a script, run daily, which roughly does :

# user is legec :

# save the crontab to a file
crontab -ulegec -l > /home/legec/.backup/crontab
# empty the crontab
echo "" | crontab -ulegec

backup ...

# reload crontab
cat /home/legec/.backup/crontab | crontab -ulegec

And this works as we expect, the vast majority of times.

This task runs on ~80 servers ; depending on the server, the backup task will take from 1 minute up to 2 hours.

Bug

Once in a while, cron will not detect the last reload, and will not execute any of the jobs listed in the crontab.

The file in /var/spool/cron/crontabs/legec has the expected content, and modification date :

$ ls -lh /var/spool/cron/crontabs/legec
-rw------- 1 legec crontab 6.7K Sep 22 04:03 /var/spool/cron/crontabs/legec

but cron logs indicate that cron did not detect the last change :

$ cat /var/log/cron.log | grep -E "LIST|RELOAD|REPLACE"
...
# yesterday's backup : all went fine
Sep 21 04:00:06 lgserver crontab[6670]: (root) LIST (legec)
Sep 21 04:00:06 lgserver crontab[6671]: (root) LIST (legec)
Sep 21 04:00:06 lgserver crontab[6673]: (root) REPLACE (legec)
Sep 21 04:01:01 lgserver /usr/sbin/cron[2025]: (legec) RELOAD (crontabs/legec)
Sep 21 04:03:01 lgserver crontab[7071]: (root) REPLACE (legec)
Sep 21 04:03:01 lgserver /usr/sbin/cron[2025]: (legec) RELOAD (crontabs/legec)

# today's backup : no final RELOAD event
Sep 22 04:00:07 lgserver crontab[24163]: (root) LIST (legec)
Sep 22 04:00:07 lgserver crontab[24164]: (root) LIST (legec)
Sep 22 04:00:07 lgserver crontab[24166]: (root) REPLACE (legec)
Sep 22 04:01:01 lgserver /usr/sbin/cron[2025]: (legec) RELOAD (crontabs/legec)
Sep 22 04:03:01 lgserver crontab[24458]: (root) REPLACE (legec)
          # no RELOAD line here

"Once in a while" means : no regularity, we see this bug maybe once a month, on one random server out of the ~80 which are running.

Question

Does anyone have a lead on where to look ?

LeGEC
  • 183
  • 7
  • is there a different behavior between reload and restart? I have seen sometime reload does not update the info. Can you use restart if possible? – Tux_DEV_NULL Sep 22 '17 at 08:56
  • actually : when this bug occurs we simply reload the crontab once again, and cron detects the reload correctly. We plan to try other ways on future occasions : restart, reload, and touch the crontab file, but since we don't know how to reproduce, we will have to wait for the bug to appear once again. – LeGEC Sep 22 '17 at 09:20
  • Can you add something like `* * * * * /bin/touch /home/user/testfile 2>/tmp/error` under /etc/cron.d then do a reload. Does it work? – Tux_DEV_NULL Sep 22 '17 at 10:18

1 Answers1

4

First of all, just to be on the safe side, I'd advise to use proper forms of dealing with crontab. Namely

crontab -u user -r

to delete his crontab, and

crontab -u user backed_up_crontab_file

to restore.

Secondly, your timings may be important. If the user's crontab runs rarely, maybe it misses to run 1 time after restore, because it would've fired a minute before it was actually restored.

chicks
  • 3,639
  • 10
  • 26
  • 36
Gnudiff
  • 533
  • 5
  • 20
  • 1
    Agreed. I thought `crontab` stopped processing stdin many years ago, because it led to too many people accidentally deleting their crontabs. The man page doesn't show the filename as optional when updating. – Barmar Sep 27 '17 at 13:42