On a server with a very high load many of my daily cron jobs stopped working. I
have postfix server running that only delivers locally so that I can see the
output of the cron jobs with mutt
.
I grepped for cron
in the logs and I saw this:
Feb 23 22:44:16 server10 cron[2276]: /usr/sbin/sendmail: Resource temporarily unavailable
In /var/log/mail
I see this:
Feb 23 22:05:15 server10 postfix/sendmail[1113]: warning: fork: Resource temporarily unavailable
A systemctl status cron
and systemctl status postfix
shows that both
processes are running.
So I added this cronjob that runs every minute
#!/bin/bash
date
sleep 1
date
date >> ~/cron.log
echo "bye"
And it took almost 5 minutes for ~/cron.log
file to appear. And then I can see
that not every minute is being executed, which explains why my daily cron jobs
are not being executed.
$ cat cron.log
Tue Feb 23 22:52:02 CET 2021
Tue Feb 23 22:53:02 CET 2021
Tue Feb 23 22:56:02 CET 2021
Tue Feb 23 22:58:02 CET 2021
Tue Feb 23 23:01:02 CET 2021
Tue Feb 23 23:02:02 CET 2021
Tue Feb 23 23:07:02 CET 2021
Tue Feb 23 23:08:02 CET 2021
Tue Feb 23 23:10:03 CET 2021
Tue Feb 23 23:11:02 CET 2021
Tue Feb 23 23:13:02 CET 2021
So when I run a tail -f
on /var/log/messages
I see this:
$ tail -f messages | grep cron
Feb 23 23:17:03 server10 cron[2276]: /usr/sbin/sendmail: Resource temporarily unavailable
Feb 23 23:18:01 server10 cron[2276]: /usr/sbin/sendmail: Resource temporarily unavailable
Feb 23 23:18:01 server10 cron[2276]: /usr/sbin/sendmail: Resource temporarily unavailable
Feb 23 23:18:01 server10 cron[2276]: /usr/sbin/sendmail: Resource temporarily unavailable
Feb 23 23:19:16 server10 cron[2276]: /usr/sbin/sendmail: Resource temporarily unavailable
Feb 23 23:19:16 server10 cron[2276]: /usr/sbin/sendmail: Resource temporarily unavailable
Feb 23 23:20:02 server10 cron[2276]: /usr/sbin/sendmail: Resource temporarily unavailable
So I googled for that and found
Cron jobs not working anymore which has
a very similar issue, but that didn't help me. I don't know which resource is
hitting the limit. sysctl kernel.pid_max
shows 32768, which seems kind of low
for an x86_64
system, so I raised the value to 4194303 but that didn't help
either, the Resource temporarily unavailable messages keep appearing.
So how can I determine which resource is hitting the limit? Sadly the log files don't tell me that much.