125

I have scheduled a cron job to run every minute but sometimes the script takes more than a minute to finish and I don't want the jobs to start "stacking up" over each other. I guess this is a concurrency problem - i.e. the script execution needs to be mutually exclusive.

To solve the problem I made the script look for the existence of a particular file ("lockfile.txt") and exit if it exists or touch it if it doesn't. But this is a pretty lousy semaphore! Is there a best practice that I should know about? Should I have written a daemon instead?

Tom
  • 4,157
  • 11
  • 41
  • 52

11 Answers11

168

There are a couple of programs that automate this feature, take away the annoyance and potential bugs from doing this yourself, and avoid the stale lock problem by using flock behind the scenes, too (which is a risk if you're just using touch). I've used lockrun and lckdo in the past, but now there's flock(1) (in newish versions of util-linux) which is great. It's really easy to use:

* * * * * /usr/bin/flock -n /tmp/fcj.lockfile /usr/local/bin/frequent_cron_job
womble
  • 95,029
  • 29
  • 173
  • 228
  • 3
    lckdo is going to be removed from moreutils, now that flock(1) is in util-linux. And that package is basically mandatory in Linux systems, so you should be able to rely on it's presence. For usage, look below. – jldugger Apr 09 '12 at 21:44
  • Yeah, flock is now my preferred option. I'll even update my answer to suit. – womble Apr 10 '12 at 06:22
  • Does anyone know the difference between `flock -n file command` and `flock -n file -c command` ? – Nanne Feb 05 '15 at 14:40
  • 2
    @Nanne, I'd have to check the code to be sure, but my educated guess is that `-c` runs the specified command through a shell (as per the manpage), while the "bare" (non-`-c`) form just `exec`s the command given. Putting something through the shell allows you to do shell-like things (such as running multiple commands separated with `;` or `&&`), but also opens you up to shell expansion attacks if you're using untrusted input. – womble Feb 06 '15 at 01:13
  • What is the minutely option? – Mario Trucco Oct 31 '17 at 08:36
  • 1
    It was an argument to the (hypothetical) `frequent_cron_job` command that tried to show it was being run every minute. I've removed it since it added nothing useful, and caused confusion (yours, if nobody else's over the years). – womble Oct 31 '17 at 08:54
  • Damn, I just spent an hour building out a bash script that does exactly what `flock` does automatically. Ah well, learned a thing or two. ;) For anybody interested, here's the article I went through and it does a good job teaching you basic fundamentals of what `flock`, no doubt, is doing (but likely better): https://bencane.com/2015/09/22/preventing-duplicate-cron-job-executions/ – Joshua Pinter Nov 17 '20 at 19:23
31

Best way in shell is to use flock(1)

(
  flock -x -w 5 99
  ## Do your stuff here
) 99>/path/to/my.lock
Philip Reynolds
  • 9,751
  • 1
  • 32
  • 33
  • 2
    I can't not upvote a tricky use of fd redirection. It's just too arcanely awesome. – womble Nov 09 '09 at 11:57
  • 1
    Doesn't parse for me in Bash or ZSH, need to eliminate the space between `99` and `>` so it is `99> /...` – Kyle Brandt Nov 09 '09 at 12:36
  • @Kyle Yup, correct! Fixed! @womble Haha, agreed :) – Philip Reynolds Nov 09 '09 at 13:51
  • This is brilliant simplicity. – sclarson Nov 09 '09 at 15:08
  • beautiful! and @wombie, it's documented in flock's man page – Javier Nov 09 '09 at 15:09
  • 2
    @Javier: Doesn't mean it's not tricky and arcane, just that it's *documented*, tricky, and arcane. – womble Nov 11 '09 at 12:29
  • 1
    what would happen if you restart while this is running or get the process killed somehow ? Would it be locked forever then ? – Alex R Aug 22 '14 at 11:33
  • 7
    I understand this structure creates an exclusive lock but I don't understand the mechanics of how this is accomplished. What is the function of the '99' in this answer? Anyone care to explain this please? Thanks! – Asciiom Feb 10 '16 at 09:30
  • 1
    @JMoons 99 is just an arbitrary fd number that's high enough it likely isn't already being used by the shell. The shell opens it when executing the subshell (parentheses). Then when it uses fork() to spawn the child process (including 'flock'), the fd remains open. Finally, the '99' parameter as the last argument to 'flock' tells the flock program to blindly use the 99 fd for the lock file. – Conrad Meyer Sep 18 '17 at 22:17
  • If you have a sufficiently recent bash: ( flock -x -w ${FD}; do stuff ) {FD}>/path/to/my.lock – Kjetil Joergensen Oct 19 '17 at 21:30
  • @Asciiom `flock` lets you pass a file number instead of a file name or directory name. There doesn't seem to be a way to specify that you want a file named `99`, if you pass `99`, it automatically interprets it as file with no name with the number 99. Then bash lets you redirect the output of file number 99 to a particular file name using `99>/path/to/my.lock`. – Flimm Mar 07 '22 at 09:06
25

Actually, flock -n may be used instead of lckdo*, so you will be using code from kernel developers.

Building on womble's example, you would write something like:

* * * * * flock -n /some/lockfile command_to_run_every_minute

BTW, looking at the code, all of flock, lockrun, and lckdo do the exact same thing, so it's just a matter of which is most readily available to you.

Amir
  • 797
  • 6
  • 16
5

Now that systemd is out, there is another scheduling mechanism on Linux systems:

A systemd.timer

In /etc/systemd/system/myjob.service or ~/.config/systemd/user/myjob.service:

[Service]
ExecStart=/usr/local/bin/myjob

In /etc/systemd/system/myjob.timer or ~/.config/systemd/user/myjob.timer:

[Timer]
OnCalendar=minutely

[Install]
WantedBy=timers.target

If the service unit is already activating when the timer next activates, then another instance of the service will not be started.

An alternative, which starts the job once at boot and one minute after each run is finished:

[Timer]
OnBootSec=1m
OnUnitInactiveSec=1m 

[Install]
WantedBy=timers.target
Amir
  • 797
  • 6
  • 16
4

You havent specified if you want the script to wait for the previous run to complete or not. By "I don't want the jobs to start "stacking up" over each other", I guess you are implying that you want the script to exit if already running,

So, if you don want to depend on lckdo or similar, you can do this:


PIDFILE=/tmp/`basename $0`.pid

if [ -f $PIDFILE ]; then
  if ps -p `cat $PIDFILE` > /dev/null 2>&1; then
      echo "$0 already running!"
      exit
  fi
fi
echo $$ > $PIDFILE

trap 'rm -f "$PIDFILE" >/dev/null 2>&1' EXIT HUP KILL INT QUIT TERM

# do the work

Aleksandar Ivanisevic
  • 3,327
  • 19
  • 24
  • Thanks your example is helpful - I do want the script to exit if already running. Thanks for mentioning *ickdo* - it seems to do the trick. – Tom Nov 11 '09 at 11:36
  • FWIW: I like this solution because it can be included in a script, so the locking works regardless of how the script is invoked. – David G Nov 20 '18 at 15:36
2

You can use a lock file. Create this file when the script starts and delete it when it finishes. The script, before it runs its main routine, should check if the lock file exists and proceed accordingly.

Lockfiles are used by initscripts and by many other applications and utilities in Unix systems.

Born To Ride
  • 1,074
  • 6
  • 10
  • 1
    this is the *only* way I've ever seen it implemented, personally. I use on as per the maintainer's suggestion as a mirror for an OSS project – warren Nov 09 '09 at 11:48
2

I would recommend using run-one command - much simpler than dealing with the locks. From the docs:

run-one is a wrapper script that runs no more than one unique instance of some command with a unique set of arguments. This is often useful with cronjobs, when you want no more than one copy running at a time.

run-this-one is exactly like run-one, except that it will use pgrep and kill to find and kill any running processes owned by the user and matching the target commands and arguments. Note that run-this-one will block while trying to kill matching processes, until all matching processes are dead.

run-one-constantly operates exactly like run-one except that it respawns "COMMAND [ARGS]" any time COMMAND exits (zero or non-zero).

keep-one-running is an alias for run-one-constantly.

run-one-until-success operates exactly like run-one-constantly except that it respawns "COMMAND [ARGS]" until COMMAND exits successfully (ie, exits zero).

run-one-until-failure operates exactly like run-one-constantly except that it respawns "COMMAND [ARGS]" until COMMAND exits with failure (ie, exits non-zero).

Yuri Astrakhan
  • 151
  • 1
  • 7
1

Your cron daemon shouldn't be invoking jobs if previous instances of them are still running. I'm the developer of one cron daemon dcron, and we specifically try to prevent that. I don't know how Vixie cron or other daemons handle this.

dubiousjim
  • 232
  • 1
  • 3
1

This might also be a sign that you're doing the wrong thing. If your jobs run that closely and that frequently, maybe you should consider de-cronning it and making it a daemon-style program.

  • 6
    I heartily disagree with this. If you have something that needs to run periodically, making it a daemon is a "sledgehammer for a nut" solution. Using a lockfile to prevent accidents is a perfectly reasonable solution I've never had a problem using. – womble Nov 09 '09 at 11:53
  • @womble I agree; but I like smashing nuts with sledgehammers! :-) – wzzrd Nov 09 '09 at 15:01
0

I have created one jar to solve such issue like duplicate crons are running could be java or shell cron. Just pass cron name in Duplicates.CloseSessions("Demo.jar") this will search and kill existng pid for this cron except current. I have implemented method to do this stuff. String proname=ManagementFactory.getRuntimeMXBean().getName(); String pid=proname.split("@")[0]; System.out.println("Current PID:"+pid);

            Process proc = Runtime.getRuntime().exec(new String[]{"bash","-c"," ps aux | grep "+cronname+" | awk '{print $2}' "});

            BufferedReader stdInput = new BufferedReader(new InputStreamReader(proc.getInputStream()));
            String s = null;
            String killid="";

            while ((s = stdInput.readLine()) != null ) {                                        
                if(s.equals(pid)==false)
                {
                    killid=killid+s+" ";    
                }
            }

And then kill killid string with again shell command

0

@Philip Reynolds answer will start executing the code after the 5s wait time anyways without getting the lock. Following Flock doesn't seem to be working I modified @Philip Reynolds answer to

(
  flock -w 5 -x 99 || exit 1
  ## Do your stuff here
) 99>/path/to/my.lock

so that the code would never be executed simultaniously. Instead after the 5 sec wait the process will exit with 1 if it did not get the lock by then.

user__42
  • 101
  • 2