3

So I'm doing automated backups of a database. The backup script works properly, both when I run it manually, and also when Cron runs scheduled hourly and daily backups. The backup fails, however, on the weekly and monthly backups.

I am (obviously) not sure, but I suppose my problem is with the cron config. Perhaps a conflict because the script is being run multiple times at midnight? I'm not sure whether that's possible, but if so, I'd appreciate instructions on fine-tuning my crontab.

my crontab:

# *  *  *  *  * user-name  command to be executed
  00 *  *  *  *   /data/backup.sh -h  #hourly
  00 00 *  *  *   /data/backup.sh -d  #daily
  00 00 *  *  6   /data/backup.sh -w  #weekly
  00 00 1  *  *   /data/backup.sh -m  #monthly

edit: I updated my crontab to have staggered minutes, but it still doesn't work:

# *  *  *  *  * user-name  command to be executed
  00 *  *  *  *   /data/backup.sh -h  #hourly
  05 00 *  *  *   /data/backup.sh -d  #daily
  10 00 *  *  6   /data/backup.sh -w  #weekly
  15 00 1  *  *   /data/backup.sh -m  #monthly

I access these via this command:

sudo crontab -u my_user_group_name -e

linux version:

$ cat /proc/version 
Linux version 3.10.0-514.6.1.el7.x86_64 (builder@kbuilder.dev.centos.org) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-11) (GCC) ) #1 SMP Wed Jan 18 13:06:36 UTC 2017

The backup script works fine on its own, when run as a manual shell script, with any of the flags (-h, -d, -w, -m). It works without fail. It is a Wordpress backup script, using wp-cli, which essentially serializes a MariaDB database. For the sake of completeness, I have included the script at the end of this question.

I have closely examined the general cron troubleshooting advice from this answer, but I do not see anything applicable to my issue:

  • I don't think the issue is with the backup script itself, as the issue occurs only during cron runs, not when it is run directly in the shell. Happy for somebody to prove me wrong.
  • I don't think the issue is with the broader environment, but is instead something to do with the cron configuration itself (above), since the issue occurs only during some cron runs, but others execute successfully. E.g., the Crontab is not misnamed, it has correct permissions, etc.
  • The cron answer says nothing about frequency of cron runs, conflicts between runs, or other dynamics which I believe are likely behind the problem.

Here are the permissions for the backup directories in question (in /data/backup/. As you can see, the hourly and weekly directories have the same permissions.

drwxr-xr-x. 2 libsys  libsys   4096 Feb 20 00:05 daily
drwxrwxr-x. 2 root    backup   4096 Feb 20 10:00 hourly
-rw-rw-r--. 1 root    backup  35644 Feb 20 10:00 log.txt
drwxrwxr-x. 2 root    backup   4096 Feb 13 11:23 manual
drwxrwxr-x. 2 aberry3 aberry3  4096 Feb  6 10:36 monthly
drwxrwxr-x. 2 aberry3 aberry3  4096 Feb  6 10:36 weekly

I just noticed the daily permissions don't have group write; I'll fix that and check back here in a week. It's probably a red herring, however; my problem is not with the daily backups, which work OK: only the weekly and monthly backups don't happen.

Here is the backup script:

#!/bin/bash

# Usage
# This script will make a backup of the WordPress database, into the
# defined backup directory, "/data/backups".
# Options are -hdwm, for "hourly", "daily", "weekly", "monthly"; these
# simply put the backups into different subdirectories.  Running the script
# without options creates four backups, one in each directory.
# The script also "cleans up" the directories afterward.

# constants
WP_DIR=/var/www/wordpress/docroot
DATA_DIR=/data/backups
LOG=$DATA_DIR/log.txt

# vars
TIMESTAMP=$(date +%Y-%m-%d.%H-%M-%S)

# run all commands from WP root directory
cd $WP_DIR

# the meat of the backup script
backup () { # arguments: "hourly", "daily", "weekly", "monthly", "manual"
  INTERVAL=$1
  BACKUP_DIR=$DATA_DIR/$INTERVAL

  # create directory hierarchy if not exists
  mkdir -p $BACKUP_DIR

  # create backup
  FILENAME=$(printf "%s/wp-mariadb-%s.sql" "$BACKUP_DIR" "$TIMESTAMP")
  /usr/local/bin/wp db export $FILENAME

  # make sure backup happened
  if [ -s $FILENAME ]
  then
      echo "√   backup OK   $TIMESTAMP $INTERVAL" >> $LOG
  else
      echo "!!! backup FAIL $TIMESTAMP $INTERVAL" >> $LOG
      exit 1 # terminate and indicate error
  fi

  # clean up backup directory
  BACKUP_FILES=$BACKUP_DIR/*.sql
  case $INTERVAL in
    "hourly")
      KEEP=24
      ;;
    "daily")
      KEEP=7
      ;;
    "weekly")
      KEEP=4
      ;;
    "monthly")
      KEEP=12
      ;;
    "manual")
      KEEP=999 # don't automatically delete manual backups
      ;;
  esac

  # evaluate which files to delete from directory
  for BACKUP in $BACKUP_FILES; do
    # if (BACKUP_FILES quantity > KEEP)
    # and if (BACKUP age in minutes) > (minutes ago)
      # delete backup
    ARR=($BACKUP_FILES) # convert to array
    LEN=${#ARR[@]} # length of array

    # if we have too many backups...
    if (($LEN > $KEEP)); then
      # ...delete the backup.
      rm $BACKUP
    fi
  done
}

# run particular backup scripts depending on options
while getopts "hdwma" arg; do
  case $arg in
    h)
      backup "hourly"
      ;;
    d)
      backup "daily"
      ;;
    w)
      backup "weekly"
      ;;
    m)
      backup "monthly"
      ;;
    a)
      # a stands for all; backup everywhere
      backup "hourly"
      backup "daily"
      backup "monthly"
      ;;
    *)
      echo "Error: command not recognized"
      echo "!!! backup FAIL $TIMESTAMP illegal option in '$1'" >> $LOG
      ;;
  esac
done

here is a sample of my log file, simply showing the problem:

...
√   backup OK   2017-02-17.22-00-01 hourly
√   backup OK   2017-02-17.23-00-01 hourly
√   backup OK   2017-02-18.00-00-02 hourly
√   backup OK   2017-02-18.00-05-01 daily
!!! backup FAIL 2017-02-18.00-10-02 weekly
√   backup OK   2017-02-18.01-00-01 hourly
√   backup OK   2017-02-18.02-00-02 hourly
√   backup OK   2017-02-18.03-00-02 hourly
√   backup OK   2017-02-18.04-00-02 hourly
√   backup OK   2017-02-18.05-00-01 hourly
√   backup OK   2017-02-18.06-00-01 hourly
√   backup OK   2017-02-18.07-00-01 hourly
√   backup OK   2017-02-18.08-00-02 hourly
√   backup OK   2017-02-18.09-00-02 hourly
√   backup OK   2017-02-18.10-00-01 hourly
√   backup OK   2017-02-18.11-00-04 hourly
√   backup OK   2017-02-18.12-00-03 hourly
√   backup OK   2017-02-18.13-00-02 hourly
√   backup OK   2017-02-18.14-00-02 hourly
√   backup OK   2017-02-18.15-00-01 hourly
√   backup OK   2017-02-18.16-00-02 hourly
√   backup OK   2017-02-18.17-00-04 hourly
√   backup OK   2017-02-18.18-00-02 hourly
√   backup OK   2017-02-18.19-00-02 hourly
√   backup OK   2017-02-18.20-00-02 hourly
√   backup OK   2017-02-18.21-00-02 hourly
√   backup OK   2017-02-18.22-00-03 hourly
√   backup OK   2017-02-18.23-00-02 hourly
√   backup OK   2017-02-19.00-00-03 hourly
√   backup OK   2017-02-19.00-05-02 daily
√   backup OK   2017-02-19.01-00-03 hourly
√   backup OK   2017-02-19.02-00-02 hourly
√   backup OK   2017-02-19.03-00-01 hourly
...
aljabear
  • 107
  • 8
  • 5
    You run parallel 3 instances of the script monthly and weekly. This can be a problem. – Ipor Sircer Feb 13 '17 at 17:48
  • 2
    Possible duplicate of [Why is my crontab not working, and how can I troubleshoot it?](http://serverfault.com/questions/449651/why-is-my-crontab-not-working-and-how-can-i-troubleshoot-it) – user9517 Feb 13 '17 at 17:50
  • @IporSircer yeah, I thought perhaps, but I figured the system would be smart enough to queue them – aljabear Feb 13 '17 at 17:53
  • @allanberry: sorry, cron doesn't have any mindreading capabilities yet. You have to be smart. – Ipor Sircer Feb 13 '17 at 18:00
  • @IporSircer ok, snarky response, but point taken. :) I've adjusted my crontab to stagger the script at 5min intervals, and I'll update this next week, I guess, after the weekly script has a chance to run – aljabear Feb 13 '17 at 18:08
  • I think you need to move the timing logic into the script. I don't think cron is your friend for this. IE run the script every hour, and have the script decide which level it needs to do. Or do something to insure only one copy of the script is running at one time. – Dylan Martin Feb 13 '17 at 18:23
  • 1
    @DylanMartin isn't "timing logic" what cron is for? – aljabear Feb 13 '17 at 18:31
  • Nope! ;-) Cron is really dumb. It's a guy who looks at a list and says "is it this time yet?" and throws a switch when it is. – Dylan Martin Feb 13 '17 at 18:52
  • Actually the backup script should be inhibiting being run multiple times. I can see in the above configuration where this might be problematic and cause corrupted backups depending on the contents of the scripting. – mdpc Feb 14 '17 at 21:35
  • man... downvote for what? tough crowd. – aljabear Feb 15 '17 at 17:04
  • @mdpc please elaborate; why would the backup script inhibit being run multiple times? What do you see in the above configuration which might be problematic? This stuff is exactly what I'm trying to discern, of course. – aljabear Feb 15 '17 at 17:11
  • I have updated the cron to not have overlapping runs; I am at a loss – aljabear Feb 20 '17 at 16:14
  • 1
    Did you take a look at the output of the actual backup process? Did you try to send emails via cron's `MAILTO=email@address.com` feature containing the full output for both the success cases as well as the failures? – Andre Klärner Feb 20 '17 at 22:35
  • @AndreKlärner Thanks for asking; yes I did; the mailto output is exactly the same as in the log file: just a failure to create the file. – aljabear Feb 21 '17 at 03:29

3 Answers3

3

This command:

sudo crontab -u my_user_group_name -e

Combined with the variety of user and group ownerships of your backup directories:

drwxr-xr-x. 2 libsys  libsys   4096 Feb 20 00:05 daily
drwxrwxr-x. 2 root    backup   4096 Feb 20 10:00 hourly
-rw-rw-r--. 1 root    backup  35644 Feb 20 10:00 log.txt
drwxrwxr-x. 2 root    backup   4096 Feb 13 11:23 manual
drwxrwxr-x. 2 aberry3 aberry3  4096 Feb  6 10:36 monthly
drwxrwxr-x. 2 aberry3 aberry3  4096 Feb  6 10:36 weekly

Looks fishy. I'm guessing the actual user -- assuming you don't really have a user named my_user_group_name -- isn't aberry3. If I were to take a wild guess, I'd say libsys is running the script who is a member of backup but not a member of aberry3 groups.

Since you're creating the directories in the script anyway, try renaming your existing ones and let the script create them with the owner/group of the actual user running the script.

Brandon Xavier
  • 1,942
  • 13
  • 15
  • thanks. Yes I think this is the answer. Totally overlooked the directory ownership, and it was right there in what I posted. Total noob error I guess. I'll check this when I get in the office tomorrow. Much appreciated. – aljabear Feb 23 '17 at 04:33
1

Try,

sudo chown root:backup weekly

1

Add some more logging to the script.

Add "-x" to the first line of the script.

#!/bin/bash -x

This should give you more verbose output of the script in the email that is sent to the address that is specified in the "MAILTO" cron option.

Also, according to the documentation (https://wp-cli.org/commands/db/export/) you can add "--debug" to the "wp db export" command. Try like this.

/usr/local/bin/wp db export --debug $FILENAME

You can post the content of the mail that you'll receive when the script fails, that should give you/us enough data to pinpoint the problem.

user373333
  • 630
  • 1
  • 4
  • 12