27

I find it tedious to have to backup databases every week. And I also think weekly backups should be turned into daily backups. If I had to do that, I don't want to do it manually. What's the best way to automate the backing-up of PostgreSQL databases daily?

Randell
  • 1,133
  • 7
  • 18
  • 25
  • Small note: dumping the db can kill the performance, use a cluster and dump on non-active nodes. – neutrinus Sep 16 '15 at 19:14
  • You can make scheduled backups with the help of this free tool http://postgresql-backup.com/ – Olek Nilson May 05 '16 at 10:16
  • Next question: How large are you, what general backup mechanism do you have in place. For example, I am not ever backing up anything manually. Install agent of my backup system, select elements to be backed up in UI, schedule backup (in my case for databases: every 5 minutes).... finished. But that assumes having enough that it makes sense to install a proper system. – TomTom Jun 06 '16 at 18:43
  • I wouldn't call postgresql-backup.com a "free" solution. It is only free for the first 2 databases... @OlekNilson – aidanmelen Apr 24 '17 at 16:49

5 Answers5

50

the same as you do for any other repetitive task that can be automated - you write a script to do the backup, and then set up a cron job to run it.

a script like the following, for instance:

(Note: it has to be run as the postgres user, or any other user with the same privs)

#! /bin/bash

# backup-postgresql.sh
# by Craig Sanders <cas@taz.net.au>
# This script is public domain.  feel free to use or modify
# as you like.

DUMPALL='/usr/bin/pg_dumpall'
PGDUMP='/usr/bin/pg_dump'
PSQL='/usr/bin/psql'

# directory to save backups in, must be rwx by postgres user
BASE_DIR='/var/backups/postgres'
YMD=$(date "+%Y-%m-%d")
DIR="$BASE_DIR/$YMD"
mkdir -p "$DIR"
cd "$DIR"

# get list of databases in system , exclude the tempate dbs
DBS=( $($PSQL --list --tuples-only |
          awk '!/template[01]/ && $1 != "|" {print $1}') )

# first dump entire postgres database, including pg_shadow etc.
$DUMPALL --column-inserts | gzip -9 > "$DIR/db.out.gz"

# next dump globals (roles and tablespaces) only
$DUMPALL --globals-only | gzip -9 > "$DIR/globals.gz"

# now loop through each individual database and backup the
# schema and data separately
for database in "${DBS[@]}" ; do
    SCHEMA="$DIR/$database.schema.gz"
    DATA="$DIR/$database.data.gz"
    INSERTS="$DIR/$database.inserts.gz"

    # export data from postgres databases to plain text:

    # dump schema
    $PGDUMP --create --clean --schema-only "$database" |
        gzip -9 > "$SCHEMA"

    # dump data
    $PGDUMP --disable-triggers --data-only "$database" |
        gzip -9 > "$DATA"

    # dump data as column inserts for a last resort backup
    $PGDUMP --disable-triggers --data-only --column-inserts \
        "$database" | gzip -9 > "$INSERTS"

done

# delete backup files older than 30 days
echo deleting old backup files:
find "$BASE_DIR/" -mindepth 1 -type d -mtime +30 -print0 |
    xargs -0r rm -rfv

EDIT :
pg_dumpall -D switch (line 27) is deprecated, now replaced with --column-inserts
https://wiki.postgresql.org/wiki/Deprecated_Features

cas
  • 6,653
  • 31
  • 34
  • 12
    +1 for a great script – rkthkr Aug 28 '09 at 05:57
  • I use something very similar as a PreDumpCmd for backuppc, with the exception that I don't encode the date into the path, as backuppc deals with keeping multiple copies. – David Pashley Aug 28 '09 at 06:03
  • 2
    Great script, but I found I needed to tweak the regex such that it didn't include pipes and blank lines as database names. `DBS=$($PSQL -l -t | egrep -v 'template[01]' | awk '{print $1}' | egrep -v '^\|' | egrep -v '^$')` – s29 Oct 09 '13 at 23:45
  • 1
    @s29 I think its better to use a direct query instead of all that grep hackery, like: DBS=( $(${PSQL} -t -A -c "select datname from pg_database where datname not in ('template0', 'template1')") ) – PolyTekPatrick Nov 10 '17 at 13:37
  • Awesome script - so how does this compare to services like ClusterControl? – karns Aug 27 '19 at 19:18
  • @karns I have no idea, I've never used ClusterControl (and am unlikely to ever do so. I've never heard of it before, but a quick google shows they claim to be "open source" but I can't find any sign of source code or even an explicit license other than a mention of "free to use" for a Community Edition and they require you to register for spam just to download a demo. nope.). I expect it doesn't compare at all, this is just a simple script to backup postgres databases. – cas Aug 27 '19 at 23:53
  • I found I had to use -d postgres when trying to avoid the grep. so my command was : psql -d postgres -t -A -c "select datname from pg_database where datname not in ('template0', 'template1')" – kdubs Apr 09 '21 at 15:56
8
pg_dump dbname | gzip > filename.gz

Reload with

createdb dbname
gunzip -c filename.gz | psql dbname

or

cat filename.gz | gunzip | psql dbname

Use split. The split command allows you to split the output into pieces that are acceptable in size to the underlying file system. For example, to make chunks of 1 megabyte:

pg_dump dbname | split -b 1m - filename

Reload with

createdb dbname
cat filename* | psql dbname

Your could toss one of those in /etc/cron.hourly

Sourced from http://www.postgresql.org/docs/8.1/interactive/backup.html#BACKUP-DUMP-ALL

amc
  • 105
  • 5
Nick Anderson
  • 669
  • 2
  • 5
  • 11
  • Splitting the file is a great idea. It is better to split the dump, using `split -C`, so that a line is never split. The debugging of a failed restore is easier. – Gianluca Della Vedova Sep 11 '14 at 07:12
4

Whatever commands you issue "by hand", - write them to script, and put call to this script in cron or whatever scheduler you use.

You can of course make the script more fancy, but generally, I think that you'll get there - start simple, and later refine.

Simplest possible script:

#!/bin/bash
/usr/local/pgsql/bin/pg_dumpall -U postgres -f /var/backups/backup.dump

Save it as /home/randell/bin/backup.sh, add to cron:

0 0 * * 0 /home/randell/bin/backup.sh
  • IF pg_dumpall is used is it possible to restore single table from it or it will restore all at once? can you please share script for restore single table created by using dumpall – Ashish Karpe Nov 26 '19 at 05:10
0

in case anyone has to backup their postgres on a windows machine without the aid of cygwin etc I have a batch file which does the job quite well.

this will backup the databases into individual files in it's own directory every day

set dtnm=%date:~-4,4%%date:~-7,2%%date:~0,2%
set bdir=D:\backup\%dtnm%
mkdir %bdir%

FOR /F "tokens=1,2 delims=|" %%a IN ('psql -l -t -A -U postgres') DO (
    IF %%b EQU postgres pg_dump -U postgres -f %bdir%\%%a.sql.gz -Z 9 -i %%a
)
l0ft13
  • 171
  • 7
0

If you want to backup an entire cluster with minimal system load, you can simply tar the root directory of the postgresql cluster. for example:

echo "select pg_start_backup('full backup - `date`');" | psql
/usr/bin/rdiff-backup --force --remove-older-than 7D $BACKUP_TARGET
/usr/bin/rdiff-backup --include '/etc/postgresql' --include $PGDATA --exclude '/*' / $BACKUP_TARGET
/bin/tar -cjf /mnt/tmp/$SERVER_NAME.tbz2 $BACKUP_TARGET 2>&1
echo "select pg_stop_backup();" | psql

that's the bulk of my backup script.

lee
  • 599
  • 3
  • 7