2

I use pg_dump for my primary backup, once every three hours. I also use monit. When monit checks if PostgreSQL is alive during the pg_dump run, it sometimes times out, and restarts postgres. This results in failed backup.

What to do? Move to Write-Ahead-Logs? Disable monit during the backup? The database is serving an active web site at these times.

Monit config.:

check process postgres with pidfile /usr/local/pgsql/data/postmaster.pid
group database
start program = "/etc/init.d/postgresql start"
stop program = "/etc/init.d/postgresql stop"
if failed unixsocket /tmp/.s.PGSQL.5432 protocol pgsql then restart
if failed host 127.0.0.1 port 5432      protocol pgsql then restart
if 5 restarts within 5 cycles then timeout
Terry G Lorber
  • 565
  • 2
  • 7
  • 12
  • What does your monit configuration look like? Can you simply configure it to be a bit more relaxed? So instead of immediately restarting if the services is not alive, maybe it should wait for a few cycles of being in a failed state? – Zoredache Jan 09 '14 at 21:04

1 Answers1

2

So something like this?

if failed unixsocket /tmp/.s.PGSQL.5432 protocol pgsql for 5 cycles then restart
if failed host 127.0.0.1 port 5432      protocol pgsql for 5 cycles then restart
if 5 restarts within 25 cycles then timeout

That way the monit check would have to unreachable for 15 minutes before a restart. Assuming a 180 second cycle interval. Obviously you can adjust to your tastes, but resetting after a single failed check can result in false positives if your server happens to be busy or otherwise occupied.

Zoredache
  • 128,755
  • 40
  • 271
  • 413
  • Thanks for this. pg_dump is probably the wrong tool for the job, but it's easiest for now. Will try loosening monit and see if that prevents a restart. The web application appears to be fine during these periods. – Terry G Lorber Jan 09 '14 at 21:51