I would say that heartbeat
is what you're looking for.
If the monitored service (haproxy
in your case) has an lsb-compliant init-script - heartbeat will run initscript status
. If it says that the service is down, it will attempt to start it. If it fail at starting it a couple of times - it will perform a fail over to the other node. As long as the nodes have a way to communicate with each other, this will be performed in a very controlled manner - the addresses are removed on one node, and brought up on the other.
If two heartbeat-machines lose communication with each other, they may both attempt to perform failover. One way to solve this, is by configuring a STONITH-plugin (Shoot The Other Node In The Head). This will use a management interface and attempt to turn off the other service before starting its services. Some mechanism like this is crucial if you involve shared storage in fail over.
Personally, I have never experienced haproxy
die - I consider it a very stable service. I use heartbeat
for failover of IP-addresses only if running on the haproxy-nodes.