3

I setup a Corosync/Pacemaker cluster + HAproxy using the following guide on Ubuntu 14.04 LTS: http://www.sebastien-han.fr/blog/2012/04/15/active-passive-failover-cluster-on-a-mysql-galera-cluster-with-haproxy-lsb-agent/

I have not added the virtual ip setup, only two nodes, both with Haproxy installed on them. I am using the lsb:haproxy and my config is as follows:

Corosync lsb:haproxy failed actions: insuffient privileges To test everything, I kill the haproxy process by running the following command: sudo kill -9 [PID#]

I then check the status of my cluster and receive the following error message: "Failed actions: insufficient privileges". I did not change haproxy user/group definition and my aisexec{} is using root for both user and group.

What should my permissions be if I want Corosync/Pacemaker to manage Haproxy?

EDIT: When I run the below service stop command, haproxy restarts as expected. Checking crm status haproxy daemon is running like normal

# sudo service haproxy stop
# sudo crm status
HaproxyHA     (lsb:haproxy):    Started node1
Failed Actions:

But when I kill the pid manually, I keep seeing the error:

# sudo kill -9 $PID
HaproxyHA (lsb:haproxy): Started node1 (unmanaged) FAILED
Failed Actions:

After implementing change Federico mentioned (/bin/kill $pid || return 7) it doesn't change my problem and I find this in my logs:

pengine: warning: unpack_rsc_op: Processing failed op stop for HaproxyHA on node1: not running (7)
invulnarable27
  • 183
  • 1
  • 3
  • 8
  • Check /var/log/haproxy.log if there is any message – Federico Sierra Nov 12 '14 at 01:36
  • Check the cluster log file for what exactly went wrong (look for messages from the haproxy resource). Probably one of the haproxy directories or files has wrong permissions. – Federico Sierra Nov 12 '14 at 01:44
  • @FedericoSierra Corosync.log just emits: "Preventing HaproxyHA from restarting on node1: operation stop failed with insufficient privileges", but no actionable feedback. Haproxy only logs start and stop statuses, nothing useful for me to modify in the config file – invulnarable27 Nov 18 '14 at 02:05
  • can you post your crm config? – Federico Sierra Nov 19 '14 at 12:09
  • @FedericoSierra the image above is all there is in my crm config. Just added the lsb:haproxy resource and nothing else. – invulnarable27 Nov 19 '14 at 20:32
  • You have set? `property stonith-enabled=false` and `property no-quorum-policy=ignore` – Federico Sierra Nov 19 '14 at 22:03
  • @FedericoSierra yes those are both set to the values you mentioned – invulnarable27 Nov 20 '14 at 00:37
  • failover occurs? because messages may be normal in a situation of failure of one of the nodes – Federico Sierra Nov 20 '14 at 01:28
  • @FedericoSierra No failover occurs when killed manually, it keeps showing the [unmanaged) FAILED error when i run [crm status]. Also grepping for haproy in ps aux returns nothing. Only way to get haproxy running and back to normal is restart corosync/pacemaker services – invulnarable27 Nov 20 '14 at 05:42

1 Answers1

3

I think the problem is in the init script, it does not respect the LSB spec.

If you look at the function haproxy_stop, in file /etc/init.d/haproxy:

haproxy_stop()
{
    if [ ! -f $PIDFILE ] ; then
        # This is a success according to LSB
        return 0
    fi
    for pid in $(cat $PIDFILE) ; do
        /bin/kill $pid || return 4
    done
    rm -f $PIDFILE
    return 0
}

In particularly, the line /bin/kill $pid || return 4. This makes the case that the process is killed the return value is 4, which according to the spec this is: user had insufficient privileges. Which is not correct.

In case of an error while processing any init-script action except for status, the init script shall print an error message and exit with a non-zero status code:

1 generic or unspecified error (current practice)
2 invalid or excess argument(s)
3 unimplemented feature (for example, "reload")
4 user had insufficient privilege
5 program is not installed
6 program is not configured
7 program is not running
8-99  reserved for future LSB use
100-149   reserved for distribution use
150-199   reserved for application use
200-254   reserved

You can try to change by:

/bin/kill $pid || return 7

the correct way is stop daemon with killproc(8) and if this fails killproc sets the return value according to LSB.

Eg.

/sbin/killproc -p $PIDFILE $HAPROXY

sends the signal SIGTERM to the pid found in $PIDFILE if and only if this pid belongs to $HAPROXY. If the named $PIDFILE does not exist, killproc assumes that the daemon of $HAPROXY is not running. The exit status is set to 0 for successfully delivering the default signals SIGTERM and SIGKILL otherwise to 7 if the program was not running. It is also successful if no signal was specified and no program was there for Termination because it is already terminated.

Federico Sierra
  • 3,499
  • 1
  • 18
  • 24
  • please see my above edit. When I tried your second suggestion, it kept complaining that [killproc] was not a known program name, so had to change it back to [/bin/kill $pid || return 7] – invulnarable27 Nov 19 '14 at 06:39