keepalived registers failure but won't failover

Question

I'm running 2 keepalived servers that share a common IP (Server1 and Server2). Server1 is the master, and whenever haproxy dies, Server2 should take over. If Server1 comes back up, Server2 should release the vIP and let Server1 take over again.

I've managed to get that running using the 2 configs below, however, I recently noticed that it stopped working.

The Servers are running CentOS 7 and is fully updated. If I manually kill keepalived on Server1, it fails over to Server2 and when keepalived comes back up, Server1 takes over again. However, if I kill haproxy, keepalived registers that the check_haproxy check failed, but doesn't failover.

Just to make sure it's not the FW or SELinux, I've removed all IPtables rules and disabled SELinux.

The configs are:

Server1

global_defs {
    # Keepalived process identifier
    # Probably should be unique: http://www.keepalived.org/LVS-NAT-Keepalived-HOWTO.html
    lvs_id haproxy_DH
}
# Script used to check if HAProxy is running
vrrp_script check_haproxy {
    script "killall -0 haproxy"
    interval 2
    weight 2
}
# Virtual interface
# The priority specifies the order in which the assigned interface to take over in a failover
vrrp_instance VI_01 {
    state EQUAL
    interface eno16777984
    virtual_router_id 51
    notify /etc/keepalived/notify.sh
    priority 100
    # The virtual ip address shared between the two loadbalancers
    virtual_ipaddress {
        10.9.17.20
        10.9.17.19
    }
    track_script {
        check_haproxy
    }
}

Server2

global_defs {
    # Keepalived process identifier
    # Probably should be unique: http://www.keepalived.org/LVS-NAT-Keepalived-HOWTO.html
    lvs_id haps2a

# Script used to check if HAProxy is running
vrrp_script check_haproxy {
    script "killall -0 haproxy"
    interval 2
    weight 2
}
# Virtual interface
# The priority specifies the order in which the assigned interface to take over in a failover
vrrp_instance VI_01 {
    state EQUAL
    interface eno16777984
    virtual_router_id 51
    notify /etc/keepalived/notify.sh
    priority 100
    # The virtual ip address shared between the two loadbalancers
    virtual_ipaddress {
        10.9.17.20
        10.9.17.19
    }
    track_script {
        check_haproxy
    }
}

score 1 · Answer 1 · answered Mar 23 '16 at 15:32

1

Besides the correct syntax in your configuration files, please also be aware that killall is not installed on CentOS 7 by default now. You will need to install psmisc package for this or you can use script "pidof haproxy" instead.

answered Mar 23 '16 at 15:32

HTF

3,050
14
49
78

Anderson Medeiros Gomes · Accepted Answer · 2016-03-23T15:12:59.317

I could not find any documentation that explains the meaning of state EQUAL. I usually define the initial state to BACKUP and let the election process to choose the master instance.

I copied your configuration files to a lab environment and have found that a closing brace for global_defs is missing in Server2's keepalived.conf . However, ~~the failover seemed to work well despite that character absence~~.

Please, check using tcpdump -i eno16777984 vrrp wether unrelated VRRP packets with VRID=51 are present. Or try to change the virtual_router_id to another number. As VRRP packets are sent to the multicast address 224.0.0.18, each virtual IP in a network must use an unique VRID.

Also, if you intend to let Server1 take over the virtual IP, I suggest you to set priority 101 in its vrrp_instance. RFC5798 section 6.4.3. Master says that if Server1's IP address is greater than Server2's IP address and both servers have the same priority, Server1 wins the election and gets the virtual IP. However, keepalived seems to compare priorities only.

EDIT: Actually, I forgot to remove the closing brace in a second test. In fact, keepalived startup process ignores the missing brace, but the failover doesn't work during the runtime.

keepalived registers failure but won't failover

2 Answers2