Piranha/Pulse, lvs.cf with persistence and server failure

Question

We have the following setup:

RedHat 6
LVS set up to fail between two webservers
Connection persistence of 900 seconds

It's a pretty simple setup however when a server is marked as failed the piranha/pulse/nanny process marks the weight of the server in the table as 0 and doesn't remove the failed server. This means any persistent connections remain attached to a failed server and the load balancing is defeated.

How can we tell nanny to force the failed node out so persistent connections are failed to a working node?

Thanks

We have the following lvs.cf:

serial_no = 201305302344
primary = 10.1.1.45
service = lvs
backup = 0.0.0.0
heartbeat = 1
heartbeat_port = 539
keepalive = 6
deadtime = 18
network = nat
nat_router = 10.1.1.70 eth0:1
nat_nmask = 255.255.255.0
debug_level = NONE
virtual http {
     active = 1
     address = 10.1.1.70 eth0:1
     vip_nmask = 255.255.255.0
     persistent = 900
     pmask = 255.255.255.0
     port = 80
     send = "GET / HTTP/1.0\r\n\r\n"
     expect = "HTTP/1.1 200 OK"
     use_regex = 0
     load_monitor = none
     scheduler = wlc
     protocol = tcp
     timeout = 6
     reentry = 15
     quiesce_server = 1
     server web1 {
         address = 10.1.1.51
         active = 1
         weight = 1
     }
     server web2 {
         address = 10.1.1.52
         active = 1
         weight = 1
     }
}
virtual https {
     active = 1
     address = 10.1.1.70 eth0:1
     vip_nmask = 255.255.255.0
     port = 443
     persistent = 900
     pmask = 255.255.255.0
     send = "GET / HTTP/1.0\r\n\r\n"
     expect = "up"
     use_regex = 0
     load_monitor = none
     scheduler = wlc
     protocol = tcp
     timeout = 6
     reentry = 15
     quiesce_server = 1
     server web1 {
         address = 10.1.1.51
         active = 1
         weight = 1
     }
     server web2 {
         address = 10.1.1.52
         active = 1
         weight = 1
     }
}

score 1 · Accepted Answer · answered Jun 11 '13 at 18:47

1

Try echo 1 > /proc/sys/net/ipv4/vs/expire_quiescent_template

More details here:

http://www.austintek.com/LVS/LVS-HOWTO/HOWTO/LVS-HOWTO.persistent_connection.html

answered Jun 11 '13 at 18:47

dmourati

24,720
2
40
69

This worked for me, thank you. It does seem to remove the ability to gracefully remove servers however as of right now I don't have that requirement. – Antitribu Jun 12 '13 at 08:49
1

I recall having both the ability to leave existing connections running as well as forcing new ones to a new server. I suppose it depends. Read that austintek LVS Howto link, there is tons of information buried there on LVS. – dmourati Jun 12 '13 at 16:21

score 1 · Answer 2 · answered Jun 11 '13 at 21:25

1

You have to trigger a script on failure/recovery of a director that removes/adds that director.

I use lvs-kiss for this, which has a syntax to include scripts for these cases.

answered Jun 11 '13 at 21:25

Nils

7,657
3
31
71

This would probably be a better long term solution as the expire option removes the ability to gracefully remove servers, thanks. I'll investigate it when I have time! – Antitribu Jun 12 '13 at 08:48

Piranha/Pulse, lvs.cf with persistence and server failure

2 Answers2