0

I'm using keepalived to provide availability between two Alma 8 Nginx servers (hosted on VMWare if that's of any relevance). When firewalld is enabled, despite a rich rule being set for VRRP, when I bring firewalld up both hosts start to respond on the virtual IP:

root@dca-nfs01:~# arping 172.31.5.233
60 bytes from 00:50:56:84:ac:d0 (172.31.5.233): index=39 time=1.960 usec
60 bytes from 00:50:56:84:ac:d0 (172.31.5.233): index=40 time=20.660 usec
60 bytes from 00:50:56:84:52:ed (172.31.5.233): index=41 time=24.930 usec
60 bytes from 00:50:56:84:ac:d0 (172.31.5.233): index=42 time=534.616 msec
60 bytes from 00:50:56:84:52:ed (172.31.5.233): index=43 time=534.646 msec

My keepalived config is taken from a standard tutorial template and looks as follows:

[root@dca-ngx01-al ~]# cat /etc/keepalived/keepalived.conf
global_defs {
  # Keepalived process identifier
  router_id nginx
}

# Script to check whether Nginx is running or not
vrrp_script check_nginx {
  script "/sbin/pidof nginx"
  interval 2
  weight 50
}

# Virtual interface - The priority specifies the order in which the assigned interface to take over in a failover
vrrp_instance VI_01 {
  state MASTER
  interface ens192
  virtual_router_id 151
  priority 110

  # The virtual ip address shared between the two NGINX Web Server which will float
  virtual_ipaddress {
    172.31.5.233
  }
  track_script {
    check_nginx
  }
  authentication {
    auth_type AH
    auth_pass secret
  }
}

Both boxes have a simple one zone firewall, and I have added a rich rule to allow VRRP communication between the two hosts:

[root@dca-ngx01-al ~]# firewall-cmd --list-all
public (active)
  target: default
  icmp-block-inversion: no
  interfaces: ens192
  sources:
  services: dhcpv6-client http https ssh
  ports: 10050/tcp
  protocols:
  forward: no
  masquerade: no
  forward-ports:
  source-ports:
  icmp-blocks:
  rich rules:
        rule protocol value="vrrp" accept

I have also set net.ipv4.ip_forward = 1 in /etc/sysctl.conf.

When firewalld is stopped on both boxes, keepalived behaves correctly, but when enabled to appear that both sides lose touch with each other, and just send out repeated gratuitous ARP packets:

● keepalived.service - LVS and VRRP High Availability Monitor
   Loaded: loaded (/usr/lib/systemd/system/keepalived.service; enabled; vendor preset: disabled)
   Active: active (running) since Fri 2022-03-25 12:48:25 GMT; 2h 35min ago
  Process: 7140 ExecReload=/bin/kill -HUP $MAINPID (code=exited, status=0/SUCCESS)
  Process: 12966 ExecStart=/usr/sbin/keepalived $KEEPALIVED_OPTIONS (code=exited, status=0/SUCCESS)
 Main PID: 12967 (keepalived)
    Tasks: 2 (limit: 11406)
   Memory: 1.8M
   CGroup: /system.slice/keepalived.service
           ├─12967 /usr/sbin/keepalived -D
           └─12968 /usr/sbin/keepalived -D

Mar 25 15:08:15 dca-ngx01-al.REDACTED.local Keepalived_vrrp[12968]: Sending gratuitous ARP on ens192 for 172.31.5.233
Mar 25 15:08:15 dca-ngx01-al.REDACTED.local Keepalived_vrrp[12968]: Sending gratuitous ARP on ens192 for 172.31.5.233
Mar 25 15:08:15 dca-ngx01-al.REDACTED.local Keepalived_vrrp[12968]: Sending gratuitous ARP on ens192 for 172.31.5.233
Mar 25 15:08:15 dca-ngx01-al.REDACTED.local Keepalived_vrrp[12968]: Sending gratuitous ARP on ens192 for 172.31.5.233
Mar 25 15:08:18 dca-ngx01-al.REDACTED.local Keepalived_vrrp[12968]: (VI_01) Sending/queueing gratuitous ARPs on ens192 for 1>
Mar 25 15:08:18 dca-ngx01-al.REDACTED.local Keepalived_vrrp[12968]: Sending gratuitous ARP on ens192 for 172.31.5.233
Mar 25 15:08:18 dca-ngx01-al.REDACTED.local Keepalived_vrrp[12968]: Sending gratuitous ARP on ens192 for 172.31.5.233
Mar 25 15:08:18 dca-ngx01-al.REDACTED.local Keepalived_vrrp[12968]: Sending gratuitous ARP on ens192 for 172.31.5.233
Mar 25 15:08:18 dca-ngx01-al.REDACTED.local Keepalived_vrrp[12968]: Sending gratuitous ARP on ens192 for 172.31.5.233
Mar 25 15:08:18 dca-ngx01-al.REDACTED.local Keepalived_vrrp[12968]: Sending gratuitous ARP on ens192 for 172.31.5.233

I can however see from using TCPDump that regular VRRP packets from the other host are at least hitting the network interface when firewalld is active:

tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on ens192, link-type EN10MB (Ethernet), capture size 262144 bytes
15:25:21.532300 IP dca-ngx02-al.REDACTED.local > vrrp.mcast.net: AH(spi=0xac1f05e5,seq=0x3160): VRRPv2, Advertisement, vrid 151, prio 150, authtype ah, intvl 1s, length 20
15:25:22.532419 IP dca-ngx02-al.REDACTED.local > vrrp.mcast.net: AH(spi=0xac1f05e5,seq=0x3161): VRRPv2, Advertisement, vrid 151, prio 150, authtype ah, intvl 1s, length 20
15:25:23.532476 IP dca-ngx02-al.REDACTED.local > vrrp.mcast.net: AH(spi=0xac1f05e5,seq=0x3162): VRRPv2, Advertisement, vrid 151, prio 150, authtype ah, intvl 1s, length 20
15:25:24.532544 IP dca-ngx02-al.REDACTED.local > vrrp.mcast.net: AH(spi=0xac1f05e5,seq=0x3163): VRRPv2, Advertisement, vrid 151, prio 150, authtype ah, intvl 1s, length 20

Does anybody have any ideas as to how I can further troubleshoot this issue?

Thanks in advance.

1 Answers1

1

I've this morning figured out what the cause of the issue was, in case this helps somebody at a future date. I enabled LogDenied=all in /etc/firewalld/firewalld.conf, and was then able to identify which packets were still being dropped by firewalld using the --get-log-denied switch:

[root@dca-ngx02-al keepalived]# firewall-cmd --get-log-denied
Mar 28 08:40:04 dca-ngx01-al.REDACTED.local kernel: FINAL_REJECT: IN=ens192 OUT= MAC=01:00:5e:00:00:12:00:50:56:84:ac:d0:08:00 SRC=172.31.5.229 DST=224.0.0.18 LEN=64 TOS=0x00 PREC=0xC0 TTL=255 ID=79 PROTO=AH SPI=0xac1f05e5
Mar 28 08:40:05 dca-ngx01-al.REDACTED.local kernel: FINAL_REJECT: IN=ens192 OUT= MAC=01:00:5e:00:00:12:00:50:56:84:ac:d0:08:00 SRC=172.31.5.229 DST=224.0.0.18 LEN=64 TOS=0x00 PREC=0xC0 TTL=255 ID=80 PROTO=AH SPI=0xac1f05e5

I resolved the issue by adding a subsequent firewall rule for AH multicast packets.

firewall-cmd --add-rich-rule='rule protocol value="ah" accept' --permanent