3

I have two machines, each configured identically as a firewall/load balancer for a busy website. I have set them up with CARP and pfsync on both the internal and external interfaces. The internal interface is behaving as expected (primary listed as MASTER and secondary listed as BACKUP)

On both machines, the network interfaces are as follows:

  • em0 - External interface
  • bge0 - Internal interface
  • bge1 - Crossover connection between both machines
  • carp0 - Shared external interface for CARP
  • carp1 - Shared internal interface for CARP

I've rewritten the IP addresses and MAC addresses below. The networks are as follows:

  • 10.0.1.0/24 - External network
  • 10.0.2.0/24 - Internal network
  • 10.0.3.0/24 - Crossover network

Here's the output from ifconfig on the primary:

em0: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
    options=19b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4>
    ether [SNIP]
    inet 10.0.1.10 netmask 0xffffff00 broadcast 10.0.1.255
    media: Ethernet 100baseTX <full-duplex>
    status: active
bge0: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
    options=9b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM>
    ether [SNIP]
    inet 10.0.2.10 netmask 0xffffff00 broadcast 10.0.2.255
    media: Ethernet 1000baseT <full-duplex>
    status: active
bge1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
    options=9b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM>
    ether [SNIP]
    inet 10.0.3.10 netmask 0xffffff00 broadcast 10.0.3.255
    media: Ethernet 1000baseT <full-duplex>
    status: active
lo0: flags=80c9<UP,LOOPBACK,RUNNING,NOARP,MULTICAST> metric 0 mtu 16384
    options=3<RXCSUM,TXCSUM>
    inet6 fe80::1%lo0 prefixlen 64 scopeid 0x4 
    inet6 ::1 prefixlen 128 
    inet 127.0.0.1 netmask 0xff000000 
pflog0: flags=141<UP,RUNNING,PROMISC> metric 0 mtu 33152
pfsync0: flags=0<> metric 0 mtu 1460
    pfsync: syncdev: bge1 syncpeer: 10.0.3.11 maxupd: 128
carp0: flags=49<UP,LOOPBACK,RUNNING> metric 0 mtu 1500
    inet 10.0.1.5 netmask 0xffffff00 
    carp: MASTER vhid 1 advbase 1 advskew 0
carp1: flags=49<UP,LOOPBACK,RUNNING> metric 0 mtu 1500
    inet 10.0.2.5 netmask 0xffffff00 
    carp: MASTER vhid 2 advbase 1 advskew 0

And here's the /etc/rc.conf excerpt from the primary:

defaultrouter="10.0.1.1"
network_interfaces="em0 bge0 bge1 lo0 pfsync0"
cloned_interfaces="carp0 carp1"
ifconfig_em0="inet 10.0.1.10 netmask 255.255.255.0 media 100BaseTX mediaopt full-duplex"
ifconfig_bge0="inet 10.0.2.10 netmask 255.255.255.0 media 1000BaseTX mediaopt full-duplex"
ifconfig_bge1="inet 10.0.3.10 netmask 255.255.255.0 media 1000BaseTX mediaopt full-duplex"
ifconfig_carp0="vhid 1 pass [SNIP] 10.0.1.5/24"
ifconfig_carp1="vhid 2 pass [SNIP] 10.0.2.5/24"
pfsync_enable="YES"
pfsync_syncdev="bge1"
pfsync_syncpeer="10.0.3.11"

And here's the output on the secondary:

em0: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
    options=19b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4>
    ether [SNIP]
    inet 10.0.1.11 netmask 0xffffff00 broadcast 10.0.1.255
    media: Ethernet 100baseTX <full-duplex>
    status: active
bge0: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
    options=9b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM>
    ether [SNIP]
    inet 10.0.2.11 netmask 0xffffff00 broadcast 10.0.2.255
    media: Ethernet 1000baseT <full-duplex>
    status: active
bge1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
    options=9b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM>
    ether [SNIP]
    inet 10.0.3.11 netmask 0xffffff00 broadcast 10.0.3.255
    media: Ethernet 1000baseT <full-duplex>
    status: active
lo0: flags=80c9<UP,LOOPBACK,RUNNING,NOARP,MULTICAST> metric 0 mtu 16384
    options=3<RXCSUM,TXCSUM>
    inet6 fe80::1%lo0 prefixlen 64 scopeid 0x4 
    inet6 ::1 prefixlen 128 
    inet 127.0.0.1 netmask 0xff000000 
pflog0: flags=141<UP,RUNNING,PROMISC> metric 0 mtu 33152
pfsync0: flags=0<> metric 0 mtu 1460
    pfsync: syncdev: bge1 syncpeer: 10.0.3.10 maxupd: 128
carp0: flags=49<UP,LOOPBACK,RUNNING> metric 0 mtu 1500
    inet 10.0.1.5 netmask 0xffffff00 
    carp: MASTER vhid 1 advbase 1 advskew 20
carp1: flags=49<UP,LOOPBACK,RUNNING> metric 0 mtu 1500
    inet 10.0.2.5 netmask 0xffffff00 
    carp: BACKUP vhid 2 advbase 1 advskew 20

And here's the /etc/rc.conf excerpt from the secondary:

defaultrouter="10.0.1.1"
network_interfaces="em0 bge0 bge1 lo0 pfsync0"
cloned_interfaces="carp0 carp1"
ifconfig_em0="inet 10.0.1.11 netmask 255.255.255.0 media 100BaseTX mediaopt full-duplex"
ifconfig_bge0="inet 10.0.2.11 netmask 255.255.255.0 media 1000BaseTX mediaopt full-duplex"
ifconfig_bge1="inet 10.0.3.11 netmask 255.255.255.0 media 1000BaseTX mediaopt full-duplex"
ifconfig_carp0="vhid 1 pass [SNIP] advskew 20 10.0.1.5/24"
ifconfig_carp1="vhid 2 pass [SNIP] advskew 20 10.0.2.5/24"
pfsync_enable="YES"
pfsync_syncdev="bge1"
pfsync_syncpeer="10.0.3.10"

What I don't understand is, the carp status on carp0 is MASTER on both machines when the status on carp1 is as it should be (MASTER on the primary and BACKUP on the secondary). What am I missing? Where should I be looking for clues?

Conor McDermottroe
  • 938
  • 1
  • 7
  • 17

2 Answers2

6

Are the machines able to ping each other over the external interface? Do you by any chance have another vhid 1 on your external network?

ryanlim
  • 458
  • 3
  • 4
  • 1
    The two machines are plugged into our external switch along with two other non-CARPed boxes. They can ping each other on the external interface, no problem with that at all. The router (10.0.1.1 above) is provided by the datacentre and is outside of my control. I *think* it may be a pair of Cisco routers set up with VRRP. If that pair of VRRP routers had a VRID of 1, would it conflict with a CARP vhid of 1? Are the two protocols that similar? – Conor McDermottroe Apr 22 '10 at 10:40
  • 3
    That would be correct. I think I had this issue before where the datacenter used VHID 1 on their VRRP setup of my WAN side, and I attempted to use VHID 1 as well. Here's some better explanation and possibly a solution? - http://forum.pfsense.org/index.php?topic=752.0 – ryanlim Apr 22 '10 at 13:48
  • That's right, if the two machines can't talk to each other for some reason then they both assume the master role. – hookenz Mar 22 '13 at 00:38
  • 1
    Had the same problem with Arista switches that used VRRP. [CARP uses the same protocol number (112) as VRRP,](https://en.wikipedia.org/wiki/Common_Address_Redundancy_Protocol#Incompatibility_with_IANA_standards) probably on purpose. – Jose Quinteiro May 28 '18 at 20:14
2

Looks like the advskew on the primary is saying that (0|primary) should be MASTER versus (20|secondary) should be BACKUP. Implicating(?) a lack of communications between the two carp0 interfaces.

You may already have checked these, but some general diagnostics procedures on OpenBSD.

  1. Configuration Files
  2. Verify Carp Protocol is allowed in and out of the two machines
  3. Verify pfsync protocol is allowed in and out of the two machines

As it seems you are using FreeBSD (i.e.not using OpenBSD) I hope my answer is clear enough for you to adjust and make useful.

--

1. Configuration Files

Do you have net.inet.carp settings similar to the below ?

File: /etc/sysctl.conf

net.inet.carp.allow=1
net.inet.carp.preempt=1
net.inet.carp.log=1

One CARP interface working, while the other doesn't seems to indicate that the correct system configurations have been made. It doesn't hurt to confirm, sometimes we make the changes with a command-line setting and forget to set in the system configurations.

  • net.inet.carp.allow accept incoming CARP packets or not. Default is yes, and not in /etc/sysctl.conf
  • net.inet.carp.preempt Allow hosts within group to preempt the master. Sets failover of all CARP interfaces on the failure of one interface. Disabled by default.
  • net.inet.carp.log Log bad CARP packets.
  • net.inet.carp.arpbalance Load balance traffic across group hosts. Default is disabled.

2. Carp Protocol

Carp packets need to be recieved for the firewall to adjudicate whether it needs to become MASTER/BACKUP

Revisit your firewall configuration to make sure that proto carp is passed in and out on both carp physical interfaces.

For example:

pass quick on { em0 bge0 } proto carp keep state (no-sync)

You can confirm through adding block log all at the beginning of your firewall ruleset, and then using tcpdump on the pflog0 interface to confirm whether the carp packets are being allowed through or not.

2. pfsync protocol

An additional check, that pfsync packets are allowed through the firewalls, to ensure that firewall states are being shared between the two hosts.

pass quick on bge1 proto pfsync keep state (no-sync)
samt
  • 713
  • 4
  • 10