I have a problem with an active/active firewall cluster where the connection tracking state in the firewall does not seem be be being replicated.
It's active/active because I have two routers connected via different ISP's and a network range that is provided through BGP. How the data is routed back is determined by BGP. Therefore the routing is asymmetric. These two firewalls are networked together on the inside network and I have a virtual IP acting as a default route for windows servers.
When both firewall's are running and an inside server tries to connect, the reply comes back via the secondary firewall (the one which has no record of the connection state). Therefore the reply is dropped and not routed to the server that initiated the request.
I thought conntrackd would fix this but I can't seem to get it to work. Perhaps I misunderstand how it works. Can I get conntrackd to replicate iptables state at all? Does it actually work in active/active mode? Is state replicated in real time?
Here are what my conntrackd.conf file contains.
Sync {
Mode ALARM {
RefreshTime 15
CacheTimeout 180
}
Multicast {
IPv4_Address 225.0.0.50
Group 3780
IPv4_Interface 10.0.0.100
Interface eth2
SndSocketBuffer 1249280
RcvSocketBuffer 1249280
Checksum on
}
}
General {
Nice -20
HashSize 32768
HashLimit 131072
LogFile on
Syslog on
LockFile /var/lock/conntrack.lock
UNIX {
Path /var/run/conntrackd.ctl
Backlog 20
}
NetlinkBufferSize 2097152
NetlinkBufferSizeMaxGrowth 8388608
Filter From Userspace {
Protocol Accept {
TCP
}
Address Ignore {
IPv4_address 127.0.0.1 # loopback
IPv4_address 10.0.0.100 # dedicated link0
IPv4_address 10.0.0.101 # dedicated link1
IPv4_address x.x.x.130 # Internal ip
}
}
}
The other conntrackd is the same apart from the IPv4_interface in the multicast section which has 10.0.0.101. And the internal IP in the filter section ends in 131
I have set firewall rules to accept input to 225.0.0.50/32 & output to 225.0.0.50/32.
I've set mode to ALARM here but first tried FTFW. Neither seems to work.
My kernel version is: 3.11.0.
Sorry, my cut and paste isn't working from the Virtual box window. However, let me just say that when I run: sudo conntrackd -i it lists as output an ESTABLISHED tcp connection which is one that I created with ssh going in.
However, on the other router the same command produces no output. Which I think should mean that the state didn't get transferred across onto the other router.
Any ideas?
Update: I ran tcpdump -i eth2 on each machine and I can see UDP packets arriving locally from the other router that were destined for the multicast address 225.0.0.50 port 3780 with a length of 68 bytes.
If I initiate an ssh connection I see immediate activity on tcpdump, and disconnecting does the same. Otherwise regular heartbeats of that message come through. So it's clear that the routers are sending the packets, but is conntrackd ignoring them? Is there some hidden debug I can turn on?
Update2: Ok, after days of googling and looking at source code I have discovered that conntrackd is replicating the state but it ends up in an external cache. To commit the rules you need to run conntrackd -c. Clearly conntrackd is designed to be used in an active/backup mode.
It seems a new option was introduced at some point called CacheWriteThrough. But was then removed. Can conntrack do active/active or not? I can't seem to find an answer to that.