Simply put, I have 2 containers for a service in a swarm mode. Container receives udp packets and sends them back to multiple clients, which ip's and ports are stored in db. Load-balancing: packets from one IP go to same container.
So, here is the situation:
container-1 receives packet from client-1 (
10.255.0.2:5874
- src ip and port of packet we see inside container) and send him response back (successfully).container-2 receives packet from client-2 (
10.255.0.2:5875
) and wants to send this packet back to both clients (using addresses from db), but only client-2 receives the packet!
Debug information I gathered so far:
sudo nsenter --net=/run/docker/netns/ingress_sbox
root@alpom-qa:~# sudo iptables --list -t mangle
sudo: unable to resolve host alpom-qa: Resource temporarily unavailable
Chain PREROUTING (policy ACCEPT)
target prot opt source destination
MARK tcp -- anywhere anywhere tcp dpt:5070 MARK set 0x161
MARK tcp -- anywhere anywhere tcp dpt:5062 MARK set 0x163
MARK tcp -- anywhere anywhere tcp dpt:5080 MARK set 0x16b
MARK tcp -- anywhere anywhere tcp dpt:http-alt MARK set 0x18f
MARK udp -- anywhere anywhere udp dpt:sip MARK set 0x192
MARK tcp -- anywhere anywhere tcp dpt:sip-tls MARK set 0x192
Chain INPUT (policy ACCEPT)
target prot opt source destination
MARK all -- anywhere 10.255.0.7 MARK set 0x161
MARK all -- anywhere 10.255.0.9 MARK set 0x163
MARK all -- anywhere 10.255.0.5 MARK set 0x16b
MARK all -- anywhere 10.255.10.115 MARK set 0x18f
MARK all -- anywhere 10.255.0.11 MARK set 0x192
Docker uses IPVS for load-balancing:
root@alpom-qa:~# ipvsadm
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
FWM 353 rr
-> 10.255.10.245:0 Masq 1 0 0
FWM 355 rr
-> 10.255.10.246:0 Masq 1 0 0
FWM 363 rr
-> 10.255.11.61:0 Masq 1 0 0
FWM 399 rr
-> 10.255.11.91:0 Masq 1 0 13
FWM 402 rr
-> 10.255.11.93:0 Masq 1 0 1
-> 10.255.11.94:0 Masq 1 0 1
and connection(filtered out udp):
root@alpom-qa:~# ipvsadm -lc
IPVS connection entries
pro expire state source virtual destination
UDP 04:59 UDP 10.10.0.1:5874 172.18.0.2:sip 10.255.11.93:sip
UDP 04:59 UDP 10.10.0.1:5875 172.18.0.2:sip 10.255.11.94:sip
And tcpdump:
sudo tcpdump -i any src 10.255.11.93 -nn
shows only entries like 13:03:28.280287 IP 10.255.11.93.5060 > 10.255.0.2.5874: SIP
Looks like if we send udp packet to 10.255.0.2.5875
not from 10.255.11.94
(as shown in connection list), it somehow gets lost.