1

I have been struggling with a very strange networking issue for the past week. In summary, my containers can't reach the internet unless I run tcpdump -i br-XXXXX (which puts the bridge into promiscuous mode)

I have two containers that I'm bringing up with Compose:

version: '3'
services:
  seafile:
    build: ./seafile/build
    container_name: seafile
    restart: always
    ports:
      - 8080:80
    networks:
      seafile_net:
        ipv4_address: 192.168.0.2
    volumes:
      - /mnt/gluster/files/redacted/data:/shared
    environment:
      - DB_HOST=10.200.7.100
      - DB_PASSWD=redacted
      - TIME_ZONE=America/Chicago
    depends_on:
      - seafile-memcached
  seafile-memcached:
    image: memcached:1.5.6
    container_name: seafile-memcached
    restart: always
    networks:
      seafile_net:
        ipv4_address: 192.168.0.3
    entrypoint: memcached -m 256
networks:
  seafile_net:
    driver: bridge
    ipam:
      driver: default
      config:
        - subnet: 192.168.0.0/24

Containers running:

$ docker ps
CONTAINER ID        IMAGE               COMMAND                  CREATED             STATUS              PORTS                  NAMES
93b1b773ad4e        docker_seafile      "/sbin/my_init -- /s…"   2 minutes ago       Up 2 minutes        0.0.0.0:8080->80/tcp   seafile
1f6b124c3be4        memcached:1.5.6     "memcached -m 256"       2 minutes ago       Up 2 minutes        11211/tcp              seafile-memcached

Network information:

$ docker network ls
NETWORK ID          NAME                 DRIVER              SCOPE
f67b015c4b84        bridge               bridge              local
d21cb7ba8ee4        docker_seafile_net   bridge              local
d0eb86ca57fa        host                 host                local
01f03fcfa103        none                 null                local

$ docker inspect d21cb7ba8ee4
[
    {
        "Name": "docker_seafile_net",
        "Id": "d21cb7ba8ee4a477497a7d343ea1a5f9b109237dce878a40605a281e1a2db1e9",
        "Created": "2020-09-24T15:03:46.39761472-04:00",
        "Scope": "local",
        "Driver": "bridge",
        "EnableIPv6": false,
        "IPAM": {
            "Driver": "default",
            "Options": null,
            "Config": [
                {
                    "Subnet": "192.168.0.0/24"
                }
            ]
        },
        "Internal": false,
        "Attachable": true,
        "Ingress": false,
        "ConfigFrom": {
            "Network": ""
        },
        "ConfigOnly": false,
        "Containers": {
            "1f6b124c3be414040a6def3b3bc3e9f06e2af6a28afd6737823d1da65d5ab047": {
                "Name": "seafile-memcached",
                "EndpointID": "ab3e3c4aa216d158473fa3dde3f87e654422ffeca6ebb7626d072da10ba9a5cf",
                "MacAddress": "02:42:c0:a8:00:03",
                "IPv4Address": "192.168.0.3/24",
                "IPv6Address": ""
            },
            "93b1b773ad4e3685aa8ff2db2f342c617c42f1c5ab4ce693132c1238e73e705d": {
                "Name": "seafile",
                "EndpointID": "a895a417c22a4755df15b180d1c38b712c36047b01596c370815964a212f7105",
                "MacAddress": "02:42:c0:a8:00:02",
                "IPv4Address": "192.168.0.2/24",
                "IPv6Address": ""
            }
        },
        "Options": {},
        "Labels": {
            "com.docker.compose.network": "seafile_net",
            "com.docker.compose.project": "docker",
            "com.docker.compose.version": "1.27.4"
        }
    }
]

$ ip link show master br-d21cb7ba8ee4
18: veth8fd88c9: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master br-d21cb7ba8ee4 state UP mode DEFAULT group default
    link/ether b6:37:9e:fd:9e:da brd ff:ff:ff:ff:ff:ff
20: vetheb84e16: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master br-d21cb7ba8ee4 state UP mode DEFAULT group default
    link/ether ca:90:c8:a6:2e:9b brd ff:ff:ff:ff:ff:ff

Once the containers are up, they are unable to reach the internet or any other resources on the host network. The following curl command was run from inside one of the containers. On the host server, this same command works fine:

root@93b1b773ad4e:/opt/seafile# curl -viLk http://1.1.1.1
* Rebuilt URL to: http://1.1.1.1/
*   Trying 1.1.1.1...
* TCP_NODELAY set
**hangs**

Here is a tcpdump (run on the host) of the bridge WITHOUT putting it into promiscuous mode. This was captured while I was trying to run the curl command from above:

$ tcpdump --no-promiscuous-mode -lnni br-d21cb7ba8ee4
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on br-d21cb7ba8ee4, link-type EN10MB (Ethernet), capture size 262144 bytes
14:15:42.447055 ARP, Request who-has 192.168.0.2 tell 192.168.0.1, length 28
14:15:43.449058 ARP, Request who-has 192.168.0.2 tell 192.168.0.1, length 28
14:15:45.448787 ARP, Request who-has 192.168.0.2 tell 192.168.0.1, length 28
14:15:46.451049 ARP, Request who-has 192.168.0.2 tell 192.168.0.1, length 28
14:15:47.453058 ARP, Request who-has 192.168.0.2 tell 192.168.0.1, length 28
14:15:49.449789 ARP, Request who-has 192.168.0.2 tell 192.168.0.1, length 28
14:15:50.451048 ARP, Request who-has 192.168.0.2 tell 192.168.0.1, length 28

But if I let tcpdump put the bridge into promiscuous mode, things start working:

$ tcpdump -lnni br-d21cb7ba8ee4
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on br-d21cb7ba8ee4, link-type EN10MB (Ethernet), capture size 262144 bytes
14:16:05.457844 ARP, Request who-has 192.168.0.2 tell 192.168.0.1, length 28
14:16:05.457863 ARP, Reply 192.168.0.2 is-at 02:42:c0:a8:00:02, length 28
**traffic continues**

Docker info:

$ docker info
Client:
 Debug Mode: false
Server:
 Containers: 2
  Running: 2
  Paused: 0
  Stopped: 0
 Images: 6
 Server Version: 19.03.13
 Storage Driver: devicemapper
  Pool Name: docker-8:3-3801718-pool
  Pool Blocksize: 65.54kB
  Base Device Size: 10.74GB
  Backing Filesystem: xfs
  Udev Sync Supported: true
  Data file: /dev/loop0
  Metadata file: /dev/loop1
  Data loop file: /var/lib/docker/devicemapper/devicemapper/data
  Metadata loop file: /var/lib/docker/devicemapper/devicemapper/metadata
  Data Space Used: 1.695GB
  Data Space Total: 107.4GB
  Data Space Available: 80.76GB
  Metadata Space Used: 3.191MB
  Metadata Space Total: 2.147GB
  Metadata Space Available: 2.144GB
  Thin Pool Minimum Free Space: 10.74GB
  Deferred Removal Enabled: true
  Deferred Deletion Enabled: true
  Deferred Deleted Device Count: 0
  Library Version: 1.02.164-RHEL7 (2019-08-27)
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 8fba4e9a7d01810a393d5d25a3621dc101981175
 runc version: dc9208a3303feef5b3839f4323d9beb36df0a9dd
 init version: fec3683
 Security Options:
  seccomp
   Profile: default
 Kernel Version: 3.10.0-123.9.2.el7.x86_64
 Operating System: CentOS Linux 7 (Core)
 OSType: linux
 Architecture: x86_64
 CPUs: 2
 Total Memory: 3.704GiB
 Name: redacted.novalocal
 ID: redacted
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false
WARNING: bridge-nf-call-iptables is disabled
WARNING: bridge-nf-call-ip6tables is disabled
WARNING: the devicemapper storage-driver is deprecated, and will be removed in a future release.
WARNING: devicemapper: usage of loopback devices is strongly discouraged for production use.
         Use `--storage-opt dm.thinpooldev` to specify a custom block storage device.

Host information:

$ docker --version
Docker version 19.03.13, build 4484c46d9d

$ docker-compose --version
docker-compose version 1.27.4, build 40524192

$ cat /etc/redhat-release
CentOS Linux release 7.8.2003 (Core)

$ getenforce
Disabled

$ free -h
              total        used        free      shared  buff/cache   available
Mem:           3.7G        1.1G        826M        109M        1.8G        2.2G
Swap:          1.0G        292M        731M

$ nproc
2

$ uptime
 10:39:49 up 3 days, 19:56,  1 user,  load average: 0.00, 0.01, 0.05

$ iptables-save
# Generated by iptables-save v1.4.21 on Mon Sep 28 10:41:22 2020
*filter
:INPUT ACCEPT [17098775:29231856941]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [15623889:13475217196]
:DOCKER - [0:0]
:DOCKER-ISOLATION-STAGE-1 - [0:0]
:DOCKER-ISOLATION-STAGE-2 - [0:0]
:DOCKER-USER - [0:0]
-A FORWARD -j DOCKER-USER
-A FORWARD -j DOCKER-ISOLATION-STAGE-1
-A FORWARD -o br-d21cb7ba8ee4 -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
-A FORWARD -o br-d21cb7ba8ee4 -j DOCKER
-A FORWARD -i br-d21cb7ba8ee4 ! -o br-d21cb7ba8ee4 -j ACCEPT
-A FORWARD -i br-d21cb7ba8ee4 -o br-d21cb7ba8ee4 -j ACCEPT
-A FORWARD -o docker0 -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
-A FORWARD -o docker0 -j DOCKER
-A FORWARD -i docker0 ! -o docker0 -j ACCEPT
-A FORWARD -i docker0 -o docker0 -j ACCEPT
-A DOCKER -d 192.168.0.2/32 ! -i br-d21cb7ba8ee4 -o br-d21cb7ba8ee4 -p tcp -m tcp --dport 80 -j ACCEPT
-A DOCKER-ISOLATION-STAGE-1 -i br-d21cb7ba8ee4 ! -o br-d21cb7ba8ee4 -j DOCKER-ISOLATION-STAGE-2
-A DOCKER-ISOLATION-STAGE-1 -i docker0 ! -o docker0 -j DOCKER-ISOLATION-STAGE-2
-A DOCKER-ISOLATION-STAGE-1 -j RETURN
-A DOCKER-ISOLATION-STAGE-2 -o br-d21cb7ba8ee4 -j DROP
-A DOCKER-ISOLATION-STAGE-2 -o docker0 -j DROP
-A DOCKER-ISOLATION-STAGE-2 -j RETURN
-A DOCKER-USER -j RETURN
COMMIT
# Completed on Mon Sep 28 10:41:22 2020
# Generated by iptables-save v1.4.21 on Mon Sep 28 10:41:22 2020
*nat
:PREROUTING ACCEPT [408634:24674574]
:INPUT ACCEPT [380413:22825327]
:OUTPUT ACCEPT [520596:31263683]
:POSTROUTING ACCEPT [711734:42731963]
:DOCKER - [0:0]
-A PREROUTING -m addrtype --dst-type LOCAL -j DOCKER
-A OUTPUT ! -d 127.0.0.0/8 -m addrtype --dst-type LOCAL -j DOCKER
-A POSTROUTING -s 192.168.0.0/24 ! -o br-d21cb7ba8ee4 -j MASQUERADE
-A POSTROUTING -s 172.17.0.0/16 ! -o docker0 -j MASQUERADE
-A POSTROUTING -s 192.168.0.2/32 -d 192.168.0.2/32 -p tcp -m tcp --dport 80 -j MASQUERADE
-A DOCKER -i br-d21cb7ba8ee4 -j RETURN
-A DOCKER -i docker0 -j RETURN
-A DOCKER ! -i br-d21cb7ba8ee4 -p tcp -m tcp --dport 8080 -j DNAT --to-destination 192.168.0.2:80
COMMIT
# Completed on Mon Sep 28 10:41:22 2020
Andrew Paglusch
  • 111
  • 1
  • 8
  • I had something similar [once](https://stackoverflow.com/questions/42705432/kubernetes-service-ips-not-reachable#comment72646469_42760480) where the issue was caused by the same subnet configured on another bridge/interface. Maybe worth re-checking.... – grasbueschel Oct 02 '20 at 10:12
  • @grasbueschel Good suggestion, but I've just checked and verified that no subnets on this server overlap. – Andrew Paglusch Oct 02 '20 at 15:37
  • what does `iptables-save` say? certainly your outbound traffic is not routed well. I was suffering from that with _kubernetes_ – Bodo Hugo Barwich Oct 02 '20 at 16:03
  • @BodoHugoBarwich I have the `iptables-save` output in the last code box in my question. The only IPTables rules in place are the ones created by Docker. – Andrew Paglusch Oct 02 '20 at 16:59
  • Just searched through our slack channels since this sound soo familiar. And indeed, we had once the issue that we couldn't access a webserver/container if we weren't tcpdumping, i.e. we couldn't reach the container from the host normally, but it worked while tcpdumping at the same time on the host. the sad, but working, solution was: 1) remove docker 2) system update (e.g yum update) 3) reboot 4) reinstall docker 5) redeploy container/image. – grasbueschel Oct 02 '20 at 18:03
  • I can't see how this can be relevant, but usually Dockers gets br_netfilter loaded which causes troubles when the bridge code calls iptables. In your setup there are warnings telling it's disabled. Maybe you could try and enable it? net.bridge.bridge-nf-call-iptables and while at it also net.bridge.bridge-nf-call-arptables. Just to see if that makes any difference. I never found why Docker gets br_netfilter loaded in the first place. – A.B Oct 02 '20 at 18:07
  • Yes, the scenario remembers me very strong about the `kubernetes` issue that I had. I only could build a workaround by introducing firewall rules in hot state that would leave the traffic pass. I found that by introducing `LOG` rules into the firewall which showed me where the traffic was passing from one rule to the other. Unfortunately `kubernetes` continously monitors the firewall rules and rewrites them in hot state. So you can't write rules into the `DOCKER` chains. – Bodo Hugo Barwich Oct 02 '20 at 18:48
  • Reinstalling doesn't resolve the issue because the firewall rules are created by the `docker` service. It will always rebuild them on every relaunch. – Bodo Hugo Barwich Oct 02 '20 at 18:51

1 Answers1

5

Thanks to a comment @A.B made, I found the solution.

I believe the main issue was that the br_netfilter module was not loaded:

$ lsmod | grep br_netfilter
$

On another CentOS 7 Docker host (that does not have this problem), the module was loaded:

$ lsmod | grep br_netfilter
br_netfilter           22256  0
bridge                146976  1 br_netfilter

Loading the module by hand wasn't working for me:

$ modprobe br_netfilter
modprobe: FATAL: Module br_netfilter not found.

I read here that br_netfilter was a built-in module until kernel version 3.18.

I discovered that we were excluding the kernel from our updates (I didn't set this server up, so this was news to me).

$ grep exclude /etc/yum.conf
exclude=kernel*

Because of this exclusion, my prior yum updates had not been updating the kernel. I figure the separation of br_netfilter hadn't yet been backported into the kernel we were running.

After running an update without the kernel exclusion in place (yum --disableexcludes=all update kernel) and rebooting, everything started working!

The kernel update took me from 3.10.0-123.9.2.el7.x86_64 to 3.10.0-1127.19.1.el7.

Andrew Paglusch
  • 111
  • 1
  • 8
  • 1
    If it's now completely enabled, including calling iptables from bridge path, you should read my answer on how to solve problems caused by it! https://serverfault.com/questions/963759/docker-breaks-libvirt-bridge-network/964491#964491 – A.B Oct 02 '20 at 20:01