Network Traffic
We run Docker Swarm with 3 manager nodes and 16 worker nodes. The network I/O between two of the three manager nodes is very high. To illustrate this, here is the output from iftop
for the three manager nodes:
vm71 (10.0.0.131)
vm71 => 10.0.0.130 39.3Mb 47.5Mb 49.3Mb
<= 802Kb 1.03Mb 1.07Mb
vm70 (10.0.0.130)
vm70 => 10.0.0.131 798Kb 1.00Mb 1.00Mb
<= 40.9Mb 44.8Mb 44.8Mb
vm75 (10.0.0.135)
vm75 => 10.0.0.131 9.50Kb 10.1Kb 9.51Kb
<= 7.83Kb 8.08Kb 7.56Kb
As you can see above, the traffic between vm70 and vm71 is approximately 4,000 times as heavy as the traffic between vm75 and the other two managers. We have our rules set to run no containers on any swarm manager. This was confirmed by running docker stats
on each of them.
Networking by Process
The next obvious question was which processes were generating this network I/O. The output of netstat -tup
is below. I'm only displaying the line related to the relevant ports from iftop
.
tcp6 0 46 vm71:2377 10.0.0.130:39316 ESTABLISHED 791/dockerd
Notice that this is tcp6 traffic.
We're stumped. Why are we seeing so much traffic between these manager nodes? If we demote then promote the manager nodes, the traffic clears up for a while. But, it eventually increases again. What might be causing this?