0

I've got a mixed Docker Swarm Cluster with Linux machines and Win7 machines running VirtualBox via NAT (bridged option is not allowed).

    Win7        Win7
     |           |
(port forwarding on 7946tcp/udp, 2377tcp, 4789udp)
     |           |
+----+----+  +----+----+  +-------+    +-------+
| VirtBox |  | VirtBox |  | linux |    | linux |
+----+----+  +----+----+  +---+---+    +---+---+
     |            |           |            |
+----+---+   +----+---+  +----+---+   +----+---+
| docker |   | docker |  | docker |   | docker |
+----+---+   +----+---+  +----+---+   +----+---+
     |            |           |            |
+----+------------+-----------+------------+---+
|                docker swarm                  |
+----------------------------------------------+

Docker swarm is initialized ok (as per the chart bellow):

docker@frankie:~$ docker node ls
ID            HOSTNAME    STATUS    AVAILABILITY   MANAGER STATUS  ENGINE VERSION
ban0an8sg *   Win1        Ready     Active         Reachable       18.05.0-ce
asdlkj328     Win2        Ready     Active         Leader          18.05.0-ce
9w05zyye6     Linux1      Ready     Active         Reachable       18.03.1-ce
slkhj2387     Linux2      Ready     Active         Reachable       18.03.1-ce

I can launch services to the swarm without issues:

docker@frankie:~$ docker service ls
ID          NAME     MODE         REPLICAS   IMAGE          PORTS
9w05zyye6   my-web   replicated   1/1        nginx:latest   *:8083->80/tcp

Unfortunately swarm mesh only works between Linux machines. I can access the service directly on the machine it is running (even if service is running on a windows) but the network mesh only works between Linux.

accessible on ► | win1 | win2 | lin1 | lin2
   running on ▼ +------+------+------+------
           win1 |   x  |      |      |
           win2 |      |   x  |      |
         linux1 |      |      |   x  |  x
         linux2 |      |      |   x  |  x

Any idea on where the problem might be?

Edit (added ports add command for clarification):

VBoxManage.exe controlvm "win1" natpf1 "docker-swarm-cluster-management,tcp,0.0.0.0,2377,,2377"
VBoxManage.exe controlvm "win1" natpf1 "docker-swarm-cluster-comm-tcp,tcp,0.0.0.0,7946,,7946"
VBoxManage.exe controlvm "win1" natpf1 "docker-swarm-cluster-comm-udp,udp,0.0.0.0,7946,,7946"
VBoxManage.exe controlvm "win1" natpf1 "docker-swarm-cluster-traffic,udp,0.0.0.0,4789,,4789"
Frankie
  • 419
  • 1
  • 6
  • 19
  • I think the docker swarm network mesh use linux ipvs – c4f4t0r Jun 06 '18 at 22:32
  • what commands did you use to join the win7 virtualbox vm's? They will need use options for their public IP's in the join command so linux machines now how to find them. – Bret Fisher Jun 08 '18 at 00:38
  • @BretFisher I used the `--advertise-addr` on the windows machines. Seams to work as they join the cluster and get jobs delegated to then. The only thing that doesn't work is the network mesh... – Frankie Jun 08 '18 at 03:39
  • might need to change the interface of the `--data-path-addr` in the `join`. Also, the overlay traffic is encapsulated on the UDP 4789, which by your list isn't forwarded. See https://www.bretfisher.com/docker-swarm-firewall-ports/ – Bret Fisher Jun 08 '18 at 16:22
  • Hi @BretFisher thanks for answer and for site. The port was actually UDP and not TCP (typo on SF, sorry). `--data-path-addr` looked very promising but unfortunately still not the home-run. I guess it was auto-using the correct IP from `--advertise-addr`. There is also a particularity, with mesh network on, if querying from another machine servive halts for 20s or so before starting to respond again correctly. I'm using mode `host` and a haproxy on every machine to work around it but it's not an elegant solution. – Frankie Jun 11 '18 at 02:05
  • I assume you're not enabling encryption on overlay networks (which require a different protocol then TCP/UDP).... other than that I'm out of ideas. – Bret Fisher Jun 11 '18 at 06:13

0 Answers0