-1

I have a single small VPS that has a variety of services on it, such as a WordPress installation and a few web apps. For some while I have run all the services on it as Docker containers. Since I have a variety of domains and subdomains pointing to this box, I use the frontend proxy Traefik to capture the web ports and then route them internally in a Docker network.

I start up Traefik like so:

#!/bin/bash

# Removes the restart policy from previous containers
CONTAINER_LABEL=traefik-instance
../../bin/remove-restart.sh $CONTAINER_LABEL

mkdir --parents /var/log/traefik
mkdir --parents /etc/letsencrypt-traefik

docker run \
    --label $CONTAINER_LABEL \
    --publish 80:80 \
    --publish 443:443 \
    --volume $PWD/traefik.toml:/etc/traefik/traefik.toml \
    --volume $PWD/rules:/etc/traefik/rules \
    --volume /etc/letsencrypt-traefik:/etc/letsencrypt-traefik \
    --volume /var/log/traefik:/log \
    --network dockernet \
    --detach \
    --restart always \
    traefik:1.6

This all works very nicely. I have recently discovered Docker Swarm, and would like to convert all of my containers to services, which will give me replication services, rolling updates and zero-downtime deployments. However, I would like to do the change piecemeal, so that Traefik can route to both Swarm services and ordinary (non-Swarm) containers.

So, to launch Traefik as a service, I am now doing the following. You'll notice I am using non-standard ports for the purposes of testing:

#!/bin/bash

# Using "traefik2" while I am experimenting with multiple services
mkdir --parents /var/log/traefik2
mkdir --parents /etc/letsencrypt-traefik

docker service create \
    --publish 8080:80 \
    --publish 8443:443 \
    --mount type=bind,source=$PWD/traefik.toml,target=/etc/traefik/traefik.toml \
    --mount type=bind,source=$PWD/rules,target=/etc/traefik/rules \
    --mount type=bind,source=/etc/letsencrypt-traefik,target=/etc/letsencrypt-traefik \
    --mount type=bind,source=/var/log/traefik2,target=/log \
    --network traefiknet \
    traefik:1.6

This works also, when pointed to a Swarm web service that appears on the same network.

So, I have two Docker networks (amongst the various defaults that Docker creates for itself) like so:

root@box:~/docker# docker network ls
NETWORK ID          NAME                DRIVER              SCOPE
1aa479f13faa        dockernet           bridge              local
k71hpg1n0lo9        traefiknet          overlay             swarm

This results in my having a working Traefik container that can see Docker containers, and a working Traefik service that can see Swarm services. However, they cannot see each other.

To try to fix this, I have tried to add the Docker network to the start-up of the Traefik Swarm service:

--network dockernet \

In other words, I want this service to connect to both the bridge (old) and the overlay (new) networks. Unfortunately I get this:

Error response from daemon: The network dockernet cannot be used with services. Only networks scoped to the swarm can be used, such as those created with the overlay driver.

Is there a way my new service can connect to the old network, or indeed is there a way my old containers can connect to the new network? I have tried searching for the error, but there do not seem to be many mentions of it at all; I wonder if this edge-case usage of Swarm has not been encountered by many folks yet.

(Of course, one solution is for me to convert all of my containers to services, but to avoid a big-bang change, I'd rather do it slowly if possible).

Trying attachable networks

I then deleted my services and tried this:

docker network rm traefiknet
docker network create driver=overlay --attachable traefiknet

I then recreated the Traefik service, and it starts up. It is evidently still working because it routes traffic to a service that has also joined the traefiknet overlay.

However, I have created a non-service container, and connected that exclusively to traefiknet, and the --network-alias I create with that cannot be seen by the service. Oddly, if I shell into this non-Swarm container it can ping the Swarm Traefik container, so the network works. (I have tried creating an Alpine shell service, connected to traefiknet, and from here I cannot ping either the container name of my non-Swarm container, nor its --network-alias).

Upgrading Docker

I have tried to upgrade Docker from 17.03.2-ce to 18.06.1-ce, because a phrase in the manual indicated that my old Docker version might be the cause of the problem:

Communicate between a container and a swarm service sets up communication between a standalone container and a swarm service, using an attachable overlay network. This is supported in Docker 17.06 and higher.

However, this also has not helped.

halfer
  • 233
  • 1
  • 5
  • 21

1 Answers1

0

I believe I have a fix for this, though there are still some things I don't understand. To set the context for this answer, the following is how I now install a Docker non-Swarm container:

#!/bin/bash

# Save pwd and then change dir to the project root
STARTDIR=`pwd`
cd `dirname $0`/../..

# Removes the restart policy from previous containers
CONTAINER_LABEL=ilovephp-staging
NETWORK_ALIAS=${CONTAINER_LABEL}
./bin/remove-restart.sh $CONTAINER_LABEL

docker run \
    --network swarmnet \
    --network-alias ${NETWORK_ALIAS} \
    --env TUTORIAL_ENVIRONMENT_NAME=staging \
    --detach \
    --restart always \
    ilovephp:2018-08-19

# Go back to original dir
cd $STARTDIR

As you can see, as per my question update, I am now able to put a container on an attachable Swarm network.

I have discovered that the reason why this cannot be pinged from a Swarm service is that it is missing a --name. As soon as I add a --name, it is reachable. Interestingly though, if I try to ping the Swarm service or the Swarm container from the Docker container, it works:

root@server:~# docker exec -it loving_allen sh
/ # # *** Ping a specific Swarm container ***
/ # ping alpine-swarm.1.9llv2hvv5xnc1c8diuhfh5m09
PING alpine-swarm.1.9llv2hvv5xnc1c8diuhfh5m09 (10.0.1.23): 56 data bytes
64 bytes from 10.0.1.23: seq=0 ttl=64 time=0.516 ms
^C
--- alpine-swarm.1.9llv2hvv5xnc1c8diuhfh5m09 ping statistics ---
1 packets transmitted, 1 packets received, 0% packet loss
round-trip min/avg/max = 0.516/0.516/0.516 ms
/ # # *** Ping the Swarm service ***
/ # ping alpine-swarm
PING alpine-swarm (10.0.1.22): 56 data bytes
64 bytes from 10.0.1.22: seq=0 ttl=64 time=0.376 ms
^C
--- alpine-swarm ping statistics ---
3 packets transmitted, 3 packets received, 0% packet loss
round-trip min/avg/max = 0.137/0.231/0.376 ms

Thus, my original plan was to convert my Traefik instance from a container to a Swarm service, but that won't work, because unnamed non-Swarm containers won't be reachable. My solution now is to convert all my systems from containers to services first, and then once that is all done, I can convert the Traefik instance last. That way I will always be connecting from container to service, and not the other way around.

(I cannot add --names since if the Docker host is restarted, it will try to recreate containers with the same name, and they will fail because the old containers will still have those clashing names. And I can't use --rm to fix that, because it's not compatible with --restart! More on that here).

I still don't understand why a Docker-generated container name is not pingable from a Swarm container, nor why a --network-alias does not help either. But, given that my solution only needs to be temporary until all my systems are Swarm services, it probably does not matter.

halfer
  • 233
  • 1
  • 5
  • 21