DNS service discovery using Consul in Docker Swarm


I'm trying to get service discovery working using consul DNS. I've set up a docker swarm with a consul cluster functioning as the key/value pair backend required by swarm as well as functioning as the service discovery backend for the other containers.

I start off with 3 bare servers with the docker engine installed. I'm provisioning the cluster using ansible.

The process for setting up this cluster so far is:

  • When installing docker, set --cluster-store=consul:// in the docker daemon opts
  • On the "primary" cluster node, start a consul server container in "-bootstrap-expect 3" mode
  • On the "secondary" cluster nodes, start a consul server container in "-join" mode
  • Start a swarm master and swarm agent container on each cluster node, pointing to the local consul server on the same host

When starting up the consul servers I'm mapping all of the consul ports to the host, like this:

version: '2'

        image: progrium/consul
        hostname: "{{ ansible_hostname }}"
            # Explanation of ports needed: http://stackoverflow.com/a/30692226/1514089
            - "8300:8300" # This is used by servers to handle incoming requests from other agents
            - "8301:8301/tcp" # This is used to handle gossip in the LAN. Required by all agents
            - "8301:8301/udp" # This is used to handle gossip in the LAN. Required by all agents
            - "8302:8302/tcp" # This is used by servers to gossip over the WAN to other servers
            - "8302:8302/udp" # This is used by servers to gossip over the WAN to other servers
            - "8400:8400" # This is used by all agents to handle RPC from the CLI
            - "8500:8500" # This is used by clients to talk to the HTTP API
            - "8600:8600" # Used to resolve DNS queries
         restart: always
         command: "{{ consul_command }}"

This gets me a working docker swarm. I can log into any one of the cluster nodes and use docker compose to bring up an application, the swarm balances things across servers transparently.

Now though, I want to start using consuls DNS capabilities to resolve services from inside each container in the swarm, I want to be able to do this:

$ docker run -it ubuntu dig consul.service.consul

I've tried a few things but I haven't got it to work, I suspect it has something to do with the docker networks. Let me explain...

When I start the consul servers, they are attached to their own docker-compose network, since I start them with docker-compose. However since the swarm is not yet operational, there are obviously no multi-host overlay networks set up yet. So each consul container ends up being on it's own host-only bridge network. I only create my docker networks after the swarm has been bootstrapped. I haven't been able to add the consul containers to an overlay network after the swarm is up, I get

Error response from daemon: No such container: swarm-node-1/swarmconsul_consul_1

How can I get DNS based service discovery working inside my containers?

Emmet O'Grady

Posted 2016-06-02T10:48:28.913

Reputation: 41



In the end I managed to get it working. I was doing a few things wrong.

The biggest mistake was that I had not mapped port 53 on the host to port 53/udp inside the consul container. The full port mapping now looks like this:

        image: progrium/consul
        hostname: "{{ ansible_hostname }}"
          # Explanation of ports needed: http://stackoverflow.com/a/30692226/1514089
          - "8300:8300" # This is used by servers to handle incoming requests from other agents
          - "8301:8301/tcp" # This is used to handle gossip in the LAN. Required by all agents
          - "8301:8301/udp" # This is used to handle gossip in the LAN. Required by all agents
          - "8302:8302/tcp" # This is used by servers to gossip over the WAN to other servers
          - "8302:8302/udp" # This is used by servers to gossip over the WAN to other servers
          - "8400:8400" # This is used by all agents to handle RPC from the CLI
          - "8500:8500" # This is used by clients to talk to the HTTP API
          - "8600:8600" # Used to resolve DNS queries
          - ""
        restart: always
        command: "{{ consul_command }}"

Here we are binding to port 53 on the docker0 interface. However, I learned that the IP of the docker0 bridge can change, so I hardcoded it a bit and told docker to always use this IP for the docker0 bridge by specifying --bip= in the docker daemon options. The next step was to set the default dns of the docker daemon to the same IP. My full docker daemon opts look like this:

ExecStart=/usr/bin/docker daemon \
    -H tcp:// \
    -H unix:///var/run/docker.sock \
    --bip= \
    --dns= \
    --dns-search=service.consul \
    --storage-driver=overlay \

Now I can do this:

$ sudo docker -H :4000 run -it joffotron/docker-net-tools
/ # dig +short consul.service.consul


Emmet O'Grady

Posted 2016-06-02T10:48:28.913

Reputation: 41