3

I'm trying to set up a test Elastic Search cluster on 3 separate hosts, using the official 7.2.0 docker image

Each container is is configured with an elasticsearch.yml which looks like this

cluster.name: mytest
network.host: "0.0.0.0"
node.name: mytest-10.131.105.90
discovery.seed_hosts:
  - "10.131.128.252:9300"
  - "10.131.129.28:9300"
  - "10.131.105.90:9300"
cluster.initial_master_nodes:
  - mytest-10.131.128.252
  - mytest-10.131.129.28
  - mytest-10.131.105.90

Once each node has started up, it's unable to discover the other nodes, reporting this

{
  "type": "server",
  "timestamp": "2019-07-04T18:42:18,751+0000",
  "level": "WARN",
  "component": "o.e.c.c.ClusterFormationFailureHelper",
  "cluster.name": "mytest",
  "node.name": "mytest-10.131.105.90",
  "message": "master not discovered yet, this node has not previously joined a bootstrapped (v7+) cluster, and this node must discover master-eligible nodes [mytest-10.131.128.252, mytest-10.131.129.28, mytest-10.131.105.90] to bootstrap a cluster: have discovered []; discovery will continue using [10.131.128.252:9300, 10.131.129.28:9300, 10.131.105.90:9300] from hosts providers and [{mytest-10.131.105.90}{qZqV5-4RSduwKNYIOWVB9A}{_nCNwrToRoeNAiWBO1DbGg}{134.209.178.145}{134.209.178.145:9300}{ml.machine_memory=2090500096, xpack.installed=true, ml.max_open_jobs=20}] from last-known cluster state; node term 0, last-accepted version 0 in term 0"
}

Just to repeat that long error with word wrapping...

master not discovered yet, this node has not previously joined a bootstrapped (v7+) cluster, and this node must discover master-eligible nodes [mytest-10.131.128.252, mytest-10.131.129.28, mytest-10.131.105.90] to bootstrap a cluster: have discovered []; discovery will continue using [10.131.128.252:9300, 10.131.129.28:9300, 10.131.105.90:9300] from hosts providers and [{mytest-10.131.105.90}{qZqV5-4RSduwKNYIOWVB9A}{_nCNwrToRoeNAiWBO1DbGg}{134.209.178.145}{134.209.178.145:9300}{ml.machine_memory=2090500096, xpack.installed=true, ml.max_open_jobs=20}] from last-known cluster state; node term 0, last-accepted version 0 in term 0

It doesn't seem to be a networking issue. From inside the container, I can use curl to verify access to ports 9200 and 9300 on the other nodes.

Suspect it's something subtle about the node names, and I was hoping that in writing this question, I'd hit upon the answer. Alas, not.

addendum - docker run

My docker run looks like this, simplified a little (${IP} is the host machine's IP address).

docker run --rm --name elasticsearch \
  -p ${IP}:9200:9200 -p ${IP}:9300:9300 \
  --network host \
  my-elasticsearch:7.2.0 \
  /usr/local/bin/start-clustered-es.sh 

Each container is running on a separate machine. start-clustered-es.sh simply writes the elasticsearch.yml file as outlined above, so each node starts with same config. Once the file is written, it calls the base container's startup script with exec /usr/local/bin/docker-entrypoint.sh eswrapper

I tried --network host as the config uses the IP of the host machine. From inside the containers, I can reach port 9200/9300 of the other machines, so it doesn't seem to be a network issue.

Any pointers most welcome...

Paul Dixon
  • 1,436
  • 3
  • 21
  • 35
  • 1
    What do your `docker run` commands look like? Are you running each instance on a separate server? What kind of networking are you using? – GregL Jul 06 '19 at 01:02
  • I've added the `docker run` commands to the question. Each instance is on a separate server, and for this test I was keeping it simple and using host networking – Paul Dixon Jul 06 '19 at 15:42
  • Can you get it working if you run all three containers on one host? Just as a diagnostic? – Mike Diehn Jul 09 '19 at 04:37

1 Answers1

1

One idea is to limit transport.profiles.default.port aka transport.port or set -p on docker run to the full default range of 9300-9400.

According to the documentation at https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-transport.html transport.profiles.default.port defaults to 9300-9400.

Further discovery.seed_hosts lists that port relates to transport.profiles.default.port. https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-discovery-settings.html

Hope this siggestion helps, as it is already some time ago when I formed the last cluster using version 6.x, needing some discovery.zen values with docker.

hargut
  • 3,848
  • 6
  • 10