3

I am trying to start etcd2 in my CoreOS node.

I have this in my cloud-config:

coreos:
  etcd2:
    discovery: https://discovery.etcd.io/new?size=1
    advertise-client-urls: http://127.0.0.1:2379,http://127.0.0.1:4001
    initial-advertise-peer-urls: http://127.0.0.1:2380
    listen-client-urls: http://0.0.0.0:2379,http://0.0.0.0:4001
    listen-peer-urls: http://127.0.0.1:2380

After the installation, when I boot the system I get the error (according to the logs):

etcdmain: invalid character 'p' after top-level value

and etcd2 fails to start.

What does that mean? I have followed the guides on https://coreos.com/os/docs/latest/cloud-config.html and https://coreos.com/os/docs/latest/cluster-discovery.html.

EDIT

Node 1

coreos:
  etcd2:
    name: coreos1
    discovery: https://discovery.etcd.io/2d585793b364cf8985ca7a983d6c52e3
    advertise-client-urls: http://127.0.0.1:2379,http://127.0.0.1:4001
    initial-advertise-peer-urls: http://127.0.0.1:2380
    listen-client-urls: http://0.0.0.0:2379,http://0.0.0.0:4001
    listen-peer-urls: http://127.0.0.1:2380

Node 2

coreos:
  etcd2:
    name: coreos2
    discovery: https://discovery.etcd.io/2d585793b364cf8985ca7a983d6c52e3
    advertise-client-urls: http://127.0.0.1:2379,http://127.0.0.1:4001
    initial-advertise-peer-urls: http://127.0.0.1:2380
    listen-client-urls: http://0.0.0.0:2379,http://0.0.0.0:4001
    listen-peer-urls: http://127.0.0.1:2380

coreos1> journalctl -u etcd2:

Sep 21 20:10:31 coreos1 etcd2[671]: 2015/09/21 20:10:31 discovery: found self e276d5b4c276a19d in the cluster
Sep 21 20:10:31 coreos1 etcd2[671]: 2015/09/21 20:10:31 discovery: found 1 peer(s), waiting for 1 more
Sep 21 20:11:31 coreos1 etcd2[671]: 2015/09/21 20:11:31 discovery: error #0: client: etcd member https://discovery.etcd.io returns server error [Gateway Timeout]
Sep 21 20:11:31 coreos1 etcd2[671]: 2015/09/21 20:11:31 discovery: waiting for other nodes: error connecting to https://discovery.etcd.io, retrying in 8m32s

coreos2> journalctl -u etcd2:

Sep 21 20:11:43 coreos2 systemd[1]: Starting etcd2...
Sep 21 20:11:43 coreos2 etcd2[1515]: 2015/09/21 20:11:43 etcdmain: etcd Version: 2.1.2
Sep 21 20:11:43 coreos2 etcd2[1515]: 2015/09/21 20:11:43 etcdmain: Git SHA: ff8d1ec
Sep 21 20:11:43 coreos2 etcd2[1515]: 2015/09/21 20:11:43 etcdmain: Go Version: go1.4.2
Sep 21 20:11:43 coreos2 etcd2[1515]: 2015/09/21 20:11:43 etcdmain: Go OS/Arch: linux/amd64
Sep 21 20:11:43 coreos2 etcd2[1515]: 2015/09/21 20:11:43 etcdmain: setting maximum number of CPUs to 1, total number of available CPUs is 1
Sep 21 20:11:43 coreos2 etcd2[1515]: 2015/09/21 20:11:43 etcdmain: listening for peers on http://127.0.0.1:2380
Sep 21 20:11:43 coreos2 etcd2[1515]: 2015/09/21 20:11:43 etcdmain: listening for client requests on http://0.0.0.0:2379
Sep 21 20:11:43 coreos2 etcd2[1515]: 2015/09/21 20:11:43 etcdmain: listening for client requests on http://0.0.0.0:4001
Sep 21 20:11:45 coreos2 etcd2[1515]: 2015/09/21 20:11:45 etcdmain: stopping listening for client requests on http://0.0.0.0:4001
Sep 21 20:11:45 coreos2 etcd2[1515]: 2015/09/21 20:11:45 etcdmain: stopping listening for client requests on http://0.0.0.0:2379
Sep 21 20:11:45 coreos2 etcd2[1515]: 2015/09/21 20:11:45 etcdmain: stopping listening for peers on http://127.0.0.1:2380
Sep 21 20:11:45 coreos2 etcd2[1515]: 2015/09/21 20:11:45 etcdmain: member "core2" has previously registered with discovery service token (https://discovery.etcd.io/2d585793b364cf8985ca7a983d6c52e3).
Sep 21 20:11:45 coreos2 etcd2[1515]: 2015/09/21 20:11:45 etcdmain: But etcd could not find vaild cluster configuration in the given data dir (/var/lib/etcd2).
Sep 21 20:11:45 coreos2 etcd2[1515]: 2015/09/21 20:11:45 etcdmain: Please check the given data dir path if the previous bootstrap succeeded
Sep 21 20:11:45 coreos2 etcd2[1515]: 2015/09/21 20:11:45 etcdmain: or use a new discovery token if the previous bootstrap failed.
Sep 21 20:11:45 coreos2 systemd[1]: etcd2.service: Main process exited, code=exited, status=1/FAILURE
Sep 21 20:11:45 coreos2 systemd[1]: etcd2.service: Unit entered failed state.
Sep 21 20:11:45 coreos2 systemd[1]: etcd2.service: Failed with result 'exit-code'.
Rox
  • 441
  • 1
  • 6
  • 13

2 Answers2

6

Your discovery URL is incorrect - the URL https://discovery.etcd.io/new?size=1 is used to obtain a fresh discovery URL you can use in your configuration. Do this once manually, e.g. with curl:

curl --silent -H "Accept: text/plain" https://discovery.etcd.io/new?size=1

This will return a URL like this:

https://discovery.etcd.io/a93e30ebf9375f2385fef54c83b2840d

It's a URL like that which should be your discovery URL. Always use a fresh discovery URL whenever you build a new cluster.

Paul Dixon
  • 1,436
  • 3
  • 21
  • 35
  • Oh, not it works, at least with 1 node. But when using two nodes is does not. I have the same discovery url for the two nodes and different names. Is it because I have `http://127.0.0.1` as the URL for `advertise-client-urls`? Look at my edit above. Does it seem correct? – Rox Sep 21 '15 at 20:08
  • And to mention: when I tried with one node I generated a discovery token with `https://discovery.etcd.io/new?size=1` and when using two nodes I used `https://discovery.etcd.io/new?size=2`. – Rox Sep 21 '15 at 20:18
  • Oh that is right! dang I should have paid more attention to that. In addition I think size < 3 doesn't make sense. The size= parameter controls how many servers form the cluster. 2 nodes can't make a cluster and 1 node definitely won't. I'm not sure how etd2 deals with that. – hookenz Sep 23 '15 at 21:37
  • 1 will work just fine for development. 2 will *work*, but if etcd fails on one node, the other one won't have a quorum and will stop - so not terribly useful. – Paul Dixon Sep 24 '15 at 09:51
0

If that's the entire cloud-config.yml file then it's wrong.

The very first line of the file needs to be:

#cloud-config

i.e.

#cloud-config

coreos:
  etcd2:
    discovery: https://discovery.etcd.io/new?size=1
    advertise-client-urls: http://127.0.0.1:2379,http://127.0.0.1:4001
    initial-advertise-peer-urls: http://127.0.0.1:2380
    listen-client-urls: http://0.0.0.0:2379,http://0.0.0.0:4001
    listen-peer-urls: http://127.0.0.1:2380

In addition, be careful when copying cloud-config's from a website. I've had some strange characters hidden in the resulting output file that caused parser errors and resulting in failure!

hookenz
  • 14,132
  • 22
  • 86
  • 142
  • No, that was just the `etcd` part of the cloud-config file. It is correct. But etcd still fails to start. – Rox Sep 21 '15 at 06:22
  • Ok, I suspect it's some invisibles that are invalid in your cloud config. This happened to me. – hookenz Sep 21 '15 at 06:33