I would take a multi-step tactic to troubleshooting this. Pardon the extra info and over explanation, everyone here at CoreOS has to deal with this from me. ;)
First and foremost you want to make sure that the URL you are trying to download from can be retrieved from inside the cluster. Presently, I don't see any reason why this should not be the case as I was able to wget it (as an aside, it's generally better to not put private key material in a publicly accessible tarball. In this case while still not optimal it may be better to include those assets either in the user-data
or at the very least protect the tarball with symmetric encryption.)
As cloud-init runs after the network is online, this should be sufficient (the meta-data service resides at http://169.254.169.254
and thus the cloud-config cannot be retrieved until after the network is online.) This means that the likely culprits are down to transient network issues, or other details.
When I attempt to run through this I get the following error:
core@rbtest ~ $ journalctl -u bootstrap.service
-- Logs begin at Wed 2016-04-13 17:31:35 UTC, end at Wed 2016-04-13 17:33:09 UTC. --
Apr 13 17:31:47 rbtest.c.coreos-support.internal systemd[1]: [/etc/systemd/system/bootstrap.service:10] Executable path is not absolute, ignoring: cd /tmp/kubernetes-staging
Apr 13 17:31:47 rbtest.c.coreos-support.internal systemd[1]: Starting Bootstrap instance...
Apr 13 17:31:47 rbtest.c.coreos-support.internal sh[1074]: --2016-04-13 17:31:47-- https://storage.googleapis.com/experimentalberlin/staging.tar.gz
Apr 13 17:31:47 rbtest.c.coreos-support.internal sh[1074]: Resolving storage.googleapis.com... 209.85.200.128, 2607:f8b0:4001:c08::80
Apr 13 17:31:47 rbtest.c.coreos-support.internal sh[1074]: Connecting to storage.googleapis.com|209.85.200.128|:443... connected.
Apr 13 17:31:48 rbtest.c.coreos-support.internal sh[1074]: HTTP request sent, awaiting response... 200 OK
Apr 13 17:31:48 rbtest.c.coreos-support.internal sh[1074]: Length: 4722 (4.6K) [application/x-tar]
Apr 13 17:31:48 rbtest.c.coreos-support.internal sh[1074]: Saving to: 'staging.tar.gz'
Apr 13 17:31:48 rbtest.c.coreos-support.internal sh[1074]: 0K .... 100% 47.4M=0s
Apr 13 17:31:48 rbtest.c.coreos-support.internal sh[1074]: 2016-04-13 17:31:48 (47.4 MB/s) - 'staging.tar.gz' saved [4722/4722]
Apr 13 17:31:48 rbtest.c.coreos-support.internal systemd[1]: bootstrap.service: Main process exited, code=exited, status=203/EXEC
Apr 13 17:31:48 rbtest.c.coreos-support.internal systemd[1]: Failed to start Bootstrap instance.
Apr 13 17:31:48 rbtest.c.coreos-support.internal systemd[1]: bootstrap.service: Unit entered failed state.
Apr 13 17:31:48 rbtest.c.coreos-support.internal systemd[1]: bootstrap.service: Failed with result 'exit-code'.
The clue here is the line:
bootstrap.service: Main process exited, code=exited, status=203/EXEC
This message is telling you that there was a problem running the script itself. Digging in this makes complete sense as when I look at the top of that shell script there is no shebang telling systemd how to run the executable (in this case it's all Bourne Shell/Bourne-Again Shell compatible commands, so the shebang should likely be either #!/bin/sh
or #!/bin/bash
.) Adding a shebang should fix this issue.
Some other minor nits:
when using wget
specify the download location :
wget -O /tmp/kubernetes-staging/staging.tar.gz https://storage.googleapis.com/experimentalberlin/staging.tar.gz
when expanding your tarball, you can output it to a specific location with -C
:
tar xf /tmp/kubernetes-staging/staging.tar.gz -C /tmp/kubernetes-staging/
This allows you to separate those into their relevant ExecStart=
options, which provides additional logging.
- As most of these commands are pre-amble to the execution of the actual
bootstrap.sh
script, I would change all of the ExecStart=
options (with the exception of the last) to ExecStartPre=
.