1

I am deploying Openstack(I tried victoria and ussuri) with kolla-ansible on 3 CentOS 8 Nodes (1=Control+Compute,2 and 3=Compute). Deployments works fine without any problems but when I create a new VM with an ubuntu image (focal-server-cloudimg-amd64.img) from here it looks like it doesn't come up cleanly. This causes that the cloud-init script cannot finish and so the configured SSH-key is not set inside of the VM so I am not able to login.

ubuntu@10.20.34.137: Permission denied (publickey).

But its ip address is pingable and all network rules (security groups and so) are okay. Now the strange part begins. If I now do a Hard Reboot Instance, it comes up cleanly and I am able to login via SSH. Has anyone seen this issue before, because I have another/older instance of OpenStack ussuri running (from about 15 months ago) and there the same ubuntu image works well(I also verified this).

log output of the first run (only last lines):

-----END SSH HOST KEY KEYS-----
[   49.380046] cloud-init[1387]: Cloud-init v. 21.4-0ubuntu1~20.04.1 running 'modules:final' at Wed, 05 Jan 2022 11:17:08 +0000. Up 49.23 seconds.
[   49.380186] cloud-init[1387]: ci-info: no authorized SSH keys fingerprints found for user ubuntu.
[   49.380481] cloud-init[1387]: Cloud-init v. 21.4-0ubuntu1~20.04.1 finished at Wed, 05 Jan 2022 11:17:08 +0000. Datasource DataSourceNone.  Up 49.37 seconds
[   49.380779] cloud-init[1387]: 2022-01-05 11:17:08,288 - cc_final_message.py[WARNING]: Used fallback datasource

log output after Hard Reboot Instance (only last lines):

-----END SSH HOST KEY KEYS-----
[   15.105333] cloud-init[851]: Cloud-init v. 21.4-0ubuntu1~20.04.1 running 'modules:final' at Wed, 05 Jan 2022 11:46:21 +0000. Up 14.95 seconds.
[   15.106253] cloud-init[851]: Cloud-init v. 21.4-0ubuntu1~20.04.1 finished at Wed, 05 Jan 2022 11:46:22 +0000. Datasource DataSourceOpenStackLocal [net,ver=2].  Up 15.10 seconds
[[0;32m  OK  [0m] Finished [0;1;39mExecute cloud user/final scripts[0m.
[[0;32m  OK  [0m] Reached target [0;1;39mCloud-init target[0m.

Ubuntu 20.04.3 LTS asdfff ttyS0

asdfff login:

So I also tested a debian image(debian-10-openstack-amd64.qcow2) from here and it worked fine with the first startup.

Has anyone else seen this behaviour? Or maybe see something I can do to work around this.

Kind regards,

Michael

Michael
  • 11
  • 1

2 Answers2

1

Some additional information:

  • Problem observed with Openstack Ussuri and Victoria.
  • This problem is observed with Ubuntu VM with one interface in the external openstack network.
  • During the creation of the VM the cloud-init can not request VM related metadata from source "http://169.254.169.254/openstack". This leads to timeout and initialization with default keys. Startup LOG indicates: "WARNING not metadata source"

A workarround is to perform a "hard boot" of the VM. During "hard boot" the log file "/var/log/cloud-init.log" indicates that a route is added that seems to fix the problem. See log entry 14:27:08,314 below that is at the biginning of the "hard boot". The 10.20.34.100 is an IP within the external network, that provides the metadata.

ubuntu@external-server:/var/log$ grep "169.254.169.254" cloud-init.log 
2022-01-06 14:25:33,037 - util.py[DEBUG]: Resolving URL: http://169.254.169.254 took 0.000 seconds
2022-01-06 14:25:33,037 - url_helper.py[DEBUG]: [0/1] open 'http://169.254.169.254/openstack' with {'url': 'http://169.254.169.254/openstack', 'allow_redirects': True, 'method': 'GET', 'timeout': 10.0, 'headers': {'User-Agent': 'Cloud-Init/21.4-0ubuntu1~20.04.1'}} configuration
2022-01-06 14:25:43,050 - url_helper.py[DEBUG]: Calling 'http://169.254.169.254/openstack' failed [10/-1s]: request error [HTTPConnectionPool(host='169.254.169.254', port=80): Max retries exceeded with url: /openstack (Caused by ConnectTimeoutError(<urllib3.connection.HTTPConnection object at 0x7fa3b80d8a60>, 'Connection to 169.254.169.254 timed out. (connect timeout=10.0)'))]
2022-01-06 14:25:43,050 - DataSourceOpenStack.py[DEBUG]: Giving up on OpenStack md from ['http://169.254.169.254/openstack'] after 10 seconds
2022-01-06 14:25:54,772 - util.py[DEBUG]: Resolving URL: http://169.254.169.254 took 10.014 seconds
2022-01-06 14:25:54,772 - url_helper.py[DEBUG]: [0/1] open 'http://169.254.169.254/openstack' with {'url': 'http://169.254.169.254/openstack', 'allow_redirects': True, 'method': 'GET', 'timeout': 10.0, 'headers': {'User-Agent': 'Cloud-Init/21.4-0ubuntu1~20.04.1'}} configuration
2022-01-06 14:26:04,785 - url_helper.py[DEBUG]: Calling 'http://169.254.169.254/openstack' failed [10/-1s]: request error [HTTPConnectionPool(host='169.254.169.254', port=80): Max retries exceeded with url: /openstack (Caused by ConnectTimeoutError(<urllib3.connection.HTTPConnection object at 0x7fdefd568700>, 'Connection to 169.254.169.254 timed out. (connect timeout=10.0)'))]
2022-01-06 14:26:04,785 - DataSourceOpenStack.py[DEBUG]: Giving up on OpenStack md from ['http://169.254.169.254/openstack'] after 10 seconds
2022-01-06 14:27:08,314 - subp.py[DEBUG]: Running command ['ip', '-4', 'route', 'add', '169.254.169.254/32', 'via', '10.20.34.100', 'dev', 'ens3'] with allowed return codes [0] (shell=False, capture=True)
2022-01-06 14:27:08,317 - util.py[DEBUG]: Resolving URL: http://169.254.169.254 took 0.000 seconds
2022-01-06 14:27:08,318 - url_helper.py[DEBUG]: [0/1] open 'http://169.254.169.254/openstack' with {'url': 'http://169.254.169.254/openstack', 'allow_redirects': True, 'method': 'GET', 'timeout': 10.0, 'headers': {'User-Agent': 'Cloud-Init/21.4-0ubuntu1~20.04.1'}} configuration
2022-01-06 14:27:08,930 - url_helper.py[DEBUG]: Read from http://169.254.169.254/openstack (200, 105b) after 1 attempts
2022-01-06 14:27:08,930 - DataSourceOpenStack.py[DEBUG]: Using metadata source: 'http://169.254.169.254'
2022-01-06 14:27:08,930 - url_helper.py[DEBUG]: [0/6] open 'http://169.254.169.254/openstack' with {'url': 'http://169.254.169.254/openstack', 'allow_redirects': True, 'method': 'GET', 'timeout': 10.0, 'headers': {'User-Agent': 'Cloud-Init/21.4-0ubuntu1~20.04.1'}} configuration
2

After the reboot the SSH access is possible with configured credentials.

Problem could be the cloud-init script, that is not adding the route to metadata server during first cloud_init of the VM.

Herbert
  • 11
  • 1
0

I faced this issue on the Ussuri on Microstack. I tried the suggested approach of 'hard boot' but got the same error. I also got the same error with Fedora 36 image.

I resolved the error but disabling then re-enabling Microstack.

sudo snap disable microstack

sudo snap disable microstack
tariro
  • 1