Volume creation failure

Question

I want to launch an instance from an image which size is 1TB in my openstack pike environment which uses cinder with ceph as storage backend. The volume creation process for that instance fails with the following message: logs schedule allocate volume:Could not find any available weighted backend.

When I looked in cinder logs, I found this message:

...raise exc.from_response(resp, resp.content)\n', 'glanceclient.exc.HTTPUnauthorized: HTTP 401 Unauthorized: This server could not verify that you are authorized to access the document you requested. Either you supplied the wrong credentials (e.g., bad password), or your browser does not understand how to supply the credentials required...

And a few lines later, I see this:
...cinder.exception.ImageNotAuthorized: Not authorized for image...

However, I can successfully create any instance with a smaller image size.

First, I suspected of token expiration in keystone, since the volume creation takes up to 4-5 hours. I raised its expiration time up to 48 hours, but the failure keeps the same.

Is it possible that glance has some token expiration time on itself, or it should be global from keystone?

Can you launch an ephemeral VM from that image so that cinder is not involved (select "No" when asked to create new volume)? An ephemeral disk will only be a copy-on-write clone of the base image and should take only a few seconds to launch the VM. Just to confirm that the image itself works. As for the authorization failure I also suspect some form of timeout, too. But I haven't dealt with volumes that big yet. — eblock, Jul 30 '21 at 06:30
Well, I tried launching an ephemeral VM from the 1TB image with a flavor that has 0 root disk and 1TB ephemeral disk, selecting "No" when asked to create new volume. The process now fails almost instantly, and nova tells that no valid host was found: ```ERROR nova.conductor.manager nova.exception.NoValidHost: No valid host was found.``` I turned on debug mode for nova, cinder and glance-registry and I coudn't find the reason for that, the only clue at the moment is that message from nova. — bcantera, Jul 30 '21 at 15:39
Just one more thought: I assume the ceph cluster is not full, right? So that the volume creation fails on the ceph side? Are you using different cinder backends? — eblock, Aug 02 '21 at 07:34
Yup, in CEPH dashboard I see 15.2TB of raw capacity available, and this is the only backend used in cinder. — bcantera, Aug 02 '21 at 13:19
Okay, can you reproduce that just with volume creation without nova involved? Is there any limit for the volume size you encounter? — eblock, Aug 02 '21 at 13:23
Well, I tried 2 scenarios: - Create 1TB blank volume: no problem with that, volume available almost instantly - Create 1TB volume from 500MB debian image, no problem with that either, volume available in a couple of minutes - Launch an instance from the volume of previous step, using flavor with 100GB root disk, booted almost instantly, so it was possible to have 1TB instance up and running :/ — bcantera, Aug 02 '21 at 15:06
I didn't get any notification, sorry for the delay. Did you resolve this issue in the meantime? I was wondering if your images are in qcow format and for that openstack needs to convert the image on local disk before uploading it back to ceph. Do you have enough free disk space for that on the compute nodes? — eblock, Mar 28 '22 at 06:38

Volume creation failure

0 Answers0