0

Environment:
Openstack Rocky
Ceph Mimic
openstack-cinder-13.0.8-0

Cinder volume status goes down while creating or deleting multiple volumes at the same time manually or while Rally test is running. All volume tasks including create and delete fail when systemctl status shows running but the cinder service-list command returns down state for cinder-volume. This issue is resolved when service is restarted.

We searched logs (used debug mode too) but did not find anything suspicious.

One thing to consider is, this issue started happening when we added 6 new OSD hosts to our Ceph cluster each containing 21 HDDs (OSD daemons), I don't know if this is important or not but thought worth mentioning. Any ideas or help is appreciated.

berndbausch
  • 973
  • 7
  • 11
  • `cinder-volume` can be down because it **is** down (not in your case), or can't communicate with `cinder-api`, or can't communicate with the backend (Ceph in your case). Or some internal error/bug. I would be very surprised if `cinder-volume` is declared down without any trace in the `cinder-api` and/or `cinder-volume` logs. Perhaps Ceph logs are more explicit? – berndbausch Feb 20 '21 at 05:52
  • Have you checked the load on the cinder node? That could be an issue. Is the ceph cluster healthy during the rally tests? – eblock Feb 22 '21 at 08:25
  • What do you mean by checking the load on Cinder? The Ceph cluster is healthy and being monitored all the time as it's a production cluster. – Amir H Moezzi Feb 23 '21 at 09:06
  • If you create lots of volumes in a short period of time it could have an impact on the control node(s). Are you sure about the correlation between expanding ceph and failing cinder or could it be just a coincidence? Maybe you could provide more details about your setup (ceph and openstack). – eblock Feb 24 '21 at 08:17
  • Ceph has 12 OSD nodes each 21 osd daemons, Openstack is cluster of 3 controllers and about 80 compute nodes. controller nodes have enough ram and cpu allocated all are on baremetal machines – Amir H Moezzi Feb 25 '21 at 09:23

0 Answers0