I followed Clusters from Scratch instructions, chapters 8 and 9, to configure an active-active, shared disk cluster with drbd and pacemaker. The difference with that tutorial is that I am using OCFS2 instead of glusterfs.
Without pacemaker it works fine, but when I want to integrate to the cluster, it fails with this message of error:
# pcs status
Cluster name: datacluster
Stack: corosync
Current DC: darwin (version 2.0.1-9e909a5bdd) - partition with quorum
Last updated: Wed May 26 00:18:41 2021
Last change: Tue May 25 23:54:38 2021 by root via cibadmin on darwin
2 nodes configured
3 resources configured
Online: [ darwin humboldt ]
Full list of resources:
Clone Set: drbd-clone [drbd] (promotable)
Masters: [ darwin humboldt ]
shareddisk (ocf::heartbeat:Filesystem): Stopped
Failed Resource Actions:
* shareddisk_start_0 on darwin 'not installed' (5): call=11, status=complete, exitreason='Couldn't find device [/dev/drbd0]. Expected /dev/??? to exist',
last-rc-change='Wed May 26 00:02:45 2021', queued=0ms, exec=56ms
* shareddisk_start_0 on humboldt 'not installed' (5): call=12, status=complete, exitreason='Couldn't find device [/dev/drbd0]. Expected /dev/??? to exist',
last-rc-change='Wed May 26 00:04:28 2021', queued=0ms, exec=56ms
Daemon Status:
corosync: active/disabled
pacemaker: active/disabled
pcsd: active/enabled
I suspect that the message 'Couldn't find device [/dev/drbd0]. Expected /dev/??? to exist' is because the resource to mount the shared disk (shareddisk) doesn't wait until the resource drbd-clone has finished creating the device /dev/drbd0 in both nodes.
The device exist after the drbd resource is running.
root@darwin:~# cat /proc/drbd
version: 8.4.10 (api:1/proto:86-101)
srcversion: 15055BDD6F0D23278182874
0: cs:StandAlone ro:Primary/Unknown ds:UpToDate/DUnknown r-----
ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
root@darwin:~# ls -l /dev/drbd0
brw-rw---- 1 root disk 147, 0 May 26 00:02 /dev/drbd0
As you can see, the resource shareddisk is already constrained with colocation and order. Is there a way to tell the resource shareddisk to wait a minute after drbd before trying to start?
Or I should look elsewhere to solve this problem?
Thank you in advance for any help.
Here you can find both the cib and the configuration cib.