I am using DRBD for replication. For testing purpose I am using 2 VMs. I have noticed that if I disconnect network interface on a node then it moves to standalone, and after I reconnect it does not go back to connected or WFconnection mode. Is there a way to run $drbdadm connect r0 or equivalent using pacemaker?
My config:

[root@CentOS1 ~]# cat /etc/drbd.d/nfs.res 
resource r0 {

    syncer {
        c-plan-ahead 20;
        c-fill-target 50k;
        c-min-rate 25M;
        al-extents 3833;
        rate 90M;

    disk {

        fencing resource-only;
    handlers {
        fence-peer "/usr/lib/drbd/crm-fence-peer.sh --timeout 120 --dc-timeout 120";
        after-resync-target "/usr/lib/drbd/crm-unfence-peer.sh";

    net { 

        sndbuf-size 0;
        max-buffers 8000;
        max-epoch-size 8000;    
        after-sb-0pri discard-least-changes;
        after-sb-1pri consensus;
        after-sb-2pri call-pri-lost-after-sb;

    device /dev/drbd0;
    disk /dev/centos/drbd;
    meta-disk internal;

    on CentOS1 {
    on CentOS2 {

resource status after disconnecting network and reconnecting:

[root@CentOS1 ~]#cat /proc/drbd 
version: 8.4.10-1 (api:1/proto:86-101)
GIT-hash: a4d5de01fffd7e4cde48a080e2c686f9e8cebf4c build by mockbuild@, 2017-09-15 14:23:22
 0: cs:StandAlone ro:Secondary/Unknown ds:UpToDate/Outdated   r-----
    ns:0 nr:0 dw:0 dr:2128 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0


[root@CentOS2 ~]# cat /dev/kmsg 

6,1171,16427206172,-;e1000: enp0s9 NIC Link is Down
3,1172,16427206435,-;e1000 0000:00:09.0 enp0s9: Reset adapter
3,1173,16431488757,-;drbd r0: PingAck did not arrive in time.
6,1174,16431488808,-;drbd r0: peer( Secondary -> Unknown ) conn( Connected -> NetworkFailure ) pdsk( UpToDate -> DUnknown ) 
6,1175,16431489116,-;drbd r0: ack_receiver terminated
6,1176,16431489121,-;drbd r0: Terminating drbd_a_r0
6,1177,16431489277,-;block drbd0: new current UUID A5A2A33E423F0679:0F9B7A2A6938EE77:4471A80A92A109A4:4470A80A92A109A4
6,1178,16431489574,-;drbd r0: Connection closed
6,1179,16431489660,-;drbd r0: conn( NetworkFailure -> Unconnected ) 
6,1180,16431489667,-;drbd r0: receiver terminated
6,1181,16431489669,-;drbd r0: Restarting receiver thread
6,1182,16431489671,-;drbd r0: receiver (re)started
6,1183,16431489712,-;drbd r0: conn( Unconnected -> WFConnection ) 
6,1184,16431489801,-;drbd r0: helper command: /sbin/drbdadm fence-peer r0
4,1185,16431673341,-;drbd r0: helper command: /sbin/drbdadm fence-peer r0 exit code 5 (0x500)
6,1186,16431673351,-;drbd r0: fence-peer helper returned 5 (peer is unreachable, assumed to be dead)
6,1187,16431673374,-;drbd r0: pdsk( DUnknown -> Outdated ) 
3,1188,16450060876,-;drbd r0: bind before connect failed, err = -99
6,1189,16450060962,-;drbd r0: conn( WFConnection -> Disconnecting ) 
4,1190,16461488768,-;drbd r0: Discarding network configuration.
6,1191,16461488824,-;drbd r0: Connection closed
6,1192,16461488855,-;drbd r0: conn( Disconnecting -> StandAlone ) 
6,1193,16461488860,-;drbd r0: receiver terminated
6,1194,16461488862,-;drbd r0: Terminating drbd_r_r0
  • 93
  • 1
  • 10

1 Answers1


There is not. DRBD and Pacemaker don't really know about each other aside from what the resource agents return codes were on given actions.

When you pull the interface that DRBD is bound to out from under it, DRBD will go StandAlone. You simply need to reconnect the resource after doing that: drbdadm connect <res> --OR-- drbdadm connect all

Which seems fine considering network interfaces don't usually disappear completely and then come back without an admin getting involved. If you want to test what happens when the network communication breaks (packet drops), you should use IP tables rules; you'll then see that DRBD goes into WFConnection (waiting for connection), and back to Connected when the rules are removed.

Matt Kereczman
  • 1,887
  • 8
  • 12
  • I have already started working on a different approach, by colocating DRBD with ethmonitor in pacemaker. Everytime the network cable gets disconnected I will stop the DRBD resource, hopefully this will resolve the issue as bringing back DRBD resource up solves the problem. – bakasan Feb 27 '18 at 19:08
  • I managed to solve the DRBD issue with a combination of ethmonitor and ping, managed to finally configure everything related to NFS and DRBD and suddenly pacemaker/corosync threw me back to 0. Now when I disconnect the master node, secondary takes over.. and after I reconnect the previous master for some reason the new master gives it back control as DC and the resources resume on previously disconnected node. This is soo frustrating ... – bakasan Mar 06 '18 at 01:11
  • 1
    Sounds to me like you either have a Pacemaker resource constraint preferring the one node, or you forgot to configure `resource-stickiness`. Perhaps open a new stack question, with your Pacemaker configuration, and we can help more there. This question seems to be resolved. – Matt Kereczman Mar 06 '18 at 15:23
  • 1
    it was issue with drbd, as soon as it failed to connect using any interface it moved to standalone mode. as far as pacemaker was concerned drbd was promoting and demoting correctly but doing so as standalone on both nodes. Anyway i added location constraint on drbd-clone resource with both ping and ethmonitor, which would stop the drbd resource from running if it detected network loss. After that bringing the interface back online restarted drbd resource and it connected to peer node. – bakasan Mar 11 '18 at 06:58