My Debian 8.9 DRBD 8.4.3 setup somehow has got into a state where the two nodes cannot connect over the network any more. They should replicate a single resource r1
, but immediately after drbdadm down r1; drbadm up r1
on both nodes their /proc/drbd
describe the situation as follows:
on 1st node (Connection State is either WFConnection
or StandAlone
):
1: cs:WFConnection ro:Primary/Unknown ds:UpToDate/DUnknown C r-----
ns:0 nr:0 dw:0 dr:912 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:20
on 2nd node:
1: cs:StandAlone ro:Secondary/Unknown ds:UpToDate/DUnknown r-----
ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:48
The two nodes can ping each other over the IP addresses cited in /etc/drbd.d/r1.res
, and netstat
shows that both are listening on the cited port.
How can I (further diagnose and) get out of this situation so that the two nodes can become Connected and replicate over DRBD again?
BTW, on a higher level of abstraction this problem currently manifests itself by systemctl start drbd
never exiting, apparently because it gets stuck in drbdadm wait-connect all
(as suggested by /lib/systemd/system/drbd.service
).