DRDB and NFS: is there any efficient way to make failover transparent for NFS

Question

We are implementing DRDB + heartbeat with two servers to have a file system with failover. These servers exposes a NFS service for other servers

Currently DRDB is working just fine, but when testing we switch from one server to another the mounted folders trough NFS in the other servers just hangs.

Is there any transparent way to make this failover? Make it transparent to NFS or we need to necessary re-mount those nfs-mounted folders?

What HA mechanism are you using to expose this as a single service to the NFS clients connecting to the exported directory? — Ori, Jun 16 '11 at 16:52
`heartbeat` + `VIP` (not sure) I think that is what I'm trying to figure out here. any idea? — Gabriel Sosa, Jun 16 '11 at 16:59

the-wabbit · Answer 1 · 2011-06-16T22:19:05.850

6

The problem here is that you have made a redundant storage array using DRBD, but you have two disjointed NFS daemons running with the same shared data. NFS is stateful - as long as you cannot transfer the state as well, you will have serious problems on failover. Solaris HA setups do have daemons that take care of this problem. For a Linux installation, you will have to make sure that your NFS state directory (configurable, typically /var/lib/nfs) is located on the shared disk for both servers.

Stick with Heartbeat or Corosync for failure detection and failover - it generally does the Right Thing (tm) when configured with a Quorum. Other failover techniques might be too focused on just providing a virtual IP (e.g. VRRP) and would not suit your needs. See the http://linux-ha.org for further details and additional components for a cluster setup.

edited Jun 16 '11 at 22:19

answered Jun 16 '11 at 22:13

the-wabbit

40,319
13
105
169

Ok about the statefulnes. Why do you think that quorum is important? From my experience I can say the quorum is always on the "wrong" side (Murphys Law). So better set up three redundant idependent heartbeat-lines... – Nils Jun 29 '11 at 19:34
3

The quorum's primary design goal is to prevent split-brain syndrome, not to maintain availability (in fact, the very idea of a quorum is contradicting the availability optimization). Having independent communication lines is good, but even independent lines can fail - you need to ensure that in no case more than one cluster node will assume a master role and serve clients - this is what a quorum would do, at the price of the increased probability that *no* cluster node will serve clients. – the-wabbit Jun 30 '11 at 09:11

score 3 · Answer 2 · answered Jun 29 '11 at 21:07

I recommend that you read this HOWTO on highly available NFS using NFSv4, DRBD and Pacemaker. It contains detailed instructions and explanations as well as important details on how to provide a highly available NFS service. We have put a few such HA-NFS setups in production now and they work very well.

Part of such a HA setup is to move away from the old Heartbeat system (the one that uses /etc/ha.d/haresources and /etc/ha.d/ha.cf) and use the much more capable and robust Pacemaker stack. It's a bit of a transition from old Heartbeat and quite a learning curve but eventually it means you have a cluster running that is worth its name.

The HOWTO is written by Linbit, the company that created and maintains DRBD and contributes much to the whole Linux HA stack. Unfortunately (free) registration on their website is required to access the tech guides but they are well written and very useful.

score 0 · Answer 3 · answered Jun 16 '11 at 17:09

The best way I can think of to make this transparent is to use a virtual IP and virtual MAC address, and switches that are aware that this transition may happen/do the right thing when there's a gratuitous ARP (so you don't have to wait for an ARP cache to clear, which may take long enough to make your NFS mounts stale).

Something like CARP is probably the way to go for the IP failover - this is available on all the *BSDs, and as far as I know is in the Linux kernel too. Obviously give it some testing to make sure that it works the way you want (it sounds like you're currently doing testing so you're in a good place).

score 0 · Answer 4 · edited May 08 '13 at 06:46

Make sure the filesystems are located on the same major/minor device number (if you use the same drbd-device on both sides this should be true) and use a virtual IP for your NFS-service.

In Heartbeat use this order of resorces:

DRBD-Device
local mountpoint
all nfs-related services in the proper order
Last: VIP

It is important to put the VIP last - else your clients will loose their NFS-connection instead of continuously retrying it.

BTW: Putting an IP as resouce into heartbeat will do a gratitous arp upon failover as well - so you don't have to care about that (normally).

DRDB and NFS: is there any efficient way to make failover transparent for NFS

4 Answers4

Linked