1

I'm testing the RedHat Cluster Administration tool documented here: https://access.redhat.com/site/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Cluster_Administration/index.html

I have two virtual machines running CentOS 6.4. I've been testing failover scenarios; restarting the primary machine, and gracefully shutting down the primary machine both successfully migrate the resources to the secondary machine (things like DRBD, Tomcat, MySQL, Apache).

However, I wanted to simulate a complete power failure, or a forced shutdown. In the XenCenter client, I forcefully shutdown the primary machine, and watch the logs on the secondary. In short, the resources seem to NEVER migrate over to the secondary, and the cluster management interface seems to think that the services are still running on the primary.

Here's the output from the secondary machine logs: http://pastebin.com/gsi6uBct

It's complaining mostly about fencing. But I don't understand: if the primary node completely dies on its own, there's nothing to fence.

Ideas?

kubanczyk
  • 13,502
  • 5
  • 40
  • 55
Scott Crooks
  • 430
  • 4
  • 10

1 Answers1

1

Fencing is supposed to happen out of band. If you lose networking between the two hosts, there is no medium to check liveliness on, so should the secondary host try to start services? No, because that will lead to data corruption, those services are still running on the primary host.

So fencing kicks in, just to make sure the primary host is really down, and once a fence command goes through, it will be assumed safe to start the services on the secondary host.

dyasny
  • 18,482
  • 6
  • 48
  • 63
  • I see. So some sort of fencing is required in this case where the primary node completely dies. What kind of fencing would you recommend for a VM then for testing? Obviously it doesn't have an iLO / iDRAC port to be used. – Scott Crooks Jul 04 '13 at 06:11
  • there are quite a few fencing agents for virtualization systems - libvirt, rhev/ovirt, vmware - just run `rpm -ql fence_agents`. Such a fence agent is basically an API call to the virtualization system requesting an immediate shutdown of a specified VM – dyasny Jul 04 '13 at 14:17