10

I was just checking the status of my glusterfs volumes and I have one with split-brain entries that have no path:

# gluster volume heal private_uploads info
Brick server01:/var/lib/glusterfs/brick01/uploads/
<gfid:4c0edafb-0c28-427c-a162-e530280b3396> - Is in split-brain
<gfid:42d62418-1be9-4f96-96c4-268230316869> - Is in split-brain
Number of entries: 2

Brick server02:/var/lib/glusterfs/brick01/uploads/
<gfid:42d62418-1be9-4f96-96c4-268230316869> - Is in split-brain
<gfid:4c0edafb-0c28-427c-a162-e530280b3396> - Is in split-brain
Number of entries: 2

What does it mean? How do I fix it?

I'm running GlusterFS 3.5.9:

# gluster --version
glusterfs 3.5.9 built on Mar 28 2016 07:10:17
Repository revision: git://git.gluster.com/glusterfs.git
Pablo
  • 7,249
  • 25
  • 68
  • 83

4 Answers4

8

What is Split-Brain?

As mentioned in the Official Documentation on Managing Split-Brain provided by RedHat, split-brain is a state when a data or availability inconsistencies originating from the maintenance of two separate data sets with overlap in scope, either because of servers in a network design, or a failure condition based on servers not communicating and synchronizing their data to each other. And it is a term applicable to replicate configuration.

Pay attention that it is said "a failure condition based on servers not communicating and synchronizing their data to each other" - due to any likelihood - but it doesn't mean that your nodes might lose the connection. The Peer may be yet in cluster and connected.

Split-Brain Types :

We have three different types of split-brain, and as far as I can see yours is entry split-brain. To explain three types of split-brain :

  • Data split-brain : Contents of the file under split-brain are different in different replica pairs and automatic healing is not possible.

  • Metadata split-brain :, The metadata of the files (example, user defined extended attribute) are different and automatic healing is not possible.

  • Entry split-brain : It happens when a file have different gfids on each of the replica pair.


What is GFID ?

GlusterFS internal file identifier (GFID) is a uuid that is unique to each file across the entire cluster. This is analogous to inode number in a normal filesystem. The GFID of a file is stored in its xattr named trusted.gfid. To find the path from GFID, I highly recommend you read this official article provided by GlusterFS.


How to resolve entry split-brain?

There are multiple methods to prevent split-brain from occurring but to resolve it, the corresponding gfid-link files must be removed. The gfid-link files are present in the .glusterfs directory in the top-level directory of the brick. By the way, beware that before deleting the gfid-links, you must ensure that there are no hard links to the files present on that brick. If hard-links exist, you must delete them either. Then you can use self-healing process by running the following commands.

In the meantime, to view the list of files on a volume that are in a split-brain state you can use:

# gluster volume heal VOLNAME info split-brain

You should also beware that for replicated volumes, when a brick goes offline and comes back online, self-healing is required to resync all the replicas.

To check the healing status of volumes and files you can use:

# gluster volume heal VOLNAME info

Since you are using version 3.5, you don't have auto healing. So after doing the steps mentioned earlier, You need to trigger self-healing. To do so:

  • Only on the files which require healing:

    # gluster volume heal VOLNAME

  • On all the files:

    # gluster volume heal VOLNAME full

I hope this will help you through fixing your problem. Please read the official docs for further information. Cheers.

2

I think document is clear pretty enough, it even gave you a similar example.

And for Gluesterfs's healing commands such as

gluster volume heal **VOLNAME** split-brain latest-mtime **FILE**

FILE can be either the full file name as seen from the root of the volume (or) the gfid-string representation of the file

So you don't need worries about that.

And as convert GFID to path says:

GlusterFS internal file identifier (GFID) is a uuid that is unique to each file across the entire cluster.

this script may tell you which filename belongs to which gfid, but brain split happened, it may not have a filename.

You're running 3.5 and don't have a semi-auto heal cmd, so you may need to fix confliction yourself manually which normally means decide which gfid file need to be deleted.

pingz
  • 213
  • 1
  • 6
  • My version of Gluster doesn't seem to have that command, otherwise, yes, that's straightforward. Also, I don't have filenames, I have uuids. – Pablo Jan 12 '17 at 09:33
2

split-brain occurs when two nodes of a cluster are disconnected. Each node thinks the other one is not working.

split brain

To fix it, you have to understand why your two nodes are not talking to each other anymore.

bgtvfr
  • 1,224
  • 10
  • 19
2

How do I fix it?

Split-brain resolution can be found either here. In case is would not help much, the manual how-to here should do the job. For the case, I see the article also helpful.

How to avoid Split-brain.

Protection against network partitions is done through a quorum voting algorithm. In case a host fails, or there is a split-brain scenario where the nodes continue to run but can no longer communicate to each other, the remaining node or nodes in the cluster race to place a SCSI reservation on the witness drive. In the case of a split-brain, the witness will help to decide which of the hosts who is holding a copy of the data should take over control.

Some examples.

VMware VSAN allows running 2-node cluster with the witness drive running on a 3rd host or in the cloud. Source

StarWind Virtual SAN runs in just 2-node setup using Microsoft Failover Cluster service, that also contains quorum voting mechanism to avoid the split-brain issue. Source

For both, Heartbeat network is used to serve/monitoring the communication between nodes and quorum. In order to avoid a split-brain, I see it’s mandatory to go with redundant Heartbeat channels.

Mr. Raspberry
  • 3,878
  • 12
  • 32