What would be a stable, failproof, scalable galera cluster implementation

Question

Context: We are using a MariaDB Gallera cluster with (only) 2 master nodes for a web application. Last night we had a power failure and now we can't seem to recover the data and found out the database was corrupt on both nodes. Our initial impression on this setup was if one node goes down, the other would quickly act as the primary node.

My questions are,

Is there a way to set a cluster up so there is always a backup node which will automatically be replicated if one of the nodes go down? Specially if there is a power failure.
What would be the correct implementation of the gallery cluster?

So did the power go down on both servers? For a really failproof cluster you should probably not be using Gallera if you only have 2 nodes, with 2 nodes you should use (in my opinion) standard MySQL and Pacemaker/DRBD. For Gallera you would want no less than 3 nodes (and always an odd number) and you would want 2 different physical locations. — Geraint Jones, Jul 08 '15 at 06:10
@GeraintJones Yes they are on the same location. Thanks a lot for the input. Can you please elaborate more on what you suggested? — Malitta N, Jul 08 '15 at 06:56
If the power goes out, and all your systems are in the same datacenter, it doesn't matter how fault tolerant your design is if there's no high availability built in. — CIA, Jul 10 '15 at 12:59

score 1 · Answer 1 · answered Jul 14 '15 at 20:06

The answer to the first question, as with most issues in computing is: yes, if you have enough resources and time. If the cluster is in some sort of datacenter environment one would hopefully have some sort of out-of-band management interface like dedicated management NICs and/or a KVM system.

Modern datacenter management solutions like Intel's Datacenter Manager or Raritan Datacenter management systems offer users the ability to setup policies to automatically reboot systems after a power failure, send notifications and potentially even begin spinning up off-site or cloud based fail-over nodes. However there is potentially a great cost and expertise level required to setup and configure all aspects of this sort of safety net, it requires a lot of equipment and thoroughly testing and preparing is difficult without some downtime.

Another common node management tool is Nagios which allows allows for remote power management and control.

In addition to the in-band and out-of-band management options setting up a configuration management server using a CM tool like Salt or Chef would help to ensure that nodes are configured properly and greatly simplifies the task of provisioning new nodes, even in strange or remote environments. The storage and database requirements as well as the network environment will help to determine the appropriate cluster architecture, particularly with regard to storage, power and backups. In some instances it could be valuable to generate kickstart clones or some sort of similar installation aid like AutoYaST on SUSE systems. That would allow you to quickly build clean nodes and import necessary data from snapshots or backups in the event of a hardware failure.

Saving custom system images built with the KIWI Build system that import, mount or copy necessary data is another option. Using KIWI would allow you to create images that can be deployed in a variety of scenarios including as VMs, over PXE, bootable DVD/USB and more. Designing the perfect custom image for your needs using KIWI or some other operating system build tool could be quite beneficial for a variety of reasons.

Being more specific about the second question is difficult without knowing what lengths you would consider acceptable. The setup and resources required for a multi-site high-availability cluster with additional remote backups, automatic failover and recovery are drastically different from those that would be required for a cluster where "high-availability" means if the building the cluster resides in has power and internet it needs to work. Hopefully some of the information in there is useful.

Thank you for the detailed answer. I will award you the bounty since you answered my question in time and gave a lot of options for me to explore. — Malitta N, Jul 17 '15 at 01:19
You are very welcome. Please post any more questions you might have about specific implementations of anything mentioned. You might also want to checkout the [disaster recovery](http://serverfault.com/tags/disaster-recovery/info) for some more info about planning for and recovering from a disaster — Matt, Jul 17 '15 at 17:30

score 1 · Accepted Answer · answered Jul 15 '15 at 13:26

We are using a Galera cluster with 5 nodes that have a load balancer in front of them, that is continously checking all the nodes. Our configuration is that we only have one of the nodes serving a write and read target for the connections from the load balancer and the other nodes are hot standby. But of course Galera also supports multi master read and write, so you can tune that to your liking.

The minimum cluster size needs to be three, since it has to a odd number to avoid a split brain situation when the connection between the nodes goes down for any reason. (You can also use an arbitrator, but the easier setup is just to use at least 3 proper cluster nodes.) We use 5 nodes, to allow for easier upgrades on the cluster and to increase resilience.

Galera also supports a cluster over WAN, but that needs some extra tuning in the server settings to not wreck the server performance. Usually a cluster with 3+ nodes that have redundant network and power should be fine for the applications.

Some thing you didn't say in your question is the type of database engine you are using on your Galera cluster. Seeing that you got corruption, I would think it is probably MyISAM? If that is the case you need to migrate to using InnoDB, since MyISAM is actually not supported by Galera. It also has other some other benefits like more resilient writing that avoids data corruption even in the unlikely case that the cluster should actually break apart and you need to restore the database.

Thank you for the answer. Can you also tell me how you have setup the load balancer? Also I'd really appreciate if you can explain why you have hot standby nodes and how they act if your master goes down (basically your first paragraph in a bit more detail). We are using InnoDB since we use transactions. — Malitta N, Jul 17 '15 at 01:24
Both of these questions actually depend on each other. Our Load Balancer is checking if it can connect to the SQL instance on the currently configured master node every few seconds. If it cannot it will switch to the next node in his list that is still up and reachable. So in case of one node crashing the LB will notice that pretty quickly and change the forward to the next node, which can takeover immediately. — dadriel, Jul 20 '15 at 08:02
So, hot standby nodes are set up as slaves? Also I was under the impression that the cluster took care of load balancing, meaning we don't need to set up a separate load balancer. — Malitta N, Jul 21 '15 at 02:36
No, the cluster doesn't do any load balancing. Also all the nodes in the cluster have the same status, there is no master or slaves. You CAN write to all the nodes at the same time if you setup your connections that way. You just don't want that, because that has the potential for replication conflicts if you send conflicting writes to different nodes in the cluster. — dadriel, Jul 21 '15 at 08:54

What would be a stable, failproof, scalable galera cluster implementation

2 Answers2