How to backup galera cluster through garbd? SST failes

Question

We are runnning a MySQL Galera cluster (5.6) on 3 nodes.

In order to backup the cluster I am trying to follow this example: http://galeracluster.com/documentation-webpages/backingupthecluster.html

The command I am using:

node1:~$ sudo /usr/bin/garbd --address gcomm://10.0.0.120:3307?gmcast.listen_addr=tcp://0.0.0.0:4444 --group example_cluster --donor MyNode1 --sst backup

10.0.0.120 is the local adreass of node1. I also tried 10.0.0.10?3306 which is the VIP of the cluster through HAProxy.

Both fail with the error:

FATAL: Exception in creating receive loop: Failed to open connection to group: 110 (Connection timed out) at garb/garb_gcs.cpp:Gcs():35

Leaving me with the following questions:

How to specify the backupfile that I can transfer to an FTP Server for potential recovery of the cluster.
Why does the connection fail? Do I need to configure the cluster for backup access?

Thank you in advance for any help.

I am attaching the detailed output after issuing the backup command:

2015-09-01 12:29:56.240  INFO: CRC-32C: using hardware acceleration.
2015-09-01 12:29:56.241  INFO: Read config: 
    daemon:  0
    name:    garb
    address: gcomm://10.0.0.120:3307?gmcast.listen_addr=tcp://0.0.0.0:4444
    group:   example_cluster
    sst:     backup
    donor:   MyNode1
    options: gcs.fc_limit=9999999; gcs.fc_factor=1.0; gcs.fc_master_slave=yes
    cfg:     
    log:     

2015-09-01 12:29:56.245  INFO: protonet asio version 0
2015-09-01 12:29:56.245  INFO: Using CRC-32C for message checksums.
2015-09-01 12:29:56.246  INFO: backend: asio
2015-09-01 12:29:56.246  WARN: access file(./gvwstate.dat) failed(No such file or directory)
2015-09-01 12:29:56.247  INFO: restore pc from disk failed
2015-09-01 12:29:56.248  INFO: GMCast version 0
2015-09-01 12:29:56.249  INFO: (63af44e0, 'tcp://0.0.0.0:4444') listening at tcp://0.0.0.0:4444
2015-09-01 12:29:56.249  INFO: (63af44e0, 'tcp://0.0.0.0:4444') multicast: , ttl: 1
2015-09-01 12:29:56.250  INFO: EVS version 0
2015-09-01 12:29:56.250  INFO: gcomm: connecting to group 'example_cluster', peer '10.0.0.120:3307'
2015-09-01 12:29:59.255  WARN: no nodes coming from prim view, prim not possible
2015-09-01 12:29:59.256  INFO: view(view_id(NON_PRIM,63af44e0,1) memb {
    63af44e0,0
} joined {
} left {
} partitioned {
})
2015-09-01 12:29:59.759  WARN: last inactive check more than PT1.5S ago (PT3.50916S), skipping check
2015-09-01 12:30:29.287  INFO: view((empty))
2015-09-01 12:30:29.288 ERROR: failed to open gcomm backend connection: 110: failed to reach primary view: 110 (Connection timed out)
     at gcomm/src/pc.cpp:connect():162
2015-09-01 12:30:29.289 ERROR: gcs/src/gcs_core.cpp:gcs_core_open():206: Failed to open backend connection: -110 (Connection timed out)
2015-09-01 12:30:29.290 ERROR: gcs/src/gcs.cpp:gcs_open():1379: Failed to open channel 'example_cluster' at 'gcomm://10.0.0.120:3307?gmcast.listen_addr=tcp://0.0.0.0:4444': -110 (Connection timed out)
2015-09-01 12:30:29.290 FATAL: Exception in creating receive loop: Failed to open connection to group: 110 (Connection timed out)
     at garb/garb_gcs.cpp:Gcs():35

score 1 · Answer 1 · answered May 25 '16 at 22:02

First, to solve your issue, you're supposed to connect to your Galera port, not the MySQL port, in order to JOIN THE CLUSTER. The default Galera port is 4567.

You have 3 nodes, so you may specify up to the 3 addresses.

Once you're connected to the cluster, your garbd will request the node "MyNode1" (assuming there is one such node) to run the script wsrep_sst_backup (because you specified "backup" -- did you really created this script?).

node1:~$ sudo /usr/bin/garbd --address gcomm://10.0.0.120:4567,10.0.0.121:4567,10.0.0.121:4567?gmcast.listen_addr=tcp://0.0.0.0:4444 --group example_cluster --donor MyNode1 --sst backup

Let me know if this still doesn't work.

How to backup galera cluster through garbd? SST failes

1 Answers1