0

I have a MySQL Galera cluster, using Perconadb and Xtrabackup. The nodes can start stand-alone, or can join the cluster if only an IST is required. However, if an SST is required, then this runs to completion and then fails.

The logs show that, after the xtrabackup SST is completed, it exits with stats 22 (Invalid Argument) causing the SST to be rolled back and the node fails to come up.

2018-08-09 00:43:25 860 [Note] WSREP: 0.0 (xmdadb01): State transfer to 1.0 (xmdadb02) complete.
2018-08-09 00:43:25 860 [Note] WSREP: Member 0.0 (xmdadb01) synced with group.
2018-08-09 00:43:25 860 [ERROR] WSREP: Process completed with error: wsrep_sst_xtrabackup-v2 --role 'joiner' --address '10.93.40.122' --datadir '/var/lib/mysql/' --defaults-file '/etc/my.cnf' --defaults-group-suffix '' --parent '860'  '' : 22 (Invalid argument)
2018-08-09 00:43:25 860 [ERROR] WSREP: Failed to read uuid:seqno from joiner script.
2018-08-09 00:43:25 860 [ERROR] WSREP: SST script aborted with error 22 (Invalid argument)
2018-08-09 00:43:25 860 [ERROR] WSREP: SST failed: 22 (Invalid argument)
2018-08-09 00:43:25 860 [ERROR] Aborting

The relevant parts of the my.cnf:

[mysqld]
wsrep_provider=/usr/lib64/galera3/libgalera_smm.so
wsrep_provider_options="gcache.size=256M;gcs.fc_factor=1.0;gcs.fc_limit=512;gcs.fc_master_slave=YES;pc.checksum=true;"
wsrep_cluster_name="galera01-xmd"
wsrep_cluster_address="gcomm://10.93.40.121:4567,10.93.40.122:4567"
wsrep_node_name=xmdadb02
wsrep_node_address="10.93.40.122"
wsrep_sst_method=xtrabackup-v2
wsrep_sst_auth=sst_user:password-goes-in-here

As the SST runs, I can see the files coming over into /var/lib/mysql/.sst, so I know this is working. I have verified the user and password are correct. However, why is the xtrabackup-v2 returning 22, and how can I stop it from doing so in order for the SST to complete?

Annoyingly, when this setup was first installed, SST worked without issue. I do not know what changed in the intervening time to prevent SST while still allowing IST to work.

Steve Shipway
  • 742
  • 5
  • 17

4 Answers4

1

I find that the reasons SST fails regularly falls under one of the following: SElinux/AppArmor is Enforcing, SST user was not created on donor node (and permissions not updated correctly in .cnf files), IPTables/Firewall restrictions over 4444. In most cases, correcting those allows SST to work.

utdrmac
  • 111
  • 3
0

Because galera has a creative outlook on what constitutes a meaningful error message, don't expect EINVAL 22 to correspond to a syscall return code.

Take a look at some of code around this EINVAL text in their code.

fixing isn't a priority.

danblack
  • 1,179
  • 10
  • 14
  • So, looking at the code, I take it that the reason is likely something in the configuration files that mysql can cope with but the xtrabackup cannot? Maybe a key that appears twice, or has no value, or is for some reason unparseable? The next question, of course, is how do we find out the name of the problem key in the configuration file if this is indeed the cause... – Steve Shipway Sep 02 '18 at 20:51
  • Ah, quite right, the SST script is returning the error not the wsrep provider. – danblack Sep 02 '18 at 22:39
0

There are many reasons why SST and IST can fail, and some have been given by other posters; however in our case, the problem seems to have been that the xtrabackup SST script is more picky about the mysql.cnf than MySQL itself, and fails with this error when the parser has issues.

In this case, the issue was that some of the config directives were in the file more than once (though with the same value). MySQL happily passes this, but xtrabackup parser turned it into a multi-valued array which was an invalid data type so it choked.

Removing the additional duplicate config lines solved the issue.

Note this only affected xtrabackup SST -- an IST has always worked fine, and MySQL itself (plus mysqldump etc) are quite happy.

Steve Shipway
  • 742
  • 5
  • 17
0

Try opening the innobackupex log, for example on Debian it's located at /var/lib/mysql/innobackup.backup.log

I've found that my issue on the donor was InnoDB: Error number 24 means 'Too many open files'., so ulimit -n would help :-)

EDIT: found out that there's another line of log: xtrabackup: open files limit requested 200000, set to 1024 As a matter of fact, I've used:

[xtrabackup]
open-files-limit = 200000

But MySQL reduces it to 1024 (or 5000), so it's another thing to tweak:

[mysqld]
open-files-limit = 100000

(and remove the one in [xtrabackup] which is useless)

Yvan
  • 350
  • 3
  • 8
  • 1
    This is useful to know, though it is not the root cause of our original issue (which was my.cnf duplicate lines giving error 22). We had open file limits defined in limits.conf and a relatively small DB so were spared from this one. – Steve Shipway May 01 '19 at 00:02