Automaticaly restoring SQL replication after power outage

Question

I have two Sql databases, "MASTER" and "SLAVE", and I use Sql replication to get them synchronized. I'm using MariaDB 10 on both databases.
The MASTER database is installed on a linux PLC using Docker.
The SLAVE is on a NAS server, using Docker as well.

It works pretty fine until I cut the power on the PLC... when I restart it, the SLAVE has trouble reading the binlog and the replication process stops.

It seems to be a pretty common issue but I haven't found a solution to get the replication working again after a power outage automatically ...

Is there a way to solve this issue ?
I have seen that PerconaDB could be a solution, but my armv7 PLC isn't compatible.

Thank's

score 0 · Answer 1 · edited Jul 17 '19 at 12:59

0

There are several solutions to fix this problem, but not automatically! But @ first the error message would be helpfull to understand the cause. Typically the slave tries to reconnect to the master and continues replication by itself. So the question it, what is the current problem?

Manually solutions:

Solution #1 Skip Error Dataloss may happen!

SET GLOBAL SQL_SLAVE_SKIP_COUNTER = 1; start slave; show slave status \G; Repeat it as long, the slave is up an running. But be aware, data may be lost.

Solution #2 SQL Dump / Backup Restore.

On Master: Do an Database Lock first.

show master status \g;
Result may look like:
 show master status \g;
+------------------+-----------+--------------+------------------+-------------------+
| File             | Position  | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |
+------------------+-----------+--------------+------------------+-------------------+
| mysql-bin.000063 | 176085122 |              |                  |                   |
+------------------+-----------+--------------+------------------+-------------------+
1 row in set (0.00 sec)

Do an SQL Dump on Master an all databases. Restore the SQL Dump on the slave server. Maybe remove master.info or edit it with new File and Logposition of Master.

Release DB Lock on Master. Start Slave, replikation should run again.

Solution #3 Master and Slave Datebase must be the same version, same hardware architecture and operating system.

Instead of mysql dump you could stop the database on master and slave. Just copy or sync the database files from master to slave.

auto.conf contains a unique server id. Remove this file or edit the id on the slave system, before starting the slave server.

Furter information for "change_master_to" https://mariadb.com/kb/en/library/change-master-to/

Hopefully you know the basics to set up an master / slave replication.

However, this steps can be done by a script and initiated by crontab (@reboot). But i would never do it automatically, only i don't care about the data and theire consistency.

edited Jul 17 '19 at 12:59

Alexander Tolkachev

4,513
3
14
23

answered Jul 17 '19 at 12:22

MWE

26
2

Here is the message : `Got fatal error 1236 from master when reading data from binary log: 'Client requested master to start replication from impossible position; the first event 'master1-bin.000002' at 8090, the last event read from 'master1-bin.000002' at 4, the last byte read from 'master1-bin.000002' at 4.'` So according to you there isn't any automatic way (without using this kind of tricks) to restore replication automatically ? Like a middleware or something ..? – user3575838 Jul 17 '19 at 12:39
I did an edit. No, not realy an automatic way. Please have a look on the master. Is there a new binlog File? Have a look in the my.in for parameter like: expire_logs_days = 30 Maybe this ist to short. If not set, execute this statemtn on mysql: "show global variables like '%logs%';" – MWE Jul 17 '19 at 12:40
These operations seem quite "heavy", that's why I'm wondering if any tool has been made for it. Yeah there is a new binlog. I checked the parameter expire_logs_days, it is set on 10 – user3575838 Jul 17 '19 at 12:58
Why not reseting the binlog when the master reboots and reseting the binlog position on the slave a the same time ? https://mariadb.com/kb/en/library/reset-master/ Would it work ? – user3575838 Jul 17 '19 at 13:02
Well, the Problem still is, the time between failure and last replication state of the slave. That means, you can't be sure that there will be no transaction lost. Or you choose syncron replication with perfromance impact. You also have to detect, on master the first binlog position and binlog file after start. It would be possible but i'm wondering about the new binlog after restart. Maybe a configuration issue. – MWE Jul 17 '19 at 14:24
What if it doesn't bother me to lose some datas ? Loosing like 1 hour of data acquisition is not a problem but I really need the replication to recover automaticaly and with reliability – user3575838 Jul 17 '19 at 14:40
I will add another anser, it's better to read... :D – MWE Jul 17 '19 at 14:43

MWE · Accepted Answer · 2019-07-17T15:33:03.573

0

What if it doesn't bother me to lose some datas ? Loosing like 1 hour of data acquisition is not a problem but I really need the replication to recover automaticaly and with reliability

I don't like this way. Better find out why and stop mariadb to pruge / reset master after restart. That would be the best solution.

Automatically: Well, create an simple script and execute it in crontab @reboot.

RESULT=$(mysql -NLr -e  'show master status')
FILE=$(echo ${RESULT} | cut -d ' ' -f 1)
LOGPOS=$(echo ${RESULT} | cut -d ' ' -f 2)

and maybe execute remote: mysql -h slavedb -e "CHANGE MASTER TO MASTER_LOG_FILE=$FILE, MASTER_LOG_POS=$LOGPOS;"

And Maybe here you can find a safer way to automatically restoring the database. But don't froget to drop '/ cleanup the slave before. Unsure about your data. May take a while.

Just take it in a cronjob: Initialize a MySQL slave server located on another network than the master

mysqldump --all-databases --master-data=1 | mysql -h slavedb

All you need, connection to the slave db and a super user / root with admin privileges from master to slave. User must exist also on the master. Will be overwritten after dump / restore.

edited Jul 17 '19 at 15:33

answered Jul 17 '19 at 14:50

MWE

26
2

It seems nice ! Thank you I'm not an expert with linux CL, I get this error on the two first commands, is there a syntax error ? `bash: command substitution: line 5: syntax error near unexpected token |' bash: command substitution: line 5: mysql -NLr -e 'show master status'; | cut -d ' ' -f 1)'` – user3575838 Jul 17 '19 at 15:10
Yes, syntax error my fault; Removed ; FILE=$(mysql -NLr -e 'show master status' | cut -d ' ' -f 1) LOGPOS=$(mysql -NLr -e 'show master status' | cut -d ' ' -f 2) – MWE Jul 17 '19 at 15:12
Ok nice ! Just added the password into the command like this `FILE=$(mysql -u root -proot -NLr -e 'show master status' | cut -d ' ' -f 1)` But if I type `echo $FILE` I'm getting `master1-bin.000003 3656954`. Is it normal ? I tried with a | delimiter too, same issue – user3575838 Jul 17 '19 at 15:24
But as you said on your comment, you don't like this way to recover the replication, right ? – user3575838 Jul 17 '19 at 15:27
Mmmmh, i didn't test it. It's an tab not just a blank space. I will correct it above, as it wirked at my host. Hopefully it will work on your distribution, too. – MWE Jul 17 '19 at 15:31

Automaticaly restoring SQL replication after power outage

2 Answers2