In ordet to simplify the promotion of the Slave to a Master, I have a suggestion to keep the Slave closer sync'd to the Master.
You should use the tool mk-slave-prefetch.
Here is the beauty of this tool: On a slave, it will read the relay logs, look for all queries that have a WHERE clause, convert it to a SELECT and execute it. That way, the caches for InnoDB and MyISAM are essentially the same on the Slave as it is on the Master. The differences should be minor.
Here is something else you will need: Have the Master and the Slave setup using Circular Replication. That way, both Master and Slave have binary logging enabled.
How does this help? When the Master crashes and you failover to the Slave, the file master.info on the Master will contain the last place the Slave executed its SQL. Here is how you find out:
For this example we will have Circular Replication between M1 and M2
Run this commmand on M2 (your current Master) : SHOW SLAVE STATUS\G
You should see something like this:
mysql> show slave status\G
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 10.64.100.253
Master_User: replicant
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.000834
Read_Master_Log_Pos: 823413571
Relay_Log_File: relay-bin.002505
Relay_Log_Pos: 823391419
Relay_Master_Log_File: mysql-bin.000834
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 823391282
Relay_Log_Space: 823413708
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: 1
1 row in set (0.00 sec)
Please take note of two fields:
- Relay_Master_Log_File
- Exec_Master_Log_Pos
These two fields represent the last log file and log position from a Master that was successfully execute on the Slave.
Suppose M1 crashed, you failover to M2, and you bring up M1. Your goal is to reestablish Circular Replication. With a crash, there is the possibility of replication losing its place. Here is what to do:
Step01) Run SHOW SLAVE STATUS\G
on M1
Step02) Get Relay_Master_Log_File
and Exec_Master_Log_Pos
from Step01. For this exmaple, let Relay_Master_Log_File
be mysql-bin.000834 and Exec_Master_Log_Pos
be 823391282.
Step03) Run these commands
STOP SLAVE;
CHANGE MASTER TO master_log_file='mysql-bin.000834',master_log_pos=823391282;
START SLAVE;
SELECT SLEEP(5);
SHOW SLAVE STATUS\G
Here is what to look for:
- If Seconds_Behind_Master is Numeric, Replication is Running. Just wait until it reaches zero.
- If Seconds_Behind_Master is NULL, Replication is Broken
- If Slave_IO_Running=Yes and Slave_SQL_Running=No, then an SQL error broke replication. Proceed to fix the error
- If Slave_IO_Running=No and Slave_SQL_Running=Yes, then the log file or log position does not exist.
You can fix that with
STOP SLAVE;
CHANGE MASTER TO master_log_file='mysql-bin.000835',master_log_pos=<new pos>;
START SLAVE;
SELECT SLEEP(5);
SHOW SLAVE STATUS\G
What is newpos?
- For MySQL 5.5, newpos is 107
- For MySQL 5.1, newpos is 106
- For MySQL 5.0, newpos is 98
If you implement Circular Replication and properly script these things using these principles, you will achieve the recovery of Master and Slave.