How can I deal with failures when connecting to a MySQL cluster?

Question

Our company has created two MySQL servers that are kept in synch using a master-master strategy. The two-machine cluster is managed using crmsh. This is only a high-availability cluster, meaning that one MySQL server is being sent data at a time, and the other is only kept for backup if one server fails.

To test our cluster (I haven't done this sort of thing before), I wrote a script that will write 100000 rows into a table in a database. While this script is running, I log in to the server that is receiving the data (I.E. the server that the shared IP is pointing to) and reboot it, trying to simulate a hardware failure.

In some cases, the script keeps running. In some other cases, however, I receive the infamous PDOException: SQLSTATE[HY000]: General error: 2006 MySQL server has gone away exception, or the PDOException: SQLSTATE[40001]: Serialization failure: 1213 Deadlock found when trying to get lock; try restarting transaction exception, and so on.

How can I deal with these errors? I feel like there's a simple fact about programming for a cluster of servers that I'm unaware of.

Catch the exception in your application, log it, and then deal gracefully with it, retrying your connection. — EEAA, May 05 '15 at 06:20

score -1 · Answer 1 · edited May 23 '17 at 12:41

Each error message translates disparate meaning here

For MySQL server has gone away

Mostly happens because the server timed out and connection is closed. Increase max_allowed_packet in my.cnf to a larger value, restart MySQL and check

[mysqld]
max_allowed_packet = 32M

For Deadlock

Mostly happens because multiple transactions together hold and request for locks, creating a dependency. It will be helpful if you post the query as well. Try to catch the exact reason

mysql> SHOW ENGINE INNODB STATUS\G;

This link may also prove thoughtful

How can I deal with failures when connecting to a MySQL cluster?

1 Answers1