6

We have an online system using MySQL database. We need to merge an older backed up database (Full tables, not a dump file) with the current one. Without experiencing any server downtime, how would you do it? Note that the system is online 24/7. What are the issues might rise?

Thank you.

Sajal Dutta
  • 613
  • 5
  • 18

5 Answers5

7

When messing with MySQL's filesystem you have to stop the MySQL server. To avoid downtime on your live machine, use a backup/virtualised machine with the SAME VERSION of MySQL server. While the BACKUP MySQL server is stopped, copy the tables (I assume .FRM, .MYI etc?) into the filesystem in /var/lib/mysql/BACKUP_DB (the BACKUP_DB's corresponding directory).

Start the BACKUP MySQL server and ensure the data has loaded correctly using scripts or CLI. When verified, mysqldump the BACKUP_DB database so it can be loaded into the live server:

mysqldump --extended-insert BACKUP_DB > /root/sql/BACKUP_DB.sql

You have now converted your raw backup data into SQL statements which can be loaded into MySQL without downtime (unlike the raw data). Move BACKUP_DB.sql to the live machine.

Import BACKUP_DB.sql into your LIVE MySQL instance as a different database:

mysql BACKUP_DB < /root/sql/BACKUP_DB.sql

You should now have the backup database loaded into MySQL as BACKUP_DB.

Now, dependent on INSERT IGNORE, or REPLACE INTO statements (are you overwriting old data or "filling the blanks" in your indexes?):

mysqldump --no-create-db
--no-create-info --extended-insert --insert-ignore MERGE_SOURCE | mysql BACKUP_DB

Or, for REPLACE INTO action:

mysqldump --no-create-db --no-create-info --extended-insert MERGE_SOURCE | sed 's/INSERT INTO/REPLACE INTO/g' | mysql BACKUP_DB

Alternatively, instead of piping output back into MySQL, send to a file and review the SQL statements.

mysqldump --no-create-db --no-create-info --extended-insert --insert-ignore MERGE_SOURCE > /root/sql/merge.sql
mysqldump --no-create-db --no-create-info --extended-insert MERGE_SOURCE | sed 's/INSERT INTO/REPLACE INTO/g' > /root/sql/merge.sql

Finally, to avoid downtime, dump the first database over the second:

mysqldump BACKUP_DB | mysql CURRENT_DB

You could always lock the database first to prevent data being written to (for example) the z table with a foreign key in the a table (which has already been written over):

FLUSH TABLES WITH READ LOCK;

(perform dump as previous step)

UNLOCK TABLES;

Add the FLUSH command to the start of your dump .sql file and UNLOCK to the end.

Be sure to quadruple check your DB names in this scenario! Ask any follow up question you're unsure of, as this is high risk data mashing stuff. Perform (and note in detail) the exact steps required on a dev server if possible, or virtualize your test, or create a small scale test. Just test. And, of course, take enough backups to cover every data loss eventuality.

Andy
  • 5,190
  • 23
  • 34
  • Definitely my answer of choice with one exception: I'm not sure you need the 'mysqldump --no-data CURRENT_DB | mysql BACKUP_DB' stage, as putting the FRM/MYI files from your backup, should already make all tables/data available. – katriel Jun 06 '09 at 14:48
  • What if there is data with the same primary key? – David Pashley Jun 08 '09 at 22:04
  • @David - with regards to schema, "We need to merge an older backed up database" implies that the old database is a version of the current. As such, the live primary keys / auto_increment must be considered final - otherwise which version of the data do you consider authoritative? I know of no automated way to distinguish on a per row basis without explicit TIMESTAMP columns and further scripting. INSERT IGNORE will skip conflicting keys, REPLACE INTO overwrites them - the former is functionally equivalent to a SELECT INSERT on a null primary key, without requisite prior knowledge of keys. – Andy Jun 09 '09 at 10:40
  • If you have restarted the database, won't it bring down the system (though it can be from 1-x seconds)? – Niko Gunadi Jun 10 '09 at 00:00
  • @Niko you're correct, the downtime is unknown; I have amended my answer to avoid a server restart – Andy Jun 10 '09 at 10:57
5

For this example, let's assume your database is called "zig", and your backed up database is called "zig_backup". I'll also assume that your backup database and your live database have the same schema.

I'll also assume that your live database is being updated constantly and you can not perform the merge "offline" and then flip a switch to make the merged copy "live."

The issues to be aware of are:

  • Potential for indexes or primary keys in old/new to conflict for some reason (usually if a mistake was made, or records were lost accidentally)
  • Deletions which have happened since the backup should not be "merged"

The process to update will require planning in terms of which tables to merge first and which ones to merge last. My general tendency to to do the heavy lifting first (large tables), then work your way onto the smaller tables. Depending on your data sizes, it may not matter.

The process to merge would then be:

  1. Restore the backup database next to the live database with a modified database name (e.g. "zig_backup")
  2. For each table in the live database, merge from the backup database
  3. For each table in the backup database, check to see if the table exists in the live database, and if not, add it.

The safest method would be to do a full-table join and just insert the rows which don't match:

for each table in the "live" database:
    INSERT INTO zig.$table
    SELECT BACKUP.* FROM zig_backup.$table AS BACKUP
    LEFT OUTER JOIN zig.$table AS LIVE ON LIVE.ID=BACKUP.ID
    WHERE LIVE.ID IS NULL;

This, of course, requires knowledge of the primary keys of each table. A straight "REPLACE" may not work, as rows in the live database may have changed, and their changed data will be overwritten by the

To merge missing tables, do:

\u zig_backup
SHOW TABLES

To get a list of all tables, and to determine if the table exists in the live database, you can do:

\u zig
SHOW TABLES LIKE 'tablename';

Alternatively, a table in the backup database which does not exist in the "live" database can be binary copied into the live database directory.

Finally, handling deletions from the "live" database are difficult at best, especially since by the fact that you're merging that some data is missing for one reason or another.

razzed
  • 181
  • 5
3

Set up replication and switch DB-server to slave when it is ready, and well-tested(!!!).

Unless you have some technical limitations to perform that, this is only valid variant (I mean you should be able to test changes before putting them to 24/7 environment).

noonex
  • 228
  • 1
  • 10
2

Assuming you have the disk space, I'd set up a migration database and copy/merge the data in to it. Set up a copy of your web environment to connect to the migration database and test it thoroughly. Then once you are happy, change connection strings on your production environment to look at the new migration DB.

Kyle
  • 1,849
  • 2
  • 17
  • 23
  • I think I'd introduce a couple minutes of planned downtime -- otherwise you have a small race condition between syncing the two databases and switching the connection information in the application. You'd lose any (potentially critical) changes which were made between then-and-there. – hark Jun 01 '09 at 14:48
1

you can use the -T option of mysqldump to get the tables output as tab-separated-values. Then, create a copy of the live DB as Kyle suggests. Then use mysqlimport to re-import the data into the live-copy and test it there. When you are ready to switch over from live to live-copy you will need to get some downtime.

I have not tried this myself so Caveat Emptor.

Sekenre
  • 2,913
  • 1
  • 18
  • 17