2

One of the biggest complaints I have heard about MySQL is that it locks up a table if you try to change its schema like adding a column or adding an index.

By "locking up the table" does it mean I can neither read nor write to the table? Sometimes for hours?

That seems a pretty severe limitations. I was going to use MySQL for my new project but this gives me pause.

Is there a workaround for this? How do you handle the task of changing the schema of your production MySQL database?

By the way someone told me Postgresql doesn't have this problem. Is that true - I can both read and write to a Postgresql table while changing its schema? Is there any performance penalty incurred?

Would love to hear your experiences.

Continuation
  • 3,050
  • 5
  • 29
  • 38

2 Answers2

2

Yes, MySQL locks the table completely whilst it does the ALTER TABLE statement. Most of this time is spent in physically copying the table, which is why it is recommended you put all your necessary changes together in the one ALTER TABLE statement.

There are several approaches to mitigating this problem on a live database if you can't get a decent maintenance window.

First of all, many environments have timeouts of several minutes waiting for a table to become available for their query and don't know (and it's really difficult to check) just why the table is locked. I've used this quirk of a live website to do changes to tables. On a site I used to look after, I figured we had an allowance of about 7 minutes before someone started noticing. :-) It helps to make sure your boss is on your side.

Another way to do it is to do the select-insert-rename trick. This works well if the table either has fairly low UPDATE frequency, or is purely the target of INSERTs. The basic steps are to copy the table's schema, make the necessary changes, craft up a statement to do a INSERT...SELECT from the old to the new, and rename the tables (do the rename in one statement). You also need prepare ahead of time a statement to copy over any "new" records that got added or updated between the SELECT and the RENAME. I also did this a few times in a past job.

However, there are come caveats:

  • You will almost certainly have problems if the source table is MyISAM, unless perhaps the table is almost never written to. This is because of the way MyISAM tables lock. It works better with InnoDB tables because it can still be read whilst your big INSERT...SELECT is running.
  • You need a foolproof way of figuring out records that have been added or changed between the SELECT and the RENAME. For tables that are only used for INSERT, use the auto_increment column. For tables that get UPDATEs, you will need a reliable last-modified column.

Other ways to tackle this problem involve modifying slaves and failing the application over. This is more closely tied up with how your databases replicate. I also haven't done this one myself, so I can't describe exact steps.

Finally, there are a dozen server settings you can twiddle and several more that are much harder to change that will affect how long it takes to copy a table. The sort buffer is one, but also how much memory MySQL is allowed to use is another. (Remember that you can set a lot of those per connection, too, rather than setting some of them high globally.) When dealing with a lot of data, MySQL has a 'tipping point' effect where things are fairly linear up to a certain size, and then go to hell suddenly. It often comes up with complex queries working with a lot of data and is related to internal temporary table sizes and how much memory it is allowed to use, but it can come up with table alterations because they involve re-indexing the data. That is one reason why giving a database more memory is almost always a good thing.

staticsan
  • 1,529
  • 1
  • 11
  • 14
1

That's the biggest complaint you've heard about MySQL? Geez, I've got a bucket full of bigger ones than that... (story for another day, perhaps)

Yes, MySQL completely locks a table when you're running an ALTER TABLE on it; no reads or writes for the duration, and queries which attempt to do so get paused until it's complete. On the odd occasion I have to modify the schema of a large MySQL table (small ones complete their alteration quickly enough that it doesn't cause a noticable problem) I usually just schedule a maintenance window and do it then. It's not hard to benchmark exactly how long such a change will take for a given table on the replica.

If you've got one of those idiot managements that refuses to let you have reasonable maintenance windows (and if so, run like hell to another job) then I've heard of people doing things like making a copy of a table's schema, modifying the empty table, then doing a copy-select into the new table (with a write lock in place to prevent changes) then renaming the tables. Sounds like far too much risk for my taste. Maatkit's mk-table-sync has a mode to do this, if you'd prefer to be able to blame someone else when it dies.

Whilst PostgreSQL doesn't have the "giant lock of doom", you still incur a significant performance hit to modify the schema of a table in PgSQL -- it's a lot of disk IO. I can't imagine how any RDBMS is going to be able to avoid that, though.

womble
  • 95,029
  • 29
  • 173
  • 228