1

We run a large forum with lots of reads and writes, particularly to the posts and topics tables which are both innodb.

Last week I started doing 12 hourly backups with innobackupex because mysqldump just takes forever (7+ million rows in posts table.) It seems that something doesn't like these backups because I have a recurring problem every other day.

The symptoms;

The front page of the site starts throwing errors
The logs start showing errors like Error: 126 - Incorrect key file for table '/tmp/mysql/#sql_4e87_14.MYI'; try to repair it
The /tmp/ dir fills up and we start getting Error: 1030 - Got error 28 from storage engine in the logs.

The only way to fix is to optimize table on each of the posts and topics tables.

I'm trying all I can to stop MySQL using disks for temp tables, but I'd have more problems than this if it used all my memory also.

My my.cnf is here; https://gist.github.com/cbiggins/0aa26f6defb7a14541d7

The box has 32GB memory and I don't come near that usually. Currently at 15GB use.

Thanks in advance.

Update 1: Despite the conf looking like there is replication, there isn't. This is a stand alone instance.

Update 2: Having now not done backups for more than 24hrs, the problem has just occurred again. So this is not the result of the backups.

Update 3: I have given MySQL 20gb of temp space now using tmpfs. Instructions here. Going to watch for the next little while and see how it goes.

Update 4: I found a killer query! 13 seconds and 2.3 million rows examined. Do this 20 times concurrently and I was filling up my new 20GB temp dir pretty quickly. I've disabled the block that was using this query and provided some feedback to the maintainer.

I've decided to get a super cheap dedicated server to replicate to to run backups from. Hopefully we can see my uptime climb again. :)

Christian
  • 779
  • 1
  • 13
  • 31
  • 2
    You should not be performing your backups on the primary server. Create a slave. That way, you can periodically pause replication to take a backup. By doing this, performance and/or data integrity of your master data will never be affected by your backup routine. – EEAA Jun 13 '14 at 01:30
  • I appreciate that, there is actually no replication. I've updated the question. – Christian Jun 13 '14 at 01:32
  • I understand that. With a site this large, *not* having a slave is crazy, for plenty of reasons other than just being able to back up things properly. – EEAA Jun 13 '14 at 01:33
  • Oh I see, I agree with that. I've always been concerned about the integrity of a backup from a slave due to MySQL replication being finicky sometimes. – Christian Jun 13 '14 at 01:34
  • MySQL replication, when configured properly and when monitoring in place, is no less finicky than other DBMSes. There are plenty of tools available to help manage and ensure proper replication. That's off-topic for this question, though. – EEAA Jun 13 '14 at 01:36
  • Ok, that makes sense. Are you able to shed light on what is happening during my backups that causes this though? – Christian Jun 13 '14 at 01:38
  • 1
    Subquery execution may generate indexes for faster processing and these are likely ending up in /tmp/. The backup might be filling /tmp/ up leading to these indexes getting corrupt / not fully generated. – Brian Jun 13 '14 at 02:07

1 Answers1

2

The problem is /tmp/ filling up and MySQL putting some files there.

One of the choices MySQL can make when dealing with a subquery:

Ref: MySQL 5.6 - Subquery Optimization

Materialize the subquery into a temporary table with an index and use the temporary table to perform a join. The index is used to remove duplicates. The index might also be used later for lookups when joining the temporary table with the outer tables; if not, the table is scanned.

You can turn this (subquery_materialization_cost_based) off and it will use a different strategy.

Ref: MySQL 5.6 - Controlling Switchable Optimizations

The other option is prevent /tmp/ from filling up by adding more space or having MySQL put it's temporary files elsewhere.

Brian
  • 3,386
  • 17
  • 16