35

I used MySQLTuner which pointed out some tables were fragmented. I used

mysqlcheck --optimize -A

to optimize all tables. It fixed some tables but MySQLTuner still finds 19 tables fragmented. how can I see which tables are in need of defragmenting? Maybe OPTIMIZE TABLE will work where mysqlcheck didn't? Or what else should I try?

dabest1
  • 252
  • 1
  • 2
  • 9
curiouscat
  • 351
  • 1
  • 3
  • 5
  • 1
    I have a similar problem. I'm setting up a new DB with MySQL 5.5 and certain InnoDB tables never unfragment. I'm wondering if the Data_free check (shown in KayakJim's answer) is incorrect with InnoDB tables. – docwhat Mar 16 '11 at 20:50

5 Answers5

50

the short answer:

select  ENGINE, TABLE_NAME,Round( DATA_LENGTH/1024/1024) as data_length , round(INDEX_LENGTH/1024/1024) as index_length, round(DATA_FREE/ 1024/1024) as data_free from information_schema.tables  where  DATA_FREE > 0;

The "You must know" answer

first at all you must understand that Mysql tables get fragmented when a row is updated, so it's a normal situation. When a table is created, lets say imported using a dump with data, all rows are stored with no fragmentation in many fixed size pages. When you update a variable length row, the page containing this row is divided in two or more pages to store the changes, and these new two (or more) pages contains blank spaces filling the unused space.

This does not impact performance, unless of course the fragmentation grows too much. What is too much fragmentation, well let's see the query you're looking for:

  select  ENGINE, TABLE_NAME,Round( DATA_LENGTH/1024/1024) as data_length , round(INDEX_LENGTH/1024/1024) as index_length, round(DATA_FREE/ 1024/1024) as data_free from information_schema.tables  where  DATA_FREE > 0;

The DATA_LENGTH and INDEX_LENGTH are the space your data and indexes are using, and DATA_FREE is the total amount of bytes unused in all the table pages (fragmentation).

Here's an example of a real production table

| ENGINE | TABLE_NAME               | data_length | index_length | data_free |
| InnoDB | comments                 |         896 |          316 |         5 |

In this case we have a Table using (896 + 316) = 1212 MB, and have data a free space of 5 MB. This means a "ratio of fragmentation" of:

5/1212 = 0.0041

...Which is a really low "fragmentation ratio".

I've been working with tables with a ratio near 0.2 (meaning 20% of blank spaces) and never notice a slow down on queries, even if I optimize the table, the performance is the same. But apply a optimize table on a 800MB table takes a lot of time and blocks the table for several minutes, which is impracticable on production.

So, if you consider what you win in performance and the time wasted in optimize a table, I prefer NOT OPTIMIZE.

If you think it's better for storage, see your ratio and see how much space can you save when optimize. It's usually not too much, so I prefer NOT OPTIMIZE.

And if you optimize, the next update will create blank spaces by splitting a page in two or more. But it's faster to update a fragmented table than a not fragmented one, because if the table is fragmented an update on a row not necessarily will split a page.

I hope this helps you.

riffraff
  • 125
  • 6
Felipe Rojas
  • 639
  • 5
  • 7
  • 3
    Although this is an answer from several years ago, I thought I'd point out that the data_free is a stat for the whole tablespace, not for the respective table. If you store multiple tables together in one tablespace, the data_free can mislead you to believe the table needs defragmenting, when it just means there are free extents in the tablespace. Running optimize table won't reduce the free extents. Defragmenting the table may even *increase* the free extents. – Bill Karwin May 22 '17 at 17:55
21

Just to add to the answer from Felipe-Rojas you can calculate the fragment ratio as part of the query:

select ENGINE,
  concat(TABLE_SCHEMA, '.', TABLE_NAME) as table_name,
  round(DATA_LENGTH/1024/1024, 2) as data_length,
  round(INDEX_LENGTH/1024/1024, 2) as index_length,
  round(DATA_FREE/1024/1024, 2) as data_free,
  (data_free/(index_length+data_length)) as frag_ratio
FROM information_schema.tables
WHERE DATA_FREE > 0
ORDER BY frag_ratio DESC;

If a table is fragmented a small percentage (less than 5%?) then you can probably leave it alone.

Anything larger and you will need to assess based on your db usage, locking tables etc as to how important it is to defragment the table.

dland
  • 123
  • 6
sysadmiral
  • 311
  • 2
  • 3
2

Optimize Table will indeed resolve the issue you are having.

If you only have a few databases, then you could go use PHPMyAdmin to go through all of your databases. Select the tables with overhead and then select to optimize.

If you have a lot of databases then another method would likely be preferable.

I use the following PHP script setup in cron to run every hour.

$DB = new mysqli ('localhost', 'DbUser', 'DbPassword');
$results = $DB->query('show databases');
$allDbs = array();
while ($row = $results->fetch_array(MYSQLI_NUM))
{
    $allDbs[] = $row[0];
}
$results->close();
foreach ($allDbs as $dbName)
{
    if ($dbName != 'information_schema' && $dbName != 'mysql')
    {
        $DB->select_db($dbName);
        $results = $DB->query('SHOW TABLE STATUS WHERE Data_free > 0');
        if ($results->num_rows > 0)
        {
            while ($row = $results->fetch_assoc())
            {
                $DB->query('optimize table ' . $row['Name']);
            }
        }
        $results->close();
    }
}
$DB->close();
daemonofchaos
  • 1,201
  • 1
  • 8
  • 10
2

I came across this page and found the queries by Felipe-Rojas and sysadmiral to be very helpful. But in my case, I was running the query in WHM's phpMyAdmin and getting only TABLE_NAME was not as helpful since the database was not listed, and several databases have the same table names. So, simply adding TABLE_SCHEMA will provide that column as well.

select  ENGINE, TABLE_SCHEMA, TABLE_NAME, Round( DATA_LENGTH/1024/1024) as data_length , round(INDEX_LENGTH/1024/1024) as index_length, round(DATA_FREE/ 1024/1024) as data_free, (data_free/(index_length+data_length)) as frag_ratio from information_schema.tables  where  DATA_FREE > 0 order by frag_ratio desc

Shows DB

ENGINE  | TABLE_SCHEMA  | TABLE_NAME    | data_length   | index_length  | data_free | frag_ratio

InnoDB  | db_name       | db_table      | 0             | 0             | 8         | 170.6667

To "fix" I used the Defragment table link in phpMyAdmin for each of the tables that resulted in high "frag_ratio" for which phpMyAdmin executes:

ALTER TABLE `table_name` ENGINE = InnoDB;
Chris
  • 121
  • 3
1

A table using MySQL's InnoDB Engine essentially never needs to be OPTIMIZEd.

The value of Data_free from either information_schema.tables or SHOW TABLE STATUS is very often non-zero, even when you think you have done all you can do defrag your table(s). Furthermore, that metric is only one of several fragmentations that can and do occur. (Also, wasted space in blocks, undo lists, index BTrees vs data BTrees, etc, etc.

And innodb_file_per_table complicates the use of Data_free. If the table is in ibdata1, then Data_free refers to the entire tablespace; a rather useless number. If the table is in its own .ibd file, it is likely to be a few MB or a few percent of the table size, whichever is larger.

Only if you have deleted lots of rows and don't intend to refill the table, might it be worth running OPTIMIZE TABLE.

PARTITIONs also show a disturbing amount of Data_free, since each partition typically shows 4-7MB "free". And this will not go away.

Why Defragment?

  • To return space to the OS? Well, you might achieve this briefly if you had innodb_file_per_table=1. But as you add rows, you will take it back from the OS.
  • To speed up access? Forget it. The layout of blocks on disk is relatively random, and has been for the last few decades. Half a century ago, it was somewhat important to rearrange blocks.
  • To rebalance BTrees? So? They will promptly become unbalanced again. The steady-state for BTrees that are randomly inserted into is 69%. And that is not even factored into Data_free.
  • MySQLTuner says to? That product needs to chill.

A history note. When I was helping DBAs with mostly MyISAM tables, I discovered perhaps 2 out of 1000 tables that were helped by a monthly OPTIMIZE. Since then, I have worked with thousands of InnoDB tables, have yet find a performance problem that was likely to be helped by OPTIMIZE. (Sure, there have been disk space problems for which OPTIMIZE might help, but that gets tricky -- usually the DBA does not have enough disk space to run OPTIMIZE!)

Rick James
  • 2,058
  • 5
  • 11