99% disk IO spikes with Percona

Question

So we have a server that has seemingly random spikes in disk I/O, going up to 99.x% at random times and for no obvious reason, staying high for a while, and then going back down. This didn't use to be an issue, but recently the disk I/O has been staying at 99% for extended periods of time, in some cases up to 16 hours.

The server is a dedicated server, with 4 CPU cores and 4 GBs of RAM. It is running Ubuntu Server 14.04.2, running percona-server 5.6 and nothing else major. It is being monitored for downtime and we have a screen permanently showing CPU/RAM/Disk I/O for the servers we deal with. The server is also being regularly patched and maintained.

This server is the 3rd in a chain of replicas, and is there as a fail-over machine. The MySQL data flow is as follows.

Master --> Master/Slave --> Problem Server

All 3 machines have identical specs and are hosted with the same company. The problem server is in a different datacentre to the first & second.

The 'iotop' tool shows us that the disk I/O is being caused by the 'jbd2/sda7-8' process. From what we know this handles the filesystem journalling, and flushing things to disk. Our 'sda7' partition is '/var' and our sda8 partition is /home. Nothing should be reading/writing to /home on a regular basis. Stopping the mysql service results in the disk I/O immediately dropping back down to a normal level so we are fairly certain it is percona that is causing the issue, and this would match with it being the /var partition as this is where our MySQL data directory resides (/var/lib/mysql).

We use NewRelic to monitor all of our servers, and when the disk I/O spikes, we can not see anything that could be causing it. The load average sits at ~2. The CPU usage hovers at ~25%, which NewRelic says is being caused by 'IO Wait' rather than a particular process.

Our mysql configuration file was generated through a combination of the Percona configuration wizard and some settings that are required for our customers app, but is nothing particularly fancy.

MySQL config - http://pastebin.com/5iev4eNa

We have tried the following things to try and resolve the issue:

Ran mysqltuner.pl to see if there was anything obviously wrong. Results look very similar to the results from the same tool on the other 2 database servers, and does not change much between uses.
Used vmstat, iotop, iostat, pt-diskstats, fatrace, lsof, pt-stalk and probably a few more, but nothing obvious has jumped out.
Tweaked the 'innodb_flush_log_at_trx_commit' variable. Have tried setting it to 0, 1 & 2, but none seemed to have any effect. This should have changed how often MySQL was flushing transactions to the logs files.
A mysql 'show full processlist' is very un-interesting when disk-I/O is high, it just shows the slave reading from the master.

Some of the outputs from tools are obviously quite long so I'll give pastebin links, and I couldn't manage to copy-paste the output of iotop, so I've provided a screen capture instead.

iotop

IOTop

pt-diskstats: http://pastebin.com/ZYdSkCsL

When the disk I/O is high, “vmstat 2” shows us the things being written are mostly because of “bo” (buffer out), which correlates with the disk journalling (flushing buffers/RAM to disk)

http://pastebin.com/E3LWzwjj

“lsof -p mysql-pid” (list open files of a process) shows us that the files being written to are mostly .MYI and .MYD files in the /var/lib/mysql directory, and the master.info and relay-bin and relay-log files. Even without specifying the mysql process(so any file being written on the entire server), the output is very similar (mostly MySQL files, not much of anything else) This confirms for me that it is definitely being caused by Percona.

When disk I/O is high, the “seconds_behind_master” increases. I’m not sure which way round they happen as of yet. “seconds_behind_master” also jumps from normal values to arbitrarily large values temporarily, and then returns to normal pretty much straight away, some people have suggested this could be caused by network issues.

'show slave status' - http://pastebin.com/Wj0tFina

The RAID controller (3ware 8006) does not have any caching abilities; someone also suggested poor caching performance could be causing the issue. The controller has identical firmware, version, revision etc. as cards on other servers for the same customer (albeit webservers), so I’m fairly certain it is not at fault. I have also ran verifications of the array, which came back fine. We also have the RAID checking script which would have alerted us to any changes.

The network speeds are terrible compared to those on the second database server, so I’m thinking that maybe this is a network issue. This also correlates with spikes in bandwidth just before the disk I/O goes high. However, even when the network does “spike”, it doesn’t spike to a high amount of traffic, just relatively high compared to the average.

Network/Disk IO

Network speeds (generated using iPerf to an AWS instance)

Problem server - 0.0-11.3 sec 2.25 MBytes 1.67 Mbits/sec Second server - 0.0-10.0 sec 438 MBytes 366 Mbits/sec

Apart from being slow, the network seems to be fine. No packet loss, but some slow hops between servers MTR

Will be happy to also provide output of any relevant commands but I can only add 2 links to this post as I'm a new user :(

EDIT We got in touch with our hosting provider regarding this issue, and they were kind enough to swap the hard disks for SSDs of the same size. We rebuilt the RAID onto these SSDs, but unfortunately the problem persists.

If its handling exactly the same workload as other servers which are not exhibiting the problem then this rather points to a problem at the hardware level. Is there a caching disk controller? With a flat battery? Is hdparm reporting the expecting config (or paste it here)? — symcbean, Apr 30 '15 at 16:21
@symcbean It is indeed handling the exact same workload. Yes there is a disk controller, we have a 3ware 8006 running two disks in a RAID 1. I'll see if I can get some useful output from it. What 'hdparm' command would you advise using? Thanks. — , May 01 '15 at 09:06
You should be checking pretty much every option - lookahead, 32 bit IO mode, DMA mode, multiple IO count, accoustic mgmt mode. Also check for errors with smartctl. But I'd start with checking the disk controller. — symcbean, May 01 '15 at 09:54
@symcbean we are running a RAID verification now. S.M.A.R.T stats here: disk 1 http://pastebin.com/rdi4SNsv disk 2 http://pastebin.com/EpGcqR65. Thanks :) — , May 01 '15 at 11:31
4 errors on disk 2 - but such a low number is nothing to worry about - certainly nothing to do with the problem described in the question. — symcbean, May 01 '15 at 14:05

score 1 · Answer 1 · answered May 02 '15 at 00:08

1

Which version of MySQL server do you use? After 5.5 you can use the performance_schema to get realtime statistics from the database. I'd start querying the

 table_io_waits_summary_by_table
 table_io_waits_summary_by_table
 table_lock_waits_summary_by_table

to see what is happening exactly.

An another solution would be if you check the buffer pool usage, it is not possible, that there are cold pages which needed to move to memory?

answered May 02 '15 at 00:08

banyek

379
1
10

We are using version 5.6.23. I've queried the above, but we have over 2000 tables, so figuring out which one (if any) is causing the problem manually isn't plausible. Do you have any suggestions for sorting through the results? What sort of values would you expect to see? – May 05 '15 at 09:21
Here are our innodb variables - http://pastebin.com/hCXgMNSr Look okay? – May 05 '15 at 13:11
First I see you have 2Gb of buffer pool used, but the server has 4GB RAM. You said it is a dedicated machine, so I'd say you can increase the 'innodb_buffer_pool_size" to 3 Gb first, maybe that could help. Second, maybe you should check the slow queries as loggin slow queries, and run on the logfile the pt-query-digest – banyek May 05 '15 at 18:49
Or you can try this: mysql> select OBJECT_SCHEMA, OBJECT_NAME, AVG_TIMER_WAIT from performance_schema.table_io_waits_summary_by_table order by AVG_TIMER_WAIT desc limit 20; do you see a 'bump' in the data? – banyek May 05 '15 at 18:54
Thank you for those suggestions. For the rows returned from the second suggestion, there aren't any that are wildly higher than the other. There are a few that are in a less common range though. – May 06 '15 at 12:47

score 1 · Answer 2 · answered May 10 '15 at 18:30

The best way to attack it is to look at http://www.brendangregg.com/linuxperf.html and follow Brendan's advice.

Specifically you want his iosnoop tool that will tell you who access the storage the most. But you'll do yourself a big favor if you read through it to learn his thought process and methodologies as that would benefit you a lot in the long term.

99% disk IO spikes with Percona

2 Answers2