4

I have a very strange IO issue with Ubuntu 12.04 and MySQL.

Currently the machine is only a replicated slave with an occasional read query hitting it. The disk utilization spikes randomly and seemingly unrelated to MySQL usage. The machine is only running MySQL and has no other services.

Originally the machine was using ext4 which suffers from IO issues with MySQL, I wiped it and reinstalled with ext3. After replication resumed the disk utilization again randomly spiked, remained high for a few hours and dropped off again.

The MySQL usage follows the same pattern every day but the disk utilization has no pattern, it spikes randomly and can remain high for a number of hours or just a few minutes. There is a nightly spike at 1am, this is when our MySQL backup(mysqldump) runs and is consistent.

My next step is to downgrade to Ubuntu 10.04, the machine was running Debian 5 previously without any issues. We have a second identical machine with the same issue which rules out a single hardware issue in my mind.

Disk Utilization Graph:

Disk0

The initial spike at 5pm is replication catching up after reinstall, the spike at 1am is our backup. The issues pops up at 4am and remains high until just after 12 where it drops of dramatically.

MySQL Weekly Graph

MySQL

This is our average MySQL usage over the week. Same pattern every day, busiest from 9am to 11pm and quiet from here to 9am again with the lowest point each day around 4am.

Iostat output while the issue is happening:

http://pastebin.ca/2336462

/proc/mounts:

http://pastebin.ca/2336464

df -h:

http://pastebin.ca/2336465

mloc123
  • 41
  • 2
  • Can you provide iostat -xdk 1 100 and put on pastebin. Also, show your /proc/mounts and df -h. Try to take the iostat at the problematic time, otherwise it is of no use. I usually take full day iostat for this kind of issues, but going through an entire day of iostat is a job that an outsider won't like to do ;) – Soham Chakraborty Mar 14 '13 at 17:16
  • Added extra detail now. – mloc123 Mar 20 '13 at 10:37
  • Just in case, did you have a look at the `/var/log` files? Assuming you have *ssh* and are connected to the Net (even if no domain points to it), there are some random *ssh* (...) attacks that show this kind of behavior (file access would be the log). For *ssh* the `auth.log` file is a good indicator – Déjà vu Mar 20 '13 at 10:56
  • Server sits behind a firewallwith no direct SSH access. I have reinstalled with 10.04 now, checked all RAID settings while doing so and no issues with the card or drives. – mloc123 Mar 21 '13 at 14:49

1 Answers1

1

It sounds like what you want to know is which process is pegging disk IO. Fortunately, Ubuntu has iotop available via apt-get install iotop since Lucid/10.04. Since your IO spikes can last for minutes or hours, it should be relatively easy to detect the next IO spike, start up iotop, and identify the culprit process.

EdwardTeach
  • 622
  • 8
  • 20