4

Our server started getting slow, so I ran iostat on it.

iostat -dx 5

Device:         rrqm/s   wrqm/s   r/s   w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
sda               0.00    89.60 108.40  5.60   880.00   763.20    14.41     2.61   22.87   8.70  99.20
sdb               0.00     0.00  0.00  0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00

So I see that the one disk sda is totally saturated. How do I find which exact processes are causing this? (or is it swapping to that disk?)

Artem
  • 173
  • 1
  • 1
  • 6
  • On CentOS 5, the kernel is the older one, and so tools like iotop do not work! See my poor man's solution to this below. – Artem Aug 05 '10 at 15:35

7 Answers7

6

I also like iotop

Richard Salts
  • 755
  • 3
  • 17
1

collectl may be what you are after. I/O statistics by process, among other things. collectl --top io to print a top-like listing sorted by IO usage, collectl -sZ for collectl native output for the processes subsystem. Adding the --procopts t switch will show threads too.

As Richard Salts mentioned, IOTop will give you a UI with more detailed I/O stats, if you have a window manager and Python then use that. In either case though, if your kernel doesn't support it (2.6.20 or later is a safe bet) then neither program will work.

  • As you anticipated, sadly, I get "Error: you cannot use --top and IO options with this kernel type 'collectl -h' for help" I am on CentOS 5. What are my options in this case? – Artem Aug 03 '10 at 14:36
0

Would be nice to know what distro you're on, but here goes:
You can see what disk your swap partition is on by checking for " Linux swap / Solaris" in the output of "fdisk -l /dev/sda". That will show you if there is swap on that partition.

Then, you can watch swap usage with vmstat to see if your server is doing a lot of swapping.

Sweet
  • 431
  • 2
  • 4
0

So sadly none of the iostat and related packages work in CentOS 5. But I was able to find the culprit slow process by using:

ps auxf | grep ' B'

Which shows all the processes waiting in uniterruptible sleep caused by I/O waiting, so it is likely to be processes doing a lot of I/O.

This was thanks to this ServerFault answer: wa (Waiting for I/O) from top command is big

Also, for those wondering if the I/O is slow because of swapping, take a look at your top output and see what the sum of (free + cached) columns says. Or better use htop, which shows this in a less confusing way.

Artem
  • 173
  • 1
  • 1
  • 6
0

one option that might work for you is if the disk is only getting saturated in bursts, use collectl to grab disk and process stats. then look at the data to see when the disk is being saturated and 'collectl -sZ -p filename' to playback your collected process data and look at which processes are in the RUN state during these times. might work, might not... -mark

0

Try command btrace (or blktrace)

Janne Pikkarainen
  • 31,454
  • 4
  • 56
  • 78
0

iotop works fine on CentOS 5.7 (2.6.18-274). If you're running the older kernel, try dstat instead.

Install rpmforge repo:

rpm -ivh rpmforge-release-0.5.2-2.el5.rf.x86_64.rpm

then install dstat from rpmforge-extras repo to get a newer version (0.7.2):

yum --enablerepo=rpmforge-extras install dstat

To view top I/O-using processes, you might use:

dstat -d --top-io --top-bio
quanta
  • 50,327
  • 19
  • 152
  • 213