How to measure IOwait per device?

Question

I have a server, which exports home directories over NFS. They are on software RAID1 (/dev/sdb and /dev/sdc) and the OS is on /dev/sda. I noticed that my %iowait as reported by top and sar are relatively high (compare to the rest of the servers). The values range between 5-10%, as for the other servers (which are more loaded than this one) the same as 0-1%. The so-called user experience drops when the %iowait reaches values above 12%. Then we experience latency.

I don't have any drive errors in the logs. I would like to avoid playing with the drives using the trial-and-error method.

How I can find out which device (/dev/sda, /dev/sdb or /dev/sdc) is the bottleneck?

Thanks!

Edit: I use Ubuntu 9.10 and already have iostat installed. I am not interested of NFS related issues, but more of how to find which device slows down the system. The NFS is not loaded, I have 32 threads available, the result of

grep th /proc/net/rpc/nfsd
th 32 0 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000

Edit2: Here is part of iostat -x 1 output (I hope I'm not violating some rules here):

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
          45.21    0.00    0.12    4.09    0.00   50.58

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
sda               0.00     0.00   21.00    0.00   368.00     0.00    17.52     0.17    8.10   6.67  14.00
sdb               0.00     6.00    0.00    6.00     0.00    96.00    16.00     0.00    0.00   0.00   0.00
sdc               0.00     6.00    0.00    6.00     0.00    96.00    16.00     0.00    0.00   0.00   0.00
dm-0              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-1              0.00     0.00   21.00    0.00   368.00     0.00    17.52     0.17    8.10   6.67  14.00
dm-2              0.00     0.00    0.00   12.00     0.00    96.00     8.00     0.00    0.00   0.00   0.00
drbd2             0.00     0.00    0.00   12.00     0.00    96.00     8.00     5.23   99.17  65.83  79.00

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
          45.53    0.00    0.24    6.56    0.00   47.68

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
sda               0.00     1.00   23.00    2.00   424.00    24.00    17.92     0.23    9.20   8.80  22.00
sdb               0.00    32.00    0.00   10.00     0.00   336.00    33.60     0.01    1.00   1.00   1.00
sdc               0.00    32.00    0.00   10.00     0.00   336.00    33.60     0.01    1.00   1.00   1.00
dm-0              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-1              0.00     0.00   23.00    0.00   424.00     0.00    18.43     0.20    8.70   8.70  20.00
dm-2              0.00     0.00    0.00   44.00     0.00   352.00     8.00     0.30    6.82   0.45   2.00
drbd2             0.00     0.00    0.00   44.00     0.00   352.00     8.00    12.72   80.68  22.73 100.00

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
          44.11    0.00    1.19   10.46    0.00   44.23

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
sda               0.00   637.00   19.00   16.00   432.00  5208.00   161.14     0.34    9.71   6.29  22.00
sdb               0.00    31.00    0.00   13.00     0.00   352.00    27.08     0.00    0.00   0.00   0.00
sdc               0.00    31.00    0.00   13.00     0.00   352.00    27.08     0.00    0.00   0.00   0.00
dm-0              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-1              0.00     0.00   20.00  651.00   456.00  5208.00     8.44    13.14   19.58   0.33  22.00
dm-2              0.00     0.00    0.00   42.00     0.00   336.00     8.00     0.01    0.24   0.24   1.00
drbd2             0.00     0.00    0.00   42.00     0.00   336.00     8.00     4.73   73.57  18.57  78.00

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
          46.80    0.00    0.12    1.81    0.00   51.27

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
sda               0.00     0.00   16.00    0.00   240.00     0.00    15.00     0.14    8.75   8.12  13.00
sdb               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sdc               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00

What are the most relevant columns to look into? What values are considered unhealthy? I suppose await and %util are the ones I am looking for. In my opinion dm-1 is the bottleneck (this is the DRBD resource metadata).

Double thanks!

Edit3: Here is what my setup is:

sda = OS, no RAID. Devices dm-0 and dm-1 are on it, as the latter is a metadata device for the DRBD resource (see below). Both dm-0 and dm-1 are LVM volumes; drbd2 = dm-2 = sdb + sdc -> this is the RAID1 device, which serves the user home directories over NFS. I don't think this one is the bottleneck. No LVM volume here.

Could you add some information about your DRBD configuration? — sciurus, Mar 30 '11 at 14:40
I have only 1 DRBD resource, which is on top of RAID1, made of sdb + sdc. It has external metadata, which is on LVM on sda. — grs, Mar 30 '11 at 15:18

MadHatter · Accepted Answer · 2011-03-30T18:12:34.680

6

iostat -x 1?

I am told I must expand that answer further, but as yet I don't know what to add. You don't say which distro you're using, so I can't point to you to a method to install iostat if you don't already have it. But I think it's what you're asking for.

Edit: glad to see some iostat output! At the moment, the sd[ab] devices have near-identical figures, which they should in RAID-1, and neither is saturated; nor is sdc. drbd2, however, is; what is this used for, and how might it affect server performance as a whole?

Edit 2: I don't really know what to suggest. You admit that drbd2 "serves the user home directories over NFS" and you say that you have an NFS server latency problem. You produce iostat output that pretty convincingly says that drbd2 is the bottlenecked device. You then say that "In my opinion dm-1 is the bottleneck" and "I don't think [drbd2] is the bottleneck". It's not clear to me what evidence you have that contradicts the hypothesis that drbd2 is the bottleneck, but it would be nice to see it.

edited Mar 30 '11 at 18:12

answered Mar 29 '11 at 20:42

MadHatter

78,442
20
178
229

Please see my Edit3 above. Thank you! – grs Mar 30 '11 at 13:36
If I understood correctly the man page, `await svctm %util` are the 3 columns I have to look careful, as the `%util` is the most important one. It practically represents for which device the CPU waits the most. Which matches your suggestion of drbd2 being the bottleneck. Is this generally correct? Thanks! – grs Mar 31 '11 at 14:55
Yes, that is how I was thinking; as the man page says, `%util` measures the "percentage of CPU time during which I/O requests were issued to the device (bandwidth utilization for the device). Device saturation occurs when this value is close to 100%.". `await` is a linked statistic and worryingly high, too. – MadHatter Mar 31 '11 at 15:41

score 0 · Answer 2 · answered Mar 29 '11 at 21:40

Is this a heavily used NFS server? A good way to find out if NFS is the bottleneck is to check how the NFS process is running and if any are in wait status.

grep th /proc/net/rpc/nfsd

th 128 239329954 363444.325 111999.649 51847.080 12906.574 38391.554 25029.724 24115.236 24502.647 0.000 520794.933

The first number is the number of threads available for servicing requests, and the the second number is the number of times that all threads have been needed. The remaining 10 numbers are a histogram showing how many seconds a certain fraction of the threads have been busy, starting with less than 10% of the threads and ending with more than 90% of the threads. If the last few numbers have accumulated a significant amount of time, then your server probably needs more threads.

Increase the number of threads used by the server to 16 by changing RPCNFSDCOUNT=16 in /etc/rc.d/init.d/nfs

You can read more at http://billharlan.com/pub/papers/NFS_for_clusters.html under "Server Threads" Heading.

score 0 · Answer 3 · answered Mar 30 '11 at 03:03

0

Both your /dev/sdb and /dev/sdc have very close "await" factor numbers. /dev/sda has some bigger numbers but how can it affect your RAID performance being not included into it? BTW, do you use LVM for mirroring, don't you?

answered Mar 30 '11 at 03:03

poige

9,171
2
24
50

score 0 · Answer 4 · answered Mar 30 '11 at 05:11

So reading iostat will help you narrow down what drive(s) are having IO issues, but I have found that tracking down the application causing the IO issues is far more helpful in actually improving the situation. For that iotop is awesome:

http://guichaz.free.fr/iotop/

How to measure IOwait per device?

4 Answers4