Implications of 100% utilization on dm device

Question

We have a RHEL 5.6 server here with 4 active paths to a single LUN. We suspect that it's not able to cram enough IOs down the pipeline to the XIV on the other end:

mpath0 (XXXXXXXXXXXXXXX) dm-9 IBM,2810XIV
[size=1.6T][features=1 queue_if_no_path][hwhandler=0][rw]
\_ round-robin 0 [prio=4][active]
 \_ 2:0:1:2 sdaa 65:160 [active][ready]
 \_ 1:0:0:2 sdc  8:32   [active][ready]
 \_ 1:0:1:2 sdk  8:160  [active][ready]
 \_ 2:0:0:2 sds  65:32  [active][ready]

Device:         rrqm/s   wrqm/s   r/s   w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await  svctm  %util
sdc               0.00   108.18 49.30 273.65   795.21  1527.35    14.38     0.49    1.51   1.16  37.50
sdk               0.00   101.00 49.70 280.44  1700.60  1525.75    19.55     0.55    1.67   1.15  38.06
sds               0.20   110.58 50.10 270.26  1287.82  1523.35    17.55     0.51    1.58   1.17  37.47
sdaa              0.00    99.60 46.31 285.23   781.64  1539.32    14.00     0.56    1.68   1.23  40.74
dm-9              0.00     0.00 195.61 1528.94  4565.27  6115.77    12.39     2.52    1.46   0.58  99.54

It looks as though RHEL ought to be able to send far more IOPS down each path (which is desirable on the XIV storage subsystem) but the %util on the dm-9 device (which is the multipath map) is sitting around 100%.

Does this mean that RHEL is unable to cram any IOPS into the multipath (and thus the bottleneck is RHEL)? How should I be interpreting this?

How do we get 99.54% out of 37.50, 38.06, 37.47, 40.74 on individual disks?

Really quick thing... What are your I/O elevator settings set to? Which scheduler? How are you generating load? — ewwhite, Dec 02 '11 at 18:17
noop scheduler, load is DB2. We suspect the db2audit logfile. — MikeyB, Dec 02 '11 at 18:33
K. `noop` is definitely the right choice there. Hmm, I'm not sure otherwise. — ewwhite, Dec 02 '11 at 18:34

score 2 · Accepted Answer · answered Dec 02 '11 at 19:41

Experiments seem to confirm that the time spent by the kernel waiting for a sync write to complete is counted against busy%.

So the workload of this particular application (DB2 with the synchronous audit log) was doing:

open(O_SYNC)
write()
close()

to the audit log on every audited activity. Which KILLED performance.

pfo · Answer 2 · 2011-12-02T19:51:37.217

1

Everything with your DM setup seems to be fine, also the iostat output looks totally sane. 1500 IOPS are next to nothing for DM and peanuts load for the XIV. You need to look somewhere else.

edited Dec 02 '11 at 19:51

answered Dec 02 '11 at 19:15

pfo

5,630
23
36

Implications of 100% utilization on dm device

2 Answers2