12

when iowait is considered to be high ?

iostat -x

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           2,89    0,01    5,45   49,83    0,00   41,83

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
fd0               0,00     0,00    0,00    0,00     0,00     0,00     8,00     0,00   40,00  40,00   0,00
sda               0,18     0,86    2,82    0,60   181,20    21,92    59,35     0,03   10,22   5,02   1,72
sdb               3,96    39,67    6,27   20,94  2243,04   564,37   103,16     0,24    8,83   6,57  17,89
sdc              69,17     0,02   77,40   21,92 37365,20  1578,42   392,10     1,53   15,44   7,46  74,14
sdd               0,64     0,01    1,60    0,09   402,20    87,67   289,93     0,14   80,82  10,63   1,80
dm-0              0,00     0,00    0,85    0,14    28,07     1,09    29,54     0,01    8,18   2,27   0,22
dm-1              0,00     0,00    0,00    0,00     0,02     0,03     8,00     0,00   15,00   2,78   0,00
dm-2              0,00     0,00    2,24    0,10   402,20    87,67   209,87     0,15   65,29   7,69   1,80
dm-3              0,00     0,00  155,10   80,57 39493,87  2121,22   176,58     3,07   13,02   3,33  78,39
dm-4              0,00     0,00    0,34    0,06    34,97     0,47    89,11     0,01   25,57  10,23   0,41
dm-5              0,00     0,00    0,95    1,49    59,89    16,74    31,36     0,03   14,02   1,97   0,48
dm-6              0,00     0,00    0,42    0,43    19,50     4,36    28,10     0,01   16,69   5,00   0,42
dm-7              0,00     0,00    0,96    0,27    28,18     2,20    24,61     0,02   19,65   4,89   0,60
dm-8              0,00     0,00    0,83    0,71    66,32    10,89    50,17     0,02   16,16   4,50   0,69
dm-9              0,00     0,00    0,21    0,29    48,34     7,13   112,85     0,01   20,98   5,36   0,26
dm-10             0,00     0,00    0,06    0,01     2,08     0,12    29,66     0,00   16,39   7,85   0,06
dm-11             0,00     0,00    0,04    0,03     2,83     0,44    44,07     0,00   18,02   6,38   0,05
dm-12             0,00     0,00    0,00    0,00     0,01     0,00     2,40     0,00   46,53   4,53   0,00
dm-13             0,00     0,00    0,03    0,00     4,88     0,00   176,75     0,00   26,46  10,75   0,03
Gabriel Sousa
  • 307
  • 2
  • 4
  • 10

1 Answers1

9

The best answer I can give you is "iowait is too high when it's affecting performance."
Your "50% of the CPU's time is spent in iowait" situation may be fine if you have lots of I/O and very little other work to do as long as the data is getting written out to disk "fast enough". Conversely it could be catastrophic if the server is doing a high amount of disk I/O and is noticeably slow to the point where users are wasting "wall-clock time" waiting for operations to complete.

To determine what affects performance you need to throw some benchmarks (stress tests) at the system and see at what kind of I/O throughput the system can handle before it starts to noticeably bog down.
(The number you get from those benchmarks is largely an academic one - it's nice to know where the choke will happen, but when it happens the "solution" is to reduce the amount of time I/O operations take, or the number of them waiting in the queue. You can do that by switching to faster disks, installing better controllers with more cache, putting in SSDs, splitting the workload to multiple servers, etc.)

voretaq7
  • 79,345
  • 17
  • 128
  • 213
  • 3
    (For what it's worth I subscribe to the user-centered view of performance tuning: A computer can be *horribly overloaded* by the numbers, but *working beautifully* from a user perspective, and users are the ones who open support tickets and complain about things, so it's their opinion that matters.) – voretaq7 Sep 16 '15 at 19:53
  • 1
    In addition to everything said above, you can check CPU load to get a feeling how many processes waiting for IO. – kofemann Sep 16 '15 at 19:59
  • @kofemann Load Average (RunQueue depth) is a useful metric, but it too can be misleading. I've had systems operating with load averages of 10-20 but the users had no issues with performance (lots of processes waiting, but they only need a couple of microseconds and then they give the CPU back). User-perceived performance is always the great and final arbiter. – voretaq7 Sep 16 '15 at 20:46
  • @voretaq7 load averages of 10-20 on how many core / cpu ? – Gabriel Sousa Sep 19 '15 at 00:46
  • @GabrielSousa I've seen it on a variety of systems ranging from 1-CPU 486s to 8-core Sun machines. On modern multicore systems with the same kinds of workloads you could probably sustain even higher load averages as long as the OS scheduler and the disk/memory subsystems can deal with it. Load average (like iowait, %CPU busy, etc.) is not always directly correlated to performance. It is *one* component in a *system* which ultimately determines performance. – voretaq7 Sep 19 '15 at 04:03