3

We have a busy server, chocking under a high IO load, at least, that's the feeling I have. Output from iostat -xz looks like this:

             extended device statistics                 
device    r/s    w/s   kr/s   kw/s wait actv  svc_t  %w  %b 
sd5     224.8  157.8 10701.8 6114.7  0.0  9.5   24.7   0 100 
sd5     243.2  110.4 11565.3 4065.0  0.0  9.7   27.5   0 100

It's obvious that the disk subsystem is overloaded, since a 25ms service time is unacceptable for a 6 drive SATA array, and a 100% busy also means we're chocked on disk IO.

But - why is wait always 0.0? And why is %w is also 0? %w sometimes goes to 1, and quickly returns to 0. Doesn't this mean that no process is waiting for IO?

Does the RAID controller somehow causes this result / masks the wait times?

Can someone explain this behavior?

shlomoid
  • 289
  • 3
  • 14

2 Answers2

2

The svc_t time measures in milliseconds the "round trip":

"bottom" of the operating system - disk subsustem - "bottom" of the operating system

It is not completely correct that "100% busy means we're chocked on disk IO". This means that the disk was busy 100% of the time doing something, not necessarily that it cannot do more than this nor that it does not serve requests in time (this is a subtle difference).

Usually the symptoms of overloaded disks are high values on the %w column and actc (steadily over 200).

Could it be a latency problem? Does the system request lots of random operations so that the controller spends time looking for the 6th data chunk?

marcoc
  • 738
  • 4
  • 10
  • This is a database server, performing many random iops. This is the main/only access pattern on the server. While I understand that in theory, a 100% busy means the controller is just always doing something and never idle, but if it has a sustained 100% busy level for a long time without any wait larger than 0 - I suspect something is wrong. – shlomoid Apr 13 '11 at 15:10
  • Also, actv here doesn't get much higher since this is a small number of mysql slave threads writing data, and since slaves are serial in nature, I don't expect to see much higher numbers in actv column. What I see is that when the busy gets up to 100%, latency grows to 25-30ms area. When it's around the 95%, latency is around 5ms. – shlomoid Apr 13 '11 at 15:13
  • 1
    There is a difference between being fully utilized and being saturated. Your disks seem to be on the verge of becoming saturated with more I/O requests than they can handle. However, the fact that 'wait' is always zero or 1 means that either they are keeping up with the load (albeit at increased latencies) or something above it is trying to play nice (I/O scheduler?) – Giovanni Tirloni Apr 14 '11 at 03:42
  • What do you mean by trying to play nice? Like you said, I just think that something in the IO subsystem accepts more requests than it can handle (controller?) instead of letting processes wait. Question is, what, why, how do I control it, and how do I really know if disks are saturated? – shlomoid Apr 14 '11 at 08:48
  • This kind of "threshold" behaviour is sometimes bound to the cache saturation, particulary for "write" operations: some RAID controllers stop using the cache and switch to direct disk access, until the cache utilization drops again below a certain level. – marcoc Apr 22 '11 at 09:37
1

Yes, I think you're correct in the RAID controller messing up the numbers. If it tells the driver the operation has started as soon as it's requested, the driver won't know it's still waiting for the disk hardware inside the RAID controller. Can you pull stats off the RAID controller directly?

JOTN
  • 1,727
  • 1
  • 10
  • 12
  • I wish :) It's a DELL H700, and they are not supported in Solaris... Do you have an idea that I could test this theory with? Or an alternative way to measure the real disk condition? – shlomoid Apr 15 '11 at 09:07