5

What does it mean if iostat shows 100% busy wait but the CPU is completely idle? My application runs for a while, then goes into this state periodically for about 10-20 seconds.

It is transaction processing C++ app on solaris 10.

IO stat output:
                  extended device statistics                       cpu
device      r/s    w/s   kr/s   kw/s wait actv  svc_t  %w  %b  us sy wt id
c0          0.0    0.0    0.0    0.0  0.0  0.0    0.0   0   0   0  0  0 100
sd1         0.0    0.0    0.0    0.0  0.0  0.0    0.0   0   0 
sd2         0.0    0.0    0.0    0.0  0.0  0.0    0.0   0   0 
sd3         0.0    0.0    0.0    0.0  0.0  0.0    0.0   0   0 
sd4         0.0    0.0    0.0    0.0  0.0  0.0    0.0   0   0 
c1          0.0    0.0    0.0    0.0  0.0  0.0    0.0   0   0 
sd0         0.0    0.0    0.0    0.0  0.0  0.0    0.0   0   0 
c6          0.0    0.0    0.0    0.0  0.0  1.0    0.0   0 100 
sd19        0.0    0.0    0.0    0.0  0.0  0.0    0.0   0   0 
sd19.fp2    0.0    0.0    0.0    0.0  0.0  0.0    0.0   0   0 
sd19.fp4    0.0    0.0    0.0    0.0  0.0  0.0    0.0   0   0 
sd20        0.0    0.0    0.0    0.0  0.0  0.0    0.0   0   0 
sd21        0.0    0.0    0.0    0.0  0.0  0.0    0.0   0   0 
sd22        0.0    0.0    0.0    0.0  0.0  0.0    0.0   0   0 
sd23        0.0    0.0    0.0    0.0  0.0  1.0    0.0   0 100 
sd24        0.0    0.0    0.0    0.0  0.0  0.0    0.0   0   0 
sd25        0.0    0.0    0.0    0.0  0.0  0.0    0.0   0   0 
sd26        0.0    0.0    0.0    0.0  0.0  0.0    0.0   0   0 
sd27        0.0    0.0    0.0    0.0  0.0  0.0    0.0   0   0 
nfs1        0.0    0.0    0.0    0.0  0.0  0.0    0.0   0   0 

vmstat output:

kthr      memory            page            disk          faults      cpu 
r b w   swap  free  re  mf pi po fr de sr s0 s1 s2 s3   in   sy   cs us sy id 
0 0 0 10842364 33093436 30 188 0 0 0 0 65 -0  2  3 -0 1327  843  709  0  1 99 
0 0 0 3406728 28181464 71 3601 0 0 0 0  0  0  0  0  0 1372 23009 1584 4  0 96 
0 0 0 2702996 28030080 0 740 0 0  0  0  0  0  0  0  0 1414 15002 2065 6  0 93 
0 0 0 2699448 28016628 0 3  0  0  0  0  0  0  0  0  0 1747 3012 2193  9  1 90 
0 0 0 2691728 28009844 0 1  0  0  0  0  0  0 10 10  0 2315 1300 2877  2  0 97 
0 1 0 2679788 27957836 0 5033 0 0 0  0  0  0  1  1  0 1895 1945 2658 10  0 90 
0 2 0 2654188 27907196 0 0  0  0  0  0  0  0  1  1  0 3566 3788 5495  2  0 98 
sean riley
  • 151
  • 1
  • 2
  • 6
  • 1
    can you please post a few lines of vmstat 5 – Dave Cheney Jun 16 '09 at 04:32
  • Funny thing you asked about this, as I'm experiencing the exact same problem on a Linux guest in vmware enviroment. Turned out it was ext3 not being extremely happy about writing small amounts of data quickly to a raid5 array on a SAN. Reformated to XFS and all my problems went away. – pauska Jun 16 '09 at 23:38

3 Answers3

9

It means that the load is due to IO wait, not CPU contention. So, accessing a hard drive, accessing an NFS share, accessing swap space (and hence (usually) a local hard drive... I'm not sure if pure network access contributes to this, but my gut says no. NFS just adds to it because it uses the FS layer. "top" usually has a "wait" or "iowait" percentage that would usually show this.

jj33
  • 11,038
  • 1
  • 36
  • 50
1

IO Stat shouldn't be 100%. If you have a heavy amount of disk IO, it may be high, above 50%, but exactly 100% usually means something is wrong your IO subsystem. This has happened to me when I've had a hard drive in the early stages of failure, when the disk takes longer and longer to respond to requests, but still responds.

Or it could just be a very badly written application. A simple DTrace script should tell you which it is.

Craig Lewis
  • 141
  • 1
  • Every time vmstat shows a large Page In (pi) activity, your free memory is dropping. Your swap is not releasing as much as free memory is using. Are you by chance mmap()ing large files? I'm going pretty far out on a limb, but I would guess that you're mmap()ing and munmap()ing files frequently. Try to mmap() them once, and hold on to them as long as you can. – Craig Lewis Jun 18 '09 at 23:38
0

One thing that can drive IO load very high is paging. Is your application consuming all the physical memory and causing the machine to page hard?

vmstat 5

If the si and so columns show anything other than 0 then you're machine is paging (possibly a lot).

Dave Cheney
  • 18,307
  • 7
  • 48
  • 56