On an 8-way Amazon EC2 instance (running Linux 2.6.21) with 8 EBS volumes and a lot of disk traffic, we see high %wa in top (30-40%), and high load average (8-9). My understanding is that processes waiting on I/O from the EBS volumes are counted in the load average (a ps shows several processes in the D state, about as many as the load average).
However, it's not clear what %wa means. Is a CPU actually occupied waiting for a response from the EBS volume, or does the kernel schedule another process on it? I would expect that another process would be scheduled; but then I don't understand why iowait time would be expressed as a percentage of total CPU time (unless the percentages add up to more than 100%).
So long as we don't max out the I/O capacity of the EBS volumes I'm not concerned, but if the CPUs get tied up waiting for I/O I think our machine will run out of CPU capacity before running out of I/O capacity.