-4

We have a virtual server running on VMware. This virtual server has a number of iSCSI drives attached to it (E, F & G drives). It has 2 virtual NICs - 1 for normal network traffic and 1 for iSCSI. Both NICs are connected at 1Gbps. We are running Windows 2012.

Every half hour we run a process that is quite intensive on 1 drive (say F drive). In PerfMon we are seeing that the % Disk Read Time is at 100% while this process runs. Looking at task manager the throughput on the iSCSI connection rarely exceeds 500Mbps, and normally sits around 200-300Mbps while this process runs.

The graph of the disk activity in the SAN and on the switch between the host and SAN also shows that they are running well under capacity.

I saw this question, however I don't understand the answer and it doesn't seem relevant to me (I could be wrong): iSCSI SAN - Network adapter bottleneck

What should I look at to see why Windows thinks the disk is running at 100%, yet the hardware is running under 50%?

I'm a dba/programmer, not a network guy so it's quite possible I'm missing something simple

Greg
  • 463
  • 2
  • 10
  • 22
  • Your iSCSI connection bandwidth used should be be directly compared to disk load. They are not really directly related. What kind of SAN? How are your vswitches setup? What kind of switches are being used? Are the switches dedicated for iSCSI traffic? gWhat kind of drives and how many drives make up the volume you are looking at? What RAID level is used for the volume in question? What throughput are you getting in perfmon? What are your IOPS showing in perfmon? – Rex Dec 30 '13 at 03:05
  • 1
    Usually only sequential read/writes can saturate a storage bandwidth capacity. If a random acces pattern is presented in your workload then IOPS and latency are deciding factors. – Veniamin Dec 30 '13 at 04:59

1 Answers1

4

% Disk Read Time doesn't mean what you think it means. It doesn't mean that your disk is at 100% utilization, which makes the rest of your question somewhat irrelevant, since it's based on a misunderstanding of this counter.

From the linked article:

The “% Disk Time” counter is nothing more than the “Avg. Disk Queue Length” counter multiplied by 100. It is the same value displayed in a different scale. If the Avg. Disk queue length is equal to 1, the %Disk Time will equal 100. If the Avg. Disk Queue Length is 0.37, then the %Disk Time will be 37. This is the reason why you can see the % Disk Time being greater than 100%, all it takes is the Avg. Disk Queue length value being greater than 1. The same logic applies to the % Disk Read Time and the % Disk Write Time. Their data comes from the Avg. Disk Read Queue Length and Avg. Disk Write Queue Length, respectively.

MDMarra
  • 100,183
  • 32
  • 195
  • 326
  • Does % Idle Time work the same way? We're seeing the % Idle time drop to 0 at the same time as % Read Time = 100 – Greg Dec 30 '13 at 03:44
  • 2
    % Disk Read Time, %Disk Write Time, % Disk Idle Time, and % Disk Time are useless when you're talking about a SAN volume. You should measure these volumes using your SAN's tools and not Windows if you want numbers that matter. Seriously, do a little research here. This is basic stuff that's all over the Internet. Your queue length values are also only what Windows sees and won't show you what your array has queued, which is what really matters. Maybe you should reach out and ask your storage admins for help? – MDMarra Dec 30 '13 at 03:48