1. The Logical/Physical Disk % Disk Time counters look wrong. What gives?
2. I see a value of 2.63 in the Ave Disk Queue Length Counter field. How do I interpret this value?
3. How was the problem with the % Disk Time Counter fixed in Windows 2000?
4. Why are the Logical Disk counters zero?
5. In Windows NT 4.0, when should I issue the diskperf –ye command instead?
6. I am concerned about the overhead of the diskperf measurements. What does this feature cost?
1. The Logical/Physical Disk % Disk Time counters look wrong. When I add the % Disk Read Time and % Disk Write Time counters together, they do not add up to % Disk Time. What gives?
The % Disk Time counters are capped in System Monitor at 100% because it would be confusing to report disk utilization > 100%. This occurs because the % Disk Time counters do not actually measure disk utilization. The Explain text that implies that it does represent disk utilization is very misleading.
What the Logical Disk and Physical Disk % Disk Time counters actually do measure is a little complicated to explain.
The %Disk Time Counter is not measured directly. It is a value derived by the diskperf filter driver that provides disk performance statistics. diskperf is a layer of software sitting in the disk driver stack. As I/O Request packets (IRPs) pass through this layer, diskperf keeps track of the time I/Os start and the time they finish. On the way to the device, diskperf records a timestamp for the IRP. On the way back from the device, the completion time is recorded. The difference is the duration of the I/O request. Averaged over the collection interval, this becomes the Avg. Disk sec/Transfer, a direct measure of disk response time from the point of view of the device driver. diskperf also maintains byte counts and separate Counters for Reads and Writes, at both the Logical and Physical Disk level. (This allows Avg. Disk sec/Transfer to be broken out into Reads and Writes).
The Avg. Disk sec/Transfer measurement reported is based on the complete round trip time of a request. Strictly speaking, it is a direct measure of disk response time – which means it includes queue time. Queue time is time spent waiting for the device because it is busy with another request or waiting for the SCSI bus to the device because it is busy.
% Disk Time is a value derived by diskperf from the sum of all IRP round trip times (Avg.Disk sec/Transfer) times Disk Transfers/sec, divided by duration, essentially:
% Disk Time = Avg Disk sec/Transfer * Disk Transfers/sec
which is a calculation (subject to capping when it exceeds 100%!) that you can verify easily enough for yourself.
Because the Avg. Disk sec/Transfer that diskperf measures includes disk queuing, % Disk Time can grow greater than 100% if there is significant disk queuing (at either the Physical or Logical Disk level). The Explain text in the official documentation suggests that this product of Avg. Disk sec/Transfer and Disk Transfers/sec measures % Disk busy. If (big if) IRP round trip time represented service time only, then the % Disk Time calculation would correspond to disk utilization. But Avg. Disk sec/Transfer includes queue time, so the formula used really calculates something entirely different.
The formula used in the calculation to derive % Disk Time corresponds to Little's Law, a well-known equivalence relation that shows the number of requests in the system as a function of the arrival rate and service time. According to Little's Law, Avg Disk sec/Transfer * Disk transfers/sec properly yields the average number of requests in the system, more formally known as the average Queue length. The average Queue length value calculated in this fashion includes both IRPs queued for service and actually in service.
A direct measure of disk response time like Avg. Disk sec/Transfer is a useful metric. Since people tend to buy disk hardware based on a service time expectation, it is unfortunate that there is no way to break out disk service time and the queue time separately in NT 4.0. (The situation is greatly improved in Windows 2000, however.) Given the way diskperf hooks into the I/O driver stack, the software RAID functions associated with Ftdisk, and SCSI disks that support command tag queuing, one could argue this the only feasible way to do things in the Windows 2000 architecture. The problem of interpretation arises because of the misleading Explain text and the arbitrary (and surprising) use of capping.
Microsoft's fix to the problem beginning in NT 4.0 is a different version of the Counter that is not capped. This is Avg. Disk Queue Length. Basically, this is the same field as % Disk Time without capping and without being printed as a percent.
For example, if % Disk Time is 78.3%, Ave Disk Queue Length is 0.783. When % Disk Time is equal to 100%, then Ave Disk Queue Length shows the actual value before capping. We recently had a customer reporting values like 2.63 in this field. That's a busy disk! The interpretation of this Counter is the average number of disk requests that are active and queued – the average Queue Length.
2. I see a value of 2.63 in the Ave Disk Queue Length Counter field. How should I interpret this value?The Ave Disk Queue Length Counter is derived from the product of Avg.Disk sec/Transfer) times Disk Transfers/sec, the average response of the device times the I/O rate. This corresponds to a well-known theorem of Queuing Theory called Little’s Law, which states:
N = l * Sr
where N is the number of outstanding requests in the system, l is the arrival rate of requests, and Sr is the response time. So the Ave Disk Queue Length Counter is an estimate of the number of outstanding request to the (Logical or Physical) disk. This includes any requests that are currently in service at the device, plus any that requests that are waiting for service. If requests are currently waiting for the device inside the SCSI device driver layer of software below the diskperf filter driver, the Current Disk Queue Length Counter will have a value greater than 0. If requests are queued in the hardware, which is usual for SCSI disks and RAID controllers, the Current Disk Queue Length Counter will show a value of 0, even though requests are queued.
Since the Ave Disk Queue Length Counter value is a derived value, not a direct measurement, you do need to be careful how you interpret it. Little’s Law is a very general result that is often used in the field of computer measurement to derive a third result when the other two values are measured directly. However, Little’s Law does require an equilibrium assumption in order for it be valid. The equilibrium assumption is that the arrival rate equals the completion rate over the measurement interval. Otherwise, the calculation is meaningless. In practice, this means you should ignore the Ave Disk Queue Length Counter value for any interval where the Current Disk Queue Length Counter is not equal to the value of Current Disk Queue Length for the previous measurement interval.
Suppose, for example, the Ave Disk Queue Length Counter reads 10.3, and the Current Disk Queue Length Counter shows 4 requests in the disk queue at the end of the measurement interval. If the previous value of Current Disk Queue Length was 0, the equilibrium assumption necessary for Little’s Law does not hold. Since the number of arrivals is evidently greater than the number of completions during the interval, there is no valid interpretation for the value in the Ave Disk Queue Length Counter, and you should ignore the Counter value. However, if both the present measurement of the Current Disk Queue Length Counter and the previous value are equal, then it is safe to interpret the Ave Disk Queue Length Counter as the average number of outstanding I/O requests to the disk over the interval, including both requests currently in service and requests queued for service.
You also need to understand the ramifications of having a total disk round trip time measurement instead of a simple disk service time measure. Assuming M/M/1, a disk at 50% busy has one request waiting on average and disk response time is 2 * service time. This means that at 50% busy – assuming M/M/1 holds, an Ave Disk Queue Length value of 1.00 is expected. That means that any disk with an Ave Disk Queue Length value greater than 0.70 probably has a substantial amount of queue time associated with it. The exception of course is when M/M/1 does not hold, such as during a backup operation when there is only a single user of the disk. A single user of the disk can drive a disk to near 100% utilization without a queue!
3. How was the problem with the % Disk Time Counter fixed in Windows 2000?Maybe not fixed exactly, but ultimately, this problem is addressed quite nicely in Win2K (although it would arguably be better had the older % Disk Time Counters – now obsolete –not been retained).
Windows 2000 adds a new Counter to the Logical and Physical Disk Objects called % Idle Time. Disk idle time accumulates in diskperf when there are no outstanding requests for a volume.
Having a measure of disk idle time permits you to calculate % Disk busy = 100 - % Idle Time, which is a valid measure of disk utilization.
Then you can calculate Disk Service time = % Disk busy / Disk transfers/sec. This is an application of the Utilization Law, namely:
u = service time * arrival rate
Finally, calculate Disk Queue time = Avg. Disk secs/transfer - Disk service time, which follows from the definition of response time = service time + queue time.
So measuring Logical and Physical Disk % Idle Time solves a lot of problems. It allows us to calculate disk utilization and derive both disk service time and queue time measurements for disks in Windows 2000.
4. Why are the Logical Disk counters zero?This will occur on Windows 2000 if you have never issued the diskperf -yv command to enable the Logical Disk measurements. When diskperf is not active, the corresponding Counters in System Monitor are zero. In Windows 2000, only the Physical Disk counters are enabled by default (this is equivalent to issuing the diskperf -yd command.)
In Windows NT, neither Logical or Physical Disk counters are enabled by default. To enable both sets of Disk counters, issue the diskperf -y command in NT 4.0. You must reboot the system in both Windows 2000 and NT 4.0 in order to activate the new diskperf settings.
5. In Windows NT 4.0, when should I issue the diskperf -ye command instead?Almost never. We recommend that you use the diskperf -ye option only if you are using the software RAID functions (these include creating extendable volume sets and establishing disk striping, disk mirroring and RAID 5 logical volumes) in the Disk Administrator. Setting diskperf -ye allows you to collect accurate Physical Disk statistics when you are using software RAID functions in NT 4.
The diskperf –ye command loads the diskperf.sys filter driver beneath the optional fault tolerant disk driver ftdisk.sys driver that provides software RAID functions in Windows NT 4.0. When striped, mirrored or RAID 5 logical disks are defined using Disk Administrator functions, the ftdisk.sys module that is responsible for remapping logical disk I/O requests to the appropriate physical disk is loaded in I/O driver stack below the NTFS file system driver and before the SCSI physical disk driver. When the normal diskperf -y command is issued, diskperf.sys is loaded in front ftdisk.sys. This allows diskperf to capture information about Logical Disk requests accurately. But because Logical Disk requests are transformed by the ftdisk.sys layer immediately below it, the Physical Disk statistics reported are inaccurate. To see accurate Physical Disk statistics issue the diskperf -ye command to load diskperf.sys below ftdisk.sys.
Creating extendable volume sets is by far the most common use of the software RAID functions in the NT 4.0 Disk Administrator. You may prefer loading diskperf above ftdisk.sys (using the normal diskperf -y command) to obtain accurate Logical Disk statistics for a volume set.
This problem is addressed in Windows 2000 by allowing diskperf to be loaded twice, once above ftdisk.sys to collect Logical Disk statistics and once below it to collect Physical Disk stats. In Win2K, diskperf is loaded below ftdisk.sys by default. To load it a second time, issue the diskperf –yv command to activate the Logical Disk measurements.
6. I am concerned about the overhead of the diskperf measurements. What does this feature cost?Not much. We strongly recommend that you enable all disk performance data collection on any system where you care about performance.
Even if you don’t care that much about performance, you should turn on Logical Disk reporting at a minimum. The Logical Disk Object contains two Counters, Free Megabytes and % Free Space, that will alert you in advance to potential out-of-disk space conditions.
The diskperf measurement layer does add some code to the I/O Manager stack, so there is added latency associated with each I/O request that accesses a physical disk when measurement is turned on. However, the overhead of running the diskperf measurement layer, even twice on Windows 2000 machines, is trivial. In a benchmark environment where a 550 MHz 4-way Windows 2000 Server was handling 40,000 I/Os per second, enabling the diskperf measurements reduced its I/O capacity by about 5% to 38,000 I/Os per second. In that environment, we estimated that the diskperf measurement layer added about 3-4 microseconds to the I/O Manager path length for each I/O operation. (On a faster processor, the delay is proportionally less.) For a disk I/O request that you would normally expect to require a minimum of 3-5 milliseconds, this additional latency is hardly noticeable.
Besides, if you do not have disk performance statistics enabled and a performance problem occurs that happens to be disk-related (and many are), you won’t be able to gather data about the problem because loading the diskperf measurement layer requires a reboot!
In our view, you can only justify turning off the disk performance stats in a benchmark environment where you are attempting to wring out the absolute highest performance level from your hardware configuration. Of course, you will need to have the diskperf measurements enabled initially to figure out how to optimize the configuration in the first place. It is standard practice to disable disk performance monitoring prior to making your final measurement runs.
7. I do not see a drive letter for some of my logical disks. Instead, I see something that looks like HarddiskDmVolumes\systemnameDg0\Volume1 instead. What is that all about?Logical disk information containing "HarddiskVolume..." usually indicates an unformatted partition. Knowledge Base article (Q260834) describes the 'HarddiskVolume' label as a volume that has been mounted, but not assigned a drive letter. See http://support.microsoft.com/default.aspx?scid=kb;EN-US;q260834
There is another MS KB entry that specfically discusses "HarddiskDmVolumes" names. See http://support.microsoft.com/?kbid=274311. This KB article explains that after you convert a hard disk from Basic to Dynamic, the volumes on that hard disk are not identified by their drive letter in System Monitor. Instead, volumes that are displayed in System Monitor in a form similar to the following:
HarddiskDmVolumes\systemnameDg0\Volume1
You need to re-assign drive letters to dynamic disks after you convert them so that the drive letters are reported properly.