by eugene

Auditing IPEAK SPT


See also How We Test Drives

Testbed3: Auditing IPEAK SPT

The cornerstone of SR's new third-generation testbed is Intel's IPEAK SPT, a program that opens the way to unparalleled hard disk performance evaluation. The results delivered by IPEAK's tools are a bit startling; indeed, many readers refuse to accept the results since they don't align with long-held conceptions. More specifically, folks are used to the idea that SCSI hard drives, though highly optimized for random, non-localized, multi-user situations, always exhibit superior performance when compared to ATA drives even in desktop and single-user situations. As a result, when a manufacturer such as Western Digital pushes the envelope and delivers SCSI-like performance in an ATA drive with an 8-meg buffer and IPEAK results place it among the top SCSI drives for desktop performance, reception within the enthusiast community, ostensibly the target market for such a unit, is mixed. While many folks recognize that WD's latest drive truly is special, others imply that this showing "proves" that benchmarks can never really be accurate. It's time to let these misconceptions go! Such an argument begs premises from a preset conclusion, and that is irrational.

StorageReview.com's Desktop DriveMarks consist of trace files that were recorded utilizing IPEAK's WinTrace32. These files can then be analyzed using AnalyzeTrace and exactingly replayed through RankDisk. The recording-playback combination delivered by WinTrace32-RankDisk delivers us the amazing ability to recreate the disk accesses generated by any kind of system activity on a multitude of test drives. WinTrace32 intercepts all requests headed to the controller driver while RankDisk commences playback of these requests at the driver level. As a result, the CPU, RAM, OS, cache, and all other ancillary hardware and software impacts the playback once and only once, resulting in perfect recreation of high-level variables. Observe the following chain of a disk read or write:

Application
     |
     |
     |
     V
Operating System
     |
     |
     |
     V
OS Disk Cache
     |
     |
     |
     V
OS File System
     |<-------- WinTrace32 records all that reaches this point ---> no overlap between
     |                                                              Wintrace32 & Rankdisk
     |<-------- RankDisk plays back all from this point onwards -------^
     V
OS Disk Host Adapter Driver
     |
     |
     |
     V
Disk Host Adapter
     |
     |
     |
     V
Hard Drive Buffer
     |
     |
     |
Hard Drive Platter

The perfect, non-overlap system delivered by WinTrace32 and RankDisk in theory delivers the best conceivable combination between controlled trials and relevant access patterns. The million-dollar question: Are we sure that WinTrace32 and RankDisk do exactly what they claim?


WinTrace32

WinTrace32 is a background, resident program that intercepts all requests sent to a host adapter's driver. For each given intercepted request, a starting timestamp, starting sector #, length of the request in sectors, and an associated OP code (Read, Write, Done) is appended. The timestamps allow RankDisk to preserve interarrival times between requests and thus allow any advantages of a driver or device (tagged queuing, for example) to be felt.

Does WinTrace32 accurately record disk requests?

Another program from Intel, IOMeter, allows users to define access patterns through settings such as block sizes, reads vs. writes, sequential accesses vs. random accesses, etc. Accordingly, if we take a given access pattern with known settings and run through it while recording the accesses using WinTrace32, AnalyzeTrace should then be able to examine the file and note the similarities and/or differences between what IOMeter claims to be sending to the drive and what WinTrace32 claims to see going to the drive.

Let's take a closer look at the File Server access pattern, included in the default wrkloads.icf file that comes bundled with IOMeter:

Size, % Access, % Read, and % Random are the most important settings. % Access determines how much of the total pattern will consist of the other three settings. In other words, for the first entry, 10% (% Access) of the access pattern will consist of 512 byte requests (Size) of which 80% will be reads and 20% will be writes (% Read). 100% of these requests will be random. The same explanation applies to all the entries that follow.

Through these settings, Intel attempts to synthetically reproduce the disk accesses of a typical file server. One significant setting, the number of outstanding I/Os (queue depth), is set on another screen:

We went ahead and ran this pattern on an ATA drive (WD's Caviar WD1200BB) for 10 minutes with WinTrace32 running in the background to capture the accesses. WinTrace32's facilities must be manually started and stopped. We attempted to start up the IOMeter accesses immediately after beginning the recording and to stop recording immediately after IOMeter completed the 10 minute trial. Let's see how IOMeter's input variables and WinTrace32's captures stack up:

The first thing that jumps out in this AnalyzeTrace screen capture is the preponderance of 8 sector (512 bytes per sector, thus 4 KB) accesses- 60%. Surely enough, the File Server Access Pattern Spec calls for 60% of accesses to be 4 KB in length. 10% of accesses should be one sector (512 bytes) in length. Again, this is exactly what AnalyzeTrace reports. The exercise checks out perfectly: all of the reported transfer sizes in this trace file match up exactly with what IOMeter requested.

Throughout this IOMeter run, queue depth was kept at a constant 4. And, unsurprisingly, this is exactly what AnalyzeTrace reports.

According to the IOMeter File Server settings, 80% of all requests should be reads while 20% should be writes. AnalyzeTrace reports that of 48605 total requests, 39043 were reads... or 80.3%. The remaining 9562 requests (19.7%) are writes.

Since the File Server access pattern calls for 100% random accesses, there should be no requests that were 0 sectors away (and if there are any, it's pure coincidence!). As expected, AnalyzeTrace reports no back-to-back requests. Interestingly, on this chart's scale, seek distances aren't even noticeable until they're at least 256 MB away from the previous access... and only a tiny amount meet that minimum! A vast contrast from our captured desktop access patterns in which half the seeks occur within 8 KB of each other. This problem, lack of localization, plagued our now-defunct Workstation access pattern and caused it to miss the mark when assessing desktop performance.

IOMeter's csv output file lists the WD1200BB's average response time in this trial at 49.48 milliseconds. The average value derived from the trace's service time graph displayed above is 49.66 milliseconds, for all intents and purposes identical.

Finally, let's take a quick look at the accesses generated during this trial plotted along the drive's capacity with respect to time. As one would expect, AnalyzeTrace reports an even, random distribution. Incidentally, contrasting this graph with that of the StorageReview.com Office DriveMark 2002 (a trace drawn from actual computer use by yours truly) reveals the localization of single-user accesses.

From the above it's clear that WinTrace32 delivers on its end of the bargain- it succinctly and precisely captures exactly what is sent to it. Combined with AnalyzeTrace, WinTrace32 offers us an excellent view of exactly what kind of disk accesses occur with any given arbitrary system use. But what about RankDisk? Does it replay exactly what it should? Let's take a look at the other half!


RankDisk

Does RankDisk exactingly play back all accesses within a trace file?

There's a surprisingly easy way to test RankDisk's ability to play back exactly that which is sent to it: record RankDisk's replay of a trace file and compare the resulting second-generation trace to the original. To put RankDisk's ability to precisely replay a WinTrace32 capture to the test, we played back the File Server capture outlined above while using WinTrace32 to capture the playback. Hence, we ended up with a 2nd-generation capture that could then be contrasted to the original trace file.

Perhaps most telling is AnalyzeTrace's summaries for the two files:

File Server Capture, Original 2nd Generation Capture

Both trace files feature identical amounts of reads and writes when it comes to both the number of requests as well as the amount of data transferred. In the "capture of the capture" there is neither a stray access that wasn't in the original nor an access in the original file that did not subsequently occur in the replay.

Let's take a brief-blow-by-blow look at some other relevant AnalyzeTrace screens:

Distribution of Transfer Sizes
File Server Capture, Original 2nd Generation Capture
Average Value = 22.0845 sectors Average Value = 22.0845 sectors

Both captures reflect an exacting reproduction of IOMeter's set parameters in the File Server access pattern.

Distribution of Queue Lengths
File Server Capture, Original 2nd Generation Capture

In the original trial, IOMeter did its best to keep queue depths at 4. The chart to the left proves that WinTrace32 could record this constant queue depth. The chart to the right proves that RankDisk can play it back perfectly.

Distribution of Seek Distances
File Server Capture, Original 2nd Generation Capture

No surprises here. Since both the original trace as well as the trace of the original trace feature the exact same sector accesses, both seek distance graphs are identical.

Distribution of Service Times
File Server Capture, Original 2nd Generation Capture
Average Value = 49.6615 milliseconds Average Value = 49.6423 milliseconds

It's not unreasonable to expect that a given physical actuator will minutely vary in the time it takes to fulfill an access that is logically identical. One cause may be heat: the platters (or actuator) may be slightly expanded, contracted, etc. Even so, the "head-to-tail" (i.e., the amount of time that passes between the start of a given request and its completion) service times turned in by both the first and second-generation captures remain remarkably close to IOMeter's original reported score of 49.48 milliseconds.

Disk Access Sequence
File Server Capture, Original 2nd Generation Capture

This plot of the location of all requests within the access pattern is, by nature, identical in both traces. Enough said.

RankDisk itself, after completing the playback of a trace, turns out the average service time recorded... though not in a "head-to-tail" fashion. Rather, RankDisk's output is "head-to-head," the average time measured between the start of one request and the start of the next.

Replayed on the same drive that the trace was captured from (the WD1200BB), this File Server pattern yielded a RankDisk score of 12.37 milliseconds. IOMeter, for its part, does not turn in a "head-to-head" response time in milliseconds. It does, however, report the number of I/O operations per second. A look at the IOMeter output file reveals that the WD1200BB completed an average of 80.83501 I/Os per second in the original trial. There are, of course, 1000 milliseconds within one second. One can then extrapolate I/Os per second from RankDisk's output to yield a comparable score (this, in fact, is done in the SR DriveMarks... results are reported in IO/s sec rather than RankDisk's native average milliseconds per request). The result? RankDisk reports that in the File Server access pattern, at a queue depth of 4 I/Os, the WD1200BB performs (1000/12.37) 80.84 I/Os per second... this being the replay of a trace of an original trial in which the WD performed 80.83 I/Os per second! Absolutely amazing!

RankDisk delivers. It plays back any given captured trace file with incredible precision and reliability.


Conclusion

WinTrace32 records to a capture file only the accesses that makes it to the OS host adapter driver. RankDisk perfectly plays back every request found in the capture file. Together, these tools allow for systematic, precise, and accurate playback of any given workload. How precise? Look at these three intercomparable figures:

WD1200BB - IOMeter's File Server Access Pattern, 4 Oustanding I/Os
IOMeter Average Response TimeAverage Service Time
Trace File of the IOMeter Trial
Average Service Time
Trace File of a Trace File of the IOMeter Trial
49.48 ms49.66 ms49.64 ms

It is plain to see that all attributes of the original disk accesses as generated by IOMeter (a program that allows us to literally define what kind of accesses are to take place) are preserved in both the recording and the playback of the trace file. It is reasonable to assume that if IPEAK can perfectly reproduce a given program's (IOMeter) disk accesses, it can reproduce the disk accesses generated by any other program or set of programs. The StorageReview.com Desktop DriveMarks are just that- exacting RankDisk playback of precise WinTrace32 captures of typical desktop usage.

The ramifications may be startling, but they are indisputable: For desktop usage, an ATA drive such as Western Digital's Caviar WD1000BB-SE delivers performance comparable to or even exceeding that of today's 10,000 RPM SCSI drives. Thus, if you're out to purchase a drive for your non-server machine and you settle on a 10k SCSI drive just because "it's 10,000 RPM" or "it's SCSI," you are cheating yourself out of a good deal of capacity, a good deal of performance, and/or a good deal of saved money!

We recently ran the following poll:

Which new StorageReview.com feature is more significant?
  Choice     Popularity Graph  
Drive Reliability Database
76%
Hard Drive Testbed3
23%

The majority's prevailing theme is that the world already had performance comparisons pre-Testbed3 but that it never had reliability figures. While we're also very proud of the reliability database and while the proprietary filtering and analysis methods used make it sounder than its critics realize, Testbed3's desktop performance assessment improvements should not be overlooked.

The community may have had access to benchmark results on hard drives from both SR as well as many other sites, but we maintain that these results have never come close to the accuracy delivered by IPEAK and Testbed3. In the past, naysayers have been able to (legitimately or otherwise) discard benchmark results that they didn't like as not being applicable to "real world" performance. How? They would raise questions like these:

We don't know which benchmark is most relevant.
We don't know what disk usage looks like.
We don't know what the benchmark is really doing.

And so on...

With Testbed3:

We do know!

In conclusion, IPEAK SPT's precision and accuracy is beyond dispute. If one is to contest the SR Desktop DriveMarks, one must do so on the grounds that the application usage selected for recording is not representative of the majority of users, another topic entirely. What we would bet, however, is that the majority of diehards who refuse to accept that ATA drives have a place in the world even discounting the cost factor would be hard-pressed to capture a non-multi-user trace that places SCSI drives as head-and-shoulders above ATA drives as they'd like to believe is the case.