I consider the external data transfer rate of the interface to be one of the most
overrated hard disk performance specifications ever. The reason is that there are no hard
disks that can read or write data fast enough to max out modern interfaces. They can only
send at the interface's maximum speed for short bursts; see this section for further explanation.
It doesn't really matter much if the drive is running on an Ultra ATA/100 interface
capable of 100 MB/s theoretical throughput if the drive can only stream from the platters
at 40 MB/s. The only data sent anywhere near 100 MB/s will be the small amounts that
happen to be in the drive's cache. The same applies to a single drive on a high-speed SCSI
bus.
However, when using large numbers of drives with RAID, suddenly things become a bit
different. Well, under IDE/ATA they aren't different, because IDE/ATA can only handle
transfers from one drive at a time.
With SCSI though, it's a very different story. On
the SCSI bus, multiple drives can be transferring data simultaneously. This means that if
you put a bunch of drives on a single SCSI channel and want to run them all at their
maximum potential, you do have to be careful to watch the maximum throughput of the bus.
As usual, an example is much easier than trying to explain it without one. Let's take
the Quantum Atlas 10K II. This drive has a maximum sustained
transfer rate of 40 MB/s. For ideal performance, we want to make sure that the
interface can supply at least that much bandwidth. Now if we put a single one of these
drives on an Ultra160 SCSI bus, we obviously have no problems; the theoretical maximum
speed of the bus is 160 MB/s (though actual will be below that due to overhead
considerations). But what if we want to make a four-drive RAID 0 array for high-speed
multimedia editing? In that case we do care about the speed of the bus: we're
going to use all of it when dealing with large files, because all four drives will be
streaming data simultaneously! In fact, we'll be slightly below theoretical maximum
potential speed because of overhead, but it's probably close enough, especially since that
STR figure is only that high for the outermost
tracks of the drives, where the number of sectors per track is at its maximum.
But what about if we decide we want to create a larger array, say, an eight-drive
array? Then we have a problem. Even if we use the average STR figure of those
drives, 32 MB/s, we need 256 MB/s, far in excess of what Ultra160 can provide. To avoid
this problem, higher-end SCSI RAID controllers provide support for multiple channels.
Essentially, the RAID controller has not one SCSI bus with which to communicate with the
drives in the array, but two or more. For example, some cards have four channels. Each of
these is capable of handling 160 MB/s in theory, yielding a whopping theoretical bandwidth
of 640 MB/s. That's obviously more than enough to handle our eight-drive array; we just
put two drives on each of the four channels and we are in business; we even have room to
substantially expand the array in the future, or add a second array if need be. The use of
multiple channels also improves performance by cutting down on contention for the SCSI
bus. Of course you don't get four channels on a RAID controller for free; these
multi-channel controller cards aren't cheap.
Another issue when dealing with very high transfer rates is the bandwidth of the bus
itself. The standard PCI bus as
implemented in regular PCs--which seemed to have so much bandwidth five years ago
--is
32 bits wide and runs at 33 MHz, providing a total maximum theoretical bandwidth
of about 127 MB/s, not nearly enough to handle multiple-channel SCSI RAID. For this
reason, high-end cards with multiple channels often use the enhanced 64-bit, 66 MHz PCI
bus. This version of PCI has a theoretical bandwidth of over 500 MB/s, but it of course
requires a server motherboard that has a matching high-speed PCI slot. Again, not cheap.
Next: RAID
Hard Disk Drive Requirements