Caching is a technique that is used to buffer discrepancies between the
performance of different hardware components. It is used throughout the PC: cache is found
within system processors,on motherboards, within hard disks themselves, and many other places. In
every case, the goal of the cache is the same: to provide a temporary storage area that
allows a faster device to run without having to wait for a slower one.
Most advanced RAID controllers include on-board cache, which in many ways acts exactly
the same way that the cache within a hard disk does:
it improves performance to some extent by storing information that was recently used, or
that the controller predicts will be used in the future, so it can be supplied to the
system at high speed if requested instead of necessitating reads from the slow hard disk platters. Since a RAID controller turns an
array of hard disks into one "virtual hard disk", putting cache on the
controller is a natural enhancement. Typically this is implemented as a slot on the
controller that takes a standard PC memory module; some controllers can take an amount of
cache exceeding the total system memory on most regular PCs! While caching does improve
performance, as with cache size in hard disks, don't
overestimate the performance impact of increasing the size of the cache.
One area where caching can impact performance significantly is write caching,
sometimes also called write-back caching. When enabled, on a write, the
controller tells the system that the write is complete as soon as the write enters the
controller's cache; the controller then "writes back" the data to the drives at
a later time. As described in detail in this page on
write caching, this improves performance but imposes the risk of data loss or
inconsistency if, for example, the power to the system is cut off before the data in the
cache can be "written back" to the disk platters.
Tip: To avoid potential
problems with power failures and write-back caching, some controllers actually incorporate
a built-in backup battery! This battery will allow any unwritten data in the cache to be
retained for a period of time until the power is restored to the system. A very neat
feature--though if the server is connected to a UPS, as it should be anyway, its value is
debatable.
The reason that write-back caching is so important with RAID is that while writes are
slightly slower than reads for a regular hard disk, for many RAID levels they are much slower. The cache insulates the system from
the slowdowns inherent in doing writes to arrays such as those using RAID 5, which can
improve performance substantially. The bigger the gap between read and write performance
for a given RAID level, the more that array will benefit from using write caching. It is
recommended for high-performance applications using striping with parity (though it will
improve performance somewhat for all RAID levels.)
Next: Drive Swapping