The file allocation table (FAT) is used to keep track of
which clusters are assigned to each file. The operating system (and hence any software
applications) can determine where a file's data is located by using the directory entry
for the file and file allocation table entries. Similarly, the FAT also keeps track of
which clusters are open and available for use. When an application needs to create (or
extend) a file, it requests more clusters from the operating system, which finds them in
the file allocation table.
There is an entry in the file allocation table for each cluster used on the disk. Each
entry contains a value that represents how the cluster is being used. There are different
codes used to represent the different possible statuses that a cluster can have.
Every cluster that is in use by a file has in its entry in the FAT a cluster number
that links the current cluster to the next cluster that the file is using. Then that
cluster has in its entry the number of the cluster after it. The last cluster used
by the file is marked with a special code that tells the system that it is the last
cluster of the file; this is often a number like 65,535 (16 ones in binary format). Since
the clusters are linked one to the next in this manner, they are said to be chained.
Every file (that uses more than one cluster) is chained in this manner. See the example
that follows for more clarification.
In addition to a cluster number or an end-of-file marker, a cluster's entry can contain
other special codes to indicate its status. A special code, usually zero, is put in the
FAT entry of every open (unused) cluster. This tells the operating system which clusters
are available for assignment to files that need more storage space. Another code is used
to indicate "bad" clusters. These are clusters where a disk utility (or the
user) has previously detected one or more unreliable sectors, due to disk defects. These
clusters are marked as bad so that no future attempts will be made to use them.
Accessing the entire length of a file is done by using a combination of the file's
directory entry and its cluster entries in the FAT. This is confusing to describe, so
let's look at an example. Let's consider a disk volume that uses 4,096 byte clusters, and
a file in the C:\DATA directory called "PCGUIDE.html" that is 20,000 bytes in
size. This file is going to require 5 clusters of storage (because 20,000 divided by 4,096
is around 4.88).
OK, so we have this file on the disk, and let's say we want to open it up to edit it.
We open our editor and ask for the file to be opened. To find the cluster on the disk
containing the first part of the file, the system just looks at the file's directory entry
to find the starting cluster number for the file; let's suppose it goes there and sees the
number 12,720. The system then know to go to cluster number 12,720 on the disk to load the
first part of the file.
To find the second cluster used by this file, the system looks at the FAT entry for
cluster 12,720. There, it will find another number, which is the next cluster used by the
file. Let's say this is 12,721. So the next part of the file is loaded from cluster
12,721, and the FAT entry for 12,721 is examined to find the next cluster used by the
file. This continues until the last cluster used by the file is found. Then, the system
will check the FAT entry to find the number of the next cluster, but instead of finding a
valid cluster number, it will find a special number like 65,535 (special because it is the
largest number you can store in 16 bits). This is the signal to the system that
"there are no more clusters in this file". Then it knows it has retrieved the
entire file.
Since every cluster is chained to the next one using a number, it isn't necessary for
the entire file to be stored in one continuous block on the disk. In fact, pieces of the
file can be located anywhere on the disk, and can even be moved after the file has been
created. Following these chains of clusters on the disk is done invisibly by the operating
system so that to the user, each file appears to be in one continuous chunk of disk space.
Next: File Deletion and Undeletion