Long File Names
Until the release of Windows 95, all file names using DOS or Windows 3.x were limited to the standard eight character file name plus three character file extension. This restriction tends to result in users having to create incredibly cryptic names, and having the situation still like this 15 years after the PC was invented seemed laughable, especially with Microsoft wanting to compare its ease of use to that of the Macintosh. Users want to name their files "Mega Corporation - fourth quarter results.DOC", not "MGCQ4RST.DOC", because the second name will mean zippo to the user a few months after they create it.
Microsoft was determined to bring long file names (LFNs) to Windows 95 much as it had for Windows NT. The latter, however, has a new file system designed from the ground up to allow long file names. Microsoft had a big problem on its hands with Windows 95: it wanted to maintain compatibility with existing disk structures, older versions of DOS and Windows, and older applications. It couldn't just "toss out" everything that came before and start fresh, because doing this would have meant no older programs could read any files that used the new long file names. File names were restricted to "8.3" (standard file name sizes) within the directories on the disk.
What Microsoft needed was a way to implement long file names so that the following goals were all met:
- Windows 95 and applications written for Windows 95 could use file names much longer than 11 total characters.
- The new long file names could be stored on existing DOS volumes using standard directory structures, for compatibility.
- Older pre-Windows-95 software would still be able to access the files that use these new file names, somehow.
The VFAT file system accomplishes these goals, for the mostpart, as follows. Long file names of up to 255 characters per file can be assigned to any file under Windows 95 or by any program written for Windows 95 (although file names under 100 characters are recommended so that they don't get cumbersome to use). Support for these long file names is also provided by the version of DOS (7.x) that comes with Windows 95. File extensions are maintained, to preserve the way that they are used by software. The long file name is limited to the same characters as standard file names are, except that the following additional characters are allowed: + , ; = [ ].
To allow access by older software, each file that uses a long file name also has a standard file name alias that is automatically assigned to it. This is done by truncating and modifying the file name as follows:
- The long file name's extension (up to three characters after a ".") are transferred to the extension of the alias file name.
- The first six non-space characters of the long file name are analyzed. Any characters that are valid in long file names but not in standard file names (+ , ; = [ and ]) are replaced by underscores. All lower-case letters are converted to upper case. These six characters are stored as the first six characters of the file name.
- The last two characters of the file name are assigned as "~1". If that would cause a conflict because there is already a file with this alias in the directory, then it tries "~2", and so on until it finds a unique alias.
So to take our example from before, "Mega Corporation - fourth quarter results.DOC" would be stored as shown, but also under the alias "MEGACO~1.DOC". If you had previously saved a file called "Mega Corporation - third quarter results.DOC" in the same directory, then that file would be "MEGACO~1.DOC" and the new one would be "MEGACO~2.DOC". Any older software can reference the file using this older name. Note that using spaces in long file names really doesn't cause any problems because Windows 95 applications are designed knowing that they will be commonly used, and because the short file name alias has the spaces removed.
Long file names are stored in regular directories using the standard directory entries, but using a couple of tricks. The Windows 95 file system creates a standard directory entry for the file, in which it puts the short file name alias. Then, it uses several additional directory entries to hold the rest of the long file name. A single long file name can use many directory entries (since each entry is only 32 bytes in length), and for this reason it is recommended that long file names not be placed in the root directory, where the total number of directory entries is limited.
In order to make sure that older versions of DOS don't get confused by this non-standard usage, each of the extra directory entries used to hold long file name information is tagged with the following odd combination of file attributes: read-only, hidden, system and volume label. The objective here is to make sure that no older versions of DOS try to do anything with these long file name entries, and also to make sure they don't try to overwrite these entries because they think they aren't in use. That combination of file attributes causes older software to basically ignore the extra directory entries being used by VFAT.
While long file names are a great idea and improve the usability of Windows 95, Microsoft's streeeeeetch to keep them compatible with old software kind of shows. Basically, the implementation is a hack built on top of the standard FAT file system, and there are numerous problems that you should be aware of when using LFNs:
- Compatibility Problems with Older Utilities: While marking its extra entries as read-only, hidden, system and volume label will trick standard applications into leaving these entries alone, a disk utility program like Norton Disk Doctor will not be fooled. If you use the DOS version that is not aware of long file names, it will detect these entries as errors on your disk and happily "correct" them for you, and that's that for your long file names. Utilities run under Windows 95 must be aware of long file names to work properly.
- "Loss" of Long File Names with Older Software: While older apps will work with the long file names using the short name alias, they have no ability to access the long file name at all. It is easy for one of these applications to "drop" the long file name by accident. A common cause of frustration is that if you use older DOS backup software that doesn't know about long file names, it will save only the alias, and if you have a crash and need to restore, the long file names will be lost.
- Problems with Conflicting Alias File Names: There are two problems with the long
file name aliasing scheme. The first is that the alias is not permanently linked to the
long file name, and can change. Let's suppose we take our "Mega Corporation - fourth
quarter results.DOC" and save it in a new empty directory. It will be assigned the
alias "MEGACO~1.DOC". Now let's say we copy it to a directory that already has
the file "Mega Corporation - Water Cooler Policy.DOC" in it, which is using the
same "MEGACO~1.DOC" alias. When we do this, the alias for the fourth quarter
results file will magically change (well, the operating system does it) to
"MEGACO~2.DOC". This can confuse some people who refer to the file both by the
long name and the alias.
The second problem is more serious. Let's replay this same scenario using an older file copy application that isn't aware of long file names. Since all it sees is "MEGACO~1.DOC" in two different places, it thinks they are the same file, and will overwrite the water cooler memo with the fourth quarter results when you do the copy! If you are lucky it will ask "Are you sure?" first; otherwise... - Problems with Short File Name Realiasing: Copying a file with a long file name from one partition to another, or restoring one from a backup, can cause the short file name alias associated with the long file name to be changed. This can cause spurious behavior when hard-coded references to the short file name no longer resolve to the correct target. Note that despite the fact that Windows NT's NTFS file system was rebuilt from the ground up, it has this problem as well because the system aliases long file names for limited backward compatibility.
Overall, long file names are a useful advance in the usability of the FAT file system, but you need to be aware of problems when using them, especially with older software.
Next: File Attributes