8.3 filename

Last updated

An 8.3 filename [1] (also called a short filename or SFN) is a filename convention used by old versions of DOS and versions of Microsoft Windows prior to Windows 95 and Windows NT 3.5. It is also used in modern Microsoft operating systems as an alternate filename to the long filename, to provide compatibility with legacy programs. The filename convention is limited by the FAT file system. Similar 8.3 file naming schemes have also existed on earlier CP/M, TRS-80, Atari, and some Data General and Digital Equipment Corporation minicomputer operating systems.

Contents

Overview

8.3 filenames are limited to at most eight characters (after any directory specifier), followed optionally by a filename extension consisting of a period . and at most three further characters. For systems that only support 8.3 filenames, excess characters are ignored. If a file name has no extension, a trailing . has no significance (that is, myfile and myfile. are equivalent). Furthermore, file and directory names are uppercase in this system, even though systems that use the 8.3 standard are usually case-insensitive (making CamelCap.tpu equivalent to the name CAMELCAP.TPU). However, on non-8.3 operating systems (such as almost any modern operating system) accessing 8.3 file systems (including DOS-formatted floppy disks, but also including some modern memory cards and networked file systems), the underlying system may alter filenames internally to preserve case and avoid truncating letters in the names, for example in the case of VFAT.

VFAT and computer-generated 8.3 filenames

VFAT, a variant of FAT with an extended directory format, was introduced in Windows 95 and Windows NT 3.5. It allowed mixed-case Unicode long filenames (LFNs) in addition to classic 8.3 names by using multiple 32-byte directory entry records for long filenames (in such a way that old 8.3 system software will only recognize one as the valid directory entry).

To maintain backward-compatibility with legacy applications (on DOS and Windows 3.1), on FAT and VFAT filesystems, the operating system automatically generates an 8.3 filename for every LFN, through which the file can [also] be opened, renamed, or deleted, although the generated name (e.g. OVI3KV~N) may show little similarity to the original. The generation of 8.3 filenames on NTFS filesystems can be turned off system-wide (with risks). [2] The 8.3 filename can be obtained using the Kernel32.dll function GetShortPathName. [3] [4]

Although there is no compulsory algorithm for creating the 8.3 name from an LFN, Windows uses the following convention: [5]

  1. If the LFN is 8.3 uppercase, no LFN will be stored on disk at all.
    • Example: TEXTFILE.TXT
  2. If the LFN is 8.3 mixed case, the LFN will store the mixed-case name, while the 8.3 name will be an uppercase version of it.
    • Example: TextFile.Txt becomes TEXTFILE.TXT.
  3. If the filename contains characters not allowed in an 8.3 name (including space which was disallowed by convention though not by the APIs) or either part is too long, the name is stripped of invalid characters such as spaces and extra periods. If the name begins with periods . the leading periods are removed. Other characters such as + are changed to the underscore _, and letters are put in uppercase. The stripped name is then truncated to the first 6 letters of its basename, followed by a tilde, followed by a single digit, followed by a period ., followed by the first 3 characters of the extension.
    • Example: TextFile.Mine.txt becomes TEXTFI~1.TXT (or TEXTFI~2.TXT, should TEXTFI~1.TXT already exist). ver +1.2.text becomes VER_12~1.TEX. .bashrc.swp becomes BASHRC~1.SWP
  4. On all NT versions including Windows 2000 and later, if at least 4 files or folders already exist with the same extension and first 6 characters in their short names, the stripped LFN is instead truncated to the first 2 letters of the basename (or 1 if the basename has only 1 letter), followed by 4 hexadecimal digits derived from an undocumented hash of the filename, followed by a tilde, followed by a single digit, followed by a period ., followed by the first 3 characters of the extension. [6]
    • Example: TextFile.Mine.txt becomes TE021F~1.TXT.
  5. On Windows 95, 98 and ME, if more than 9 files or folders with the same extension and first 6 characters and in their short names (so that ~1 through ~9 suffixes aren't enough to resolve the collision), the name is further truncated to 5 letters, followed by a tilde, followed by two digits starting from 10, followed by a period . and the first 3 characters of the extension.
    • Example: TextFile.Mine.txt becomes TEXTF~10.TXT if TEXTFI~1.TXT through TEXTFI~9.TXT all exist already.

During the 2001 Microsoft antitrust case, the names MICROS~1 and MICROS~2 were humorously used to refer to the companies that might exist after a proposed split of Microsoft. [7]

NTFS, a file system used by the Windows NT family, supports LFNs natively, but 8.3 names are still available for legacy applications. This can optionally be disabled system-wide to improve performance in situations where large numbers of similarly named files exist in the same folder. [2] 8.3 filename creation can be disabled system-wide and/or per-volume, and existing 8.3 filenames can be stripped using fsutil or a Registry key. [8]

The ISO 9660 file system (mainly used on compact discs) has similar limitations at the most basic Level 1, with the additional restriction that directory names cannot contain extensions and that some characters (notably hyphens) are not allowed in filenames. Level 2 allows filenames of up to 31 characters, more compatible with classic AmigaOS and classic Mac OS filenames.

On VFAT and NTFS file systems, 8.3 filenames are stored as ANSI encoding, for backward-compatibility.

The ReFS no longer supports 8.3 filenames.

Compatibility

This legacy technology is used in a wide range of products and devices, as a standard for interchanging information, such as compact flash cards used in cameras. VFAT LFN long filenames introduced by Windows 95/98/ME retained compatibility. But the VFAT LFN used on NT-based systems (Windows NT/2K/XP) uses a modified 8.3 shortname.

If a filename contains only lowercase letters, or is a combination of a lowercase basename with an uppercase extension, or vice versa; and has no special characters, and fits within the 8.3 limits, a VFAT entry is not created on Windows NT and later versions such as XP. Instead, two bits in byte 0x0c of the directory entry are used to indicate that the filename should be considered as entirely or partially lowercase. Specifically, bit 4 means lowercase extension and bit 3 lowercase basename, which allows for combinations such as example.TXT or HELLO.txt but not Mixed.txt. Few other operating systems support this. This creates a backward-compatibility filename mangling problem with older Windows versions (95, 98, ME) that see all-uppercase filenames if this extension has been used, and therefore can change the capitalization of a file when it is transported, such as on a USB flash drive. This can cause problems for operating systems that do not exhibit the case-insensitive filename behavior as DOS and Windows do. Linux will recognize this extension when reading; [9] the mount option shortname determines whether this feature is used when writing. [10] For MS-DOS you may use Henrik Haftmann's DOSLFN. [11]

Directory table

A directory table is a special type of file that represents a directory. Each file or directory stored within it is represented by a 32-byte entry in the table. Each entry records the name, extension, attributes (archive, directory, hidden, read-only, system and volume), the date and time of creation, the address of the first cluster of the file/directory's data and finally the size of the file/directory.

Legal characters for DOS filenames include the following:

This excludes the following ASCII characters:

The DOS filenames are in the OEM character set. Code 0xE5 as the first byte (see below) makes troubles when extra-ASCII characters are used.

Directory entries, both in the Root Directory Region and in subdirectories, are of the following format:

Byte OffsetLengthDescription
0x008DOS filename (padded with spaces)

The first byte can have the following special values:

0x00Entry is available and no subsequent entry is in use
0x05Initial character is actually 0xE5
0x2EDot entry: either . or ..
0xE5Entry has been previously erased. File undelete utilities must replace this character with a regular character as part of the undeletion process.
0x083DOS file extension (padded with spaces, may be empty)
0x0b1File Attributes

The first byte can have the following special values:

BitMaskDescription
00x01Read Only
10x02Hidden
20x04System
30x08Volume Label
40x10Subdirectory
50x20 Archive
60x40Device (internal use only, never found on disk)
70x80Unused

An attribute value of 0x0F is used to designate a long filename entry.

0x0c1Reserved; two bits are used by NT and later versions to encode case information
0x0d1Create time, fine resolution: 10  ms units, values from 0 to 199.
0x0e2Create time. The hour, minute and second are encoded according to the following bitmap:
BitsDescription
15–11Hours (0–23)
10–5Minutes (0–59)
4–0Seconds/2 (0–29)

Note that the seconds is recorded only to a 2  second resolution. Finer resolution for file creation is found at offset 0x0d.

0x102Create date. The year, month, and day are encoded according to the following bitmap:
BitsDescription
15–9Year (0 = 1980, 127 = 2107)
8–5Month (1 = January, 12 = December)
4–0Day (1–31)
0x122Last access date; see offset 0x10 for description.
0x142EA-Index (used by OS/2 and NT) in FAT12 and FAT16, High 2 bytes of first cluster number in FAT32
0x162Last modified time; see offset 0x0e for description.
0x182Last modified date; see offset 0x10 for description.
0x1a2First cluster in FAT12 and FAT16. Low 2 bytes of first cluster in FAT32.
0x1c4File size

Working with short filenames in a command prompt

Sometimes it may be desirable to convert a long filename to a short filename, for example when working with the command prompt. A few simple rules can be followed to attain the correct 8.3 filename.

  1. A SFN filename can have at most 8 characters before the dot. If it has more than that, the first 6 must be written, then a tilde ~ as the seventh character and a number (usually 1) as the eighth. The number distinguishes it from other files with both the same first six letters and the same extension.
  2. Dots are important and must be used even for folder names (if there is a dot in the folder name). If there are multiple dots in the long file/directory name, only the last one is used. The preceding dots should be ignored. If there are more characters than three after the final dot, only the first three are used.
  3. Generally:
    • Any spaces in the filenames should be ignored when converting to SFN.
    • Ignore all periods except the last one. Do not include any other periods, just like the spaces. Use the last period if any, and the next characters (up to 3). For instance, for .manifest, .man only would be used.
    • Commas, square brackets, semicolons, = signs and + signs are changed to underscores.
    • Case is not important; upper case and lower case characters are treated equally.

To find out for sure the SFN or 8.3 names of the files in a directory

use: dir /x shows the short names if there is one, and the long names.

or: dir /-n shows only the short names, in the original DIR listing format.

In Windows NT-based operating systems, the command prompt applet (cmd.exe) accepts long filenames with wildcard characters (question mark ? and asterisk *); long filenames with spaces in them need to be escaped (i.e. enclosed in single or double quotes). [13]

Starting with Windows Vista, console commands and PowerShell applets perform limited pattern matching by allowing wildcards in filename and each subdirectory in the file path and silently substituting the first matching directory entry (for example, C:\>CD \prog*\inter* will change the current directory to C:\Program Files\Internet Explorer\).

See also

Related Research Articles

<span class="mw-page-title-main">ISO 9660</span> File system for CD-R and CD-ROM optical discs

ISO 9660 is a file system for optical disc media. The file system is an international standard available from the International Organization for Standardization (ISO). Since the specification is available for anybody to purchase, implementations have been written for many operating systems.

In computers, case sensitivity defines whether uppercase and lowercase letters are treated as distinct (case-sensitive) or equivalent (case-insensitive). For instance, when users interested in learning about dogs search an e-book, "dog" and "Dog" are of the same significance to them. Thus, they request a case-insensitive search. But when they search an online encyclopedia for information about the United Nations, for example, or something with no ambiguity regarding capitalization and ambiguity between two or more terms cut down by capitalization, they may prefer a case-sensitive search.

File Allocation Table (FAT) is a file system developed for personal computers and was the default filesystem for MS-DOS and Windows 9x operating systems. Originally developed in 1977 for use on floppy disks, it was adapted for use on hard disks and other devices. The increase in disk drives capacity required three major variants: FAT12, FAT16 and FAT32. FAT was replaced with NTFS as the default file system on Microsoft operating systems starting with Windows XP. Nevertheless, FAT continues to be used on flash and other solid-state memory cards and modules, many portable and embedded devices because of its compatibility and ease of implementation.

Linux has several filesystem drivers for the File Allocation Table (FAT) filesystem format. These are commonly known by the names used in the mount command to invoke particular drivers in the kernel: msdos, vfat, and umsdos.

A filename extension, file name extension or file extension is a suffix to the name of a computer file. The extension indicates a characteristic of the file contents or its intended use. A filename extension is typically delimited from the rest of the filename with a full stop (period), but in some systems it is separated with spaces.

NTLDR is the boot loader for all releases of Windows NT operating system from 1993 with the release of Windows NT 3.1 up until Windows XP and Windows Server 2003. From Windows Vista onwards it was replaced by the BOOTMGR bootloader. NTLDR is typically run from the primary storage device, but it can also run from portable storage devices such as a CD-ROM, USB flash drive, or floppy disk. NTLDR can also load a non NT-based operating system given the appropriate boot sector in a file.

An environment variable is a user-definable value that can affect the way running processes will behave on a computer. Environment variables are part of the environment in which a process runs. For example, a running process can query the value of the TEMP environment variable to discover a suitable location to store temporary files, or the HOME or USERPROFILE variable to find the directory structure owned by the user running the process.

<span class="mw-page-title-main">Filename</span> Text string used to uniquely identify a computer file

A filename or file name is a name used to uniquely identify a computer file in a file system. Different file systems impose different restrictions on filename lengths.

A path is a string of characters used to uniquely identify a location in a directory structure. It is composed by following the directory tree hierarchy in which components, separated by a delimiting character, represent each directory. The delimiting character is most commonly the slash ("/"), the backslash character ("\"), or colon (":"), though some operating systems may use a different delimiter. Paths are used extensively in computer science to represent the directory/file relationships common in modern operating systems and are essential in the construction of Uniform Resource Locators (URLs). Resources can be represented by either absolute or relative paths.

In computing, a hard link is a directory entry that associates a name with a file. Thus, each file must have at least one hard link. Creating additional hard links for a file makes the contents of that file accessible via additional paths. This causes an alias effect: a process can open the file by any one of its paths and change its content. By contrast, a soft link or “shortcut” to a file is not a direct link to the data itself, but rather a reference to a hard link or another soft link.

The Installable File System (IFS) is a filesystem API in MS-DOS/PC DOS 4.x, IBM OS/2 and Microsoft Windows that enables the operating system to recognize and load drivers for file systems.

The process of filename mangling, in computing, involves a translation of the file name for compatibility at the operating system level. It occurs when a filename on a filesystem appears in a form incompatible with the operating system accessing it. Such mangling occurs, for example, on computer networks when a Windows machine attempts to access a file on a Unix server and that file has a filename which includes characters not valid in Windows.

<span class="mw-page-title-main">File system</span> Format or program for storing files and directories

In computing, a file system or filesystem is a method and data structure that the operating system uses to control how data is stored and retrieved. Without a file system, data placed in a storage medium would be one large body of data with no way to tell where one piece of data stopped and the next began, or where any piece of data was located when it was time to retrieve it. By separating the data into pieces and giving each piece a name, the data are easily isolated and identified. Taking its name from the way a paper-based data management system is named, each group of data is called a "file". The structure and logic rules used to manage the groups of data and their names is called a "file system."

A file system API is an application programming interface through which a utility or user program requests services of a file system. An operating system may provide abstractions for accessing different file systems transparently.

In computer data storage, a volume or logical drive is a single accessible storage area with a single file system, typically resident on a single partition of a hard disk. Although a volume might be different from a physical disk drive, it can still be accessed with an operating system's logical interface. However, a volume differs from a partition.

Long filename (LFN) support is Microsoft's backward-compatible extension of the 8.3 filename naming scheme used in DOS. Long filenames can be more descriptive, including longer filename extensions such as .jpeg, .tiff, .html, and .xhtml that are common on other operating systems, rather than specialized shortened names such as .jpg, .tif, .htm, or .xht. The standard has been common with File Allocation Table (FAT) filesystems since its first implementation in Windows NT 3.5 of 1994.

<span class="mw-page-title-main">Windows File Manager</span> File manager bundled with Microsoft Windows in the 1990s

File Manager is a file manager program bundled with releases of OS/2 and Microsoft Windows between 1988 and 1999 and available from 6 April 2018 as an optional download for all modern releases of Windows, including Windows 10.

The following tables compare general and technical information for a number of file systems.

A batch file is a script file in DOS, OS/2 and Microsoft Windows. It consists of a series of commands to be executed by the command-line interpreter, stored in a plain text file. A batch file may contain any command the interpreter accepts interactively and use constructs that enable conditional branching and looping within the batch file, such as IF, FOR, and GOTO labels. The term "batch" is from batch processing, meaning "non-interactive execution", though a batch file might not process a batch of multiple data.

The FAT file system is a file system used on MS-DOS and Windows 9x family of operating systems. It continues to be used on mobile devices and embedded systems, and thus is a well suited file system for data exchange between computers and devices of almost any type and age from 1981 through the present.

References

  1. "Naming a File". Microsoft Developer Network. Archived from the original on 2008-10-15. Retrieved 2007-03-22.
  2. 1 2 "How to Disable the 8.3 Name Creation on NTFS Partitions". Microsoft. Retrieved 2021-02-26.
  3. "GetShortPathName Function". MSDN. Archived from the original on 2015-10-01. Retrieved 2014-09-15.
  4. "How to Get a Short Filename from a Long Filename". Microsoft. Retrieved 2021-02-26.
  5. "How Windows Generates 8.3 File Names from Long File Names". Microsoft.
  6. Galvin, Thomas (9 June 2015). "A Tale of Two File Names". tomgalvin.uk. Archived from the original on 2023-08-25. Retrieved 17 October 2022.
  7. "Microsoft One-liner Digest".
  8. "Under cmd.exe, how can "del *.tmp" be prevented from deleting *.tmpl files?". superuser.
  9. "dir.c\fat\fs - kernel/git/torvalds/linux.git - Linux kernel source tree". git.kernel.org. Retrieved 2018-06-25.
  10. "mount(8): mount filesystem – Linux man page".
  11. "DOSLFN".
  12. Andries Brouwer (2007-12-26). "Directory Entry". The FAT filesystem. Retrieved 2013-07-30.
  13. "Using Long File Names". 11 September 2008.