Filename mangling

Last updated

The process of filename mangling, in computing, involves a translation of the file name for compatibility at the operating system level. It occurs when a filename on a filesystem appears in a form incompatible with the operating system accessing it. Such mangling occurs, for example, on computer networks when a Windows machine attempts to access a file on a Unix server and that file has a filename which includes characters not valid in Windows.

Computing Activity that uses computers

Computing is any activity that uses computers. It includes developing hardware and software; and using computers to manage, process, and communicate information for various purposes. Computing is a critical, integral component of modern industrial technology. Major computing disciplines include computer engineering, software engineering, computer science, information systems, and information technology.

Operating system collection of software that manages computer hardware resources

An operating system (OS) is system software that manages computer hardware and software resources and provides common services for computer programs.

Filename name of a computer file stored in a file system

A filename is a name used to uniquely identify a computer file stored in a file system. Different file systems impose different restrictions on filename lengths and the allowed characters within filenames.

Contents

FAT Derivative Filesystem

Legacy support under VFAT

A common example of name mangling occurs on VFAT file systems on versions of Windows from Windows 95 onwards. The VFAT specification[ clarification needed ] allows Long File Names (LFNs). For backwards-compatibility with MS-DOS and older Windows software, which recognizes filenames of a maximum of 11 characters in length with 8.3 format (i.e.: an eight-letter filename, a dot and a three-letter extension, such as autoexec.bat), files with LFNs get stored on disk in 8.3 format (longfilename.txt becoming longfi~1.txt), with the long file name stored elsewhere on the disk.

File Allocation Table (FAT) is a computer file system architecture and a family of industry-standard file systems utilizing it. The FAT file system is a continuing standard which borrows source code from the original, legacy file system and proves to be simple and robust. It offers useful performance even in lightweight implementations, but cannot deliver the same performance, reliability and scalability as some modern file systems. It is, however, supported for compatibility reasons by nearly all currently developed operating systems for personal computers and many mobile devices and embedded systems, and thus is a well-suited format for data exchange between computers and devices of almost any type and age from 1981 up to the present.

Windows 95 operating system from Microsoft

Windows 95 is a consumer-oriented operating system developed by Microsoft as part of its Windows 9x family of operating systems. The first operating system in the 9x family, it is the successor to Windows 3.1x, and was released to manufacturing on August 15, 1995, and generally to retail on August 24, 1995. Windows 95 merged Microsoft's formerly separate MS-DOS and Windows products, and featured significant improvements over its predecessor, most notably in the graphical user interface (GUI) and in its simplified "plug-and-play" features. There were also major changes made to the core components of the operating system, such as moving from a mainly co-operatively multitasked 16-bit architecture to a 32-bit preemptive multitasking architecture.

Normally[ clarification needed ], when using compatible Windows programs which use standard Windows methods[ which? ] of reading the disk, the I/O subsystem returns the long filename to the program however, if an old DOS application or an old Windows application tries to address the file, it will use the older, 8.3-only APIs, or work at a lower level and perform its own disk access, which results in the return of an 8.3 filename. In this case, the filenames become mangled by taking the first six non-space characters in the filename and adding a tilde (~) and then a number to ensure the uniqueness of the 8.3 filename on the disk. This mangling scheme can turn (for example) Program Files into PROGRA~1. This technique persists today when people use DOSBox to play classic DOS games or use Windows 3.1 in conjunction to play Win16 games on 64-bit Windows.

In computing, input/output or I/O is the communication between an information processing system, such as a computer, and the outside world, possibly a human or another information processing system. Inputs are the signals or data received by the system and outputs are the signals or data sent from it. The term can also be used as part of an action; to "perform I/O" is to perform an input or output operation.

The tilde is a grapheme with several uses. The name of the character came into English from Spanish and from Portuguese, which in turn came from the Latin titulus, meaning "title" or "superscription".

DOSBox free DOS emulator

DOSBox is an emulator program which emulates an IBM PC compatible computer running a DOS operating system. Many IBM PC compatible graphics and sound cards are also emulated. This means that original DOS programs are provided with an environment in which they can run correctly, even though the modern computers have dropped support for that old environment.

Unix Filesystems

[ which? ]

Unix file names can contain colons or backslashes, whereas Windows interprets such characters in other[ which? ] ways. Accordingly[ clarification needed ], software could mangle the Unix file "Notes: 11\04\03" as "Notes_ 11-04-03" to enable Windows software to remotely access the file. Other Unix-like systems, such as Samba on Unix, use different[ clarification needed ] mangling systems to map long filenames to DOS-compatible filenames (although Samba administrators can configure this behavior in the config file). [1]

The colon is a punctuation mark consisting of two equally sized dots centered on the same vertical line. A colon precedes an explanation or an enumeration, or list. A colon is also used with ratios, titles and subtitles of books, city and publisher in bibliographies, biblical citations between chapter and verse, and for salutations in business letters and other formal letter writing, and often to separate hours and minutes.

The backslash (\) is a typographical mark used mainly in computing and is the mirror image of the common slash (/). It is sometimes called a hack, whack, escape, reverse slash, slosh, downwhack, backslant, backwhack, bash, reverse slant, and reversed virgule. In Unicode and ASCII it is encoded at U+005C\REVERSE SOLIDUS (92decimal).

Unix-like operating system that behaves in a manner similar to a Unix system

A Unix-like operating system is one that behaves in a manner similar to a Unix system, while not necessarily conforming to or being certified to any version of the Single UNIX Specification. A Unix-like application is one that behaves like the corresponding Unix command or shell. There is no standard for defining the term, and some difference of opinion is possible as to the degree to which a given operating system or application is "Unix-like".

Mac OS

macOS's Finder displays instances of ":" in file and directory names with a "/". This is because the classic Mac OS used the ":" character internally as a path separator. Listing these files or directories using a terminal emulator displays a ":" rather than the "/" character, though.

macOS Operating system for Apple computers

macOS is a series of graphical operating systems developed and marketed by Apple Inc. since 2001. It is the primary operating system for Apple's Mac family of computers. Within the market of desktop, laptop and home computers, and by web usage, it is the second most widely used desktop OS, after Microsoft Windows.

Finder (software)

The Finder is the default file manager and graphical user interface shell used on all Macintosh operating systems. Described in its "About" window as "The Macintosh Desktop Experience", it is responsible for the launching of other applications, and for the overall user management of files, disks, and network volumes. It was introduced with the first Macintosh computer, and also exists as part of GS/OS on the Apple IIGS. It was rewritten completely with the release of Mac OS X in 2001.

The slash is an oblique slanting line punctuation mark. Once used to mark periods and commas, the slash is now most often used to represent exclusive or inclusive or, division and fractions, and as a date separator. It is called a solidus in Unicode, is sometimes known as a stroke in British English, and it has several other historical or technical names, including oblique and virgule.

Related Research Articles

In computers, uppercase and lowercase text may be treated as distinct or equivalent. This means when a large character such as A is affected differently by a small character such as "a", this creates different effects on the computer system. For example, the following are sometimes case-sensitive, sometimes not:

In computer data storage, drive letter assignment is the process of assigning alphabetical identifiers to volumes. Unlike the concept of UNIX mount points, where volumes are named and located arbitrarily in a single hierarchical namespace, drive letter assignment allows multiple highest-level namespaces. Drive letter assignment is thus a process of using letters to name the roots of the "forest" representing the file system; each volume holds an independent "tree".

An 8.3 filename is a filename convention used by old versions of DOS and versions of Microsoft Windows prior to Windows 95 and Windows NT 3.5. It is also used in modern Microsoft operating systems as an alternate filename to the long filename for compatibility with legacy programs. The filename convention is limited by the FAT file system. Similar 8.3 file naming schemes have also existed on earlier CP/M, TRS-80, Atari, and some Data General and Digital Equipment Corporation minicomputer operating systems.

Linux has several filesystem drivers for the File Allocation Table (FAT) filesystem format. These are commonly known by the names used in the mount command to invoke particular drivers in the kernel: msdos, vfat, and umsdos.

COMMAND.COM default command-line interpreter for DOS, Windows 95, Windows 98 and Windows ME

COMMAND.COM is the default command-line interpreter for DOS, Windows 95, Windows 98, Windows 98SE and Windows ME. In the case of DOS, it is the default user interface as well. It has an additional role as usually being the first program run after boot, hence being responsible for setting up the system by running the AUTOEXEC.BAT configuration file, and being the ancestor of all processes.

AmigaDOS is the disk operating system of the AmigaOS, which includes file systems, file and directory manipulation, the command-line interface, and file redirection.

A filename extension is an identifier specified as a suffix to the name of a computer file. The extension indicates a characteristic of the file contents or its intended use. A file extension is typically delimited from the filename with a full stop (period), but in some systems it is separated with spaces.

A path, the general form of the name of a file or directory, specifies a unique location in a file system. A path points to a file system location by following the directory tree hierarchy expressed in a string of characters in which path components, separated by a delimiting character, represent each directory. The delimiting character is most commonly the slash ("/"), the backslash character ("\"), or colon (":"), though some operating systems may use a different delimiter. Paths are used extensively in computer science to represent the directory/file relationships common in modern operating systems, and are essential in the construction of Uniform Resource Locators (URLs). Resources can be represented by either absolute or relative paths.

In computing, a hard link is a directory entry that associates a name with a file on a file system. All directory-based file systems must have at least one hard link giving the original name for each file. The term “hard link” is usually only used in file systems that allow more than one hard link for the same file.

The Amiga Fast File System is a file system used on the Amiga personal computer. The previous Amiga filesystem was never given a specific name and known originally simply as "DOS" or AmigaDOS. Upon the release of FFS, the original filesystem became known as Amiga Old File System (OFS). OFS, which was primarily designed for use with floppy disks, had been proving slow to keep up with hard drives of the era. FFS was designed as a full replacement for the original Amiga filesystem. FFS differs from its predecessor mainly in the removal of redundant information. Data blocks contain nothing but data, allowing the filesystem to manage the transfer of large chunks of data directly from the host adapter to the final destination.

File system way of storing all data on a data storage device

In computing, a file system or filesystem controls how data is stored and retrieved. Without a file system, information placed in a storage medium would be one large body of data with no way to tell where one piece of information stops and the next begins. By separating the data into pieces and giving each piece a name, the information is easily isolated and identified. Taking its name from the way paper-based information systems are named, each group of data is called a "file". The structure and logic rules used to manage the groups of information and their names is called a "file system".

In computer data storage, a volume or logical drive is a single accessible storage area with a single file system, typically resident on a single partition of a hard disk. Although a volume might be different from a physical disk drive, it can still be accessed with an operating system's logical interface. However, a volume differs from a partition.

Long filename (LFN) support is Microsoft's backward compatible extension of the 8.3 filename naming scheme used in DOS. Long filenames can be more descriptive, including longer filename extensions common on other operating systems such as .jpeg, .tiff, .html, and .xhtml rather than specialized shortened names such as .jpg, .tif, .htm, or .xht. The standard has been common with File Allocation Table (FAT) filesystems since its first implementation in Windows NT 3.5 of 1994.

CrossDOS is a file system handler for accessing FAT formatted media on Amiga computers. It was bundled with AmigaOS 2.1 and later. Its function was to allow working with disks formatted for PCs and Atari STs. In the 1990s it became a commonly used method of file exchange between Amiga systems and other platforms.

The Sun386i is a discontinued hybrid UNIX workstation/PC compatible computer system produced by Sun Microsystems, launched in 1988. It is based on the Intel 80386 microprocessor but shares many features with the contemporary Sun-3 series systems.

In Unix-like operating systems, a device file or special file is an interface to a device driver that appears in a file system as if it were an ordinary file. There are also special files in MS-DOS, OS/2, and Microsoft Windows. These special files allow an application program to interact with a device by using its device driver via standard input/output system calls. Using standard system calls simplifies many programming tasks, and leads to consistent user-space I/O mechanisms regardless of device features and functions.

A FAT file system is a specific type of computer file system architecture and a family of industry-standard file systems utilizing it.

References

  1. Eckstein, Robert; David Collier-Brown; Peter Kelly (November 1999). "5.4 Name Mangling and Case". Using Samba (1st ed.). O'Reilly & Associates, Inc. Retrieved 2009-10-23.