Fork (file system)

Last updated

In a computer file system, a fork is a set of data associated with a file-system object. File systems without forks only allow a single set of data for the contents, while file systems with forks allow multiple such contents. Every non-empty file must have at least one fork, often of default type, and depending on the file system, a file may have one or more other associated forks, which in turn may contain primary data integral to the file, or just metadata.

Contents

Unlike extended attributes , a similar file system feature which is typically of fixed size, forks can be of variable size, possibly even larger than the file's primary data fork. The size of a file is the sum of the sizes of each fork.

Popular file systems that can use forks include Apple's HFS+ and Microsoft's NTFS.

Alternatives

On file systems without forks, one may instead use multiple separate files that are associated with each other, particularly sidecar files for metadata. However, the connection between these files is not automatically preserved by the file system, and must instead be handled by each program that works on files. Another alternative is a container file, which stores additional data within a given file format, or an archive file, which allows storing several files and metadata within a file (within a single fork). This requires that programs process the container file or archive file, rather than the file system handling forks. These alternatives require additional work by programs using the data, but benefit from portability to file systems that do not support forks.

Implementations

Apple

File system forks are associated with Apple's Hierarchical File System (HFS). [1] HFS, and the original Apple Macintosh file system MFS, allowed a file system object to have two kinds of forks: a data fork and a resource fork.

The resource fork was designed to store non-compiled data that would be used by the system's graphical user interface (GUI), such as localizable text strings, a file's icon to be used by the Finder or the menus and dialog boxes associated with an application. [2] However the feature was very flexible, so additional uses were found, such as splitting a word processing document into content and presentation, then storing each part in separate resources. As compiled software code was also stored in a resource, often applications would consist of just a resource fork and no data fork.

One of HFS+'s most obscure features is that a file may have an arbitrary number of custom "named forks" in addition to the traditional data and resource forks. This feature has gone largely unused, as Apple never added support for it under Mac OS 8.1-10.3.9. Beginning with 10.4, a partial implementation was made to support Apple's extended inline attributes. [3]

In Mac OS X until Mac OS X v10.4, users running Unix command line utilities such as tar would risk data loss, as the utilities had not been updated to handle the resource forks of files. [4]

Novell

Starting in 1985, Novell NetWare File System (NWFS), and its successor Novell Storage Services (NSS), were designed from the ground up to use a variety of methods to store a file's metadata. Some metadata resides in Novell Directory Services (NDS), some is stored in the directory structure on the disk, and some is stored in, as Novell terms it, 'multiple data streams' with the file itself. Multiple data streams also allow Macintosh clients to attach to and use NetWare servers.

Microsoft

NTFS, the file system introduced with Windows NT 3.1, supports file system forks known as alternate data streams (ADS). [5] ReFS, a new file system introduced with Windows Server 2012, originally did not support ADS, [6] [7] [8] but in Windows 8.1 64-bit and Server 2012 R2, support for ADS, with lengths of up to 128K, was added to ReFS. [9]

ADS was originally intended to add compatibility with existing operating systems that support forks.[ citation needed ] A computer program may be directed to open an ADS by specifying the name of ADS after a colon sign (:) after the file path. [10] In spite of the support, most programs, including Windows Explorer and the dir command (before Windows Vista) ignore ADS. Windows Explorer copies ADS and warns when the target file system does not support them, but only calculates the main stream's size and does not list a file or folder's streams. Since Windows Vista, the dir command supports showing ADS. [11] Windows PowerShell v3.0 and later supports manipulating ADS. [12]

Uses

Windows 2000 uses ADS to store thumbnails in image files, and to store summary information (such as title and author) in any file, without changing the main stream. [13] [14] With Windows XP, Microsoft realized that ADS is susceptible to loss when the files containing them are moved off NTFS volumes; thus Windows XP stores them in the main stream whenever the file format supports it. [13] Windows Vista discontinued support for adding summary information altogether, as Microsoft decided that they are too sensitive for ADS to handle. [15] But the use of ADS for other purposes did not stop. Service Pack 2 for Windows XP introduced the Attachment Execution Service that stores details on the origin of downloaded files in an ADS called zone identifier, in an effort to protect users from downloaded files that may present a risk. [16] Internet Explorer and Windows 8 extended this function through SmartScreen. [17] Internet Explorer also uses ADS to store favicons in Internet shortcut files. [10]

Sun

Solaris version 9 and later allows files to have forks. Forks are called extended attributes in Solaris, although they are not within the usual meaning of "extended attribute". The maximum size of a Solaris-type extended attribute is the same as the maximum size of a file, and they are read and written in the same fashion as files. Internally, they are actually stored and accessed like normal files, so their ownership and permissions can differ from those of the parent file. Sub-directories are administratively[ clarification needed ] disabled, so their names cannot contain "/" characters.

Extended attributes in Network File System Version 4 are similar to Solaris-style extended attributes.

Possible security and data loss risks

When a file system supports different forks, the applications should be aware of them, or security risks can arise. Allowing legacy software to access data without appropriate shims in place is the primary culprit for such problems.[ citation needed ]

If the different system utilities (disk explorer, antivirus software, archivers, and so on), are not aware of the different forks, the following problems can arise:

Related Research Articles

<span class="mw-page-title-main">ISO 9660</span> File system for CD-R and CD-ROM optical discs

ISO 9660 is a file system for optical disc media. The file system is an international standard available from the International Organization for Standardization (ISO). Since the specification is available for anybody to purchase, implementations have been written for many operating systems.

New Technology File System (NTFS) is a proprietary journaling file system developed by Microsoft. Starting with Windows NT 3.1, it is the default file system of the Windows NT family. It superseded File Allocation Table (FAT) as the preferred filesystem on Windows and is supported in Linux and BSD as well. NTFS reading and writing support is provided using a free and open-source kernel implementation known as NTFS3 in Linux and the NTFS-3G driver in BSD. By using the convert command, Windows can convert FAT32/16/12 into NTFS without the need to rewrite all files. NTFS uses several files typically hidden from the user to store metadata about other files stored on the drive which can help improve speed and performance when reading data. Unlike FAT and High Performance File System (HPFS), NTFS supports access control lists (ACLs), filesystem encryption, transparent compression, sparse files and file system journaling. NTFS also supports shadow copy to allow backups of a system while it is running, but the functionality of the shadow copies varies between different versions of Windows.

Universal Disk Format (UDF) is an open, vendor-neutral file system for computer data storage for a broad range of media. In practice, it has been most widely used for DVDs and newer optical disc formats, supplanting ISO 9660. Due to its design, it is very well suited to incremental updates on both recordable and (re)writable optical media. UDF was developed and maintained by the Optical Storage Technology Association (OSTA).

File Allocation Table (FAT) is a file system developed for personal computers and was the default filesystem for MS-DOS and Windows 9x operating systems. Originally developed in 1977 for use on floppy disks, it was adapted for use on hard disks and other devices. The increase in disk drives capacity required three major variants: FAT12, FAT16 and FAT32. FAT was replaced with NTFS as the default file system on Microsoft operating systems starting with Windows XP. Nevertheless, FAT continues to be used on flash and other solid-state memory cards and modules, many portable and embedded devices because of its compatibility and ease of implementation.

Hierarchical File System (HFS) is a proprietary file system developed by Apple Inc. for use in computer systems running Mac OS. Originally designed for use on floppy and hard disks, it can also be found on read-only media such as CD-ROMs. HFS is also referred to as Mac OS Standard, while its successor, HFS Plus, is also called Mac OS Extended.

The resource fork is a fork or section of a file on Apple's classic Mac OS operating system, which was also carried over to the modern macOS for compatibility, used to store structured data along with the unstructured data stored within the data fork.

A filename extension, file name extension or file extension is a suffix to the name of a computer file. The extension indicates a characteristic of the file contents or its intended use. A filename extension is typically delimited from the rest of the filename with a full stop (period), but in some systems it is separated with spaces. Other extension formats include dashes and/or underscores on early versions of Linux and some versions of IBM AIX.

The Apple Filing Protocol (AFP), formerly AppleTalk Filing Protocol, is a proprietary network protocol, and part of the Apple File Service (AFS), that offers file services for macOS, classic Mac OS, and Apple IIs. In OS X 10.8 Mountain Lion and earlier, AFP was the primary protocol for file services. Starting with OS X 10.9 Mavericks, Server Message Block (SMB) was made the primary file sharing protocol, with the ability to run an AFP server removed later in macOS 11 Big Sur. The latest version of Apple's OS at the time of writing, macOS Ventura, is still able to authenticate with and connect to AFP servers. AFP supports Unicode file names, POSIX and access control list permissions, resource forks, named extended attributes, and advanced file locking.

The Installable File System (IFS) is a filesystem API in MS-DOS/PC DOS 4.x, IBM OS/2 and Microsoft Windows that enables the operating system to recognize and load drivers for file systems.

MacBinary is a file format that combines the two forks of a classic Mac OS file into a single file, along with HFS's extended metadata. The resulting file is suitable for transmission over FTP, the World Wide Web, and electronic mail. The documents can also be stored on computers that run operating systems with no HFS support, such as Unix or Windows.

HFS Plus or HFS+ is a journaling file system developed by Apple Inc. It replaced the Hierarchical File System (HFS) as the primary file system of Apple computers with the 1998 release of Mac OS 8.1. HFS+ continued as the primary Mac OS X file system until it was itself replaced with the Apple File System (APFS), released with macOS High Sierra in 2017. HFS+ is also one of the formats supported by the iPod digital music player.

<span class="mw-page-title-main">File system</span> Format or program for storing files and directories

In computing, a file system or filesystem is a method and data structure that the operating system uses to control how data is stored and retrieved. Without a file system, data placed in a storage medium would be one large body of data with no way to tell where one piece of data stopped and the next began, or where any piece of data was located when it was time to retrieve it. By separating the data into pieces and giving each piece a name, the data are easily isolated and identified. Taking its name from the way a paper-based data management system is named, each group of data is called a "file". The structure and logic rules used to manage the groups of data and their names is called a "file system."

In computing, an extent is a contiguous area of storage reserved for a file in a file system, represented as a range of block numbers, or tracks on count key data devices. A file can consist of zero or more extents; one file fragment requires one extent. The direct benefit is in storing each range compactly as two numbers, instead of canonically storing every block number in the range. Also, extent allocation results in less file fragmentation.

File attributes are a type of meta-data that describe and may modify how files and/or directories in a filesystem behave. Typical file attributes may, for example, indicate or specify whether a file is visible, modifiable, compressed, or encrypted. The availability of most file attributes depends on support by the underlying filesystem where attribute data must be stored along with other control structures. Each attribute can have one of two states: set and cleared. Attributes are considered distinct from other metadata, such as dates and times, filename extensions or file system permissions. In addition to files, folders, volumes and other file system objects may have attributes.

Extended file attributes are file system features that enable users to associate computer files with metadata not interpreted by the filesystem, whereas regular attributes have a purpose strictly defined by the filesystem. Unlike forks, which can usually be as large as the maximum file size, extended attributes are usually limited in size to a value significantly smaller than the maximum file size. Typical uses include storing the author of a document, the character encoding of a plain-text document, or a checksum, cryptographic hash or digital certificate, and discretionary access control information.

The following tables compare general and technical information for a number of file systems.

Sidecar files, also known as buddy files or connected files, are computer files that store data which is not supported by the format of a source file.

<span class="mw-page-title-main">Apple Disk Image</span> Disk image file format developed by Apple and commonly used by macOS

AppleDisk Image is a disk image format commonly used by the macOS operating system. When opened, an Apple Disk Image is mounted as a volume within the Finder.

Resilient File System (ReFS), codenamed "Protogon", is a Microsoft proprietary file system introduced with Windows Server 2012 with the intent of becoming the "next generation" file system after NTFS.

References

  1. Apple (1996-07-02). "File Forks". Apple. Archived from the original on 2008-07-24. Retrieved 2006-11-18.
  2. Bruce Horn. "The Grand Unified Model (1) - Resources". Folklore.org. Retrieved 2017-10-03.
  3. Siracusa, John (28 April 2005). "Mac OS X 10.4 Tiger". Ars Technica .
  4. "Command-line Backup Solutions on Mac OS X". Apple. 2005-10-29. Archived from the original on February 25, 2008. Retrieved 2006-11-18.
  5. "Files and Clusters". Microsoft. 7 January 2021. Retrieved 2023-08-15.
  6. Verma, Surendra (16 January 2012). Sinofsky, Steven (ed.). "Building the next generation file system for Windows: ReFS". Building Windows 8. Microsoft. MSDN blogs. Archived from the original on 16 February 2013. Retrieved 20 January 2013.
  7. Foley, Mary Jo (16 January 2012). "Microsoft goes public with plans for its new Windows 8 file system". ZDNet . CBS Interactive . Retrieved 21 January 2013.
  8. Martin Lucas. "Windows Server 2012: Does ReFS replace NTFS? When should I use it?". TechNet . Archived from the original on 23 January 2013.
  9. "Resilient File System Overview". Microsoft Docs . Microsoft. 13 January 2017. Retrieved 15 August 2023.
  10. 1 2 Law, Eric (7 September 2013). "Fun with Favicons". Microsoft Docs . Microsoft . Retrieved 15 August 2023.
  11. Bart De Smet (2006-07-13). "Use Vista's DIR command to display alternate data streams". B# .NET Blog. Archived from the original on 2007-09-27. Retrieved 2007-07-07.
  12. "FileSystem Provider (Windows PowerShell 3.0)". TechNet . Microsoft. 9 August 2012. Archived from the original on 23 January 2015.
  13. 1 2 Chen, Raymond (27 May 2011). "Why are custom properties created on Windows 2000 lost when I view the file from newer versions of Windows?". The Old New Thing. Microsoft . Retrieved 10 June 2020.
  14. Microsoft (2006-10-27). "Indexing service adds data streams to image files". Microsoft. Retrieved 2006-11-18.
  15. Chen, Raymond (1 May 2012). "What happened to the Summary information created on Windows 2000 and Windows XP?". The Old New Thing. Microsoft . Retrieved 10 June 2020.
  16. Bart De Smet (2005-08-19). "Demo of "Attachment Execution Service internals" in Windows XP SP2 and Windows Server 2003 SP1". B# .NET Blog. Archived from the original on 2007-02-23. Retrieved 2006-11-18.
  17. Chen, Raymond (4 November 2013). "Manipulating the zone identifier to specify where a file was download from". The Old New Thing. Microsoft.