Extent (file systems)

Last updated

In computing, an extent is a contiguous area of storage reserved for a file in a file system, represented as a range of block numbers, or tracks on count key data devices. A file can consist of zero or more extents; one file fragment requires one extent. The direct benefit is in storing each range compactly as two numbers, instead of canonically storing every block number in the range. [1] Also, extent allocation results in less file fragmentation.

Contents

Extent-based file systems can also eliminate most of the metadata overhead of large files that would traditionally be taken up by the block-allocation tree. But because the savings are small compared to the amount of stored data (for all file sizes in general) but make up a large portion of the metadata (for large files), the overall benefits in storage efficiency and performance are slight. [2]

In order to resist fragmentation, several extent-based file systems do allocate-on-flush. Many modern fault-tolerant file systems also do copy-on-write, although that increases fragmentation. As a similar design, the CP/M file system uses extents as well, but those do not correspond to the definition given above. CP/M's extents appear contiguously as a single block in the combined directory/allocation table, and they do not necessarily correspond to a contiguous data area on disk.

IBM OS/360 and successors allocate files in multiples of disk tracks or cylinders. Files could originally have up to 16 extents, but this restriction has since been lifted. The initial allocation size, and the size of additional extents to be allocated if required, are specified by the user via Job Control Language. The system attempts to allocate the initial size as a contiguous area, although this may be split if contiguous space is not available.

Adoption

The systems supporting file system extents include the following:

Adoption outside of file systems include the following:

See also

Related Research Articles

XFS is a high-performance 64-bit journaling file system created by Silicon Graphics, Inc (SGI) in 1993. It was the default file system in SGI's IRIX operating system starting with its version 5.3. XFS was ported to the Linux kernel in 2001; as of June 2014, XFS is supported by most Linux distributions; Red Hat Enterprise Linux uses it as its default file system.

NT File System (NTFS) is a proprietary journaling file system developed by Microsoft in the 1990s.

ext3, or third extended filesystem, is a journaled file system that is commonly used with the Linux kernel. It used to be the default file system for many popular Linux distributions but generally has been supplanted by its successor version ext4. The main advantage of ext3 over its predecessor, ext2, is journaling, which improves reliability and eliminates the need to check the file system after an improper, a.k.a. unclean, shutdown.

Hierarchical File System (HFS) is a proprietary file system developed by Apple Inc. for use in computer systems running Mac OS. Originally designed for use on floppy and hard disks, it can also be found on read-only media such as CD-ROMs. HFS is also referred to as Mac OS Standard, while its successor, HFS Plus, is also called Mac OS Extended.

HPFS is a file system created specifically for the OS/2 operating system to improve upon the limitations of the FAT file system. It was written by Gordon Letwin and others at Microsoft and added to OS/2 version 1.2, at that time still a joint undertaking of Microsoft and IBM, and released in 1988.

In the maintenance of file systems, defragmentation is a process that reduces the degree of fragmentation. It does this by physically organizing the contents of the mass storage device used to store files into the smallest number of contiguous regions. It also attempts to create larger regions of free space using compaction to impede the return of fragmentation. Some defragmentation utilities try to keep smaller files within a single directory together, as they are often accessed in sequence.

HFS Plus or HFS+ is a journaling file system developed by Apple Inc. It replaced the Hierarchical File System (HFS) as the primary file system of Apple computers with the 1998 release of Mac OS 8.1. HFS+ continued as the primary Mac OS X file system until it was itself replaced with the Apple File System (APFS), released with macOS High Sierra in 2017. HFS+ is also one of the formats supported by the iPod digital music player.

<span class="mw-page-title-main">File system</span> Computer filing system

In computing, a file system or filesystem governs file organization and access. A local file system is a capability of an operating system that services the applications running on the same computer. A distributed file system is a protocol that provides file access between networked computers.

Allocate-on-flush is a file system feature implemented in HFS+, XFS, Reiser4, ZFS, Btrfs, and ext4. The feature also closely resembles an older technique that Berkeley's UFS called "block reallocation".

In Linux, Logical Volume Manager (LVM) is a device mapper framework that provides logical volume management for the Linux kernel. Most modern Linux distributions are LVM-aware to the point of being able to have their root file systems on a logical volume.

<span class="mw-page-title-main">Sparse file</span> Type of computer file

In computer science, a sparse file is a type of computer file that attempts to use file system space more efficiently when the file itself is partially empty. This is achieved by writing brief information (metadata) representing the empty blocks to the data storage media instead of the actual "empty" space which makes up the block, thus consuming less storage space. The full block is written to the media as the actual size only when the block contains "real" (non-empty) data.

The following tables compare general and technical information for a number of file systems.

ext4 is a journaling file system for Linux, developed as the successor to ext3.

<span class="mw-page-title-main">The Sleuth Kit</span>

The Sleuth Kit (TSK) is a library and collection of Unix- and Windows-based utilities for extracting data from disk drives and other storage so as to facilitate the forensic analysis of computer systems. It forms the foundation for Autopsy, a better known tool that is essentially a graphical user interface to the command line utilities bundled with The Sleuth Kit.

exFAT is a file system optimized for flash memory such as USB flash drives and SD cards, that was introduced by Microsoft in 2006. exFAT was proprietary until 28 August 2019, when Microsoft published its specification. Microsoft owns patents on several elements of its design.

<span class="mw-page-title-main">File system fragmentation</span> Condition where a segmented file system is used inefficiently

In computing, file system fragmentation, sometimes called file system aging, is the tendency of a file system to lay out the contents of files non-continuously to allow in-place modification of their contents. It is a special case of data fragmentation. File system fragmentation negatively impacts seek time in spinning storage media, which is known to hinder throughput. Fragmentation can be remedied by re-organizing files and free space back into contiguous areas, a process called defragmentation.

Btrfs is a computer storage format that combines a file system based on the copy-on-write (COW) principle with a logical volume manager, developed together. It was created by Chris Mason in 2007 for use in Linux, and since November 2013, the file system's on-disk format has been declared stable in the Linux kernel.

A journaling file system is a file system that keeps track of changes not yet committed to the file system's main part by recording the goal of such changes in a data structure known as a "journal", which is usually a circular log. In the event of a system crash or power failure, such file systems can be brought back online more quickly with a lower likelihood of becoming corrupted.

Resilient File System (ReFS), codenamed "Protogon", is a Microsoft proprietary file system introduced with Windows Server 2012 with the intent of becoming the "next generation" file system after NTFS.

References

  1. "Understanding Ext4 (part1): Extents". 2010-12-20. Archived from the original on 2015-02-03. Retrieved 2015-02-02. What's really a departure for EXT4 however, is the use of extents rather than the old, inefficient indirect block mechanism used by earlier Unix file systems (e.g. EXT2 and EXT3) for tracking file content. Extents are similar to cluster runs in the NTFS file system; essentially, they specify an initial block address and the number of blocks that make up the extent. A file that is fragmented will have multiple extents, but EXT4 tries very hard to keep files contiguous.
  2. "Ext4 Disk Layout". 2015-01-26. Retrieved 2015-02-02. If flex_bg is enabled, it is possible to allocate very large files with a single extent, at a considerable reduction in metadata block use, and some improvement in disk efficiency.
  3. "Pages and Extents Architecture Guide - SQL Server". learn.microsoft.com. Microsoft. 12 June 2024. Retrieved 18 December 2024.
  4. "Oracle Database 23ai Technical Architecture". docs.oracle.com. Oracle Corporation. Retrieved 18 December 2024.