Free-space bitmap

Last updated

Free-space bitmaps are one method used to track allocated sectors by some file systems. While the most simplistic design is highly inefficient, advanced or hybrid implementations of free-space bitmaps are used by some modern file systems[ which? ].

Contents

Example

The simplest form of free-space bitmap is a bit array, i.e. a block of bits. In this example, a zero would indicate a free sector, while a one indicates a sector in use. Each sector would be of fixed size. For explanatory purposes, we will use a 4  GiB hard drive with 4096-byte sectors and assume that the bitmap itself is stored elsewhere. The example disk would require 1,048,576 bits, one for each sector, or 128  KiB. Increasing the size of the drive will proportionately increase the size of the bitmap, while multiplying the sector size will produce a proportionate reduction.

When the operating system (OS) needs to write a file, it will scan the bitmap until it finds enough free locations to fit the file. If a 12 KiB file were stored on the example drive, three zero bits would be found, changed to ones, and the data would be written across the three sectors represented by those bits. If the file were subsequently truncated down to 8 KiB, the final sector's bit would be set back to zero, indicating that it is again available for use.

Advantages

Disadvantages

Advanced techniques

As the drive size grows, the amount of time needed to scan for free space can become unreasonable. To address this, real-world implementations of free-space bitmaps will find ways to centralize information on free space. One approach is to split the bitmap into many chunks. A separate array then stores the number of free sectors in each chunk, so chunks with insufficient space can be easily skipped over, and the total amount of free space is easier to compute. Finding free space now entails searching the summary array first, then searching the associated bitmap chunk for the exact sectors available. [1]

This approach drastically reduces the cost of finding free space, but it doesn't help with the process of freeing space. If the combined size of the summary array and bitmap is greater than can readily be stored in memory, and a large number of files with scattered sectors are freed, an enormous amount of disk access is necessary to find all the sectors, decrement the summary counter and flip the bits back to zero. This greatly reduces the benefits of the bitmap, as it is no longer performing its function of summarizing the free space rapidly without reading from the disk.

See also

Related Research Articles

In computer science, a B-tree is a self-balancing tree data structure that maintains sorted data and allows searches, sequential access, insertions, and deletions in logarithmic time. The B-tree generalizes the binary search tree, allowing for nodes with more than two children. Unlike other self-balancing binary search trees, the B-tree is well suited for storage systems that read and write relatively large blocks of data, such as databases and file systems.

The megabyte is a multiple of the unit byte for digital information. Its recommended unit symbol is MB. The unit prefix mega is a multiplier of 1000000 (106) in the International System of Units (SI). Therefore, one megabyte is one million bytes of information. This definition has been incorporated into the International System of Quantities.

ext2, or second extended file system, is a file system for the Linux kernel. It was initially designed by French software developer Rémy Card as a replacement for the extended file system (ext). Having been designed according to the same principles as the Berkeley Fast File System from BSD, it was the first commercial-grade filesystem for Linux.

Hierarchical File System (HFS) is a proprietary file system developed by Apple Inc. for use in computer systems running Mac OS. Originally designed for use on floppy and hard disks, it can also be found on read-only media such as CD-ROMs. HFS is also referred to as Mac OS Standard, while its successor, HFS Plus, is also called Mac OS Extended.

bzip2 File compression software

bzip2 is a free and open-source file compression program that uses the Burrows–Wheeler algorithm. It only compresses single files and is not a file archiver. It relies on separate external utilities for tasks such as handling multiple files, encryption, and archive-splitting.

Disk formatting is the process of preparing a data storage device such as a hard disk drive, solid-state drive, floppy disk, memory card or USB flash drive for initial use. In some cases, the formatting operation may also create one or more new file systems. The first part of the formatting process that performs basic medium preparation is often referred to as "low-level formatting". Partitioning is the common term for the second part of the process, dividing the device into several sub-devices and, in some cases, writing information to the device allowing an operating system to be booted from it. The third part of the process, usually termed "high-level formatting" most often refers to the process of generating a new file system. In some operating systems all or parts of these three processes can be combined or repeated at different levels and the term "format" is understood to mean an operation in which a new disk medium is fully prepared to store files. Some formatting utilities allow distinguishing between a quick format, which does not erase all existing data and a long option that does erase all existing data.

HFS Plus or HFS+ is a journaling file system developed by Apple Inc. It replaced the Hierarchical File System (HFS) as the primary file system of Apple computers with the 1998 release of Mac OS 8.1. HFS+ continued as the primary Mac OS X file system until it was itself replaced with the Apple File System (APFS), released with macOS High Sierra in 2017. HFS+ is also one of the formats supported by the iPod digital music player.

A bit array is an array data structure that compactly stores bits. It can be used to implement a simple set data structure. A bit array is effective at exploiting bit-level parallelism in hardware to perform operations quickly. A typical bit array stores kw bits, where w is the number of bits in the unit of storage, such as a byte or word, and k is some nonnegative integer. If w does not divide the number of bits to be stored, some space is wasted due to internal fragmentation.

<span class="mw-page-title-main">Cylinder-head-sector</span> Historical method for giving addresses to physical data blocks on hard disk drives

Cylinder-head-sector (CHS) is an early method for giving addresses to each physical block of data on a hard disk drive.

Data structure alignment is the way data is arranged and accessed in computer memory. It consists of three separate but related issues: data alignment, data structure padding, and packing.

<span class="mw-page-title-main">GUID Partition Table</span> Computer disk partitioning standard

The GUID Partition Table (GPT) is a standard for the layout of partition tables of a physical computer storage device, such as a hard disk drive or solid-state drive, using universally unique identifiers, which are also known as globally unique identifiers (GUIDs). Forming a part of the Unified Extensible Firmware Interface (UEFI) standard, it is nevertheless also used for some BIOSs, because of the limitations of master boot record (MBR) partition tables, which use 32 bits for logical block addressing (LBA) of traditional 512-byte disk sectors.

File size is a measure of how much data a computer file contains or, alternately, how much storage it consumes. Typically, file size is expressed in units of measurement based on the byte. By convention, file size units use either a metric prefix or a binary prefix.

In computer storage, fragmentation is a phenomenon in which storage space, main storage or secondary storage, is used inefficiently, reducing capacity or performance and often both. The exact consequences of fragmentation depend on the specific system of storage allocation in use and the particular form of fragmentation. In many cases, fragmentation leads to storage space being "wasted", and in that case the term also refers to the wasted space itself.

The following tables compare general and technical information for a number of file systems.

<span class="mw-page-title-main">Disk sector</span> Logical or physical division of storage media

In computer disk storage, a sector is a subdivision of a track on a magnetic disk or optical disc. For most disks, each sector stores a fixed amount of user-accessible data, traditionally 512 bytes for hard disk drives (HDDs) and 2048 bytes for CD-ROMs and DVD-ROMs. Newer HDDs and SSDs use 4096-byte (4 KiB) sectors, which are known as the Advanced Format (AF).

Btrfs is a computer storage format that combines a file system based on the copy-on-write (COW) principle with a logical volume manager, developed together. It was founded by Chris Mason in 2007 for use in Linux, and since November 2013, the file system's on-disk format has been declared stable in the Linux kernel.

This timeline of binary prefixes lists events in the history of the evolution, development, and use of units of measure which are germane to the definition of the binary prefixes by the International Electrotechnical Commission (IEC) in 1998, used primarily with units of information such as the bit and the byte.

<span class="mw-page-title-main">Advanced Format</span> Disk format and access using sector sizes larger than 512 bytes

Advanced Format (AF) is any disk sector format used to store data on magnetic disks in hard disk drives (HDDs) that exceeds 528 bytes per sector, frequently 4096, 4112, 4160, or 4224-byte (4 KB) sectors. Larger sectors of an Advanced Format Drive (AFD) enable the integration of stronger error correction algorithms to maintain data integrity at higher storage densities.

In computer file systems, a block availability map (BAM) is a data structure used to track disk blocks that are considered free. It is used along with a directory to manage files on a disk.

The FAT file system is a file system used on MS-DOS and Windows 9x family of operating systems. It continues to be used on mobile devices and embedded systems, and thus is a well suited file system for data exchange between computers and devices of almost any type and age from 1981 through the present.

References

  1. 1 2 3 Bonwick, Jeff (2007-09-14). "Space Maps". Archived from the original on April 1, 2009. Retrieved 2009-10-02.