Disk sector

Last updated

Figure 1: Disk structures:
.mw-parser-output .legend{page-break-inside:avoid;break-inside:avoid-column}.mw-parser-output .legend-color{display:inline-block;min-width:1.25em;height:1.25em;line-height:1.25;margin:1px 0;text-align:center;border:1px solid black;background-color:transparent;color:black}.mw-parser-output .legend-text{}
(A) Track
(B) Geometrical sector
(C) Disk sector
(D) Cluster Disk-structure2.svg
Figure 1: Disk structures:
  (A) Track
  (C) Disk sector
  (D) Cluster

In computer disk storage, a sector is a subdivision of a track on a magnetic disk or optical disc. For most disks, each sector stores a fixed amount of user-accessible data, traditionally 512 bytes for hard disk drives (HDDs) and 2048 bytes for CD-ROMs and DVD-ROMs. Newer HDDs and SSDs use 4096-byte (4  KiB) sectors, which are known as the Advanced Format (AF).

Contents

The sector is the minimum storage unit of a hard drive. [1] Most disk partitioning schemes are designed to have files occupy an integral number of sectors regardless of the file's actual size. Files that do not fill a whole sector will have the remainder of their last sector filled with zeroes. In practice, operating systems typically operate on blocks of data, which may span multiple sectors. [2]

Geometrically, the word sector means a portion of a disk between a center, two radii and a corresponding arc (see Figure 1, item B), which is shaped like a slice of a pie. Thus, the disk sector (Figure 1, item C) refers to the intersection of a track and geometrical sector.

In modern disk drives, each physical sector is made up of two basic parts, the sector header area (typically called "ID") and the data area. The sector header contains information used by the drive and controller; this information includes sync bytes, address identification, flaw flag and error detection and correction information. The header may also include an alternate address to be used if the data area is undependable. The address identification is used to ensure that the mechanics of the drive have positioned the read/write head over the correct location. The data area contains the sync bytes, user data and an error-correcting code (ECC) that is used to check and possibly correct errors that may have been introduced into the data.

History

The first disk drive, the 1957 IBM 350 disk storage, had ten 100 character sectors per track; each character was six bits and included a parity bit. The number of sectors per track was identical on all recording surfaces. There was no recorded identifier field (ID) associated with each sector. [3]

The 1961 IBM 1301 disk storage introduced variable length sectors,[ disputed (for: the literature does not use the term sector for variable-length blocks) ] termed records or physical records by IBM, and added to each record a record address field separate from the data in a record. [4] [5] All modern disk drives have sector address fields, called ID fields, separate from the data in a sector.

Also in 1961 Bryant with its 4000 series introduced the concept of zoned recording (ZBR) which allowed the number of sectors per track to vary as a function of the track's diameter – there are more sectors on an outer track than on an inner track. [6] In the late 1980s ZBR was again used in disk drives then announced by Imprimis and Quantum [7] and by 1997 its industry usage was ubiquitous. [8]

The disk drives and other DASDs announced with the IBM System/360 in 1964 used self-formatting variable length sectors,[ disputed (for: the literature does not use the term sector for variable-length blocks) ] termed records or physical records by IBM. They detected errors in all fields of their records with a cyclic redundancy check (CRC) replacing parity per character detection of prior generations. These IBM physical records have three basic parts, a Count field which acts as an ID field, an optional Key field to aid in searching for data and a Data field; in practice, most records had no Key field, indicated by a key length of zero. The structure of these three fields is called the CKD track format for a record.

The 1970 IBM 3330 disk storage replaced the CRC on the data field of each record with an error correcting code (ECC) to improve data integrity by detecting most errors and allowing correction of many errors. [9] Ultimately all fields of disk sectors had ECCs.

Prior to the 1980s, there was little standardization of sector sizes; disk drives had a maximum number of bits per track and various system manufacturers subdivided the track into different sector sizes to suit their OSes and applications. The popularity of the PC beginning in the 1980s and the advent of the IDE interface in the late 1980s led to a 512-byte sector becoming an industry standard sector size for HDDs and similar storage devices. [10] [ failed verification ]

In the 1970s, IBM added fixed-block architecture Direct Access Storage Devices (FBA DASDs) to its line of CKD DASD. CKD DASD supported multiple variable length sectors while the IBM FBA DASD supported sector sizes of 512, 1024, 2048, or 4096 bytes.

In 2000 the industry trade organization, International Disk Drive Equipment and Materials Association (IDEMA) started work to define the implementation and standards that would govern sector size formats exceeding 512 bytes to accommodate future increases in data storage capacities. [10] By the end of 2007 in anticipation of a future IDEMA standard, Samsung and Toshiba began shipments of 1.8-inch hard disk drives with 4096 byte sectors. In 2010 IDEMA completed the Advanced Format standard for 4096 sector drives, [10] setting the date for the transition from 512 to 4096 byte sectors as January 2011 for all manufacturers, [11] and Advanced Format drives soon became prevalent.

Sectors versus blocks

While sector specifically means the physical disk area, the term block has been used loosely to refer to a small chunk of data. Block has multiple meanings depending on the context. In the context of data storage, a filesystem block is an abstraction over disk sectors possibly encompassing multiple sectors. In other contexts, it may be a unit of a data stream or a unit of operation for a utility. [12] For example, the Unix program dd allows one to set the block size to be used during execution with the parameter bs=bytes. This specifies the size of the chunks of data as delivered by dd, and is unrelated to sectors or filesystem blocks.

In Linux, disk sector size can be determined with sudo fdisk -l | grep "Sector size" and block size can be determined with sudo blockdev --getbsz /dev/sda. [13]

Sectors versus clusters

In computer file systems, a cluster (sometimes also called allocation unit or block) is a unit of disk space allocation for files and directories. To reduce the overhead of managing on-disk data structures, the filesystem does not allocate individual disk sectors by default, but contiguous groups of sectors, called clusters.

On a disk that uses 512-byte sectors, a 512-byte cluster contains one sector, whereas a 4-kibibyte (KiB) cluster contains eight sectors.

A cluster is the smallest logical amount of disk space that can be allocated to hold a file. Storing small files on a filesystem with large clusters will therefore waste disk space; such wasted disk space is called slack space. For cluster sizes which are small versus the average file size, the wasted space per file will be statistically about half of the cluster size; for large cluster sizes, the wasted space will become greater. However, a larger cluster size reduces bookkeeping overhead and fragmentation, which may improve reading and writing speed overall. Typical cluster sizes range from 1 sector (512 B) to 128 sectors (64 KiB).

A cluster need not be physically contiguous on the disk; it may span more than one track or, if sector interleaving is used, may even be discontiguous within a track. This should not be confused with fragmentation, as the sectors are still logically contiguous.

A "lost cluster" occurs when a file is deleted from the directory listing, but the File Allocation Table (FAT) still shows the clusters allocated to the file. [14]

The term cluster was changed to allocation unit in DOS 4.0. However the term cluster is still widely used. [15]

Zone bit recording

If a sector is defined as the intersection between a radius and a track, as was the case with early hard drives and most floppy disks, the sectors towards the outside of the disk are physically longer than those nearer the spindle. Because each sector still contains the same number of bytes, the outer sectors have lower bit density than the inner ones, which is an inefficient use of the magnetic surface. The solution is zone bit recording, wherein the disk is divided into zones, each encompassing a small number of contiguous tracks. Each zone is then divided into sectors such that each sector has a similar physical size. Because outer zones have a greater circumference than inner zones, they are allocated more sectors. This is known as zoned bit recording. [16]

A consequence of zone bit recording is that contiguous reads and writes are noticeably faster on outer tracks (corresponding to lower block addresses) than on inner tracks, as more bits pass under the head with each rotation; this difference can be 25% or more.

Advanced Format

In 1998 the traditional 512-byte sector size was identified as one impediment to increasing capacity which at that time was growing at a rate exceeding Moore's Law. Increasing the length of the data field through the implementation of Advanced Format using 4096-byte sectors removed this impediment; it increased the efficiency of the data surface area by five to thirteen percent while increasing the strength of the ECC which in turn allowed higher capacity. The format was standardized by an industry consortium in 2005 and by 2011 incorporated in all new products of all hard drive manufacturers.

See also

Related Research Articles

<span class="mw-page-title-main">Hard disk drive</span> Electro-mechanical data storage device

A hard disk drive (HDD), hard disk, hard drive, or fixed disk, is an electro-mechanical data storage device that stores and retrieves digital data using magnetic storage with one or more rigid rapidly rotating platters coated with magnetic material. The platters are paired with magnetic heads, usually arranged on a moving actuator arm, which read and write data to the platter surfaces. Data is accessed in a random-access manner, meaning that individual blocks of data can be stored and retrieved in any order. HDDs are a type of non-volatile storage, retaining stored data when powered off. Modern HDDs are typically in the form of a small rectangular box.

File Allocation Table (FAT) is a file system developed for personal computers and was the default filesystem for MS-DOS and Windows 9x operating systems. Originally developed in 1977 for use on floppy disks, it was adapted for use on hard disks and other devices. The increase in disk drives capacity required three major variants: FAT12, FAT16 and FAT32. FAT was replaced with NTFS as the default file system on Microsoft operating systems starting with Windows XP. Nevertheless, FAT continues to be used on flash and other solid-state memory cards and modules, many portable and embedded devices because of its compatibility and ease of implementation.

A direct-access storage device (DASD) is a secondary storage device in which "each physical record has a discrete location and a unique address". The term was coined by IBM to describe devices that allowed random access to data, the main examples being drum memory and hard disk drives. Later, optical disc drives and flash memory units are also classified as DASD.

Disk formatting is the process of preparing a data storage device such as a hard disk drive, solid-state drive, floppy disk, memory card or USB flash drive for initial use. In some cases, the formatting operation may also create one or more new file systems. The first part of the formatting process that performs basic medium preparation is often referred to as "low-level formatting". Partitioning is the common term for the second part of the process, dividing the device into several sub-devices and, in some cases, writing information to the device allowing an operating system to be booted from it. The third part of the process, usually termed "high-level formatting" most often refers to the process of generating a new file system. In some operating systems all or parts of these three processes can be combined or repeated at different levels and the term "format" is understood to mean an operation in which a new disk medium is fully prepared to store files. Some formatting utilities allow distinguishing between a quick format, which does not erase all existing data and a long option that does erase all existing data.

Logical block addressing (LBA) is a common scheme used for specifying the location of blocks of data stored on computer storage devices, generally secondary storage systems such as hard disk drives. LBA is a particularly simple linear addressing scheme; blocks are located by an integer index, with the first block being LBA 0, the second LBA 1, and so on.

On the Amiga, the Old File System was the filesystem for AmigaOS before the Amiga Fast File System. Even though it used 512-byte blocks, it reserved the first small portion of each block for metadata, leaving an actual data block capacity of 488 bytes per block. It wasn't very suitable for anything except floppy disks, and it was soon replaced.

IBM manufactured magnetic disk storage devices from 1956 to 2003, when it sold its hard disk drive business to Hitachi. Both the hard disk drive (HDD) and floppy disk drive (FDD) were invented by IBM and as such IBM's employees were responsible for many of the innovations in these products and their technologies. The basic mechanical arrangement of hard disk drives has not changed since the IBM 1301. Disk drive performance and characteristics are measured by the same standards now as they were in the 1950s. Few products in history have enjoyed such spectacular declines in cost and physical size along with equally dramatic improvements in capacity and performance.

HFS Plus or HFS+ is a journaling file system developed by Apple Inc. It replaced the Hierarchical File System (HFS) as the primary file system of Apple computers with the 1998 release of Mac OS 8.1. HFS+ continued as the primary Mac OS X file system until it was itself replaced with the Apple File System (APFS), released with macOS High Sierra in 2017. HFS+ is also one of the formats supported by the iPod digital music player.

<span class="mw-page-title-main">Cylinder-head-sector</span> Historical method for giving addresses to physical data blocks on hard disk drives

Cylinder-head-sector (CHS) is an early method for giving addresses to each physical block of data on a hard disk drive.

In computing, an extent is a contiguous area of storage reserved for a file in a file system, represented as a range of block numbers, or tracks on count key data devices. A file can consist of zero or more extents; one file fragment requires one extent. The direct benefit is in storing each range compactly as two numbers, instead of canonically storing every block number in the range. Also, extent allocation results in less file fragmentation.

In the IBM System/360 storage architecture, the Volume Table of Contents (VTOC), is a data structure that provides a way of locating the data sets that reside on a particular DASD volume. With the exception of the IBM Z compatible disk layout in Linux on Z, it is the functional equivalent of the MS/PC DOS File Allocation Table (FAT), the NTFS Master File Table (MFT), and an inode table in a file system for a Unix-like system. The VTOC is not used to contain any IPLTEXT and does not have any role in the IPL process, therefore does not have any data used by or functionally equivalent to the MBR. It lists the names of each data set on the volume as well as size, location, and permissions. Additionally, it contains an entry for every area of contiguous free space on the volume. The third record on the first track of the first cylinder of any DASD volume is known as the volume label and must contain a pointer to the location of the VTOC. The location of the VTOC may be specified when the volume is initialized. For performance reasons it may be located as close to the center of the volume as possible, since it is referenced frequently. A VTOC is added to a DASD volume when it is initialized using the Device Support Facilities program, ICKDSF, in current systems.

INT 13h is shorthand for BIOS interrupt call 13hex, the 20th interrupt vector in an x86-based computer system. The BIOS typically sets up a real mode interrupt handler at this vector that provides sector-based hard disk and floppy disk read and write services using cylinder-head-sector (CHS) addressing. Modern PC BIOSes also include INT 13h extension functions, originated by IBM and Microsoft in 1992, that provide those same disk access services using 64-bit LBA addressing; with minor additions, these were quasi-standardized by Phoenix Technologies and others as the EDD BIOS extensions.

The following tables compare general and technical information for a number of file systems.

Count key data (CKD) is a direct-access storage device (DASD) data recording format introduced in 1964, by IBM with its IBM System/360 and still being emulated on IBM mainframes. It is a self-defining format with each data record represented by a Count Area that identifies the record and provides the number of bytes in an optional Key Area and an optional Data Area. This is in contrast to devices using fixed sector size or a separate format track.

Fixed-block architecture (FBA) is an IBM term for the hard disk drive (HDD) layout in which each addressable block on the disk has the same size, utilizing 4 byte block numbers and a new set of command codes. FBA as a term was created and used by IBM for its 3310 and 3370 HDDs beginning in 1979 to distinguish such drives as IBM transitioned away from their variable record size format used on IBM's mainframe hard disk drives beginning in 1964 with its System/360.

Free-space bitmaps are one method used to track allocated sectors by some file systems. While the most simplistic design is highly inefficient, advanced or hybrid implementations of free-space bitmaps are used by some modern file systems.

A master boot record (MBR) is a special type of boot sector at the very beginning of partitioned computer mass storage devices like fixed disks or removable drives intended for use with IBM PC-compatible systems and beyond. The concept of MBRs was publicly introduced in 1983 with PC DOS 2.0.

<span class="mw-page-title-main">Advanced Format</span> Disk format and access using sector sizes larger than 512 bytes

Advanced Format (AF) is any disk sector format used to store data on magnetic disks in hard disk drives (HDDs) that exceeds 528 bytes per sector, frequently 4096, 4112, 4160, or 4224-byte (4 KB) sectors. Larger sectors of an Advanced Format Drive (AFD) enable the integration of stronger error correction algorithms to maintain data integrity at higher storage densities.

<span class="mw-page-title-main">Floppy disk variants</span> Types of floppy disk formats

The floppy disk is a data storage and transfer medium that was ubiquitous from the mid-1970s well into the 2000s. Besides the 3½-inch and 5¼-inch formats used in IBM PC compatible systems, or the 8-inch format that preceded them, many proprietary floppy disk formats were developed, either using a different disk design or special layout and encoding methods for the data held on the disk.

The FAT file system is a file system used on MS-DOS and Windows 9x family of operating systems. It continues to be used on mobile devices and embedded systems, and thus is a well suited file system for data exchange between computers and devices of almost any type and age from 1981 through the present.

References

  1. Hamington, Suzie (2004-01-01). Computer Science. Lotus Press. p. 42. ISBN   9788189093242.
  2. Tucker, Allen B. (2004-06-28). Computer Science Handbook, Second Edition. CRC Press. p. 86. ISBN   9780203494455.
  3. 305 RAMAC Random Access Method of Accounting and Control Manual of Operation (PDF). IBM. 1957.
  4. IBM 1301, Models 1 and 2, Disk Storage and IBM 1302, Models 1 and 2, Disk Storage with IBM 7090, 7094, and 7094 II Data Processing Systems (PDF). IBM. A22-6785.
  5. IBM 1301, Models 1 and 2, Disk Storage and IBM 1302, Models 1 and 2, Disk Storage with IBM 1410 and 7010 Data Processing Systems (PDF). IBM. A22-6788.
  6. Technical Data - Series 4000 Disk File (PDF). Bryant Computer Products. 1963.
  7. Porter, James (October 1988). "Rigid Magnetic Disk Drive Specifications". 1988 DISK/TREND REPORT, RIGID DISK DRIVES. DISK/TREND, Inc. p. 63, 122.
  8. Porter, James (June 1997). "Rigid Magnetic Disk Drive Specifications". 1997 DISK/TREND REPORT, RIGID DISK DRIVES. DISK/TREND, Inc.
  9. Reference Manual for IBM 3330 Series Disk Storage (PDF). IBM. March 1974. GA26-1615-3.
  10. 1 2 3 "The Advent of Advanced Format". IDEMA. Retrieved 2013-11-18.
  11. Skinner, Heather (29 June 2010). "IDEMA launches "Are you ready?" campaign to prepare industry for Hard Disk Drive sector format change" (PDF). www.idema.org . Archived from the original on 14 December 2020. Retrieved 14 December 2020.
  12. "Difference between block size and cluster size". unix.stackexchange.com. Retrieved 2015-12-13.
  13. "Disk Sector and Block Allocation For File". stackoverflow.com. Retrieved 2015-12-13.
  14. "Errors Caused by Cross-Linked Files or Lost Clusters". Archived from the original on 2015-03-06. Retrieved 2020-08-03.
  15. Mueller, Scott (2002). Upgrading and repairing PCs, p. 1354. ISBN   0-7897-2745-5.
  16. Kern Wong (January 1989), DP8459 Zoned Bit Recording (PDF), National Semiconductor, archived from the original (PDF) on 2011-06-15, retrieved 2010-03-10