Bad sector

Last updated December 13, 2024

A bad sector in computing is a disk sector on a disk storage unit that is unreadable. Upon taking damage, all information stored on that sector is lost. When a bad sector is found and marked, the operating system like Windows or Linux will skip it in the future. Bad sectors are a threat to information security in the sense of data remanence.

Details

Bad sectors can be "soft" (logical) or "hard" (hardware, physical), depending on what is making the sector inaccessible. In case of power loss, bit rot (more likely on floppy disks), or firmware issues, the on-disk format can be corrupt beyond what the error correcting code can fix. This is a "soft" bad sector: writing over the corruption would succeed.^[1]

On the other hand, sectors broken physically cannot be restored: writing would fail, forcing a remap. A new drive may start with some innocuous bad sectors due to manufacturing flaws. Larger patches occur throughout use, due to head crash, wear-and-tear, physical shock, or dust intrusion.^[2]^[1]

On solid-state drives, flash wear or flash controller error may also cause bad sectors.^[3]

Handling

Operating system

Bad sectors may be detected by the operating system or the disk controller. Most file systems contain provisions for sectors to be marked as bad, so that the operating system avoids them in the future. Disk diagnostic utilities, such as CHKDSK (Microsoft Windows), Disk Utility (on macOS), or badblocks (on Linux) can actively look for bad sectors upon user request.

With the advent of SMART-enabled disk controllers (see below), the burden of avoiding bad sectors more commonly falls to the disk.^[4] Some newer file systems such as Btrfs and ZFS do not have a bad-block avoidance feature at all.^[5] Software tools that look for bad blocks still have a use case: by issuing writes at detected bad sectors, one can expedite the remapping process, avoiding further attempts at reading the bad sector.^[6]

Disk controller

When a sector is found to be bad or unstable by the firmware of a disk controller, a modern (post-1990) disk controller remaps the logical sector to a different physical sector. Typically, automatic remapping of sectors only happens when a sector is written to – failed reads remain marked "pending". In the normal operation of a hard drive, the detection and remapping of bad sectors should take place in a manner transparent to the rest of the system and in advance before data is lost. There are two types of remapping by disk hardware: P-LIST (mapping during factory production tests) and G-LIST (mapping during consumer usage by disk microcode).^[4]

Utilities can read the Self-Monitoring, Analysis, and Reporting Technology (SMART) information to tell how many sectors have been reallocated, and how many spare sectors the drive may still have.^[7] Because reads and writes from G-list sectors are automatically redirected (remapped) to spare sectors, it slows down drive access even if data in drive is defragmented. Once the G-list is filled up, the storage unit must be replaced.^[8]^[9]

Compared to ATA, the SCSI command set allows finer-grained management of bad sectors. Users can read the G-LIST, control whether automatic remap is performed, and use a dedicated command REASSIGN BLOCKS to manually remap if needed. The command set also provides a way to perform low-level format with FORMAT UNIT.^[10]

Manipulation methods

The Windows program ATATool can be used to create deliberate "soft" bad sectors by manipulating the sector Error correction code (ECC) data. This can be used to verify bad sector support in disk utilities and forensic tools. For instance, to make sector 10 bad:

ATATOOL /BADECC:10 \\.\PhysicalDrive1

hdparm has a --make-bad-sector command that works similarly. For newer drives, it can alternatively use WRITE_UNCORRECTABLE_EXT to "flag" a sector as bad: the drive controller would not attempt to read, but fail immediately.^[6]

Frequency

In a 2007 study, CERN observed 1.53 million hard drives from 30 models over 32 months and analyzed the drive read errors returned. They noted that 3.5% of drives developed "latent read error" (i.e. unreadable bad sector), and that a disk with a bad sector is more likely to develop more. Bad sectors cluster spatially (in a 10–MB neighborhood) and temporally. Errors recovered by ECC, which are reported by enterprise drives (using the SCSI command set), also suggest a higher chance of a bad sector in the future.^[11]

Related Research Articles

<span class="mw-page-title-main">Hard disk drive</span> Electro-mechanical data storage device

A hard disk drive (HDD), hard disk, hard drive, or fixed disk is an electro-mechanical data storage device that stores and retrieves digital data using magnetic storage with one or more rigid rapidly rotating platters coated with magnetic material. The platters are paired with magnetic heads, usually arranged on a moving actuator arm, which read and write data to the platter surfaces. Data is accessed in a random-access manner, meaning that individual blocks of data can be stored and retrieved in any order. HDDs are a type of non-volatile storage, retaining stored data when powered off. Modern HDDs are typically in the form of a small rectangular box.

RAID is a data storage virtualization technology that combines multiple physical data storage components into one or more logical units for the purposes of data redundancy, performance improvement, or both. This is in contrast to the previous concept of highly reliable mainframe disk drives known as single large expensive disk (SLED).

Disk formatting is the process of preparing a data storage device such as a hard disk drive, solid-state drive, floppy disk, memory card or USB flash drive for initial use. In some cases, the formatting operation may also create one or more new file systems. The first part of the formatting process that performs basic medium preparation is often referred to as "low-level formatting". Partitioning is the common term for the second part of the process, dividing the device into several sub-devices and, in some cases, writing information to the device allowing an operating system to be booted from it. The third part of the process, usually termed "high-level formatting" most often refers to the process of generating a new file system. In some operating systems all or parts of these three processes can be combined or repeated at different levels and the term "format" is understood to mean an operation in which a new disk medium is fully prepared to store files. Some formatting utilities allow distinguishing between a quick format, which does not erase all existing data and a long option that does erase all existing data.

Mount Rainier (MRW) is a format for writable optical discs which provides the packet writing and defect management. Its goal is the replacement of the floppy disk. It is named after Mount Rainier, a volcano near Seattle, Washington, United States.

dd is a command-line utility for Unix, Plan 9, Inferno, and Unix-like operating systems and beyond, the primary purpose of which is to convert and copy files. On Unix, device drivers for hardware and special device files appear in the file system just like normal files; dd can also read and/or write from/to these files, provided that function is implemented in their respective driver. As a result, dd can be used for tasks such as backing up the boot sector of a hard drive, and obtaining a fixed amount of random data. The dd program can also perform conversions on the data as it is copied, including byte order swapping and conversion to and from the ASCII and EBCDIC text encodings.

Self-Monitoring, Analysis, and Reporting Technology is a monitoring system included in computer hard disk drives (HDDs) and solid-state drives (SSDs). Its primary function is to detect and report various indicators of drive reliability, or how long a drive can function while anticipating imminent hardware failures.

Data corruption refers to errors in computer data that occur during writing, reading, storage, transmission, or processing, which introduce unintended changes to the original data. Computer, transmission, and storage systems use a number of measures to provide end-to-end data integrity, or lack of errors.

<span class="mw-page-title-main">SpinRite</span> Data recovery software

SpinRite is a computer program for scanning RAS Random Access Storage devices such as hard disks, reading and rewriting data to resolve and retrieve data that is unreadable by DOS or Windows. The first version was released in 1987 by Steve Gibson. The current version, 6.1, was released in 2024.

<span class="mw-page-title-main">USB mass storage device class</span> USB device class for drives

The USB mass storage device class is a set of computing communications protocols, specifically a USB Device Class, defined by the USB Implementers Forum that makes a USB device accessible to a host computing device and enables file transfers between the host and the USB device. To a host, the USB device acts as an external hard drive; the protocol set interfaces with a number of storage devices.

In computing, data recovery is a process of retrieving deleted, inaccessible, lost, corrupted, damaged, or formatted data from secondary storage, removable media or files, when the data stored in them cannot be accessed in a usual way. The data is most often salvaged from storage media such as internal or external hard disk drives (HDDs), solid-state drives (SSDs), USB flash drives, magnetic tapes, CDs, DVDs, RAID subsystems, and other electronic devices. Recovery may be required due to physical damage to the storage devices or logical damage to the file system that prevents it from being mounted by the host operating system (OS).

hdparm is a command line program for Linux to set and view ATA hard disk drive hardware parameters and test performance. It can set parameters such as drive caches, sleep mode, power management, acoustic management, and DMA settings. GParted and Parted Magic both include hdparm.

In data storage, disk mirroring is the replication of logical disk volumes onto separate physical hard disks in real time to ensure continuous availability. It is most commonly used in RAID 1. A mirrored volume is a complete logical representation of separate volume copies.

Data scrubbing is an error correction technique that uses a background task to periodically inspect main memory or storage for errors, then corrects detected errors using redundant data in the form of different checksums or copies of data. Data scrubbing reduces the likelihood that single correctable errors will accumulate, leading to reduced risks of uncorrectable errors.

sync is a standard system call in the Unix operating system, which commits all data from the kernel filesystem buffers to non-volatile storage, i.e., data which has been scheduled for writing via low-level I/O system calls. Higher-level I/O layers such as stdio may maintain separate buffers of their own.

In computing, error recovery control (ERC) is a feature of hard disks which allow a system administrator to configure the amount of time a drive's firmware is allowed to spend recovering from a read or write error. Limiting the recovery time allows for improved error handling in hardware or software RAID environments. In some cases, there is a conflict as to whether error handling should be undertaken by the hard drive or by the RAID implementation, which leads to drives being marked as unusable and significant performance degradation, when this could otherwise have been avoided.

badblocks is a Linux utility to check for bad sectors on a disk drive. It can create a text file with list of these sectors that can be used with other programs, like mkfs, so that they are not used in the future and thus do not cause corruption of data. It is part of the e2fsprogs project, and a port is available for BSD operating systems.

Device configuration overlay (DCO) is a hidden area on many of today's hard disk drives (HDDs). Usually when information is stored in either the DCO or host protected area (HPA), it is not accessible by the BIOS, OS, or the user. However, certain tools can be used to modify the HPA or DCO. The system uses the IDENTIFY_DEVICE command to determine the supported features of a given hard drive, but the DCO can report to this command that supported features are nonexistent or that the drive is smaller than it actually is. To determine the actual size and features of a disk, the DEVICE_CONFIGURATION_IDENTIFY command is used, and the output of this command can be compared to the output of IDENTIFY_DEVICE to see if a DCO is present on a given hard drive. Most major tools will remove the DCO in order to fully image a hard drive, using the DEVICE_CONFIGURATION_RESET command. This permanently alters the disk, unlike with the host protected area (HPA), which can be temporarily removed for a power cycle.

A forensic disk controller or hardware write-block device is a specialized type of computer hard disk controller made for the purpose of gaining read-only access to computer hard drives without the risk of damaging the drive's contents. The device is named forensic because its most common application is for use in investigations where a computer hard drive may contain evidence. Such a controller historically has been made in the form of a dongle that fits between a computer and an IDE or SCSI hard drive, but with the advent of USB and SATA, forensic disk controllers supporting these newer technologies have become widespread. Steve Bress and Mark Menz invented hard drive write blocking.

A trim command allows an operating system to inform a solid-state drive (SSD) which blocks of data are no longer considered to be "in use" and therefore can be erased internally.

Shingled magnetic recording (SMR) is a magnetic storage data recording technology used in hard disk drives (HDDs) to increase storage density and overall per-drive storage capacity. Conventional hard disk drives record data by writing non-overlapping concentric magnetic tracks, while shingled recording writes new tracks that overlap part of the previously written magnetic track, leaving the previous track narrower and allowing higher track density. Thus, the tracks partially overlap similar to roof shingles. This approach was selected because, if the writing head is made too narrow, it cannot provide the very high fields required in the recording layer of the disk.

References

1 2 Zhang (2 March 2018). "Hard vs Soft Bad Sectors in HDD: Different Causes and Solutions". Data Recovery Blog.
↑ Chris Hoffman (5 July 2017). "Bad Sectors Explained: Why Hard Drives Get Bad Sectors and What You Can Do About It". How-To Geek.
↑ "Question - should i rma my 980 pro". Tom's Hardware Forum. 14 February 2023. Retrieved 22 July 2024.
1 2 "Bad Sector Remapping". mjm.co.uk. Archived from the original on 10 March 2018. Retrieved 9 March 2018.
↑ "badblocks - Can btrfs track / avoid bad blocks?". Unix & Linux Stack Exchange.
1 2 hdparm(8) – Linux Programmer's Manual – Administration and Privileged Commands. "--make-bad-sector Deliberately create a bad sector (aka. "media error") on the disk. [...] Note also that the --repair-sector option can be used to restore (any) bad sectors when they are no longer needed, including sectors that were genuinely bad (the drive will likely remap those to a fresh area on the media). --write-sector: This can be used to force a drive to repair a bad sector (media error)."
↑ Monitoring Hard Disks with SMART.Linux Journal, 2004.
↑ "Encyclopedia". PCMag.com. Ziff Davis.
↑ Stephens, Curtis E, ed. (11 December 2006), Information technology - AT Attachment 8 - ATA/ATAPI Command Set (ATA8-ACS), working draft revision 3f (PDF), ANSI INCITS, pp. 198–213, 327–344, archived from the original (PDF) on 30 July 2007
↑ "INCITS 506-202x - Information technology - SCSI Block Commands - 4 (SBC-4) draft revision 22". 15 September 2020. Retrieved 22 May 2023.
↑ Lakshmi N. Bairavasundaram; Garth R. Goodson; Shankar Pasupathy; Jiri Schindler (June 2007). "An analysis of latent sector errors in disk drives". Proceedings of the 2007 ACM SIGMETRICS international conference on Measurement and modeling of computer systems. San Diego, California, United States: ACM. pp. 289–300. CiteSeerX 10.1.1.63.1412 . doi:10.1145/1254882.1254917. ISBN 9781595936394. S2CID 14164251 . Retrieved 9 June 2012.

External links

Bad Blocks Definition

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[Zhang-1] 1 2 Zhang (2 March 2018). "Hard vs Soft Bad Sectors in HDD: Different Causes and Solutions". Data Recovery Blog.

[2] Chris Hoffman (5 July 2017). "Bad Sectors Explained: Why Hard Drives Get Bad Sectors and What You Can Do About It". How-To Geek.

[3] "Question - should i rma my 980 pro". Tom's Hardware Forum. 14 February 2023. Retrieved 22 July 2024.

[MJM-4] 1 2 "Bad Sector Remapping". mjm.co.uk. Archived from the original on 10 March 2018. Retrieved 9 March 2018.

[5] "badblocks - Can btrfs track / avoid bad blocks?". Unix & Linux Stack Exchange.

[hdparm-6] 1 2 hdparm(8) – Linux Programmer's Manual – Administration and Privileged Commands. "--make-bad-sector Deliberately create a bad sector (aka. "media error") on the disk. [...] Note also that the --repair-sector option can be used to restore (any) bad sectors when they are no longer needed, including sectors that were genuinely bad (the drive will likely remap those to a fresh area on the media). --write-sector: This can be used to force a drive to repair a bad sector (media error)."

[7] Monitoring Hard Disks with SMART.Linux Journal, 2004.

[8] "Encyclopedia". PCMag.com. Ziff Davis.

[9] Stephens, Curtis E, ed. (11 December 2006), Information technology - AT Attachment 8 - ATA/ATAPI Command Set (ATA8-ACS), working draft revision 3f (PDF), ANSI INCITS, pp. 198–213, 327–344, archived from the original (PDF) on 30 July 2007

[10] "INCITS 506-202x - Information technology - SCSI Block Commands - 4 (SBC-4) draft revision 22". 15 September 2020. Retrieved 22 May 2023.

[11] Lakshmi N. Bairavasundaram; Garth R. Goodson; Shankar Pasupathy; Jiri Schindler (June 2007). "An analysis of latent sector errors in disk drives". Proceedings of the 2007 ACM SIGMETRICS international conference on Measurement and modeling of computer systems. San Diego, California, United States: ACM. pp. 289–300. CiteSeerX 10.1.1.63.1412 . doi:10.1145/1254882.1254917. ISBN 9781595936394. S2CID 14164251 . Retrieved 9 June 2012.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]