Error recovery control

Last updated

In computing, error recovery control (ERC) (Western Digital: time-limited error recovery (TLER), Samsung/Hitachi: command completion time limit (CCTL)) is a feature of hard disks which allow a system administrator to configure the amount of time a drive's firmware is allowed to spend recovering from a read or write error. Limiting the recovery time allows for improved error handling in hardware or software RAID environments. In some cases, there is a conflict as to whether error handling should be undertaken by the hard drive or by the RAID implementation, which leads to drives being marked as unusable and significant performance degradation, when this could otherwise have been avoided.

Contents

Overview

Modern hard drives feature an ability to recover from some read/write errors by internally remapping sectors and performing other forms of self-test and recovery. The process for this can sometimes take several seconds or (under heavy usage) minutes, during which time the drive is unresponsive. Hardware RAID controllers and software RAID implementations are designed to recognise a drive which does not respond within a few seconds, and mark it as unreliable, indicating that it should be withdrawn from use and the array rebuilt from parity data. This is a long process, degrades performance, and if more drives fail under the resulting additional workload, it may be catastrophic.

If the drive itself is inherently reliable but has some bad sectors, then TLER and similar features prevent a disk from being unnecessarily marked as 'failed' by limiting the time spent on correcting detected errors before advising the array controller of a failed operation. The array controller can then handle the data recovery for the limited amount involved, rather than marking the entire drive as faulty.

Typical defaults

Effectively, TLER and similar features limit the performance of on-drive error handling, to allow hardware RAID controllers and software RAID implementations to handle the error if problematic.

Generally, Western Digital enterprise drives such as Raptor, Caviar RE2 and RE2-GP (RAID Edition) come with TLER Read "Enabled" (7 seconds) and TLER Write "Enabled" (7 seconds) while desktop drives such as Caviar SE, SE16, and GP come with TLER Read and Write Disabled (configured as 0 seconds, to disable).

Standalone vs. RAID considerations

It is best for TLER to be "enabled" when in a RAID array to prevent the recovery time from a disk read or write error from exceeding the RAID implementation's timeout threshold. If a drive times out, the hard disk will need to be manually re-added to the array, requiring a re-build and re-synchronization of the hard disk. Enabling TLER seeks to prevent this by interrupting error correction before timeout, to report failures only for data segments. The result is increased reliability in a RAID array.

In a stand-alone configuration TLER should be disabled. As the drive is not redundant, reporting segments as failed will only increase manual intervention. Without a hardware RAID controller or a software RAID implementation to drop the disk, normal (no TLER) recovery ability is most stable.

In a software RAID configuration whether or not TLER is helpful is dependent on the operating system. For example, in FreeBSD the ATA/CAM stack controls the timeouts, and is set to progressively increase the timeouts as they occur. Thus, if a desktop disk without TLER starts delaying a response to a sector read, FreeBSD will retry the read with successively longer timeouts to prevent prematurely dropping the disk out of the array.

ModelTLER default ( read / write )Stand-alone recommendationRAID recommendation
Caviar, SE, SE16, GP, RaptorDisabled ( 0s / 0s )DefaultEnabled (if possible)
Caviar RE2, RE2-GP, RedEnabled ( 7s / 7s )DisabledDefault

ZFS

The ZFS filesystem was written to immediately write data to a sector that reports as bad or takes an excessively long time to read (such as non-TLER drives); this will usually force an immediate sector remap on a weak sector in most drives.[ citation needed ]

RAID controllers

Disconnect timeout values for different hardware RAID controllers may vary between vendors; thus, TLER should trigger before the controller times out the drive. For example, 3ware 9650SE uses 20 seconds as the timeout, [1] while for the LSI Logic used in IBM x-series it is 10 seconds. [2]

Widely available Intel Matrix RAID / Intel Rapid Storage Technology, embedded in Intel server motherboards and modern desktop motherboards, is a pseudo-hardware controller, not a true hardware RAID controller.

Software RAID

Linux mdadm simply holds and lets the drive complete its recovery – however, the default command timeout for the SCSI Disk layer (/sys/block/sd?/device/timeout) is 30 seconds, [3] after which it will attempt to reset the drive, and if that fails, put the drive offline. [4]

Changing ERC

ATA-8 standard

The 2006 ATA-8 standard defines a SCT Error Recovery Control command. [5] For hard drives that implement this interface, the smartctl utility (part of the smartmontools package) can be used to change the error-recovery timeout via -l scterc. [6] In 2018, ACS-4 added a functionality for the setting to persist across reboot; it is now supported by smartctl. [7]

Controlling the timeout behavior through the smartctl utility may not work on all hard disk drives because some manufacturers have changed their desktop drives not to include the support for the ERC parameter, [8] [9] purportedly to force sales of their more expensive RAID/enterprise models.[ citation needed ] Richard Gregory, who wrote the original ERC patch for smartctl, reports that Western Digital retracted ERC support by releasing a new model without notice. [6]

On Windows, the HDAT2 program is available in addition to smartctl (which is cross-platform). [6]

SCSI standard

SBC-4 describes a RECOVERY TIME LIMIT field in the Read-Write Error Recovery mode page used to define how the drive performs error recovery. [10] The sdparm program can change this setting with --set=RTL. [11]

Vendor utilities

Western Digital

A WDTLER.EXE utility allows the enabling or disabling of the TLER parameter on Western Digital hard drives. This utility is written for DOS. The utility works on and makes changes to all compatible Western Digital hard disk drives connected to the computer. The change survives power-cycling. [6] Western Digital used to mention the tool in an FAQ. [12]

The utility comes with three batch files, TLERSCAN.BAT to get the current state of the TLER setting on all the hard drives, TLER-ON.BAT to enable TLER, and TLER-OFF.BAT to disable TLER. The included TLER-ON.BAT will set the Read & Write TLER time to seven seconds. It is possible to can use the WDTLER.EXE utility directly with the -r# -w# parameters for a custom timeout.

Western Digital claims that using the WDTLER.EXE utility on newer drives can damage the firmware and make the disk unusable. The utility is no longer available from Western Digital, and new drives will not be able to have the TLER setting changed. RE disks are only suitable for RAID arrays and Caviar are only suitable for non-RAID use. The utility still works for older drives[ which? ].

Hitachi

Hitachi customer service stated in 2009 that there is a Feature Tool for changing ERC (referred to as CCTL). [12]

Related Research Articles

<span class="mw-page-title-main">Hard disk drive</span> Electro-mechanical data storage device

A hard disk drive (HDD), hard disk, hard drive, or fixed disk, is an electro-mechanical data storage device that stores and retrieves digital data using magnetic storage with one or more rigid rapidly rotating platters coated with magnetic material. The platters are paired with magnetic heads, usually arranged on a moving actuator arm, which read and write data to the platter surfaces. Data is accessed in a random-access manner, meaning that individual blocks of data can be stored and retrieved in any order. HDDs are a type of non-volatile storage, retaining stored data when powered off. Modern HDDs are typically in the form of a small rectangular box.

RAID is a data storage virtualization technology that combines multiple physical disk drive components into one or more logical units for the purposes of data redundancy, performance improvement, or both. This is in contrast to the previous concept of highly reliable mainframe disk drives referred to as "single large expensive disk" (SLED).

<span class="mw-page-title-main">ST506/ST412</span>

The ST-506 and ST-412 were early hard disk drive products introduced by Seagate in 1980 and 1981 respectively, that later became construed as hard disk drive interfaces: the ST-506 disk interface and the ST-412 disk interface. Compared to the ST-506 precursor, the ST-412 implemented a refinement to the seek speed, and increased the drive capacity from 5 MB to 10 MB, but was otherwise highly similar.

<span class="mw-page-title-main">Western Digital</span> American digital storage company

Western Digital Corporation is an American computer drive manufacturer and data storage company, headquartered in San Jose, California. It designs, manufactures and sells data technology products, including data storage devices, data center systems and cloud storage services.

<span class="mw-page-title-main">Disk controller</span>

The disk controller is the controller circuit which enables the CPU to communicate with a hard disk, floppy disk or other kind of disk drive. It also provides an interface between the disk drive and the bus connecting it to the rest of the system.

Disk formatting is the process of preparing a data storage device such as a hard disk drive, solid-state drive, floppy disk, memory card or USB flash drive for initial use. In some cases, the formatting operation may also create one or more new file systems. The first part of the formatting process that performs basic medium preparation is often referred to as "low-level formatting". Partitioning is the common term for the second part of the process, dividing the device into several sub-devices and, in some cases, writing information to the device allowing an operating system to be booted from it. The third part of the process, usually termed "high-level formatting" most often refers to the process of generating a new file system. In some operating systems all or parts of these three processes can be combined or repeated at different levels and the term "format" is understood to mean an operation in which a new disk medium is fully prepared to store files. Some formatting utilities allow distinguishing between a quick format, which does not erase all existing data and a long option that does erase all existing data.

<span class="mw-page-title-main">Network-attached storage</span> Computer data storage server

Network-attached storage (NAS) is a file-level computer data storage server connected to a computer network providing data access to a heterogeneous group of clients. The term "NAS" can refer to both the technology and systems involved, or a specialized device built for such functionality.

A disk array controller is a device that manages the physical disk drives and presents them to the computer as logical units. It almost always implements hardware RAID, thus it is sometimes referred to as RAID controller. It also often provides additional disk cache.

Self-Monitoring, Analysis and Reporting Technology is a monitoring system included in computer hard disk drives (HDDs) and solid-state drives (SSDs). Its primary function is to detect and report various indicators of drive reliability with the intent of anticipating imminent hardware failures.

<span class="mw-page-title-main">USB mass storage device class</span>

The USB mass storage device class is a set of computing communications protocols, specifically a USB Device Class, defined by the USB Implementers Forum that makes a USB device accessible to a host computing device and enables file transfers between the host and the USB device. To a host, the USB device acts as an external hard drive; the protocol set interfaces with a number of storage devices.

<span class="mw-page-title-main">Western Digital Raptor</span>

The Western Digital Raptor is a discontinued series of high performance hard disk drives produced by Western Digital first marketed in 2003. The drive occupies a niche in the enthusiast, workstation and small-server market. Traditionally, the majority of servers used hard drives featuring a SCSI interface because of their advantages in both performance and reliability over consumer-level ATA drives.

sync is a standard system call in the Unix operating system, which commits all data in the kernel filesystem to non-volatile storage buffers, i.e., data which has been scheduled for writing via low-level I/O system calls. Higher-level I/O layers such as stdio may maintain separate buffers of their own.

<span class="mw-page-title-main">Hard disk drive failure</span> Being electromechanical devices, hard disk drives (HDDs) have finite lifetime, usually 5-7 years

A hard disk drive failure occurs when a hard disk drive malfunctions and the stored information cannot be accessed with a properly configured computer.

A bad sector in computing is a disk sector on a disk storage unit that is unreadable. Upon taking damage, all information stored on that sector is lost. When a bad sector is found and marked, the operating system like Windows or Linux will skip it in the future. Bad sectors are a threat to information security in the sense of data remanence.

In computer storage, the standard RAID levels comprise a basic set of RAID configurations that employ the techniques of striping, mirroring, or parity to create large reliable data stores from multiple general-purpose computer hard disk drives (HDDs). The most common types are RAID 0 (striping), RAID 1 (mirroring) and its variants, RAID 5, and RAID 6. Multiple RAID levels can also be combined or nested, for instance RAID 10 or RAID 01. RAID levels and their associated data formats are standardized by the Storage Networking Industry Association (SNIA) in the Common RAID Disk Drive Format (DDF) standard. The numerical values only serve as identifiers and do not signify performance, reliability, generation, or any other metric.

<span class="mw-page-title-main">Disk buffer</span>

In computer storage, disk buffer is the embedded memory in a hard disk drive (HDD) or solid state drive (SSD) acting as a buffer between the rest of the computer and the physical hard disk platter or flash memory that is used for storage. Modern hard disk drives come with 8 to 256 MiB of such memory, and solid-state drives come with up to 4 GB of cache memory.

In Unix-like operating systems, a device file or special file is an interface to a device driver that appears in a file system as if it were an ordinary file. There are also special files in DOS, OS/2, and Windows. These special files allow an application program to interact with a device by using its device driver via standard input/output system calls. Using standard system calls simplifies many programming tasks, and leads to consistent user-space I/O mechanisms regardless of device features and functions.

A trim command allows an operating system to inform a solid-state drive (SSD) which blocks of data are no longer considered to be 'in use' and therefore can be erased internally.

Smartmontools is a set of utility programs to control and monitor computer storage systems using the Self-Monitoring, Analysis and Reporting Technology (S.M.A.R.T.) system built into most modern (P)ATA, Serial ATA, SCSI/SAS and NVMe hard drives.

Higher performance in hard disk drives comes from devices which have better performance characteristics. These performance characteristics can be grouped into two categories: access time and data transfer time.

References

  1. "User Guide for 9650SE 9690SA from 9.5.2 Complete Codeset". lsi.com. Archived from the original on 3 February 2012. Retrieved 10 June 2015.
  2. Available in BIOS Raid Config Utility > Advanced Device Properties
  3. "linux/sd.h at master · torvalds/linux · GitHub". GitHub.
  4. "Linux SCSI Subsystem: SCSI EH". kernel.org.
  5. ATA/ATAPI Command Set (ATA8-ACS)
  6. 1 2 3 4 Richard Gregory. "Author's description of the original patch to smartctl that implemented that feature". Archived from the original on 2013-09-10. Retrieved 2013-02-15.
  7. "#1427 (Add support for SCT Error Recovery Timer features added in ACS-4) – smartmontools". www.smartmontools.org.
  8. "Re: md RAID with enterprise-class SATA or SAS drives". spinics.net.
  9. "Seagate FAQ: What is Error Recovery Control?". seagate.com.
  10. "INCITS 506-202x - Information technology - SCSI Block Commands - 4 (SBC-4) draft revision 22". 15 September 2020. Retrieved 22 May 2023.
  11. sdparm(8)    Linux Programmer's Manual – Administration and Privileged Commands
  12. 1 2 "TLER / CCTL / ERC thread". [H]ard|Forum. 16 November 2010.