Self-Monitoring, Analysis and Reporting Technology

Last updated
An example of software that shows the health of the drive and its smart attributes. This 8TB Toshiba Hard Drive appears to be in perfect condition. CrystalDiskInfo printout - TOSHIBA HDWR480.png
An example of software that shows the health of the drive and its smart attributes. This 8TB Toshiba Hard Drive appears to be in perfect condition.
Another example of software that shows the health of the drive and its smart attributes. This Intel 120GB SSD also appears to be in perfect condition. Hard Disk Sentinel.jpg
Another example of software that shows the health of the drive and its smart attributes. This Intel 120GB SSD also appears to be in perfect condition.

Self-Monitoring, Analysis, and Reporting Technology (S.M.A.R.T. or SMART) is a monitoring system included in computer hard disk drives (HDDs) and solid-state drives (SSDs). [3] Its primary function is to detect and report various indicators of drive reliability, or how long a drive can function while anticipating imminent hardware failures. [4] [5]

Contents

When S.M.A.R.T. data indicates a possible imminent drive failure, software running on the host system may notify the user so action can be taken to prevent data loss, and the failing drive can be replaced and no data is lost. [6]

Background

Hard disk and other storage drives are subject to failures (see hard disk drive failure) which can be classified into two basic classes:

Mechanical failures account for about 60% of all drive failures. [7] While the eventual failure may be catastrophic, most mechanical failures result from gradual wear and there are usually certain indications that failure is imminent. These may include increased heat output, increased noise level, problems with reading and writing of data, or an increase in the number of damaged disk sectors.

PCTechGuide's page on S.M.A.R.T. (2003) comments that the technology has gone through three phases: [8]

In its original incarnation S.M.A.R.T. provided failure prediction by monitoring certain online hard drive activities. A subsequent version of the standard improved failure prediction by adding an automatic off-line read scan to monitor additional operations. Online attributes are always updated while the offline attributes get updated when the HDD is not under working condition. If there is an immediate need to update the offline attributes, the HDD slows down and the offline attributes get updated. The latest "S.M.A.R.T." technology not only monitors hard drive activities but adds failure prevention by attempting to detect and repair sector errors. Also, while earlier versions of the technology only monitored hard drive activity for data that was retrieved by the operating system, this latest S.M.A.R.T. tests all data and all sectors of a drive by using "off-line data collection" to confirm the drive's health during periods of inactivity.

Accuracy

A field study at Google [9] covering over 100,000 consumer-grade drives from December 2005 to August 2006 found correlations between certain S.M.A.R.T. information and annualized failure rates:

History and predecessors

An early hard disk monitoring technology was introduced by IBM in 1992 in its IBM 9337 Disk Arrays for AS/400 servers using IBM 0662 SCSI-2 disk drives. [11] Later it was named Predictive Failure Analysis (PFA) technology. It was measuring several key device health parameters and evaluating them within the drive firmware. Communications between the physical unit and the monitoring software were limited to a binary result: namely, either "device is OK" or "drive is likely to fail soon".

Later, another variant, which was named IntelliSafe, was created by computer manufacturer Compaq and disk drive manufacturers Seagate, Quantum, and Conner. [12] The disk drives would measure the disk's "health parameters", and the values would be transferred to the operating system and user-space monitoring software. Each disk drive vendor was free to decide which parameters were to be included for monitoring, and what their thresholds should be. The unification was at the protocol level with the host.

Compaq submitted IntelliSafe to the Small Form Factor (SFF) committee for standardization in early 1995. [13] It was supported by IBM, by Compaq's development partners Seagate, Quantum, and Conner, and by Western Digital, which did not have a failure prediction system at the time. The Committee chose IntelliSafe's approach, as it provided more flexibility. Compaq placed IntelliSafe into the public domain on 12 May 1995. [14] The resulting jointly developed standard was named S.M.A.R.T..

That SFF standard described a communication protocol for an ATA host to use and control monitoring and analysis in a hard disk drive, but did not specify any particular metrics or analysis methods. Later, "S.M.A.R.T." came to be understood (though without any formal specification) to refer to a variety of specific metrics and methods and to apply to protocols unrelated to ATA for communicating the same kinds of things.

Provided information

The technical documentation for S.M.A.R.T. is in the AT Attachment (ATA) standard. First introduced in 1994, [15] the ATA standard has gone through multiple revisions. Some parts of the original S.M.A.R.T. specification by the Small Form Factor (SFF) Committee were added to ATA-3, [16] published in 1997. In 1998 ATA-4 dropped the requirement for drives to maintain an internal attribute table and instead required only for an "OK" or "NOT OK" value to be returned. [16] Albeit, manufacturers have kept the capability to retrieve the attributes' value. The most recent ATA standard, ATA-8, was published in 2004. [17] It has undergone regular revisions, [18] the latest being in 2011. [19] Standardization of similar features on SCSI is more scarce and is not named as such on standards, although vendors and consumers alike do refer to these similar features as S.M.A.R.T. too. [20]

The most basic information that S.M.A.R.T. provides is the S.M.A.R.T. status. It provides only two values: "threshold not exceeded" and "threshold exceeded". Often, these are represented as "drive OK" or "drive fail" respectively. A "threshold exceeded" value is intended to indicate that there is a relatively high probability that the drive will not be able to honor its specification in the future: that is, the drive is "about to fail". The predicted failure may be catastrophic or may be something as subtle as the inability to write to certain sectors, or perhaps slower performance than the manufacturer's declared minimum.

The S.M.A.R.T. status does not necessarily indicate the drive's past or present reliability. If a drive has already failed catastrophically, the S.M.A.R.T. status may be inaccessible. Alternatively, if a drive has experienced problems in the past, but the sensors no longer detect such problems, the S.M.A.R.T. status may, depending on the manufacturer's programming, suggest that the drive is now healthy.

The inability to read some sectors is not always an indication that a drive is about to fail. One way that unreadable sectors may be created, even when the drive is functioning within specification, is through a sudden power failure while the drive is writing. Also, even if the physical disk is damaged at one location, such that a certain sector is unreadable, the disk may be able to use spare space to replace the bad area, so that the sector can be overwritten. [21]

More detail on the health of the drive may be obtained by examining the S.M.A.R.T. Attributes. S.M.A.R.T. Attributes were included in some drafts of the ATA standard, but were removed before the standard became final. The meaning and interpretation of the attributes varies between manufacturers, and are sometimes considered a trade secret for one manufacturer or another. Attributes are further discussed below. [22]

Drives with S.M.A.R.T. may optionally maintain a number of 'logs'. The error log records information about the most recent errors that the drive has reported back to the host computer. Examining this log may help one to determine whether computer problems are disk-related or caused by something else (error log timestamps may "wrap" after 232 ms=49.71 days [23] )

A drive that implements S.M.A.R.T. may optionally implement a number of self-test or maintenance routines, and the results of the tests are kept in the self-test log. The self-test routines may be used to detect any unreadable sectors on the disk, so that they may be restored from back-up sources (for example, from other disks in a RAID). This helps to reduce the risk of incurring permanent loss of data.

Standards and implementation

Lack of common interpretation

Many motherboards display a warning message upon boot when a disk drive is approaching failure. Although an industry standard exists among most major hard drive manufacturers, [8] issues remain due to attributes intentionally left undocumented to the public in order to differentiate models between manufacturers. [24] [22] From a legal perspective, the term "S.M.A.R.T." refers only to a signaling method between internal disk drive electromechanical sensors and the host computer. Because of this the specifications of S.M.A.R.T. are entirely vendor specific and, while many of these attributes have been standardized between drive vendors, others remain vendor-specific. S.M.A.R.T. implementations still differ and in some cases may lack "common" or expected features such as a temperature sensor or only include a few select attributes while still allowing the manufacturer to advertise the product as "S.M.A.R.T. compatible." [22]

Visibility to host systems

Depending on the type of interface being used, some S.M.A.R.T.-enabled motherboards and related software may not communicate with certain S.M.A.R.T.-capable drives. For example, few external drives connected via USB and FireWire correctly send S.M.A.R.T. data over those interfaces. With so many ways to connect a hard drive (SCSI, Fibre Channel, ATA, SATA, SAS, SSA, NVMe and so on), it is difficult to predict whether S.M.A.R.T. reports will function correctly in a given system.

Even with a hard drive and interface that implements the specification, the computer's operating system may not see the S.M.A.R.T. information because the drive and interface are encapsulated in a lower layer. For example, they may be part of a RAID subsystem in which the RAID controller sees the S.M.A.R.T.-capable drive, but the host computer sees only a logical volume generated by the RAID controller.

On the Windows platform, many programs designed to monitor and report S.M.A.R.T. information will function only under an administrator account.

BIOS and Windows (Windows Vista and later) may detect S.M.A.R.T. status of hard disk drives and solid state drives, and give a prompt if the S.M.A.R.T. status is bad. [25]

In ATA

ATA S.M.A.R.T. attributes

Each drive manufacturer defines a set of attributes, [26] [27] and sets threshold values beyond which attributes should not pass under normal operation. [22] Each attribute has: [28]

In practice, however, the full "vendor-specific" field is not used as-is. Instead, one of the following occurs: [29]

In any case, the vendor field, also commonly called a "raw value", may be displayed as a decimal or hexadecimal number; its meaning is entirely up to the drive manufacturer (but often corresponds to counts or a physical unit, such as degrees Celsius or seconds). [30]

If one or more attribute have the "prefailure" flag, and the "current value" of such prefailure attribute is smaller than or equal to its "threshold value" (unless the "threshold value" is 0), that will be reported as a "drive failure". In addition, a utility software can send SMART RETURN STATUS command to the ATA drive, it may report three status: "drive OK", "drive warning" or "drive failure".

Manufacturers that have implemented at least one S.M.A.R.T. attribute in various products include Samsung, Seagate, IBM (Hitachi), Fujitsu, Maxtor, Toshiba, Intel, sTec, Inc., Western Digital and ExcelStor Technology.

Known ATA S.M.A.R.T. attributes

The following chart lists some S.M.A.R.T. attributes and the typical meaning of their raw values. Normalized values are usually mapped so that higher values are better (exceptions include drive temperature, number of head load/unload cycles [31] ), but higher raw attribute values may be better or worse depending on the attribute and manufacturer. For example, the "Reallocated Sectors Count" attribute's normalized value decreases as the count of reallocated sectors increases. In this case, the attribute's raw value will often indicate the actual count of sectors that were reallocated, although vendors are in no way required to adhere to this convention.

As manufacturers do not necessarily agree on precise attribute definitions and measurement units, the following list of attributes is a general guide only.

Drives do not support all attribute codes (sometimes abbreviated as "ID", for "identifier", in tables). Some codes are specific to particular drive types (magnetic platter, flash, SSD). Drives may use different codes for the same parameter, e.g., see codes 193 and 225.

Legend
ID 193
0xC1
Attribute code in decimal and
hexadecimal notations
Ideal
Dark Green Arrow Up.svg
High
Higher raw value is better
Low
Dark Green Arrow Down.svg
Lower raw value is better
 !
(Critical)
Nuvola apps important.svg
Denotes a Critical attribute.
Specific values may predict drive failure
IDAttribute nameIdeal !Description
01
0x01
Read Error Rate
Low
Dark Green Arrow Down.svg

Nuvola apps important.svg

(Vendor specific raw value.) Stores data related to the rate of hardware read errors that occurred when reading data from a disk surface. The raw value has different structure for different vendors and is often not meaningful as a decimal number. For some drives, this number may increase during normal operation without necessarily signifying errors. [32] [33] [34]
02
0x02
Throughput Performance
Dark Green Arrow Up.svg
High
Overall (general) throughput performance of a hard disk drive. If the value of this attribute is decreasing there is a high probability that there is a problem with the disk.
03
0x03
Spin-Up Time
Low
Dark Green Arrow Down.svg
Average time of spindle spin up (from zero RPM to fully operational [milliseconds]).
04
0x04
Start/Stop CountA tally of spindle start/stop cycles. The spindle turns on, and hence the count is increased, both when the hard disk is turned on after having before been turned entirely off (disconnected from power source) and when the hard disk returns from having previously been put to sleep mode. [35]
05
0x05
Reallocated Sectors Count
Low
Dark Green Arrow Down.svg

Nuvola apps important.svg
Count of reallocated sectors. The raw value represents a count of the bad sectors that have been found and remapped. [39] Thus, the higher the attribute value, the more sectors the drive has had to reallocate. This value is primarily used as a metric of the life expectancy of the drive; a drive which has had any reallocations at all is significantly more likely to fail in the immediate months. [36] [40] If Raw value of 0x05 attribute is higher than its Threshold value, that will reported as "drive warning". [41]
06
0x06
Read Channel MarginMargin of a channel while reading data. The function of this attribute is not specified.
07
0x07
Seek Error RateVaries(Vendor specific raw value.) Rate of seek errors of the magnetic heads. If there is a partial failure in the mechanical positioning system, then seek errors will arise. Such a failure may be due to numerous factors, such as damage to a servo, or thermal widening of the hard disk. The raw value has different structure for different vendors and is often not meaningful as a decimal number. For some drives, this number may increase during normal operation without necessarily signifying errors. [32]
08
0x08
Seek Time Performance
Dark Green Arrow Up.svg
High
Average performance of seek operations of the magnetic heads. If this attribute is decreasing, it is a sign of problems in the mechanical subsystem.
09
0x09
Power-On Hours Count of hours in power-on state. The raw value of this attribute shows total count of hours (or minutes, or seconds, depending on manufacturer) in power-on state. [42]

"By default, the total expected lifetime of a hard disk in perfect condition is defined as 5 years (running every day and night on all days). This is equal to 1825 days in 24/7 mode or 43800 hours." [43]

On some pre-2005 drives, this raw value may advance erratically and/or "wrap around" (reset to zero periodically). [44] For some HDDs it might be stored as a unsigned 16-bit integer[ citation needed ], which would cause it to wrap around after 65535.

10
0x0A
Spin Retry Count
Low
Dark Green Arrow Down.svg

Nuvola apps important.svg
Count of retry of spin start attempts. This attribute stores a total count of the spin start attempts to reach the fully operational speed (under the condition that the first attempt was unsuccessful). An increase of this attribute value is a sign of problems in the hard disk mechanical subsystem.
11
0x0B
Recalibration Retries or Calibration Retry Count
Low
Dark Green Arrow Down.svg
This attribute indicates the count that recalibration was requested (under the condition that the first attempt was unsuccessful). An increase of this attribute value is a sign of problems in the hard disk mechanical subsystem.
12
0x0C
Power Cycle CountThis attribute indicates the count of full hard disk power on/off cycles.
13
0x0D
Soft Read Error Rate
Low
Dark Green Arrow Down.svg
Uncorrected read errors reported to the operating system.
22
0x16
Current Helium Level
Dark Green Arrow Up.svg
High
Specific to He8 drives from HGST. This value measures the helium inside of the drive specific to this manufacturer. It is a pre-fail attribute that trips once the drive detects that the internal environment is out of specification. [46]
23
0x17
Helium Condition LowerSpecific to MG07+ drives from Toshiba. These values measures level of helium inside of the drive specific to this manufacturer. It is a pre-fail attribute that trips once the drive detects that the internal environment is out of specification. [47]
24
0x18
Helium Condition Upper
170
0xAA
Available Reserved SpaceSee attribute E8. [48]
171
0xAB
SSD Program Fail Count(Kingston) The total number of flash program operation failures since the drive was deployed. [49] Identical to attribute 181.
172
0xAC
SSD Erase Fail Count(Kingston) Counts the number of flash erase failures. This attribute returns the total number of Flash erase operation failures since the drive was deployed. This attribute is identical to attribute 182.
173
0xAD
SSD Wear Leveling CountCounts the maximum worst erase count on any block.
174
0xAE
Unexpected Power Loss CountAlso known as "Power-off Retract Count" per conventional HDD terminology. Raw value reports the number of unclean shutdowns, cumulative over the life of an SSD, where an "unclean shutdown" is the removal of power without STANDBY IMMEDIATE as the last command (regardless of PLI activity using capacitor power). Normalized value is always 100. [50]
175
0xAF
Power Loss Protection FailureLast test result as microseconds to discharge cap, saturated at its maximum value. Also logs minutes since last test and lifetime number of tests. Raw value contains the following data:
  • Bytes 0-1: Last test result as microseconds to discharge cap, saturates at max value. Test result expected in range 25 <= result <= 5000000, lower indicates specific error code.
  • Bytes 2-3: Minutes since last test, saturates at max value.
  • Bytes 4-5: Lifetime number of tests, not incremented on power cycle, saturates at max value.

Normalized value is set to one on test failure or 11 if the capacitor has been tested in an excessive temperature condition, otherwise 100. [50]

176
0xB0
Erase Fail CountS.M.A.R.T. parameter indicates a number of flash erase command failures. [51]
177
0xB1
Wear Range DeltaDelta between most-worn and least-worn Flash blocks. It describes how good/bad the wearleveling of the SSD works on a more technical way.
178
0xB2
Used Reserved Block Count"Pre-Fail" attribute used at least in Samsung devices.
179
0xB3
Used Reserved Block Count Total"Pre-Fail" attribute used at least in Samsung devices. [52]
180
0xB4
Unused Reserved Block Count Total"Pre-Fail" attribute used at least in HP devices.

If the value drops to 0 the device may become read-only to allow the user to retrieve stored data. [53]

181
0xB5
Program Fail Count Total or Non-4K Aligned Access Count
Low
Dark Green Arrow Down.svg
(Flash Memory) Total number of Flash program operation failures since the drive was deployed (indicating old age). [54]

(HDD, Advanced Format) Number of user data accesses (both reads and writes) where LBAs are not 4 KiB aligned (LBA % 8 != 0) or where size is not modulus 4 KiB (block count != 8), assuming logical block size (LBS)=512 B (indicating bad software configuration). [55]

182
0xB6
Erase Fail Count"Pre-Fail" Attribute used at least in Samsung devices.
183
0xB7
SATA Downshift Error Count or Runtime Bad Block
Low
Dark Green Arrow Down.svg
Western Digital, Samsung or Seagate attribute: Either the number of downshifts of link speed (e.g. from 6Gbit/s to 3Gbit/s) or the total number of data blocks with detected, uncorrectable errors encountered during normal operation. [56] Although degradation of this parameter can be an indicator of drive aging and/or potential electromechanical problems, it does not directly indicate imminent drive failure. [57]
184
0xB8
End-to-End error / IOEDC
Low
Dark Green Arrow Down.svg

Nuvola apps important.svg
This attribute is a part of Hewlett-Packard's SMART IV technology, as well as part of other vendors' IO Error Detection and Correction schemas, and it contains a count of parity errors which occur in the data path to the media via the drive's cache RAM. [59]
185
0xB9
Head StabilityWestern Digital attribute.
186
0xBA
Induced Op-Vibration DetectionWestern Digital attribute.
187
0xBB
Reported Uncorrectable Errors
Low
Dark Green Arrow Down.svg

Nuvola apps important.svg
The count of errors that could not be recovered using hardware ECC (see attribute 195). [60]
188
0xBC
Command Timeout
Low
Dark Green Arrow Down.svg

Nuvola apps important.svg
The count of aborted operations due to HDD timeout. Normally this attribute value should be equal to zero. [61]
189
0xBD
High Fly Writes
Low
Dark Green Arrow Down.svg
HDD manufacturers implement a flying height sensor that attempts to provide additional protections for write operations by detecting when a recording head is flying outside its normal operating range. If an unsafe fly height condition is encountered, the write process is stopped, and the information is rewritten or reallocated to a safe region of the hard drive. This attribute indicates the count of these errors detected over the lifetime of the drive.

This feature is implemented in most modern Seagate drives [7] and some of Western Digital's drives, beginning with the WD Enterprise WDE18300 and WDE9180 Ultra2 SCSI hard drives, and will be included on all future WD Enterprise products. [62]

190
0xBE
Temperature Difference or Airflow TemperatureVariesValue is equal to (100-temp. °C), allowing manufacturer to set a minimum threshold which corresponds to a maximum temperature. This also follows the convention of 100 being a best-case value and lower values being undesirable. However, some older drives may instead report raw Temperature (identical to 0xC2) or Temperature minus 50 here.
191
0xBF
G-sense Error Rate
Low
Dark Green Arrow Down.svg
The count of errors resulting from externally induced shock and vibration.
192
0xC0
Power-off Retract Count, Emergency Retract Cycle Count (Fujitsu), [63] or Unsafe Shutdown Count
Low
Dark Green Arrow Down.svg
Number of power-off or emergency retract cycles. [22] [64]
193
0xC1
Load Cycle Count or Load/Unload Cycle Count (Fujitsu)
Low
Dark Green Arrow Down.svg
Count of load/unload cycles into head landing zone position. [63] Some drives use 225 (0xE1) for Load Cycle Count instead.

Western Digital rates their VelociRaptor drives for 600,000 load/unload cycles, [65] and WD Green drives for 300,000 cycles; [66] the latter ones are designed to unload heads often to conserve power. On the other hand, the WD3000GLFS (a desktop drive) is specified for only 50,000 load/unload cycles. [67]

Some laptop drives and "green power" desktop drives are programmed to unload the heads whenever there has not been any activity for a short period, to save power. [68] [69] Operating systems often access the file system a few times a minute in the background, [70] causing 100 or more load cycles per hour if the heads unload: the load cycle rating may be exceeded in less than a year. [71] There are programs for most operating systems that disable the Advanced Power Management (APM) and Automatic acoustic management (AAM) features causing frequent load cycles. [72] [73]

194
0xC2
Temperature or Temperature Celsius
Low
Dark Green Arrow Down.svg
Indicates the device temperature, if the appropriate sensor is fitted. Lowest byte of the raw value contains the exact temperature value (Celsius degrees). [74]
195
0xC3
Hardware ECC RecoveredVaries(Vendor-specific raw value.) The raw value has different structure for different vendors and is often not meaningful as a decimal number. For some drives, this number may increase during normal operation without necessarily signifying errors. [32]
196
0xC4
Reallocation Event Count [63]
Low
Dark Green Arrow Down.svg

Nuvola apps important.svg
[9]
Count of remap operations. The raw value of this attribute shows the total count of attempts to transfer data from reallocated sectors to a spare area. Both successful and unsuccessful attempts are counted. [75]
197
0xC5
Current Pending Sector Count [63]
Low
Dark Green Arrow Down.svg

Nuvola apps important.svg
Count of "unstable" sectors (waiting to be remapped, because of unrecoverable read errors). If an unstable sector is subsequently read successfully, the sector is remapped and this value is decreased. Read errors on a sector will not remap the sector immediately (since the correct value cannot be read and so the value to remap is not known, and also it might become readable later); instead, the drive firmware remembers that the sector needs to be remapped, and will remap it the next time it has been successfully read. [76]

However, some drives will not immediately remap such sectors when successfully read; instead the drive will first attempt to write to the problem sector, and if the write operation is successful the sector will then be marked as good (in this case, the "Reallocation Event Count" (0xC4) will not be increased). This is a serious shortcoming, for if such a drive contains marginal sectors that consistently fail only after some time has passed following a successful write operation, then the drive will never remap these problem sectors. If Raw value of 0xC5 attribute is higher than its Threshold value, that will reported as "drive warning". [77] [78]

198
0xC6
(Offline) Uncorrectable Sector Count [63]
Low
Dark Green Arrow Down.svg

Nuvola apps important.svg
The total count of uncorrectable errors when reading/writing a sector. A rise in the value of this attribute indicates defects of the disk surface and/or problems in the mechanical subsystem. [9] [61] [58]
199
0xC7
UltraDMA CRC Error Count
Low
Dark Green Arrow Down.svg
The count of errors in data transfer via the interface cable as determined by ICRC (Interface Cyclic Redundancy Check).
200
0xC8
Multi-Zone Error Rate [79]
Low
Dark Green Arrow Down.svg
The count of errors found when writing a sector. The higher the value, the worse the disk's mechanical condition is.
200
0xC8
Write Error Rate (Fujitsu)
Low
Dark Green Arrow Down.svg
The total count of errors when writing a sector. [80]
201
0xC9
Soft Read Error Rate or
TA Counter Detected
Low
Dark Green Arrow Down.svg

Nuvola apps important.svg
Count indicates the number of uncorrectable software read errors. [81]
202
0xCA
Data Address Mark errors or
TA Counter Increased
Low
Dark Green Arrow Down.svg
Count of Data Address Mark errors (or vendor-specific). [22]
203
0xCB
Run Out Cancel
Low
Dark Green Arrow Down.svg
The number of errors caused by incorrect checksum during the error correction.
204
0xCC
Soft ECC Correction
Low
Dark Green Arrow Down.svg
Count of errors corrected by the internal error correction software. [22]
205
0xCD
Thermal Asperity Rate
Low
Dark Green Arrow Down.svg
Count of errors due to high temperature. [82]
206
0xCE
Flying HeightHeight of heads above the disk surface. If too low, head crash is more likely; if too high, read/write errors are more likely. [22] [83]
207
0xCF
Spin High Current
Low
Dark Green Arrow Down.svg
Amount of surge current used to spin up the drive. [82]
208
0xD0
Spin BuzzCount of buzz routines needed to spin up the drive due to insufficient power. [82]
209
0xD1
Offline Seek PerformanceDrive's seek performance during its internal tests. [82]
210
0xD2
Vibration During WriteFound in Maxtor 6B200M0 200GB and 2R015H1 15GB disks.
211
0xD3
Vibration During WriteA recording of a vibration encountered during write operations. [84]
212
0xD4
Shock During WriteA recording of shock encountered during write operations. [49] [85]
220
0xDC
Disk Shift
Low
Dark Green Arrow Down.svg
Distance the disk has shifted relative to the spindle (usually due to shock or temperature). Unit of measure is unknown. [49]
221
0xDD
G-Sense Error Rate
Low
Dark Green Arrow Down.svg
The count of errors resulting from externally induced shock and vibration. More typically reported at 0xBF.
222
0xDE
Loaded HoursTime spent operating under data load (movement of magnetic head armature). [49]
223
0xDF
Load/Unload Retry CountCount of times head changes position. [49]
224
0xE0
Load Friction
Low
Dark Green Arrow Down.svg
Resistance caused by friction in mechanical parts while operating. [49]
225
0xE1
Load/Unload Cycle Count
Low
Dark Green Arrow Down.svg
Total count of load cycles [49] Some drives use 193 (0xC1) for Load Cycle Count instead. See Description for 193 for significance of this number.
226
0xE2
Load 'In'-timeTotal time of loading on the magnetic heads actuator (time not spent in parking area). [49]
227
0xE3
Torque Amplification Count
Low
Dark Green Arrow Down.svg
Count of attempts to compensate for platter speed variations. [86]
228
0xE4
Power-Off Retract Cycle
Low
Dark Green Arrow Down.svg
The number of power-off cycles which are counted whenever there is a "retract event" and the heads are loaded off of the media such as when the machine is powered down, put to sleep, or is idle. [22] [64]
230
0xE6
GMR Head Amplitude (magnetic HDDs), Drive Life Protection Status (SSDs)Amplitude of "thrashing" (repetitive head moving motions between operations). [22] [87]

In solid-state drives, indicates whether usage trajectory is outpacing the expected life curve [88]

231
0xE7
Life Left (SSDs) or TemperatureIndicates the approximate SSD life left, in terms of program/erase cycles or available reserved blocks. [88] A normalized value of 100 represents a new drive, with a threshold value at 10 indicating a need for replacement. A value of 0 may mean that the drive is operating in read-only mode to allow data recovery. [89]

Previously (pre-2010) occasionally used for Drive Temperature (more typically reported at 0xC2).

232
0xE8
Endurance Remaining or Available Reserved SpaceNumber of physical erase cycles completed on the SSD as a percentage of the maximum physical erase cycles the drive is designed to endure.

Intel SSDs report the available reserved space as a percentage of the initial reserved space.

233
0xE9
Media Wearout Indicator (SSDs) or Power-On HoursIntel SSDs report a normalized value from 100, a new drive, to a minimum of 1. It decreases while the NAND erase cycles increase from 0 to the maximum-rated cycles.

Previously (pre-2010) occasionally used for Power-On Hours (more typically reported in 0x09).

234
0xEA
Average erase count AND Maximum Erase CountDecoded as: byte 0-1-2=average erase count (big endian) and byte 3-4-5=max erase count (big endian). [90]
235
0xEB
Good Block Count AND System(Free) Block CountDecoded as: byte 0-1-2=good block count (big endian) and byte 3-4=system (free) block count.
240
0xF0
Head Flying Hours or 'Transfer Error Rate' (Fujitsu)Time spent during the positioning of the drive heads. [22] [91] Some Fujitsu drives report the count of link resets during a data transfer. [92]
241
0xF1
Total LBAs Written or Total Host WritesTotal count of LBAs written. Some SSD (for example, manufacture by Western Digital and Seagate) use 1 GiB as unit of this attribute.
242
0xF2
Total LBAs Read or Total Host ReadsTotal count of LBAs read.
Some S.M.A.R.T. utilities will report a negative number for the raw value since in reality it has 48 bits rather than 32. Some SSD (for example, manufacture by Western Digital and Seagate) use 1 GiB as unit of this attribute.
243
0xF3
Total LBAs Written Expanded or Total Host Writes ExpandedThe upper 5 bytes of the 12-byte total number of LBAs written to the device. The lower 7 byte value is located at attribute 0xF1. [93]
244
0xF4
Total LBAs Read Expanded or Total Host Reads ExpandedThe upper 5 bytes of the 12-byte total number of LBAs read from the device. The lower 7 byte value is located at attribute 0xF2. [94]
245

0xF5

Remaining Rated Write Endurance
Dark Green Arrow Up.svg
High
Certified Dell SSDs use this for write endurance.
246
0xF6
Cumulative host sectors written(Micron) LBA written due to computer request. [95]
247
0xF7
Host program page count(Micron) NAND pages written due to computer request. [96]
248
0xF8
Background program page count(Micron) NAND pages written due to background operations (e.g. garbage collection). [96]
249
0xF9
NAND Writes (1GiB)Total NAND Writes. Raw value reports the number of writes to NAND in 1 GB increments. [97]
250
0xFA
Read Error Retry Rate
Low
Dark Green Arrow Down.svg
Count of errors while reading from a disk. [49]
251
0xFB
Minimum Spares RemainingThe Minimum Spares Remaining attribute indicates the number of remaining spare blocks as a percentage of the total number of spare blocks available. [98]
252
0xFC
Newly Added Bad Flash BlockThe Newly Added Bad Flash Block attribute indicates the total number of bad flash blocks the drive detected since it was first initialized in manufacturing. [98]
254
0xFE
Free Fall Protection
Low
Dark Green Arrow Down.svg
Count of "Free Fall Events" detected. [99]

Logs

GP Log 0x04: Device Statistics
PageOffsetDescription
0x010x08Lifetime Power-On Resets
0x010x10Power-on Hours
0x010x18Logical Sectors Written
0x010x28Logical Sectors Read
0x050x08Current Temperature
0x050x20Highest Temperature
0x050x28Lowest Temperature
0x050x58Specified Maximum Operating Temperature
0x050x68Specified Minimum Operating Temperature
0x070x08Percentage Used Endurance Indicator

SMART Command Transport

Threshold Exceeds Condition

Threshold Exceeds Condition (TEC) is an estimated date when a critical drive statistic attribute will reach its threshold value. When Drive Health software reports a "Nearest T.E.C.", it should be regarded as a "Failure date". Sometimes, no date is given and the drive can be expected to work without errors. [100]

To predict the date, the drive tracks the rate at which the attribute changes. Note that TEC dates are only estimates; hard drives can and do fail much sooner or much later than the TEC date. [101]

In NVMe

NVMe specification has defined unified S.M.A.R.T. attributes for different drive manufacturers. The data is present in a log page of 512 bytes long. [102]

Known NVMe S.M.A.R.T. attributes

OffsetLengthAttribute nameDescription
0
0x00
1Critical WarningCritical warnings or drive failures of the controller.

Bit definition:
Bit 00, value 1: Available spare is below threshold.
Bit 01, value 1: Temperature is over threshold.
Bit 02, value 1: Drive reliability is degraded.
Bit 03, value 1: Drive is in read only mode.
Bit 04, value 1: Volatile memory backup device failed. This usually means power-loss protection capacitor.
1
0x01
2Composite TemperatureTemperature in kelvins representing the current composite temperature of the controller and its namespace(s).
3
0x03
1Available SparePercentage of available spare space (spare space for bad block mapping).
4
0x04
1Available Spare ThresholdPercentage of available spare space threshold.
5
0x05
1Percentage UsedPercentage of drive life used. This drive life may be estimated as terabytes written.
7
0x07
25Reserved
32
0x20
16Data Units ReadNumber of 512-byte data units the host has read from the controller. This value does not include metadata. This value is reported in thousands (i.e., a value of 1 corresponds to 1000 units of 512 bytes written) and is rounded up.
48
0x30
16Data Units WrittenNumber of 512-byte data units the host has written to the controller. This value does not include metadata. This value is reported in thousands (i.e., a value of 1 corresponds to 1000 units of 512 bytes written) and is rounded up.
64
0x40
16Host Read CommandsNumber of read commands completed by the controller.
80
0x50
16Host Write CommandsNumber of write commands completed by the controller.
96
0x60
16Controller Busy TimeAmount of time the controller is busy with I/O commands.
112
0x70
16Power CyclesNumber of power cycles.
128
0x80
16Power On HoursNumber of power-on hours, excluding time powered on in non-operational power state.
144
0x90
16Unsafe ShutdownsNumber of unsafe shutdowns. Incremented when a Shutdown Notification is not received prior to loss of power.
160
0xA0
16Media ErrorsNumber of occurrences where the controller detected an unrecovered data integrity error, including uncorrectable ECC, CRC checksum failure, or LBA tag mismatch.
176
0xB0
16Number of Error Information Log EntriesNumber of Error Information log entries over the life of the controller.
192
0xC0
4Warning Composite Temperature TimeThe amount of time in minutes that the controller is operational and the Composite Temperature is greater than or equal to the Warning Composite Temperature Threshold and less than the Critical Composite Temperature Threshold.
196
0xC4
4Critical Composite Temperature TimeContains the amount of time in minutes that the controller is operational and the Composite Temperature is greater than or equal to the Critical Composite Temperature Threshold.
200
0xC8
2×8Temperature Sensor 1–8
216
0xD8
4×2Thermal Management Temperature 1/2 Transition Count
224
0xE0
4×2Total Time For Thermal Management Temperature 1/2
232
0xE8
280Reserved

In SCSI

The SCSI standard does not mention the term "S.M.A.R.T." except in one place, but the equivalent logging/failure-prediction functionality is available in standard log pages prescribed by SPC-4. [20] There is also space for vendor-specific log pages. Log pages are variable-length. [103]

List of SCSI log pages [103]
CodeNameDescription
00hSupported log pages
01hBuffer Over-Run/Under-Run
02hWrite Error Counter
03hRead Error Counter
04hRead Reverse Error Counter
05hVerify Error Counter
06hNon-Medium Error
07hLast n Error Events
0BhLast n Deferred Errors or Asynchronous Event
0DhTemperature
0EhStart-Stop Cycle Counter
0FhApplication Client
10hSelf-Test Results
15hBackground Scan Results
18hProtocol Specific Port
2FhInformational ExceptionsProvides two kinds of warnings: impending failure warning based on a vendor-defined threshold (similar to ATA SMART normalized values) and temperature warnings.

SCSI has a specialized set of S.M.A.R.T. features for tape drives known as TapeAlert defined in SMC-2. [20] SCSI offers self-testing, similar to ATA. [20]

Information related to the reallocation of bad sectors is provided not via a log page, but via the READ DEFECT DATA command. In addition to a grand total, this command provides information about which specific sectors were allocated and why. [103]

Self-tests

S.M.A.R.T. drives may offer a number of self-tests: [104] [105] [106]

Short
Checks the electrical and mechanical performance as well as the read performance of the disk. Electrical tests might include a test of buffer RAM, a read/write circuitry test, or a test of the read/write head elements. Mechanical test includes seeking and servo on data tracks. Scans small parts of the drive's surface (area is vendor-specific and there is a time limit on the test). Checks the list of pending sectors that may have read errors, and it usually takes under two minutes.
Long/extended
A longer and more thorough version of the short self-test, scanning the entire disk surface with no time limit. This test usually takes several hours, depending on the read/write speed of the drive and its size. It is possible for the long test to pass even if the short test fails. [107]
Conveyance
Intended as a quick test to identify damage incurred during transporting of the device from the drive manufacturer to the computer manufacturer. [108] Only available on ATA drives, and it usually takes several minutes.
Selective
Some ATA drives allow selective self-tests of just a part of the surface. [109] There is a dedicated log for selective test results, separate from the main self-test log.
Background scan
SCSI drives have the ability to schedule periodic full-surface self-tests known as a background media scan (BMS). The sdparm program may be used to adjust whether to run the scans (EN_BMS) and various parameters for the scan (e.g. period, idle time before run). The drive remains operable during the test. [110] There is a dedicated log for background scan results, separate from the self-test log.
Offline data collection
ATA drives may support a periodic short operation called "offline data collection". Although this feature is marked "obsolete", many modern hard drives retain this feature. The drive remains operable during collection and any result is reflected only in SMART attributes (some attributes only update when "offline"). [111]

Drives remain operable during self-test, unless a "captive" option (ATA only) is requested. [104]

The self-test logs for SCSI and ATA drives are slightly different.

The ATA drive's self-test log can contain up to 21 read-only entries. When the log is filled, old entries are removed. [112]

See also

Related Research Articles

<span class="mw-page-title-main">Parallel ATA</span> Computer storage interface standard

Parallel ATA (PATA), originally AT Attachment, also known as Integrated Drive Electronics (IDE), is a standard interface designed for IBM PC-compatible computers. It was first developed by Western Digital and Compaq in 1986 for compatible hard drives and CD or DVD drives. The connection is used for storage devices such as hard disk drives, floppy disk drives, optical disc drives, and tape drives in computers.

<span class="mw-page-title-main">Hard disk drive</span> Electro-mechanical data storage device

A hard disk drive (HDD), hard disk, hard drive, or fixed disk is an electro-mechanical data storage device that stores and retrieves digital data using magnetic storage with one or more rigid rapidly rotating platters coated with magnetic material. The platters are paired with magnetic heads, usually arranged on a moving actuator arm, which read and write data to the platter surfaces. Data is accessed in a random-access manner, meaning that individual blocks of data can be stored and retrieved in any order. HDDs are a type of non-volatile storage, retaining stored data when powered off. Modern HDDs are typically in the form of a small rectangular box.

<span class="mw-page-title-main">SATA</span> Computer bus interface for storage devices

SATA is a computer bus interface that connects host bus adapters to mass storage devices such as hard disk drives, optical drives, and solid-state drives. Serial ATA succeeded the earlier Parallel ATA (PATA) standard to become the predominant interface for storage devices.

<span class="mw-page-title-main">Seagate Technology</span> American data storage company

Seagate Technology Holdings plc is an American data storage company. It was incorporated in 1978 as Shugart Technology and commenced business in 1979. Since 2010, the company has been incorporated in Dublin, Ireland, with operational headquarters in Fremont, California, United States.

<span class="mw-page-title-main">Native Command Queuing</span>

In computing, Native Command Queuing (NCQ) is an extension of the Serial ATA protocol allowing hard disk drives to internally optimize the order in which received read and write commands are executed. This can reduce the amount of unnecessary drive head movement, resulting in increased performance for workloads where multiple simultaneous read/write requests are outstanding, most often occurring in server-type applications.

<span class="mw-page-title-main">Serial Attached SCSI</span> Point-to-point serial protocol for enterprise storage

In computing, Serial Attached SCSI (SAS) is a point-to-point serial protocol that moves data to and from computer-storage devices such as hard disk drives, solid-state drives and tape drives. SAS replaces the older Parallel SCSI bus technology that first appeared in the mid-1980s. SAS, like its predecessor, uses the standard SCSI command set. SAS offers optional compatibility with Serial ATA (SATA), versions 2 and later. This allows the connection of SATA drives to most SAS backplanes or controllers. The reverse, connecting SAS drives to SATA backplanes, is not possible.

<span class="mw-page-title-main">SpinRite</span> Data recovery software

SpinRite is a computer program for scanning RAS Random Access Storage devices such as hard disks, reading and rewriting data to resolve and retrieve data that is unreadable by DOS or Windows. The first version was released in 1987 by Steve Gibson. The current version, 6.1, was released in 2024.

<span class="mw-page-title-main">USB mass storage device class</span> USB device class for drives

The USB mass storage device class is a set of computing communications protocols, specifically a USB Device Class, defined by the USB Implementers Forum that makes a USB device accessible to a host computing device and enables file transfers between the host and the USB device. To a host, the USB device acts as an external hard drive; the protocol set interfaces with a number of storage devices.

A hybrid drive is a logical or physical computer storage device that combines a faster storage medium such as solid-state drive (SSD) with a higher-capacity hard disk drive (HDD). The intent is adding some of the speed of SSDs to the cost-effective storage capacity of traditional HDDs. The purpose of the SSD in a hybrid drive is to act as a cache for the data stored on the HDD, improving the overall performance by keeping copies of the most frequently used data on the faster SSD drive.

<span class="mw-page-title-main">Hard disk drive failure</span> Electromechanical malfunctioning

A hard disk drive failure occurs when a hard disk drive malfunctions and the stored information cannot be accessed with a properly configured computer.

<span class="mw-page-title-main">Solid-state drive</span> Computer storage device with no moving parts

A solid-state drive (SSD) is a type of solid-state storage device that uses integrated circuits to store data persistently. It is sometimes called semiconductor storage device, solid-state device, or solid-state disk.

In computing, error recovery control (ERC) is a feature of hard disks which allow a system administrator to configure the amount of time a drive's firmware is allowed to spend recovering from a read or write error. Limiting the recovery time allows for improved error handling in hardware or software RAID environments. In some cases, there is a conflict as to whether error handling should be undertaken by the hard drive or by the RAID implementation, which leads to drives being marked as unusable and significant performance degradation, when this could otherwise have been avoided.

Spin-up refers to the process of a hard disk drive or optical disc drive accelerating its platters or inserted optical disc from a stopped state to an operational speed. The required operational speed depends on the design of the disk drive. Typical speeds of hard disks have been 2400, 3600, 4200, 5400, 7200, 10000 and 15000 revolutions per minute (RPM). Achieving such speeds can require a significant portion of the available power budget of a computer system, and so application of power to the disks must be carefully controlled. Operational speed of optical disc drives may vary depending on type of disc and mode of operation.

<span class="mw-page-title-main">Seagate Barracuda</span> Series of hard disk drives produced by Seagate Technology

The Seagate Barracuda is a series of hard disk drives and later solid state drives produced by Seagate Technology that was first introduced in 1993.

A trim command allows an operating system to inform a solid-state drive (SSD) which blocks of data are no longer considered to be "in use" and therefore can be erased internally.

<span class="mw-page-title-main">SeaTools</span> Software for hard disk diagnostic

SeaTools is a computer hard disk analysis software developed and released by Seagate Technology. It exists as a version for DOS and Microsoft Windows. It can perform short and long drive self-tests and read/write tests, extract S.M.A.R.T. indicators and drive information, and perform advanced tests. It was created by Seagate in response to the fact that more than one third of all drives sent in for repair were actually not defective at all, thus creating unnecessary costs for retailers and the company by having to ship and analyze such disks.

<span class="mw-page-title-main">Advanced Format</span> Disk format and access using sector sizes larger than 512 bytes

Advanced Format (AF) is any disk sector format used to store data on magnetic disks in hard disk drives (HDDs) that exceeds 528 bytes per sector, frequently 4096, 4112, 4160, or 4224-byte sectors. Larger sectors of an Advanced Format Drive (AFD) enable the integration of stronger error correction algorithms to maintain data integrity at higher storage densities.

Power-on hours (POH) is the length of time, usually in hours, that electrical power is applied to a device.

Smartmontools is a set of utility programs to control and monitor computer storage systems using the Self-Monitoring, Analysis and Reporting Technology (S.M.A.R.T.) system built into most modern (P)ATA, Serial ATA, SCSI/SAS and NVMe hard drives.

Shingled magnetic recording (SMR) is a magnetic storage data recording technology used in hard disk drives (HDDs) to increase storage density and overall per-drive storage capacity. Conventional hard disk drives record data by writing non-overlapping concentric magnetic tracks, while shingled recording writes new tracks that overlap part of the previously written magnetic track, leaving the previous track narrower and allowing higher track density. Thus, the tracks partially overlap similar to roof shingles. This approach was selected because, if the writing head is made too narrow, it cannot provide the very high fields required in the recording layer of the disk.

References

  1. "CrystalDiskInfo". SourceForge. 2024-06-22. Retrieved 2024-07-30.
  2. "Hard Disk Sentinel - HDD health and temperature monitoring". www.hdsentinel.com. Retrieved 2024-10-29.
  3. "Communicating With Your SSD: Understanding SMART Attributes | Samsung SSD". Samsung.com. Archived from the original on 2015-03-10. Retrieved 2014-01-18.
  4. Kubico. "What is Self-Monitoring, Analysis and Reporting Technology? – Kubico" . Retrieved 2024-09-13.
  5. "SMART and SSDs". Crucial. Retrieved 2024-09-13.
  6. "Self-Monitoring, Analysis, and Reporting Technology (S.M.A.R.T.)". Dolphin Data Lab. Retrieved 2024-09-13.
  7. 1 2 "Enhanced Smart attributes" (PDF). Seagate. Archived from the original (PDF) on 2006-03-28.
  8. 1 2 "SMART". PCTechGuide. 2003.
  9. 1 2 3 4 5 Pinheiro, Eduardo; Weber, Wolf-Dietrich; Barroso, Luís André (2007). Failure Trends in a Large Disk Drive Population (PDF). 5th USENIX Conference on File and Storage Technologies (FAST 2007). Mountain View, Calif. pp. 8–9.
  10. Pinheiro, Eduardo; Weber, Wolf-Dietrich; Barroso, Luís André. Failure Trends in a Large Disk Drive Population: Conclusion (PDF). 5th USENIX Conference on File and Storage Technologies (FAST 2007). Mountain View, Calif.
  11. "No. ZG92-0289" (announcement letter). IBM. September 1, 1992.
  12. Ottem, Eric; Plummer, Judy (1995). Playing it S.M.A.R.T.: The emergence of reliability prediction technology (Report). Seagate Technology Paper.
  13. Compaq. IntelliSafe. Technical Report SSF-8035 (Report). Small Form Committee. January 1995.
  14. Seagate Product Marketing (July 1999). Get S.M.A.R.T. for Reliability (PDF) (Report). Technology Paper. Scotts Valley, California: Seagate Technology. TP-67D. Archived from the original (PDF) on 12 June 2001. Compaq placed IntelliSafe in the public domain by presenting its specification for the ATA environment, SFF-8035, to the Small Form Factor Committee on May 12, 1995.
  15. Mueller, Scott (2013). "The ATA/IDE Interface". Upgrading and repairing PCs (21st ed.). Indianapolis, Ind.: Que Pub. ISBN   978-0-7897-5000-6. OCLC   816159579.
  16. 1 2 Allen, Bruce (2004-01-01). "Monitoring Hard Disks with SMART | Linux Journal". www.linuxjournal.com. Retrieved 2021-04-13.
  17. "AT Attachment 8 – ATA/ATAPI Command Set (ATA8-ACS): SMART (Self-monitoring, analysis, and reporting technology) feature set" (PDF). ANSI INCITS. September 6, 2008. Archived from the original (PDF) on October 10, 2014. Retrieved March 23, 2020.
  18. Stephens 2006 , pp. 44–126, 198–213, 327–44, Sections 4.19: "SMART (Self-monitoring, analysis, and reporting technology) feature set", 7.52: "SMART", Annex A: "Log Page Definitions"
  19. "ATA/ATAPI Command Set - 2 (ACS-2)" (PDF). ATA Command Set 2 (working draft) (7 ed.). ANSI INCITS. June 22, 2011. Archived from the original (PDF) on July 1, 2016. Retrieved June 8, 2017.
  20. 1 2 3 4 Gilbert, Douglas. "Smartmontools for SCSI devices".
  21. "Hitachi Travelstar 80GN" (PDF) (2.0 ed.). Hitachi Data Systems. 19 September 2003. Hitachi Document Part Number S13K-1055-20. Archived from the original (PDF) on 18 July 2011.
  22. 1 2 3 4 5 6 7 8 9 10 11 Hatfield, Jim (30 September 2005). "SMART Attribute Annex" (PDF). Technical Committee T13. Seagate Technology. pp. 1–5. Archived from the original (PDF) on 2007-02-03. Retrieved 12 July 2016. Note: this source usefully lists a set of attributes in use. However, its description of "worst value" deviates from SFF-8035, which is closer to reality.
  23. "Smartmontools". Source forge.
  24. Bruno Sonnino (31 October 2005). "What is S.M.A.R.T.?". PC Mag. Ziff Davis. p. 1. Retrieved 12 July 2016.
  25. "Hard drive acting up? It could be hardware issues — here's how to find out". Windows Central. 2019-08-27. Retrieved 2021-03-04.
  26. Stephens 2006, p. 207Of the 512 octets listed in table 42 on page 207: "Device SMART data structure" a total of 489 are marked as "Vendor specific".
  27. Ottem, Eric; Plummer, Judy (1995). Playing it S.M.A.R.T.: The emergence of reliability prediction technology (Report). Seagate. Though attributes are drive-specific, a variety of typical characteristics can be identified: [...] The attributes listed above illustrate typical kinds of reliability indicators. Ultimately, the disc drive design determines which attributes the manufacturer will choose. Attributes are therefore considered proprietary, since they depend on drive design.
  28. SFF Committee (April 1, 1996). "Specification for Self-Monitoring, Analysis and Reporting Technology (S.M.A.R.T.) SFF-8035i Revision 2.0" (PDF). Archived from the original (PDF) on 2014-04-23.
  29. "smartmontools/smartmontools: atacmds.cpp, ata_get_attr_raw_value()". Smartmontools.org. 30 May 2023.
  30. "smartmontools/smartmontools: atacmds.h: struct ata_smart_attribute, enum ata_attr_raw_format". Smartmontools.org. 30 May 2023.
  31. "Smartmontools". Source forge. Attribute 194 (Temperature Celsius) behaves strangely on my Seagate disk
  32. 1 2 3 fzabkar. "Seagate SER, RRER & HEC". www.users.on.net. Archived from the original on July 29, 2023. Retrieved June 14, 2024.
  33. Seagate Technology. "Seagate SMART Attribute Specification" (PDF). t1.daumcdn.net.
  34. "Seagate SMART parsing for smartmontools · Issue #108 · Seagate/OpenSeaChest". GitHub .
  35. "Self-Monitoring, Analysis and Reporting Technology (SMART)" (article). Smart Linux. 2009-03-10.
  36. 1 2 3 "Failure Trends in a Large Disk Drive Population" (PDF). Proceedings of the 5th USENIX Conference on File and Storage Technologies (FAST'07). 2007. We find that the group of drives with scan errors are ten times more likely to fail than the group with no errors. This effect is also noticed when we further break down the groups by disk model. From Figure 8 we see a drastic and quick decrease in survival probability after the first scan error (left graph). A little over 70% of the drives survive the first 8 months after their first scan error.
  37. "What SMART Hard Disk Errors Actually Tell Us". Backblaze Blog | Cloud Storage & Cloud Backup. 2016-10-06. Retrieved 2017-03-19.
  38. "S.M.A.R.T. Attribute: Reallocated Sectors Count | Knowledge Base". kb.acronis.com. Retrieved 2017-03-19.
  39. "ATA Command Set 4 (ACS-4) - Working Draft" (PDF). October 14, 2016. Archived from the original (PDF) on September 21, 2020. Retrieved March 19, 2017.
  40. "What SMART Stats Tell Us About Hard Drives". 2016-10-06. Comment by "Mark". There was a direct link between Reallocated Sectors Count and how quickly the drive would fail [...] Even one Uncorrectable sector count would lead to most drives being unusable within 3months
  41. https://download.semiconductor.samsung.com/resources/others/SSD_Application_Note_SMART_final.pdf [ bare URL PDF ]
  42. "Knowledge Base: 9109: S.M.A.R.T. Attribute: Power-On Hours (POH)". Acronis.
  43. "Power on time". hdsentinel.com. Retrieved 2014-07-14.
  44. "FAQ". Smartmontools. Sourceforge. Retrieved 2013-01-15.
  45. "S.M.A.R.T. Attribute: Spin Retry Count Knowledge Base". kb.acronis.com. Retrieved 2017-03-19.
  46. "SMART Hard Drive Attributes: SMART 22 is a Gas Gas Gas". Backblaze Blog - The Life of a Cloud Backup Company. 2015-04-16.
  47. "Hard Drive Stats for Q2 2018". Backblaze Blog - The Life of a Cloud Backup Company. 2018-08-27. Retrieved 2022-08-07.
  48. "Intel Solid-state Drive DC S3700 Series Product Specification" (PDF). Intel. March 2014.
  49. 1 2 3 4 5 6 7 8 9 "S.M.A.R.T." Acronis. 9 March 2010. Retrieved 1 April 2016. Samsung, Seagate, IBM (Hitachi), Fujitsu (not all models), Maxtor, Western Digital (not all models)
  50. 1 2 "Intel Solid-state Drive DC S3700 Series Product Specification" (PDF) (product manual). Intel. March 2014.
  51. "9184: S.M.A.R.T. Attribute: Erase Fail Count (chip)". March 9, 2010.
  52. Over-Provisioning Benefits for Samsung Data Center SSDs (PDF) (Report). Samsung. March 2019.
  53. "SSDs and SMART Data | Crucial.com". Crucial.com. Retrieved 5 March 2024.
  54. "SMART Attribute Details" (PDF). Kingston Technology Corporation. 2013. p. 4. Archived from the original (PDF) on 2013-05-07. Retrieved 3 August 2013.
  55. "The SMART Command Feature Set" (PDF). Micron Technology, Inc. August 2010. p. 11. Archived from the original (PDF) on 2013-02-01. Retrieved 3 August 2013.
  56. "HDD Guardian". CodePlex. Retrieved 21 January 2015.
  57. "S.M.A.R.T. Attribute: SATA Downshift Error Count | Knowledge Base". kb.acronis.com. Retrieved 2017-03-19.
  58. 1 2 3 "Acronis Drive Monitor: Disk Health Calculation Knowledge Base". kb.acronis.com. Retrieved 2017-03-19.
  59. "SMART IV Technology on HP Business Desktop Hard Drives" (PDF). Hewlett-Packard. Retrieved 20 August 2021.
  60. 1 2 "BackBlaze SMART blog". 12 November 2014. Retrieved 20 July 2015.
  61. 1 2 3 4 "What SMART Hard Disk Errors Actually Tell Us". Backblaze Blog | Cloud Storage & Cloud Backup. 2016-10-06. Retrieved 2017-03-19.
  62. "Fly Height Monitor Improves Hard Drive Reliability" (PDF). Western Digital. April 1999.
  63. 1 2 3 4 5 "MHT2080AT, MHT2060AT, MHT2040AT, MHT2030AT, MHG2020AT Disk Drives" (PDF). Fujitsu. 2003-07-04. Archived from the original (PDF) on 2017-03-12.
  64. 1 2 "9127: S.M.A.R.T. Attribute: Power-off Retract Count". Acronis Knowledge Base. Acronis International. Retrieved 12 July 2016.
  65. "WD VelociRaptor Spec Sheet" (PDF). WD.
  66. "WD Green Spec Sheet" (PDF). WD.
  67. "WD VelociRaptor SATA Hard Drives" (PDF). wdc.com. 2008. Retrieved 2014-03-31.
  68. "Problem with hard drive clicking". ThinkWiki.
  69. "hdparm(8) - Linux manual page". man7.org. November 2012. Retrieved 2014-03-31. Get/set the Western Digital (WD) Green Drive's "idle3" timeout value. This timeout controls how often the drive parks its heads and enters a low power consumption state. The factory default is eight (8) seconds, which is a very poor choice for use with Linux. Leaving it at the default will result in hundreds of thousands of head load/unload cycles in a very short period of time.
  70. "discussion list". Arch Linux. If linux tends to write to /var/log/* every 30s, then the heads can park/unpark every 30s.
  71. "How to Reduce Power Consumption". ThinkWiki. Hard drives. The files access time update, while mandated by POSIX, is causing lots of disks access; even accessing files on disk cache may wake the ATA or USB bus.
  72. "Mac OS X is beating your hard drives to death. Here's the fix". Kg4cyx.net. 11 November 2014. Retrieved 3 April 2016.
  73. "quietHDD". quiethdd. 13 December 2009. Archived from the original on 8 July 2016. Retrieved 3 April 2016.
  74. "S.M.A.R.T. basics".
  75. "S.M.A.R.T.-Attribute: Reallocation Event Count". Acronis.
  76. "S.M.A.R.T. Attribute: Current Pending Sector Count". Acronis.
  77. "CrystalDiskInfo Caution - Storage Devices - Linus Tech Tips". 15 July 2017.
  78. https://documents.westerndigital.com/content/dam/doc-library/en_us/assets/public/western-digital/product/data-center-drives/ultrastar-sas-series/product-manual-ultrastar-dc-ha210.pdf [ bare URL PDF ]
  79. Cabla, Lubomir (2009-08-06). "HDAT2 v4.6 User's Manual" (PDF) (1.1 ed.).
  80. "Attributes". SMART Linux project. Source forge.
  81. 1 2 "S.M.A.R.T. Attribute: Soft Read Error Rate / Off Track Errors (Maxtor) | Knowledge Base". kb.acronis.com. Retrieved 2017-03-19.
  82. 1 2 3 4 "S.M.A.R.T. attribute list (ATA)". HD sentinel.
  83. "9142: S.M.A.R.T. Attribute: Flying Height". Acronis Knowledge Base. Acronis International. Retrieved 12 July 2016.
  84. "9146: S.M.A.R.T. Attribute: Vibration During Write". Acronis Knowledge Base. Acronis International. Retrieved 12 July 2016.
  85. "9147: S.M.A.R.T. Attribute: Shock During Write". Acronis Knowledge Base. Acronis International. Retrieved 12 July 2016.
  86. "9154: S.M.A.R.T. Attribute: Torque Amplification Count". Acronis Knowledge Base. Acronis International. Retrieved 12 July 2016.
  87. "9156: S.M.A.R.T. Attribute: GMR Head Amplitude". Acronis Knowledge Base. Acronis International.
  88. 1 2 "SMART Attribute Details" (PDF). Kingston.
  89. "S.M.A.R.T. Monitoring Tools / Mailing Lists". sourceforge.net. Retrieved 2017-03-19.
  90. "Ticket 171". Smartmontools (log). Source forge.
  91. "9157: S.M.A.R.T. Attribute: Head Flying Hours / Transfer Error Rate (Fujitsu)". Acronis Knowledge Base. Acronis International.
  92. "MHY2250BH, MHY2200BH, MHY2160BH, MHY2120BH, MHY2100BH, MHY2080BH, MHY2060BH, MHY2040BH Disk Drives, Product/Maintenance Manual" (PDF). Fujitsu Limited. Archived from the original (PDF) on 2021-02-25. Retrieved 2015-09-18.
  93. "SlimSATA SSD Mini-SATA Embedded Flash Module" (PDF) (Engineering Specification). Delkin Devices. 2013. Archived from the original (PDF) on 2015-11-17. Retrieved 2015-11-16.
  94. "SlimSATA SSD Mini-SATA Embedded Flash Module" (PDF) (Engineering Specification). Delkin Devices. 2013. Archived from the original (PDF) on 2015-11-17. Retrieved 2015-11-16.
  95. "TN-FD-22: Client SATA SSD SMART Attribute Reference" (PDF). 2013. Retrieved 16 May 2023.
  96. 1 2 "TN-FD-23: Calculating Write Amplification Factor" (PDF). Micron. 2014. Archived from the original (PDF) on 6 June 2023. Retrieved 16 May 2023.
  97. "Intel Solid-state Drive 520 Series Product Specification" (PDF) (product manual). Intel. February 2012.
  98. 1 2 "SMART Modular Technologies S.M.A.R.T. attributes - A new Windows interface for smartctl". google.com.
  99. "Momentus 7200.2 SATA" (PDF) (product manual) (D ed.). Seagate. September 2007. Hitachi Document Part Number S13K-1055-20.
  100. "FAQ". Drive health. Archived from the original on September 26, 2011. Retrieved October 4, 2011.
  101. "The interpretation of the TEC and the SMART". Altrix soft. Archived from the original on January 13, 2012. Retrieved October 4, 2011.
  102. NVM Express Base Specification Revision 2.0a (PDF). July 23, 2021. pp. 181–3.
  103. 1 2 3 Seagate (December 2010). "SCSI Commands Reference Manual (100293068 Rev. D)" (PDF).
  104. 1 2 "smartctl - Control and Monitor Utility for SMART Disks".
  105. "HDDScan". – free HDD test utility with USB flash and RAID support.
  106. Evans, Mark (26 April 1999). "Hard Drive Self-tests" (PDF). Milpitas, Calif., US: T10.
  107. "HDD fails S.M.A.R.T. short test, but passes long test?". Hardware Canucks. Archived from the original on 2013-07-29. Retrieved 2013-01-15.
  108. Bulik, Darrin (Sep 24, 2001). "Proposal for Extensions To Drive Self Test" (PDF). Lake Forest, Calif.: T10. Archived from the original (PDF) on 2011-09-28.
  109. McLean, Pete (23 October 2001). "Proposal for a Selective Self-test" (PDF). Longmont, Colo.: T10. Archived from the original (PDF) on 28 September 2011.
  110. "PSA: If you have SAS drives, check the Background Media Scan function. It's very useful and not necessarily on by default". r/DataHoarder. 4 December 2019.
  111. "test_offline – smartmontools". www.smartmontools.org.
  112. , Smartmontools mailing lists

Further reading

Projects
Articles
Specifications