Native Command Queuing

Last updated
NCQ allows the drive itself to determine the optimal order in which to retrieve outstanding requests. This may, as here, allow the drive to fulfill all requests in fewer rotations and thus less time. NCQ.svg
NCQ allows the drive itself to determine the optimal order in which to retrieve outstanding requests. This may, as here, allow the drive to fulfill all requests in fewer rotations and thus less time.

In computing, Native Command Queuing (NCQ) is an extension of the Serial ATA protocol allowing hard disk drives to internally optimize the order in which received read and write commands are executed. This can reduce the amount of unnecessary drive head movement, resulting in increased performance (and slightly decreased wear of the drive) for workloads where multiple simultaneous read/write requests are outstanding, most often occurring in server-type applications.

Contents

History

Native Command Queuing was preceded by Parallel ATA's version of Tagged Command Queuing (TCQ). ATA's attempt at integrating TCQ was constrained by the requirement that ATA host bus adapters use ISA bus device protocols to interact with the operating system. The resulting high CPU overhead and negligible performance gain contributed to a lack of market acceptance for ATA TCQ.

NCQ differs from TCQ in that, with NCQ, each command is of equal importance, but NCQ's host bus adapter also programs its own first party DMA engine with CPU-given DMA parameters during its command sequence whereas TCQ interrupts the CPU during command queries and requires it to modulate the ATA host bus adapter's third party DMA engine. NCQ's implementation is preferable because the drive has more accurate knowledge of its performance characteristics and is able to account for its rotational position. Both NCQ and TCQ have a maximum queue length of 32 outstanding commands. [1] [2] Because the ATA TCQ is rarely used, Parallel ATA (and the IDE mode of some chipsets) usually only support one outstanding command per port.

For NCQ to be enabled, it must be supported and enabled in the SATA host bus adapter and in the hard drive itself. The appropriate driver must be loaded into the operating system to enable NCQ on the host bus adapter. [3]

Many newer chipsets support the Advanced Host Controller Interface (AHCI), which allows operating systems to universally control them and enable NCQ. DragonFly BSD has supported AHCI with NCQ since 2.3 in 2009. [4] [5] Linux kernels support AHCI natively since version 2.6.19, and FreeBSD fully supports AHCI since version 8.0. Windows Vista and Windows 7 also natively support AHCI, but their AHCI support (via the msahci service) must be manually enabled via registry editing if controller support was not present during their initial install. Windows 7's AHCI enables not only NCQ but also TRIM support on SSD drives (with their supporting firmware). Older operating systems such as Windows XP require the installation of a vendor-specific driver (similar to installing a RAID or SCSI controller) even if AHCI is present on the host bus adapter, which makes initial setup more tedious and conversions of existing installations relatively difficult as most controllers cannot operate their ports in mixed AHCI–SATA/IDE/legacy mode.

Hard disk drives

Performance

A 2004 test with the first-generation NCQ drive (Seagate 7200.7 NCQ) found that while NCQ increased IOMeter performance, desktop application performance decreased. [6] One review in 2010 found improvements on the order of 9% (on average) with NCQ enabled in a series of Windows multitasking tests. [7]

NCQ can negatively interfere with the operating system's I/O scheduler, decreasing performance; [8] this has been observed in practice on Linux with RAID-5. [9] There is no mechanism in NCQ for the host to specify any sort of deadlines for an I/O, like how many times a request can be ignored in favor of others. In theory, a queued request can be delayed by the drive an arbitrary amount of time while it is serving other (possibly new) requests under I/O pressure. [8] Since the algorithms used inside drives' firmware for NCQ dispatch ordering are generally not publicly known, this introduces another level of uncertainty for hardware/firmware performance. Tests at Google around 2008 have shown that NCQ can delay an I/O for up to 1–2 seconds. A proposed workaround is for the operating system to artificially starve the NCQ queue sooner in order to satisfy low-latency applications in a timely manner. [10]

On some drives' firmware, such as the WD Raptor circa 2007, read-ahead is disabled when NCQ is enabled, resulting in slower sequential performance. [11]

SATA solid-state drives profit significantly from being able to queue multiple commands for parallel workloads. For PCIe-based NVMe SSDs, the queue depth was even increased to support a maximum of 65,535 queues with up to 65,535 commands each.

Safety (FUA)

One lesser-known feature of NCQ is that, unlike its ATA TCQ predecessor, it allows the host to specify whether it wants to be notified when the data reaches the disk's platters, or when it reaches the disk's buffer (on-board cache). Assuming a correct hardware implementation, this feature allows data consistency to be guaranteed when the disk's on-board cache is used in conjunction with system calls like fsync. [12] The associated write flag, which is also borrowed from SCSI, is called Force Unit Access (FUA). [13] [14] [15]

Solid-state drives

NCQ is also used in newer solid-state drives where the drive encounters latency on the host, rather than the other way around. For example, Intel's X25-E Extreme solid-state drive uses NCQ to ensure that the drive has commands to process while the host system is busy processing CPU tasks. [16]

NCQ also enables the SSD controller to complete commands concurrently (or partly concurrently, for example using pipelines) where the internal organisation of the device enables such processing.

The NVM Express (NVMe) standard also supports command queuing, in a form optimized for SSDs. [17] NVMe allows multiple queues for a single controller and device, allowing at the same time much higher depths for each queue, which more closely matches how the underlying SSD hardware works. [18]

See also

Related Research Articles

<span class="mw-page-title-main">Parallel ATA</span> Interface standard for the connection of storage devices

Parallel ATA (PATA), originally AT Attachment, also known as IDE, is a standard interface designed for IBM PC-compatible computers. It was first developed by Western Digital and Compaq in 1986 for compatible hard drives and CD or DVD drives. The connection is used for storage devices such as hard disk drives, floppy disk drives, optical disc drives, and tape drives in computers.

<span class="mw-page-title-main">Industry Standard Architecture</span> Internal expansion bus in early PC compatibles

Industry Standard Architecture (ISA) is the 16-bit internal bus of IBM PC/AT and similar computers based on the Intel 80286 and its immediate successors during the 1980s. The bus was (largely) backward compatible with the 8-bit bus of the 8088-based IBM PC, including the IBM PC/XT as well as IBM PC compatibles.

<span class="mw-page-title-main">SATA</span> Computer bus interface for storage devices

SATA is a computer bus interface that connects host bus adapters to mass storage devices such as hard disk drives, optical drives, and solid-state drives. Serial ATA succeeded the earlier Parallel ATA (PATA) standard to become the predominant interface for storage devices.

<span class="mw-page-title-main">CompactFlash</span> Memory card format

CompactFlash (CF) is a flash memory mass storage device used mainly in portable electronic devices. The format was specified and the devices were first manufactured by SanDisk in 1994.

<span class="mw-page-title-main">Host adapter</span> Computer hardware device

In computer hardware, a host controller, host adapter, or host bus adapter (HBA), connects a computer system bus, which acts as the host system, to other network and storage devices. The terms are primarily used to refer to devices for connecting SCSI, SAS, NVMe, Fibre Channel and SATA devices. Devices for connecting to FireWire, USB and other devices may also be called host controllers or host adapters.

A disk array controller is a device that manages the physical disk drives and presents them to the computer as logical units. It almost always implements hardware RAID, thus it is sometimes referred to as RAID controller. It also often provides additional disk cache.

The Advanced Host Controller Interface (AHCI) is a technical standard defined by Intel that specifies the register-level interface of Serial ATA (SATA) host controllers in a non-implementation-specific manner in its motherboard chipsets.

<span class="mw-page-title-main">Western Digital Raptor</span>

The Western Digital Raptor is a discontinued series of high performance hard disk drives produced by Western Digital first marketed in 2003. The drive occupies a niche in the enthusiast, workstation and small-server market. Traditionally, the majority of servers used hard drives featuring a SCSI interface because of their advantages in both performance and reliability over consumer-level ATA drives.

Input/output operations per second is an input/output performance measurement used to characterize computer storage devices like hard disk drives (HDD), solid state drives (SSD), and storage area networks (SAN). Like benchmarks, IOPS numbers published by storage device manufacturers do not directly relate to real-world application performance.

Tagged Command Queuing (TCQ) is a technology built into certain ATA and SCSI hard drives. It allows the operating system to send multiple read and write requests to a hard drive. ATA TCQ is not identical in function to the more efficient Native Command Queuing (NCQ) used by SATA drives. SCSI TCQ does not suffer from the same limitations as ATA TCQ.

sync is a standard system call in the Unix operating system, which commits all data from the kernel filesystem buffers to non-volatile storage, i.e., data which has been scheduled for writing via low-level I/O system calls. Higher-level I/O layers such as stdio may maintain separate buffers of their own.

<span class="mw-page-title-main">Solid-state drive</span> Data storage device

A solid-state drive (SSD) is a solid-state storage device that uses integrated circuit assemblies to store data persistently, typically using flash memory, and functions as secondary storage in the hierarchy of computer storage. It is also sometimes called a semiconductor storage device, a solid-state device, or a solid-state disk, even though SSDs lack the physical spinning disks and movable read-write heads used in hard disk drives (HDDs) and floppy disks. SSD also has rich internal parallelism for data processing.

<span class="mw-page-title-main">Disk buffer</span>

In computer storage, disk buffer is the embedded memory in a hard disk drive (HDD) or solid state drive (SSD) acting as a buffer between the rest of the computer and the physical hard disk platter or flash memory that is used for storage. Modern hard disk drives come with 8 to 256 MiB of such memory, and solid-state drives come with up to 4 GB of cache memory.

<span class="mw-page-title-main">Intel Rapid Storage Technology</span> Computer storage device

Intel Rapid Storage Technology (RST) is a driver SATA AHCI and a firmware-based RAID solution built into a wide range of Intel chipsets. Currently also is installed as a driver for Intel Optane temporary storage units.

<span class="mw-page-title-main">Seagate Barracuda</span> Series of hard disk drives produced by Seagate Technology

The Seagate Barracuda is a series of hard disk drives and later solid state drives produced by Seagate Technology that was first introduced in 1993.

A trim command allows an operating system to inform a solid-state drive (SSD) which blocks of data are no longer considered to be "in use" and therefore can be erased internally.

<span class="mw-page-title-main">Write amplification</span> Phenomenon associated with solid state storage

Write amplification (WA) is an undesirable phenomenon associated with flash memory and solid-state drives (SSDs) where the actual amount of information physically written to the storage media is a multiple of the logical amount intended to be written.

NVM Express (NVMe) or Non-Volatile Memory Host Controller Interface Specification (NVMHCIS) is an open, logical-device interface specification for accessing a computer's non-volatile storage media usually attached via the PCI Express bus. The initial NVM stands for non-volatile memory, which is often NAND flash memory that comes in several physical form factors, including solid-state drives (SSDs), PCIe add-in cards, and M.2 cards, the successor to mSATA cards. NVM Express, as a logical-device interface, has been designed to capitalize on the low latency and internal parallelism of solid-state storage devices.

<span class="mw-page-title-main">M.2</span> Standard for miniature computer expansion cards

M.2, pronounced m dot two and formerly known as the Next Generation Form Factor (NGFF), is a specification for internally mounted computer expansion cards and associated connectors. M.2 replaces the mSATA standard, which uses the PCI Express Mini Card physical card layout and connectors. Employing a more flexible physical specification, M.2 allows different module widths and lengths, which, paired with the availability of more advanced interfacing features, makes M.2 more suitable than mSATA in general for solid-state storage applications, particularly in smaller devices such as ultrabooks and tablets.

<span class="mw-page-title-main">SATA Express</span> Computer device interface

SATA Express is a computer bus interface that supports both Serial ATA (SATA) and PCI Express (PCIe) storage devices, initially standardized in the SATA 3.2 specification. The SATA Express connector used on the host side is backward compatible with the standard SATA data connector, while it also provides two PCI Express lanes as a pure PCI Express connection to the storage device.

References

  1. PDF white paper on NCQ from Intel and Seagate
  2. Volume 1 of the final draft of the ATA-7 standard
  3. "SATA II Native Command Queuing Overview", Intel Whitepaper, April 2003.
  4. Matthew Dillon (2009-06-04). ""Re: DragonFly-2.3.1.165.g25822 master sys/dev/disk/ahci Makefile TODO ahci.c ahci.h ahci_attach.c ahci_cam.c ahci_dragonfly.c ahci_dragonfly.h atascsi.h"".
  5. Matthew Dillon (2009). "ahci(4) — Advanced Host Controller Interface for Serial ATA". BSD Cross Reference. DragonFly BSD.
  6. "Seagate's Barracuda 7200.7 NCQ hard drive - The Tech Report - Page 13". The Tech Report. 17 December 2004. Retrieved 2014-01-11.
  7. "Multitasking with Native Command Queuing - The Tech Report - Page 5". The Tech Report. 3 August 2005. Retrieved 2014-01-11.
  8. 1 2 Yu, Y. J.; Shin, D. I.; Eom, H.; Yeom, H. Y. (2010). "NCQ vs. I/O scheduler". ACM Transactions on Storage. 6: 1–37. doi:10.1145/1714454.1714456. S2CID   14414608.
  9. "hard drive - Poor Linux software RAID 5 performance with NCQ". Server Fault. Retrieved 2014-01-11.
  10. Gwendal Grignou, NCQ Emulation, FLS'08 talk summary (p. 109) slides
  11. "Mark Lord: Re: Lower HD transfer rate with NCQ enabled?". LKML. 2007-04-03. Retrieved 2014-01-11.
  12. Marshall Kirk McKusick. "Disks from the Perspective of a File System - ACM Queue". Queue.acm.org. Retrieved 2014-01-11.
  13. Gregory Smith (2010). PostgreSQL 9.0: High Performance . Packt Publishing Ltd. p.  78. ISBN   978-1-84951-031-8.
  14. http://www.seagate.com/docs/pdf/whitepaper/D2c_tech_paper_intc-stx_sata_ncq.pdf [ bare URL PDF ]
  15. Jonathan Corbet (2010-08-18). "The end of block barriers". LWN.net . Retrieved 2015-06-27.
  16. Gasior, Geoff (November 23, 2008). "Intel's X25-E Extreme solid-state drive - Now with single-level cell flash memory". Tech Report.
  17. Dave Landsman (2013-08-09). "AHCI and NVMe as Interfaces for SATA Express Devices – Overview" (PDF). SATA-IO . Retrieved 2013-10-02.
  18. "NVM Express Overview". nvmexpress.org. Retrieved 2014-11-26.