Nested RAID levels

Last updated

Nested RAID levels, also known as hybrid RAID, combine two or more of the standard RAID levels to gain performance, additional redundancy or both, as a result of combining properties of different standard RAID layouts. [1] [2]

Contents

Nested RAID levels are usually numbered using a series of numbers, where the most commonly used levels use two numbers. The first number in the numeric designation denotes the lowest RAID level in the "stack", while the rightmost one denotes the highest layered RAID level; for example, RAID 50 layers the data striping of RAID 0 on top of the distributed parity of RAID 5. Nested RAID levels include RAID 01, RAID 10, RAID 100, RAID 50 and RAID 60, which all combine data striping with other RAID techniques; as a result of the layering scheme, RAID 01 and RAID 10 represent significantly different nested RAID levels. [3]

RAID 01 (RAID 0+1)

RAID 01.svg
A nested RAID 01 configuration

RAID 01, also called RAID 0+1, is a RAID level using a mirror of stripes, achieving both replication and sharing of data between disks. [3] The usable capacity of a RAID 01 array is the same as in a RAID 1 array made of the same drives, in which one half of the drives is used to mirror the other half. , where is the total number of drives and is the capacity of the smallest drive in the array. [4]

At least four disks are required in a standard RAID 01 configuration, but larger arrays are also used.

RAID 03 (RAID 0+3)

A typical RAID 03 configuration RAID 0+3.svg
A typical RAID 03 configuration

RAID 03, also called RAID 0+3 and sometimes RAID 53, is similar to RAID 01 with the exception that byte-level striping with dedicated parity is used instead of mirroring. [5]

RAID 10 (RAID 1+0)

A typical RAID 10 configuration RAID 10 01.svg
A typical RAID 10 configuration

RAID 10, also called RAID 1+0 and sometimes RAID 1&0, is similar to RAID 01 with an exception that the two used standard RAID levels are layered in the opposite order; thus, RAID 10 is a stripe of mirrors. [3]

RAID 10, as recognized by the storage industry association and as generally implemented by RAID controllers, is a RAID 0 array of mirrors, which may be two- or three-way mirrors, [6] and requires a minimum of four drives. However, a nonstandard definition of "RAID 10" was created for the Linux MD driver; Linux "RAID 10" can be implemented with as few as two disks. Implementations supporting two disks such as Linux RAID 10 offer a choice of layouts. [7] Arrays of more than four disks are also possible.

According to manufacturer specifications and official independent benchmarks, in most cases RAID 10 [8] provides better throughput and latency than all other RAID levels [9] except RAID 0 (which wins in throughput). [10] Thus, it is the preferable RAID level for I/O-intensive applications such as database, email, and web servers, as well as for any other use requiring high disk performance. [11]

RAID 50 (RAID 5+0)

A typical RAID 50 configuration. A1, B1, etc. each represent one data block; each column represents one disk; Ap, Bp, etc. each represent parity information for each distinct RAID 5 and may represent different values across the RAID 5 (that is, Ap for A1 and A2 can differ from Ap for A3 and A4). RAID 50.png
A typical RAID 50 configuration. A1, B1, etc. each represent one data block; each column represents one disk; Ap, Bp, etc. each represent parity information for each distinct RAID 5 and may represent different values across the RAID 5 (that is, Ap for A1 and A2 can differ from Ap for A3 and A4).

RAID 50, also called RAID 5+0, combines the straight block-level striping of RAID 0 with the distributed parity of RAID 5. [3] As a RAID 0 array striped across RAID 5 elements, minimal RAID 50 configuration requires six drives. On the right is an example where three collections of 120 GB RAID 5s are striped together to make 720 GB of total storage space.

One drive from each of the RAID 5 sets could fail without loss of data; for example, a RAID 50 configuration including three RAID 5 sets can tolerate three maximum potential simultaneous drive failures (but only one per RAID 5 set). Because the reliability of the system depends on quick replacement of the bad drive so the array can rebuild, it is common to include hot spares that can immediately start rebuilding the array upon failure. However, this does not address the issue that the array is put under maximum strain reading every bit to rebuild the array at the time when it is most vulnerable. [12] [13]

RAID 50 improves upon the performance of RAID 5 particularly during writes, and provides better fault tolerance than a single RAID level does. This level is recommended for applications that require high fault tolerance, capacity and random access performance. As the number of drives in a RAID set increases, and the capacity of the drives increase, this impacts the fault-recovery time correspondingly as the interval for rebuilding the RAID set increases. [12] [13]

RAID 60 (RAID 6+0)

A typical RAID 60 configuration consisting of two sets of four drives each RAID 60.png
A typical RAID 60 configuration consisting of two sets of four drives each

RAID 60, also called RAID 6+0, combines the straight block-level striping of RAID 0 with the distributed double parity of RAID 6, resulting in a RAID 0 array striped across RAID 6 elements. It requires at least eight disks. [14]

RAID 100 (RAID 10+0)

A typical RAID 100 configuration RAID 100.svg
A typical RAID 100 configuration

RAID 100, sometimes also called RAID 10+0, is a stripe of RAID 10s. This is logically equivalent to a wider RAID 10 array, but is generally implemented using software RAID 0 over hardware RAID 10. Being "striped two ways", RAID 100 is described as a "plaid RAID". [15]

Comparison

The following table provides an overview of some considerations for nested RAID levels. In each case:

LevelDescriptionMinimum number of drives [a] Space efficiencyFault tolerance
MinMax
RAID 01 Block-level striping, and mirroring without parity41 / strips per stripestrips per stripe − 1nn / strips per stripe
RAID 03 Block-level striping, and byte-level striping with dedicated parity61 − 1 / strips per stripe1n / strips per stripe
RAID 10 [b] Mirroring without parity, and block-level striping41 / strips per stripestrips per stripe − 1(strips per stripe − 1) × strips per stripe
RAID 1+6Mirroring without parity, and block-level striping with double distributed parity8(1 - 2 / strips per stripe) / 22 × strips per stripe2 x strips per stripe + (n / strips per stripe) - 2
RAID 50 Block-level striping with distributed parity, and block-level striping61 - (1 / strips per stripe)1n / strips per stripe
RAID 60 Block-level striping with double distributed parity, and block-level striping81 - (2 / strips per stripe) 22 × (n / strips per stripe)
RAID 100 Mirroring without parity, and two levels of block-level striping81 / strips per stripestrips per stripe − 1(strips per stripe − 1) × (strips per stripe)


See also

Explanatory notes

  1. Assumes a non-degenerate minimum number of drives
  2. Theoretical maximum read performance can be represented as n×. However this may be as low as (n / spans)× in practice, depending on configuration and implementation; theoretical maximum write performance can be represented as (n / spans)×, which is close to observed values in practice; See "Performance comparison" section above for explanation of n.

Related Research Articles

<span class="mw-page-title-main">Computer data storage</span> Storage of digital data readable by computers

Computer data storage or digital data storage is a technology consisting of computer components and recording media that are used to retain digital data. It is a core function and fundamental component of computers.

<span class="mw-page-title-main">Hard disk drive</span> Electro-mechanical data storage device

A hard disk drive (HDD), hard disk, hard drive, or fixed disk is an electro-mechanical data storage device that stores and retrieves digital data using magnetic storage with one or more rigid rapidly rotating platters coated with magnetic material. The platters are paired with magnetic heads, usually arranged on a moving actuator arm, which read and write data to the platter surfaces. Data is accessed in a random-access manner, meaning that individual blocks of data can be stored and retrieved in any order. HDDs are a type of non-volatile storage, retaining stored data when powered off. Modern HDDs are typically in the form of a small rectangular box.

RAID is a data storage virtualization technology that combines multiple physical data storage components into one or more logical units for the purposes of data redundancy, performance improvement, or both. This is in contrast to the previous concept of highly reliable mainframe disk drives known as single large expensive disk (SLED).

<span class="mw-page-title-main">Data striping</span> Data segmentation technique

In computer data storage, data striping is the technique of segmenting logically sequential data, such as a file, so that consecutive segments are stored on different physical storage devices.

<span class="mw-page-title-main">Clariion</span> Storage array product

Clariion is a discontinued SAN disk array manufactured and sold by EMC Corporation, it occupied the entry-level and mid-range of EMC's SAN disk array products. In 2011, EMC introduced the EMC VNX Series, designed to replace both the Clariion and Celerra products.

The HP Storageworks XP is a computer data storage disk array sold by Hewlett Packard Enterprise using Hitachi Data Systems hardware and adding their own software to it. It's based on the Hitachi Virtual Storage Platform and targeted towards enabling large scale consolidation, large database, Oracle, SAP, Exchange, and online transaction processing (OLTP) environments.

In coding theory, an erasure code is a forward error correction (FEC) code under the assumption of bit erasures, which transforms a message of k symbols into a longer message with n symbols such that the original message can be recovered from a subset of the n symbols. The fraction r = k/n is called the code rate. The fraction k’/k, where k’ denotes the number of symbols required for recovery, is called reception efficiency. The recovery algorithm expects that it is known which of the n symbols are lost.

HPE Integrity Servers is a series of server computers produced by Hewlett Packard Enterprise since 2003, based on the Itanium processor. The Integrity brand name was inherited by HP from Tandem Computers via Compaq.

<span class="mw-page-title-main">Disk mirroring</span>

In data storage, disk mirroring is the replication of logical disk volumes onto separate physical hard disks in real time to ensure continuous availability. It is most commonly used in RAID 1. A mirrored volume is a complete logical representation of separate volume copies.

<span class="mw-page-title-main">Solid-state drive</span> Computer storage device with no moving parts

A solid-state drive (SSD) is a type of solid-state storage device that uses integrated circuits to store data persistently. It is sometimes called semiconductor storage device, solid-state device, or solid-state disk.

In computer storage, the standard RAID levels comprise a basic set of RAID configurations that employ the techniques of striping, mirroring, or parity to create large reliable data stores from multiple general-purpose computer hard disk drives (HDDs). The most common types are RAID 0 (striping), RAID 1 (mirroring) and its variants, RAID 5, and RAID 6. Multiple RAID levels can also be combined or nested, for instance RAID 10 or RAID 01. RAID levels and their associated data formats are standardized by the Storage Networking Industry Association (SNIA) in the Common RAID Disk Drive Format (DDF) standard. The numerical values only serve as identifiers and do not signify performance, reliability, generation, hierarchy, or any other metric.

Although all RAID implementations differ from the specification to some extent, some companies and open-source projects have developed non-standard RAID implementations that differ substantially from the standard. Additionally, there are non-RAID drive architectures, providing configurations of multiple hard drives not referred to by RAID acronyms.

mdadm is a Linux utility used to manage and monitor software RAID devices. It is used in modern Linux distributions in place of older software RAID utilities such as raidtools2 or raidtools.

The most widespread standard for configuring multiple hard disk drives is RAID, which comes in a number of standard configurations and non-standard configurations. Non-RAID drive architectures also exist, and are referred to by acronyms with tongue-in-cheek similarity to RAID:

<span class="mw-page-title-main">Universal Storage Platform</span> Enterprise storage array

Universal Storage Platform (USP) was the brand name for an Hitachi Data Systems line of computer data storage disk arrays circa 2004 to 2010.

A Redundant Array of Inexpensive Servers (RAIS) or Redundant Array of Independent Nodes (RAIN) is the use of multiple servers to maintain service if one server fails. This is similar in concept to how RAID turns a cluster of ordinary disks into a single block device. RAIS was designed to provide the benefits of a symmetric multiprocessor system (SMP) at the entry cost of computer clusters.

Higher performance in hard disk drives comes from devices which have better performance characteristics. These performance characteristics can be grouped into two categories: access time and data transfer time .

When a RAID array experiences the failure of one or more disks, it can enter degraded mode, a fallback mode that generally allows the continued usage of the array, but either loses the performance boosts of the RAID technique or experiences severe performance penalties due to the necessity to reconstruct the damaged data from error correction data.

bcache is a cache mechanism in the Linux kernel's block layer, which is used for accessing secondary storage devices. It allows one or more fast storage devices, such as flash-based solid-state drives (SSDs), to act as a cache for one or more slower storage devices, such as hard disk drives (HDDs); this effectively creates hybrid volumes and provides performance improvements.

ZFS is a file system with volume management capabilities. It began as part of the Sun Microsystems Solaris operating system in 2001. Large parts of Solaris, including ZFS, were published under an open source license as OpenSolaris for around 5 years from 2005 before being placed under a closed source license when Oracle Corporation acquired Sun in 2009–2010. During 2005 to 2010, the open source version of ZFS was ported to Linux, Mac OS X and FreeBSD. In 2010, the illumos project forked a recent version of OpenSolaris, including ZFS, to continue its development as an open source project. In 2013, OpenZFS was founded to coordinate the development of open source ZFS. OpenZFS maintains and manages the core ZFS code, while organizations using ZFS maintain the specific code and validation processes required for ZFS to integrate within their systems. OpenZFS is widely used in Unix-like systems.

References

  1. Delmar, Michael Graves (2003). "Data Recovery and Fault Tolerance". The Complete Guide to Networking and Network+. Cengage Learning. p. 448. ISBN   1-4018-3339-X.
  2. Mishra, S. K.; Vemulapalli, S. K.; Mohapatra, P. (1995). "Dual-Crosshatch Disk Array: A Highly Reliable Hybrid-RAID Architecture". Proceedings of the 1995 International Conference on Parallel Processing: Volume 1. CRC Press. pp. I-146ff. ISBN   0-8493-2615-X.
  3. 1 2 3 4 Layton, Jeffrey B. (2011-01-06). "Intro to Nested-RAID: RAID-01 and RAID-10". Linux-Mag.com. Linux Magazine. Archived from the original on January 10, 2011. Retrieved 2015-02-01.{{cite web}}: CS1 maint: unfit URL (link)
  4. Kozierok, Charles (17 August 2018). "RAID Levels 0+1 (01) and 1+0 (10)". The PC Guide. Retrieved May 28, 2019.
  5. Kozierok, Charles (5 September 2018). "RAID Levels 0+3 (03 or 53) and 3+0 (30)". The PC Guide. Retrieved May 28, 2019.
  6. Dawkins, Bill; Jones, Arnold (2006-07-28). "Common RAID Disk Data Format Specification" (PDF). SNIA.org (1.2 ed.). Storage Networking Industry Association. Archived from the original (PDF) on 2009-08-24. Retrieved 2015-01-31.
  7. Brown, Neil (27 August 2004). "RAID10 in Linux MD driver". Archived from the original on 12 September 2013. Retrieved 17 April 2009.
  8. chipsets/imsm/sb/CS-020655.htm "Intel Rapid Storage Technology: What is RAID 10?". Intel. 16 November 2009.
  9. "IBM and HP 6-Gbps SAS RAID Controller Performance" (PDF). Demartek. October 2009. Archived from the original (PDF) on 2011-06-05.
  10. Kozierok, Charles (15 August 2018). "Summary Comparison of RAID Levels". The PC Guide. Retrieved May 28, 2019.
  11. Gupta, Meeta (2002). Storage Area Network Fundamentals. Cisco Press. p. 268. ISBN   1-58705-065-X.
  12. 1 2 "Cisco UCS Servers RAID Guide, Chapter 1: RAID Overview" (PDF). Cisco.com. Cisco Systems. pp. 1–14, 1–15. Retrieved 2015-02-01.
  13. 1 2 Lowe, Scott (2010-07-09). "RAID 50 offers a balance of performance, storage capacity, and data integrity". TechRepublic.com. Retrieved 2015-02-01.
  14. "Which RAID Level is Right for Me: RAID 60 (Striping and striping with dual party)". Adaptec.com. Adaptec. Archived from the original on 2015-07-10. Retrieved 2015-02-03.
  15. McKinstry, Jim. "Server Management: Questions and Answers". SAMag.com. Archived from the original on 19 January 2008.

Further reading