IBM SAN Volume Controller

Last updated

The IBM SAN Volume Controller (SVC) is a block storage virtualization appliance that belongs to the IBM System Storage product family. SVC implements an indirection, or "virtualization", layer in a Fibre Channel storage area network (SAN).

Contents

Architecture

The IBM 2145 SAN Volume Controller (SVC) is an inline virtualization or "gateway" device. It logically sits between hosts and storage arrays, presenting itself to hosts as the storage provider (target) and presenting itself to storage arrays as one big host. SVC is physically attached to one or several SAN fabrics.

The virtualization approach allows for non-disruptive replacements of any part in the storage infrastructure, including the SVC devices themselves. It also aims at simplifying compatibility requirements in strongly heterogeneous server and storage landscapes. All advanced functions are therefore implemented in the virtualization layer, which allows switching storage array vendors without impact. Finally, spreading an SVC installation across two or more sites (stretched clustering) enables basic disaster protection paired with continuous availability.

SVC nodes are always clustered, with a minimum of 2 and a maximum of 8 nodes, and linear scalability. Nodes are rack-mounted appliances derived from IBM System x servers, protected by redundant power supplies and integrated batteries. Earlier models featured external battery-backed power supplies. Each node has Fibre Channel ports simultaneously used for incoming, outgoing, and intracluster data traffic. Hosts may also be attached via FCoE and iSCSI Gbit Ethernet ports. Intracluster communication includes maintaining read/write cache integrity, sharing status information, and forwarding reads and writes to any port. These ports must be zoned together.

Write cache is protected by mirroring within a pair of SVC nodes, called I/O group. Virtualized resources (= storage volumes presented to hosts) are distributed across I/O groups to improve performance. Volumes can also be moved nondisruptively between I/O groups, e.g., when new node pairs are added or older technology is removed. Node pairs are always active, meaning both members accept simultaneous writes for each volume. In addition, all other cluster nodes accept and forward read and write requests which are internally handled by the appropriate I/O group. Path or board failures are compensated by non-disruptive failover within each I/O group, or optionally across dispersed I/O groups. Hosts must have multipath drivers installed, such as IBM Subsystem Device Driver (SDD) [1] or standard MPIO drivers.

SVC is based on COMmodity PArts Storage System (Compass) architecture, developed at the IBM Almaden Research Center. [1] The majority of the software has been developed at the IBM Hursley Labs in the UK.

Terminology

SVC node models
Type-modelCache [GB]FC speed [Gb/s]iSCSI Speed [Gb/s]Based uponAnnounced
2145-4F242n/ax3352 June 2003
2145-8F2821x33625 October 2005
2145-8F4841x33623 May 2006
2145-8G4841x355022 May 2007
2145-8A4841x3250M228 October 2008
2145-CF82481x3550M220 October 2009
2145-CG82481 (10 Gbit/s optional)x3550M39 May 2011
2145-DH8328 & 161 (10 Gbit/s optional)x3650M46 May 2014
2145-SV164...2561610 Gbit/sXeon E5 v423 August 2016
2147-SV164...2561610 Gbit/sXeon E5 v423 August 2016

Timeline

Timeline for IBM SAN Volume Controller until August 2019 Timeline for IBM SVC (until August 2019).png
Timeline for IBM SAN Volume Controller until August 2019

The different SAN Volume Controller models were available for purchase shortly after the mentioned announcement day. The light green bars show the period of time when each model could be ordered, while the light blue bars show how long the standard service was continued after withdrawal from marketing. The displayed information is current in August 2019. There are differences in service conditions between 2145 and 2147, but not in hardware.

Performance

Release 4.3 of the SVC held the Storage Performance Council (SPC) world record for SPC-1 performance benchmarks, returning nearly 275K (274,997.58) IOPS. There was no faster storage subsystem benchmarked by the SPC at that time (October 2008). [2] The SPC-2 benchmark also returned a world leading measurement of over 7 GB/s throughput.

Release 5.1 achieved new records with a 4 node and 6 node cluster benchmark with DS8700 as backed storage device. SVC broke its own record of 274,997.58 SPC-1 IOPS in March 2010, with 315,043.59 for the 4 node cluster and 380,489.30 with the 6 node cluster, records that stood until October 2011.

Release 6.2 of the SVC held the Storage Performance Council (SPC) world record for SPC-1 performance benchmarks, returning over 500K (520,043.99) IOPS (I/Os per second) using 8 SVC nodes and Storwize V7000 as the backend disk. There was no faster storage subsystem benchmarked by the SPC at that time (January 2012). [3] The full results and executive summaries can be reviewed at the SPC website referenced above. [note 1]

Release 7.x provides multiple enhancements including support for additional CPUs, cache and adapters. The streamlined cache operates at 100μs fall-through latency [4] and 60 μs cache-hit latency, enabling SVC as a front-end to IBM FlashSystem solid-state storage without significant performance penalty. (See also: FlashSystem V9000).

Included Features (7.x)

Indirection or mapping from virtual LUN to physical LUN
Servers access SVC as if it were a storage controller. The SCSI LUNs they see represent virtual disks (volumes) allocated in SVC from a pool of storage made up from one or more managed disks (MDisks). A managed disk is simply a storage LUN provided by one of the storage controllers that SVC is virtualizing. The virtual capacity can be larger than the managed physical capacity, with a current maximum of 32 PB, depending on management granularity (extent size)
Data migration and pooling
SVC can move volumes from one capacity pool (MDisk group) to another whilst maintaining I/O access to the data. Write and read caching remain active. Pools can be shrunk or expanded by removing or adding hardware capacity, while maintaining I/O access to the data. Both features can be used for seamless hardware migration. Migration from an old SVC model to the most recent model is also seamless and implies no copying of data.
Importing and exporting existing LUNs via Image Mode
"Image mode" is a non-virtualized pass-through representation of an MDisk (managed LUN) that contains existing client data; such an MDisk can be seamlessly imported into or removed from an SVC cluster.
Fast-write cache
Writes from hosts are acknowledged once they have been committed into the SVC mirrored cache, but prior to being destaged to the underlying storage controllers. Data is protected by replication to the peer node in an I/O group (cluster node pair). Cache size is dependent on the SVC hardware model and installed options. Fast-write cache is especially useful to increase performance in midrange storage configurations.
Auto tiering (Easy Tier)
SVC automatically selects the best storage hardware for each chunk of data, according to its access patterns. Cache unfriendly "hot" data is dynamically moved to solid state drives SSD, whereas cache friendly data as well as "cold" data is moved to economic spinning disks. Easy Tier also monitors and optimizes spindle-only workloads if no solid state storage is attached. Idem, Easy Tier automatically optimizes solid-state workloads between Enterprise- and Read Intensive Flash media.
Solid state drive (SSD) capability
SVC can use any supported external SSD storage device or provide its own internal SSD slots, up to 32 per cluster. These can be used to boost aging spinning disk pools: Easy Tiering is automatically active in hybrid mixed-media capacity pools.
Thin Provisioning
LUN capacity is only used when new data is written to a LUN. Data blocks equal zero are not physically allocated, unless previous data unequal zero exists. During import or during internal migrations, data blocks equal zero are discarded (Thick-to-thin migration).
Besides, thin provisioning is integrated in the FlashCopy features detailed below to provide space-efficient snapshots
Virtual Disk Mirroring
Provides the ability to maintain two redundant copies of a LUN, implicitly on different storage controllers
Site protection with Stretched Cluster
A geographically distributed, highly available clustered storage setup leveraging the virtual disk mirroring feature across datacenters within 300 km distance. Stretched Clusters can span 2, 3 or 4 datacenters (chain or ring topology, a 4-site cluster requiring 8 cluster nodes). Cluster consistency is ensured by a majority voting set.
From two storage devices in two datacenters, SVC presents one common logical instance. User-side operations like Snapshot or LUN Resizing apply at the logical instance level. Hardware-oriented operations like real-time compression or live hardware migration occur at the physical instance level.
Unlike in classical mirroring, logical LUNs are readable and writable on both sides (tandem) at the same time, removing the need for failover, role switch, or site switch as found in Site Recovery managing products. The feature can be combined with Live Partition Mobility or VMotion to avoid bulk data transport during a metro-distance virtual server motion.
Geographical crossover access
All SVC cluster nodes in a Stretched Cluster have read/write access to storage hardware in the mirror location, removing the need for site-resynchronization in case of single node failures. This feature is mutually exclusive with Enhanced Stretched Cluster, and only recommended for single stretched node pairs.
Hot Standby Nodes
Powered nodes that can take over the role of failed nodes in a stretched or local cluster at very short notice.
Enhanced Stretched Cluster
A functionality optimizing data paths within a metro- or geo-distance Stretched Cluster (see above), helpful when bandwidth between sites is scarce and cross-site traffic must be minimized. SVC will attempt to use the shortest path for reads and writes. For instance, cache write destaging to storage devices is always performed by the most nearby cache copy, unless its peer cache copy is down. Two node pairs are the recommended minimum for an Enhanced Stretched Cluster.
Stretched Cluster with golden copy (3-site DR)
A Stretched Cluster that maintains an additional synchronous or asynchronous data copy on an independent Stretched Cluster or SVC or Storwize device at geo distances. The golden copy is a disaster protection against metro-scale outages impacting the Stretched Cluster as a whole. It relies on licensed Metro- or Global Mirror functionality.
Hyperswap
The ability to seamlessly fail over data access between geographically dispersed IO groups or clusters. As with Stretched Cluster, both sides accept simultaneous writes, but write cache data is mirrored locally in both sites as IO groups are kept together. Hyperswap can be combined with Live Partition Mobility or VMotion for maximized application availability. On the server side, Hyperswap works with most native multipath drivers with ALUA support. Hyperswap relies on Metro Mirror functionality and requires a Metro Mirror license, as well as a minimum of two node pairs.
Transparent Cloud Tiering
Swift- and S3-compatible object datastores can be used as a cold tier for incremental volume snapshots and volume archives without live production access. This allows keeping hourly time machine copies or archiving VM images including attached volumes at a price point somewhat closer to tape media. On-premise datastore support is provided via OpenStack Swift. Off-premise datastore support is provided by Amazon S3 or Softlayer. Off-premise Transparent Cloud Tiering per default uses AES encryption, which is a licensed feature.

Optional features

There are some optional features, separately licensed e.g. per TB: [1]

Real-Time Compression
This in-flight data reduction technology offers a footprint reduction of 50% (guaranteed) or up to 80% (found in Oracle databases). Leveraging dedicated compression hardware, it has generally no performance impact and is usable for heavy duty databases. The temporal locality of the algorithm can even increase the read performance on adequate data patterns such as SQL databases stored on spinning disks. The compression efficiency is equal to "zip" (Lempel–Ziv–Welch) with a very large dictionary, and can accurately be predicted across Petabytes using the Comprestimator tool.
Real-Time Compression can be combined with Easy Tiering, Thin Provisioning and Virtual Disk Mirroring. It was initially invented by the acquired startup Storwize Inc., [5] which also served as new name for the SVC-derived IBM storage systems family.
FlashCopy (Snapshot)
This is used to create a disk snapshot for backup/rollback or application testing of a single volume. Snapshots require only the "delta" capacity unless created with full-provisioned target volumes. FlashCopy comes in three flavours: Snapshot, Backup volume, and Clone, which is automatically unlinked from its source. All are based on optimized copy-on-write technology.
One source volume can have up to 256 simultaneous targets. Targets can be made incremental, and cascaded tree like dependency structures can be constructed. Targets can be re-applied to their source or any other appropriate volume, also of different size (e.g. resetting any changes from a resize command).
Copy-on-write is based on a bitmap with a configurable grain size, as opposed to a journal. [1]
FlashCopy rollback (time machine)
Provides a time-machine inspired rollback capability using selectively granular consistency points. The consistency mechanism can cover many LUNs at a time. Rollback requires a FlashCopy license and the Spectrum Control Snapshot software.
Metro Mirror - synchronous remote replication
This allows a remote disaster recovery site at a distance of up to about 300km [6]
Global Mirror - asynchronous remote replication
This allows a remote disaster recovery site at a distance of thousands of kilometres. Each Global Mirror relationship can be configured for high latency / low bandwidth or for high latency / high bandwidth connectivity, the latter allowing a consistent recovery point objective RPO below 1 sec.
Global Mirror over IP - remote replication over the Internet
uses SANslide technology integrated into the SVC firmware to send mirroring data traffic across a TCP/IP link, while maximizing the bandwidth efficiency of that link. This may result in a 100x data transfer acceleration over long distances. [7]
Encryption of data at rest
SVC and other Spectrum Virtualize-based devices can transparently encrypt data on any local media, virtualized attached storage, or cloud tier (per default). The encryption mechanism is 256-bit AES-XTS. Keys are either generated locally and stored on removable thumb drives or obtained from a key lifecycle management service. Both options are mutually exclusive.

Other products running SVC code

On 7 October 2010, IBM announced the IBM Storwize V7000, the first member of the Storwize family. [8] Storwize uses the SAN Volume Controller code base with internal storage to provide a mid-price storage subsystem. [9] The IBM Storwize V5000, V3700 and V3500 are shrunk compatible models with less cache/CPU/adapters and a reduced set of features.

The IBM FlashSystem V9000 leverages the SVC firmware integrated with IBM FlashSystem solid-state drawers.

In 2015, IBM re-badged the virtualization functionality as Spectrum Virtualize, in order to align it with the IBM software-defined storage naming conventions and to highlight the interoperability aspect.

Non-IBM products running SVC code

The Actifio Protection and Availability Storage (PAS) appliance includes elements of SVC code to achieve wide interoperability. [10] The PAS platform spans backup, disaster recovery, and business continuity among other functions.

See also

Footnotes

  1. "Cache hit" or "bandwidth" performance numbers are usually much higher, e.g. "20 GBPS", but are relatively meaningless as they cannot be achieved in real-word scenarios.

Related Research Articles

Internet Small Computer Systems Interface or iSCSI is an Internet Protocol-based storage networking standard for linking data storage facilities. iSCSI provides block-level access to storage devices by carrying SCSI commands over a TCP/IP network. iSCSI facilitates data transfers over intranets and to manage storage over long distances. It can be used to transmit data over local area networks (LANs), wide area networks (WANs), or the Internet and can enable location-independent data storage and retrieval.

In computer storage, logical volume management or LVM provides a method of allocating space on mass-storage devices that is more flexible than conventional partitioning schemes to store volumes. In particular, a volume manager can concatenate, stripe together or otherwise combine partitions into larger virtual partitions that administrators can re-size or move, potentially without interrupting system use.

<span class="mw-page-title-main">Network-attached storage</span> Computer data storage server

Network-attached storage (NAS) is a file-level computer data storage server connected to a computer network providing data access to a heterogeneous group of clients. The term "NAS" can refer to both the technology and systems involved, or a specialized device built for such functionality.

<span class="mw-page-title-main">Clariion</span> Storage array product

Clariion is a discontinued SAN disk array manufactured and sold by EMC Corporation, it occupied the entry-level and mid-range of EMC's SAN disk array products. In 2011, EMC introduced the EMC VNX Series, designed to replace both the Clariion and Celerra products.

NetApp, Inc. is an intelligent data infrastructure company that provides unified data storage, integrated data services, and cloud operations (CloudOps) solutions to enterprise customers. The company is based in San Jose, California. It has ranked in the Fortune 500 from 2012 to 2021. Founded in 1992 with an initial public offering in 1995, NetApp offers cloud data services for management of applications and data both online and physically.

The HP Storageworks XP is a computer data storage disk array sold by Hewlett Packard Enterprise using Hitachi Data Systems hardware and adding their own software to it. It's based on the Hitachi Virtual Storage Platform and targeted towards enabling large scale consolidation, large database, Oracle, SAP, Exchange, and online transaction processing (OLTP) environments.

The Write Anywhere File Layout (WAFL) is a proprietary file system that supports large, high-performance RAID arrays, quick restarts without lengthy consistency checks in the event of a crash or power failure, and growing the filesystems size quickly. It was designed by NetApp for use in its storage appliances like NetApp FAS, AFF, Cloud Volumes ONTAP and ONTAP Select.

A virtual storage area network is a logical representation of a physical storage area network (SAN). A VSAN abstracts the storage-related operations from the physical storage layer, and provides shared storage access to the applications and virtual machines by combining the servers' local storage over a network into a single or multiple storage pools.

In Linux, Logical Volume Manager (LVM) is a device mapper framework that provides logical volume management for the Linux kernel. Most modern Linux distributions are LVM-aware to the point of being able to have their root file systems on a logical volume.

A NetApp FAS is a computer storage product by NetApp running the ONTAP operating system; the terms ONTAP, AFF, ASA, FAS are often used as synonyms. "Filer" is also used as a synonym although this is not an official name. There are three types of FAS systems: Hybrid, All-Flash, and All SAN Array:

  1. NetApp proprietary custom-build hardware appliances with HDD or SSD drives called hybrid Fabric-Attached Storage
  2. NetApp proprietary custom-build hardware appliances with only SSD drives and optimized ONTAP for low latency called ALL-Flash FAS
  3. All SAN Array build on top of AFF platform, and provide only SAN-based data protocol connectivity.

In computer science, storage virtualization is "the process of presenting a logical view of the physical storage resources to" a host computer system, "treating all storage media in the enterprise as a single pool of storage."

<span class="mw-page-title-main">IBM storage</span> Product portfolio of IBM

The IBM Storage product portfolio includes disk, flash, tape, NAS storage products, storage software and services. IBM's approach is to focus on data management.

This glossary of computer hardware terms is a list of definitions of terms and concepts related to computer hardware, i.e. the physical and structural components of computers, architectural issues, and peripheral devices.

IBM Storwize systems were virtualizing RAID computer data storage systems with raw storage capacities up to 32 PB. Storwize is based on the same software as IBM SAN Volume Controller (SVC).

<span class="mw-page-title-main">IBM FlashSystem</span> IBM Storage enterprise system that store data on flash memory

IBM FlashSystem is an IBM Storage enterprise system that stores data on flash memory. Unlike storage systems that use standard solid-state drives, IBM FlashSystem products incorporate custom hardware based on technology from the 2012 IBM acquisition of Texas Memory Systems.

Enterprise Storage OS, also known as ESOS, is a Linux distribution that serves as a block-level storage server in a storage area network (SAN). ESOS is composed of open-source software projects that are required for a Linux distribution and several proprietary build and install time options. The SCST project is the core component of ESOS; it provides the back-end storage functionality.

<span class="mw-page-title-main">Dell Technologies PowerFlex</span> Software-defined storage product

Dell Technologies PowerFlex, is a commercial software-defined storage product from Dell Technologies that creates a server-based storage area network (SAN) from local server storage using x86 servers. It converts this direct-attached storage into shared block storage that runs over an IP-based network.

ONTAP or Data ONTAP or Clustered Data ONTAP (cDOT) or Data ONTAP 7-Mode is NetApp's proprietary operating system used in storage disk arrays such as NetApp FAS and AFF, ONTAP Select, and Cloud Volumes ONTAP. With the release of version 9.0, NetApp decided to simplify the Data ONTAP name and removed the word "Data" from it, removed the 7-Mode image, therefore, ONTAP 9 is the successor of Clustered Data ONTAP 8.

ZFS is a file system with volume management capabilities. It began as part of the Sun Microsystems Solaris operating system in 2001. Large parts of Solaris, including ZFS, were published under an open source license as OpenSolaris for around 5 years from 2005 before being placed under a closed source license when Oracle Corporation acquired Sun in 2009–2010. During 2005 to 2010, the open source version of ZFS was ported to Linux, Mac OS X and FreeBSD. In 2010, the illumos project forked a recent version of OpenSolaris, including ZFS, to continue its development as an open source project. In 2013, OpenZFS was founded to coordinate the development of open source ZFS. OpenZFS maintains and manages the core ZFS code, while organizations using ZFS maintain the specific code and validation processes required for ZFS to integrate within their systems. OpenZFS is widely used in Unix-like systems.

References

  1. 1 2 3 4 5 "IBM System Storage SAN Volume Controller", IBM's Redbook SG24-6423-05, pp. 12.
  2. "SVC Rel 4.3 SPC results". Archived from the original on 2007-02-06. Retrieved 2007-02-13.
  3. "SVC Rel 6.2 SPC results" (PDF). Archived from the original (PDF) on 2012-11-19. Retrieved 2012-01-31.
  4. Implementing FlashSystem 840 with SAN Volume Controller | IBM Redbooks. 30 September 2016.
  5. "IBM News room - 2010-07-29 IBM Acquires Storage Company Storwize for Data Compression Capabilities - United States". 03.ibm.com. 2010-07-29. Retrieved 2012-11-07.
  6. "DS8000 Information Center". Publib.boulder.ibm.com. Retrieved 2012-11-07.
  7. "WAN Optimization Products | SANSlide from 4BridgeWorks". Archived from the original on 2013-12-09.
  8. "IBM Storwize V7000 and Storwize V7000 Unified Disk Systems". 03.ibm.com. Retrieved 2012-11-07.
  9. "IBM Storwize V7000 and Storwize V7000 Unified Disk Systems". 03.ibm.com. Retrieved 2012-11-07.
  10. "Actifio, IBM Partner On Virtualized Storage, Target MSPs". www.mspmentor.net. Archived from the original on 2012-11-04. Retrieved 2013-01-10.