In-situ processing

Last updated

In-situ processing also known as in-storage processing (ISP) is a computer science term that refers to processing data where it resides. In-situ means "situated in the original, natural, or existing place or position." An in-situ process processes data where it is stored, such as in solid-state drives (SSDs) or memory devices like NVDIMM, rather than sending the data to a computer's central processing unit (CPU).

Contents

The technology utilizes embedded processing engines inside the storage devices to make them capable of running user applications in-place, so data does not need to leave the device to be processed. The technology is not new, but modern SSD architecture, as well as the availability of powerful embedded processors, make it more appealing to run user applications in-place. [1] SSDs deliver higher data throughput in comparison to hard disk drives (HDDs). Additionally, in contrast to the HDDs, the SSDs can handle multiple I/O commands at the same time.

The SSDs contain a considerable amount of processing horsepower for managing flash memory array and providing a high-speed interface to host machines. These processing capabilities can provide an environment to run user applications in-place. The computational storage device (CSD) term refers to an SSD which is capable of running user applications in-place. In an efficient CSD architecture, the embedded in-storage processing subsystem has access to the data stored in flash memory array through a low-power and high-speed link. The deployment of such CSDs in clusters can increase the overall performance and efficiency of big data and high-performance computing (HPC) applications. [1]

Reducing data transfer bottlenecks

Webscale data center designers have been trying to develop storage architectures that favor high-capacity hosts. In the following figure (from [1] ), such a storage system is shown where 64 SSDs are attached to a host. For the sake of simplicity, only the details of one SSD are demonstrated. Modern SSDs usually contain 16 or more flash memory channels which can be utilized concurrently for flash memory array I/O operations. Considering 512 MB/s bandwidth per channel, the internal bandwidth of an SSD with 16 flash memory channels is about 8 GB/s. This huge bandwidth decreases to about 1 GB/s due to the complexity of the host interface software and hardware architecture. In other words, the accumulated bandwidth of all internal channels of the 64 SSDs reaches the multiplication of the number of SSDs, the number of channels per SSD, and 512 MB/s (bandwidth of each channel) which is equal to 512 GB/s. While the accumulated bandwidth of the SSDs’ external interfaces is equal to 64 multiply by 1 GB/s (the host interface bandwidth of each SSD) which is 64 GB/s. However, In order to talk to the host, all SSDs required to be connected to a PCIe switch. Hence, the available bandwidth of the host is limited to 32 GB/s.

Modern SSD architecture and the data transfer bottleneck in storage servers Modern SSD Architecture.jpg
Modern SSD architecture and the data transfer bottleneck in storage servers

Overall, there is a 16X gap between the accumulated internal bandwidth of all SSDs and the bandwidth available to the host. In other words, for reading 32 TB of data, the host needs 16 minutes while internal components of the SSDs can read the same amount of data in about 1 minute. Additionally, in such storage systems, data need to continuously move through the complex hardware and software stack between hosts and storage units, which imposes a considerable amount of energy consumption and dramatically decreases the energy efficiency of large data centers. Hence, storage architects need to develop techniques to decrease data movement, and ISP technology has been introduced to overcome the aforementioned challenges by moving the process to data.

Efficiency and utilization

The computational storage technology minimizes the data movements in a cluster and also increases the processing horsepower of the cluster by augmenting power-efficient processing engines to the whole system. This technology can potentially be applied to both HDDs and SSDs; however, modern SSD architecture provides better tools for developing such technologies. The SSDs which can run user application in-place are called computational storage devices (CSDs). These storage units are augmentable processing resources, which means they are not designed to replace the high-end processors of modern servers. Instead, they can collaborate with the host’s CPU and augment their efficient processing horsepower to the system. The scientific article “Computational storage: an efficient and scalable platform for big data and HPC applications” [1] which is published by Springer Publishing under open access policy (free for the public to access) shows the benefits of CSD utilization in the clusters.

Examples of in-storage processing can be seen in fields like visualization efforts, [2] biology [3] and chemistry. This showcases how this technology allows for actions and results to be seen more efficiently than through data movement, regardless of the data being moved. The following figures (from [1] ) show how CSDs can be utilized in an Apache Hadoop cluster and on a Message Passing Interface-based distributed environment.

CSD-equipped Hadoop cluster CSD-equipped Hadoop cluster.jpg
CSD-equipped Hadoop cluster
CSD-equipped MPI-based cluster CSD-equipped MPI-based cluster.jpg
CSD-equipped MPI-based cluster

Industry

In the storage industry, implementations from several companies are now available, including from NGD Systems, [4] ScaleFlux [5] and Eideticom. [6] Other companies have tried to do similar work in the past, including Micron Technology [7] and Samsung. The approach from all of these are the same direction, managing or processing data where it resided.

NGD Systems was the first company to create in-situ processing storage and has produced two versions of the device since 2017. The Catalina-1 was a standalone SSD that offered 24 TB of flash along with processing. [4] A second product called Newport was released in 2018 that offered up to 32 TB of flash memory. [8] [9]

ScaleFlux uses a CSS-1000 NVMe device that uses host resourcing and kernel changes to address the device and use Host resources to manage up to 6.4 TB flash on the device, or base SSD. [10] Eideticom utilizes a device called a No-Load DRAM-only NVMe device as an accelerator with no actual flash storage for persistent data. [11] Micron called their version ‘Scale In’ at a Flash Memory Summit (FMS) event in 2013 but was never able to productize it and was based on a SATA SSD in production. [7] Samsung has worked on various versions of devices from KV Store and others. [12]

Related Research Articles

<span class="mw-page-title-main">Hard disk drive</span> Electro-mechanical data storage device

A hard disk drive (HDD), hard disk, hard drive, or fixed disk, is an electro-mechanical data storage device that stores and retrieves digital data using magnetic storage with one or more rigid rapidly rotating platters coated with magnetic material. The platters are paired with magnetic heads, usually arranged on a moving actuator arm, which read and write data to the platter surfaces. Data is accessed in a random-access manner, meaning that individual blocks of data can be stored and retrieved in any order. HDDs are a type of non-volatile storage, retaining stored data when powered off. Modern HDDs are typically in the form of a small rectangular box.

<span class="mw-page-title-main">Flash memory</span> Electronic non-volatile computer storage device

Flash memory is an electronic non-volatile computer memory storage medium that can be electrically erased and reprogrammed. The two main types of flash memory, NOR flash and NAND flash, are named for the NOR and NAND logic gates. Both use the same cell design, consisting of floating gate MOSFETs. They differ at the circuit level depending on whether the state of the bit line or word lines is pulled high or low: in NAND flash, the relationship between the bit line and the word lines resembles a NAND gate; in NOR flash, it resembles a NOR gate.

<span class="mw-page-title-main">Western Digital</span> American digital storage company

Western Digital Corporation is an American computer drive manufacturer and data storage company, headquartered in San Jose, California. It designs, manufactures and sells data technology products, including data storage devices, data center systems and cloud storage services.

<span class="mw-page-title-main">Seagate Technology</span> American data storage company

Seagate Technology Holdings plc is an American data storage company. It was incorporated in 1978 as Shugart Technology and commenced business in 1979. Since 2010, the company has been incorporated in Dublin, Ireland, with operational headquarters in Fremont, California, United States.

Input/output operations per second is an input/output performance measurement used to characterize computer storage devices like hard disk drives (HDD), solid state drives (SSD), and storage area networks (SAN). Like benchmarks, IOPS numbers published by storage device manufacturers do not directly relate to real-world application performance.

In computing, a hybrid drive is a logical or physical storage device that combines a faster storage medium such as solid-state drive (SSD) with a higher-capacity hard disk drive (HDD). The intent is adding some of the speed of SSDs to the cost-effective storage capacity of traditional HDDs. The purpose of the SSD in a hybrid drive is to act as a cache for the data stored on the HDD, improving the overall performance by keeping copies of the most frequently used data on the faster SSD drive.

A NetApp FAS is a computer storage product by NetApp running the ONTAP operating system; the terms ONTAP, AFF, ASA, FAS are often used as synonyms. "Filer" is also used as a synonym although this is not an official name. There are three types of FAS systems: Hybrid, All-Flash, and All SAN Array:

  1. NetApp proprietary custom-build hardware appliances with HDD or SSD drives called hybrid Fabric-Attached Storage
  2. NetApp proprietary custom-build hardware appliances with only SSD drives and optimized ONTAP for low latency called ALL-Flash FAS
  3. All SAN Array build on top of AFF platform, and provide only SAN-based data protocol connectivity.
<span class="mw-page-title-main">Solid-state drive</span> Data storage device

A solid-state drive (SSD) is a solid-state storage device that uses integrated circuit assemblies to store data persistently, typically using flash memory, and functioning as secondary storage in the hierarchy of computer storage. It is also sometimes called a semiconductor storage device, a solid-state device or a solid-state disk, even though SSDs lack the physical spinning disks and movable read–write heads used in hard disk drives (HDDs) and floppy disks. SSD also has rich internal parallelism for data processing.

Texas Memory Systems, Inc. (TMS) was an American corporation that designed and manufactured solid-state disks (SSDs) and digital signal processors (DSPs). TMS was founded in 1978 and that same year introduced their first solid-state drive, followed by their first digital signal processor. In 2000 they introduced the RamSan line of SSDs. Based in Houston, Texas, they supply these two product categories to large enterprise and government organizations.

IBM Storwize systems were virtualizing RAID computer data storage systems with raw storage capacities up to 32 PB. Storwize is based on the same software as IBM SAN Volume Controller (SVC).

<span class="mw-page-title-main">X25-M</span> Line of SSDs by Intel

The Intel X25-M was a line of Serial ATA interface solid-state drives developed by Intel for personal computers, announced in late 2008. The SSD was a multi-level-cell solid-state drive available in a 2.5" form factor, came in 80 GB and 160 GB capacities and utilized NAND flash memory on a 50 nm process. The second-generation SSD which was called the "X25-M G2". The X25-M G2 was also available in a 2.5" form factor and 80 GB and 160 GB capacities, but with NAND flash memory on a more efficient 34 nm process.

NVM Express (NVMe) or Non-Volatile Memory Host Controller Interface Specification (NVMHCIS) is an open, logical-device interface specification for accessing a computer's non-volatile storage media usually attached via the PCI Express bus. The initialism NVM stands for non-volatile memory, which is often NAND flash memory that comes in several physical form factors, including solid-state drives (SSDs), PCIe add-in cards, and M.2 cards, the successor to mSATA cards. NVM Express, as a logical-device interface, has been designed to capitalize on the low latency and internal parallelism of solid-state storage devices.

<span class="mw-page-title-main">M.2</span> Standard for miniature computer expansion cards

M.2, pronounced m dot two and formerly known as the Next Generation Form Factor (NGFF), is a specification for internally mounted computer expansion cards and associated connectors. M.2 replaces the mSATA standard, which uses the PCI Express Mini Card physical card layout and connectors. Employing a more flexible physical specification, M.2 allows different module widths and lengths, which, paired with the availability of more advanced interfacing features, makes M.2 more suitable than mSATA in general for solid-state storage applications, particularly in smaller devices such as ultrabooks and tablets.

<span class="mw-page-title-main">SATA Express</span> Computer device interface

SATA Express is a computer bus interface that supports both Serial ATA (SATA) and PCI Express (PCIe) storage devices, initially standardized in the SATA 3.2 specification. The SATA Express connector used on the host side is backward compatible with the standard SATA data connector, while it also provides two PCI Express lanes as a pure PCI Express connection to the storage device.

Solid-state storage (SSS) is a type of non-volatile computer storage that stores and retrieves digital information using only electronic circuits, without any involvement of moving mechanical parts. This differs fundamentally from the traditional electromechanical storage, which records data using rotating or linearly moving media coated with magnetic material.

In an enterprise server, a Caching SAN Adapter is a host bus adapter (HBA) for storage area network (SAN) connectivity which accelerates performance by transparently storing duplicate data such that future requests for that data can be serviced faster compared to retrieving the data from the source. A caching SAN adapter is used to accelerate the performance of applications across multiple clustered or virtualized servers and uses DRAM, NAND Flash or other memory technologies as the cache. The key requirement for the memory technology is that it is faster than the media storing the original copy of the data to ensure performance acceleration is achieved.

<span class="mw-page-title-main">Dell Technologies PowerFlex</span> Software-defined storage product

Dell Technologies PowerFlex, is a commercial software-defined storage product from Dell Technologies that creates a server-based storage area network (SAN) from local server storage using x86 servers. It converts this direct-attached storage into shared block storage than runs over an IP-based network.

<span class="mw-page-title-main">Nvidia DGX</span> Line of Nvidia produced servers and workstations

Nvidia DGX is a line of Nvidia-produced servers and workstations which specialize in using GPGPU to accelerate deep learning applications. The typical design of a DGX system is based upon a rackmount chassis with motherboard that carries high performance x86 server CPUs. The main component of a DGX system is a set of 4 to 16 Nvidia Tesla GPU modules on an independent system board. DGX systems have large heatsinks and powerful fans to adequately cool thousands of watts of thermal output. The GPU modules are typically integrated into the system using a version of the SXM socket.

IBM FlashCore Modules (FCM) are solid state technology computer data storage modules using PCI Express attachment and the NVMe command set. The raw storage capacities are 4.8 TB, 9.6 TB, 19.2 TB and 38.4 TB. The FlashCore modules support hardware self-encryption and real-time inline hardware data compression without performance impact. They are used in selected arrays from the IBM FlashSystem family.

<span class="mw-page-title-main">Leonardo (supercomputer)</span> Supercomputer in Italy

Leonardo is a petascale supercomputer located at the CINECA datacenter in Bologna, Italy. The system consists of an Atos BullSequana XH2000 computer, with close to 14,000 Nvidia Ampere GPUs and 200Gb/s Nvidia Mellanox HDR InfiniBand connectivity. Inagurated in November 2022, Leonardo is capable of 250 petaflops, making it one of the top five fastest supercomputers in the world. It debuted on the TOP500 in November 2022 ranking fourth in the world, and second in Europe.

References

  1. 1 2 3 4 5 Torabzadehkashi, Mahdi; Rezaei, Siavash; HeydariGorji, Ali; Bobarshad, Hossein; Alves, Vladimir; Bagherzadeh, Nader (15 November 2019). "Computational storage: an efficient and scalable platform for big data and HPC applications". Journal of Big Data. 6 (100). doi: 10.1186/s40537-019-0265-5 .
  2. Raffin, Bruno (December 2014). "In-Situ_2014" (PDF).
  3. "In situ Structural Biology". Utrecht University. 2016-03-17. Retrieved 2018-06-04.
  4. 1 2 "Computational storage takes spotlight in new NGD Systems SSD". Tech Target. 2020-02-13. Retrieved 2019-03-07.
  5. "What if I told you that flash drives could do their own processing?". The Register. 2020-02-13. Retrieved 2018-02-13.
  6. "IDC Innovators: Computational Storage, 2019". IDC. 2020-02-13. Retrieved 2019-08-01.
  7. 1 2 Doller, Ed (14 August 2013). "Micron Scale In Keynote - 2013 FMS" (PDF). www.FlashMemorySummit.com.
  8. "NGD Systems Releases First 16TB NVMe Computational U.2 SSD". Storage Review. 2020-02-13. Retrieved 2018-10-31.
  9. "$20M for Upstart Storage Device Firm NGD". Orange County Business Journal. 2020-02-13. Retrieved 2020-02-10.
  10. "Data-Driven Computational Storage Server Solution (Compute and Storage Acceleration Solution) : Inspur". xeonscalable.inspursystems.com. Retrieved 2018-06-04.
  11. "Modern Storage Technologies in 2020: What You Need to Know". Bigstep. 2020-02-13. Retrieved 2020-01-10.
  12. Do, Jaeyoung; Kee, Yang-Suk; Patel, Jignesh M.; Park, Chanik; Park, Kwanghyun; DeWitt, David J. (2013-06-22). "Query processing on smart SSDs". Query processing on smart SSDs: opportunities and challenges. ACM. pp. 1221–1230. doi:10.1145/2463676.2465295. ISBN   9781450320375. S2CID   12496095.