Memory virtualization

Last updated

In computer science, memory virtualization decouples volatile random access memory (RAM) resources from individual systems in the data centre, and then aggregates those resources into a virtualized memory pool available to any computer in the cluster.[ citation needed ] The memory pool is accessed by the operating system or applications running on top of the operating system. The distributed memory pool can then be utilized as a high-speed cache, a messaging layer, or a large, shared memory resource for a CPU or a GPU application.

Contents

Description

Memory virtualization allows networked, and therefore distributed, servers to share a pool of memory to overcome physical memory limitations, a common bottleneck in software performance.[ citation needed ] With this capability integrated into the network, applications can take advantage of a very large amount of memory to improve overall performance, system utilization, increase memory usage efficiency, and enable new use cases. Software on the memory pool nodes (servers) allows nodes to connect to the memory pool to contribute memory, and store and retrieve data. Management software and the technologies of memory overcommitment manage shared memory, data insertion, eviction and provisioning policies, data assignment to contributing nodes, and handles requests from client nodes. The memory pool may be accessed at the application level or operating system level. At the application level, the pool is accessed through an API or as a networked file system to create a high-speed shared memory cache. At the operating system level, a page cache can utilize the pool as a very large memory resource that is much faster than local or networked storage.

Memory virtualization implementations are distinguished from shared memory systems. Shared memory systems do not permit abstraction of memory resources, thus requiring implementation with a single operating system instance (i.e. not within a clustered application environment).

Memory virtualization is also different from storage based on flash memory such as solid-state drives (SSDs) - SSDs and other similar technologies replace hard-drives (networked or otherwise), while memory virtualization replaces or complements traditional RAM.

Products

Implementations

Application level integration

In this case, applications running on connected computers connect to the memory pool directly through an API or the file system.

Cluster implementing memory virtualization at the application level. Contributors 1...n contribute memory to the pool. Applications read and write data to the pool using Java or C APIs, or a file system API. Memory virtualization application level.png
Cluster implementing memory virtualization at the application level. Contributors 1...n contribute memory to the pool. Applications read and write data to the pool using Java or C APIs, or a file system API.

Operating System Level Integration

In this case, the operating system connects to the memory pool, and makes pooled memory available to applications.

Cluster implementing memory virtualization. Contributors 1...n contribute memory to the pool. The operating system connects to the memory pool through the page cache system. Applications consume pooled memory via the operating system. Memory virtualization addressable level.png
Cluster implementing memory virtualization. Contributors 1...n contribute memory to the pool. The operating system connects to the memory pool through the page cache system. Applications consume pooled memory via the operating system.

Background

Memory virtualization technology follows from memory management architectures and virtual memory techniques. In both fields, the path of innovation has moved from tightly coupled relationships between logical and physical resources to more flexible, abstracted relationships where physical resources are allocated as needed.

Virtual memory systems abstract between physical RAM and virtual addresses, assigning virtual memory addresses both to physical RAM and to disk-based storage, expanding addressable memory, but at the cost of speed. NUMA and SMP architectures optimize memory allocation within multi-processor systems. While these technologies dynamically manage memory within individual computers, memory virtualization manages the aggregated memory of multiple networked computers as a single memory pool.

In tandem with memory management innovations, a number of virtualization techniques have arisen to make the best use of available hardware resources. Application virtualization was demonstrated in mainframe systems first. The next wave was storage virtualization, as servers connected to storage systems such as NAS or SAN in addition to, or instead of, on-board hard disk drives. Server virtualization, or Full virtualization, partitions a single physical server into multiple virtual machines, consolidating multiple instances of operating systems onto the same machine for the purpose of efficiency and flexibility. In both storage and server virtualization, the applications are unaware that the resources they are using are virtual rather than physical, so efficiency and flexibility are achieved without application changes. In the same way, memory virtualization allocates the memory of an entire networked cluster of servers among the computers in that cluster.

See also

Related Research Articles

In computer science, distributed shared memory (DSM) is a form of memory architecture where physically separated memories can be addressed as a single shared address space. The term "shared" does not mean that there is a single centralized memory, but that the address space is shared—i.e., the same physical address on two processors refers to the same location in memory. Distributed global address space (DGAS), is a similar term for a wide class of software and hardware implementations, in which each node of a cluster has access to shared memory in addition to each node's private memory.

NetApp, Inc. is an American hybrid cloud data services and data management company headquartered in San Jose, California. It has ranked in the Fortune 500 from 2012 to 2021. Founded in 1992 with an IPO in 1995, NetApp offers cloud data services for management of applications and data both online and physically.

MySQL Cluster is a technology providing shared-nothing clustering and auto-sharding for the MySQL database management system. It is designed to provide high availability and high throughput with low latency, while allowing for near linear scalability. MySQL Cluster is implemented through the NDB or NDBCLUSTER storage engine for MySQL.

<span class="mw-page-title-main">Diskless node</span> Computer workstation operated without disk drives

A diskless node is a workstation or personal computer without disk drives, which employs network booting to load its operating system from a server.

A virtual storage area network is a logical representation of a physical storage area network (SAN). A VSAN abstracts the storage-related operations from the physical storage layer, and provides shared storage access to the applications and virtual machines by combining the servers' local storage over a network into a single or multiple storage pools.

High-availability clusters are groups of computers that support server applications that can be reliably utilized with a minimum amount of down-time. They operate by using high availability software to harness redundant computers in groups or clusters that provide continued service when system components fail. Without clustering, if a server running a particular application crashes, the application will be unavailable until the crashed server is fixed. HA clustering remedies this situation by detecting hardware/software faults, and immediately restarting the application on another system without requiring administrative intervention, a process known as failover. As part of this process, clustering software may configure the node before starting the application on it. For example, appropriate file systems may need to be imported and mounted, network hardware may have to be configured, and some supporting applications may need to be running as well.

The IBM SAN Volume Controller (SVC) is a block storage virtualization appliance that belongs to the IBM System Storage product family. SVC implements an indirection, or "virtualization", layer in a Fibre Channel storage area network (SAN).

In database computing, Oracle Real Application Clusters (RAC) — an option for the Oracle Database software produced by Oracle Corporation and introduced in 2001 with Oracle9i — provides software for clustering and high availability in Oracle database environments. Oracle Corporation includes RAC with the Enterprise Edition, provided the nodes are clustered using Oracle Clusterware.

A clustered file system is a file system which is shared by being simultaneously mounted on multiple servers. There are several approaches to clustering, most of which do not employ a clustered file system. Clustered file systems can provide features like location-independent addressing and redundancy which improve reliability or reduce the complexity of the other parts of the cluster. Parallel file systems are a type of clustered file system that spread data across multiple storage nodes, usually for redundancy or performance.

In computing, virtualization or virtualisation is the act of creating a virtual version of something at the same abstraction level, including virtual computer hardware platforms, storage devices, and computer network resources.

<span class="mw-page-title-main">Computer cluster</span> Set of computers configured in a distributed computing system

A computer cluster is a set of computers that work together so that they can be viewed as a single system. Unlike grid computers, computer clusters have each node set to perform the same task, controlled and scheduled by software.

Exalogic is a computer appliance made by Oracle Corporation, commercially available since 2010. It is a cluster of x86-64-servers running Oracle Linux or Solaris preinstalled.

<span class="mw-page-title-main">Converged storage</span>

Converged storage is a storage architecture that combines storage and computing resources into a single entity. This can result in the development of platforms for server centric, storage centric or hybrid workloads where applications and data come together to improve application performance and delivery. The combination of storage and compute differs to the traditional IT model in which computation and storage take place in separate or siloed computer equipment. The traditional model requires discrete provisioning changes, such as upgrades and planned migrations, in the face of server load changes, which are increasingly dynamic with virtualization, where converged storage increases the supply of resources along with new VM demands in parallel.

<span class="mw-page-title-main">Oracle NoSQL Database</span>

Oracle NoSQL Database is a NoSQL-type distributed key-value database from Oracle Corporation. It provides transactional semantics for data manipulation, horizontal scalability, and simple administration and monitoring.

In an enterprise server, a Caching SAN Adapter is a host bus adapter (HBA) for storage area network (SAN) connectivity which accelerates performance by transparently storing duplicate data such that future requests for that data can be serviced faster compared to retrieving the data from the source. A caching SAN adapter is used to accelerate the performance of applications across multiple clustered or virtualized servers and uses DRAM, NAND Flash or other memory technologies as the cache. The key requirement for the memory technology is that it is faster than the media storing the original copy of the data to ensure performance acceleration is achieved.

<span class="mw-page-title-main">Shared memory</span> Computer memory that can be accessed by multiple processes

In computer science, shared memory is memory that may be simultaneously accessed by multiple programs with an intent to provide communication among them or avoid redundant copies. Shared memory is an efficient means of passing data between programs. Depending on context, programs may run on a single processor or on multiple separate processors.

<span class="mw-page-title-main">Apache Ignite</span>

Apache Ignite is a distributed database management system for high-performance computing.

ONTAP or Data ONTAP or Clustered Data ONTAP (cDOT) or Data ONTAP 7-Mode is NetApp's proprietary operating system used in storage disk arrays such as NetApp FAS and AFF, ONTAP Select, and Cloud Volumes ONTAP. With the release of version 9.0, NetApp decided to simplify the Data ONTAP name and removed the word "Data" from it, and remove the 7-Mode image, therefore, ONTAP 9 is the successor of Clustered Data ONTAP 8.

Database scalability is the ability of a database to handle changing demands by adding/removing resources. Databases use a host of techniques to cope.

References