Memory virtualization

Last updated November 09, 2024

In computer science, memory virtualization decouples volatile random access memory (RAM) resources from individual systems in the data center, and then aggregates those resources into a virtualized memory pool available to any computer in the cluster.^{[ citation needed ]} The memory pool is accessed by the operating system or applications running on top of the operating system. The distributed memory pool can then be utilized as a high-speed cache, a messaging layer, or a large, shared memory resource for a CPU or a GPU application.

Description

Memory virtualization allows networked and therefore distributed servers to share a pool of memory to overcome physical memory limitations, a common bottleneck in software performance.^{[ citation needed ]} With this capability integrated into the network, applications can take advantage of a very large amount of memory to improve overall performance, system utilization, increase memory usage efficiency, and enable new use cases. Software on the memory pool nodes (servers) allows nodes to connect to the memory pool to contribute memory, and store and retrieve data. Management software and the technologies of memory overcommitment manage shared memory, data insertion, eviction and provisioning policies, data assignment to contributing nodes, and handles requests from client nodes. The memory pool may be accessed at the application level or operating system level. At the application level, the pool is accessed through an API or as a networked file system to create a high-speed shared memory cache. At the operating system level, a page cache can utilize the pool as a very large memory resource that is much faster than local or networked storage.

Memory virtualization implementations are distinguished from shared memory systems. Shared memory systems do not permit abstraction of memory resources, thus requiring implementation with a single operating system instance (i.e. not within a clustered application environment).

Memory virtualization is also different from storage based on flash memory such as solid-state drives (SSDs) - SSDs and other similar technologies replace hard-drives (networked or otherwise), while memory virtualization replaces or complements traditional RAM.

Products

RNA networks Memory Virtualization Platform - A low latency memory pool, implemented as a shared cache and a low latency messaging solution.
ScaleMP - A platform to combine resources from multiple computers for the purpose of creating a single computing instance.
Wombat Data Fabric – A memory based messaging fabric for delivery of market data in financial services.
Oracle Coherence is a Java-based in-memory data-grid product by Oracle
AppFabric Caching Service is a distributed cache platform for in-memory caches spread across multiple systems, developed by Microsoft.
IBM Websphere extremeScale is a Java-based distributed cache much like Oracle Coherence
GigaSpaces XAP is a Java based in-memory computing software platform like Oracle Coherence and VMware Gemfire

Implementations

Application level integration

In this case, applications running on connected computers connect to the memory pool directly through an API or the file system.

Cluster implementing memory virtualization at the application level. Contributors 1...n contribute memory to the pool. Applications read and write data to the pool using Java or C APIs, or a file system API. Memory virtualization application level.png — Cluster implementing memory virtualization at the application level. Contributors 1...n contribute memory to the pool. Applications read and write data to the pool using Java or C APIs, or a file system API.

Operating System Level Integration

In this case, the operating system connects to the memory pool, and makes pooled memory available to applications.

Cluster implementing memory virtualization. Contributors 1...n contribute memory to the pool. The operating system connects to the memory pool through the page cache system. Applications consume pooled memory via the operating system. Memory virtualization addressable level.png — Cluster implementing memory virtualization. Contributors 1...n contribute memory to the pool. The operating system connects to the memory pool through the page cache system. Applications consume pooled memory via the operating system.

Background

Memory virtualization technology follows from memory management architectures and virtual memory techniques. In both fields, the path of innovation has moved from tightly coupled relationships between logical and physical resources to more flexible, abstracted relationships where physical resources are allocated as needed.

Virtual memory systems abstract between physical RAM and virtual addresses, assigning virtual memory addresses both to physical RAM and to disk-based storage, expanding addressable memory, but at the cost of speed. NUMA and SMP architectures optimize memory allocation within multi-processor systems. While these technologies dynamically manage memory within individual computers, memory virtualization manages the aggregated memory of multiple networked computers as a single memory pool.

In tandem with memory management innovations, a number of virtualization techniques have arisen to make the best use of available hardware resources. Application virtualization was demonstrated in mainframe systems first. The next wave was storage virtualization, as servers connected to storage systems such as NAS or SAN in addition to, or instead of, on-board hard disk drives. Server virtualization, or Full virtualization, partitions a single physical server into multiple virtual machines, consolidating multiple instances of operating systems onto the same machine for the purpose of efficiency and flexibility. In both storage and server virtualization, the applications are unaware that the resources they are using are virtual rather than physical, so efficiency and flexibility are achieved without application changes. In the same way, memory virtualization allocates the memory of an entire networked cluster of servers among the computers in that cluster.

Related Research Articles

<span class="mw-page-title-main">Cache (computing)</span> Additional storage that enables faster access to main storage

In computing, a cache is a hardware or software component that stores data so that future requests for that data can be served faster; the data stored in a cache might be the result of an earlier computation or a copy of data stored elsewhere. A cache hit occurs when the requested data can be found in a cache, while a cache miss occurs when it cannot. Cache hits are served by reading data from the cache, which is faster than recomputing a result or reading from a slower data store; thus, the more requests that can be served from the cache, the faster the system performs.

In computer science, distributed shared memory (DSM) is a form of memory architecture where physically separated memories can be addressed as a single shared address space. The term "shared" does not mean that there is a single centralized memory, but that the address space is shared—i.e., the same physical address on two processors refers to the same location in memory. Distributed global address space (DGAS), is a similar term for a wide class of software and hardware implementations, in which each node of a cluster has access to shared memory in addition to each node's private memory.

Network-attached storage (NAS) is a file-level computer data storage server connected to a computer network providing data access to a heterogeneous group of clients. The term "NAS" can refer to both the technology and systems involved, or a specialized device built for such functionality.

MySQL Cluster, also known as MySQL Ndb Cluster is a technology providing shared-nothing clustering and auto-sharding for the MySQL database management system. It is designed to provide high availability and high throughput with low latency, while allowing for near linear scalability. MySQL Cluster is implemented through the NDB or NDBCLUSTER storage engine for MySQL.

A diskless node is a workstation or personal computer without disk drives, which employs network booting to load its operating system from a server.

A virtual storage area network is a logical representation of a physical storage area network (SAN). A VSAN abstracts the storage-related operations from the physical storage layer, and provides shared storage access to the applications and virtual machines by combining the servers' local storage over a network into a single or multiple storage pools.

High-availability clusters are groups of computers that support server applications that can be reliably utilized with a minimum amount of down-time. They operate by using high availability software to harness redundant computers in groups or clusters that provide continued service when system components fail. Without clustering, if a server running a particular application crashes, the application will be unavailable until the crashed server is fixed. HA clustering remedies this situation by detecting hardware/software faults, and immediately restarting the application on another system without requiring administrative intervention, a process known as failover. As part of this process, clustering software may configure the node before starting the application on it. For example, appropriate file systems may need to be imported and mounted, network hardware may have to be configured, and some supporting applications may need to be running as well.

The IBM SAN Volume Controller (SVC) is a block storage virtualization appliance that belongs to the IBM System Storage product family. SVC implements an indirection, or "virtualization", layer in a Fibre Channel storage area network (SAN).

In database computing, Oracle Real Application Clusters (RAC) — an option for the Oracle Database software produced by Oracle Corporation and introduced in 2001 with Oracle9i — provides software for clustering and high availability in Oracle database environments. Oracle Corporation includes RAC with the Enterprise Edition, provided the nodes are clustered using Oracle Clusterware.

The Texas Advanced Computing Center (TACC) at the University of Texas at Austin, United States, is an advanced computing research center that is based on comprehensive advanced computing resources and supports services to researchers in Texas and across the U.S. The mission of TACC is to enable discoveries that advance science and society through the application of advanced computing technologies. Specializing in high-performance computing, scientific visualization, data analysis and storage systems, software, research and development, and portal interfaces, TACC deploys and operates advanced computational infrastructure to enable the research activities of faculty, staff, and students of UT Austin. TACC also provides consulting, technical documentation, and training to support researchers who use these resources. TACC staff members conduct research and development in applications and algorithms, computing systems design/architecture, and programming tools and environments.

A clustered file system (CFS) is a file system which is shared by being simultaneously mounted on multiple servers. There are several approaches to clustering, most of which do not employ a clustered file system. Clustered file systems can provide features like location-independent addressing and redundancy which improve reliability or reduce the complexity of the other parts of the cluster. Parallel file systems are a type of clustered file system that spread data across multiple storage nodes, usually for redundancy or performance.

<span class="mw-page-title-main">Computer cluster</span> Set of computers configured in a distributed computing system

A computer cluster is a set of computers that work together so that they can be viewed as a single system. Unlike grid computers, computer clusters have each node set to perform the same task, controlled and scheduled by software. The newest manifestation of cluster computing is cloud computing.

Exalogic is a computer appliance made by Oracle Corporation, commercially available since 2010. It is a cluster of x86-64-servers running Oracle Linux or Solaris preinstalled.

<span class="mw-page-title-main">Converged storage</span>

Converged storage is a storage architecture that combines storage and computing resources into a single entity. This can result in the development of platforms for server centric, storage centric or hybrid workloads where applications and data come together to improve application performance and delivery. The combination of storage and compute differs to the traditional IT model in which computation and storage take place in separate or siloed computer equipment. The traditional model requires discrete provisioning changes, such as upgrades and planned migrations, in the face of server load changes, which are increasingly dynamic with virtualization, where converged storage increases the supply of resources along with new VM demands in parallel.

<span class="mw-page-title-main">Oracle NoSQL Database</span> Distributed database

Oracle NoSQL Database is a NoSQL-type distributed key-value database from Oracle Corporation. It provides transactional semantics for data manipulation, horizontal scalability, and simple administration and monitoring.

In an enterprise server, a Caching SAN Adapter is a host bus adapter (HBA) for storage area network (SAN) connectivity which accelerates performance by transparently storing duplicate data such that future requests for that data can be serviced faster compared to retrieving the data from the source. A caching SAN adapter is used to accelerate the performance of applications across multiple clustered or virtualized servers and uses DRAM, NAND Flash or other memory technologies as the cache. The key requirement for the memory technology is that it is faster than the media storing the original copy of the data to ensure performance acceleration is achieved.

Dell Technologies PowerFlex, is a commercial software-defined storage product from Dell Technologies that creates a server-based storage area network (SAN) from local server storage using x86 servers. It converts this direct-attached storage into shared block storage that runs over an IP-based network.

Apache Ignite is a distributed database management system for high-performance computing.

ONTAP, Data ONTAP, Clustered Data ONTAP (cDOT), or Data ONTAP 7-Mode is NetApp's proprietary operating system used in storage disk arrays such as NetApp FAS and AFF, ONTAP Select, and Cloud Volumes ONTAP. With the release of version 9.0, NetApp decided to simplify the Data ONTAP name and removed the word "Data" from it, removed the 7-Mode image, therefore, ONTAP 9 is the successor of Clustered Data ONTAP 8.

References

Oleg Goldshmidt, Virtualization: Advanced Operating Systems
"Startup RNA Networks Virtualizes Memory Across Multiple Servers". InformationWeek . February 13, 2009. Retrieved March 24, 2009.
"Five Virtualization Trends to Watch". Computerworld . February 3, 2009. Archived from the original on March 16, 2009. Retrieved March 24, 2009.
"RNA networks and Memory Virtualization". ZDNet . February 2, 2009. Archived from the original on February 14, 2009. Retrieved March 24, 2009.
Kusnetzky, Dan (January 28, 2007). "Sorting out the different layers of virtualization". ZDNet. Archived from the original on July 3, 2007. Retrieved March 24, 2009.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.