Kernel same-page merging

Last updated

In computing, kernel same-page merging (KSM), also known as kernel shared memory, memory merging, memory deduplication, and page deduplication is a kernel feature that makes it possible for a hypervisor system to share memory pages that have identical contents between multiple processes or virtualized guests. While not directly linked, Kernel-based Virtual Machine (KVM) can use KSM to merge memory pages occupied by virtual machines.

Contents

Deduplication

KSM performs memory deduplication by scanning through main memory for physical pages that have identical content, and identifies the virtual pages that are mapped to those physical pages. It leaves one page unchanged, and re-maps each duplicate page to point to the same physical page, after which it releases the extra physical pages for re-use. It also marks both virtual pages as "copy-on-write" (COW), so that kernel will automatically remap a virtual page back to having its own separate physical page as soon as any process begins to write to it. [1]

KSM was originally intended to run more virtual machines on one host by sharing memory between processes as well as virtual machines. An experimental implementation of KSM by Red Hat found that 52 virtual instances of Windows XP with 1 GB of memory, could run on a host computer that had only 16 GB of RAM. [2] Upon its implementation, users found KSM to also be useful for non-virtualized environments in which memory is at a premium. [3] [4]

KSM was merged into the Linux kernel mainline in version 2.6.32, which was released on December 3, 2009. [3] [2] To be effective, the operating system kernel must find identical memory pages held by different processes. The kernel also needs to predict whether the pages are likely to update infrequently enough that the merging would be an efficient use of processor resources. [4] A concern is that although memory usage is reduced, CPU usage is increased, thus negating potential increases in performance. [1]

Security risks

Security is also a concern:

See also

Related Research Articles

In computing, a virtual machine (VM) is the virtualization or emulation of a computer system. Virtual machines are based on computer architectures and provide the functionality of a physical computer. Their implementations may involve specialized hardware, software, or a combination of the two. Virtual machines differ and are organized by their function, shown here:

In computer operating systems, memory paging is a memory management scheme by which a computer stores and retrieves data from secondary storage for use in main memory. In this scheme, the operating system retrieves data from secondary storage in same-size blocks called pages. Paging is an important part of virtual memory implementations in modern operating systems, using secondary storage to let programs exceed the size of available physical memory.

Copy-on-write (COW), sometimes referred to as implicit sharing or shadowing, is a resource-management technique used in computer programming to efficiently implement a "duplicate" or "copy" operation on modifiable resources.

In computing, Physical Address Extension (PAE), sometimes referred to as Page Address Extension, is a memory management feature for the x86 architecture. PAE was first introduced by Intel in the Pentium Pro, and later by AMD in the Athlon processor. It defines a page table hierarchy of three levels (instead of two), with table entries of 64 bits each instead of 32, allowing these CPUs to directly access a physical address space larger than 4 gigabytes (232 bytes).

Address space layout randomization (ASLR) is a computer security technique involved in preventing exploitation of memory corruption vulnerabilities. In order to prevent an attacker from reliably redirecting code execution to, for example, a particular exploited function in memory, ASLR randomly arranges the address space positions of key data areas of a process, including the base of the executable and the positions of the stack, heap and libraries.

A hypervisor, also known as a virtual machine monitor (VMM) or virtualizer, is a type of computer software, firmware or hardware that creates and runs virtual machines. A computer on which a hypervisor runs one or more virtual machines is called a host machine, and each virtual machine is called a guest machine. The hypervisor presents the guest operating systems with a virtual operating platform and manages the execution of the guest operating systems. Unlike an emulator, the guest executes most instructions on the native hardware. Multiple instances of a variety of operating systems may share the virtualized hardware resources: for example, Linux, Windows, and macOS instances can all run on a single physical x86 machine. This contrasts with operating-system–level virtualization, where all instances must share a single kernel, though the guest operating systems can differ in user space, such as different Linux distributions with the same kernel.

PowerLinux is the combination of a Linux-based operating system (OS) running on PowerPC- or Power ISA-based computers from IBM. It is often used in reference along with Linux on Power, and is also the name of several Linux-only IBM Power Systems.

<span class="mw-page-title-main">Out of memory</span> State of computer operation where no additional memory can be allocated

Out of memory (OOM) is an often undesired state of computer operation where no additional memory can be allocated for use by programs or the operating system. Such a system will be unable to load any additional programs, and since many programs may load additional data into memory during execution, these will cease to function correctly. This usually occurs because all available memory, including disk swap space, has been allocated.

In computer security, executable-space protection marks memory regions as non-executable, such that an attempt to execute machine code in these regions will cause an exception. It makes use of hardware features such as the NX bit, or in some cases software emulation of those features. However, technologies that emulate or supply an NX bit will usually impose a measurable overhead while using a hardware-supplied NX bit imposes no measurable overhead.

<span class="mw-page-title-main">OpenVZ</span> Operating-system level virtualization technology

OpenVZ is an operating-system-level virtualization technology for Linux. It allows a physical server to run multiple isolated operating system instances, called containers, virtual private servers (VPSs), or virtual environments (VEs). OpenVZ is similar to Solaris Containers and LXC.

The following is a timeline of virtualization development. In computing, virtualization is the use of a computer to simulate another computer. Through virtualization, a host simulates a guest by exposing virtual hardware devices, which may be done through software or by allowing access to a physical device connected to the machine.

Address Windowing Extensions (AWE) is a Microsoft Windows application programming interface that allows a 32-bit software application to access more physical memory than it has virtual address space, even in excess of the 4 GB limit. The process of mapping an application's virtual address space to physical memory under AWE is known as "windowing", and is similar to the overlay concept of other environments. AWE is beneficial to certain data-intensive applications, such as database management systems and scientific and engineering software, that need to manipulate very large data sets while minimizing paging.

<span class="mw-page-title-main">Kernel-based Virtual Machine</span> Virtualization module in the Linux kernel

Kernel-based Virtual Machine (KVM) is a free and open-source virtualization module in the Linux kernel that allows the kernel to function as a hypervisor. It was merged into the mainline Linux kernel in version 2.6.20, which was released on February 5, 2007. KVM requires a processor with hardware virtualization extensions, such as Intel VT or AMD-V. KVM has also been ported to other operating systems such as FreeBSD and illumos in the form of loadable kernel modules.

Hardware virtualization is the virtualization of computers as complete hardware platforms, certain logical abstractions of their componentry, or only the functionality required to run various operating systems. Virtualization hides the physical characteristics of a computing platform from the users, presenting instead an abstract computing platform. At its origins, the software that controlled virtualization was called a "control program", but the terms "hypervisor" or "virtual machine monitor" became preferred over time.

In computing, data deduplication is a technique for eliminating duplicate copies of repeating data. Successful implementation of the technique can improve storage utilization, which may in turn lower capital expenditure by reducing the overall amount of storage media required to meet storage capacity needs. It can also be applied to network data transfers to reduce the number of bytes that must be sent.

cgroups is a Linux kernel feature that limits, accounts for, and isolates the resource usage of a collection of processes.

Second Level Address Translation (SLAT), also known as nested paging, is a hardware-assisted virtualization technology which makes it possible to avoid the overhead associated with software-managed shadow page tables.

Linux on IBM Z or Linux on zSystems is the collective term for the Linux operating system compiled to run on IBM mainframes, especially IBM Z / IBM zSystems and IBM LinuxONE servers. Similar terms which imply the same meaning are Linux/390, Linux/390x, etc. The three Linux distributions certified for usage on the IBM Z hardware platform are Red Hat Enterprise Linux, SUSE Linux Enterprise Server, and Ubuntu.

In computing, a system virtual machine is a virtual machine (VM) that provides a complete system platform and supports the execution of a complete operating system (OS). These usually emulate an existing architecture, and are built with the purpose of either providing a platform to run programs where the real hardware is not available for use, or of having multiple instances of virtual machines leading to more efficient use of computing resources, both in terms of energy consumption and cost effectiveness, or both. A VM was originally defined by Popek and Goldberg as "an efficient, isolated duplicate of a real machine".

<span class="mw-page-title-main">Meltdown (security vulnerability)</span> Microprocessor security vulnerability

Meltdown is one of the two original transient execution CPU vulnerabilities. Meltdown affects Intel x86 microprocessors, IBM POWER processors, and some ARM-based microprocessors. It allows a rogue process to read all memory, even when it is not authorized to do so.

References

  1. 1 2 "KSM tries again". lwn.net. kernel.org. Retrieved August 21, 2010.
  2. 1 2 "Linux kernel 2.6.32, Section 1.3. Kernel Samepage Merging (memory deduplication)". KernelNewbies.org. December 3, 2009. Retrieved August 12, 2015.
  3. 1 2 "Anatomy of Linux Kernel Shared Memory". IBM Developer. IBM. Archived from the original on July 9, 2016. Retrieved August 21, 2010.{{cite web}}: CS1 maint: unfit URL (link)
  4. 1 2 "Increasing memory density by using KSM" (PDF). Kernel.org. Retrieved August 21, 2010.
  5. Kuniyasu Suzaki; Kengo Iijima; Toshiki Yagi; Cyrille Artho. "Memory Deduplication as a Threat to the Guest OS" (PDF). Archived from the original (PDF) on January 5, 2013. Retrieved September 16, 2013.
  6. Barresi, Antonio; Razavi, Kaveh; Payer, Mathias; Gross, Thomas R. (August 2015). "CAIN: Silently Breaking ASLR in the Cloud" (PDF). USENIX . Retrieved August 12, 2015.
  7. Red Hat; Polyakov, Andy. "It's all a question of time - AES timing attacks on OpenSSL". access.redhat.com. Red Hat. Retrieved August 4, 2016.
  8. Schwarzl, Martin; Kraft, Erik; Lipp, Moritz; Gruss, Daniel (2021). "Remote Memory-Deduplication Attacks". Network and Distributed System Security Symposium. Retrieved April 2, 2022.
  9. "New FFS Rowhammer Attack Hijacks Linux VMs". August 14, 2016. Retrieved August 17, 2016.