Zswap

Last updated

zswap
Developer(s) Seth Jennings and others
Initial releaseSeptember 2, 2013;10 years ago (2013-09-02) (Linux 3.11)
Written in C
Operating system Linux
Type Linux kernel features
License GNU GPL
Website kernel.org

zswap is a Linux kernel feature that provides a compressed write-back cache for swapped pages, as a form of virtual memory compression. Instead of moving memory pages to a swap device when they are to be swapped out, zswap performs their compression and then stores them into a memory pool dynamically allocated in the system RAM. Later writeback to the actual swap device is deferred or even completely avoided, resulting in a significantly reduced I/O for Linux systems that require swapping; the tradeoff is the need for additional CPU cycles to perform the compression. [1] [2] [3]

Contents

As a result of reduced I/O, zswap offers advantages to various devices that use flash-based storage, including embedded devices, netbooks and similar low-end hardware devices, as well as to other devices that use solid-state drives (SSDs) for storage. Flash memory has a limited lifespan due to its nature, so avoiding it to be used for providing swap space prevents it from wearing out quickly. [4]

Internals

zswap is integrated into the rest of Linux kernel's virtual memory subsystem using the API provided by frontswap, which is a mechanism of the Linux kernel that abstracts various types of storage that can be used as swap space. [5] As a result, zswap operates as a backend driver for frontswap by providing what is internally visible as a pseudo-RAM device. In other words, the frontswap API makes zswap capable of intercepting memory pages while they are being swapped out, and capable of intercepting page faults for the already swapped pages; the access to those two paths allows zswap to act as a compressed write-back cache for swapped pages. [1] [6]

Internally, zswap uses compression modules provided by the Linux kernel's crypto API, which makes it possible, for example, to offload the compression tasks from the main CPU using any of the hardware compression accelerators supported by the Linux kernel. The selection of the desired compression module can be performed dynamically at the boot time through the value of kernel boot parameter zswap.compressor; if not specified, it selects the Lempel–Ziv–Oberhumer (LZO) compression. As of version 3.13 of the Linux kernel, zswap also needs to be explicitly enabled by specifying value 1 for the kernel boot parameter zswap.enabled. [1] [2] [4]

The maximum size of the memory pool used by zswap is configurable through the sysfs parameter max_pool_percent, which specifies the maximum percentage of total system RAM that can be occupied by the pool. The memory pool is not preallocated to its configured maximum size, and instead grows and shrinks as required. When the configured maximum pool size is reached as the result of performed swapping, or when growing the pool is impossible due to an out-of-memory condition, swapped pages are evicted from the memory pool to a swap device on the least recently used (LRU) basis. This approach makes zswap a true swap cache, as the oldest cached pages are evicted to a swap device once the cache is full, making room for newer swapped pages to be compressed and cached. [1] [4] [7]

zbud is a special-purpose memory allocator used internally by zswap for storing compressed pages, implemented as a rewrite of the zbud allocator used by the Oracle's zcache, [8] which is another virtual memory compression implementation for the Linux kernel. Internally, zbud works by storing up to two compressed pages ("buddies", hence the allocator name) per physical memory page, which brings both advantages due to easy coalescing and reusing of freed space, and disadvantages due to possible lower memory utilization. However, as a result of its design, zbud cannot allocate more memory space than it would be originally occupied by the uncompressed pages. [3] [9]

History

Both zswap and zbud were created by Seth Jennings. The first public announcement was in December 2012, and the development continued until May 2013 at which point the codebase reached its maturity although still having the status of an experimental kernel feature. [10] [11]

zswap (together with zbud) was merged into the Linux kernel mainline in kernel version 3.11, which was released on September 2, 2013. [4] [12]

Since version 3.15 of the Linux kernel, which was released on June 8, 2014, zswap properly supports multiple swap devices. [13] [14]

Since version 6.8 of the Linux kernel, which was released on March 11, 2024, zswap supports disabling writeback for specific cgroups. [15]

Alternatives

One of the alternatives to zswap is zram, which provides a similar but still different "swap compressed pages to RAM" mechanism to the Linux kernel.

The main difference is that zram provides a compressed block device using RAM for storing data, which acts as a regular and separate swap device.

In comparison, zswap acts as a RAM-based cache for swap devices. This provides zswap with an eviction mechanism for less used swapped pages, which zram lacked until the introduction of CONFIG_ZRAM_WRITEBACK in kernel version 4.14. Though, as a result of its design, at least one already existing swap device is required for zswap to be used. [16]

See also

Related Research Articles

<span class="mw-page-title-main">Non-uniform memory access</span> Computer memory design used in multiprocessing

Non-uniform memory access (NUMA) is a computer memory design used in multiprocessing, where the memory access time depends on the memory location relative to the processor. Under NUMA, a processor can access its own local memory faster than non-local memory. The benefits of NUMA are limited to particular workloads, notably on servers where the data is often associated strongly with certain tasks or users.

In computer operating systems, memory paging is a memory management scheme by which a computer stores and retrieves data from secondary storage for use in main memory. In this scheme, the operating system retrieves data from secondary storage in same-size blocks called pages. Paging is an important part of virtual memory implementations in modern operating systems, using secondary storage to let programs exceed the size of available physical memory.

The Direct Rendering Manager (DRM) is a subsystem of the Linux kernel responsible for interfacing with GPUs of modern video cards. DRM exposes an API that user-space programs can use to send commands and data to the GPU and perform operations such as configuring the mode setting of the display. DRM was first developed as the kernel-space component of the X Server Direct Rendering Infrastructure, but since then it has been used by other graphic stack alternatives such as Wayland and standalone applications and libraries such as SDL2 and Kodi.

<span class="mw-page-title-main">Linux kernel interfaces</span> An overview and comparison of the Linux kernel API and ABI.

The Linux kernel provides multiple interfaces to user-space and kernel-mode code that are used for varying purposes and that have varying properties by design. There are two types of application programming interface (API) in the Linux kernel:

  1. the "kernel–user space" API; and
  2. the "kernel internal" API.

seccomp is a computer security facility in the Linux kernel. seccomp allows a process to make a one-way transition into a "secure" state where it cannot make any system calls except exit , sigreturn , read and write to already-open file descriptors. Should it attempt any other system calls, the kernel will either just log the event or terminate the process with SIGKILL or SIGSYS. In this sense, it does not virtualize the system's resources but isolates the process from them entirely.

inotify is a Linux kernel subsystem created by John McCutchan, which monitors changes to the filesystem, and reports those changes to applications. It can be used to automatically update directory views, reload configuration files, log changes, backup, synchronize, and upload. The inotifywait and inotifywatch commands allow using the inotify subsystem from the command line. One major use is in desktop search utilities like Beagle, where its functionality permits reindexing of changed files without scanning the filesystem for changes every few minutes, which would be very inefficient.

<span class="mw-page-title-main">Out of memory</span> State of computer operation where no additional memory can be allocated

Out of memory (OOM) is an often undesired state of computer operation where no additional memory can be allocated for use by programs or the operating system. Such a system will be unable to load any additional programs, and since many programs may load additional data into memory during execution, these will cease to function correctly. This usually occurs because all available memory, including disk swap space, has been allocated.

Squashfs is a compressed read-only file system for Linux. Squashfs compresses files, inodes and directories, and supports block sizes from 4 KiB up to 1 MiB for greater compression. Several compression algorithms are supported. Squashfs is also the name of free software, licensed under the GPL, for accessing Squashfs filesystems.

<span class="mw-page-title-main">Kernel-based Virtual Machine</span> Virtualization module in the Linux kernel

Kernel-based Virtual Machine (KVM) is a free and open-source virtualization module in the Linux kernel that allows the kernel to function as a hypervisor. It was merged into the mainline Linux kernel in version 2.6.20, which was released on February 5, 2007. KVM requires a processor with hardware virtualization extensions, such as Intel VT or AMD-V. KVM has also been ported to other operating systems such as FreeBSD and illumos in the form of loadable kernel modules.

Btrfs is a computer storage format that combines a file system based on the copy-on-write (COW) principle with a logical volume manager, developed together. It was founded by Chris Mason in 2007 for use in Linux, and since November 2013, the file system's on-disk format has been declared stable in the Linux kernel.

<span class="mw-page-title-main">Linux kernel</span> Operating system kernel

The Linux kernel is a free and open-source, monolithic, modular, multitasking, Unix-like operating system kernel. It was originally written in 1991 by Linus Torvalds for his i386-based PC, and it was soon adopted as the kernel for the GNU operating system, which was written to be a free (libre) replacement for Unix.

cgroups is a Linux kernel feature that limits, accounts for, and isolates the resource usage of a collection of processes.

zram, formerly called compcache, is a Linux kernel module for creating a compressed block device in RAM, i.e. a RAM disk with on-the-fly disk compression. The block device created with zram can then be used for swap or as general-purpose RAM disk. The two most common uses for zram are for the storage of temporary files and as a swap device. Initially, zram had only the latter function, hence the original name "compcache". Unlike swap, zram only uses 0.1% of the maximum size of the disk when not in use.

bcache is a cache in the Linux kernel's block layer, which is used for accessing secondary storage devices. It allows one or more fast storage devices, such as flash-based solid-state drives (SSDs), to act as a cache for one or more slower storage devices, such as hard disk drives (HDDs); this effectively creates hybrid volumes and provides performance improvements.

dm-cache is a component of the Linux kernel's device mapper, which is a framework for mapping block devices onto higher-level virtual block devices. It allows one or more fast storage devices, such as flash-based solid-state drives (SSDs), to act as a cache for one or more slower storage devices such as hard disk drives (HDDs); this effectively creates hybrid volumes and provides secondary storage performance improvements.

In the Linux kernel, kernfs is a set of functions that contain the functionality required for creating the pseudo file systems used internally by various kernel subsystems so that they may use virtual files. For example, sysfs provides a set of virtual files by exporting information about hardware devices and associated device drivers from the kernel's device model to user space.

Virtual memory compression is a memory management technique that utilizes data compression to reduce the size or number of paging requests to and from the auxiliary storage. In a virtual memory compression system, pages to be paged out of virtual memory are compressed and stored in physical memory, which is usually random-access memory (RAM), or sent as compressed to auxiliary storage such as a hard disk drive (HDD) or solid-state drive (SSD). In both cases the virtual memory range, whose contents has been compressed, is marked inaccessible so that attempts to access compressed pages can trigger page faults and reversal of the process. The footprint of the data being paged is reduced by the compression process; in the first instance, the freed RAM is returned to the available physical memory pool, while the compressed portion is kept in RAM. In the second instance, the compressed data is sent to auxiliary storage but the resulting I/O operation is smaller and therefore takes less time.

Bcachefs is a copy-on-write (COW) file system for Linux-based operating systems. Its primary developer, Kent Overstreet, first announced it in 2015, and it was added to the Linux kernel beginning with 6.7. It is intended to compete with the modern features of ZFS or Btrfs, and the speed and performance of ext4 or XFS.

This article documents the version history of the Linux kernel. The Linux kernel is a free and open-source, monolithic, Unix-like operating system kernel. It was conceived and created in 1991 by Linus Torvalds.

842, 8-4-2, or EFT is a data compression algorithm. It is a variation on Lempel–Ziv compression with a limited dictionary length. With typical data, 842 gives 80 to 90 percent of the compression of LZ77 with much faster throughput and less memory use. Hardware implementations also provide minimal use of energy and minimal chip area.

References

  1. 1 2 3 4 Seth Jennings (February 12, 2013). "The zswap compressed swap cache". LWN.net . Retrieved January 22, 2014.
  2. 1 2 Jenifer Hopper (December 11, 2012). "New Linux zswap compression functionality". IBM . Retrieved January 31, 2014.
  3. 1 2 Michael Larabel (July 11, 2013). "Zswap Merged Into The Linux 3.11 Kernel". Phoronix . Retrieved February 5, 2014.
  4. 1 2 3 4 "Linux kernel documentation: Documentation/vm/zswap.txt". kernel.org. November 22, 2013. Retrieved January 22, 2014.
  5. Dan Magenheimer (April 22, 2010). "Frontswap [PATCH 0/4] (was Transcendent Memory): Overview". gmane.org. Retrieved December 23, 2014.
  6. Jonathan Corbet (May 4, 2010). "Cleancache and Frontswap". LWN.net . Retrieved March 26, 2014.
  7. "Linux kernel source tree: kernel/git/torvalds/linux.git: zswap: add to mm/". kernel.org. July 11, 2013. Retrieved February 5, 2014.
  8. Dan Magenheimer (March 29, 2012). "Zcache and RAMster (oh, and frontswap too): Overview and some benchmarking" (PDF). oss.oracle.com. p. 12. Retrieved August 19, 2015.
  9. "Linux kernel source tree: kernel/git/torvalds/linux.git: zbud: add to mm/". kernel.org. July 11, 2013. Retrieved February 5, 2014.
  10. "[PATCH 0/8] zswap: compressed swap caching". gmane.org. December 11, 2012. Retrieved January 5, 2014.
  11. "[PATCHv10 0/4] zswap: compressed swap caching". gmane.org. May 8, 2013. Retrieved January 5, 2014.
  12. "Linux kernel 3.11, Section 9. Zswap: A compressed swap cache". kernelnewbies.org. September 2, 2013. Retrieved January 22, 2014.
  13. "Linux kernel 3.15, Section 4. Memory management". kernelnewbies.org. June 8, 2014. Retrieved June 15, 2014.
  14. "Linux kernel source tree: kernel/git/torvalds/linux.git: mm/zswap: support multiple swap devices". kernel.org. April 7, 2014. Retrieved June 15, 2014.
  15. https://github.com/torvalds/linux/commit/501a06fe8e4c185bbda371b8cedbdf1b23a633d8
  16. Dan Magenheimer (April 3, 2013). "In-kernel memory compression". LWN.net . Retrieved March 8, 2014.