Adaptive replacement cache

Last updated

Adaptive Replacement Cache (ARC) is a page replacement algorithm with better performance [1] than LRU (least recently used). This is accomplished by keeping track of both frequently used and recently used pages plus a recent eviction history for both. The algorithm was developed [2] at the IBM Almaden Research Center. In 2006, IBM was granted a patent for the adaptive replacement cache policy.

Contents

Summary

Basic LRU maintains an ordered list (the cache directory) of resource entries in the cache, with the sort order based on the time of most recent access. New entries are added at the top of the list, after the bottom entry has been evicted. Cache hits move to the top, pushing all other entries down.

ARC improves the basic LRU strategy by splitting the cache directory into two lists, T1 and T2, for recently and frequently referenced entries. In turn, each of these is extended with a ghost list (B1 or B2), which is attached to the bottom of the two lists. These ghost lists act as scorecards by keeping track of the history of recently evicted cache entries, and the algorithm uses ghost hits to adapt to recent change in resource usage. Note that the ghost lists only contain metadata (keys for the entries) and not the resource data itself, i.e. as an entry is evicted into a ghost list its data is discarded. The combined cache directory is organised in four LRU lists:

  1. T1, for recent cache entries.
  2. T2, for frequent entries, referenced at least twice.
  3. B1, ghost entries recently evicted from the T1 cache, but are still tracked.
  4. B2, similar ghost entries, but evicted from T2.

T1 and B1 together are referred to as L1, a combined history of recent single references. Similarly, L2 is the combination of T2 and B2.

The whole cache directory can be visualised in a single line:

. . . [   B1  <-[     T1    <-!->      T2   ]->  B2   ] . .       [ . . . . [ . . . . . . ! . .^. . . . ] . . . . ][   fixed cache size (c)    ]

The inner [] brackets indicate actual cache, which although fixed in size, can move freely across the B1 and B2 history.

L1 is now displayed from right to left, starting at the top, indicated by the ! marker. ^ indicates the target size for T1, and may be equal to, smaller than, or larger than the actual size (as indicated by !).

Replacement

Entries (re-)entering the cache (T1, T2) will cause ! to move towards the target marker ^. If no free space exists in the cache, this marker also determines whether either T1 or T2 will evict an entry.

Deployment

ARC is currently deployed in IBM's DS6000/DS8000 storage controllers.

Sun Microsystems's scalable file system ZFS uses a variant [3] of ARC as an alternative to the traditional Solaris filesystem page cache in virtual memory. It has been modified to allow for locked pages that are currently in use and cannot be vacated.

PostgreSQL used ARC in its buffer manager for a brief time (version 8.0.0), but quickly replaced it with another algorithm, citing concerns over an IBM patent on ARC. [4]

VMware's vSAN (formerly Virtual SAN) is a hyper-converged, software-defined storage (SDS) product developed by VMware. It uses A variant of ARC in its Caching Algorithm. [5]

OpenZFS supports using ARC and L2ARC in a multi-level cache as read caches. In OpenZFS, disk reads often hit the first level disk cache in RAM using ARC. If a SSD is set up to store the second level disk cache, it is called an L2ARC. L2ARC uses the same ARC algorithm, but instead of storing the cached data in RAM, L2ARC stores the cached data in a fast SSD. [6] [7] [8] [9] [10] [11]

See also

Related Research Articles

<span class="mw-page-title-main">Cache (computing)</span> Additional storage that enables faster access to main storage

In computing, a cache is a hardware or software component that stores data so that future requests for that data can be served faster; the data stored in a cache might be the result of an earlier computation or a copy of data stored elsewhere. A cache hit occurs when the requested data can be found in a cache, while a cache miss occurs when it cannot. Cache hits are served by reading data from the cache, which is faster than recomputing a result or reading from a slower data store; thus, the more requests that can be served from the cache, the faster the system performs.

The Black Path Game is a two-player board game described and analysed in Winning Ways for your Mathematical Plays. It was invented by Larry Black in 1960.

In a computer operating system that uses paging for virtual memory management, page replacement algorithms decide which memory pages to page out, sometimes called swap out, or write to disk, when a page of memory needs to be allocated. Page replacement happens when a requested page is not in memory and a free page cannot be used to satisfy the allocation, either because there are none, or because the number of free pages is lower than some threshold.

A CPU cache is a hardware cache used by the central processing unit (CPU) of a computer to reduce the average cost to access data from the main memory. A cache is a smaller, faster memory, located closer to a processor core, which stores copies of the data from frequently used main memory locations. Most CPUs have a hierarchy of multiple cache levels, with different instruction-specific and data-specific caches at level 1. The cache memory is typically implemented with static random-access memory (SRAM), in modern CPUs by far the largest part of them by chip area, but SRAM is not always used for all levels, or even any level, sometimes some latter or all levels are implemented with eDRAM.

In computing, cache replacement policies are optimizing instructions or algorithms which a computer program or hardware-maintained structure can utilize to manage a cache of information. Caching improves performance by keeping recent or often-used data items in memory locations which are faster, or computationally cheaper to access, than normal memory stores. When the cache is full, the algorithm must choose which items to discard to make room for new data.

Pseudo-LRU or PLRU is a family of cache algorithms which improve on the performance of the Least Recently Used (LRU) algorithm by replacing values using approximate measures of age rather than maintaining the exact age of every value in the cache.

Hierarchical storage management (HSM), also known as tiered storage, is a data storage and data management technique that automatically moves data between high-cost and low-cost storage media. HSM systems exist because high-speed storage devices, such as solid-state drive arrays, are more expensive than slower devices, such as hard disk drives, optical discs and magnetic tape drives. While it would be ideal to have all data available on high-speed devices all the time, this is prohibitively expensive for many organizations. Instead, HSM systems store the bulk of the enterprise's data on slower devices, and then copy data to faster disk drives when needed. The HSM system monitors the way data is used and makes best guesses as to which data can safely be moved to slower devices and which data should stay on the fast devices.

<span class="mw-page-title-main">UltraSPARC T1</span> Microprocessor by Sun Microsystems

The UltraSPARC T1 is a multithreading, multicore CPU released by Sun Microsystems in 2005. Designed to lower the energy consumption of server computers, the CPU typically uses 72 W of power at 1.4 GHz.

Nimrod Megiddo is a mathematician and computer scientist. He is a research scientist at the IBM Almaden Research Center and Stanford University. His interests include combinatorial optimization, algorithm design and analysis, game theory, and machine learning. He was one of the first people to propose a solution to the bounding sphere and smallest-circle problem.

LIRS is a page replacement algorithm with an improved performance over LRU and many other newer replacement algorithms. This is achieved by using "reuse distance" as the locality metric for dynamically ranking accessed pages to make a replacement decision.

dm-cache is a component of the Linux kernel's device mapper, which is a framework for mapping block devices onto higher-level virtual block devices. It allows one or more fast storage devices, such as flash-based solid-state drives (SSDs), to act as a cache for one or more slower storage devices such as hard disk drives (HDDs); this effectively creates hybrid volumes and provides secondary storage performance improvements.

In computing, frecency is any heuristic that combines the frequency and recency into a single measure.

Multi-level caches can be designed in various ways depending on whether the content of one cache is present in other levels of caches. If all blocks in the higher level cache are also present in the lower level cache, then the lower level cache is said to be inclusive of the higher level cache. If the lower level cache contains only blocks that are not present in the higher level cache, then the lower level cache is said to be exclusive of the higher level cache. If the contents of the lower level cache are neither strictly inclusive nor exclusive of the higher level cache, then it is called non-inclusive non-exclusive (NINE) cache.

Cache placement policies are policies that determine where a particular memory block can be placed when it goes into a CPU cache. A block of memory cannot necessarily be placed at an arbitrary location in the cache; it may be restricted to a particular cache line or a set of cache lines by the cache's placement policy.

<span class="mw-page-title-main">Cache hierarchy</span> Memory hierarchy concept applied to CPU caches with multiple levels

Cache hierarchy, or multi-level cache, is a memory architecture that uses a hierarchy of memory stores based on varying access speeds to cache data. Highly requested data is cached in high-speed access memory stores, allowing swifter access by central processing unit (CPU) cores.

A CPU cache is a piece of hardware that reduces access time to data in memory by keeping some part of the frequently used data of the main memory in a 'cache' of smaller and faster memory.

A victim cache is a small, typically fully associative cache placed in the refill path of a CPU cache. It stores all the blocks evicted from that level of cache and was originally proposed in 1990. In modern architectures, this function is typically performed by Level 3 or Level 4 caches.

In computing, cache algorithms are optimizing instructions‍—‌or algorithms‍—‌that a computer program or a hardware-maintained structure can follow in order to manage a cache of information stored on the computer. When the cache is full, the algorithm must choose which items to discard to make room for the new ones. Due to the inherent caching capability of nodes in Information-centric networking ICN, the ICN can be viewed as a loosely connect network of caches, which has unique requirements of Caching policies. Unlike proxy servers, in Information-centric networking the cache is a network level solution. Therefore, it has rapidly changing cache states and higher request arrival rates; moreover, smaller cache sizes further impose different kind of requirements on the content eviction policies. In particular, eviction policies for Information-centric networking should be fast and lightweight. Various cache replication and eviction schemes for different Information-centric networking architectures and applications are proposed.

References

  1. One Up on LRU, Usenix :login; August 2003
  2. Nimrod Megiddo and Dharmendra Modha, 2010-03-09 archive of the ARC home page, with pointers to several articles
  3. comments in Solaris ZFS arc.c source file explains differences with original work
  4. Article in Postgresql General Bits, "The Saga of the ARC Algorithm and Patent", published 6 February 2005
  5. Reference document, "VMware vSAN Caching Algorithms" [ permanent dead link ]
  6. "ZFS Caching".
  7. "ZFS Primer".
  8. Jim Salter. "Persistent L2ARC might be coming to ZFS on Linux". 2020.
  9. "Cache: L2ARC accesses".
  10. Brendan Gregg. "ZFS L2ARC".
  11. Ranvir Singh. "Adaptive Replacement Cache (ARC) and L2ARC".