Thrashing (computer science)

Last updated

In computer science, thrashing occurs in a system with virtual memory when a computer's real storage resources are overcommitted, leading to a constant state of paging and page faults, slowing most application-level processing. [1] This causes the performance of the computer to degrade or collapse. The situation can continue indefinitely until the user closes some running applications or the active processes free up additional virtual memory resources.

Contents

After initialization, most programs operate on a small number of code and data pages compared to the total memory the program requires. The pages most frequently accessed at any point are called the working set, which may change over time.

When the working set is not significantly greater than the system's total number of real storage page frames, virtual memory systems work most efficiently, and an insignificant amount of computing is spent resolving page faults. As the total of the working sets grows, resolving page faults remains manageable until the growth reaches a critical point at which the number of faults increases dramatically and the time spent resolving them overwhelms the time spent on the computing the program was written to do. This condition is referred to as thrashing. Thrashing may occur on a program that randomly accesses huge data structures, as its large working set causes continual page faults that drastically slow down the system. Satisfying page faults may require freeing pages that will soon have to be re-read from disk.

The term is also used for various similar phenomena, particularly movement between other levels of the memory hierarchy, wherein a process progresses slowly because significant time is being spent acquiring resources.

"Thrashing" is also used in contexts other than virtual memory systems—for example, to describe cache issues in computing or silly window syndrome in networking.

Overview

Virtual memory works by treating a portion of secondary storage such as a computer hard disk as an additional layer of the cache hierarchy. Virtual memory allows processes to use more memory than is physically present in main memory. Operating systems supporting virtual memory assign processes a virtual address space and each process refers to addresses in its execution context by a so-called virtual address. To access data such as code or variables at that address, the process must translate the address to a physical address in a process known as virtual address translation. In effect, physical main memory becomes a cache for virtual memory, which is in general stored on disk in memory pages.

Programs are allocated a certain number of pages as needed by the operating system. Active memory pages exist in both RAM and on disk. Inactive pages are removed from the cache and written to disk when the main memory becomes full.

If processes are utilizing all main memory and need additional memory pages, a cascade of severe cache misses known as page faults will occur, often leading to a noticeable lag in the operating system responsiveness. This process together with the futile, repetitive page swapping that occurs is known as "thrashing". This frequently leads to high, runaway CPU utilization that can grind the system to a halt. In modern computers, thrashing may occur in the paging system (if there is not sufficient physical memory or the disk access time is overly long), or in the I/O communications subsystem (especially in conflicts over internal bus access), etc.

Depending on the configuration and algorithms involved, the throughput and latency of a system may degrade by multiple orders of magnitude. Thrashing is when the CPU performs 'productive' work less and 'swapping' work more. The overall memory access time may increase since the higher level memory is only as fast as the next lower level in the memory hierarchy. [2] The CPU is busy swapping pages so much that it cannot respond to users' programs and interrupts as much as required. Thrashing occurs when there are too many pages in memory, and each page refers to another page. Real memory reduces its capacity to contain all the pages, so it uses 'virtual memory'. When each page in execution demands that page that is not currently in real memory (RAM) it places some pages on virtual memory and adjusts the required page on RAM. If the CPU is too busy doing this task, thrashing occurs.

Causes

In virtual memory systems, thrashing may be caused by programs or workloads that present insufficient locality of reference: if the working set of a program or a workload cannot be effectively held within physical memory, then constant data swapping, i.e., thrashing, may occur. The term was first used during the tape operating system days to describe the sound the tapes made when data was being rapidly written to and read. A worst-case scenario of this sort on the IBM System/370 series mainframe computer could be an execute instruction crossing a page boundary that points to a move instruction itself also crossing a page boundary, itself pointing to a source and a target that each cross page boundaries. The total number of pages thus involved in this particular instruction is eight, and all eight pages must be simultaneously present in memory. If any one of the eight pages cannot be swapped in (for example to make room for any of the other pages), the instruction will fault, and every attempt to restart it will fail until all eight pages can be swapped in.

A system thrashing is often a result of a sudden spike in page demand from a small number of running programs. Swap-token [3] is a lightweight and dynamic thrashing protection mechanism. The basic idea is to set a token in the system, which is randomly given to a process that has page faults when thrashing happens. The process that has the token is given a privilege to allocate more physical memory pages to build its working set, which is expected to quickly finish its execution and release the memory pages to other processes. A timestamp is used to hand over the tokens one by one. The first version of swap-token is implemented in Linux. The second version is called preempt swap-token. In this updated swap-token implementation, a priority counter is set for each process to track the number of swap-out pages. The token is always given to the process with a high priority, which has a high number of swap-out pages. The length of the time stamp is not a constant but is determined by the priority: the higher the number of swap-out pages of a process, the longer the time stamp for it will be.

Other uses

Thrashing is best known in the context of memory and storage, but analogous phenomena occur for other resources, including:

Cache thrashing

Where main memory is accessed in a pattern that leads to multiple main memory locations competing for the same cache lines, resulting in excessive cache misses. This is most problematic for caches that have low associativity.

TLB thrashing

Where the translation lookaside buffer (TLB) acting as a cache for the memory management unit (MMU) which translates virtual addresses to physical addresses is too small for the working set of pages. TLB thrashing can occur even if instruction cache or data cache thrashing is not occurring because these are cached in different sizes. Instructions and data are cached in small blocks (cache lines), not entire pages, but address lookup is done at the page level. Thus even if the code and data working sets fit into the cache, if the working sets are fragmented across many pages, the virtual address working set may not fit into TLB, causing TLB thrashing.

Heap thrashing

Frequent garbage collection, due to failure to allocate memory for an object, due to insufficient free memory or insufficient contiguous free memory due to memory fragmentation is referred to as heap thrashing. [4]

Process thrashing

A similar phenomenon occurs for processes: when the process working set cannot be coscheduled – so not all interacting processes are scheduled to run at the same time – they experience "process thrashing" due to being repeatedly scheduled and unscheduled, progressing only slowly. [5]

See also

Related Research Articles

In computing, a context switch is the process of storing the state of a process or thread, so that it can be restored and resume execution at a later point, and then restoring a different, previously saved, state. This allows multiple processes to share a single central processing unit (CPU), and is an essential feature of a multiprogramming or multitasking operating system. In a traditional CPU, each process - a program in execution - utilizes the various CPU registers to store data and hold the current state of the running process. However, in a multitasking operating system, the operating system switches between processes or threads to allow the execution of multiple processes simultaneously. For every switch, the operating system must save the state of the currently running process, followed by loading the next process state, which will run on the CPU. This sequence of operations that stores the state of the running process and the loading of the following running process is called a context switch.

<span class="mw-page-title-main">Virtual memory</span> Computer memory management technique

In computing, virtual memory, or virtual storage, is a memory management technique that provides an "idealized abstraction of the storage resources that are actually available on a given machine" which "creates the illusion to users of a very large (main) memory".

<span class="mw-page-title-main">Process (computing)</span> Particular execution of a computer program

In computing, a process is the instance of a computer program that is being executed by one or many threads. There are many different process models, some of which are light weight, but almost all processes are rooted in an operating system (OS) process which comprises the program code, assigned system resources, physical and logical access permissions, and data structures to initiate, control and coordinate execution activity. Depending on the OS, a process may be made up of multiple threads of execution that execute instructions concurrently.

In computer science, locality of reference, also known as the principle of locality, is the tendency of a processor to access the same set of memory locations repetitively over a short period of time. There are two basic types of reference locality – temporal and spatial locality. Temporal locality refers to the reuse of specific data and/or resources within a relatively small time duration. Spatial locality refers to the use of data elements within relatively close storage locations. Sequential locality, a special case of spatial locality, occurs when data elements are arranged and accessed linearly, such as traversing the elements in a one-dimensional array.

<span class="mw-page-title-main">Memory management unit</span> Hardware translating virtual addresses to physical address

A memory management unit (MMU), sometimes called paged memory management unit (PMMU), is a computer hardware unit that examines all memory references on the memory bus, translating these requests, known as virtual memory addresses, into physical addresses in main memory.

In computing, a bus error is a fault raised by hardware, notifying an operating system (OS) that a process is trying to access memory that the CPU cannot physically address: an invalid address for the address bus, hence the name. In modern use on most architectures these are much rarer than segmentation faults, which occur primarily due to memory access violations: problems in the logical address or permissions.

In computer operating systems, memory paging is a memory management scheme by which a computer stores and retrieves data from secondary storage for use in main memory. In this scheme, the operating system retrieves data from secondary storage in same-size blocks called pages. Paging is an important part of virtual memory implementations in modern operating systems, using secondary storage to let programs exceed the size of available physical memory.

Memory protection is a way to control memory access rights on a computer, and is a part of most modern instruction set architectures and operating systems. The main purpose of memory protection is to prevent a process from accessing memory that has not been allocated to it. This prevents a bug or malware within a process from affecting other processes, or the operating system itself. Protection may encompass all accesses to a specified area of memory, write accesses, or attempts to execute the contents of the area. An attempt to access unauthorized memory results in a hardware fault, e.g., a segmentation fault, storage violation exception, generally causing abnormal termination of the offending process. Memory protection for computer security includes additional techniques such as address space layout randomization and executable-space protection.

A translation lookaside buffer (TLB) is a memory cache that stores the recent translations of virtual memory to physical memory. It is used to reduce the time taken to access a user memory location. It can be called an address-translation cache. It is a part of the chip's memory-management unit (MMU). A TLB may reside between the CPU and the CPU cache, between CPU cache and the main memory or between the different levels of the multi-level cache. The majority of desktop, laptop, and server processors include one or more TLBs in the memory-management hardware, and it is nearly always present in any processor that utilizes paged or segmented virtual memory.

<span class="mw-page-title-main">Page table</span> Data structure that maps virtual addresses with physical addresses

A page table is a data structure used by a virtual memory system in a computer to store mappings between virtual addresses and physical addresses. Virtual addresses are used by the program executed by the accessing process, while physical addresses are used by the hardware, or more specifically, by the random-access memory (RAM) subsystem. The page table is a key component of virtual address translation that is necessary to access data in memory. The page table is set up by the computer's operating system, and may be read and written during the virtual address translation process by the memory management unit or by low-level system software or firmware.

In a computer operating system that uses paging for virtual memory management, page replacement algorithms decide which memory pages to page out, sometimes called swap out, or write to disk, when a page of memory needs to be allocated. Page replacement happens when a requested page is not in memory and a free page cannot be used to satisfy the allocation, either because there are none, or because the number of free pages is lower than some threshold.

A CPU cache is a hardware cache used by the central processing unit (CPU) of a computer to reduce the average cost to access data from the main memory. A cache is a smaller, faster memory, located closer to a processor core, which stores copies of the data from frequently used main memory locations. Most CPUs have a hierarchy of multiple cache levels, with different instruction-specific and data-specific caches at level 1. The cache memory is typically implemented with static random-access memory (SRAM), in modern CPUs by far the largest part of them by chip area, but SRAM is not always used for all levels, or even any level, sometimes some latter or all levels are implemented with eDRAM.

In computing, a page fault is an exception that the memory management unit (MMU) raises when a process accesses a memory page without proper preparations. Accessing the page requires a mapping to be added to the process's virtual address space. Besides, the actual page contents may need to be loaded from a backing store, such as a disk. The MMU detects the page fault, but the operating system's kernel handles the exception by making the required page accessible in the physical memory or denying an illegal memory access.

In computer operating systems, demand paging is a method of virtual memory management. In a system that uses demand paging, the operating system copies a disk page into physical memory only if an attempt is made to access it and that page is not already in memory. It follows that a process begins execution with none of its pages in physical memory, and many page faults will occur until most of a process's working set of pages are located in physical memory. This is an example of a lazy loading technique.

Memory segmentation is an operating system memory management technique of dividing a computer's primary memory into segments or sections. In a computer system using segmentation, a reference to a memory location includes a value that identifies a segment and an offset within that segment. Segments or sections are also used in object files of compiled programs when they are linked together into a program image and when the image is loaded into memory.

In computing, a system resource, or simple resource, is any physical or virtual component of limited availability within a computer system. All connected devices and internal system components are resources. Virtual system resources include files, network connections, and memory areas.

Working set is a concept in computer science which defines the amount of memory that a process requires in a given time interval.

<span class="mw-page-title-main">Multithreading (computer architecture)</span> Ability of a CPU to provide multiple threads of execution concurrently

In computer architecture, multithreading is the ability of a central processing unit (CPU) to provide multiple threads of execution concurrently, supported by the operating system. This approach differs from multiprocessing. In a multithreaded application, the threads share the resources of a single or multiple cores, which include the computing units, the CPU caches, and the translation lookaside buffer (TLB).

This glossary of computer hardware terms is a list of definitions of terms and concepts related to computer hardware, i.e. the physical and structural components of computers, architectural issues, and peripheral devices.

Virtual memory compression is a memory management technique that utilizes data compression to reduce the size or number of paging requests to and from the auxiliary storage. In a virtual memory compression system, pages to be paged out of virtual memory are compressed and stored in physical memory, which is usually random-access memory (RAM), or sent as compressed to auxiliary storage such as a hard disk drive (HDD) or solid-state drive (SSD). In both cases the virtual memory range, whose contents has been compressed, is marked inaccessible so that attempts to access compressed pages can trigger page faults and reversal of the process. The footprint of the data being paged is reduced by the compression process; in the first instance, the freed RAM is returned to the available physical memory pool, while the compressed portion is kept in RAM. In the second instance, the compressed data is sent to auxiliary storage but the resulting I/O operation is smaller and therefore takes less time.

References

  1. Denning, Peter J. (1968). "Thrashing: Its causes and prevention" (PDF). Proceedings AFIPS, Fall Joint Computer Conference. 33: 915–922. Retrieved 2012-02-15.
  2. L., Hennessy, John (2012). Computer architecture: a quantitative approach. Patterson, David A., Asanović, Krste. (5th ed.). Waltham, MA: Morgan Kaufmann. ISBN   9780123838728. OCLC   755102367.{{cite book}}: CS1 maint: multiple names: authors list (link)
  3. Song Jiang, and Xiaodong Zhang (2005). "Token-ordered LRU: an effective page replacement policy and its implementation in Linux systems". Performance Evaluation. pp. 5–29. doi:10.1016/j.peva.2004.10.002.
  4. Performance Optimization and Tuning Techniques for IBM Processors, including IBM POWER8, "heap+thrashing" p. 170
  5. Ousterhout, J. K. (1982). "Scheduling Techniques for Concurrent Systems" (PDF). Proceedings of Third International Conference on Distributed Computing Systems. pp. 22–30.