Circular buffer

Last updated
A ring showing, conceptually, a circular buffer. This visually shows that the buffer has no real end and it can loop around the buffer. However, since memory is never physically created as a ring, a linear representation is generally used as is done below. Circular buffer.svg
A ring showing, conceptually, a circular buffer. This visually shows that the buffer has no real end and it can loop around the buffer. However, since memory is never physically created as a ring, a linear representation is generally used as is done below.

In computer science, a circular buffer, circular queue, cyclic buffer or ring buffer is a data structure that uses a single, fixed-size buffer as if it were connected end-to-end. This structure lends itself easily to buffering data streams. [1] There were early circular buffer implementations in hardware. [2] [3]

Contents

Overview

A 24-byte keyboard circular buffer. When the write pointer is about to reach the read pointer--because the microprocessor is not responding--the buffer stops recording keystrokes. On some computers a beep would be played. Circular Buffer Animation.gif
A 24-byte keyboard circular buffer. When the write pointer is about to reach the read pointerbecause the microprocessor is not respondingthe buffer stops recording keystrokes. On some computers a beep would be played.

A circular buffer first starts out empty and has a set length. In the diagram below is a 7-element buffer:

Circular buffer - empty.svg

Assume that 1 is written in the center of a circular buffer (the exact starting location is not important in a circular buffer):

Circular buffer - XX1XXXX.svg

Then assume that two more elements are added to the circular buffer 2 & 3 which get put after 1:

Circular buffer - XX123XX.svg

If two elements are removed, the two oldest values inside of the circular buffer would be removed. Circular buffers use FIFO ( first in, first out ) logic. In the example, 1 & 2 were the first to enter the circular buffer, they are the first to be removed, leaving 3 inside of the buffer.

Circular buffer - XXXX3XX.svg

If the buffer has 7 elements, then it is completely full:

Circular buffer - 6789345.svg

A property of the circular buffer is that when it is full and a subsequent write is performed, then it starts overwriting the oldest data. In the current example, two more elements A & B are added and they overwrite the 3 & 4:

Circular buffer - 6789AB5.svg

Alternatively, the routines that manage the buffer could prevent overwriting the data and return an error or raise an exception. Whether or not data is overwritten is up to the semantics of the buffer routines or the application using the circular buffer.

Finally, if two elements are now removed then what would be removed is not 3 & 4 (or rather now A & B) but 5 & 6 because 5 & 6 are now the oldest elements, yielding the buffer with:

Circular buffer - X789ABX.svg

Uses

The useful property of a circular buffer is that it does not need to have its elements shuffled around when one is consumed. (If a non-circular buffer were used then it would be necessary to shift all elements when one is consumed.) In other words, the circular buffer is well-suited as a FIFO (first in, first out) buffer while a standard, non-circular buffer is well suited as a LIFO (last in, first out) buffer.

Circular buffering makes a good implementation strategy for a queue that has fixed maximum size. Should a maximum size be adopted for a queue, then a circular buffer is a completely ideal implementation; all queue operations are constant time. However, expanding a circular buffer requires shifting memory, which is comparatively costly. For arbitrarily expanding queues, a linked list approach may be preferred instead.

In some situations, overwriting circular buffer can be used, e.g. in multimedia. If the buffer is used as the bounded buffer in the producer–consumer problem then it is probably desired for the producer (e.g., an audio generator) to overwrite old data if the consumer (e.g., the sound card) is unable to momentarily keep up. Also, the LZ77 family of lossless data compression algorithms operates on the assumption that strings seen more recently in a data stream are more likely to occur soon in the stream. Implementations store the most recent data in a circular buffer.

Circular buffer mechanics

Circular buffer implementation in hardware, US patent 3979733, fig4 Hardware circular buffer implementation patent us3979733 fig4.png
Circular buffer implementation in hardware, US patent 3979733, fig4

A circular buffer can be implemented using a pointer and three integers: [4]

This image shows a partially full buffer with Length = 7:

Circular buffer - XX123XX with pointers.svg

This image shows a full buffer with four elements (numbers 1 through 4) having been overwritten:

Circular buffer - 6789AB5 with pointers.svg

In the beginning the indexes end and start are set to 0. The circular buffer write operation writes an element to the end index position and the end index is incremented to the next buffer position. The circular buffer read operation reads an element from the start index position and the start index is incremented to the next buffer position.

The start and end indexes alone are not enough to distinguish between buffer full or empty state while also utilizing all buffer slots, [5] but can be if the buffer only has a maximum in-use size of Length − 1. [6] In this case, the buffer is empty if the start and end indexes are equal and full when the in-use size is Length − 1. Another solution is to have another integer count that is incremented at a write operation and decremented at a read operation. Then checking for emptiness means testing count equals 0 and checking for fullness means testing count equals Length. [7]

The following source code is a C implementation together with a minimal test. Function put() puts an item in the buffer, function get() gets an item from the buffer. Both functions take care about the capacity of the buffer :

#include<stdio.h>enum{N=10};// size of circular bufferintbuffer[N];// note: only (N - 1) elements can be stored at a given timeintwriteIndx=0;intreadIndx=0;intput(intitem){if((writeIndx+1)%N==readIndx){// buffer is full, avoid overflowreturn0;}buffer[writeIndx]=item;writeIndx=(writeIndx+1)%N;return1;}intget(int*value){if(readIndx==writeIndx){// buffer is emptyreturn0;}*value=buffer[readIndx];readIndx=(readIndx+1)%N;return1;}intmain(){// test circular bufferintvalue=1001;while(put(value++));while(get(&value))printf("read %d\n",value);return0;}

Optimization

A circular-buffer implementation may be optimized by mapping the underlying buffer to two contiguous regions of virtual memory. [8] [ disputed discuss ] (Naturally, the underlying buffer‘s length must then equal some multiple of the system’s page size.) Reading from and writing to the circular buffer may then be carried out with greater efficiency by means of direct memory access; those accesses which fall beyond the end of the first virtual-memory region will automatically wrap around to the beginning of the underlying buffer. When the read offset is advanced into the second virtual-memory region, both offsets—read and write—are decremented by the length of the underlying buffer.

Fixed-length-element and contiguous-block circular buffer

Perhaps the most common version of the circular buffer uses 8-bit bytes as elements.

Some implementations of the circular buffer use fixed-length elements that are bigger than 8-bit bytes—16-bit integers for audio buffers, 53-byte ATM cells for telecom buffers, etc. Each item is contiguous and has the correct data alignment, so software reading and writing these values can be faster than software that handles non-contiguous and non-aligned values.

Ping-pong buffering can be considered a very specialized circular buffer with exactly two large fixed-length elements.

The bip buffer (bipartite buffer) is very similar to a circular buffer, except it always returns contiguous blocks which can be variable length. This offers nearly all the efficiency advantages of a circular buffer while maintaining the ability for the buffer to be used in APIs that only accept contiguous blocks. [9]

Fixed-sized compressed circular buffers use an alternative indexing strategy based on elementary number theory to maintain a fixed-sized compressed representation of the entire data sequence. [10]

Related Research Articles

In computer science, an array is a data structure consisting of a collection of elements, of same memory size, each identified by at least one array index or key. An array is stored such that the position of each element can be computed from its index tuple by a mathematical formula. The simplest type of data structure is a linear array, also called a one-dimensional array.

<span class="mw-page-title-main">FIFO (computing and electronics)</span> Scheduling algorithm, the first piece of data inserted into a queue is processed first

In computing and in systems theory, first in, first out, acronymized as FIFO, is a method for organizing the manipulation of a data structure where the oldest (first) entry, or "head" of the queue, is processed first.

In computer science, a linked list is a linear collection of data elements whose order is not given by their physical placement in memory. Instead, each element points to the next. It is a data structure consisting of a collection of nodes which together represent a sequence. In its most basic form, each node contains data, and a reference to the next node in the sequence. This structure allows for efficient insertion or removal of elements from any position in the sequence during iteration. More complex variants add additional links, allowing more efficient insertion or removal of nodes at arbitrary positions. A drawback of linked lists is that data access time is linear in respect to the number of nodes in the list. Because nodes are serially linked, accessing any node requires that the prior node be accessed beforehand. Faster access, such as random access, is not feasible. Arrays have better cache locality compared to linked lists.

<span class="mw-page-title-main">Virtual memory</span> Computer memory management technique

In computing, virtual memory, or virtual storage, is a memory management technique that provides an "idealized abstraction of the storage resources that are actually available on a given machine" which "creates the illusion to users of a very large (main) memory".

<span class="mw-page-title-main">Round-robin scheduling</span> Algorithm employed by process and network schedulers in computing

Round-robin (RR) is one of the algorithms employed by process and network schedulers in computing. As the term is generally used, time slices are assigned to each process in equal portions and in circular order, handling all processes without priority. Round-robin scheduling is simple, easy to implement, and starvation-free. Round-robin scheduling can be applied to other scheduling problems, such as data packet scheduling in computer networks. It is an operating system concept.

A log-structured filesystem is a file system in which data and metadata are written sequentially to a circular buffer, called a log. The design was first proposed in 1988 by John K. Ousterhout and Fred Douglis and first implemented in 1992 by Ousterhout and Mendel Rosenblum for the Unix-like Sprite distributed operating system.

A re-order buffer (ROB) is a hardware unit used in an extension to the Tomasulo algorithm to support out-of-order and speculative instruction execution. The extension forces instructions to be committed in-order.

In computer science, the test-and-set instruction is an instruction used to write (set) 1 to a memory location and return its old value as a single atomic operation. The caller can then "test" the result to see if the state was changed by the call. If multiple processes may access the same memory location, and if a process is currently performing a test-and-set, no other process may begin another test-and-set until the first process's test-and-set is finished. A central processing unit (CPU) may use a test-and-set instruction offered by another electronic component, such as dual-port RAM; a CPU itself may also offer a test-and-set instruction.

In computer architecture, register renaming is a technique that abstracts logical registers from physical registers. Every logical register has a set of physical registers associated with it. When a machine language instruction refers to a particular logical register, the processor transposes this name to one specific physical register on the fly. The physical registers are opaque and cannot be referenced directly but only via the canonical names.

<span class="mw-page-title-main">Pointer (computer programming)</span> Object which stores memory addresses in a computer program

In computer science, a pointer is an object in many programming languages that stores a memory address. This can be that of another value located in computer memory, or in some cases, that of memory-mapped computer hardware. A pointer references a location in memory, and obtaining the value stored at that location is known as dereferencing the pointer. As an analogy, a page number in a book's index could be considered a pointer to the corresponding page; dereferencing such a pointer would be done by flipping to the page with the given page number and reading the text found on that page. The actual format and content of a pointer variable is dependent on the underlying computer architecture.

<span class="mw-page-title-main">Page table</span> Data structure that maps virtual addresses with physical addresses

A page table is a data structure used by a virtual memory system in a computer to store mappings between virtual addresses and physical addresses. Virtual addresses are used by the program executed by the accessing process, while physical addresses are used by the hardware, or more specifically, by the random-access memory (RAM) subsystem. The page table is a key component of virtual address translation that is necessary to access data in memory. The page table is set up by the computer's operating system, and may be read and written during the virtual address translation process by the memory management unit or by low-level system software or firmware.

<span class="mw-page-title-main">File system</span> Computer filing system

In computing, a file system or filesystem governs file organization and access. A local file system is a capability of an operating system that services the applications running on the same computer. A distributed file system is a protocol that provides file access between networked computers.

Memory segmentation is an operating system memory management technique of dividing a computer's primary memory into segments or sections. In a computer system using segmentation, a reference to a memory location includes a value that identifies a segment and an offset within that segment. Segments or sections are also used in object files of compiled programs when they are linked together into a program image and when the image is loaded into memory.

<span class="mw-page-title-main">Dynamic array</span> List data structure to which elements can be added/removed

In computer science, a dynamic array, growable array, resizable array, dynamic table, mutable array, or array list is a random access, variable-size list data structure that allows elements to be added or removed. It is supplied with standard libraries in many modern mainstream programming languages. Dynamic arrays overcome a limit of static arrays, which have a fixed capacity that needs to be specified at allocation.

In computer science, a data buffer is a region of memory used to store data temporarily while it is being moved from one place to another. Typically, the data is stored in a buffer as it is retrieved from an input device or just before it is sent to an output device ; however, a buffer may be used when data is moved between processes within a computer, comparable to buffers in telecommunication. Buffers can be implemented in a fixed memory location in hardware or by using a virtual data buffer in software that points at a location in the physical memory.

In computing, the producer-consumer problem is a family of problems described by Edsger W. Dijkstra since 1965.

In software, a stack buffer overflow or stack buffer overrun occurs when a program writes to a memory address on the program's call stack outside of the intended data structure, which is usually a fixed-length buffer. Stack buffer overflow bugs are caused when a program writes more data to a buffer located on the stack than what is actually allocated for that buffer. This almost always results in corruption of adjacent data on the stack, and in cases where the overflow was triggered by mistake, will often cause the program to crash or operate incorrectly. Stack buffer overflow is a type of the more general programming malfunction known as buffer overflow. Overfilling a buffer on the stack is more likely to derail program execution than overfilling a buffer on the heap because the stack contains the return addresses for all active function calls.

A journaling file system is a file system that keeps track of changes not yet committed to the file system's main part by recording the goal of such changes in a data structure known as a "journal", which is usually a circular log. In the event of a system crash or power failure, such file systems can be brought back online more quickly with a lower likelihood of becoming corrupted.

<span class="mw-page-title-main">Log-structured merge-tree</span> Data structure

In computer science, the log-structured merge-tree is a data structure with performance characteristics that make it attractive for providing indexed access to files with high insert volume, such as transactional log data. LSM trees, like other search trees, maintain key-value pairs. LSM trees maintain data in two or more separate structures, each of which is optimized for its respective underlying storage medium; data is synchronized between the two structures efficiently, in batches.

References

  1. Arpaci-Dusseau, Remzi H.; Arpaci-Dusseau, Andrea C. (2014), Operating Systems: Three Easy Pieces [Chapter: Condition Variables, figure 30.13] (PDF), Arpaci-Dusseau Books
  2. Hartl, Johann (17 October 2011). "Impulswiederholer - Telephone Exchange (video)". Youtube. Retrieved 15 December 2021.
  3. Fraser, Alexander Gibson. "US patent 3979733 Digital data communications system packet switch". US States Patent. Retrieved 15 December 2021.
  4. Liu, Z.; Wu, F.; Das, S.K. (2021). Wireless Algorithms, Systems, and Applications: 16th International Conference, WASA 2021, Nanjing, China, June 25–27, 2021, Proceedings, Part II. Lecture Notes in Computer Science. Springer International Publishing. p. 117. ISBN   978-3-030-86130-8 . Retrieved 2023-09-04.
  5. Chandrasekaran, Siddharth (2014-05-16). "Implementing Circular/Ring Buffer in Embedded C". Embed Journal. EmbedJournal Team. Archived from the original on 11 February 2017. Retrieved 14 August 2017.
  6. Circular buffers kernel.org
  7. Morin, Pat. "ArrayQueue: An Array-Based Queue". Open Data Structures (in pseudocode). Archived from the original on 31 August 2015. Retrieved 7 November 2015.
  8. Mike Ash (2012-02-17). "mikeash.com: Friday Q&A 2012-02-17: Ring Buffers and Mirrored Memory: Part II". mikeash.com. Archived from the original on 2019-01-11. Retrieved 2019-01-10.
  9. Simon Cooke (2003), "The Bip Buffer - The Circular Buffer with a Twist"
  10. Gunther, John C. (March 2014). "Algorithm 938: Compressing circular buffers". ACM Transactions on Mathematical Software. 40 (2): 1–12. doi:10.1145/2559995. S2CID   14682572.