Double-ended queue

Last updated

In computer science, a double-ended queue (abbreviated to deque, pronounced deck, like "cheque" [1] ) is an abstract data type that generalizes a queue, for which elements can be added to or removed from either the front (head) or back (tail). [2] It is also often called a head-tail linked list, though properly this refers to a specific data structure implementation of a deque (see below).

Contents

Naming conventions

Deque is sometimes written dequeue, but this use is generally deprecated in technical literature or technical writing because dequeue is also a verb meaning "to remove from a queue". Nevertheless, several libraries and some writers, such as Aho, Hopcroft, and Ullman in their textbook Data Structures and Algorithms, spell it dequeue. John Mitchell, author of Concepts in Programming Languages, also uses this terminology.

Distinctions and sub-types

This differs from the queue abstract data type or first in first out list (FIFO), where elements can only be added to one end and removed from the other. This general data class has some possible sub-types:

Both the basic and most common list types in computing, queues and stacks can be considered specializations of deques, and can be implemented using deques.

Operations

UML class diagram of a double-ended queue UML deque.svg
UML class diagram of a double-ended queue

The basic operations on a deque are enqueue and dequeue on either end. Also generally implemented are peek operations, which return the value at that end without dequeuing it.

Names vary between languages; major implementations include:

operationcommon name(s) Ada C++ Java Perl PHP Python Ruby Rust JavaScript
insert element at backinject, snoc, pushAppendpush_backofferLastpusharray_pushappendpushpush_backpush
insert element at frontpush, consPrependpush_frontofferFirstunshiftarray_unshiftappendleftunshiftpush_frontunshift
remove last elementejectDelete_Lastpop_backpollLastpoparray_poppoppoppop_backpop
remove first elementpopDelete_Firstpop_frontpollFirstshiftarray_shiftpopleftshiftpop_frontshift
examine last elementpeekLast_ElementbackpeekLast$array[-1]end<obj>[-1]lastback<obj>.at(-1)
examine first elementFirst_ElementfrontpeekFirst$array[0]reset<obj>[0]firstfront<obj>[0]

Implementations

There are at least two common ways to efficiently implement a deque: with a modified dynamic array or with a doubly linked list.

The dynamic array approach uses a variant of a dynamic array that can grow from both ends, sometimes called array deques. These array deques have all the properties of a dynamic array, such as constant-time random access, good locality of reference, and inefficient insertion/removal in the middle, with the addition of amortized constant-time insertion/removal at both ends, instead of just one end. Three common implementations include:

Purely functional implementation

Double-ended queues can also be implemented as a purely functional data structure. [3] :115 Two versions of the implementation exist. The first one, called 'real-time deque, is presented below. It allows the queue to be persistent with operations in O(1) worst-case time, but requires lazy lists with memoization. The second one, with no lazy lists nor memoization is presented at the end of the sections. Its amortized time is O(1) if the persistency is not used; but the worst-time complexity of an operation is O(n) where n is the number of elements in the double-ended queue.

Let us recall that, for a list l, |l| denotes its length, that NIL represents an empty list and CONS(h, t) represents the list whose head is h and whose tail is t. The functions drop(i, l) and take(i, l) return the list l without its first i elements, and the first i elements of l, respectively. Or, if |l| < i, they return the empty list and l respectively.

Real-time deques via lazy rebuilding and scheduling

A double-ended queue is represented as a sextuple (len_front, front, tail_front, len_rear, rear, tail_rear) where front is a linked list which contains the front of the queue of length len_front. Similarly, rear is a linked list which represents the reverse of the rear of the queue, of length len_rear. Furthermore, it is assured that |front| ≤ 2|rear|+1 and |rear| ≤ 2|front|+1 - intuitively, it means that both the front and the rear contains between a third minus one and two thirds plus one of the elements. Finally, tail_front and tail_rear are tails of front and of rear, they allow scheduling the moment where some lazy operations are forced. Note that, when a double-ended queue contains n elements in the front list and n elements in the rear list, then the inequality invariant remains satisfied after i insertions and d deletions when (i+d) n/2. That is, at most n/2 operations can happen between each rebalancing.

Let us first give an implementation of the various operations that affect the front of the deque - cons, head and tail. Those implementations do not necessarily respect the invariant. In a second time we'll explain how to modify a deque which does not satisfy the invariant into one which satisfies it. However, they use the invariant, in that if the front is empty then the rear has at most one element. The operations affecting the rear of the list are defined similarly by symmetry.

empty=(0,NIL,NIL,0,NIL,NIL)funinsert'(x,(len_front,front,tail_front,len_rear,rear,tail_rear))=(len_front+1,CONS(x,front),drop(2,tail_front),len_rear,rear,drop(2,tail_rear))funhead((_,CONS(h,_),_,_,_,_))=hfunhead((_,NIL,_,_,CONS(h,NIL),_))=hfuntail'((len_front,CONS(head_front,front),tail_front,len_rear,rear,tail_rear))=(len_front-1,front,drop(2,tail_front),len_rear,rear,drop(2,tail_rear))funtail'((_,NIL,_,_,CONS(h,NIL),_))=empty

It remains to explain how to define a method balance that rebalance the deque if insert' or tail broke the invariant. The method insert and tail can be defined by first applying insert' and tail' and then applying balance.

funbalance(qas(len_front,front,tail_front,len_rear,rear,tail_rear))=letfloor_half_len=(len_front+len_rear)/2inletceil_half_len=len_front+len_rear-floor_half_leniniflen_front>2*len_rear+1thenletvalfront'=take(ceil_half_len,front)valrear'=rotateDrop(rear,floor_half_len,front)in(ceil_half_len,front',front',floor_half_len,rear',rear')elseiflen_front>2*len_rear+1thenletvalrear'=take(floor_half_len,rear)valfront'=rotateDrop(front,ceil_half_len,rear)in(ceil_half_len,front',front',floor_half_len,rear',rear')elseq

where rotateDrop(front, i, rear)) return the concatenation of front and of drop(i, rear). That isfront' = rotateDrop(front, ceil_half_len, rear) put into front' the content of front and the content of rear that is not already in rear'. Since dropping n elements takes time, we use laziness to ensure that elements are dropped two by two, with two drops being done during each tail' and each insert' operation.

funrotateDrop(front,i,rear)=ifi<2thenrotateRev(front,drop(i,rear),NIL)elseletCONS(x,front')=frontinCONS(x,rotateDrop(front',j-2,drop(2,rear)))

where rotateRev(front, middle, rear) is a function that returns the front, followed by the middle reversed, followed by the rear. This function is also defined using laziness to ensure that it can be computed step by step, with one step executed during each insert' and tail' and taking a constant time. This function uses the invariant that |rear|-2|front| is 2 or 3.

funrotateRev(NIL,rear,a)=reverse(rear)++afunrotateRev(CONS(x,front),rear,a)=CONS(x,rotateRev(front,drop(2,rear),reverse(take(2,rear))++a))

where ++ is the function concatenating two lists.

Implementation without laziness

Note that, without the lazy part of the implementation, this would be a non-persistent implementation of queue in O(1) amortized time. In this case, the lists tail_front and tail_rear could be removed from the representation of the double-ended queue.

Language support

Ada's containers provides the generic packages Ada.Containers.Vectors and Ada.Containers.Doubly_Linked_Lists, for the dynamic array and linked list implementations, respectively.

C++'s Standard Template Library provides the class templates std::deque and std::list, for the multiple array and linked list implementations, respectively.

As of Java 6, Java's Collections Framework provides a new Deque interface that provides the functionality of insertion and removal at both ends. It is implemented by classes such as ArrayDeque (also new in Java 6) and LinkedList , providing the dynamic array and linked list implementations, respectively. However, the ArrayDeque, contrary to its name, does not support random access.

Javascript's Array prototype & Perl's arrays have native support for both removing (shift and pop) and adding (unshift and push) elements on both ends.

Python 2.4 introduced the collections module with support for deque objects. It is implemented using a doubly linked list of fixed-length subarrays.

As of PHP 5.3, PHP's SPL extension contains the 'SplDoublyLinkedList' class that can be used to implement Deque datastructures. Previously to make a Deque structure the array functions array_shift/unshift/pop/push had to be used instead.

GHC's Data.Sequence module implements an efficient, functional deque structure in Haskell. The implementation uses 2–3 finger trees annotated with sizes. There are other (fast) possibilities to implement purely functional (thus also persistent) double queues (most using heavily lazy evaluation). [3] [4] Kaplan and Tarjan were the first to implement optimal confluently persistent catenable deques. [5] Their implementation was strictly purely functional in the sense that it did not use lazy evaluation. Okasaki simplified the data structure by using lazy evaluation with a bootstrapped data structure and degrading the performance bounds from worst-case to amortized. [6] Kaplan, Okasaki, and Tarjan produced a simpler, non-bootstrapped, amortized version that can be implemented either using lazy evaluation or more efficiently using mutation in a broader but still restricted fashion. [7] Mihaescu and Tarjan created a simpler (but still highly complex) strictly purely functional implementation of catenable deques, and also a much simpler implementation of strictly purely functional non-catenable deques, both of which have optimal worst-case bounds. [8]

Rust's std::collections includes VecDeque which implements a double-ended queue using a growable ring buffer.

Complexity

Applications

A double-ended queue can be used to store the browsing history: new websites are added to the end of the queue, while the oldest entries will be deleted when the history is too large. When a user asks to clear the browsing history for the past hour, the most recently added entries are removed. Back button history (Ecosia browser - Google Pixel 4a) - Google Android 12 (cropped).png
A double-ended queue can be used to store the browsing history: new websites are added to the end of the queue, while the oldest entries will be deleted when the history is too large. When a user asks to clear the browsing history for the past hour, the most recently added entries are removed.

One example where a deque can be used is the work stealing algorithm. [9] This algorithm implements task scheduling for several processors. A separate deque with threads to be executed is maintained for each processor. To execute the next thread, the processor gets the first element from the deque (using the "remove first element" deque operation). If the current thread forks, it is put back to the front of the deque ("insert element at front") and a new thread is executed. When one of the processors finishes execution of its own threads (i.e. its deque is empty), it can "steal" a thread from another processor: it gets the last element from the deque of another processor ("remove last element") and executes it. The work stealing algorithm is used by Intel's Threading Building Blocks (TBB) library for parallel programming.

See also

Related Research Articles

<span class="mw-page-title-main">Heap (data structure)</span> Computer science data structure

In computer science, a heap is a tree-based data structure that satisfies the heap property: In a max heap, for any given node C, if P is a parent node of C, then the key of P is greater than or equal to the key of C. In a min heap, the key of P is less than or equal to the key of C. The node at the "top" of the heap is called the root node.

In computer science, a linked list is a linear collection of data elements whose order is not given by their physical placement in memory. Instead, each element points to the next. It is a data structure consisting of a collection of nodes which together represent a sequence. In its most basic form, each node contains data, and a reference to the next node in the sequence. This structure allows for efficient insertion or removal of elements from any position in the sequence during iteration. More complex variants add additional links, allowing more efficient insertion or removal of nodes at arbitrary positions. A drawback of linked lists is that data access time is linear in respect to the number of nodes in the list. Because nodes are serially linked, accessing any node requires that the prior node be accessed beforehand. Faster access, such as random access, is not feasible. Arrays have better cache locality compared to linked lists.

In computer science, a priority queue is an abstract data-type similar to a regular queue or stack data structure. Each element in a priority queue has an associated priority. In a priority queue, elements with high priority are served before elements with low priority. In some implementations, if two elements have the same priority, they are served in the same order in which they were enqueued. In other implementations, the order of elements with the same priority is undefined.

<span class="mw-page-title-main">Queue (abstract data type)</span> Abstract data type

In computer science, a queue is a collection of entities that are maintained in a sequence and can be modified by the addition of entities at one end of the sequence and the removal of entities from the other end of the sequence. By convention, the end of the sequence at which elements are added is called the back, tail, or rear of the queue, and the end at which elements are removed is called the head or front of the queue, analogously to the words used when people line up to wait for goods or services.

A splay tree is a binary search tree with the additional property that recently accessed elements are quick to access again. Like self-balancing binary search trees, a splay tree performs basic operations such as insertion, look-up and removal in O(log n) amortized time. For random access patterns drawn from a non-uniform random distribution, their amortized time can be faster than logarithmic, proportional to the entropy of the access pattern. For many patterns of non-random operations, also, splay trees can take better than logarithmic time, without requiring advance knowledge of the pattern. According to the unproven dynamic optimality conjecture, their performance on all access patterns is within a constant factor of the best possible performance that could be achieved by any other self-adjusting binary search tree, even one selected to fit that pattern. The splay tree was invented by Daniel Sleator and Robert Tarjan in 1985.

<span class="mw-page-title-main">Binary heap</span> Variant of heap data structure

A binary heap is a heap data structure that takes the form of a binary tree. Binary heaps are a common way of implementing priority queues. The binary heap was introduced by J. W. J. Williams in 1964, as a data structure for heapsort.

In computer science, a list or sequence is an abstract data type that represents a finite number of ordered values, where the same value may occur more than once. An instance of a list is a computer representation of the mathematical concept of a tuple or finite sequence; the (potentially) infinite analog of a list is a stream. Lists are a basic example of containers, as they contain other values. If the same value occurs multiple times, each occurrence is considered a distinct item.

In computer science, amortized analysis is a method for analyzing a given algorithm's complexity, or how much of a resource, especially time or memory, it takes to execute. The motivation for amortized analysis is that looking at the worst-case run time can be too pessimistic. Instead, amortized analysis averages the running times of operations in a sequence over that sequence. As a conclusion: "Amortized analysis is a useful tool that complements other techniques such as worst-case and average-case analysis."

In computer science, a Fibonacci heap is a data structure for priority queue operations, consisting of a collection of heap-ordered trees. It has a better amortized running time than many other priority queue data structures including the binary heap and binomial heap. Michael L. Fredman and Robert E. Tarjan developed Fibonacci heaps in 1984 and published them in a scientific journal in 1987. Fibonacci heaps are named after the Fibonacci numbers, which are used in their running time analysis.

In computing, a persistent data structure or not ephemeral data structure is a data structure that always preserves the previous version of itself when it is modified. Such data structures are effectively immutable, as their operations do not (visibly) update the structure in-place, but instead always yield a new updated structure. The term was introduced in Driscoll, Sarnak, Sleator, and Tarjan's 1986 article.

<span class="mw-page-title-main">Dynamic array</span> List data structure to which elements can be added/removed

In computer science, a dynamic array, growable array, resizable array, dynamic table, mutable array, or array list is a random access, variable-size list data structure that allows elements to be added or removed. It is supplied with standard libraries in many modern mainstream programming languages. Dynamic arrays overcome a limit of static arrays, which have a fixed capacity that needs to be specified at allocation.

A pairing heap is a type of heap data structure with relatively simple implementation and excellent practical amortized performance, introduced by Michael Fredman, Robert Sedgewick, Daniel Sleator, and Robert Tarjan in 1986. Pairing heaps are heap-ordered multiway tree structures, and can be considered simplified Fibonacci heaps. They are considered a "robust choice" for implementing such algorithms as Prim's MST algorithm, and support the following operations :

<span class="mw-page-title-main">Java collections framework</span> Collections in Java

The Java collections framework is a set of classes and interfaces that implement commonly reusable collection data structures.

In computer science, a purely functional data structure is a data structure that can be directly implemented in a purely functional language. The main difference between an arbitrary data structure and a purely functional one is that the latter is (strongly) immutable. This restriction ensures the data structure possesses the advantages of immutable objects: (full) persistency, quick copy of objects, and thread safety. Efficient purely functional data structures may require the use of lazy evaluation and memoization.

In computer science, a finger tree is a purely functional data structure that can be used to efficiently implement other functional data structures. A finger tree gives amortized constant time access to the "fingers" (leaves) of the tree, which is where data is stored, and concatenation and splitting logarithmic time in the size of the smaller piece. It also stores in each internal node the result of applying some associative operation to its descendants. This "summary" data stored in the internal nodes can be used to provide the functionality of data structures other than trees.

In computer science, the Brodal queue is a heap/priority queue structure with very low worst case time bounds: for insertion, find-minimum, meld and decrease-key and for delete-minimum and general deletion. They are the first heap variant to achieve these bounds without resorting to amortization of operational costs. Brodal queues are named after their inventor Gerth Stølting Brodal.

In computing, sequence containers refer to a group of container class templates in the standard library of the C++ programming language that implement storage of data elements. Being templates, they can be used to store arbitrary elements, such as integers or custom classes. One common property of all sequential containers is that the elements can be accessed sequentially. Like all other standard library components, they reside in namespace std.

In computer science, peek is an operation on certain abstract data types, specifically sequential collections such as stacks and queues, which returns the value of the top ("front") of the collection without removing the element from the collection. It thus returns the same value as operations such as "pop" or "dequeue", but does not modify the data.

In computer science, Iacono's working set structure is a comparison based dictionary. It supports insertion, deletion and access operation to maintain a dynamic set of elements. The working set of an item is the set of elements that have been accessed in the structure since the last time that was accessed . Inserting and deleting in the working set structure takes time while accessing an element takes . Here, represents the size of the working set of .

This is a comparison of the performance of notable data structures, as measured by the complexity of their logical operations. For a more comprehensive listing of data structures, see List of data structures.

References

  1. Jesse Liberty; Siddhartha Rao; Bradley Jones. C++ in One Hour a Day, Sams Teach Yourself, Sixth Edition. Sams Publishing, 2009. ISBN   0-672-32941-7. Lesson 18: STL Dynamic Array Classes, pp. 486.
  2. Donald Knuth. The Art of Computer Programming , Volume 1: Fundamental Algorithms, Third Edition. Addison-Wesley, 1997. ISBN   0-201-89683-4. Section 2.2.1: Stacks, Queues, and Deques, pp. 238243.
  3. 1 2 Okasaki, Chris (September 1996). Purely Functional Data Structures (PDF) (Ph.D. thesis). Carnegie Mellon University. CMU-CS-96-177.
  4. Adam L. Buchsbaum and Robert E. Tarjan. Confluently persistent deques via data structural bootstrapping. Journal of Algorithms, 18(3):513–547, May 1995. (pp. 58, 101, 125)
  5. Haim Kaplan and Robert E. Tarjan. Purely functional representations of catenable sorted lists. In ACM Symposium on Theory of Computing, pages 202–211, May 1996. (pp. 4, 82, 84, 124)
  6. Chris Okasaki (Aug. 1997), Catenable double-ended queues, ACM SIGPLAN Notices Volume 32 Issue 8
  7. Haim Kaplan, Chris Okasaki, and Robert E. Tarjan (2000), Simple Confluently Persistent Catenable Lists, SIAM Journal on Computing Vol. 30, Iss. 3
  8. Radu Mihaescu and Robert Tarjan (Aug. 2003), Notes on Catenable Deques in Pure Lisp, Princetown University, COS 528, Fall 03
  9. Blumofe, Robert D.; Leiserson, Charles E. (1999). "Scheduling multithreaded computations by work stealing" (PDF). J ACM. 46 (5): 720–748. doi:10.1145/324133.324234. S2CID   5428476.