Pairing heap

Last updated

A pairing heap is a type of heap data structure with relatively simple implementation and excellent practical amortized performance, introduced by Michael Fredman, Robert Sedgewick, Daniel Sleator, and Robert Tarjan in 1986. [1] Pairing heaps are heap-ordered multiway tree structures, and can be considered simplified Fibonacci heaps. They are considered a "robust choice" for implementing such algorithms as Prim's MST algorithm, [2] and support the following operations (assuming a min-heap):

Contents

The analysis of pairing heaps' time complexity was initially inspired by that of splay trees. [1] The amortized time per delete-min is O(log n), and the operations find-min, meld, and insert run in O(1) time. [3]

When a decrease-key operation is added as well, determining the precise asymptotic running time of pairing heaps has turned out to be difficult. Initially, the time complexity of this operation was conjectured on empirical grounds to be O(1), [4] but Fredman proved that the amortized time per decrease-key is at least for some sequences of operations. [5] Using a different amortization argument, Pettie then proved that insert, meld, and decrease-key all run in amortized time, which is . [6] Elmasry later introduced elaborations of pairing heaps (lazy, consolidate) for which decrease-key runs in amortized time and other operations have optimal amortized bounds, [7] [8] but no tight bound is known for the original data structure. [3] [6]

Although the asymptotic performance of pairing heaps is worse than other priority queue algorithms such as Fibonacci heaps, which perform decrease-key in amortized time, the performance in practice is excellent. Jones [9] and Larkin, Sen, and Tarjan [10] conducted experiments on pairing heaps and other heap data structures. They concluded that d-ary heaps such as binary heaps are faster than all other heap implementations when the decrease-key operation is not needed (and hence there is no need to externally track the location of nodes in the heap), but that when decrease-key is needed pairing heaps are often faster than d-ary heaps and almost always faster than other pointer-based heaps, including data structures like Fibonacci heaps that are theoretically more efficient. Chen et al. [11] examined priority queues specifically for use with Dijkstra's algorithm and concluded that in normal cases using a d-ary heap without decrease-key (instead duplicating nodes on the heap and ignoring redundant instances) resulted in better performance, despite the inferior theoretical performance guarantees.

Structure

A pairing heap is either an empty heap, or a pairing tree consisting of a root element and a possibly empty list of pairing trees. The heap ordering property requires that parent of any node is no greater than the node itself. The following description assumes a purely functional heap that does not support the decrease-key operation.

type PairingTree[Elem] = Heap(elem: Elem, subheaps: List[PairingTree[Elem]]) type PairingHeap[Elem] = Empty | PairingTree[Elem]

A pointer-based implementation for RAM machines, supporting decrease-key, can be achieved using three pointers per node, by representing the children of a node by a doubly-linked list: a pointer to the node's first child, one to its next sibling, and one to its previous sibling (or, for the leftmost sibling, to its parent). It can also be viewed as a variant of a Left-child right-sibling binary tree with an additional pointer to a node's parent (which represents its previous sibling or actual parent for the leftmost sibling). Alternatively, the previous-pointer can be omitted by letting the last child point back to the parent, if a single boolean flag is added to indicate "end of list". This achieves a more compact structure at the expense of a constant overhead factor per operation. [1]

Operations

Meld

Melding with an empty heap returns the other heap, otherwise a new heap is returned that has the minimum of the two root elements as its root element and just adds the heap with the larger root to the list of subheaps:

function meld(heap1, heap2: PairingHeap[Elem]) -> PairingHeap[Elem]     if heap1 is Empty         return heap2     elsif heap2 is Empty         return heap1     elsif heap1.elem < heap2.elem         return Heap(heap1.elem, heap2 :: heap1.subheaps)     elsereturn Heap(heap2.elem, heap1 :: heap2.subheaps)

Insert

The easiest way to insert an element into a heap is to meld the heap with a new heap containing just this element and an empty list of subheaps:

function insert(elem: Elem, heap: PairingHeap[Elem]) -> PairingHeap[Elem]     return meld(Heap(elem, []), heap)

Find-min

The function find-min simply returns the root element of the heap:

function find-min(heap: PairingHeap[Elem]) -> Elem     if heap is Empty         errorelsereturn heap.elem

Delete-min

The only non-trivial fundamental operation is the deletion of the minimum element from the heap. This requires performing repeated melds of its children until only one tree remains. The standard strategy first melds the subheaps in pairs (this is the step that gave this data structure its name) from left to right and then melds the resulting list of heaps from right to left:

function delete-min(heap: PairingHeap[Elem]) -> PairingHeap[Elem]     if heap is Empty         errorelsereturn merge-pairs(heap.subheaps)

This uses the auxiliary function merge-pairs:

function merge-pairs(list: List[PairingTree[Elem]]) -> PairingHeap[Elem]     if length(list) == 0         return Empty     elsif length(list) == 1         return list[0]     elsereturn meld(meld(list[0], list[1]), merge-pairs(list[2..]))

That this does indeed implement the described two-pass left-to-right then right-to-left merging strategy can be seen from this reduction:

   merge-pairs([H1, H2, H3, H4, H5, H6, H7]) => meld(meld(H1, H2), merge-pairs([H3, H4, H5, H6, H7]))      # meld H1 and H2 to H12, then the rest of the list => meld(H12, meld(meld(H3, H4), merge-pairs([H5, H6, H7])))      # meld H3 and H4 to H34, then the rest of the list => meld(H12, meld(H34, meld(meld(H5, H6), merge-pairs([H7]))))      # meld H5 and H6 to H56, then the rest of the list => meld(H12, meld(H34, meld(H56, H7)))      # switch direction, meld the last two resulting heaps, giving H567 => meld(H12, meld(H34, H567))      # meld the last two resulting heaps, giving H34567 => meld(H12, H34567)       # finally, meld the first pair with the result of merging the rest => H1234567

Summary of running times

Here are time complexities [12] of various heap data structures. Function names assume a min-heap. For the meaning of "O(f)" and "Θ(f)" see Big O notation.

Operationfind-mindelete-mininsertdecrease-keymeld
Binary [12] Θ(1)Θ(log n)O(log n)O(log n)Θ(n)
Leftist Θ(1)Θ(log n)Θ(log n)O(log n)Θ(log n)
Binomial [12] [13] Θ(1)Θ(log n)Θ(1) [lower-alpha 1] Θ(log n)O(log n)
Skew binomial [14] Θ(1)Θ(log n)Θ(1)Θ(log n)O(log n) [lower-alpha 2]
Pairing [3] Θ(1)O(log n) [lower-alpha 1] Θ(1)o(log n) [lower-alpha 1] [lower-alpha 3] Θ(1)
Rank-pairing [17] Θ(1)O(log n) [lower-alpha 1] Θ(1)Θ(1) [lower-alpha 1] Θ(1)
Fibonacci [12] [18] Θ(1)O(log n) [lower-alpha 1] Θ(1)Θ(1) [lower-alpha 1] Θ(1)
Strict Fibonacci [19] Θ(1)O(log n)Θ(1)Θ(1)Θ(1)
Brodal [20] [lower-alpha 4] Θ(1)O(log n)Θ(1)Θ(1)Θ(1)
2–3 heap [22] O(log n)O(log n) [lower-alpha 1] O(log n) [lower-alpha 1] Θ(1)?
  1. 1 2 3 4 5 6 7 8 9 Amortized time.
  2. Brodal and Okasaki describe a technique to reduce the worst-case complexity of meld to Θ(1); this technique applies to any heap datastructure that has insert in Θ(1) and find-min, delete-min, meld in O(log n).
  3. Lower bound of [15] upper bound of [16]
  4. Brodal and Okasaki later describe a persistent variant with the same bounds except for decrease-key, which is not supported. Heaps with n elements can be constructed bottom-up in O(n). [21]

Related Research Articles

<span class="mw-page-title-main">Heap (data structure)</span> Computer science data structure

In computer science, a heap is a tree-based data structure that satisfies the heap property: In a max heap, for any given node C, if P is a parent node of C, then the key of P is greater than or equal to the key of C. In a min heap, the key of P is less than or equal to the key of C. The node at the "top" of the heap is called the root node.

In computer science, a priority queue is an abstract data-type similar to a regular queue or stack data structure. Each element in a priority queue has an associated priority. In a priority queue, elements with high priority are served before elements with low priority. In some implementations, if two elements have the same priority, they are served in the same order in which they were enqueued. In other implementations, the order of elements with the same priority is undefined.

A splay tree is a binary search tree with the additional property that recently accessed elements are quick to access again. Like self-balancing binary search trees, a splay tree performs basic operations such as insertion, look-up and removal in O(log n) amortized time. For random access patterns drawn from a non-uniform random distribution, their amortized time can be faster than logarithmic, proportional to the entropy of the access pattern. For many patterns of non-random operations, also, splay trees can take better than logarithmic time, without requiring advance knowledge of the pattern. According to the unproven dynamic optimality conjecture, their performance on all access patterns is within a constant factor of the best possible performance that could be achieved by any other self-adjusting binary search tree, even one selected to fit that pattern. The splay tree was invented by Daniel Sleator and Robert Tarjan in 1985.

<span class="mw-page-title-main">Dijkstra's algorithm</span> Graph search algorithm

Dijkstra's algorithm is an algorithm for finding the shortest paths between nodes in a weighted graph, which may represent, for example, road networks. It was conceived by computer scientist Edsger W. Dijkstra in 1956 and published three years later.

<span class="mw-page-title-main">Binary heap</span> Variant of heap data structure

A binary heap is a heap data structure that takes the form of a binary tree. Binary heaps are a common way of implementing priority queues. The binary heap was introduced by J. W. J. Williams in 1964, as a data structure for heapsort.

In computer science, a binomial heap is a data structure that acts as a priority queue but also allows pairs of heaps to be merged. It is important as an implementation of the mergeable heap abstract data type, which is a priority queue supporting merge operation. It is implemented as a heap similar to a binary heap but using a special tree structure that is different from the complete binary trees used by binary heaps. Binomial heaps were invented in 1978 by Jean Vuillemin.

In computer science, a Fibonacci heap is a data structure for priority queue operations, consisting of a collection of heap-ordered trees. Fibonacci heaps were originally explained to be an extension of binomial queues. It has a better amortized running time than many other priority queue data structures including the binary heap and binomial heap. Michael L. Fredman and Robert E. Tarjan developed Fibonacci heaps in 1984 and published them in a scientific journal in 1987. Fibonacci heaps are named after the Fibonacci numbers, which are used in their running time analysis.

In computer science, a soft heap is a variant on the simple heap data structure that has constant amortized time complexity for 5 types of operations. This is achieved by carefully "corrupting" (increasing) the keys of at most a constant number of values in the heap.

In computer science, a 2–3 heap is a data structure, a variation on the heap, designed by Tadao Takaoka in 1999. The structure is similar to the Fibonacci heap, and borrows from the 2–3 tree.

In computer science, a disjoint-set data structure, also called a union–find data structure or merge–find set, is a data structure that stores a collection of disjoint (non-overlapping) sets. Equivalently, it stores a partition of a set into disjoint subsets. It provides operations for adding new sets, merging sets, and finding a representative member of a set. The last operation makes it possible to find out efficiently if any two elements are in the same or different sets.

In computer science, a leftist tree or leftist heap is a priority queue implemented with a variant of a binary heap. Every node x has an s-value which is the distance to the nearest leaf in subtree rooted at x. In contrast to a binary heap, a leftist tree attempts to be very unbalanced. In addition to the heap property, leftist trees are maintained so the right descendant of each node has the lower s-value.

A skew heap is a heap data structure implemented as a binary tree. Skew heaps are advantageous because of their ability to merge more quickly than binary heaps. In contrast with binary heaps, there are no structural constraints, so there is no guarantee that the height of the tree is logarithmic. Only two conditions must be satisfied:

A link/cut tree is a data structure for representing a forest, a set of rooted trees, and offers the following operations:

The d-ary heap or d-heap is a priority queue data structure, a generalization of the binary heap in which the nodes have d children instead of 2. Thus, a binary heap is a 2-heap, and a ternary heap is a 3-heap. According to Tarjan and Jensen et al., d-ary heaps were invented by Donald B. Johnson in 1975.

<span class="mw-page-title-main">Cartesian tree</span> Binary tree derived from a sequence of numbers

In computer science, a Cartesian tree is a binary tree derived from a sequence of distinct numbers. To construct the Cartesian tree, set its root to be the minimum number in the sequence, and recursively construct its left and right subtrees from the subsequences before and after this number. It is uniquely defined as a min-heap whose symmetric (in-order) traversal returns the original sequence.

In computer science, a skew binomial heap is a variant of the binomial heap that supports constant-time insertion operations in the worst case, rather than the logarithmic worst case and constant amortized time of ordinary binomial heaps. Just as binomial heaps are based on the binary number system, skew binary heaps are based on the skew binary number system.

In computer science, the Brodal queue is a heap/priority queue structure with very low worst case time bounds: for insertion, find-minimum, meld and decrease-key and for delete-minimum and general deletion. They are the first heap variant to achieve these bounds without resorting to amortization of operational costs. Brodal queues are named after their inventor Gerth Stølting Brodal.

In computer science, a randomized meldable heap is a priority queue based data structure in which the underlying structure is also a heap-ordered binary tree. However, there are no restrictions on the shape of the underlying binary tree.

This is a comparison of the performance of notable data structures, as measured by the complexity of their logical operations. For a more comprehensive listing of data structures, see List of data structures.

In computer science, a strict Fibonacci heap is a priority queue data structure with worst case time bounds equal to the amortized bounds of the Fibonacci heap. Strict Fibonacci heaps belong to a class of asymptotically optimal data structures for priority queues. All operations on strict Fibonacci heaps run in worst case constant time except delete-min, which is necessarily logarithmic. This is optimal, because any priority queue can be used to sort a list of elements by performing insertions and delete-min operations. Strict Fibonacci heaps were invented in 2012 by Gerth Stølting Brodal, George Lagogiannis, and Robert E. Tarjan.

References

  1. 1 2 3 Fredman, Michael L.; Sedgewick, Robert; Sleator, Daniel D.; Tarjan, Robert E. (1986). "The pairing heap: a new form of self-adjusting heap" (PDF). Algorithmica. 1 (1–4): 111–129. doi:10.1007/BF01840439. S2CID   23664143.
  2. Mehlhorn, Kurt; Sanders, Peter (2008). Algorithms and Data Structures: The Basic Toolbox (PDF). Springer. p. 231.
  3. 1 2 3 Iacono, John (2000), "Improved upper bounds for pairing heaps", Proc. 7th Scandinavian Workshop on Algorithm Theory (PDF), Lecture Notes in Computer Science, vol. 1851, Springer-Verlag, pp. 63–77, arXiv: 1110.4428 , CiteSeerX   10.1.1.748.7812 , doi:10.1007/3-540-44985-X_5, ISBN   3-540-67690-2
  4. Stasko, John T.; Vitter, Jeffrey S. (1987), "Pairing heaps: experiments and analysis" (PDF), Communications of the ACM, 30 (3): 234–249, CiteSeerX   10.1.1.106.2988 , doi:10.1145/214748.214759, S2CID   17811811
  5. Fredman, Michael L. (1999). "On the efficiency of pairing heaps and related data structures" (PDF). Journal of the ACM. 46 (4): 473–501. doi:10.1145/320211.320214. S2CID   16115266. Archived from the original (PDF) on 2011-07-21. Retrieved 2011-05-03.
  6. 1 2 Pettie, Seth (2005), "Towards a final analysis of pairing heaps" (PDF), Proc. 46th Annual IEEE Symposium on Foundations of Computer Science (PDF), pp. 174–183, doi:10.1109/SFCS.2005.75, ISBN   0-7695-2468-0, S2CID   2717102
  7. Elmasry, Amr (2009), "Pairing heaps with O(log log n) decrease cost" (PDF), Proc. 20th Annual ACM-SIAM Symposium on Discrete Algorithms , pp. 471–476, CiteSeerX   10.1.1.502.6706 , doi:10.1137/1.9781611973068.52
  8. Elmasry, Amr (November 2017). "Toward Optimal Self-Adjusting Heaps". ACM Transactions on Algorithms. 13 (4): 1–14. doi:10.1145/3147138. S2CID   1182235.
  9. Jones, Douglas W. (1986). "An empirical comparison of priority-queue and event-set implementations". Communications of the ACM. 29 (4): 300–311. CiteSeerX   10.1.1.117.9380 . doi:10.1145/5684.5686. S2CID   43650389.
  10. Larkin, Daniel H.; Sen, Siddhartha; Tarjan, Robert E. (2014), "A back-to-basics empirical study of priority queues", Proceedings of the 16th Workshop on Algorithm Engineering and Experiments, pp. 61–72, arXiv: 1403.0252 , doi:10.1137/1.9781611973198.7, ISBN   978-1-61197-319-8, S2CID   15216766
  11. Chen, Mo; Chowdhury, Rezaul Alam; Ramachandran, Vijaya; Roche, David Lan; Tong, Lingling (12 October 2007). Priority Queues and Dijkstra's Algorithm (PDF) (Technical report). University of Texas. TR-07-54.{{cite tech report}}: CS1 maint: date and year (link)
  12. 1 2 3 4 Cormen, Thomas H.; Leiserson, Charles E.; Rivest, Ronald L. (1990). Introduction to Algorithms (1st ed.). MIT Press and McGraw-Hill. ISBN   0-262-03141-8.
  13. "Binomial Heap | Brilliant Math & Science Wiki". brilliant.org. Retrieved 2019-09-30.
  14. Brodal, Gerth Stølting; Okasaki, Chris (November 1996), "Optimal purely functional priority queues", Journal of Functional Programming, 6 (6): 839–857, doi: 10.1017/s095679680000201x
  15. Fredman, Michael Lawrence (July 1999). "On the Efficiency of Pairing Heaps and Related Data Structures" (PDF). Journal of the Association for Computing Machinery . 46 (4): 473–501. doi:10.1145/320211.320214.
  16. Pettie, Seth (2005). Towards a Final Analysis of Pairing Heaps (PDF). FOCS '05 Proceedings of the 46th Annual IEEE Symposium on Foundations of Computer Science. pp. 174–183. CiteSeerX   10.1.1.549.471 . doi:10.1109/SFCS.2005.75. ISBN   0-7695-2468-0.
  17. Haeupler, Bernhard; Sen, Siddhartha; Tarjan, Robert E. (November 2011). "Rank-pairing heaps" (PDF). SIAM J. Computing. 40 (6): 1463–1485. doi:10.1137/100785351.
  18. Fredman, Michael Lawrence; Tarjan, Robert E. (July 1987). "Fibonacci heaps and their uses in improved network optimization algorithms" (PDF). Journal of the Association for Computing Machinery . 34 (3): 596–615. CiteSeerX   10.1.1.309.8927 . doi:10.1145/28869.28874.
  19. Brodal, Gerth Stølting; Lagogiannis, George; Tarjan, Robert E. (2012). Strict Fibonacci heaps (PDF). Proceedings of the 44th symposium on Theory of Computing - STOC '12. pp. 1177–1184. CiteSeerX   10.1.1.233.1740 . doi:10.1145/2213977.2214082. ISBN   978-1-4503-1245-5.
  20. Brodal, Gerth S. (1996), "Worst-Case Efficient Priority Queues" (PDF), Proc. 7th Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 52–58
  21. Goodrich, Michael T.; Tamassia, Roberto (2004). "7.3.6. Bottom-Up Heap Construction". Data Structures and Algorithms in Java (3rd ed.). pp. 338–341. ISBN   0-471-46983-1.
  22. Takaoka, Tadao (1999), Theory of 2–3 Heaps (PDF), p. 12