Merge algorithm

Last updated November 15, 2024

Merge algorithms are a family of algorithms that take multiple sorted lists as input and produce a single list as output, containing all the elements of the inputs lists in sorted order. These algorithms are used as subroutines in various sorting algorithms, most famously merge sort.

Application

The merge algorithm plays a critical role in the merge sort algorithm, a comparison-based sorting algorithm. Conceptually, the merge sort algorithm consists of two steps:

Recursively divide the list into sublists of (roughly) equal length, until each sublist contains only one element, or in the case of iterative (bottom up) merge sort, consider a list of n elements as n sub-lists of size 1. A list containing a single element is, by definition, sorted.
Repeatedly merge sublists to create a new sorted sublist until the single list contains all elements. The single list is the sorted list.

The merge algorithm is used repeatedly in the merge sort algorithm.

An example merge sort is given in the illustration. It starts with an unsorted array of 7 integers. The array is divided into 7 partitions; each partition contains 1 element and is sorted. The sorted partitions are then merged to produce larger, sorted, partitions, until 1 partition, the sorted array, is left.

Merging two lists

Merging two sorted lists into one can be done in linear time and linear or constant space (depending on the data access model). The following pseudocode demonstrates an algorithm that merges input lists (either linked lists or arrays) $A$ and $B$ into a new list $C$ .^[1]^[2]^: 104 The function head yields the first element of a list; "dropping" an element means removing it from its list, typically by incrementing a pointer or index.

algorithm merge(A, B) isinputs A, B : list     returns list      C := new empty list     while A is not empty and B is not empty doif head(A) ≤ head(B) then             append head(A) to C             drop the head of A         else             append head(B) to C             drop the head of B      // By now, either A or B is empty. It remains to empty the other input list.while A is not empty do         append head(A) to C         drop the head of A     while B is not empty do         append head(B) to C         drop the head of B      return C

When the inputs are linked lists, this algorithm can be implemented to use only a constant amount of working space; the pointers in the lists' nodes can be reused for bookkeeping and for constructing the final merged list.

In the merge sort algorithm, this subroutine is typically used to merge two sub-arrays A[lo..mid], A[mid+1..hi] of a single array A. This can be done by copying the sub-arrays into a temporary array, then applying the merge algorithm above.^[1] The allocation of a temporary array can be avoided, but at the expense of speed and programming ease. Various in-place merge algorithms have been devised,^[3] sometimes sacrificing the linear-time bound to produce an $O (n log n)$ algorithm;^[4] see Merge sort § Variants for discussion.

K-way merging

$k$ -way merging generalizes binary merging to an arbitrary number $k$ of sorted input lists. Applications of $k$ -way merging arise in various sorting algorithms, including patience sorting ^[5] and an external sorting algorithm that divides its input into $k = .mw-parser-output .sfrac{white-space:nowrap}.mw-parser-output .sfrac.tion,.mw-parser-output .sfrac .tion{display:inline-block;vertical-align:-0.5em;font-size:85%;text-align:center}.mw-parser-output .sfrac .num{display:block;line-height:1em;margin:0.0em 0.1em;border-bottom:1px solid}.mw-parser-output .sfrac .den{display:block;line-height:1em;margin:0.1em 0.1em}.mw-parser-output .sr-only{border:0;clip:rect(0,0,0,0);clip-path:polygon(0px 0px,0px 0px,0px 0px);height:1px;margin:-1px;overflow:hidden;padding:0;position:absolute;width:1px}⁠1/M⁠ − 1$ blocks that fit in memory, sorts these one by one, then merges these blocks.^[2]^{: 119–120}

Several solutions to this problem exist. A naive solution is to do a loop over the $k$ lists to pick off the minimum element each time, and repeat this loop until all lists are empty:

Input: a list of $k$ lists.
While any of the lists is non-empty:
- Loop over the lists to find the one with the minimum first element.
- Output the minimum element and remove it from its list.

In the worst case, this algorithm performs $(k -1)(n - ⁠ k / 2 ⁠)$ element comparisons to perform its work if there are a total of $n$ elements in the lists.^[6] It can be improved by storing the lists in a priority queue (min-heap) keyed by their first element:

Build a min-heap $h$ of the $k$ lists, using the first element as the key.
While any of the lists is non-empty:
- Let $i = find-min(h)$ .
- Output the first element of list $i$ and remove it from its list.
- Re-heapify $h$ .

Searching for the next smallest element to be output (find-min) and restoring heap order can now be done in $O (log k)$ time (more specifically, $2⌊log k ⌋$ comparisons^[6]), and the full problem can be solved in $O (n log k)$ time (approximately $2 n ⌊log k ⌋$ comparisons).^[6]^[2]^{: 119–120}

A third algorithm for the problem is a divide and conquer solution that builds on the binary merge algorithm:

If $k = 1$ , output the single input list.
If $k = 2$ , perform a binary merge.
Else, recursively merge the first $⌊ k /2⌋$ lists and the final $⌈ k /2⌉$ lists, then binary merge these.

When the input lists to this algorithm are ordered by length, shortest first, it requires fewer than $n ⌈log k ⌉$ comparisons, i.e., less than half the number used by the heap-based algorithm; in practice, it may be about as fast or slow as the heap-based algorithm.^[6]

Parallel merge

A parallel version of the binary merge algorithm can serve as a building block of a parallel merge sort. The following pseudocode demonstrates this algorithm in a parallel divide-and-conquer style (adapted from Cormen et al.^[7]^: 800). It operates on two sorted arrays $A$ and $B$ and writes the sorted output to array $C$ . The notation A[i...j] denotes the part of $A$ from index $i$ through $j$ , exclusive.

algorithm merge(A[i...j], B[k...ℓ], C[p...q]) isinputs A, B, C : array            i, j, k, ℓ, p, q : indices      let m = j - i,         n = ℓ - k      if m < n then         swap A and B  // ensure that A is the larger array: i, j still belong to A; k, ℓ to B         swap m and n      if m ≤ 0 thenreturn// base case, nothing to mergelet r = ⌊(i + j)/2⌋     let s = binary-search(A[r], B[k...ℓ])     let t = p + (r - i) + (s - k)     C[t] = A[r]      in parallel do         merge(A[i...r], B[k...s], C[p...t])         merge(A[r+1...j], B[s...ℓ], C[t+1...q])

The algorithm operates by splitting either $A$ or $B$ , whichever is larger, into (nearly) equal halves. It then splits the other array into a part with values smaller than the midpoint of the first, and a part with larger or equal values. (The binary search subroutine returns the index in $B$ where $A [r]$ would be, if it were in $B$ ; that this always a number between $k$ and $ℓ$ .) Finally, each pair of halves is merged recursively, and since the recursive calls are independent of each other, they can be done in parallel. Hybrid approach, where serial algorithm is used for recursion base case has been shown to perform well in practice ^[8]

The work performed by the algorithm for two arrays holding a total of $n$ elements, i.e., the running time of a serial version of it, is $O (n)$ . This is optimal since $n$ elements need to be copied into $C$ . To calculate the span of the algorithm, it is necessary to derive a Recurrence relation. Since the two recursive calls of merge are in parallel, only the costlier of the two calls needs to be considered. In the worst case, the maximum number of elements in one of the recursive calls is at most ${\textstyle {\frac {3}{4}}n}$ since the array with more elements is perfectly split in half. Adding the $\Theta \left(\log(n)\right)$ cost of the Binary Search, we obtain this recurrence as an upper bound:

$T_{\infty }^{\text{merge}}(n)=T_{\infty }^{\text{merge}}\left({\frac {3}{4}}n\right)+\Theta \left(\log(n)\right)$

The solution is $T_{\infty }^{\text{merge}}(n)=\Theta \left(\log(n)^{2}\right)$ , meaning that it takes that much time on an ideal machine with an unbounded number of processors.^[7]^{: 801–802}

Note: The routine is not stable: if equal items are separated by splitting $A$ and $B$ , they will become interleaved in $C$ ; also swapping $A$ and $B$ will destroy the order, if equal items are spread among both input arrays. As a result, when used for sorting, this algorithm produces a sort that is not stable.

Parallel merge of two lists

There are also algorithms that introduce parallelism within a single instance of merging of two sorted lists. These can be used in field-programmable gate arrays (FPGAs), specialized sorting circuits, as well as in modern processors with single-instruction multiple-data (SIMD) instructions.

Existing parallel algorithms are based on modifications of the merge part of either the bitonic sorter or odd-even mergesort.^[9] In 2018, Saitoh M. et al. introduced MMS ^[10] for FPGAs, which focused on removing a multi-cycle feedback datapath that prevented efficient pipelining in hardware. Also in 2018, Papaphilippou P. et al. introduced FLiMS ^[9] that improved the hardware utilization and performance by only requiring $\log _{2}(P)+1$ pipeline stages of $P/2$ compare-and-swap units to merge with a parallelism of $P$ elements per FPGA cycle.

Language support

Some computer languages provide built-in or library support for merging sorted collections.

C++

The C++'s Standard Template Library has the function std::merge, which merges two sorted ranges of iterators, and std::inplace_merge, which merges two consecutive sorted ranges in-place. In addition, the std::list (linked list) class has its own merge method which merges another list into itself. The type of the elements merged must support the less-than (<) operator, or it must be provided with a custom comparator.

C++17 allows for differing execution policies, namely sequential, parallel, and parallel-unsequenced.^[11]

Python

Python's standard library (since 2.6) also has a merge function in the heapq module, that takes multiple sorted iterables, and merges them into a single iterator.^[12]

Related Research Articles

<span class="mw-page-title-main">Heapsort</span> A sorting algorithm which uses the heap data structure

In computer science, heapsort is a comparison-based sorting algorithm which can be thought of as "an implementation of selection sort using the right data structure." Like selection sort, heapsort divides its input into a sorted and an unsorted region, and it iteratively shrinks the unsorted region by extracting the largest element from it and inserting it into the sorted region. Unlike selection sort, heapsort does not waste time with a linear-time scan of the unsorted region; rather, heap sort maintains the unsorted region in a heap data structure to efficiently find the largest element in each step.

<span class="mw-page-title-main">Heap (data structure)</span> Computer science data structure

In computer science, a heap is a tree-based data structure that satisfies the heap property: In a max heap, for any given node C, if P is a parent node of C, then the key of P is greater than or equal to the key of C. In a min heap, the key of P is less than or equal to the key of C. The node at the "top" of the heap is called the root node.

<span class="mw-page-title-main">Insertion sort</span> Sorting algorithm

Insertion sort is a simple sorting algorithm that builds the final sorted array (or list) one item at a time by comparisons. It is much less efficient on large lists than more advanced algorithms such as quicksort, heapsort, or merge sort. However, insertion sort provides several advantages:

<span class="mw-page-title-main">Merge sort</span> Divide and conquer sorting algorithm

In computer science, merge sort is an efficient, general-purpose, and comparison-based sorting algorithm. Most implementations produce a stable sort, which means that the relative order of equal elements is the same in the input and output. Merge sort is a divide-and-conquer algorithm that was invented by John von Neumann in 1945. A detailed description and analysis of bottom-up merge sort appeared in a report by Goldstine and von Neumann as early as 1948.

In computer science, a priority queue is an abstract data type similar to a regular queue or stack abstract data type. Each element in a priority queue has an associated priority. In a priority queue, elements with high priority are served before elements with low priority. In some implementations, if two elements have the same priority, they are served in the same order in which they were enqueued. In other implementations, the order of elements with the same priority is undefined.

In computer science, radix sort is a non-comparative sorting algorithm. It avoids comparison by creating and distributing elements into buckets according to their radix. For elements with more than one significant digit, this bucketing process is repeated for each digit, while preserving the ordering of the prior step, until all digits have been considered. For this reason, radix sort has also been called bucket sort and digital sort.

<span class="mw-page-title-main">Sorting algorithm</span> Algorithm that arranges lists in order

In computer science, a sorting algorithm is an algorithm that puts elements of a list into an order. The most frequently used orders are numerical order and lexicographical order, and either ascending or descending. Efficient sorting is important for optimizing the efficiency of other algorithms that require input data to be in sorted lists. Sorting is also often useful for canonicalizing data and for producing human-readable output.

In computer science, selection sort is an in-place comparison sorting algorithm. It has an O(n²) time complexity, which makes it inefficient on large lists, and generally performs worse than the similar insertion sort. Selection sort is noted for its simplicity and has performance advantages over more complicated algorithms in certain situations, particularly where auxiliary memory is limited.

Bucket sort, or bin sort, is a sorting algorithm that works by distributing the elements of an array into a number of buckets. Each bucket is then sorted individually, either using a different sorting algorithm, or by recursively applying the bucket sorting algorithm. It is a distribution sort, a generalization of pigeonhole sort that allows multiple keys per bucket, and is a cousin of radix sort in the most-to-least significant digit flavor. Bucket sort can be implemented with comparisons and therefore can also be considered a comparison sort algorithm. The computational complexity depends on the algorithm used to sort each bucket, the number of buckets to use, and whether the input is uniformly distributed.

<span class="mw-page-title-main">Smoothsort</span> Comparison-based sorting algorithm

In computer science, smoothsort is a comparison-based sorting algorithm. A variant of heapsort, it was invented and published by Edsger Dijkstra in 1981. Like heapsort, smoothsort is an in-place algorithm with an upper bound of $O (n log n)$ operations (see big O notation), but it is not a stable sort. The advantage of smoothsort is that it comes closer to $O (n)$ time if the input is already sorted to some degree, whereas heapsort averages $O (n log n)$ regardless of the initial sorted state.

In computer science, a selection algorithm is an algorithm for finding the $th smallest value in a collection of ordered values, such as numbers. The value that it finds is called the th order statistic. Selection includes as special cases the problems of finding the minimum, median, and maximum element in the collection. Selection algorithms include quickselect, and the median of medians algorithm. When applied to a collection of values, these algorithms take linear time, as expressed using big O notation. For data that is already structured, faster algorithms may be possible; as an extreme case, selection in an already-sorted array takes time .$

<span class="mw-page-title-main">Quicksort</span> Divide and conquer sorting algorithm

Quicksort is an efficient, general-purpose sorting algorithm. Quicksort was developed by British computer scientist Tony Hoare in 1959 and published in 1961. It is still a commonly used algorithm for sorting. Overall, it is slightly faster than merge sort and heapsort for randomized data, particularly on larger distributions.

sort is a generic function in the C++ Standard Library for doing comparison sorting. The function originated in the Standard Template Library (STL).

Spreadsort is a sorting algorithm invented by Steven J. Ross in 2002. It combines concepts from distribution-based sorts, such as radix sort and bucket sort, with partitioning concepts from comparison sorts such as quicksort and mergesort. In experimental results it was shown to be highly efficient, often outperforming traditional algorithms such as quicksort, particularly on distributions exhibiting structure and string sorting. There is an open-source implementation with performance analysis and benchmarks, and HTML documentation .

In computer science, a Cartesian tree is a binary tree derived from a sequence of distinct numbers. To construct the Cartesian tree, set its root to be the minimum number in the sequence, and recursively construct its left and right subtrees from the subsequences before and after this number. It is uniquely defined as a min-heap whose symmetric (in-order) traversal returns the original sequence.

Samplesort is a sorting algorithm that is a divide and conquer algorithm often used in parallel processing systems. Conventional divide and conquer sorting algorithms partitions the array into sub-intervals or buckets. The buckets are then sorted individually and then concatenated together. However, if the array is non-uniformly distributed, the performance of these sorting algorithms can be significantly throttled. Samplesort addresses this issue by selecting a sample of size $s$ from the $n$ -element sequence, and determining the range of the buckets by sorting the sample and choosing $p -1 < s$ elements from the result. These elements then divide the array into $p$ approximately equal-sized buckets. Samplesort is described in the 1970 paper, "Samplesort: A Sampling Approach to Minimal Storage Tree Sorting", by W. D. Frazer and A. C. McKellar.

In computer science, integer sorting is the algorithmic problem of sorting a collection of data values by integer keys. Algorithms designed for integer sorting may also often be applied to sorting problems in which the keys are floating point numbers, rational numbers, or text strings. The ability to perform integer arithmetic on the keys allows integer sorting algorithms to be faster than comparison sorting algorithms in many cases, depending on the details of which operations are allowed in the model of computing and how large the integers to be sorted are.

In computer science, partial sorting is a relaxed variant of the sorting problem. Total sorting is the problem of returning a list of items such that its elements all appear in order, while partial sorting is returning a list of the k smallest elements in order. The other elements may also be sorted, as in an in-place partial sort, or may be discarded, which is common in streaming partial sorts. A common practical example of partial sorting is computing the "Top 100" of some list.

In computer science, k-way merge algorithms or multiway merges are a specific type of sequence merge algorithms that specialize in taking in k sorted lists and merging them into a single sorted list. These merge algorithms generally refer to merge algorithms that take in a number of sorted lists greater than two. Two-way merges are also referred to as binary merges. The k-way merge is also an external sorting algorithm.

<span class="mw-page-title-main">Merge-insertion sort</span> Type of comparison sorting algorithm

In computer science, merge-insertion sort or the Ford–Johnson algorithm is a comparison sorting algorithm published in 1959 by L. R. Ford Jr. and Selmer M. Johnson. It uses fewer comparisons in the worst case than the best previously known algorithms, binary insertion sort and merge sort, and for 20 years it was the sorting algorithm with the fewest known comparisons. Although not of practical significance, it remains of theoretical interest in connection with the problem of sorting with a minimum number of comparisons. The same algorithm may have also been independently discovered by Stanisław Trybuła and Czen Ping.

References

1 2 Skiena, Steven (2010). The Algorithm Design Manual (2nd ed.). Springer Science+Business Media. p. 123. ISBN 978-1-849-96720-4.
1 2 3 Kurt Mehlhorn; Peter Sanders (2008). Algorithms and Data Structures: The Basic Toolbox. Springer. ISBN 978-3-540-77978-0.
↑ Katajainen, Jyrki; Pasanen, Tomi; Teuhola, Jukka (1996). "Practical in-place mergesort". Nordic J. Computing. 3 (1): 27–40. CiteSeerX 10.1.1.22.8523 .
↑ Kim, Pok-Son; Kutzner, Arne (2004). Stable Minimum Storage Merging by Symmetric Comparisons. European Symp. Algorithms. Lecture Notes in Computer Science. Vol. 3221. pp. 714–723. CiteSeerX 10.1.1.102.4612 . doi:10.1007/978-3-540-30140-0_63. ISBN 978-3-540-23025-0.
↑ Chandramouli, Badrish; Goldstein, Jonathan (2014). Patience is a Virtue: Revisiting Merge and Sort on Modern Processors. SIGMOD/PODS.
1 2 3 4 Greene, William A. (1993). k-way Merging and k-ary Sorts (PDF). Proc. 31-st Annual ACM Southeast Conf. pp. 127–135.
1 2 Cormen, Thomas H.; Leiserson, Charles E.; Rivest, Ronald L.; Stein, Clifford (2009) [1990]. Introduction to Algorithms (3rd ed.). MIT Press and McGraw-Hill. ISBN 0-262-03384-4.
↑ Victor J. Duvanenko (2011), "Parallel Merge", Dr. Dobb's Journal
1 2 Papaphilippou, Philippos; Luk, Wayne; Brooks, Chris (2022). "FLiMS: a Fast Lightweight 2-way Merger for Sorting". IEEE Transactions on Computers: 1–12. doi:10.1109/TC.2022.3146509. hdl: 10044/1/95271 . S2CID 245669103.
↑ Saitoh, Makoto; Elsayed, Elsayed A.; Chu, Thiem Van; Mashimo, Susumu; Kise, Kenji (April 2018). "A High-Performance and Cost-Effective Hardware Merge Sorter without Feedback Datapath". 2018 IEEE 26th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM). pp. 197–204. doi:10.1109/FCCM.2018.00038. ISBN 978-1-5386-5522-1. S2CID 52195866.
↑ "std:merge". cppreference.com. 2018-01-08. Retrieved 2018-04-28.
↑ "heapq — Heap queue algorithm — Python 3.10.1 documentation".

External links

High Performance Implementation of Parallel and Serial Merge in C# with source in GitHub and in C++ GitHub

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[skiena-1] 1 2 Skiena, Steven (2010). The Algorithm Design Manual (2nd ed.). Springer Science+Business Media. p. 123. ISBN 978-1-849-96720-4.

[toolbox-2] 1 2 3 Kurt Mehlhorn; Peter Sanders (2008). Algorithms and Data Structures: The Basic Toolbox. Springer. ISBN 978-3-540-77978-0.

[3] Katajainen, Jyrki; Pasanen, Tomi; Teuhola, Jukka (1996). "Practical in-place mergesort". Nordic J. Computing. 3 (1): 27–40. CiteSeerX 10.1.1.22.8523 .

[4] Kim, Pok-Son; Kutzner, Arne (2004). Stable Minimum Storage Merging by Symmetric Comparisons. European Symp. Algorithms. Lecture Notes in Computer Science. Vol. 3221. pp. 714–723. CiteSeerX 10.1.1.102.4612 . doi:10.1007/978-3-540-30140-0_63. ISBN 978-3-540-23025-0.

[Chandramouli-5] Chandramouli, Badrish; Goldstein, Jonathan (2014). Patience is a Virtue: Revisiting Merge and Sort on Modern Processors. SIGMOD/PODS.

[greene-6] 1 2 3 4 Greene, William A. (1993). k-way Merging and k-ary Sorts (PDF). Proc. 31-st Annual ACM Southeast Conf. pp. 127–135.

[clrs-7] 1 2 Cormen, Thomas H.; Leiserson, Charles E.; Rivest, Ronald L.; Stein, Clifford (2009) [1990]. Introduction to Algorithms (3rd ed.). MIT Press and McGraw-Hill. ISBN 0-262-03384-4.

[vjd-8] Victor J. Duvanenko (2011), "Parallel Merge", Dr. Dobb's Journal

[flimsj-9] 1 2 Papaphilippou, Philippos; Luk, Wayne; Brooks, Chris (2022). "FLiMS: a Fast Lightweight 2-way Merger for Sorting". IEEE Transactions on Computers: 1–12. doi:10.1109/TC.2022.3146509. hdl: 10044/1/95271 . S2CID 245669103.

[10] Saitoh, Makoto; Elsayed, Elsayed A.; Chu, Thiem Van; Mashimo, Susumu; Kise, Kenji (April 2018). "A High-Performance and Cost-Effective Hardware Merge Sorter without Feedback Datapath". 2018 IEEE 26th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM). pp. 197–204. doi:10.1109/FCCM.2018.00038. ISBN 978-1-5386-5522-1. S2CID 52195866.

[11] "std:merge". cppreference.com. 2018-01-08. Retrieved 2018-04-28.

[12] "heapq — Heap queue algorithm — Python 3.10.1 documentation".

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

v t e Sorting algorithms
Theory	Computational complexity theory Big O notation Total order Lists Inplacement Stability Comparison sort Adaptive sort Sorting network Integer sorting X + Y sorting Transdichotomous model Quantum sort
Exchange sorts	Bubble sort Cocktail shaker sort Odd–even sort Comb sort Gnome sort Proportion extend sort Quicksort
Selection sorts	Selection sort Heapsort Smoothsort Cartesian tree sort Tournament sort Cycle sort Weak-heap sort
Insertion sorts	Insertion sort Shellsort Splaysort Tree sort Library sort Patience sorting
Merge sorts	Merge sort Cascade merge sort Oscillating merge sort Polyphase merge sort
Distribution sorts	American flag sort Bead sort Bucket sort Burstsort Counting sort Interpolation sort Pigeonhole sort Proxmap sort Radix sort Flashsort
Concurrent sorts	Bitonic sorter Batcher odd–even mergesort Pairwise sorting network Samplesort
Hybrid sorts	Block merge sort Kirkpatrick–Reisch sort Timsort Introsort Spreadsort Merge-insertion sort
Other	Topological sorting Pre-topological order Pancake sorting Spaghetti sort
Impractical sorts	Stooge sort Slowsort Bogosort