Burstsort

Burstsort
Class	Sorting algorithm
Data structure	Trie
Worst-case performance	O(wn)
Worst-case space complexity	O(wn)
Optimal	?

Last updated May 01, 2025

Burstsort and its variants are cache-efficient algorithms for sorting strings. They are variants of the traditional radix sort but faster for large data sets of common strings, first published in 2003, with some optimizing versions published in later years.^[1]

Burstsort algorithms use a trie to store prefixes of strings, with growable arrays of pointers as end nodes containing sorted, unique, suffixes (referred to as buckets). Some variants copy the string tails into the buckets. As the buckets grow beyond a predetermined threshold, the buckets are "burst" into tries, giving the sort its name. A more recent variant uses a bucket index with smaller sub-buckets to reduce memory usage. Most implementations delegate to multikey quicksort, an extension of three-way radix quicksort, to sort the contents of the buckets. By dividing the input into buckets with common prefixes, the sorting can be done in a cache-efficient manner.

Burstsort was introduced as a sort that is similar to MSD radix sort,^[1] but is faster due to being aware of caching and related radixes being stored closer to each other due to specifics of trie structure. It exploits specifics of strings that are usually encountered in real world. And although asymptotically it is the same as radix sort, with time complexity of $O (wn)$ (w – word length and n – number of strings to be sorted), but due to better memory distribution it tends to be twice as fast on big data sets of strings. It has been billed as the "fastest known algorithm to sort large sets of strings".^[2]

Psuedocode

The Pseudocode below implements Burst Sort recursively .

function merge(List arr1, List arr2) isList arr3 ← new List     i, j ← 0     while i < length(arr1) AND j < length(arr2) doif arr1[i] < arr2[j] THEN             arr3.add(arr1[i])      i ← i + 1  else       arr3.add(arr2[j])      j ← j + 1     while i < length(arr1) do  arr3.add(arr1[i])  i ← i + 1     while j < length(arr2) do   arr3.add(arr2[j])  j ← j + 1     return arr3   function split(List arr) isList l1 ← new List     List l2 ← new List     mid ← length(arr) / 2      for I = 0 to mid do  l1.add(arr[i])       for I = mid to i < length(arr) do   l2.add(arr[i])     // Return the split lists.      return (l1, l2)     function burstSort(List arr) isiflength(arr) == 1 THENreturn arr      List subarr ← split(arr)     // Recursively sort each partition returned by split     List arr1 ← burstSort(subarr[0])     List arr2 ← burstSort(subarr[1])     return merge(arr1, arr2)

References

1 2 Sinha, R.; Zobel, J. (2005). "Cache-conscious sorting of large sets of strings with dynamic tries" (PDF). Journal of Experimental Algorithmics. 9: 1.5. CiteSeerX 10.1.1.599.861 . doi:10.1145/1005813.1041517. S2CID 10807318.
↑ "Burstsort: Fastest known algorithm to sort large set of strings | Hacker News".

A burstsort derivative (C-burstsort), faster than burstsort: Sinha, Ranjan; Zobel, Justin; Ring, David (January 2006). "Cache-Efficient String Sorting Using Copying" (PDF). Journal of Experimental Algorithmics. 11 (1.2): 1.2. CiteSeerX 10.1.1.85.3498 . doi:10.1145/1187436.1187439. S2CID 3184411. Archived from the original (PDF) on 2007-10-01. Retrieved 2007-05-31.
The data type used in burstsort: Heinz, Steffen; Zobel, Justin; Williams, Hugh E. (April 2002). "Burst Tries: A Fast, Efficient Data Structure for String Keys" (PDF). ACM Transactions on Information Systems. 20 (2): 192–223. CiteSeerX 10.1.1.18.3499 . doi:10.1145/506309.506312. S2CID 14122377. Archived from the original (PDF) on 2013-12-05. Retrieved 2007-09-25.
Sinha, Ranjan; Zobel, Justin (2003). "Efficient Trie-Based Sorting of Large Sets of Strings" (PDF). Proceedings of the 26th Australasian Computer Science Conference. Vol. 16. Australian Computer Society. pp. 11–18. CiteSeerX 10.1.1.12.2757 . ISBN 978-0-909-92594-9. Archived from the original (PDF) on 2012-02-08. Retrieved 2007-09-25.
Sinha, Ranjan; Wirth, Anthony (March 2010). "Engineering Burstsort: Towards Fast In-Place String Sorting" (PDF). ACM Journal of Experimental Algorithmics. 15 (2.5): 1–24. doi:10.1145/1671970.1671978. S2CID 16410080.

External links

A burstsort implementation in Java: burstsort4j
Judy arrays are a type of copy burstsort: C implementation

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[ref1-1] 1 2 Sinha, R.; Zobel, J. (2005). "Cache-conscious sorting of large sets of strings with dynamic tries" (PDF). Journal of Experimental Algorithmics. 9: 1.5. CiteSeerX 10.1.1.599.861 . doi:10.1145/1005813.1041517. S2CID 10807318.

[2] "Burstsort: Fastest known algorithm to sort large set of strings | Hacker News".

[1]

[2]

v t e Sorting algorithms
Theory	Computational complexity theory Big O notation Total order Lists Inplacement Stability Comparison sort Adaptive sort Sorting network Integer sorting X + Y sorting Transdichotomous model Quantum sort
Exchange sorts	Bubble sort Cocktail shaker sort Odd–even sort Comb sort Gnome sort Proportion extend sort Quicksort
Selection sorts	Selection sort Heapsort Smoothsort Cartesian tree sort Tournament sort Cycle sort Weak-heap sort
Insertion sorts	Insertion sort Shellsort Splaysort Tree sort Library sort Patience sorting
Merge sorts	Merge sort Cascade merge sort Oscillating merge sort Polyphase merge sort
Distribution sorts	American flag sort Bead sort Bucket sort Burstsort Counting sort Interpolation sort Pigeonhole sort Proxmap sort Radix sort Flashsort
Concurrent sorts	Bitonic sorter Batcher odd–even mergesort Pairwise sorting network Samplesort
Hybrid sorts	Block merge sort Kirkpatrick–Reisch sort Timsort Introsort Spreadsort Merge-insertion sort
Other	Topological sorting Pre-topological order Pancake sorting Spaghetti sort
Impractical sorts	Stooge sort Slowsort Bogosort

Burstsort

Contents

Psuedocode

References

External links