Bead sort

Last updated

Bead sort, also called gravity sort, is a natural sorting algorithm, developed by Joshua J. Arulanandham, Cristian S. Calude and Michael J. Dinneen in 2002, and published in The Bulletin of the European Association for Theoretical Computer Science . [1] Both digital and analog hardware implementations of bead sort can achieve a sorting time of O (n); however, the implementation of this algorithm tends to be significantly slower in software and can only be used to sort lists of positive integers. Also, it would seem that even in the best case, the algorithm requires O (n2) space.

Contents

Algorithm overview

Step 1: Suspended beads on vertical poles. BeadSort-Figure1.svg
Step 1: Suspended beads on vertical poles.
Step 2: The beads have been allowed to fall. BeadSort-Figure2.svg
Step 2: The beads have been allowed to fall.

The bead sort operation can be compared to the manner in which beads slide on parallel poles, such as on an abacus. However, each pole may have a distinct number of beads. Initially, it may be helpful to imagine the beads suspended on vertical poles. In Step 1, such an arrangement is displayed using n=5 rows of beads on m=4 vertical poles. The numbers to the right of each row indicate the number that the row in question represents; rows 1 and 2 are representing the positive integer 3 (because they each contain three beads) while the top row represents the positive integer 2 (as it only contains two beads). [notes 1]

If we then allow the beads to fall, the rows now represent the same integers in sorted order. Row 1 contains the largest number in the set, while row n contains the smallest. If the above-mentioned convention of rows containing a series of beads on poles 1..k and leaving poles k+1..m empty has been followed, it will continue to be the case here.

The action of allowing the beads to "fall" in our physical example has allowed the larger values from the higher rows to propagate to the lower rows. If the value represented by row a is smaller than the value contained in row a+1, some of the beads from row a+1 will fall into row a; this is certain to happen, as row a does not contain beads in those positions to stop the beads from row a+1 from falling.

The mechanism underlying bead sort is similar to that behind counting sort; the number of beads on each pole corresponds to the number of elements with value equal or greater than the index of that pole.

Complexity

Bead sort can be implemented with four general levels of complexity, among others:

Like the Pigeonhole sort, bead sort is unusual in that in worst case it can perform faster than O (n log n), the fastest performance possible for a comparison sort in worst case. This is possible because the key for a bead sort is always a positive integer and bead sort exploits its structure.

Implementation

This implementation is written in Python; it is assumed that the input_list will be a sequence of integers. The function returns a new list rather than mutating the one passed in, but it can be trivially modified to operate in place efficiently.

defbeadsort(input_list):"""Bead sort."""return_list=[]# Initialize a 'transposed list' to contain as many elements as# the maximum value of the input -- in effect, taking the 'tallest'# column of input beads and laying it out flattransposed_list=[0]*max(input_list)fornumininput_list:# For each element (each 'column of beads') of the input list,# 'lay the beads flat' by incrementing as many elements of the# transposed list as the column is tall.# These will accumulate atop previous additions.transposed_list[:num]=[n+1fornintransposed_list[:num]]# We've now dropped the beads. To de-transpose, we count the# 'bottommost row' of dropped beads, then mimic removing this# row by subtracting 1 from each 'column' of the transposed list.# When a column does not reach high enough for the current row,# its value in transposed_list will be <= 0.foriinrange(len(input_list)):# Counting values > i is how we tell how many beads are in the# current 'bottommost row'. Note that Python's bools can be# evaluated as integers; True == 1 and False == 0.return_list.append(sum(n>ifornintransposed_list))# The resulting list is sorted in descending orderreturnreturn_list

We can also implement the algorithm using Java. [2]

publicstaticvoidbeadSort(int[]a){// Find the maximum elementintmax=a[0];for(inti=1;i<a.length;i++){if(a[i]>max){max=a[i];}}// allocating memoryint[][]beads=newint[a.length][max];// mark the beadsfor(inti=0;i<a.length;i++){for(intj=0;j<a[i];j++){beads[i][j]=1;}}// move down the beadsfor(intj=0;j<max;j++){intsum=0;for(inti=0;i<a.length;i++){sum+=beads[i][j];beads[i][j]=0;}for(inti=a.length-1;i>=a.length-sum;i--){a[i]=j+1;}}}

Notes

  1. By convention, a row representing the positive integer k should have beads on poles 1..k and poles k+1..m should be empty. This is not a strict requirement, but will most likely simplify implementation.

Related Research Articles

<span class="mw-page-title-main">Heapsort</span> A sorting algorithm which uses the heap data structure

In computer science, heapsort is a comparison-based sorting algorithm which can be thought of as "an implementation of selection sort using the right data structure." Like selection sort, heapsort divides its input into a sorted and an unsorted region, and it iteratively shrinks the unsorted region by extracting the largest element from it and inserting it into the sorted region. Unlike selection sort, heapsort does not waste time with a linear-time scan of the unsorted region; rather, heap sort maintains the unsorted region in a heap data structure to efficiently find the largest element in each step.

<span class="mw-page-title-main">Insertion sort</span> Sorting algorithm

Insertion sort is a simple sorting algorithm that builds the final sorted array (or list) one item at a time by comparisons. It is much less efficient on large lists than more advanced algorithms such as quicksort, heapsort, or merge sort. However, insertion sort provides several advantages:

In computer science, radix sort is a non-comparative sorting algorithm. It avoids comparison by creating and distributing elements into buckets according to their radix. For elements with more than one significant digit, this bucketing process is repeated for each digit, while preserving the ordering of the prior step, until all digits have been considered. For this reason, radix sort has also been called bucket sort and digital sort.

In computer science, selection sort is an in-place comparison sorting algorithm. It has an O(n2) time complexity, which makes it inefficient on large lists, and generally performs worse than the similar insertion sort. Selection sort is noted for its simplicity and has performance advantages over more complicated algorithms in certain situations, particularly where auxiliary memory is limited.

The subset sum problem (SSP) is a decision problem in computer science. In its most general formulation, there is a multiset of integers and a target-sum , and the question is to decide whether any subset of the integers sum to precisely . The problem is known to be NP-complete. Moreover, some restricted variants of it are NP-completene too, for example:

<span class="mw-page-title-main">Bucket sort</span> Sorting algorithm

Bucket sort, or bin sort, is a sorting algorithm that works by distributing the elements of an array into a number of buckets. Each bucket is then sorted individually, either using a different sorting algorithm, or by recursively applying the bucket sorting algorithm. It is a distribution sort, a generalization of pigeonhole sort that allows multiple keys per bucket, and is a cousin of radix sort in the most-to-least significant digit flavor. Bucket sort can be implemented with comparisons and therefore can also be considered a comparison sort algorithm. The computational complexity depends on the algorithm used to sort each bucket, the number of buckets to use, and whether the input is uniformly distributed.

In computer science, counting sort is an algorithm for sorting a collection of objects according to keys that are small positive integers; that is, it is an integer sorting algorithm. It operates by counting the number of objects that possess distinct key values, and applying prefix sum on those counts to determine the positions of each key value in the output sequence. Its running time is linear in the number of items and the difference between the maximum key value and the minimum key value, so it is only suitable for direct use in situations where the variation in keys is not significantly greater than the number of items. It is often used as a subroutine in radix sort, another sorting algorithm, which can handle larger keys more efficiently.

In computer science, bogosort is a sorting algorithm based on the generate and test paradigm. The function successively generates permutations of its input until it finds one that is sorted. It is not considered useful for sorting, but may be used for educational purposes, to contrast it with more efficient algorithms.

The Cooley–Tukey algorithm, named after J. W. Cooley and John Tukey, is the most common fast Fourier transform (FFT) algorithm. It re-expresses the discrete Fourier transform (DFT) of an arbitrary composite size in terms of N1 smaller DFTs of sizes N2, recursively, to reduce the computation time to O(N log N) for highly composite N (smooth numbers). Because of the algorithm's importance, specific variants and implementation styles have become known by their own names, as described below.

<span class="mw-page-title-main">Z-order curve</span> Mapping function that preserves data point locality

In mathematical analysis and computer science, functions which are Z-order, Lebesgue curve, Morton space-filling curve, Morton order or Morton code map multidimensional data to one dimension while preserving locality of the data points. It is named in France after Henri Lebesgue, who studied it in 1904, and named in the United States after Guy Macdonald Morton, who first applied the order to file sequencing in 1966. The z-value of a point in multidimensions is simply calculated by interleaving the binary representations of its coordinate values. Once the data are sorted into this ordering, any one-dimensional data structure can be used, such as simple one dimensional arrays, binary search trees, B-trees, skip lists or hash tables. The resulting ordering can equivalently be described as the order one would get from a depth-first traversal of a quadtree or octree.

Lamport's bakery algorithm is a computer algorithm devised by computer scientist Leslie Lamport, as part of his long study of the formal correctness of concurrent systems, which is intended to improve the safety in the usage of shared resources among multiple threads by means of mutual exclusion.

<span class="mw-page-title-main">Quicksort</span> Divide and conquer sorting algorithm

Quicksort is an efficient, general-purpose sorting algorithm. Quicksort was developed by British computer scientist Tony Hoare in 1959 and published in 1961. It is still a commonly used algorithm for sorting. Overall, it is slightly faster than merge sort and heapsort for randomized data, particularly on larger distributions.

In information theory and computer science, the Damerau–Levenshtein distance is a string metric for measuring the edit distance between two sequences. Informally, the Damerau–Levenshtein distance between two words is the minimum number of operations required to change one word into the other.

<span class="mw-page-title-main">Recursion (computer science)</span> Use of functions that call themselves

In computer science, recursion is a method of solving a computational problem where the solution depends on solutions to smaller instances of the same problem. Recursion solves such recursive problems by using functions that call themselves from within their own code. The approach can be applied to many types of problems, and recursion is one of the central ideas of computer science.

The power of recursion evidently lies in the possibility of defining an infinite set of objects by a finite statement. In the same manner, an infinite number of computations can be described by a finite recursive program, even if this program contains no explicit repetitions.

In computer science, adaptive heap sort is a comparison-based sorting algorithm of the adaptive sort family. It is a variant of heap sort that performs better when the data contains existing order. Published by Christos Levcopoulos and Ola Petersson in 1992, the algorithm utilizes a new measure of presortedness, Osc, as the number of oscillations. Instead of putting all the data into the heap as the traditional heap sort did, adaptive heap sort only take part of the data into the heap so that the run time will reduce significantly when the presortedness of the data is high.

In numerical analysis and linear algebra, lower–upper (LU) decomposition or factorization factors a matrix as the product of a lower triangular matrix and an upper triangular matrix. The product sometimes includes a permutation matrix as well. LU decomposition can be viewed as the matrix form of Gaussian elimination. Computers usually solve square systems of linear equations using LU decomposition, and it is also a key step when inverting a matrix or computing the determinant of a matrix. The LU decomposition was introduced by the Polish astronomer Tadeusz Banachiewicz in 1938. To quote: "It appears that Gauss and Doolittle applied the method [of elimination] only to symmetric equations. More recent authors, for example, Aitken, Banachiewicz, Dwyer, and Crout … have emphasized the use of the method, or variations of it, in connection with non-symmetric problems … Banachiewicz … saw the point … that the basic problem is really one of matrix factorization, or “decomposition” as he called it." It is also sometimes referred to as LR decomposition.

The Luhn mod N algorithm is an extension to the Luhn algorithm that allows it to work with sequences of values in any even-numbered base. This can be useful when a check digit is required to validate an identification string composed of letters, a combination of letters and digits or any arbitrary set of N characters where N is divisible by 2.

Spreadsort is a sorting algorithm invented by Steven J. Ross in 2002. It combines concepts from distribution-based sorts, such as radix sort and bucket sort, with partitioning concepts from comparison sorts such as quicksort and mergesort. In experimental results it was shown to be highly efficient, often outperforming traditional algorithms such as quicksort, particularly on distributions exhibiting structure and string sorting. There is an open-source implementation with performance analysis and benchmarks, and HTML documentation .

A negative base may be used to construct a non-standard positional numeral system. Like other place-value systems, each position holds multiples of the appropriate power of the system's base; but that base is negative—that is to say, the base b is equal to −r for some natural number r.

In computer science, integer sorting is the algorithmic problem of sorting a collection of data values by integer keys. Algorithms designed for integer sorting may also often be applied to sorting problems in which the keys are floating point numbers, rational numbers, or text strings. The ability to perform integer arithmetic on the keys allows integer sorting algorithms to be faster than comparison sorting algorithms in many cases, depending on the details of which operations are allowed in the model of computing and how large the integers to be sorted are.

References

  1. Arulanandham, Joshua J.; Calude, Cristian S.; Dinneen, Michael J. (January 2002). "Bead-Sort: A Natural Sorting Algorithm" (PDF). Department of Computer Science, University of Auckland. Retrieved 2021-05-14.
  2. Nigam, Palash. "Bead Sort - A Natural Sorting Algorithm". GeeksForGeeks.