Sorting number

Last updated

In mathematics and computer science, the sorting numbers are a sequence of numbers introduced in 1950 by Hugo Steinhaus for the analysis of comparison sort algorithms. These numbers give the worst-case number of comparisons used by both binary insertion sort and merge sort. However, there are other algorithms that use fewer comparisons.

Contents

Formula and examples

The th sorting number is given by the formula [1]

where

The sequence of numbers given by this formula (starting with ) is

0, 1, 3, 5, 8, 11, 14, 17, 21, 25, 29, 33, 37, 41, ... (sequence A001855 in the OEIS ).

The same sequence of numbers can also be obtained from the recurrence relation [2] ,

or closed form

It is an example of a 2-regular sequence. [2]

Asymptotically, the value of the th sorting number fluctuates between approximately and depending on the ratio between and the nearest power of two. [1]

Application to sorting

In 1950, Hugo Steinhaus observed that these numbers count the number of comparisons used by binary insertion sort, and conjectured (incorrectly) that they give the minimum number of comparisons needed to sort items using any comparison sort. The conjecture was disproved in 1959 by L. R. Ford Jr. and Selmer M. Johnson, who found a different sorting algorithm, the Ford–Johnson merge-insert sort, using fewer comparisons. [1]

The same sequence of sorting numbers also gives the worst-case number of comparisons used by merge sort to sort items. [2]

Other applications

The sorting numbers (shifted by one position) also give the sizes of the shortest possible superpatterns for the layered permutations. [3]

Related Research Articles

<span class="mw-page-title-main">Binary search algorithm</span> Search algorithm finding the position of a target value within a sorted array

In computer science, binary search, also known as half-interval search, logarithmic search, or binary chop, is a search algorithm that finds the position of a target value within a sorted array. Binary search compares the target value to the middle element of the array. If they are not equal, the half in which the target cannot lie is eliminated and the search continues on the remaining half, again taking the middle element to compare to the target value, and repeating this until the target value is found. If the search ends with the remaining half being empty, the target is not in the array.

<span class="mw-page-title-main">Pigeonhole principle</span> If there are more items than boxes holding them, one box must contain at least two items

In mathematics, the pigeonhole principle states that if n items are put into m containers, with n > m, then at least one container must contain more than one item. For example, of three gloves, at least two must be right-handed or at least two must be left-handed, because there are three objects but only two categories of handedness to put them into. This seemingly obvious statement, a type of counting argument, can be used to demonstrate possibly unexpected results. For example, given that the population of London is greater than the maximum number of hairs that can be on a human's head, the principle requires that there must be at least two people in London who have the same number of hairs on their heads.

<span class="mw-page-title-main">Floor and ceiling functions</span> Nearest integers from a number

In mathematics and computer science, the floor function is the function that takes as input a real number x, and gives as output the greatest integer less than or equal to x, denoted x or floor(x). Similarly, the ceiling function maps x to the smallest integer greater than or equal to x, denoted x or ceil(x).

<span class="mw-page-title-main">Shellsort</span> Sorting algorithm which uses multiple comparison intervals

Shellsort, also known as Shell sort or Shell's method, is an in-place comparison sort. It can be seen as either a generalization of sorting by exchange or sorting by insertion. The method starts by sorting pairs of elements far apart from each other, then progressively reducing the gap between elements to be compared. By starting with far apart elements, it can move some out-of-place elements into position faster than a simple nearest neighbor exchange. Donald Shell published the first version of this sort in 1959. The running time of Shellsort is heavily dependent on the gap sequence it uses. For many practical variants, determining their time complexity remains an open problem.

Golomb coding is a lossless data compression method using a family of data compression codes invented by Solomon W. Golomb in the 1960s. Alphabets following a geometric distribution will have a Golomb code as an optimal prefix code, making Golomb coding highly suitable for situations in which the occurrence of small values in the input stream is significantly more likely than large values.

<span class="mw-page-title-main">Rounding</span> Replacing a number with a simpler value

Rounding means replacing a number with an approximate value that has a shorter, simpler, or more explicit representation. For example, replacing $23.4476 with $23.45, the fraction 312/937 with 1/3, or the expression 2 with 1.414.

In computer science, the Akra–Bazzi method, or Akra–Bazzi theorem, is used to analyze the asymptotic behavior of the mathematical recurrences that appear in the analysis of divide and conquer algorithms where the sub-problems have substantially different sizes. It is a generalization of the master theorem for divide-and-conquer recurrences, which assumes that the sub-problems have equal size. It is named after mathematicians Mohamad Akra and Louay Bazzi.

<span class="mw-page-title-main">Self-balancing binary search tree</span> Any node-based binary search tree that automatically keeps its height the same

In computer science, a self-balancing binary search tree (BST) is any node-based binary search tree that automatically keeps its height small in the face of arbitrary item insertions and deletions. These operations when designed for a self-balancing binary search tree, contain precautionary measures against boundlessly increasing tree height, so that these abstract data structures receive the attribute "self-balancing".

In computer science, a selection algorithm is an algorithm for finding the th smallest value in a collection of ordered values, such as numbers. The value that it finds is called the th order statistic. Selection includes as special cases the problems of finding the minimum, median, and maximum element in the collection. Selection algorithms include quickselect, and the median of medians algorithm. When applied to a collection of values, these algorithms take linear time, as expressed using big O notation. For data that is already structured, faster algorithms may be possible; as an extreme case, selection in an already-sorted array takes time .

<span class="mw-page-title-main">Bitonic sorter</span> Parallel sorting algorithm

Bitonic mergesort is a parallel algorithm for sorting. It is also used as a construction method for building a sorting network. The algorithm was devised by Ken Batcher. The resulting sorting networks consist of comparators and have a delay of , where is the number of items to be sorted. This makes it a popular choice for sorting large numbers of elements on an architecture which itself contains a large number of parallel execution units running in lockstep, such as a typical GPU.

The Engel expansion of a positive real number x is the unique non-decreasing sequence of positive integers such that

<span class="mw-page-title-main">Comparison sort</span> Type of sorting algorithm that works by comparing pairs of elements

A comparison sort is a type of sorting algorithm that only reads the list elements through a single abstract comparison operation that determines which of two elements should occur first in the final sorted list. The only requirement is that the operator forms a total preorder over the data, with:

  1. if ab and bc then ac (transitivity)
  2. for all a and b, ab or ba (connexity).

In mathematics, a Beatty sequence is the sequence of integers found by taking the floor of the positive multiples of a positive irrational number. Beatty sequences are named after Samuel Beatty, who wrote about them in 1926.

In mathematics, Abel's summation formula, introduced by Niels Henrik Abel, is intensively used in analytic number theory and the study of special functions to compute series.

The Lambek–Moser theorem is a mathematical description of partitions of the natural numbers into two complementary sets. For instance, it applies to the partition of numbers into even and odd, or into prime and non-prime. There are two parts to the Lambek–Moser theorem. One part states that any two non-decreasing integer functions that are inverse, in a certain sense, can be used to split the natural numbers into two complementary subsets, and the other part states that every complementary partition can be constructed in this way. When a formula is known for the th natural number in a set, the Lambek–Moser theorem can be used to obtain a formula for the th number not in the set.

Square packing is a packing problem where the objective is to determine how many congruent squares can be packed into some larger shape, often a square or circle.

In computer science, an exponential search is an algorithm, created by Jon Bentley and Andrew Chi-Chih Yao in 1976, for searching sorted, unbounded/infinite lists. There are numerous ways to implement this, with the most common being to determine a range that the search key resides in and performing a binary search within that range. This takes O(log i) where i is the position of the search key in the list, if the search key is in the list, or the position where the search key should be, if the search key is not in the list.

In the mathematics of permutations, a layered permutation is a permutation that reverses contiguous blocks of elements. Equivalently, it is the direct sum of decreasing permutations.

In computer science, merge-insertion sort or the Ford–Johnson algorithm is a comparison sorting algorithm published in 1959 by L. R. Ford Jr. and Selmer M. Johnson. It uses fewer comparisons in the worst case than the best previously known algorithms, binary insertion sort and merge sort, and for 20 years it was the sorting algorithm with the fewest known comparisons. Although not of practical significance, it remains of theoretical interest in connection with the problem of sorting with a minimum number of comparisons. The same algorithm may have also been independently discovered by Stanisław Trybuła and Czen Ping.

An agreeable subset is a subset of items that is considered, by all people in a certain group, to be at least as good as its complement. Finding a small agreeable subset is a problem in computational social choice.

References

  1. 1 2 3 Ford, Lester R. Jr.; Johnson, Selmer M. (1959), "A tournament problem", American Mathematical Monthly , 66 (5): 387–389, doi:10.2307/2308750, JSTOR   2308750, MR   0103159
  2. 1 2 3 Allouche, Jean-Paul; Shallit, Jeffrey (1992), "The ring of -regular sequences", Theoretical Computer Science , 98 (2): 163–197, doi: 10.1016/0304-3975(92)90001-V , MR   1166363 . See Example 28, p. 192.
  3. Albert, Michael; Engen, Michael; Pantone, Jay; Vatter, Vincent (2018), "Universal layered permutations", Electronic Journal of Combinatorics , 25 (3): P23:1–P23:5, arXiv: 1710.04240 , doi: 10.37236/7386 , S2CID   52100342