Random permutation

Last updated

A random permutation is a random ordering of a set of objects, that is, a permutation-valued random variable. The use of random permutations is common in games of chance and in randomized algorithms in coding theory, cryptography, and simulation. A good example of a random permutation is the fair shuffling of a standard deck of cards: this is ideally a random permutation of the 52 cards.

Contents

Computation of random permutations

Entry-by-entry methods

One algorithm for generating a random permutation of a set of size n uniformly at random, i.e., such that each of the n! permutations is equally likely to appear, is to generate a sequence by uniformly randomly selecting an integer between 1 and n (inclusive), sequentially and without replacement n times, and then to interpret this sequence (x1, ..., xn) as the permutation

shown here in two-line notation.

An inefficient brute-force method for sampling without replacement could select from the numbers between 1 and n at every step, retrying the selection whenever the random number picked is a repeat of a number already selected until selecting a number that has not yet been selected. The expected number of retries per step in such cases will scale with the inverse of the fraction of numbers already selected, and the overall number of retries as the sum of those inverses, making this an inefficient approach.

Such retries can be avoided using an algorithm where, on each ith step when x1, ..., xi 1 have already been chosen, one chooses a uniformly random number j from between 1 and ni + 1 (inclusive) and sets xi equal to the jth largest of the numbers that have not yet been selected. This selects uniformly randomly among the remaining numbers at every step without retries.

Fisher-Yates shuffles

A simple algorithm to generate a permutation of n items uniformly at random without retries, known as the Fisher–Yates shuffle, is to start with any permutation (for example, the identity permutation), and then go through the positions 0 through n − 2 (we use a convention where the first element has index 0, and the last element has index n − 1), and for each position iswap the element currently there with a randomly chosen element from positions i through n − 1 (the end), inclusive. Any permutation of n elements will be produced by this algorithm with probability exactly 1/n!, thus yielding a uniform distribution of the permutations.

unsigneduniform(unsignedm);/* Returns a random integer 0 <= uniform(m) <= m-1 with uniform distribution */voidinitialize_and_permute(unsignedpermutation[],unsignedn){unsignedi;for(i=0;i<=n-2;i++){unsignedj=i+uniform(n-i);/* A random integer such that i ≤ j < n */swap(permutation[i],permutation[j]);/* Swap the randomly picked element with permutation[i] */}}

If the uniform() function is implemented simply as random() % (m) then there will be a bias in the distribution of permutations if the number of return values of random() is not a multiple of m. However, this effect is small if the number of return values of random() is orders of magnitude greater than m.

Randomness testing

As with all computational implementations of random processes, the quality of the distribution generated by an implementation of a randomized algorithm such as the Fisher-Yates shuffle, i.e., how close the actually generated distribution is to the desired distribution, will depend on the quality of underlying sources of randomness in the implementation such as pseudorandom number generators or hardware random number generators. There are many randomness tests for random permutations, such as the "overlapping permutations" test of the Diehard tests. A typical form of such tests is to take some permutation statistic for which the distribution is theoretically known and then test whether the distribution of that statistic on a set of randomly generated permutations from an implementation closely approximates the distribution of that statistic from the true distribution.

Statistics on random permutations

Fixed points

The probability distribution for the number of fixed points of a uniformly distributed random permutation of n elements approaches a Poisson distribution with expected value 1 as n grows. [1] The first n moments of this distribution are exactly those of the Poisson distribution. In particular, the probability that a random permutation has no fixed points (i.e., that the permutation is a derangement) approaches 1/e as n increases.

See also

Related Research Articles

<span class="mw-page-title-main">Hash function</span> Mapping arbitrary data to fixed-size values

A hash function is any function that can be used to map data of arbitrary size to fixed-size values, though there are some hash functions that support variable-length output. The values returned by a hash function are called hash values, hash codes, hash digests, digests, or simply hashes. The values are usually used to index a fixed-size table called a hash table. Use of a hash function to index a hash table is called hashing or scatter-storage addressing.

In cryptography, RC4 is a stream cipher. While it is remarkable for its simplicity and speed in software, multiple vulnerabilities have been discovered in RC4, rendering it insecure. It is especially vulnerable when the beginning of the output keystream is not discarded, or when nonrandom or related keys are used. Particularly problematic uses of RC4 have led to very insecure protocols such as WEP.

<span class="mw-page-title-main">Permutation</span> Mathematical version of an order change

In mathematics, a permutation of a set can mean one of two different things:

A pseudorandom number generator (PRNG), also known as a deterministic random bit generator (DRBG), is an algorithm for generating a sequence of numbers whose properties approximate the properties of sequences of random numbers. The PRNG-generated sequence is not truly random, because it is completely determined by an initial value, called the PRNG's seed. Although sequences that are closer to truly random can be generated using hardware random number generators, pseudorandom number generators are important in practice for their speed in number generation and their reproducibility.

The Mersenne Twister is a general-purpose pseudorandom number generator (PRNG) developed in 1997 by Makoto Matsumoto and Takuji Nishimura. Its name derives from the choice of a Mersenne prime as its period length.

In computer science, bogosort is a sorting algorithm based on the generate and test paradigm. The function successively generates permutations of its input until it finds one that is sorted. It is not considered useful for sorting, but may be used for educational purposes, to contrast it with more efficient algorithms.

<span class="mw-page-title-main">XOR swap algorithm</span> Binary arithmetic algorithm

In computer programming, the exclusive or swap is an algorithm that uses the exclusive or bitwise operation to swap the values of two variables without using the temporary variable which is normally required.

In computer science, a one-way function is a function that is easy to compute on every input, but hard to invert given the image of a random input. Here, "easy" and "hard" are to be understood in the sense of computational complexity theory, specifically the theory of polynomial time problems. Not being one-to-one is not considered sufficient for a function to be called one-way.

In mathematics, and in particular in group theory, a cyclic permutation is a permutation consisting of a single cycle. In some cases, cyclic permutations are referred to as cycles; if a cyclic permutation has k elements, it may be called a k-cycle. Some authors widen this definition to include permutations with fixed points in addition to at most one non-trivial cycle. In cycle notation, cyclic permutations are denoted by the list of their elements enclosed with parentheses, in the order to which they are permuted.

<span class="mw-page-title-main">Perlin noise</span> Type of gradient noise in computer graphics

Perlin noise is a type of gradient noise developed by Ken Perlin in 1983. It has many uses, including but not limited to: procedurally generating terrain, applying pseudo-random changes to a variable, and assisting in the creation of image textures. It is most commonly implemented in two, three, or four dimensions, but can be defined for any number of dimensions.

<span class="mw-page-title-main">Random number generation</span> Producing a sequence that cannot be predicted better than by random chance

Random number generation is a process by which, often by means of a random number generator (RNG), a sequence of numbers or symbols that cannot be reasonably predicted better than by random chance is generated. This means that the particular outcome sequence will contain some patterns detectable in hindsight but impossible to foresee. True random number generators can be hardware random-number generators (HRNGs), wherein each generation is a function of the current value of a physical environment's attribute that is constantly changing in a manner that is practically impossible to model. This would be in contrast to so-called "random number generations" done by pseudorandom number generators (PRNGs), which generate numbers that only look random but are in fact predetermined—these generations can be reproduced simply by knowing the state of the PRNG.

In computational complexity theory and cryptography, the existence of pseudorandom generators is related to the existence of one-way functions through a number of theorems, collectively referred to as the pseudorandom generator theorem.

<span class="mw-page-title-main">Quicksort</span> Divide and conquer sorting algorithm

Quicksort is an efficient, general-purpose sorting algorithm. Quicksort was developed by British computer scientist Tony Hoare in 1959 and published in 1961. It is still a commonly used algorithm for sorting. Overall, it is slightly faster than merge sort and heapsort for randomized data, particularly on larger distributions.

In cryptography, a pseudorandom permutation (PRP) is a function that cannot be distinguished from a random permutation (that is, a permutation selected at random with uniform probability, from the family of all permutations on the function's domain) with practical effort.

<span class="mw-page-title-main">Ziggurat algorithm</span> Algorithm for pseudo-random number sampling

The ziggurat algorithm is an algorithm for pseudo-random number sampling. Belonging to the class of rejection sampling algorithms, it relies on an underlying source of uniformly-distributed random numbers, typically from a pseudo-random number generator, as well as precomputed tables. The algorithm is used to generate values from a monotonically decreasing probability distribution. It can also be applied to symmetric unimodal distributions, such as the normal distribution, by choosing a value from one half of the distribution and then randomly choosing which half the value is considered to have been drawn from. It was developed by George Marsaglia and others in the 1960s.

<span class="mw-page-title-main">Fisher–Yates shuffle</span> Algorithm for generating a random permutation of a finite set

The Fisher–Yates shuffle is an algorithm for shuffling a finite sequence. The algorithm takes a list of all the elements of the sequence, and continually determines the next element in the shuffled sequence by randomly drawing an element from the list until no elements remain. The algorithm produces an unbiased permutation: every permutation is equally likely. The modern version of the algorithm takes time proportional to the number of items being shuffled and shuffles them in place.

Reservoir sampling is a family of randomized algorithms for choosing a simple random sample, without replacement, of k items from a population of unknown size n in a single pass over the items. The size of the population n is not known to the algorithm and is typically too large for all n items to fit into main memory. The population is revealed to the algorithm over time, and the algorithm cannot look back at previous items. At any point, the current state of the algorithm must permit extraction of a simple random sample without replacement of size k over the part of the population seen so far.

In the mathematics of shuffling playing cards, the Gilbert–Shannon–Reeds model is a probability distribution on riffle shuffle permutations. It forms the basis for a recommendation that a deck of cards should be riffled seven times in order to thoroughly randomize it. It is named after the work of Edgar Gilbert, Claude Shannon, and J. Reeds, reported in a 1955 technical report by Gilbert and in a 1981 unpublished manuscript of Reeds.

<span class="mw-page-title-main">Heap's algorithm</span> Method of generating all permutations of n objects

Heap's algorithm generates all possible permutations of n objects. It was first proposed by B. R. Heap in 1963. The algorithm minimizes movement: it generates each permutation from the previous one by interchanging a single pair of elements; the other n−2 elements are not disturbed. In a 1977 review of permutation-generating algorithms, Robert Sedgewick concluded that it was at that time the most effective algorithm for generating permutations by computer.

Birkhoff's algorithm is an algorithm for decomposing a bistochastic matrix into a convex combination of permutation matrices. It was published by Garrett Birkhoff in 1946. It has many applications. One such application is for the problem of fair random assignment: given a randomized allocation of items, Birkhoff's algorithm can decompose it into a lottery on deterministic allocations.

References

  1. Durstenfeld, Richard (1964-07-01). "Algorithm 235: Random permutation". Communications of the ACM. 7 (7): 420. doi: 10.1145/364520.364540 .