This article possibly contains original research .(July 2014) |
The statistics of random permutations, such as the cycle structure of a random permutation are of fundamental importance in the analysis of algorithms, especially of sorting algorithms, which operate on random permutations. Suppose, for example, that we are using quickselect (a cousin of quicksort) to select a random element of a random permutation. Quickselect will perform a partial sort on the array, as it partitions the array according to the pivot. Hence a permutation will be less disordered after quickselect has been performed. The amount of disorder that remains may be analysed with generating functions. These generating functions depend in a fundamental way on the generating functions of random permutation statistics. Hence it is of vital importance to compute these generating functions.
The article on random permutations contains an introduction to random permutations.
Permutations are sets of labelled cycles. Using the labelled case of the Flajolet–Sedgewick fundamental theorem and writing for the set of permutations and for the singleton set, we have
Translating into exponential generating functions (EGFs), we have
where we have used the fact that the EGF of the combinatorial species of permutations (there are n! permutations of n elements) is
This one equation allows one to derive a large number of permutation statistics. Firstly, by dropping terms from , i.e. exp, we may constrain the number of cycles that a permutation contains, e.g. by restricting the EGF to we obtain permutations containing two cycles. Secondly, note that the EGF of labelled cycles, i.e. of , is
because there are k! / k labelled cycles. This means that by dropping terms from this generating function, we may constrain the size of the cycles that occur in a permutation and obtain an EGF of the permutations containing only cycles of a given size.
Instead of removing and selecting cycles, one can also put different weights on different size cycles. If is a weight function that depends only on the size k of the cycle and for brevity we write
defining the value of b for a permutation to be the sum of its values on the cycles, then we may mark cycles of length k with ub(k) and obtain a two-variable generating function
This is a "mixed" generating function: it is an exponential generating function in z and an ordinary generating function in the secondary parameter u. Differentiating and evaluating at u = 1, we have
This is the probability generating function of the expectation of b. In other words, the coefficient of in this power series is the expected value of b on permutations in , given that each permutation is chosen with the same probability .
This article uses the coefficient extraction operator [zn], documented on the page for formal power series.
An involution is a permutation σ so that σ2 = 1 under permutation composition. It follows that σ may only contain cycles of length one or two, i.e. the exponential generating function g(z) of these permutations is [1]
This gives the explicit formula for the total number of involutions among the permutations σ ∈ Sn: [1]
Dividing by n! yields the probability that a random permutation is an involution. These numbers are known as telephone numbers.
This generalizes the concept of an involution. An mth root of unity is a permutation σ so that σm = 1 under permutation composition. Now every time we apply σ we move one step in parallel along all of its cycles. A cycle of length d applied d times produces the identity permutation on d elements (d fixed points) and d is the smallest value to do so. Hence m must be a multiple of all cycle sizes d, i.e. the only possible cycles are those whose length d is a divisor of m. It follows that the EGF g(x) of these permutations is
When m = p, where p is prime, this simplifies to
This one can be done by Möbius inversion. Working with the same concept as in the previous entry we note that the combinatorial species of permutations whose order divides k is given by
Translation to exponential generating functions we obtain the EGF of permutations whose order divides k, which is
Now we can use this generating function to count permutations of order exactly k. Let be the number of permutations on n whose order is exactly d and the number of permutations on n the permutation count whose order divides k. Then we have
It follows by Möbius inversion that
Therefore, we have the EGF
The desired count is then given by
This formula produces e.g. for k = 6 the EGF
with the sequence of values starting at n = 5
For k = 8 we get the EGF
with the sequence of values starting at n = 8
Finally for k = 12 we get the EGF
with the sequence of values starting at n = 7
Suppose there are n people at a party, each of whom brought an umbrella. At the end of the party everyone picks an umbrella out of the stack of umbrellas and leaves. What is the probability that no one left with his/her own umbrella? This problem is equivalent to counting permutations with no fixed points (called derangements), and hence the EGF, where we subtract out fixed points (cycles of length 1) by removing the term z from the fundamental relation is
Multiplication by sums the coefficients of , so , the total number of derangements, is given by:
Hence there are about derangements and the probability that a random permutation is a derangement is
This result may also be proved by inclusion–exclusion. Using the sets where to denote the set of permutations that fix p, we have
This formula counts the number of permutations that have at least one fixed point. The cardinalities are as follows:
Hence the number of permutations with no fixed point is
or
and we have the claim.
There is a generalization of these numbers, which is known as rencontres numbers, i.e. the number of permutations of containing m fixed points. The corresponding EGF is obtained by marking cycles of size one with the variable u, i.e. choosing b(k) equal to one for and zero otherwise, which yields the generating function of the set of permutations by the number of fixed points:
It follows that
and hence
This immediately implies that
for n large, m fixed.
If P is a permutation, the order of P is the smallest positive integer n for which is the identity permutation. This is the least common multiple of the lengths of the cycles of P.
A theorem of Goh and Schmutz [2] states that if is the expected order of a random permutation of size n, then
where the constant c is
We can use the same construction as in the previous section to compute the number of derangements containing an even number of cycles and the number containing an odd number of cycles. To do this we need to mark all cycles and subtract fixed points, giving
Now some very basic reasoning shows that the EGF of is given by
We thus have
which is
Subtracting from , we find
The difference of these two ( and ) is
A prison warden wants to make room in his prison and is considering liberating one hundred prisoners, thereby freeing one hundred cells. He therefore assembles one hundred prisoners and asks them to play the following game: he lines up one hundred urns in a row, each containing the name of one prisoner, where every prisoner's name occurs exactly once. The game is played as follows: every prisoner is allowed to look inside fifty urns. If he or she does not find his or her name in one of the fifty urns, all prisoners will immediately be executed, otherwise the game continues. The prisoners have a few moments to decide on a strategy, knowing that once the game has begun, they will not be able to communicate with each other, mark the urns in any way or move the urns or the names inside them. Choosing urns at random, their chances of survival are almost zero, but there is a strategy giving them a 30% chance of survival, assuming that the names are assigned to urns randomly – what is it?
First of all, the survival probability using random choices is
so this is definitely not a practical strategy.
The 30% survival strategy is to consider the contents of the urns to be a permutation of the prisoners, and traverse cycles. To keep the notation simple, assign a number to each prisoner, for example by sorting their names alphabetically. The urns may thereafter be considered to contain numbers rather than names. Now clearly the contents of the urns define a permutation. The first prisoner opens the first urn. If he finds his name, he has finished and survives. Otherwise he opens the urn with the number he found in the first urn. The process repeats: the prisoner opens an urn and survives if he finds his name, otherwise he opens the urn with the number just retrieved, up to a limit of fifty urns. The second prisoner starts with urn number two, the third with urn number three, and so on. This strategy is precisely equivalent to a traversal of the cycles of the permutation represented by the urns. Every prisoner starts with the urn bearing his number and keeps on traversing his cycle up to a limit of fifty urns. The number of the urn that contains his number is the pre-image of that number under the permutation. Hence the prisoners survive if all cycles of the permutation contain at most fifty elements. We have to show that this probability is at least 30%.
Note that this assumes that the warden chooses the permutation randomly; if the warden anticipates this strategy, he can simply choose a permutation with a cycle of length 51. To overcome this, the prisoners may agree in advance on a random permutation of their names.
We consider the general case of prisoners and urns being opened. We first calculate the complementary probability, i.e. that there is a cycle of more than elements. With this in mind, we introduce
or
so that the desired probability is
because the cycle of more than elements will necessarily be unique. Using the fact that , we find that
which yields
Finally, using an integral estimate such as Euler–Maclaurin summation, or the asymptotic expansion of the nth harmonic number, we obtain
so that
or at least 30%, as claimed.
A related result is that asymptotically, the expected length of the longest cycle is λn, where λ is the Golomb–Dickman constant, approximately 0.62.
This example is due to Anna Gál and Peter Bro Miltersen; consult the paper by Peter Winkler for more information, and see the discussion on Les-Mathematiques.net. Consult the references on 100 prisoners for links to these references.
The above computation may be performed in a more simple and direct way, as follows: first note that a permutation of elements contains at most one cycle of length strictly greater than . Thus, if we denote
then
For , the number of permutations that contain a cycle of length exactly is
Explanation: is the number of ways of choosing the elements that comprise the cycle; is the number of ways of arranging items in a cycle; and is the number of ways to permute the remaining elements. There is no double counting here because there is at most one cycle of length when . Thus,
We conclude that
There is a closely related problem that fits the method presented here quite nicely. Say you have n ordered boxes. Every box contains a key to some other box or possibly itself giving a permutation of the keys. You are allowed to select k of these n boxes all at once and break them open simultaneously, gaining access to k keys. What is the probability that using these keys you can open all n boxes, where you use a found key to open the box it belongs to and repeat.
The mathematical statement of this problem is as follows: pick a random permutation on n elements and k values from the range 1 to n, also at random, call these marks. What is the probability that there is at least one mark on every cycle of the permutation? The claim is this probability is k/n.
The species of permutations by cycles with some non-empty subset of every cycle being marked has the specification
The index in the inner sum starts at one because we must have at least one mark on every cycle.
Translating the specification to generating functions we obtain the bivariate generating function
This simplifies to
or
In order to extract coefficients from this re-write like so
It now follows that
and hence
Divide by to obtain
We do not need to divide by n! because is exponential in z.
Applying the Flajolet–Sedgewick fundamental theorem, i.e. the labelled enumeration theorem with , to the set
we obtain the generating function
The term
yields the signed Stirling numbers of the first kind, and is the EGF of the unsigned Stirling numbers of the first kind, i.e.
We can compute the OGF of the signed Stirling numbers for n fixed, i.e.
Start with
which yields
Summing this, we obtain
Using the formula involving the logarithm for on the left, the definition of on the right, and the binomial theorem, we obtain
Comparing the coefficients of , and using the definition of the binomial coefficient, we finally have
a falling factorial. The computation of the OGF of the unsigned Stirling numbers of the first kind works in a similar way.
In this problem we use a bivariate generating function g(z, u) as described in the introduction. The value of b for a cycle not of size m is zero, and one for a cycle of size m. We have
or
This means that the expected number of cycles of size m in a permutation of length n less than m is zero (obviously). A random permutation of length at least m contains on average 1/m cycles of length m. In particular, a random permutation contains about one fixed point.
The OGF of the expected number of cycles of length less than or equal to m is therefore
where Hm is the mth harmonic number. Hence the expected number of cycles of length at most m in a random permutation is about ln m.
The mixed GF of the set of permutations by the number of fixed points is
Let the random variable X be the number of fixed points of a random permutation. Using Stirling numbers of the second kind, we have the following formula for the mth moment of X:
where is a falling factorial. Using , we have
which is zero when , and one otherwise. Hence only terms with contribute to the sum. This yields
Suppose you pick a random permutation and raise it to some power , with a positive integer and ask about the expected number of fixed points in the result. Denote this value by .
For every divisor of a cycle of length splits into fixed points when raised to the power Hence we need to mark these cycles with To illustrate this consider
We get
which is
Once more continuing as described in the introduction, we find
which is
The conclusion is that for and there are four fixed points on average.
The general procedure is
Once more continuing as before, we find
We have shown that the value of is equal to (the number of divisors of ) as soon as It starts out at for and increases by one every time hits a divisor of up to and including itself.
We construct the bivariate generating function using , where is one for all cycles (every cycle contributes one to the total number of cycles).
Note that has the closed form
and generates the unsigned Stirling numbers of the first kind.
We have
Hence the expected number of cycles is the harmonic number , or about .
(Note that Section One hundred prisoners contains exactly the same problem with a very similar calculation, plus also a simpler elementary proof.)
Once more, start with the exponential generating function , this time of the class of permutations according to size where cycles of length more than are marked with the variable :
There can only be one cycle of length more than , hence the answer to the question is given by
or
which is
The exponent of in the term being raised to the power is larger than and hence no value for can possibly contribute to
It follows that the answer is
The sum has an alternate representation that one encounters e.g. in the OEIS OEIS: A024167 .
finally giving
We can use the disjoint cycle decomposition of a permutation to factorize it as a product of transpositions by replacing a cycle of length k by k − 1 transpositions. E.g. the cycle factors as . The function for cycles is equal to and we obtain
and
Hence the expected number of transpositions is
where is the Harmonic number. We could also have obtained this formula by noting that the number of transpositions is obtained by adding the lengths of all cycles (which gives n) and subtracting one for every cycle (which gives by the previous section).
Note that again generates the unsigned Stirling numbers of the first kind, but in reverse order. More precisely, we have
To see this, note that the above is equivalent to
and that
which we saw to be the EGF of the unsigned Stirling numbers of the first kind in the section on permutations consisting of precisely m cycles.
We select a random element q of a random permutation and ask about the expected size of the cycle that contains q. Here the function is equal to , because a cycle of length k contributes k elements that are on cycles of length k. Note that unlike the previous computations, we need to average out this parameter after we extract it from the generating function (divide by n). We have
Hence the expected length of the cycle that contains q is
This average parameter represents the probability that if we again select a random element of of a random permutation, the element lies on a cycle of size m. The function is equal to for and zero otherwise, because only cycles of length m contribute, namely m elements that lie on a cycle of length m. We have
It follows that the probability that a random element lies on a cycle of length m is
Select a random subset Q of [n] containing m elements and a random permutation, and ask about the probability that all elements of Q lie on the same cycle. This is another average parameter. The function b(k) is equal to , because a cycle of length k contributes subsets of size m, where for k < m. This yields
Averaging out we obtain that the probability of the elements of Q being on the same cycle is
or
In particular, the probability that two elements p < q are on the same cycle is 1/2.
We may use the Flajolet–Sedgewick fundamental theorem directly and compute more advanced permutation statistics. (Check that page for an explanation of how the operators we will use are computed.) For example, the set of permutations containing an even number of even cycles is given by
Translating to exponential generating functions (EGFs), we obtain
or
This simplifies to
or
This says that there is one permutation of size zero containing an even number of even cycles (the empty permutation, which contains zero cycles of even length), one such permutation of size one (the fixed point, which also contains zero cycles of even length), and that for , there are such permutations.
Consider what happens when we square a permutation. Fixed points are mapped to fixed points. Odd cycles are mapped to odd cycles in a one-to-one correspondence, e.g. turns into . Even cycles split in two and produce a pair of cycles of half the size of the original cycle, e.g. turns into . Hence permutations that are squares may contain any number of odd cycles, and an even number of cycles of size two, an even number of cycles of size four etc., and are given by
which yields the EGF
The types of permutations presented in the preceding two sections, i.e. permutations containing an even number of even cycles and permutations that are squares, are examples of so-called odd cycle invariants, studied by Sung and Zhang (see external links). The term odd cycle invariant simply means that membership in the respective combinatorial class is independent of the size and number of odd cycles occurring in the permutation. In fact we can prove that all odd cycle invariants obey a simple recurrence, which we will derive. First, here are some more examples of odd cycle invariants.
This class has the specification
and the generating function
The first few values are
This class has the specification
and the generating function
There is a semantic nuance here. We could consider permutations containing no even cycles as belonging to this class, since zero is even. The first few values are
This class has the specification
and the generating function
The first few values are
Observe carefully how the specifications of the even cycle component are constructed. It is best to think of them in terms of parse trees. These trees have three levels. The nodes at the lowest level represent sums of products of even-length cycles of the singleton . The nodes at the middle level represent restrictions of the set operator. Finally the node at the top level sums products of contributions from the middle level. Note that restrictions of the set operator, when applied to a generating function that is even, will preserve this feature, i.e. produce another even generating function. But all the inputs to the set operators are even since they arise from even-length cycles. The result is that all generating functions involved have the form
where is an even function. This means that
is even, too, and hence
Letting and extracting coefficients, we find that
which yields the recurrence
A link to the Putnam competition website appears in the section External links. The problem asks for a proof that
where the sum is over all permutations of , is the sign of , i.e. if is even and if is odd, and is the number of fixed points of .
Now the sign of is given by
where the product is over all cycles c of , as explained e.g. on the page on even and odd permutations.
Hence we consider the combinatorial class
where marks one minus the length of a contributing cycle, and marks fixed points. Translating to generating functions, we obtain
or
Now we have
and hence the desired quantity is given by
Doing the computation, we obtain
or
Extracting coefficients, we find that the coefficient of is zero. The constant is one, which does not agree with the formula (should be zero). For positive, however, we obtain
or
which is the desired result.
As an interesting aside, we observe that may be used to evaluate the following determinant of an matrix:
where . Recall the formula for the determinant:
Now the value of the product on the right for a permutation is , where f is the number of fixed points of . Hence
which yields
and finally
Here we seek to show that this difference is given by
Recall that the sign of a permutation is given by
where the product ranges over the cycles c from the disjoint cycle composition of .
It follows that the combinatorial species that reflects the signs and the cycle count of the set of permutations is given by
where we have used to mark signs and for the cycle count.
Translating to generating functions we have
This simplifies to
which is
Now the two generating functions and of even and odd permutations by cycle count are given by
and
We require the quantity
which is
Finally, extracting coefficients from this generating function, we obtain
which is
which is in turn
This concludes the proof.
Similar statistics are available for random endomorphisms on a finite set. [3] [4]
In complex analysis, an entire function, also called an integral function, is a complex-valued function that is holomorphic on the whole complex plane. Typical examples of entire functions are polynomials and the exponential function, and any finite sums, products and compositions of these, such as the trigonometric functions sine and cosine and their hyperbolic counterparts sinh and cosh, as well as derivatives and integrals of entire functions such as the error function. If an entire function has a root at , then , taking the limit value at , is an entire function. On the other hand, the natural logarithm, the reciprocal function, and the square root are all not entire functions, nor can they be continued analytically to an entire function.
In probability theory and statistics, the exponential distribution or negative exponential distribution is the probability distribution of the time between events in a Poisson point process, i.e., a process in which events occur continuously and independently at a constant average rate. It is a particular case of the gamma distribution. It is the continuous analogue of the geometric distribution, and it has the key property of being memoryless. In addition to being used for the analysis of Poisson point processes it is found in various other contexts.
In complex analysis, a branch of mathematics, analytic continuation is a technique to extend the domain of definition of a given analytic function. Analytic continuation often succeeds in defining further values of a function, for example in a new region where the infinite series representation which initially defined the function becomes divergent.
In mathematics, a Dirichlet series is any series of the form
In mathematics, the exponential integral Ei is a special function on the complex plane.
In probability theory, a Chernoff bound is an exponentially decreasing upper bound on the tail of a random variable based on its moment generating function. The minimum of all such exponential bounds forms the Chernoff or Chernoff-Cramér bound, which may decay faster than exponential. It is especially useful for sums of independent random variables, such as sums of Bernoulli random variables.
In combinatorics, the symbolic method is a technique for counting combinatorial objects. It uses the internal structure of the objects to derive formulas for their generating functions. The method is mostly associated with Philippe Flajolet and is detailed in Part A of his book with Robert Sedgewick, Analytic Combinatorics, while the rest of the book explains how to use complex analysis in order to get asymptotic and probabilistic results on the corresponding generating functions.
In mathematical statistics, the Kullback–Leibler divergence, denoted , is a type of statistical distance: a measure of how one probability distribution P is different from a second, reference probability distribution Q. A simple interpretation of the KL divergence of P from Q is the expected excess surprise from using Q as a model when the actual distribution is P. While it is a distance, it is not a metric, the most familiar type of distance: it is not symmetric in the two distributions, and does not satisfy the triangle inequality. Instead, in terms of information geometry, it is a type of divergence, a generalization of squared distance, and for certain classes of distributions, it satisfies a generalized Pythagorean theorem.
Variational Bayesian methods are a family of techniques for approximating intractable integrals arising in Bayesian inference and machine learning. They are typically used in complex statistical models consisting of observed variables as well as unknown parameters and latent variables, with various sorts of relationships among the three types of random variables, as might be described by a graphical model. As typical in Bayesian inference, the parameters and latent variables are grouped together as "unobserved variables". Variational Bayesian methods are primarily used for two purposes:
In statistics and information theory, a maximum entropy probability distribution has entropy that is at least as great as that of all other members of a specified class of probability distributions. According to the principle of maximum entropy, if nothing is known about a distribution except that it belongs to a certain class, then the distribution with the largest entropy should be chosen as the least-informative default. The motivation is twofold: first, maximizing entropy minimizes the amount of prior information built into the distribution; second, many physical systems tend to move towards maximal entropy configurations over time.
In mathematics, especially in combinatorics, Stirling numbers of the first kind arise in the study of permutations. In particular, the Stirling numbers of the first kind count permutations according to their number of cycles.
Expected shortfall (ES) is a risk measure—a concept used in the field of financial risk measurement to evaluate the market risk or credit risk of a portfolio. The "expected shortfall at q% level" is the expected return on the portfolio in the worst of cases. ES is an alternative to value at risk that is more sensitive to the shape of the tail of the loss distribution.
The use of exponential generating functions (EGFs) to study the properties of Stirling numbers is a classical exercise in combinatorial mathematics and possibly the canonical example of how symbolic combinatorics is used. It also illustrates the parallels in the construction of these two types of numbers, lending support to the binomial-style notation that is used for them.
Discrepancy of hypergraphs is an area of discrepancy theory.
The exponential mechanism is a technique for designing differentially private algorithms. It was developed by Frank McSherry and Kunal Talwar in 2007. Their work was recognized as a co-winner of the 2009 PET Award for Outstanding Research in Privacy Enhancing Technologies.
A product distribution is a probability distribution constructed as the distribution of the product of random variables having two other known distributions. Given two statistically independent random variables X and Y, the distribution of the random variable Z that is formed as the product is a product distribution.
In probability theory, concentration inequalities provide bounds on how a random variable deviates from some value. The law of large numbers of classical probability theory states that sums of independent random variables are, under very mild conditions, close to their expectation with a large probability. Such sums are the most basic examples of random variables concentrated around their mean. Recent results show that such behavior is shared by other functions of independent random variables.
For certain applications in linear algebra, it is useful to know properties of the probability distribution of the largest eigenvalue of a finite sum of random matrices. Suppose is a finite sequence of random matrices. Analogous to the well-known Chernoff bound for sums of scalars, a bound on the following is sought for a given parameter t:
In mathematics, a transformation of a sequence's generating function provides a method of converting the generating function for one sequence into a generating function enumerating another. These transformations typically involve integral formulas applied to a sequence generating function or weighted sums over the higher-order derivatives of these functions.
In number theory, the prime omega functions and count the number of prime factors of a natural number Thereby counts each distinct prime factor, whereas the related function counts the total number of prime factors of honoring their multiplicity. That is, if we have a prime factorization of of the form for distinct primes , then the respective prime omega functions are given by and . These prime factor counting functions have many important number theoretic relations.