Permanent (mathematics)

Last updated

In linear algebra, the permanent of a square matrix is a function of the matrix similar to the determinant. The permanent, as well as the determinant, is a polynomial in the entries of the matrix. [1] Both are special cases of a more general function of a matrix called the immanant.

Contents

Definition

The permanent of an n×n matrix A = (ai,j) is defined as

The sum here extends over all elements σ of the symmetric group Sn; i.e. over all permutations of the numbers 1, 2, ..., n.

For example,

and

The definition of the permanent of A differs from that of the determinant of A in that the signatures of the permutations are not taken into account.

The permanent of a matrix A is denoted per A, perm A, or Per A, sometimes with parentheses around the argument. Minc uses Per(A) for the permanent of rectangular matrices, and per(A) when A is a square matrix. [2] Muir and Metzler use the notation . [3]

The word, permanent, originated with Cauchy in 1812 as “fonctions symétriques permanentes” for a related type of function, [4] and was used by Muir and Metzler [5] in the modern, more specific, sense. [6]

Properties

If one views the permanent as a map that takes n vectors as arguments, then it is a multilinear map and it is symmetric (meaning that any order of the vectors results in the same permanent). Furthermore, given a square matrix of order n: [7]

Relation to determinants

Laplace's expansion by minors for computing the determinant along a row, column or diagonal extends to the permanent by ignoring all signs. [9]

For every ,

where is the entry of the ith row and the jth column of B, and is the permanent of the submatrix obtained by removing the ith row and the jth column of B.

For example, expanding along the first column,

while expanding along the last row gives,

On the other hand, the basic multiplicative property of determinants is not valid for permanents. [10] A simple example shows that this is so.

Unlike the determinant, the permanent has no easy geometrical interpretation; it is mainly used in combinatorics, in treating boson Green's functions in quantum field theory, and in determining state probabilities of boson sampling systems. [11] However, it has two graph-theoretic interpretations: as the sum of weights of cycle covers of a directed graph, and as the sum of weights of perfect matchings in a bipartite graph.

Applications

Symmetric tensors

The permanent arises naturally in the study of the symmetric tensor power of Hilbert spaces. [12] In particular, for a Hilbert space , let denote the th symmetric tensor power of , which is the space of symmetric tensors. Note in particular that is spanned by the symmetric products of elements in . For , we define the symmetric product of these elements by

If we consider (as a subspace of , the kth tensor power of ) and define the inner product on accordingly, we find that for

Applying the Cauchy–Schwarz inequality, we find that , and that

Cycle covers

Any square matrix can be viewed as the adjacency matrix of a weighted directed graph on vertex set , with representing the weight of the arc from vertex i to vertex j. A cycle cover of a weighted directed graph is a collection of vertex-disjoint directed cycles in the digraph that covers all vertices in the graph. Thus, each vertex i in the digraph has a unique "successor" in the cycle cover, and so represents a permutation on V. Conversely, any permutation on V corresponds to a cycle cover with arcs from each vertex i to vertex .

If the weight of a cycle-cover is defined to be the product of the weights of the arcs in each cycle, then

implying that

Thus the permanent of A is equal to the sum of the weights of all cycle-covers of the digraph.

Perfect matchings

A square matrix can also be viewed as the adjacency matrix of a bipartite graph which has vertices on one side and on the other side, with representing the weight of the edge from vertex to vertex . If the weight of a perfect matching that matches to is defined to be the product of the weights of the edges in the matching, then

Thus the permanent of A is equal to the sum of the weights of all perfect matchings of the graph.

Permanents of (0, 1) matrices

Enumeration

The answers to many counting questions can be computed as permanents of matrices that only have 0 and 1 as entries.

Let Ω(n,k) be the class of all (0, 1)-matrices of order n with each row and column sum equal to k. Every matrix A in this class has perm(A) > 0. [13] The incidence matrices of projective planes are in the class Ω(n2 + n + 1, n + 1) for n an integer > 1. The permanents corresponding to the smallest projective planes have been calculated. For n = 2, 3, and 4 the values are 24, 3852 and 18,534,400 respectively. [13] Let Z be the incidence matrix of the projective plane with n = 2, the Fano plane. Remarkably, perm(Z) = 24 = |det (Z)|, the absolute value of the determinant of Z. This is a consequence of Z being a circulant matrix and the theorem: [14]

If A is a circulant matrix in the class Ω(n,k) then if k > 3, perm(A) > |det (A)| and if k = 3, perm(A) = |det (A)|. Furthermore, when k = 3, by permuting rows and columns, A can be put into the form of a direct sum of e copies of the matrix Z and consequently, n = 7e and perm(A) = 24e.

Permanents can also be used to calculate the number of permutations with restricted (prohibited) positions. For the standard n-set {1, 2, ..., n}, let be the (0, 1)-matrix where aij = 1 if i  j is allowed in a permutation and aij = 0 otherwise. Then perm(A) is equal to the number of permutations of the n-set that satisfy all the restrictions. [9] Two well known special cases of this are the solution of the derangement problem and the ménage problem: the number of permutations of an n-set with no fixed points (derangements) is given by

where J is the n×n all 1's matrix and I is the identity matrix, and the ménage numbers are given by

where I' is the (0, 1)-matrix with nonzero entries in positions (i, i + 1) and (n, 1).

Bounds

The Bregman–Minc inequality, conjectured by H. Minc in 1963 [15] and proved by L. M. Brégman in 1973, [16] gives an upper bound for the permanent of an n × n (0, 1)-matrix. If A has ri ones in row i for each 1 ≤ in, the inequality states that

Van der Waerden's conjecture

In 1926, Van der Waerden conjectured that the minimum permanent among all n×n doubly stochastic matrices is n!/nn, achieved by the matrix for which all entries are equal to 1/n. [17] Proofs of this conjecture were published in 1980 by B. Gyires [18] and in 1981 by G. P. Egorychev [19] and D. I. Falikman; [20] Egorychev's proof is an application of the Alexandrov–Fenchel inequality. [21] For this work, Egorychev and Falikman won the Fulkerson Prize in 1982. [22]

Computation

The naïve approach, using the definition, of computing permanents is computationally infeasible even for relatively small matrices. One of the fastest known algorithms is due to H. J. Ryser. [23] Ryser's method is based on an inclusion–exclusion formula that can be given [24] as follows: Let be obtained from A by deleting k columns, let be the product of the row-sums of , and let be the sum of the values of over all possible . Then

It may be rewritten in terms of the matrix entries as follows:

The permanent is believed to be more difficult to compute than the determinant. While the determinant can be computed in polynomial time by Gaussian elimination, Gaussian elimination cannot be used to compute the permanent. Moreover, computing the permanent of a (0,1)-matrix is #P-complete. Thus, if the permanent can be computed in polynomial time by any method, then FP  =  #P , which is an even stronger statement than P = NP. When the entries of A are nonnegative, however, the permanent can be computed approximately in probabilistic polynomial time, up to an error of , where is the value of the permanent and is arbitrary. [25] The permanent of a certain set of positive semidefinite matrices is NP-hard to approximate within any subexponential factor. [26] If further conditions on the spectrum are imposed, the permanent can be approximated in probabilistic polynomial time: the best achievable error of this approximation is ( is again the value of the permanent). [27] The hardness in these instances is closely linked with difficulty of simulating boson sampling experiments.

MacMahon's master theorem

Another way to view permanents is via multivariate generating functions. Let be a square matrix of order n. Consider the multivariate generating function:

The coefficient of in is perm(A). [28]

As a generalization, for any sequence of n non-negative integers, define:

as the coefficient of in

MacMahon's master theorem relating permanents and determinants is: [29]

where I is the order n identity matrix and X is the diagonal matrix with diagonal

Rectangular matrices

The permanent function can be generalized to apply to non-square matrices. Indeed, several authors make this the definition of a permanent and consider the restriction to square matrices a special case. [30] Specifically, for an m × n matrix with m  n, define

where P(n,m) is the set of all m-permutations of the n-set {1,2,...,n}. [31]

Ryser's computational result for permanents also generalizes. If A is an m × n matrix with m  n, let be obtained from A by deleting k columns, let be the product of the row-sums of , and let be the sum of the values of over all possible . Then [10]

Systems of distinct representatives

The generalization of the definition of a permanent to non-square matrices allows the concept to be used in a more natural way in some applications. For instance:

Let S1, S2, ..., Sm be subsets (not necessarily distinct) of an n-set with m  n. The incidence matrix of this collection of subsets is an m × n (0,1)-matrix A. The number of systems of distinct representatives (SDR's) of this collection is perm(A). [32]

See also

Notes

  1. Marcus, Marvin; Minc, Henryk (1965). "Permanents". Amer. Math. Monthly. 72 (6): 577–591. doi:10.2307/2313846. JSTOR   2313846.
  2. Minc (1978)
  3. Muir & Metzler (1960)
  4. Cauchy, A. L. (1815), "Mémoire sur les fonctions qui ne peuvent obtenir que deux valeurs égales et de signes contraires par suite des transpositions opérées entre les variables qu'elles renferment.", Journal de l'École Polytechnique, 10: 91–169
  5. Muir & Metzler (1960)
  6. van Lint & Wilson 2001 , p. 108
  7. Ryser 1963 , pp. 25 26
  8. Percus 1971 , p. 2
  9. 1 2 Percus 1971 , p. 12
  10. 1 2 Ryser 1963 , p. 26
  11. Aaronson, Scott (14 Nov 2010). "The Computational Complexity of Linear Optics". arXiv: 1011.3245 [quant-ph].
  12. Bhatia, Rajendra (1997). Matrix Analysis. New York: Springer-Verlag. pp. 16–19. ISBN   978-0-387-94846-1.
  13. 1 2 Ryser 1963 , p. 124
  14. Ryser 1963 , p. 125
  15. Minc, Henryk (1963), "Upper bounds for permanents of (0,1)-matrices", Bulletin of the American Mathematical Society, 69 (6): 789–791, doi: 10.1090/s0002-9904-1963-11031-9
  16. van Lint & Wilson 2001 , p. 101
  17. van der Waerden, B. L. (1926), "Aufgabe 45", Jber. Deutsch. Math.-Verein., 35: 117.
  18. Gyires, B. (2022), "The common source of several inequalities concerning doubly stochastic matrices", Publicationes Mathematicae Institutum Mathematicum Universitatis Debreceniensis, 27 (3–4): 291–304, doi: 10.5486/PMD.1980.27.3-4.15 , MR   0604006 .
  19. Egoryčev, G. P. (1980), Reshenie problemy van-der-Vardena dlya permanentov (in Russian), Krasnoyarsk: Akad. Nauk SSSR Sibirsk. Otdel. Inst. Fiz., p. 12, MR   0602332 . Egorychev, G. P. (1981), "Proof of the van der Waerden conjecture for permanents", Akademiya Nauk SSSR (in Russian), 22 (6): 65–71, 225, MR   0638007 . Egorychev, G. P. (1981), "The solution of van der Waerden's problem for permanents", Advances in Mathematics , 42 (3): 299–305, doi: 10.1016/0001-8708(81)90044-X , MR   0642395 .
  20. Falikman, D. I. (1981), "Proof of the van der Waerden conjecture on the permanent of a doubly stochastic matrix", Akademiya Nauk Soyuza SSR (in Russian), 29 (6): 931–938, 957, MR   0625097 .
  21. Brualdi (2006) p.487
  22. Fulkerson Prize, Mathematical Optimization Society, retrieved 2012-08-19.
  23. Ryser (1963 , p. 27)
  24. van Lint & Wilson (2001) p. 99
  25. Jerrum, M.; Sinclair, A.; Vigoda, E. (2004), "A polynomial-time approximation algorithm for the permanent of a matrix with nonnegative entries", Journal of the ACM , 51 (4): 671–697, CiteSeerX   10.1.1.18.9466 , doi:10.1145/1008731.1008738, S2CID   47361920
  26. Meiburg, Alexander (2023). "Inapproximability of Positive Semidefinite Permanents and Quantum State Tomography". Algorithmica . 85 (12): 3828–3854. arXiv: 2111.03142 . doi: 10.1007/s00453-023-01169-1 .
  27. Chakhmakhchyan, Levon; Cerf, Nicolas; Garcia-Patron, Raul (2017). "A quantum-inspired algorithm for estimating the permanent of positive semidefinite matrices". Phys. Rev. A. 96 (2): 022329. arXiv: 1609.02416 . Bibcode:2017PhRvA..96b2329C. doi:10.1103/PhysRevA.96.022329. S2CID   54194194.
  28. Percus 1971 , p. 14
  29. Percus 1971 , p. 17
  30. In particular, Minc (1978) and Ryser (1963) do this.
  31. Ryser 1963 , p. 25
  32. Ryser 1963 , p. 54

Related Research Articles

In mathematics, the determinant is a scalar value that is a function of the entries of a square matrix. The determinant of a matrix A is commonly denoted det(A), det A, or |A|. Its value characterizes some properties of the matrix and the linear map represented, on a given basis, by the matrix. In particular, the determinant is nonzero if and only if the matrix is invertible and the corresponding linear map is an isomorphism. The determinant of a product of matrices is the product of their determinants.

<span class="mw-page-title-main">Pauli matrices</span> Matrices important in quantum mechanics and the study of spin

In mathematical physics and mathematics, the Pauli matrices are a set of three 2 × 2 complex matrices that are Hermitian, involutory and unitary. Usually indicated by the Greek letter sigma, they are occasionally denoted by tau when used in connection with isospin symmetries.

<span class="mw-page-title-main">Multivariate normal distribution</span> Generalization of the one-dimensional normal distribution to higher dimensions

In probability theory and statistics, the multivariate normal distribution, multivariate Gaussian distribution, or joint normal distribution is a generalization of the one-dimensional (univariate) normal distribution to higher dimensions. One definition is that a random vector is said to be k-variate normally distributed if every linear combination of its k components has a univariate normal distribution. Its importance derives mainly from the multivariate central limit theorem. The multivariate normal distribution is often used to describe, at least approximately, any set of (possibly) correlated real-valued random variables each of which clusters around a mean value.

In mathematics, the dot product or scalar product is an algebraic operation that takes two equal-length sequences of numbers, and returns a single number. In Euclidean geometry, the dot product of the Cartesian coordinates of two vectors is widely used. It is often called the inner product of Euclidean space, even though it is not the only inner product that can be defined on Euclidean space.

In statistics, the Gauss–Markov theorem states that the ordinary least squares (OLS) estimator has the lowest sampling variance within the class of linear unbiased estimators, if the errors in the linear regression model are uncorrelated, have equal variances and expectation value of zero. The errors do not need to be normal for the theorem to apply, nor do they need to be independent and identically distributed.

<span class="mw-page-title-main">Cayley–Hamilton theorem</span> Every square matrix over a commutative ring satisfies its own characteristic equation

In linear algebra, the Cayley–Hamilton theorem states that every square matrix over a commutative ring satisfies its own characteristic equation.

In mathematics, particularly in linear algebra, a skew-symmetricmatrix is a square matrix whose transpose equals its negative. That is, it satisfies the condition

In linear algebra, the adjugate of a square matrix A is the transpose of its cofactor matrix and is denoted by adj(A). It is also occasionally known as adjunct matrix, or "adjoint", though the latter term today normally refers to a different concept, the adjoint operator which for a matrix is the conjugate transpose.

In mathematics, the determinant of an m×m skew-symmetric matrix can always be written as the square of a polynomial in the matrix entries, a polynomial with integer coefficients that only depends on m. When m is odd, the polynomial is zero. When m is even, it is a nonzero polynomial of degree m/2, and is unique up to multiplication by ±1. The convention on skew-symmetric tridiagonal matrices, given below in the examples, then determines one specific polynomial, called the Pfaffian polynomial. The value of this polynomial, when applied to the entries of a skew-symmetric matrix, is called the Pfaffian of that matrix. The term Pfaffian was introduced by Cayley, who indirectly named them after Johann Friedrich Pfaff.

In mathematics, the Kronecker product, sometimes denoted by ⊗, is an operation on two matrices of arbitrary size resulting in a block matrix. It is a specialization of the tensor product from vectors to matrices and gives the matrix of the tensor product linear map with respect to a standard choice of basis. The Kronecker product is to be distinguished from the usual matrix multiplication, which is an entirely different operation. The Kronecker product is also sometimes called matrix direct product.

In the field of mathematics, norms are defined for elements within a vector space. Specifically, when the vector space comprises matrices, such norms are referred to as matrix norms. Matrix norms differ from vector norms in that they must also interact with matrix multiplication.

In matrix calculus, Jacobi's formula expresses the derivative of the determinant of a matrix A in terms of the adjugate of A and the derivative of A.

In linear algebra, the Laplace expansion, named after Pierre-Simon Laplace, also called cofactor expansion, is an expression of the determinant of an n × n-matrix B as a weighted sum of minors, which are the determinants of some (n − 1) × (n − 1)-submatrices of B. Specifically, for every i, the Laplace expansion along the ith row is the equality

In algebra, the Leibniz formula, named in honor of Gottfried Leibniz, expresses the determinant of a square matrix in terms of permutations of the matrix elements. If is an matrix, where is the entry in the -th row and -th column of , the formula is

In statistics, the inverse Wishart distribution, also called the inverted Wishart distribution, is a probability distribution defined on real-valued positive-definite matrices. In Bayesian statistics it is used as the conjugate prior for the covariance matrix of a multivariate normal distribution.

In linear algebra, the computation of the permanent of a matrix is a problem that is thought to be more difficult than the computation of the determinant of a matrix despite the apparent similarity of the definitions.

In mathematics, Capelli's identity, named after Alfredo Capelli (1887), is an analogue of the formula det(AB) = det(A) det(B), for certain matrices with noncommuting entries, related to the representation theory of the Lie algebra . It can be used to relate an invariant ƒ to the invariant Ωƒ, where Ω is Cayley's Ω process.

The purpose of this page is to provide supplementary materials for the ordinary least squares article, reducing the load of the main article with mathematics and improving its accessibility, while at the same time retaining the completeness of exposition.

In discrete mathematics, the Bregman–Minc inequality, or Bregman's theorem, allows one to estimate the permanent of a binary matrix via its row or column sums. The inequality was conjectured in 1963 by Henryk Minc and first proved in 1973 by Lev M. Bregman. Further entropy-based proofs have been given by Alexander Schrijver and Jaikumar Radhakrishnan. The Bregman–Minc inequality is used, for example, in graph theory to obtain upper bounds for the number of perfect matchings in a bipartite graph.

The Dittert conjecture, or Dittert–Hajek conjecture, is a mathematical hypothesis in combinatorics concerning the maximum achieved by a particular function of matrices with real, nonnegative entries satisfying a summation condition. The conjecture is due to Eric Dittert and (independently) Bruce Hajek.

References

Further reading