Isserlis' theorem

Last updated

In probability theory, Isserlis' theorem or Wick's probability theorem is a formula that allows one to compute higher-order moments of the multivariate normal distribution in terms of its covariance matrix. It is named after Leon Isserlis.

Contents

This theorem is also particularly important in particle physics, where it is known as Wick's theorem after the work of Wick (1950). [1] Other applications include the analysis of portfolio returns, [2] quantum field theory [3] and generation of colored noise. [4]

Statement

If is a zero-mean multivariate normal random vector, thenwhere the sum is over all the pairings of , i.e. all distinct ways of partitioning into pairs , and the product is over the pairs contained in . [5] [6]

More generally, if is a zero-mean complex-valued multivariate normal random vector, then the formula still holds.

The expression on the right-hand side is also known as the hafnian of the covariance matrix of .

Odd case

If is odd, there does not exist any pairing of . Under this hypothesis, Isserlis' theorem implies that This also follows from the fact that has the same distribution as , which implies that .

Even case

In his original paper, [7] Leon Isserlis proves this theorem by mathematical induction, generalizing the formula for the order moments, [8] which takes the appearance

If is even, there exist (see double factorial) pair partitions of : this yields terms in the sum. For example, for order moments (i.e. random variables) there are three terms. For -order moments there are terms, and for -order moments there are terms.

Example

We can evaluate the characteristic function of gaussians by the Isserlis theorem:

Proof

Since both sides of the formula are multilinear in , if we can prove the real case, we get the complex case for free.

Let be the covariance matrix, so that we have the zero-mean multivariate normal random vector . Since both sides of the formula are continuous with respect to , it suffices to prove the case when is invertible.

Using quadratic factorization , we get

Differentiate under the integral sign with to obtain

.

That is, we need only find the coefficient of term in the Taylor expansion of .

If is odd, this is zero. So let , then we need only find the coefficient of term in the polynomial .

Expand the polynomial and count, we obtain the formula.

Generalizations

Gaussian integration by parts

An equivalent formulation of the Wick's probability formula is the Gaussian integration by parts. If is a zero-mean multivariate normal random vector, then

This is a generalization of Stein's lemma.

The Wick's probability formula can be recovered by induction, considering the function defined by . Among other things, this formulation is important in Liouville conformal field theory to obtain conformal Ward identities, BPZ equations [9] and to prove the Fyodorov-Bouchaud formula. [10]

Non-Gaussian random variables

For non-Gaussian random variables, the moment-cumulants formula [11] replaces the Wick's probability formula. If is a vector of random variables, then where the sum is over all the partitions of , the product is over the blocks of and is the joint cumulant of .

Uniform distribution on the unit sphere

Consider uniformly distributed on the unit sphere , so that almost surely. In this setting, the following holds.

If is odd,

If is even, where is the set of all pairings of , is the Kronecker delta, and Here, denotes the double factorial.

These results are discussed in the context of random vectors and irreducible representations in the work by Kushkuley (2021). [12]

See also

Related Research Articles

<span class="mw-page-title-main">Pauli matrices</span> Matrices important in quantum mechanics and the study of spin

In mathematical physics and mathematics, the Pauli matrices are a set of three 2 × 2 complex matrices that are traceless, Hermitian, involutory and unitary. Usually indicated by the Greek letter sigma, they are occasionally denoted by tau when used in connection with isospin symmetries.

<span class="mw-page-title-main">Riemann zeta function</span> Analytic function in mathematics

The Riemann zeta function or Euler–Riemann zeta function, denoted by the Greek letter ζ (zeta), is a mathematical function of a complex variable defined as for , and its analytic continuation elsewhere.

<span class="mw-page-title-main">Variance</span> Statistical measure of how far values spread from their average

In probability theory and statistics, variance is the expected value of the squared deviation from the mean of a random variable. The standard deviation (SD) is obtained as the square root of the variance. Variance is a measure of dispersion, meaning it is a measure of how far a set of numbers is spread out from their average value. It is the second central moment of a distribution, and the covariance of the random variable with itself, and it is often represented by , , , , or .

<span class="mw-page-title-main">Central limit theorem</span> Fundamental theorem in probability theory and statistics

In probability theory, the central limit theorem (CLT) states that, under appropriate conditions, the distribution of a normalized version of the sample mean converges to a standard normal distribution. This holds even if the original variables themselves are not normally distributed. There are several versions of the CLT, each applying in the context of different conditions.

In mathematics, an infinite series of numbers is said to converge absolutely if the sum of the absolute values of the summands is finite. More precisely, a real or complex series is said to converge absolutely if for some real number Similarly, an improper integral of a function, is said to converge absolutely if the integral of the absolute value of the integrand is finite—that is, if A convergent series that is not absolutely convergent is called conditionally convergent.

In statistics, maximum likelihood estimation (MLE) is a method of estimating the parameters of an assumed probability distribution, given some observed data. This is achieved by maximizing a likelihood function so that, under the assumed statistical model, the observed data is most probable. The point in the parameter space that maximizes the likelihood function is called the maximum likelihood estimate. The logic of maximum likelihood is both intuitive and flexible, and as such the method has become a dominant means of statistical inference.

In statistics, the Gauss–Markov theorem states that the ordinary least squares (OLS) estimator has the lowest sampling variance within the class of linear unbiased estimators, if the errors in the linear regression model are uncorrelated, have equal variances and expectation value of zero. The errors do not need to be normal, nor do they need to be independent and identically distributed. The requirement that the estimator be unbiased cannot be dropped, since biased estimators exist with lower variance. See, for example, the James–Stein estimator, ridge regression, or simply any degenerate estimator.

<span class="mw-page-title-main">Exterior algebra</span> Algebra associated to any vector space

In mathematics, the exterior algebra or Grassmann algebra of a vector space is an associative algebra that contains which has a product, called exterior product or wedge product and denoted with , such that for every vector in The exterior algebra is named after Hermann Grassmann, and the names of the product come from the "wedge" symbol and the fact that the product of two elements of is "outside"

<span class="mw-page-title-main">Spearman's rank correlation coefficient</span> Nonparametric measure of rank correlation

In statistics, Spearman's rank correlation coefficient or Spearman's ρ, named after Charles Spearman and often denoted by the Greek letter (rho) or as , is a nonparametric measure of rank correlation. It assesses how well the relationship between two variables can be described using a monotonic function.

In mathematics, Fatou's lemma establishes an inequality relating the Lebesgue integral of the limit inferior of a sequence of functions to the limit inferior of integrals of these functions. The lemma is named after Pierre Fatou.

In mathematics, in particular in algebraic topology, differential geometry and algebraic geometry, the Chern classes are characteristic classes associated with complex vector bundles. They have since become fundamental concepts in many branches of mathematics and physics, such as string theory, Chern–Simons theory, knot theory, and Gromov–Witten invariants. Chern classes were introduced by Shiing-Shen Chern.

Stein's lemma, named in honor of Charles Stein, is a theorem of probability theory that is of interest primarily because of its applications to statistical inference — in particular, to James–Stein estimation and empirical Bayes methods — and its applications to portfolio choice theory. The theorem gives a formula for the covariance of one random variable with the value of a function of another, when the two random variables are jointly normally distributed.

In probability theory and statistics, the generalized extreme value (GEV) distribution is a family of continuous probability distributions developed within extreme value theory to combine the Gumbel, Fréchet and Weibull families also known as type I, II and III extreme value distributions. By the extreme value theorem the GEV distribution is the only possible limit distribution of properly normalized maxima of a sequence of independent and identically distributed random variables. that a limit distribution needs to exist, which requires regularity conditions on the tail of the distribution. Despite this, the GEV distribution is often used as an approximation to model the maxima of long (finite) sequences of random variables.

In the field of mathematics, norms are defined for elements within a vector space. Specifically, when the vector space comprises matrices, such norms are referred to as matrix norms. Matrix norms differ from vector norms in that they must also interact with matrix multiplication.

In statistics and information theory, a maximum entropy probability distribution has entropy that is at least as great as that of all other members of a specified class of probability distributions. According to the principle of maximum entropy, if nothing is known about a distribution except that it belongs to a certain class, then the distribution with the largest entropy should be chosen as the least-informative default. The motivation is twofold: first, maximizing entropy minimizes the amount of prior information built into the distribution; second, many physical systems tend to move towards maximal entropy configurations over time.

<span class="mw-page-title-main">Lemniscate constant</span> Ratio of the perimeter of Bernoullis lemniscate to its diameter

In mathematics, the lemniscate constantϖ is a transcendental mathematical constant that is the ratio of the perimeter of Bernoulli's lemniscate to its diameter, analogous to the definition of π for the circle. Equivalently, the perimeter of the lemniscate is 2ϖ. The lemniscate constant is closely related to the lemniscate elliptic functions and approximately equal to 2.62205755. It also appears in evaluation of the gamma and beta function at certain rational values. The symbol ϖ is a cursive variant of π known as variant pi represented in Unicode by the character U+03D6ϖGREEK PI SYMBOL.

<span class="mw-page-title-main">Lemniscate elliptic functions</span> Mathematical functions

In mathematics, the lemniscate elliptic functions are elliptic functions related to the arc length of the lemniscate of Bernoulli. They were first studied by Giulio Fagnano in 1718 and later by Leonhard Euler and Carl Friedrich Gauss, among others.

Volume of an <i>n</i>-ball Size of a mathematical ball

In geometry, a ball is a region in a space comprising all points within a fixed distance, called the radius, from a given point; that is, it is the region enclosed by a sphere or hypersphere. An n-ball is a ball in an n-dimensional Euclidean space. The volume of a n-ball is the Lebesgue measure of this ball, which generalizes to any dimension the usual volume of a ball in 3-dimensional space. The volume of a n-ball of radius R is where is the volume of the unit n-ball, the n-ball of radius 1.

In mathematics, the hafnian is a scalar function of a symmetric matrix that generalizes the permanent.

In the mathematical theory of random processes, the Markov chain central limit theorem has a conclusion somewhat similar in form to that of the classic central limit theorem (CLT) of probability theory, but the quantity in the role taken by the variance in the classic CLT has a more complicated definition. See also the general form of Bienaymé's identity.

References

  1. Wick, G.C. (1950). "The evaluation of the collision matrix". Physical Review . 80 (2): 268–272. Bibcode:1950PhRv...80..268W. doi:10.1103/PhysRev.80.268.
  2. Repetowicz, Przemysław; Richmond, Peter (2005). "Statistical inference of multivariate distribution parameters for non-Gaussian distributed time series" (PDF). Acta Physica Polonica B. 36 (9): 2785–2796. Bibcode:2005AcPPB..36.2785R.
  3. Perez-Martin, S.; Robledo, L.M. (2007). "Generalized Wick's theorem for multiquasiparticle overlaps as a limit of Gaudin's theorem". Physical Review C. 76 (6): 064314. arXiv: 0707.3365 . Bibcode:2007PhRvC..76f4314P. doi:10.1103/PhysRevC.76.064314. S2CID   119627477.
  4. Bartosch, L. (2001). "Generation of colored noise". International Journal of Modern Physics C. 12 (6): 851–855. Bibcode:2001IJMPC..12..851B. doi:10.1142/S0129183101002012. S2CID   54500670.
  5. Janson, Svante (June 1997). Gaussian Hilbert Spaces. Cambridge Core. doi:10.1017/CBO9780511526169. ISBN   9780521561280 . Retrieved 2019-11-30.
  6. Michalowicz, J.V.; Nichols, J.M.; Bucholtz, F.; Olson, C.C. (2009). "An Isserlis' theorem for mixed Gaussian variables: application to the auto-bispectral density". Journal of Statistical Physics. 136 (1): 89–102. Bibcode:2009JSP...136...89M. doi:10.1007/s10955-009-9768-3. S2CID   119702133.
  7. Isserlis, L. (1918). "On a formula for the product-moment coefficient of any order of a normal frequency distribution in any number of variables". Biometrika . 12 (1–2): 134–139. doi:10.1093/biomet/12.1-2.134. JSTOR   2331932.
  8. Isserlis, L. (1916). "On Certain Probable Errors and Correlation Coefficients of Multiple Frequency Distributions with Skew Regression". Biometrika . 11 (3): 185–190. doi:10.1093/biomet/11.3.185. JSTOR   2331846.
  9. Kupiainen, Antti; Rhodes, Rémi; Vargas, Vincent (2019-11-01). "Local Conformal Structure of Liouville Quantum Gravity". Communications in Mathematical Physics. 371 (3): 1005–1069. arXiv: 1512.01802 . Bibcode:2019CMaPh.371.1005K. doi:10.1007/s00220-018-3260-3. ISSN   1432-0916. S2CID   55282482.
  10. Remy, Guillaume (2020). "The Fyodorov–Bouchaud formula and Liouville conformal field theory". Duke Mathematical Journal. 169. arXiv: 1710.06897 . doi:10.1215/00127094-2019-0045. S2CID   54777103.
  11. Leonov, V. P.; Shiryaev, A. N. (January 1959). "On a Method of Calculation of Semi-Invariants". Theory of Probability & Its Applications. 4 (3): 319–329. doi:10.1137/1104031.
  12. Kushkuley, Alexander (2021). "A Remark on Random Vectors and Irreducible Representations". arXiv: 2110.15504 .

Further reading