Isserlis' theorem

Last updated April 15, 2025

In probability theory, Isserlis' theorem or Wick's probability theorem is a formula that allows one to compute higher-order moments of the multivariate normal distribution in terms of its covariance matrix. It is named after Leon Isserlis.

Statement

If $(X_{1},\dots ,X_{n})$ is a zero-mean multivariate normal random vector, then $\operatorname {E} [\,X_{1}X_{2}\cdots X_{n}\,]=\sum _{p\in P_{n}^{2}}\prod _{\{i,j\}\in p}\operatorname {E} [\,X_{i}X_{j}\,]=\sum _{p\in P_{n}^{2}}\prod _{\{i,j\}\in p}\operatorname {Cov} (\,X_{i},X_{j}\,),$ where the sum is over all the pairings of $\{1,\ldots ,n\}$ , i.e. all distinct ways of partitioning $\{1,\ldots ,n\}$ into pairs $\{i,j\}$ , and the product is over the pairs contained in $p$ .^[5]^[6]

More generally, if $(Z_{1},\dots ,Z_{n})$ is a zero-mean complex-valued multivariate normal random vector, then the formula still holds.

The expression on the right-hand side is also known as the hafnian of the covariance matrix of $(X_{1},\dots ,X_{n})$ .

Odd case

If $n=2m+1$ is odd, there does not exist any pairing of $\{1,\ldots ,2m+1\}$ . Under this hypothesis, Isserlis' theorem implies that $\operatorname {E} [\,X_{1}X_{2}\cdots X_{2m+1}\,]=0.$ This also follows from the fact that $-X=(-X_{1},\dots ,-X_{n})$ has the same distribution as $X$ , which implies that $\operatorname {E} [\,X_{1}\cdots X_{2m+1}\,]=\operatorname {E} [\,(-X_{1})\cdots (-X_{2m+1})\,]=-\operatorname {E} [\,X_{1}\cdots X_{2m+1}\,]=0$ .

Even case

In his original paper,^[7] Leon Isserlis proves this theorem by mathematical induction, generalizing the formula for the $4^{\text{th}}$ order moments,^[8] which takes the appearance

\operatorname {E} [\,X_{1}X_{2}X_{3}X_{4}\,]=\operatorname {E} [X_{1}X_{2}]\,\operatorname {E} [X_{3}X_{4}]+\operatorname {E} [X_{1}X_{3}]\,\operatorname {E} [X_{2}X_{4}]+\operatorname {E} [X_{1}X_{4}]\,\operatorname {E} [X_{2}X_{3}].

If $n=2m$ is even, there exist $(2m)!/(2^{m}m!)=(2m-1)!!$ (see double factorial) pair partitions of $\{1,\ldots ,2m\}$ : this yields $(2m)!/(2^{m}m!)=(2m-1)!!$ terms in the sum. For example, for $4^{\text{th}}$ order moments (i.e. $4$ random variables) there are three terms. For $6^{\text{th}}$ -order moments there are $3\times 5=15$ terms, and for $8^{\text{th}}$ -order moments there are $3\times 5\times 7=105$ terms.

Example

We can evaluate the characteristic function of gaussians by the Isserlis theorem: $E[e^{-iX}]=\sum _{k}{\frac {(-i)^{k}}{k!}}E[X^{k}]=\sum _{k}{\frac {(-i)^{2k}}{(2k)!}}E[X^{2k}]=\sum _{k}{\frac {(-i)^{2k}}{(2k)!}}{\frac {(2k)!}{k!2^{k}}}E[X^{2}]^{k}=e^{-{\frac {1}{2}}E[X^{2}]}$

Proof

Since both sides of the formula are multilinear in $X_{1},...,X_{n}$ , if we can prove the real case, we get the complex case for free.

Let $\Sigma _{ij}=\operatorname {Cov} (X_{i},X_{j})$ be the covariance matrix, so that we have the zero-mean multivariate normal random vector $(X_{1},...,X_{n})\sim N(0,\Sigma )$ . Since both sides of the formula are continuous with respect to $\Sigma$ , it suffices to prove the case when $\Sigma$ is invertible.

Using quadratic factorization $-x^{T}\Sigma ^{-1}x/2+v^{T}x-v^{T}\Sigma v/2=-(x-\Sigma v)^{T}\Sigma ^{-1}(x-\Sigma v)/2$ , we get

${\frac {1}{\sqrt {(2\pi )^{n}\det \Sigma }}}\int e^{-x^{T}\Sigma ^{-1}x/2+v^{T}x}dx=e^{v^{T}\Sigma v/2}$

Differentiate under the integral sign with $\partial _{v_{1},...,v_{n}}|_{v_{1},...,v_{n}=0}$ to obtain

$E[X_{1}\cdots X_{n}]=\partial _{v_{1},...,v_{n}}|_{v_{1},...,v_{n}=0}e^{v^{T}\Sigma v/2}$ .

That is, we need only find the coefficient of term $v_{1}\cdots v_{n}$ in the Taylor expansion of $e^{v^{T}\Sigma v/2}$ .

If $n$ is odd, this is zero. So let $n=2m$ , then we need only find the coefficient of term $v_{1}\cdots v_{n}$ in the polynomial ${\frac {1}{m!}}(v^{T}\Sigma v/2)^{m}$ .

Expand the polynomial and count, we obtain the formula. $\square$

Generalizations

Gaussian integration by parts

An equivalent formulation of the Wick's probability formula is the Gaussian integration by parts. If $(X_{1},\dots X_{n})$ is a zero-mean multivariate normal random vector, then

$\operatorname {E} (X_{1}f(X_{1},\ldots ,X_{n}))=\sum _{i=1}^{n}\operatorname {Cov} (X_{1},X_{i})\operatorname {E} (\partial _{X_{i}}f(X_{1},\ldots ,X_{n})).$ This is a generalization of Stein's lemma.

The Wick's probability formula can be recovered by induction, considering the function $f:\mathbb {R} ^{n}\to \mathbb {R}$ defined by $f(x_{1},\ldots ,x_{n})=x_{2}\ldots x_{n}$ . Among other things, this formulation is important in Liouville conformal field theory to obtain conformal Ward identities, BPZ equations ^[9] and to prove the Fyodorov-Bouchaud formula.^[10]

Non-Gaussian random variables

For non-Gaussian random variables, the moment-cumulants formula^[11] replaces the Wick's probability formula. If $(X_{1},\dots X_{n})$ is a vector of random variables, then $\operatorname {E} (X_{1}\ldots X_{n})=\sum _{p\in P_{n}}\prod _{b\in p}\kappa {\big (}(X_{i})_{i\in b}{\big )},$ where the sum is over all the partitions of $\{1,\ldots ,n\}$ , the product is over the blocks of $p$ and $\kappa {\big (}(X_{i})_{i\in b}{\big )}$ is the joint cumulant of $(X_{i})_{i\in b}$ .

Uniform distribution on the unit sphere

Consider $X=(X_{1},\dots ,X_{d})$ uniformly distributed on the unit sphere $S^{d-1}$ , so that $\|X\|=1$ almost surely. In this setting, the following holds.

If $n$ is odd, $\operatorname {E} {\bigl [}X_{i_{1}}\,X_{i_{2}}\,\cdots \,X_{i_{n}}{\bigr ]}\;=\;0.\!$

If $n=2k$ is even, $\operatorname {E} {\bigl [}X_{i_{1}}\,\cdots \,X_{i_{2k}}{\bigr ]}\;=\;{\frac {1}{d\,{\bigl (}d+2{\bigr )}{\bigl (}d+4{\bigr )}\cdots {\bigl (}d+2k-2{\bigr )}}}\sum _{p\in P_{2k}^{2}}\prod _{\{r,s\}\in p}\delta _{\,i_{r},i_{s}},$ where $P_{2k}^{2}$ is the set of all pairings of $\{1,\ldots ,2k\}$ , $\delta _{i,j}$ is the Kronecker delta.

Since there are $|P_{2k}^{2}|=(2k-1)!!$ delta-terms, we get on the diagonal: $\operatorname {E} [\,X_{1}^{2k}\,]\;=\;{\frac {(2k-1)!!}{d\,{\bigl (}d+2{\bigr )}{\bigl (}d+4{\bigr )}\cdots {\bigl (}d+2k-2{\bigr )}}}.$ Here, $(2k-1)!!$ denotes the double factorial.

These results are discussed in the context of random vectors and irreducible representations in the work by Kushkuley (2021).^[12]

References

↑ Wick, G.C. (1950). "The evaluation of the collision matrix". Physical Review . 80 (2): 268–272. Bibcode:1950PhRv...80..268W. doi:10.1103/PhysRev.80.268.
↑ Repetowicz, Przemysław; Richmond, Peter (2005). "Statistical inference of multivariate distribution parameters for non-Gaussian distributed time series" (PDF). Acta Physica Polonica B. 36 (9): 2785–2796. Bibcode:2005AcPPB..36.2785R.
↑ Perez-Martin, S.; Robledo, L.M. (2007). "Generalized Wick's theorem for multiquasiparticle overlaps as a limit of Gaudin's theorem". Physical Review C. 76 (6): 064314. arXiv: 0707.3365 . Bibcode:2007PhRvC..76f4314P. doi:10.1103/PhysRevC.76.064314. S2CID 119627477.
↑ Bartosch, L. (2001). "Generation of colored noise". International Journal of Modern Physics C. 12 (6): 851–855. Bibcode:2001IJMPC..12..851B. doi:10.1142/S0129183101002012. S2CID 54500670.
↑ Janson, Svante (June 1997). Gaussian Hilbert Spaces. Cambridge Core. doi:10.1017/CBO9780511526169. ISBN 9780521561280 . Retrieved 2019-11-30.
↑ Michalowicz, J.V.; Nichols, J.M.; Bucholtz, F.; Olson, C.C. (2009). "An Isserlis' theorem for mixed Gaussian variables: application to the auto-bispectral density". Journal of Statistical Physics. 136 (1): 89–102. Bibcode:2009JSP...136...89M. doi:10.1007/s10955-009-9768-3. S2CID 119702133.
↑ Isserlis, L. (1918). "On a formula for the product-moment coefficient of any order of a normal frequency distribution in any number of variables". Biometrika . 12 (1–2): 134–139. doi:10.1093/biomet/12.1-2.134. JSTOR 2331932.
↑ Isserlis, L. (1916). "On Certain Probable Errors and Correlation Coefficients of Multiple Frequency Distributions with Skew Regression". Biometrika . 11 (3): 185–190. doi:10.1093/biomet/11.3.185. JSTOR 2331846.
↑ Kupiainen, Antti; Rhodes, Rémi; Vargas, Vincent (2019-11-01). "Local Conformal Structure of Liouville Quantum Gravity". Communications in Mathematical Physics. 371 (3): 1005–1069. arXiv: 1512.01802 . Bibcode:2019CMaPh.371.1005K. doi:10.1007/s00220-018-3260-3. ISSN 1432-0916. S2CID 55282482.
↑ Remy, Guillaume (2020). "The Fyodorov–Bouchaud formula and Liouville conformal field theory". Duke Mathematical Journal. 169. arXiv: 1710.06897 . doi:10.1215/00127094-2019-0045. S2CID 54777103.
↑ Leonov, V. P.; Shiryaev, A. N. (January 1959). "On a Method of Calculation of Semi-Invariants". Theory of Probability & Its Applications. 4 (3): 319–329. doi:10.1137/1104031.
↑ Kushkuley, Alexander (2021). "A Remark on Random Vectors and Irreducible Representations". arXiv: 2110.15504 .