The Gaussian correlation inequality (GCI), formerly known as the Gaussian correlation conjecture (GCC), is a mathematical theorem in the fields of mathematical statistics and convex geometry.
The Gaussian correlation inequality states:
Let be an n-dimensional Gaussian probability measure on , i.e. a multivariate normal distribution, centered at the origin. Then for all convex sets that are symmetric about the origin,
As a simple example for n=2, one can think of darts being thrown at a board, with their landing spots in the plane distributed according to a 2-variable normal distribution centered at the origin. (This is a reasonable assumption for any given darts player, with different players being described by different normal distributions.) If we now consider a circle and a rectangle in the plane, both centered at the origin, then the proportion of the darts landing in the intersection of both shapes is no less than the product of the proportions of the darts landing in each shape. This can also be formulated in terms of conditional probabilities: if you're informed that your last dart hit the rectangle, then this information will increase your estimate of the probability that the dart hit the circle.
A special case of the inequality was conjectured in 1955; [1] further development was given by Olive Jean Dunn in 1958. [2] [3] The general case was stated in 1972, also as a conjecture. [4] The case of dimension n=2 was proved in 1977 [5] and certain special cases of higher dimension have also been proven in subsequent years. [6]
The general case of the inequality remained open until 2014, when Thomas Royen, a retired German statistician, proved it using relatively elementary tools. [7] In fact, Royen generalized the conjecture and proved it for multivariate gamma distributions. The proof did not gain attention when it was published in 2014, due to Royen's relative anonymity and the fact that the proof was published in a predatory journal. [2] [8] Another reason was a history of false proofs (by others) and many failed attempts to prove the conjecture, causing skepticism among mathematicians in the field. [2]
The conjecture, and its solution, came to public attention in 2017, when other mathematicians described Royen's proof in a mainstream publication [9] and popular media reported on the story. [2] [10] [11]
In statistics and probability theory, the median is the value separating the higher half from the lower half of a data sample, a population, or a probability distribution. For a data set, it may be thought of as "the middle" value. The basic feature of the median in describing data compared to the mean is that it is not skewed by a small proportion of extremely large or small values, and therefore provides a better representation of the center. Median income, for example, may be a better way to describe center of the income distribution because increases in the largest incomes alone have no effect on median. For this reason, the median is of central importance in robust statistics.
In statistics, a normal distribution or Gaussian distribution is a type of continuous probability distribution for a real-valued random variable. The general form of its probability density function is
In probability theory, the central limit theorem (CLT) states that, under appropriate conditions, the distribution of a normalized version of the sample mean converges to a standard normal distribution. This holds even if the original variables themselves are not normally distributed. There are several versions of the CLT, each applying in the context of different conditions.
In probability theory and statistics, the multivariate normal distribution, multivariate Gaussian distribution, or joint normal distribution is a generalization of the one-dimensional (univariate) normal distribution to higher dimensions. One definition is that a random vector is said to be k-variate normally distributed if every linear combination of its k components has a univariate normal distribution. Its importance derives mainly from the multivariate central limit theorem. The multivariate normal distribution is often used to describe, at least approximately, any set of (possibly) correlated real-valued random variables each of which clusters around a mean value.
In probability theory, Chebyshev's inequality provides an upper bound on the probability of deviation of a random variable from its mean. More specifically, the probability that a random variable deviates from its mean by more than is at most , where is any positive constant.
In probability theory, the law of large numbers (LLN) is a mathematical theorem that states that the average of the results obtained from a large number of independent and identical random samples converges to the true value, if it exists. More formally, the LLN states that given a sample of independent and identically distributed values, the sample mean converges to the true mean.
In mathematics, Jensen's inequality, named after the Danish mathematician Johan Jensen, relates the value of a convex function of an integral to the integral of the convex function. It was proved by Jensen in 1906, building on an earlier proof of the same inequality for doubly-differentiable functions by Otto Hölder in 1889. Given its generality, the inequality appears in many forms depending on the context, some of which are presented below. In its simplest form the inequality states that the convex transformation of a mean is less than or equal to the mean applied after convex transformation; it is a simple corollary that the opposite is true of concave transformations.
In probability and statistics, a mixture distribution is the probability distribution of a random variable that is derived from a collection of other random variables as follows: first, a random variable is selected by chance from the collection according to given probabilities of selection, and then the value of the selected random variable is realized. The underlying random variables may be random real numbers, or they may be random vectors, in which case the mixture distribution is a multivariate distribution.
In mathematics, concentration of measure is a principle that is applied in measure theory, probability and combinatorics, and has consequences for other fields such as Banach space theory. Informally, it states that "A random variable that depends in a Lipschitz way on many independent variables is essentially constant".
In probability theory, the inverse Gaussian distribution is a two-parameter family of continuous probability distributions with support on (0,∞).
In the theory of probability and statistics, the Dvoretzky–Kiefer–Wolfowitz–Massart inequality provides a bound on the worst case distance of an empirically determined distribution function from its associated population distribution function. It is named after Aryeh Dvoretzky, Jack Kiefer, and Jacob Wolfowitz, who in 1956 proved the inequality
Inequalities are very important in the study of information theory. There are a number of different contexts in which these inequalities appear.
In mathematics, the second moment method is a technique used in probability theory and analysis to show that a random variable has positive probability of being positive. More generally, the "moment method" consists of bounding the probability that a random variable fluctuates far from its mean, by using its moments.
In mathematics, the Fortuin–Kasteleyn–Ginibre (FKG) inequality is a correlation inequality, a fundamental tool in statistical mechanics and probabilistic combinatorics, due to Cees M. Fortuin, Pieter W. Kasteleyn, and Jean Ginibre (1971). Informally, it says that in many random systems, increasing events are positively correlated, while an increasing and a decreasing event are negatively correlated. It was obtained by studying the random cluster model.
In combinatorial mathematics, the XYZ inequality, also called the Fishburn–Shepp inequality, is an inequality for the number of linear extensions of finite partial orders. The inequality was conjectured by Ivan Rival and Bill Sands in 1981. It was proved by Lawrence Shepp in Shepp (1982). An extension was given by Peter Fishburn in Fishburn (1984).
Energy distance is a statistical distance between probability distributions. If X and Y are independent random vectors in Rd with cumulative distribution functions (cdf) F and G respectively, then the energy distance between the distributions F and G is defined to be the square root of
In probability theory, more specifically the study of random matrices, the circular law concerns the distribution of eigenvalues of an n × n random matrix with independent and identically distributed entries in the limit n → ∞.
In mathematics, Scheffé's lemma is a proposition in measure theory concerning the convergence of sequences of integrable functions. It states that, if is a sequence of integrable functions on a measure space that converges almost everywhere to another integrable function , then if and only if .
Thomas Royen is a retired German professor of statistics who has been affiliated with the University of Applied Sciences Bingen. Royen came to prominence in the spring of 2017 for a relatively simple proof for the Gaussian Correlation Inequality (GCI), a conjecture that originated in the 1950s, which he had published three years earlier without much recognition. A proof of this conjecture, which lies at the intersection of geometry, probability theory and statistics, had eluded top experts for decades.
In probability theory, Bobkov's inequality is a functional isoperimetric inequality for the canonical Gaussian measure. It generalizes the Gaussian isoperimetric inequality. The equation was proven in 1997 by the Russian mathematician Sergey Bobkov.