In statistics, Cochran's theorem, devised by William G. Cochran, [1] is a theorem used to justify results relating to the probability distributions of statistics that are used in the analysis of variance. [2]
If X1, ..., Xn are independent normally distributed random variables with mean μ and standard deviation σ then
is standard normal for each i. Note that the total Q is equal to sum of squared Us as shown here:
which stems from the original assumption that . So instead we will calculate this quantity and later separate it into Qi's. It is possible to write
(here is the sample mean). To see this identity, multiply throughout by and note that
and expand to give
The third term is zero because it is equal to a constant times
and the second term has just n identical terms added together. Thus
and hence
Now with the matrix of ones which has rank 1. In turn given that . This expression can be also obtained by expanding in matrix notation. It can be shown that the rank of is as the addition of all its rows is equal to zero. Thus the conditions for Cochran's theorem are met.
Cochran's theorem then states that Q1 and Q2 are independent, with chi-squared distributions with n− 1 and 1 degree of freedom respectively. This shows that the sample mean and sample variance are independent. This can also be shown by Basu's theorem, and in fact this property characterizes the normal distribution – for no other distribution are the sample mean and sample variance independent. [3]
The result for the distributions is written symbolically as
Both these random variables are proportional to the true but unknown variance σ2. Thus their ratio does not depend on σ2 and, because they are statistically independent. The distribution of their ratio is given by
where F1,n − 1 is the F-distribution with 1 and n − 1 degrees of freedom (see also Student's t-distribution). The final step here is effectively the definition of a random variable having the F-distribution.
To estimate the variance σ2, one estimator that is sometimes used is the maximum likelihood estimator of the variance of a normal distribution
Cochran's theorem shows that
and the properties of the chi-squared distribution show that
The following version is often seen when considering linear regression. [4] Suppose that is a standard multivariate normal random vector (here denotes the n-by-n identity matrix), and if are all n-by-n symmetric matrices with . Then, on defining , any one of the following conditions implies the other two:
Let U1, ..., UN be i.i.d. standard normally distributed random variables, and . Let be symmetric matrices. Define ri to be the rank of . Define , so that the Qi are quadratic forms. Further assume .
Cochran's theorem states that the following are equivalent:
Often it's stated as , where is idempotent, and is replaced by . But after an orthogonal transform, , and so we reduce to the above theorem.
Claim: Let be a standard Gaussian in , then for any symmetric matrices , if and have the same distribution, then have the same eigenvalues (up to multiplicity).
Let the eigenvalues of be , then calculate the characteristic function of . It comes out to be
(To calculate it, first diagonalize , change into that frame, then use the fact that the characteristic function of the sum of independent variables is the product of their characteristic functions.)
For and to be equal, their characteristic functions must be equal, so have the same eigenvalues (up to multiplicity).
Claim: .
. Since is symmetric, and , by the previous claim, has the same eigenvalues as 0.
Lemma: If , all symmetric, and have eigenvalues 0, 1, then they are simultaneously diagonalizable.
Fix i, and consider the eigenvectors v of such that . Then we have , so all . Thus we obtain a split of into , such that V is the 1-eigenspace of , and in the 0-eigenspaces of all other . Now induct by moving into .
Now we prove the original theorem. We prove that the three cases are equivalent by proving that each case implies the next one in a cycle ().
Case: All are independent
Fix some , define , and diagonalize by an orthogonal transform . Then consider . It is diagonalized as well.
Let , then it is also standard Gaussian. Then we have
Inspect their diagonal entries, to see that implies that their nonzero diagonal entries are disjoint.
Thus all eigenvalues of are 0, 1, so is a dist with degrees of freedom.
Case: Each is a distribution.
Fix any , diagonalize it by orthogonal transform , and reindex, so that . Then for some , a spherical rotation of .
Since , we get all . So all , and have eigenvalues .
So diagonalize them simultaneously, add them up, to find .
Case: .
We first show that the matrices B(i) can be simultaneously diagonalized by an orthogonal matrix and that their non-zero eigenvalues are all equal to +1. Once that's shown, take this orthogonal transform to this simultaneous eigenbasis, in which the random vector becomes , but all are still independent and standard Gaussian. Then the result follows.
Each of the matrices B(i) has rank ri and thus ri non-zero eigenvalues. For each i, the sum has at most rank . Since , it follows that C(i) has exactly rank N − ri.
Therefore B(i) and C(i) can be simultaneously diagonalized. This can be shown by first diagonalizing B(i), by the spectral theorem. In this basis, it is of the form:
Thus the lower rows are zero. Since , it follows that these rows in C(i) in this basis contain a right block which is a unit matrix, with zeros in the rest of these rows. But since C(i) has rank N − ri, it must be zero elsewhere. Thus it is diagonal in this basis as well. It follows that all the non-zero eigenvalues of both B(i) and C(i) are +1. This argument applies for all i, thus all B(i) are positive semidefinite.
Moreover, the above analysis can be repeated in the diagonal basis for . In this basis is the identity of an vector space, so it follows that both B(2) and are simultaneously diagonalizable in this vector space (and hence also together with B(1)). By iteration it follows that all B-s are simultaneously diagonalizable.
Thus there exists an orthogonal matrix such that for all , is diagonal, where any entry with indices , , is equal to 1, while any entry with other indices is equal to 0.
This article needs additional citations for verification .(July 2011) |
In number theory, an arithmetic, arithmetical, or number-theoretic function is generally any function f(n) whose domain is the positive integers and whose range is a subset of the complex numbers. Hardy & Wright include in their definition the requirement that an arithmetical function "expresses some arithmetical property of n". There is a larger class of number-theoretic functions that do not fit this definition, for example, the prime-counting functions. This article provides links to functions of both classes.
In probability theory and statistics, a normal distribution or Gaussian distribution is a type of continuous probability distribution for a real-valued random variable. The general form of its probability density function is The parameter is the mean or expectation of the distribution, while the parameter is the variance. The standard deviation of the distribution is (sigma). A random variable with a Gaussian distribution is said to be normally distributed, and is called a normal deviate.
In mathematical physics and mathematics, the Pauli matrices are a set of three 2 × 2 complex matrices that are traceless, Hermitian, involutory and unitary. Usually indicated by the Greek letter sigma, they are occasionally denoted by tau when used in connection with isospin symmetries.
In probability theory and statistics, skewness is a measure of the asymmetry of the probability distribution of a real-valued random variable about its mean. The skewness value can be positive, zero, negative, or undefined.
In probability theory and statistics, variance is the expected value of the squared deviation from the mean of a random variable. The standard deviation (SD) is obtained as the square root of the variance. Variance is a measure of dispersion, meaning it is a measure of how far a set of numbers is spread out from their average value. It is the second central moment of a distribution, and the covariance of the random variable with itself, and it is often represented by , , , , or .
In probability theory and statistics, the multivariate normal distribution, multivariate Gaussian distribution, or joint normal distribution is a generalization of the one-dimensional (univariate) normal distribution to higher dimensions. One definition is that a random vector is said to be k-variate normally distributed if every linear combination of its k components has a univariate normal distribution. Its importance derives mainly from the multivariate central limit theorem. The multivariate normal distribution is often used to describe, at least approximately, any set of (possibly) correlated real-valued random variables, each of which clusters around a mean value.
In probability theory and statistics, the chi-squared distribution with degrees of freedom is the distribution of a sum of the squares of independent standard normal random variables.
In probability theory and statistics, a covariance matrix is a square matrix giving the covariance between each pair of elements of a given random vector.
In probability theory and statistics, the Rayleigh distribution is a continuous probability distribution for nonnegative-valued random variables. Up to rescaling, it coincides with the chi distribution with two degrees of freedom. The distribution is named after Lord Rayleigh.
In quantum field theory, the Dirac spinor is the spinor that describes all known fundamental particles that are fermions, with the possible exception of neutrinos. It appears in the plane-wave solution to the Dirac equation, and is a certain combination of two Weyl spinors, specifically, a bispinor that transforms "spinorially" under the action of the Lorentz group.
In statistics, sometimes the covariance matrix of a multivariate random variable is not known but has to be estimated. Estimation of covariance matrices then deals with the question of how to approximate the actual covariance matrix on the basis of a sample from the multivariate distribution. Simple cases, where observations are complete, can be dealt with by using the sample covariance matrix. The sample covariance matrix (SCM) is an unbiased and efficient estimator of the covariance matrix if the space of covariance matrices is viewed as an extrinsic convex cone in Rp×p; however, measured using the intrinsic geometry of positive-definite matrices, the SCM is a biased and inefficient estimator. In addition, if the random variable has a normal distribution, the sample covariance matrix has a Wishart distribution and a slightly differently scaled version of it is the maximum likelihood estimate. Cases involving missing data, heteroscedasticity, or autocorrelated residuals require deeper considerations. Another issue is the robustness to outliers, to which sample covariance matrices are highly sensitive.
The spectrum of a linear operator that operates on a Banach space is a fundamental concept of functional analysis. The spectrum consists of all scalars such that the operator does not have a bounded inverse on . The spectrum has a standard decomposition into three parts:
In mathematics, an Azumaya algebra is a generalization of central simple algebras to -algebras where need not be a field. Such a notion was introduced in a 1951 paper of Goro Azumaya, for the case where is a commutative local ring. The notion was developed further in ring theory, and in algebraic geometry, where Alexander Grothendieck made it the basis for his geometric theory of the Brauer group in Bourbaki seminars from 1964–65. There are now several points of access to the basic definitions.
A landing footprint, also called a landing ellipse, is the area of uncertainty of a spacecraft's landing zone on an astronomical body. After atmospheric entry, the landing point of a spacecraft will depend upon the degree of control, entry angle, entry mass, atmospheric conditions, and drag. By aggregating such numerous variables it is possible to model a spacecraft's landing zone to a certain degree of precision. By simulating entry under varying conditions an probable ellipse can be calculated; the size of the ellipse represents the degree of uncertainty for a given confidence interval.
In mathematics, the spectral theory of ordinary differential equations is the part of spectral theory concerned with the determination of the spectrum and eigenfunction expansion associated with a linear ordinary differential equation. In his dissertation, Hermann Weyl generalized the classical Sturm–Liouville theory on a finite closed interval to second order differential operators with singularities at the endpoints of the interval, possibly semi-infinite or infinite. Unlike the classical case, the spectrum may no longer consist of just a countable set of eigenvalues, but may also contain a continuous part. In this case the eigenfunction expansion involves an integral over the continuous part with respect to a spectral measure, given by the Titchmarsh–Kodaira formula. The theory was put in its final simplified form for singular differential equations of even degree by Kodaira and others, using von Neumann's spectral theorem. It has had important applications in quantum mechanics, operator theory and harmonic analysis on semisimple Lie groups.
In probability theory and directional statistics, a wrapped normal distribution is a wrapped probability distribution that results from the "wrapping" of the normal distribution around the unit circle. It finds application in the theory of Brownian motion and is a solution to the heat equation for periodic boundary conditions. It is closely approximated by the von Mises distribution, which, due to its mathematical simplicity and tractability, is the most commonly used distribution in directional statistics.
In probability theory and statistics, the generalized chi-squared distribution is the distribution of a quadratic form of a multinormal variable, or a linear combination of different normal variables and squares of normal variables. Equivalently, it is also a linear sum of independent noncentral chi-square variables and a normal variable. There are several other such generalizations for which the same term is sometimes used; some of them are special cases of the family discussed here, for example the gamma distribution.
In the mathematical theory of random matrices, the Marchenko–Pastur distribution, or Marchenko–Pastur law, describes the asymptotic behavior of singular values of large rectangular random matrices. The theorem is named after soviet mathematicians Volodymyr Marchenko and Leonid Pastur who proved this result in 1967.
In physics, particularly in quantum field theory, the Weyl equation is a relativistic wave equation for describing massless spin-1/2 particles called Weyl fermions. The equation is named after Hermann Weyl. The Weyl fermions are one of the three possible types of elementary fermions, the other two being the Dirac and the Majorana fermions.
In the mathematical theory of random processes, the Markov chain central limit theorem has a conclusion somewhat similar in form to that of the classic central limit theorem (CLT) of probability theory, but the quantity in the role taken by the variance in the classic CLT has a more complicated definition. See also the general form of Bienaymé's identity.