MINQUE

Last updated

In statistics, the theory of minimum norm quadratic unbiased estimation (MINQUE) [1] [2] [3] was developed by C. R. Rao. MINQUE is a theory alongside other estimation methods in estimation theory, such as the method of moments or maximum likelihood estimation. Similar to the theory of best linear unbiased estimation, MINQUE is specifically concerned with linear regression models. [1] The method was originally conceived to estimate heteroscedastic error variance in multiple linear regression. [1] MINQUE estimators also provide an alternative to maximum likelihood estimators or restricted maximum likelihood estimators for variance components in mixed effects models. [3] MINQUE estimators are quadratic forms of the response variable and are used to estimate a linear function of the variances.

Contents

Principles

We are concerned with a mixed effects model for the random vector with the following linear structure.

Here, is a design matrix for the fixed effects, represents the unknown fixed-effect parameters, is a design matrix for the -th random-effect component, and is a random vector for the -th random-effect component. The random effects are assumed to have zero mean () and be uncorrelated (). Furthermore, any two random effect vectors are also uncorrelated (). The unknown variances represent the variance components of the model.

This is a general model that captures commonly used linear regression models.

  1. Gauss-Markov Model [3] : If we consider a one-component model where , then the model is equivalent to the Gauss-Markov model with and .
  2. Heteroscedastic Model [1] : Each set of random variables in that shares a common variance can be modeled as an individual variance component with an appropriate .

A compact representation for the model is the following, where and .

Note that this model makes no distributional assumptions about other than the first and second moments. [3]

The goal in MINQUE is to estimate using a quadratic form . MINQUE estimators are derived by identifying a matrix such that the estimator has some desirable properties, [2] [3] described below.

Optimal Estimator Properties to Constrain MINQUE

Invariance to translation of the fixed effects

Consider a new fixed-effect parameter , which represents a translation of the original fixed effect. The new, equivalent model is now the following.

Under this equivalent model, the MINQUE estimator is now . Rao argued that since the underlying models are equivalent, this estimator should be equal to . [2] [3] This can be achieved by constraining such that , which ensures that all terms other than in the expansion of the quadratic form are zero.

Unbiased estimation

Suppose that we constrain , as argued in the section above. Then, the MINQUE estimator has the following form

To ensure that this estimator is unbiased, the expectation of the estimator must equal the parameter of interest, . Below, the expectation of the estimator can be decomposed for each component since the components are uncorrelated with each other. Furthermore, the cyclic property of the trace is used to evaluate the expectation with respect to .

To ensure that this estimator is unbiased, Rao suggested setting , which can be accomplished by constraining such that for all components. [3]

Minimum Norm

Rao argues that if were observed, a "natural" estimator for would be the following [2] [3] since . Here, is defined as a diagonal matrix.

The difference between the proposed estimator and the natural estimator is . This difference can be minimized by minimizing the norm of the matrix .

Procedure

Given the constraints and optimization strategy derived from the optimal properties above, the MINQUE estimator for is derived by choosing a matrix that minimizes , subject to the constraints

  1. , and
  2. .

Examples of Estimators

Standard Estimator for Homoscedastic Error

In the Gauss-Markov model, the error variance is estimated using the following.

This estimator is unbiased and can be shown to minimize the Euclidean norm of the form . [1] Thus, the standard estimator for error variance in the Gauss-Markov model is a MINQUE estimator.

Random Variables with Common Mean and Heteroscedastic Error

For random variables with a common mean and different variances , the MINQUE estimator for is , where and . [1]

Estimator for Variance Components

Rao proposed a MINQUE estimator for the variance components model based on minimizing the Euclidean norm. [2] The Euclidean norm is the square root of the sum of squares of all elements in the matrix. When evaluating this norm below, . Furthermore, using the cyclic property of traces, .

Note that since does not depend on , the MINQUE with the Euclidean norm is obtained by identifying the matrix that minimizes , subject to the MINQUE constraints discussed above.

Rao showed that the matrix that satisfies this optimization problem is

,

where , is the projection matrix into the column space of , and represents the generalized inverse of a matrix.

Therefore, the MINQUE estimator is the following, where the vectors and are defined based on the sum.

The vector is obtained by using the constraint . That is, the vector represents the solution to the following system of equations .

This can be written as a matrix product , where and is the following.

Then, . This implies that the MINQUE is . Note that , where . Therefore, the estimator for the variance components is .

Extensions

MINQUE estimators can be obtained without the invariance criteria, in which case the estimator is only unbiased and minimizes the norm. [2] Such estimators have slightly different constraints on the minimization problem.

The model can be extended to estimate covariance components. [3] In such a model, the random effects of a component are assumed to have a common covariance structure . A MINQUE estimator for a mixture of variance and covariance components was also proposed. [3] In this model, for and for .

Related Research Articles

<span class="mw-page-title-main">Pauli matrices</span> Matrices important in quantum mechanics and the study of spin

In mathematical physics and mathematics, the Pauli matrices are a set of three 2 × 2 complex matrices that are traceless, Hermitian, involutory and unitary. Usually indicated by the Greek letter sigma, they are occasionally denoted by tau when used in connection with isospin symmetries.

<span class="mw-page-title-main">Central limit theorem</span> Fundamental theorem in probability theory and statistics

In probability theory, the central limit theorem (CLT) states that, under appropriate conditions, the distribution of a normalized version of the sample mean converges to a standard normal distribution. This holds even if the original variables themselves are not normally distributed. There are several versions of the CLT, each applying in the context of different conditions.

<span class="mw-page-title-main">Navier–Stokes equations</span> Equations describing the motion of viscous fluid substances

The Navier–Stokes equations are partial differential equations which describe the motion of viscous fluid substances. They were named after French engineer and physicist Claude-Louis Navier and the Irish physicist and mathematician George Gabriel Stokes. They were developed over several decades of progressively building the theories, from 1822 (Navier) to 1842–1850 (Stokes).

<span class="mw-page-title-main">Multivariate normal distribution</span> Generalization of the one-dimensional normal distribution to higher dimensions

In probability theory and statistics, the multivariate normal distribution, multivariate Gaussian distribution, or joint normal distribution is a generalization of the one-dimensional (univariate) normal distribution to higher dimensions. One definition is that a random vector is said to be k-variate normally distributed if every linear combination of its k components has a univariate normal distribution. Its importance derives mainly from the multivariate central limit theorem. The multivariate normal distribution is often used to describe, at least approximately, any set of (possibly) correlated real-valued random variables, each of which clusters around a mean value.

In physics, a Langevin equation is a stochastic differential equation describing how a system evolves when subjected to a combination of deterministic and fluctuating ("random") forces. The dependent variables in a Langevin equation typically are collective (macroscopic) variables changing only slowly in comparison to the other (microscopic) variables of the system. The fast (microscopic) variables are responsible for the stochastic nature of the Langevin equation. One application is to Brownian motion, which models the fluctuating motion of a small particle in a fluid.

<span class="mw-page-title-main">Fokker–Planck equation</span> Partial differential equation

In statistical mechanics and information theory, the Fokker–Planck equation is a partial differential equation that describes the time evolution of the probability density function of the velocity of a particle under the influence of drag forces and random forces, as in Brownian motion. The equation can be generalized to other observables as well. The Fokker-Planck equation has multiple applications in information theory, graph theory, data science, finance, economics etc.

In statistics, the Gauss–Markov theorem states that the ordinary least squares (OLS) estimator has the lowest sampling variance within the class of linear unbiased estimators, if the errors in the linear regression model are uncorrelated, have equal variances and expectation value of zero. The errors do not need to be normal, nor do they need to be independent and identically distributed. The requirement that the estimator be unbiased cannot be dropped, since biased estimators exist with lower variance. See, for example, the James–Stein estimator, ridge regression, or simply any degenerate estimator.

In mechanics and geometry, the 3D rotation group, often denoted SO(3), is the group of all rotations about the origin of three-dimensional Euclidean space under the operation of composition.

A Newtonian fluid is a fluid in which the viscous stresses arising from its flow are at every point linearly correlated to the local strain rate — the rate of change of its deformation over time. Stresses are proportional to the rate of change of the fluid's velocity vector.

A directional derivative is a concept in multivariable calculus that measures the rate at which a function changes in a particular direction at a given point.

In statistics, propagation of uncertainty is the effect of variables' uncertainties on the uncertainty of a function based on them. When the variables are the values of experimental measurements they have uncertainties due to measurement limitations which propagate due to the combination of variables in the function.

In physics, the Hamilton–Jacobi equation, named after William Rowan Hamilton and Carl Gustav Jacob Jacobi, is an alternative formulation of classical mechanics, equivalent to other formulations such as Newton's laws of motion, Lagrangian mechanics and Hamiltonian mechanics.

<span class="mw-page-title-main">Ordinary least squares</span> Method for estimating the unknown parameters in a linear regression model

In statistics, ordinary least squares (OLS) is a type of linear least squares method for choosing the unknown parameters in a linear regression model by the principle of least squares: minimizing the sum of the squares of the differences between the observed dependent variable in the input dataset and the output of the (linear) function of the independent variable. Some sources consider OLS to be linear regression.

In statistics, generalized least squares (GLS) is a method used to estimate the unknown parameters in a linear regression model. It is used when there is a non-zero amount of correlation between the residuals in the regression model. GLS is employed to improve statistical efficiency and reduce the risk of drawing erroneous inferences, as compared to conventional least squares and weighted least squares methods. It was first described by Alexander Aitken in 1935.

The topic of heteroskedasticity-consistent (HC) standard errors arises in statistics and econometrics in the context of linear regression and time series analysis. These are also known as heteroskedasticity-robust standard errors, Eicker–Huber–White standard errors, to recognize the contributions of Friedhelm Eicker, Peter J. Huber, and Halbert White.

The Cauchy momentum equation is a vector partial differential equation put forth by Cauchy that describes the non-relativistic momentum transport in any continuum.

In statistics, principal component regression (PCR) is a regression analysis technique that is based on principal component analysis (PCA). More specifically, PCR is used for estimating the unknown regression coefficients in a standard linear regression model.

<span class="mw-page-title-main">Stokes' theorem</span> Theorem in vector calculus

Stokes' theorem, also known as the Kelvin–Stokes theorem after Lord Kelvin and George Stokes, the fundamental theorem for curls or simply the curl theorem, is a theorem in vector calculus on . Given a vector field, the theorem relates the integral of the curl of the vector field over some surface, to the line integral of the vector field around the boundary of the surface. The classical theorem of Stokes can be stated in one sentence:

For certain applications in linear algebra, it is useful to know properties of the probability distribution of the largest eigenvalue of a finite sum of random matrices. Suppose is a finite sequence of random matrices. Analogous to the well-known Chernoff bound for sums of scalars, a bound on the following is sought for a given parameter t:

In physics and mathematics, the Klein–Kramers equation or sometimes referred as Kramers–Chandrasekhar equation is a partial differential equation that describes the probability density function f of a Brownian particle in phase space (r, p). It is a special case of the Fokker–Planck equation.

References

  1. 1 2 3 4 5 6 Rao, C.R. (1970). "Estimation of heteroscedastic variances in linear models". Journal of the American Statistical Association. 65 (329): 161–172. doi:10.1080/01621459.1970.10481070. JSTOR   2283583.
  2. 1 2 3 4 5 6 Rao, C.R. (1971). "Estimation of variance and covariance components MINQUE theory". J Multivar Anal. 1: 257–275. doi:10.1016/0047-259x(71)90001-7. hdl: 10338.dmlcz/104230 .
  3. 1 2 3 4 5 6 7 8 9 10 Rao, C.R. (1972). "Estimation of variance and covariance components in linear models". Journal of the American Statistical Association. 67 (337): 112–115. doi:10.1080/01621459.1972.10481212. JSTOR   2284708.