This article needs additional citations for verification .(November 2009) (Learn how and when to remove this template message) |

In statistics a **minimum-variance unbiased estimator (MVUE)** or **uniformly minimum-variance unbiased estimator (UMVUE)** is an unbiased estimator that has lower variance than any other unbiased estimator for all possible values of the parameter.

For practical statistics problems, it is important to determine the MVUE if one exists, since less-than-optimal procedures would naturally be avoided, other things being equal. This has led to substantial development of statistical theory related to the problem of optimal estimation.

While combining the constraint of unbiasedness with the desirability metric of least variance leads to good results in most practical settings—making MVUE a natural starting point for a broad range of analyses—a targeted specification may perform better for a given problem; thus, MVUE is not always the best stopping point.

Consider estimation of based on data i.i.d. from some member of a family of densities , where is the parameter space. An unbiased estimator of is *UMVUE* if ,

for any other unbiased estimator

If an unbiased estimator of exists, then one can prove there is an essentially unique MVUE.^{ [1] } Using the Rao–Blackwell theorem one can also prove that determining the MVUE is simply a matter of finding a complete sufficient statistic for the family and conditioning *any* unbiased estimator on it.

Further, by the Lehmann–Scheffé theorem, an unbiased estimator that is a function of a complete, sufficient statistic is the UMVUE estimator.

Put formally, suppose is unbiased for , and that is a complete sufficient statistic for the family of densities. Then

is the MVUE for

A Bayesian analog is a Bayes estimator, particularly with minimum mean square error (MMSE).

An efficient estimator need not exist, but if it does and if it is unbiased, it is the MVUE. Since the mean squared error (MSE) of an estimator *δ* is

the MVUE minimizes MSE *among unbiased estimators*. In some cases biased estimators have lower MSE because they have a smaller variance than does any unbiased estimator; see estimator bias.

Consider the data to be a single observation from an absolutely continuous distribution on with density

and we wish to find the UMVU estimator of

First we recognize that the density can be written as

Which is an exponential family with sufficient statistic . In fact this is a full rank exponential family, and therefore is complete sufficient. See exponential family for a derivation which shows

Therefore,

Here we use Lehmann–Scheffé theorem to get the MVUE

Clearly is unbiased and is complete sufficient, thus the UMVU estimator is

This example illustrates that an unbiased function of the complete sufficient statistic will be UMVU, as Lehmann–Scheffé theorem states.

- For a normal distribution with unknown mean and variance, the sample mean and (unbiased) sample variance are the MVUEs for the population mean and population variance.
- However, the sample standard deviation is not unbiased for the population standard deviation – see unbiased estimation of standard deviation.
- Further, for other distributions the sample mean and sample variance are not in general MVUEs – for a uniform distribution with unknown upper and lower bounds, the mid-range is the MVUE for the population mean.

- If
*k*exemplars are chosen (without replacement) from a discrete uniform distribution over the set {1, 2, ...,*N*} with unknown upper bound*N*, the MVUE for*N*is

- where
*m*is the sample maximum. This is a scaled and shifted (so unbiased) transform of the sample maximum, which is a sufficient and complete statistic. See German tank problem for details.

In statistics, an **estimator** is a rule for calculating an estimate of a given quantity based on observed data: thus the rule, the quantity of interest and its result are distinguished.

In statistics, **maximum likelihood estimation** (**MLE**) is a method of estimating the parameters of a probability distribution by maximizing a likelihood function, so that under the assumed statistical model the observed data is most probable. The point in the parameter space that maximizes the likelihood function is called the maximum likelihood estimate. The logic of maximum likelihood is both intuitive and flexible, and as such the method has become a dominant means of statistical inference.

In statistics, **completeness** is a property of a statistic in relation to a model for a set of observed data. In essence, it ensures that the distributions corresponding to different values of the parameters are distinct.

In statistics, the **mean squared error** (**MSE**) or **mean squared deviation** (**MSD**) of an estimator measures the average of the squares of the errors—that is, the average squared difference between the estimated values and the actual value. MSE is a risk function, corresponding to the expected value of the squared error loss. The fact that MSE is almost always strictly positive is because of randomness or because the estimator does not account for information that could produce a more accurate estimate.

In statistics, the **Lehmann–Scheffé theorem** is a prominent statement, tying together the ideas of completeness, sufficiency, uniqueness, and best unbiased estimation. The theorem states that any estimator which is unbiased for a given unknown quantity and that depends on the data only through a complete, sufficient statistic is the unique best unbiased estimator of that quantity. The Lehmann–Scheffé theorem is named after Erich Leo Lehmann and Henry Scheffé, given their two early papers.

In statistics, the **Rao–Blackwell theorem**, sometimes referred to as the **Rao–Blackwell–Kolmogorov theorem**, is a result which characterizes the transformation of an arbitrarily crude estimator into an estimator that is optimal by the mean-squared-error criterion or any of a variety of similar criteria.

In statistics, an **efficient estimator** is an estimator that estimates the quantity of interest in some “best possible” manner. The notion of “best possible” relies upon the choice of a particular loss function — the function which quantifies the relative degree of undesirability of estimation errors of different magnitudes. The most common choice of the loss function is quadratic, resulting in the mean squared error criterion of optimality.

In estimation theory and statistics, the **Cramér–Rao bound (CRB)**, **Cramér–Rao lower bound (CRLB)**, **Cramér–Rao inequality**, **Fréchet–Darmois–Cramér–Rao inequality**, or **information inequality** expresses a lower bound on the variance of unbiased estimators of a deterministic parameter. This term is named in honor of Harald Cramér, Calyampudi Radhakrishna Rao, Maurice Fréchet and Georges Darmois all of whom independently derived this limit to statistical precision in the 1940s.

In statistics, the **score** is the gradient of the log-likelihood function with respect to the parameter vector. Evaluated at a particular point of the parameter vector, the score indicates the steepness of the log-likelihood function and thereby the sensitivity to infinitesimal changes to the parameter values. If the log-likelihood function is continuous over the parameter space, the score will vanish at a local maximum or minimum; this fact is used in maximum likelihood estimation to find the parameter values that maximize the likelihood function.

In mathematical statistics, the **Fisher information** is a way of measuring the amount of information that an observable random variable *X* carries about an unknown parameter *θ* of a distribution that models *X*. Formally, it is the variance of the score, or the expected value of the observed information. In Bayesian statistics, the asymptotic distribution of the posterior mode depends on the Fisher information and not on the prior. The role of the Fisher information in the asymptotic theory of maximum-likelihood estimation was emphasized by the statistician Ronald Fisher. The Fisher information is also used in the calculation of the Jeffreys prior, which is used in Bayesian statistics.

In statistics, a **consistent estimator** or **asymptotically consistent estimator** is an estimator—a rule for computing estimates of a parameter *θ*_{0}—having the property that as the number of data points used increases indefinitely, the resulting sequence of estimates converges in probability to *θ*_{0}. This means that the distributions of the estimates become more and more concentrated near the true value of the parameter being estimated, so that the probability of the estimator being arbitrarily close to *θ*_{0} converges to one.

In statistics, the **delta method** is a result concerning the approximate probability distribution for a function of an asymptotically normal statistical estimator from knowledge of the limiting variance of that estimator.

In statistics, the **bias** of an estimator is the difference between this estimator's expected value and the true value of the parameter being estimated. An estimator or decision rule with zero bias is called **unbiased**. In statistics, "bias" is an **objective** property of an estimator.

In statistics, the **Chapman–Robbins bound** or **Hammersley–Chapman–Robbins bound** is a lower bound on the variance of estimators of a deterministic parameter. It is a generalization of the Cramér–Rao bound; compared to the Cramér–Rao bound, it is both tighter and applicable to a wider range of problems. However, it is usually more difficult to compute.

In statistics, the concept of being an **invariant estimator** is a criterion that can be used to compare the properties of different estimators for the same quantity. It is a way of formalising the idea that an estimator should have certain intuitively appealing qualities. Strictly speaking, "invariant" would mean that the estimates themselves are unchanged when both the measurements and the parameters are transformed in a compatible way, but the meaning has been extended to allow the estimates to change in appropriate ways with such transformations. The term **equivariant estimator** is used in formal mathematical contexts that include a precise description of the relation of the way the estimator changes in response to changes to the dataset and parameterisation: this corresponds to the use of "equivariance" in more general mathematics.

In statistics, **principal component regression** (**PCR**) is a regression analysis technique that is based on principal component analysis (PCA). More specifically, PCA is used for estimating the unknown regression coefficients in a standard linear regression model.

**Location estimation** in **wireless sensor networks** is the problem of estimating the location of an object from a set of noisy measurements. These measurements are acquired in a distributed manner by a set of sensors.

In statistics, **maximum spacing estimation**, or **maximum product of spacing estimation (MPS)**, is a method for estimating the parameters of a univariate statistical model. The method requires maximization of the geometric mean of *spacings* in the data, which are the differences between the values of the cumulative distribution function at neighbouring data points.

In the comparison of various statistical procedures, **efficiency** is a measure of quality of an estimator, of an experimental design, or of a hypothesis testing procedure. Essentially, a more efficient estimator, experiment, or test needs fewer observations than a less efficient one to achieve a given performance. This article primarily deals with efficiency of estimators.

In statistics, the **variance function** is a smooth function which depicts the variance of a random quantity as a function of its mean. The variance function plays a large role in many settings of statistical modelling. It is a main ingredient in the generalized linear model framework and a tool used in non-parametric regression, semiparametric regression and functional data analysis. In parametric modeling, variance functions take on a parametric form and explicitly describe the relationship between the variance and the mean of a random quantity. In a non-parametric setting, the variance function is assumed to be a smooth function.

- Keener, Robert W. (2006).
*Statistical Theory: Notes for a Course in Theoretical Statistics*. Springer. pp. 47–48, 57–58. - Voinov V. G., Nikulin M.S. (1993).
*Unbiased estimators and their applications, Vol.1: Univariate case*. Kluwer Academic Publishers. pp. 521p.

This page is based on this Wikipedia article

Text is available under the CC BY-SA 4.0 license; additional terms may apply.

Images, videos and audio are available under their respective licenses.

Text is available under the CC BY-SA 4.0 license; additional terms may apply.

Images, videos and audio are available under their respective licenses.