# Completeness (statistics)

Last updated

In statistics, completeness is a property of a statistic in relation to a model for a set of observed data. In essence, it ensures that the distributions corresponding to different values of the parameters are distinct.

## Contents

It is closely related to the idea of identifiability, but in statistical theory it is often found as a condition imposed on a sufficient statistic from which certain optimality results are derived.

## Definition

Consider a random variable X whose probability distribution belongs to a parametric model Pθ parametrized by θ.

Say T is a statistic; that is, the composition of a measurable function with a random sample X1,...,Xn.

The statistic T is said to be complete for the distribution of X if, for every measurable function g,: [1]

${\displaystyle {\text{if }}\operatorname {E} _{\theta }(g(T))=0{\text{ for all }}\theta {\text{ then }}\mathbf {P} _{\theta }(g(T)=0)=1{\text{ for all }}\theta .}$

The statistic T is said to be boundedly complete for the distribution of X if this implication holds for every measurable function g that is also bounded.

### Example 1: Bernoulli model

The Bernoulli model admits a complete statistic. [2] Let X be a random sample of size n such that each Xi has the same Bernoulli distribution with parameter p. Let T be the number of 1s observed in the sample, i.e. ${\displaystyle \textstyle T=\sum _{i=1}^{n}X_{i}}$. T is a statistic of X which has a binomial distribution with parameters (n,p). If the parameter space for p is (0,1), then T is a complete statistic. To see this, note that

${\displaystyle \operatorname {E} _{p}(g(T))=\sum _{t=0}^{n}{g(t){n \choose t}p^{t}(1-p)^{n-t}}=(1-p)^{n}\sum _{t=0}^{n}{g(t){n \choose t}\left({\frac {p}{1-p}}\right)^{t}}.}$

Observe also that neither p nor 1  p can be 0. Hence ${\displaystyle E_{p}(g(T))=0}$ if and only if:

${\displaystyle \sum _{t=0}^{n}g(t){n \choose t}\left({\frac {p}{1-p}}\right)^{t}=0.}$

On denoting p/(1  p) by r, one gets:

${\displaystyle \sum _{t=0}^{n}g(t){n \choose t}r^{t}=0.}$

First, observe that the range of r is the positive reals. Also, E(g(T)) is a polynomial in r and, therefore, can only be identical to 0 if all coefficients are 0, that is, g(t) = 0 for all t.

It is important to notice that the result that all coefficients must be 0 was obtained because of the range of r. Had the parameter space been finite and with a number of elements less than or equal to n, it might be possible to solve the linear equations in g(t) obtained by substituting the values of r and get solutions different from 0. For example, if n = 1 and the parameter space is {0.5}, a single observation and a single parameter value, T is not complete. Observe that, with the definition:

${\displaystyle g(t)=2(t-0.5),\,}$

then, E(g(T)) = 0 although g(t) is not 0 for t = 0 nor for t = 1.

## Relation to sufficient statistics

For some parametric families, a complete sufficient statistic does not exist (for example, see Galili and Meilijson 2016 [3] ).

For example, if you take a sample sized n > 2 from a N(θ,θ2) distribution, then ${\displaystyle \left(\sum _{i=1}^{n}X_{i},\sum _{i=1}^{n}X_{i}^{2}\right)}$ is a minimal sufficient statistic and is a function of any other minimal sufficient statistic, but ${\displaystyle 2\left(\sum _{i=1}^{n}X_{i}\right)^{2}-(n+1)\sum _{i=1}^{n}X_{i}^{2}}$ has an expectation of 0 for all θ, so there cannot be a complete statistic.

If there is a minimal sufficient statistic then any complete sufficient statistic is also minimal sufficient. But there are pathological cases where a minimal sufficient statistic does not exist even if a complete statistic does.

## Importance of completeness

The notion of completeness has many applications in statistics, particularly in the following two theorems of mathematical statistics.

### Lehmann–Scheffé theorem

Completeness occurs in the Lehmann–Scheffé theorem, [4] which states that if a statistic that is unbiased, complete and sufficient for some parameter θ, then it is the best mean-unbiased estimator for θ. In other words, this statistic has a smaller expected loss for any convex loss function; in many practical applications with the squared loss-function, it has a smaller mean squared error among any estimators with the same expected value.

Examples exists that when the minimal sufficient statistic is not complete then several alternative statistics exist for unbiased estimation of θ, while some of them have lower variance than others. [5]

### Basu's theorem

Bounded completeness occurs in Basu's theorem, [6] which states that a statistic that is both boundedly complete and sufficient is independent of any ancillary statistic.

Bounded completeness also occurs in Bahadur's theorem. In the case where there exists at least one minimal sufficient statistic, a statistic which is sufficient and boundedly complete, is necessarily minimal sufficient.

## Notes

1. Young, G. A. and Smith, R. L. (2005). Essentials of Statistical Inference. (p. 94). Cambridge University Press.
2. Casella, G. and Berger, R. L. (2001). Statistical Inference. (pp. 285–286). Duxbury Press.
3. Tal Galili & Isaac Meilijson (31 Mar 2016). "An Example of an Improvable Rao–Blackwell Improvement, Inefficient Maximum Likelihood Estimator, and Unbiased Generalized Bayes Estimator". The American Statistician. 70 (1): 108–113. doi:10.1080/00031305.2015.1100683. PMC  . PMID   27499547.{{cite journal}}: CS1 maint: uses authors parameter (link)
4. Casella, George; Berger, Roger L. (2001). Statistical Inference (2nd ed.). Duxbury Press. ISBN   978-0534243128.
5. Tal Galili & Isaac Meilijson (31 Mar 2016). "An Example of an Improvable Rao–Blackwell Improvement, Inefficient Maximum Likelihood Estimator, and Unbiased Generalized Bayes Estimator". The American Statistician. 70 (1): 108–113. doi:10.1080/00031305.2015.1100683. PMC  . PMID   27499547.{{cite journal}}: CS1 maint: uses authors parameter (link)
6. Casella, G. and Berger, R. L. (2001). Statistical Inference. (pp. 287). Duxbury Press.

## Related Research Articles

In statistics, an estimator is a rule for calculating an estimate of a given quantity based on observed data: thus the rule, the quantity of interest and its result are distinguished. For example, the sample mean is a commonly used estimator of the population mean.

In statistics, maximum likelihood estimation (MLE) is a method of estimating the parameters of an assumed probability distribution, given some observed data. This is achieved by maximizing a likelihood function so that, under the assumed statistical model, the observed data is most probable. The point in the parameter space that maximizes the likelihood function is called the maximum likelihood estimate. The logic of maximum likelihood is both intuitive and flexible, and as such the method has become a dominant means of statistical inference.

In statistics, a statistic is sufficient with respect to a statistical model and its associated unknown parameter if "no other statistic that can be calculated from the same sample provides any additional information as to the value of the parameter". In particular, a statistic is sufficient for a family of probability distributions if the sample from which it is calculated gives no additional information than the statistic, as to which of those probability distributions is the sampling distribution.

In statistics, point estimation involves the use of sample data to calculate a single value which is to serve as a "best guess" or "best estimate" of an unknown population parameter. More formally, it is the application of a point estimator to the data to obtain a point estimate.

In statistics, the mean squared error (MSE) or mean squared deviation (MSD) of an estimator measures the average of the squares of the errors—that is, the average squared difference between the estimated values and the actual value. MSE is a risk function, corresponding to the expected value of the squared error loss. The fact that MSE is almost always strictly positive is because of randomness or because the estimator does not account for information that could produce a more accurate estimate.

In statistics, the Lehmann–Scheffé theorem is a prominent statement, tying together the ideas of completeness, sufficiency, uniqueness, and best unbiased estimation. The theorem states that any estimator which is unbiased for a given unknown quantity and that depends on the data only through a complete, sufficient statistic is the unique best unbiased estimator of that quantity. The Lehmann–Scheffé theorem is named after Erich Leo Lehmann and Henry Scheffé, given their two early papers.

In statistics, the Rao–Blackwell theorem, sometimes referred to as the Rao–Blackwell–Kolmogorov theorem, is a result which characterizes the transformation of an arbitrarily crude estimator into an estimator that is optimal by the mean-squared-error criterion or any of a variety of similar criteria.

In mathematical statistics, the Fisher information is a way of measuring the amount of information that an observable random variable X carries about an unknown parameter θ of a distribution that models X. Formally, it is the variance of the score, or the expected value of the observed information. In Bayesian statistics, the asymptotic distribution of the posterior mode depends on the Fisher information and not on the prior. The role of the Fisher information in the asymptotic theory of maximum-likelihood estimation was emphasized by the statistician Ronald Fisher. The Fisher information is also used in the calculation of the Jeffreys prior, which is used in Bayesian statistics.

In statistics, a consistent estimator or asymptotically consistent estimator is an estimator—a rule for computing estimates of a parameter θ0—having the property that as the number of data points used increases indefinitely, the resulting sequence of estimates converges in probability to θ0. This means that the distributions of the estimates become more and more concentrated near the true value of the parameter being estimated, so that the probability of the estimator being arbitrarily close to θ0 converges to one.

Estimation theory is a branch of statistics that deals with estimating the values of parameters based on measured empirical data that has a random component. The parameters describe an underlying physical setting in such a way that their value affects the distribution of the measured data. An estimator attempts to approximate the unknown parameters using the measurements. In estimation theory, two approaches are generally considered:

In statistics a minimum-variance unbiased estimator (MVUE) or uniformly minimum-variance unbiased estimator (UMVUE) is an unbiased estimator that has lower variance than any other unbiased estimator for all possible values of the parameter.

In statistics, M-estimators are a broad class of extremum estimators for which the objective function is a sample average. Both non-linear least squares and maximum likelihood estimation are special cases of M-estimators. The definition of M-estimators was motivated by robust statistics, which contributed new types of M-estimators. The statistical procedure of evaluating an M-estimator on a data set is called M-estimation. 48 samples of robust M-estimators can be founded in a recent review study.

In estimation theory and decision theory, a Bayes estimator or a Bayes action is an estimator or decision rule that minimizes the posterior expected value of a loss function. Equivalently, it maximizes the posterior expectation of a utility function. An alternative way of formulating an estimator within Bayesian statistics is maximum a posteriori estimation.

In statistics, Basu's theorem states that any boundedly complete minimal sufficient statistic is independent of any ancillary statistic. This is a 1955 result of Debabrata Basu.

In statistics, the bias of an estimator is the difference between this estimator's expected value and the true value of the parameter being estimated. An estimator or decision rule with zero bias is called unbiased. In statistics, "bias" is an objective property of an estimator. Bias can also be measured with respect to the median, rather than the mean, in which case one distinguishes median-unbiased from the usual mean-unbiasedness property. Bias is a distinct concept from consistency. Consistent estimators converge in probability to the true value of the parameter, but may be biased or unbiased; see bias versus consistency for more.

In statistics, the jackknife is a resampling technique that is especially useful for bias and variance estimation. The jackknife pre-dates other common resampling methods such as the bootstrap. Given a sample of size , a jackknife estimator can be built by aggregating the parameter estimates from each subsample of size obtained by omitting one observation.

In statistics, the concept of being an invariant estimator is a criterion that can be used to compare the properties of different estimators for the same quantity. It is a way of formalising the idea that an estimator should have certain intuitively appealing qualities. Strictly speaking, "invariant" would mean that the estimates themselves are unchanged when both the measurements and the parameters are transformed in a compatible way, but the meaning has been extended to allow the estimates to change in appropriate ways with such transformations. The term equivariant estimator is used in formal mathematical contexts that include a precise description of the relation of the way the estimator changes in response to changes to the dataset and parameterisation: this corresponds to the use of "equivariance" in more general mathematics.

In statistics, Fisher consistency, named after Ronald Fisher, is a desirable property of an estimator asserting that if the estimator were calculated using the entire population rather than a sample, the true value of the estimated parameter would be obtained.

Although the term well-behaved statistic often seems to be used in the scientific literature in somewhat the same way as is well-behaved in mathematics it can also be assigned precise mathematical meaning, and in more than one way. In the former case, the meaning of this term will vary from context to context. In the latter case, the mathematical conditions can be used to derive classes of combinations of distributions with statistics which are well-behaved in each sense.

In the comparison of various statistical procedures, efficiency is a measure of quality of an estimator, of an experimental design, or of a hypothesis testing procedure. Essentially, a more efficient estimator, experiment, or test needs fewer observations than a less efficient one to achieve a given error performance. An efficient estimator is characterized by a small variance or mean square error, indicating that there is a small deviance between the estimated value and the "true" value.