# Pivotal quantity

Last updated

In statistics, a pivotal quantity or pivot is a function of observations and unobservable parameters such that the function's probability distribution does not depend on the unknown parameters (including nuisance parameters). [1] A pivot quantity need not be a statistic—the function and its value can depend on the parameters of the model, but its distribution must not. If it is a statistic, then it is known as an ancillary statistic.

## Contents

More formally, [2] let ${\displaystyle X=(X_{1},X_{2},\ldots ,X_{n})}$ be a random sample from a distribution that depends on a parameter (or vector of parameters) ${\displaystyle \theta }$. Let ${\displaystyle g(X,\theta )}$ be a random variable whose distribution is the same for all ${\displaystyle \theta }$. Then ${\displaystyle g}$ is called a pivotal quantity (or simply a pivot).

Pivotal quantities are commonly used for normalization to allow data from different data sets to be compared. It is relatively easy to construct pivots for location and scale parameters: for the former we form differences so that location cancels, for the latter ratios so that scale cancels.

Pivotal quantities are fundamental to the construction of test statistics, as they allow the statistic to not depend on parameters – for example, Student's t-statistic is for a normal distribution with unknown variance (and mean). They also provide one method of constructing confidence intervals, and the use of pivotal quantities improves performance of the bootstrap. In the form of ancillary statistics, they can be used to construct frequentist prediction intervals (predictive confidence intervals).

## Examples

### Normal distribution

One of the simplest pivotal quantities is the z-score; given a normal distribution with mean ${\displaystyle \mu }$ and variance ${\displaystyle \sigma ^{2}}$, and an observation x, the z-score:

${\displaystyle z={\frac {x-\mu }{\sigma }},}$

has distribution ${\displaystyle N(0,1)}$ – a normal distribution with mean 0 and variance 1. Similarly, since the n-sample sample mean has sampling distribution ${\displaystyle N(\mu ,\sigma ^{2}/n),}$ the z-score of the mean

${\displaystyle z={\frac {{\overline {X}}-\mu }{\sigma /{\sqrt {n}}}}}$

also has distribution ${\displaystyle N(0,1).}$ Note that while these functions depend on the parameters – and thus one can only compute them if the parameters are known (they are not statistics) – the distribution is independent of the parameters.

Given ${\displaystyle n}$ independent, identically distributed (i.i.d.) observations ${\displaystyle X=(X_{1},X_{2},\ldots ,X_{n})}$ from the normal distribution with unknown mean ${\displaystyle \mu }$ and variance ${\displaystyle \sigma ^{2}}$, a pivotal quantity can be obtained from the function:

${\displaystyle g(x,X)={\frac {x-{\overline {X}}}{s/{\sqrt {n}}}}}$

where

${\displaystyle {\overline {X}}={\frac {1}{n}}\sum _{i=1}^{n}{X_{i}}}$

and

${\displaystyle s^{2}={\frac {1}{n-1}}\sum _{i=1}^{n}{(X_{i}-{\overline {X}})^{2}}}$

are unbiased estimates of ${\displaystyle \mu }$ and ${\displaystyle \sigma ^{2}}$, respectively. The function ${\displaystyle g(x,X)}$ is the Student's t-statistic for a new value ${\displaystyle x}$, to be drawn from the same population as the already observed set of values ${\displaystyle X}$.

Using ${\displaystyle x=\mu }$ the function ${\displaystyle g(\mu ,X)}$ becomes a pivotal quantity, which is also distributed by the Student's t-distribution with ${\displaystyle \nu =n-1}$ degrees of freedom. As required, even though ${\displaystyle \mu }$ appears as an argument to the function ${\displaystyle g}$, the distribution of ${\displaystyle g(\mu ,X)}$ does not depend on the parameters ${\displaystyle \mu }$ or ${\displaystyle \sigma }$ of the normal probability distribution that governs the observations ${\displaystyle X_{1},\ldots ,X_{n}}$.

This can be used to compute a prediction interval for the next observation ${\displaystyle X_{n+1};}$ see Prediction interval: Normal distribution.

### Bivariate normal distribution

In more complicated cases, it is impossible to construct exact pivots. However, having approximate pivots improves convergence to asymptotic normality.

Suppose a sample of size ${\displaystyle n}$ of vectors ${\displaystyle (X_{i},Y_{i})'}$ is taken from a bivariate normal distribution with unknown correlation ${\displaystyle \rho }$.

An estimator of ${\displaystyle \rho }$ is the sample (Pearson, moment) correlation

${\displaystyle r={\frac {{\frac {1}{n-1}}\sum _{i=1}^{n}(X_{i}-{\overline {X}})(Y_{i}-{\overline {Y}})}{s_{X}s_{Y}}}}$

where ${\displaystyle s_{X}^{2},s_{Y}^{2}}$ are sample variances of ${\displaystyle X}$ and ${\displaystyle Y}$. The sample statistic ${\displaystyle r}$ has an asymptotically normal distribution:

${\displaystyle {\sqrt {n}}{\frac {r-\rho }{1-\rho ^{2}}}\Rightarrow N(0,1)}$.

However, a variance-stabilizing transformation

${\displaystyle z={\rm {{tanh}^{-1}r={\frac {1}{2}}\ln {\frac {1+r}{1-r}}}}}$

known as Fisher's z transformation of the correlation coefficient allows creating the distribution of ${\displaystyle z}$ asymptotically independent of unknown parameters:

${\displaystyle {\sqrt {n}}(z-\zeta )\Rightarrow N(0,1)}$

where ${\displaystyle \zeta ={\rm {tanh}}^{-1}\rho }$ is the corresponding distribution parameter. For finite samples sizes ${\displaystyle n}$, the random variable ${\displaystyle z}$ will have distribution closer to normal than that of ${\displaystyle r}$. An even closer approximation to the standard normal distribution is obtained by using a better approximation for the exact variance: the usual form is

${\displaystyle \operatorname {Var} (z)\approx {\frac {1}{n-3}}.}$

## Robustness

From the point of view of robust statistics, pivotal quantities are robust to changes in the parameters – indeed, independent of the parameters – but not in general robust to changes in the model, such as violations of the assumption of normality. This is fundamental to the robust critique of non-robust statistics, often derived from pivotal quantities: such statistics may be robust within the family, but are not robust outside it.

## Related Research Articles

In probability theory and statistics, kurtosis is a measure of the "tailedness" of the probability distribution of a real-valued random variable. Like skewness, kurtosis describes the shape of a probability distribution and there are different ways of quantifying it for a theoretical distribution and corresponding ways of estimating it from a sample from a population. Different measures of kurtosis may have different interpretations.

In probability theory, a normaldistribution is a type of continuous probability distribution for a real-valued random variable. The general form of its probability density function is

In probability theory and statistics, variance is the expectation of the squared deviation of a random variable from its mean. Variance is a measure of dispersion, meaning it is a measure of how far a set of numbers is spread out from their average value. Variance has a central role in statistics, where some ideas that use it include descriptive statistics, statistical inference, hypothesis testing, goodness of fit, and Monte Carlo sampling. Variance is an important tool in the sciences, where statistical analysis of data is common. The variance is the square of the standard deviation, the second central moment of a distribution, and the covariance of the random variable with itself, and it is often represented by , , or .

In probability theory and statistics, the multivariate normal distribution, multivariate Gaussian distribution, or joint normal distribution is a generalization of the one-dimensional (univariate) normal distribution to higher dimensions. One definition is that a random vector is said to be k-variate normally distributed if every linear combination of its k components has a univariate normal distribution. Its importance derives mainly from the multivariate central limit theorem. The multivariate normal distribution is often used to describe, at least approximately, any set of (possibly) correlated real-valued random variables each of which clusters around a mean value.

In probability and statistics, Student's t-distribution is any member of a family of continuous probability distributions that arise when estimating the mean of a normally-distributed population in situations where the sample size is small and the population's standard deviation is unknown. It was developed by English statistician William Sealy Gosset under the pseudonym "Student".

In probability theory and statistics, the Rayleigh distribution is a continuous probability distribution for nonnegative-valued random variables. It is essentially a chi distribution with two degrees of freedom.

In statistics and optimization, errors and residuals are two closely related and easily confused measures of the deviation of an observed value of an element of a statistical sample from its "theoretical value". The error of an observed value is the deviation of the observed value from the (unobservable) true value of a quantity of interest, and the residual of an observed value is the difference between the observed value and the estimated value of the quantity of interest. The distinction is most important in regression analysis, where the concepts are sometimes called the regression errors and regression residuals and where they lead to the concept of studentized residuals.

In statistical inference, specifically predictive inference, a prediction interval is an estimate of an interval in which a future observation will fall, with a certain probability, given what has already been observed. Prediction intervals are often used in regression analysis.

In statistics, Cochran's theorem, devised by William G. Cochran, is a theorem used to justify results relating to the probability distributions of statistics that are used in the analysis of variance.

In statistics, particularly in hypothesis testing, the Hotelling's T-squared distribution (T2), proposed by Harold Hotelling, is a multivariate probability distribution that is tightly related to the F-distribution and is most notable for arising as the distribution of a set of sample statistics that are natural generalizations of the statistics underlying the Student's t-distribution.

In statistics, sometimes the covariance matrix of a multivariate random variable is not known but has to be estimated. Estimation of covariance matrices then deals with the question of how to approximate the actual covariance matrix on the basis of a sample from the multivariate distribution. Simple cases, where observations are complete, can be dealt with by using the sample covariance matrix. The sample covariance matrix (SCM) is an unbiased and efficient estimator of the covariance matrix if the space of covariance matrices is viewed as an extrinsic convex cone in Rp×p; however, measured using the intrinsic geometry of positive-definite matrices, the SCM is a biased and inefficient estimator. In addition, if the random variable has normal distribution, the sample covariance matrix has Wishart distribution and a slightly differently scaled version of it is the maximum likelihood estimate. Cases involving missing data require deeper considerations. Another issue is the robustness to outliers, to which sample covariance matrices are highly sensitive.

Directional statistics is the subdiscipline of statistics that deals with directions, axes or rotations in Rn. More generally, directional statistics deals with observations on compact Riemannian manifolds including the Stiefel manifold.

In probability theory, the Rice distribution or Rician distribution is the probability distribution of the magnitude of a circularly-symmetric bivariate normal random variable, possibly with non-zero mean (noncentral). It was named after Stephen O. Rice.

In statistics, the bias of an estimator is the difference between this estimator's expected value and the true value of the parameter being estimated. An estimator or decision rule with zero bias is called unbiased. In statistics, "bias" is an objective property of an estimator. Bias can also be measured with respect to the median, rather than the mean, in which case one distinguishes median-unbiased from the usual mean-unbiasedness property. Bias is a distinct concept from consistency. Consistent estimators converge in probability to the true value of the parameter, but may be biased or unbiased; see bias versus consistency for more.

A ratio distribution is a probability distribution constructed as the distribution of the ratio of random variables having two other known distributions. Given two random variables X and Y, the distribution of the random variable Z that is formed as the ratio Z = X/Y is a ratio distribution.

Exact statistics, such as that described in exact test, is a branch of statistics that was developed to provide more accurate results pertaining to statistical testing and interval estimation by eliminating procedures based on asymptotic and approximate statistical methods. The main characteristic of exact methods is that statistical tests and confidence intervals are based on exact probability statements that are valid for any sample size. Exact statistical methods help avoid some of the unreasonable assumptions of traditional statistical methods, such as the assumption of equal variances in classical ANOVA. They also allow exact inference on variance components of mixed models.

Experimental uncertainty analysis is a technique that analyses a derived quantity, based on the uncertainties in the experimentally measured quantities that are used in some form of mathematical relationship ("model") to calculate that derived quantity. The model used to convert the measurements into the derived quantity is usually based on fundamental principles of a science or engineering discipline.

In statistics, a generalized p-value is an extended version of the classical p-value, which except in a limited number of applications, provides only approximate solutions.

In the comparison of various statistical procedures, efficiency is a measure of quality of an estimator, of an experimental design, or of a hypothesis testing procedure. Essentially, a more efficient estimator, experiment, or test needs fewer observations than a less efficient one to achieve a given performance. This article primarily deals with efficiency of estimators.

In statistics, inverse-variance weighting is a method of aggregating two or more random variables to minimize the variance of the weighted average. Each random variable is weighted in inverse proportion to its variance, i.e. proportional to its precision.

## References

1. Shao, J. (2008). "Pivotal quantities". Mathematical Statistics (2nd ed.). New York: Springer. pp. 471–477. ISBN   978-0-387-21718-5.
2. DeGroot, Morris H.; Schervish, Mark J. (2011). Probability and Statistics (4th ed.). Pearson. p. 489. ISBN   978-0-321-70970-7.