In statistics, a **pivotal quantity** or **pivot** is a function of observations and unobservable parameters such that the function's probability distribution does not depend on the unknown parameters (including nuisance parameters).^{ [1] } A pivot quantity need not be a statistic—the function and its *value* can depend on the parameters of the model, but its *distribution* must not. If it is a statistic, then it is known as an * ancillary statistic.*

More formally,^{ [2] } let be a random sample from a distribution that depends on a parameter (or vector of parameters) . Let be a random variable whose distribution is the same for all . Then is called a *pivotal quantity* (or simply a *pivot*).

Pivotal quantities are commonly used for normalization to allow data from different data sets to be compared. It is relatively easy to construct pivots for location and scale parameters: for the former we form differences so that location cancels, for the latter ratios so that scale cancels.

Pivotal quantities are fundamental to the construction of test statistics, as they allow the statistic to not depend on parameters – for example, Student's t-statistic is for a normal distribution with unknown variance (and mean). They also provide one method of constructing confidence intervals, and the use of pivotal quantities improves performance of the bootstrap. In the form of ancillary statistics, they can be used to construct frequentist prediction intervals (predictive confidence intervals).

One of the simplest pivotal quantities is the z-score; given a normal distribution with mean and variance , and an observation *x,* the z-score:

has distribution – a normal distribution with mean 0 and variance 1. Similarly, since the *n*-sample sample mean has sampling distribution the z-score of the mean

also has distribution Note that while these functions depend on the parameters – and thus one can only compute them if the parameters are known (they are not statistics) – the distribution is independent of the parameters.

Given independent, identically distributed (i.i.d.) observations from the normal distribution with unknown mean and variance , a pivotal quantity can be obtained from the function:

where

and

are unbiased estimates of and , respectively. The function is the Student's t-statistic for a new value , to be drawn from the same population as the already observed set of values .

Using the function becomes a pivotal quantity, which is also distributed by the Student's t-distribution with degrees of freedom. As required, even though appears as an argument to the function , the distribution of does not depend on the parameters or of the normal probability distribution that governs the observations .

This can be used to compute a prediction interval for the next observation see Prediction interval: Normal distribution.

In more complicated cases, it is impossible to construct exact pivots. However, having approximate pivots improves convergence to asymptotic normality.

Suppose a sample of size of vectors is taken from a bivariate normal distribution with unknown correlation .

An estimator of is the sample (Pearson, moment) correlation

where are sample variances of and . The sample statistic has an asymptotically normal distribution:

- .

However, a variance-stabilizing transformation

known as Fisher's *z* transformation of the correlation coefficient allows creating the distribution of asymptotically independent of unknown parameters:

where is the corresponding distribution parameter. For finite samples sizes , the random variable will have distribution closer to normal than that of . An even closer approximation to the standard normal distribution is obtained by using a better approximation for the exact variance: the usual form is

From the point of view of robust statistics, pivotal quantities are robust to changes in the parameters – indeed, independent of the parameters – but not in general robust to changes in the model, such as violations of the assumption of normality. This is fundamental to the robust critique of non-robust statistics, often derived from pivotal quantities: such statistics may be robust within the family, but are not robust outside it.

In probability theory and statistics, **kurtosis** is a measure of the "tailedness" of the probability distribution of a real-valued random variable. Like skewness, kurtosis describes the shape of a probability distribution and there are different ways of quantifying it for a theoretical distribution and corresponding ways of estimating it from a sample from a population. Different measures of kurtosis may have different interpretations.

In probability theory, a **normal****distribution** is a type of continuous probability distribution for a real-valued random variable. The general form of its probability density function is

In probability theory and statistics, **variance** is the expectation of the squared deviation of a random variable from its mean. Variance is a measure of dispersion, meaning it is a measure of how far a set of numbers is spread out from their average value. Variance has a central role in statistics, where some ideas that use it include descriptive statistics, statistical inference, hypothesis testing, goodness of fit, and Monte Carlo sampling. Variance is an important tool in the sciences, where statistical analysis of data is common. The variance is the square of the standard deviation, the second central moment of a distribution, and the covariance of the random variable with itself, and it is often represented by , , or .

In probability theory and statistics, the **multivariate normal distribution**, **multivariate Gaussian distribution**, or **joint normal distribution** is a generalization of the one-dimensional (univariate) normal distribution to higher dimensions. One definition is that a random vector is said to be *k*-variate normally distributed if every linear combination of its *k* components has a univariate normal distribution. Its importance derives mainly from the multivariate central limit theorem. The multivariate normal distribution is often used to describe, at least approximately, any set of (possibly) correlated real-valued random variables each of which clusters around a mean value.

In probability and statistics, **Student's t-distribution** is any member of a family of continuous probability distributions that arise when estimating the mean of a normally-distributed population in situations where the sample size is small and the population's standard deviation is unknown. It was developed by English statistician William Sealy Gosset under the pseudonym "Student".

In probability theory and statistics, the **Rayleigh distribution** is a continuous probability distribution for nonnegative-valued random variables. It is essentially a chi distribution with two degrees of freedom.

In statistics and optimization, **errors** and **residuals** are two closely related and easily confused measures of the deviation of an observed value of an element of a statistical sample from its "theoretical value". The **error** of an observed value is the deviation of the observed value from the (unobservable) *true* value of a quantity of interest, and the **residual** of an observed value is the difference between the observed value and the *estimated* value of the quantity of interest. The distinction is most important in regression analysis, where the concepts are sometimes called the **regression errors** and **regression residuals** and where they lead to the concept of studentized residuals.

In statistical inference, specifically predictive inference, a **prediction interval** is an estimate of an interval in which a future observation will fall, with a certain probability, given what has already been observed. Prediction intervals are often used in regression analysis.

In statistics, **Cochran's theorem**, devised by William G. Cochran, is a theorem used to justify results relating to the probability distributions of statistics that are used in the analysis of variance.

In statistics, particularly in hypothesis testing, the **Hotelling's T-squared distribution** (

In statistics, sometimes the covariance matrix of a multivariate random variable is not known but has to be estimated. **Estimation of covariance matrices** then deals with the question of how to approximate the actual covariance matrix on the basis of a sample from the multivariate distribution. Simple cases, where observations are complete, can be dealt with by using the sample covariance matrix. The sample covariance matrix (SCM) is an unbiased and efficient estimator of the covariance matrix if the space of covariance matrices is viewed as an extrinsic convex cone in **R**^{p×p}; however, measured using the intrinsic geometry of positive-definite matrices, the SCM is a biased and inefficient estimator. In addition, if the random variable has normal distribution, the sample covariance matrix has Wishart distribution and a slightly differently scaled version of it is the maximum likelihood estimate. Cases involving missing data require deeper considerations. Another issue is the robustness to outliers, to which sample covariance matrices are highly sensitive.

**Directional statistics** is the subdiscipline of statistics that deals with directions, axes or rotations in **R**^{n}. More generally, directional statistics deals with observations on compact Riemannian manifolds including the Stiefel manifold.

In probability theory, the **Rice distribution** or **Rician distribution** is the probability distribution of the magnitude of a circularly-symmetric bivariate normal random variable, possibly with non-zero mean (noncentral). It was named after Stephen O. Rice.

In statistics, the **bias** of an estimator is the difference between this estimator's expected value and the true value of the parameter being estimated. An estimator or decision rule with zero bias is called **unbiased**. In statistics, "bias" is an **objective** property of an estimator. Bias can also be measured with respect to the median, rather than the mean, in which case one distinguishes *median*-unbiased from the usual *mean*-unbiasedness property. Bias is a distinct concept from consistency. Consistent estimators converge in probability to the true value of the parameter, but may be biased or unbiased; see bias versus consistency for more.

A **ratio distribution** is a probability distribution constructed as the distribution of the ratio of random variables having two other known distributions. Given two random variables *X* and *Y*, the distribution of the random variable *Z* that is formed as the ratio *Z* = *X*/*Y* is a *ratio distribution*.

**Exact statistics**, such as that described in exact test, is a branch of statistics that was developed to provide more accurate results pertaining to statistical testing and interval estimation by eliminating procedures based on asymptotic and approximate statistical methods. The main characteristic of exact methods is that statistical tests and confidence intervals are based on exact probability statements that are valid for any sample size. Exact statistical methods help avoid some of the unreasonable assumptions of traditional statistical methods, such as the assumption of equal variances in classical ANOVA. They also allow exact inference on variance components of mixed models.

**Experimental uncertainty analysis** is a technique that analyses a *derived* quantity, based on the uncertainties in the experimentally *measured* quantities that are used in some form of mathematical relationship ("model") to calculate that derived quantity. The model used to convert the measurements into the derived quantity is usually based on fundamental principles of a science or engineering discipline.

In statistics, a **generalized p-value** is an extended version of the classical

In the comparison of various statistical procedures, **efficiency** is a measure of quality of an estimator, of an experimental design, or of a hypothesis testing procedure. Essentially, a more efficient estimator, experiment, or test needs fewer observations than a less efficient one to achieve a given performance. This article primarily deals with efficiency of estimators.

In statistics, **inverse-variance weighting** is a method of aggregating two or more random variables to minimize the variance of the weighted average. Each random variable is weighted in inverse proportion to its variance, i.e. proportional to its precision.

- ↑ Shao, J. (2008). "Pivotal quantities".
*Mathematical Statistics*(2nd ed.). New York: Springer. pp. 471–477. ISBN 978-0-387-21718-5. - ↑ DeGroot, Morris H.; Schervish, Mark J. (2011).
*Probability and Statistics*(4th ed.). Pearson. p. 489. ISBN 978-0-321-70970-7.

This page is based on this Wikipedia article

Text is available under the CC BY-SA 4.0 license; additional terms may apply.

Images, videos and audio are available under their respective licenses.

Text is available under the CC BY-SA 4.0 license; additional terms may apply.

Images, videos and audio are available under their respective licenses.