This article needs additional citations for verification .(December 2009) (Learn how and when to remove this template message) |

In probability theory and statistics, a **scale parameter** is a special kind of numerical parameter of a parametric family of probability distributions. The larger the scale parameter, the more spread out the distribution.

If a family of probability distributions is such that there is a parameter *s* (and other parameters *θ*) for which the cumulative distribution function satisfies

then *s* is called a **scale parameter**, since its value determines the "scale" or statistical dispersion of the probability distribution. If *s* is large, then the distribution will be more spread out; if *s* is small then it will be more concentrated.

If the probability density exists for all values of the complete parameter set, then the density (as a function of the scale parameter only) satisfies

where *f* is the density of a standardized version of the density, i.e. .

An estimator of a scale parameter is called an **estimator of scale.**

In the case where a parametrized family has a location parameter, a slightly different definition is often used as follows. If we denote the location parameter by , and the scale parameter by , then we require that where is the cmd for the parametrized family.^{ [1] } This modification is necessary in order for the standard deviation of a non-central Gaussian to be a scale parameter, since otherwise the mean would change when we rescale . However, this alternative definition is not consistently used.^{ [2] }

We can write in terms of , as follows:

Because *f* is a probability density function, it integrates to unity:

By the substitution rule of integral calculus, we then have

So is also properly normalized.

Some families of distributions use a **rate parameter** (or "**inverse scale parameter**"), which is simply the reciprocal of the *scale parameter*. So for example the exponential distribution with scale parameter β and probability density

could equivalently be written with rate parameter λ as

- The uniform distribution can be parameterized with a location parameter of and a scale parameter .
- The normal distribution has two parameters: a location parameter and a scale parameter . In practice the normal distribution is often parameterized in terms of the
*squared*scale , which corresponds to the variance of the distribution. - The gamma distribution is usually parameterized in terms of a scale parameter or its inverse.
- Special cases of distributions where the scale parameter equals unity may be called "standard" under certain conditions. For example, if the location parameter equals zero and the scale parameter equals one, the normal distribution is known as the
*standard*normal distribution, and the Cauchy distribution as the*standard*Cauchy distribution.

A statistic can be used to estimate a scale parameter so long as it:

- Is location-invariant,
- Scales linearly with the scale parameter, and
- Converges as the sample size grows.

Various measures of statistical dispersion satisfy these. In order to make the statistic a consistent estimator for the scale parameter, one must in general multiply the statistic by a constant scale factor. This scale factor is defined as the theoretical value of the value obtained by dividing the required scale parameter by the asymptotic value of the statistic. Note that the scale factor depends on the distribution in question.

For instance, in order to use the median absolute deviation (MAD) to estimate the standard deviation of the normal distribution, one must multiply it by the factor

where Φ^{−1} is the quantile function (inverse of the cumulative distribution function) for the standard normal distribution. (See MAD for details.) That is, the MAD is not a consistent estimator for the standard deviation of a normal distribution, but 1.4826... MAD is a consistent estimator. Similarly, the average absolute deviation needs to be multiplied by approximately 1.2533 to be a consistent estimator for standard deviation. Different factors would be required to estimate the standard deviation if the population did not follow a normal distribution.

In statistics, an **estimator** is a rule for calculating an estimate of a given quantity based on observed data: thus the rule, the quantity of interest and its result are distinguished.

In statistics, a **location parameter** of a probability distribution is a scalar- or vector-valued parameter , which determines the "location" or shift of the distribution. In the literature of location parameter estimation, the probability distributions with such parameter are found to be formally defined in one of the following equivalent ways:

In statistics, the **likelihood function** measures the goodness of fit of a statistical model to a sample of data for given values of the unknown parameters. It is formed from the joint probability distribution of the sample, but viewed and used as a function of the parameters only, thus treating the random variables as fixed at the observed values.

In probability theory and statistics, the **exponential distribution** is the probability distribution of the time between events in a Poisson point process, i.e., a process in which events occur continuously and independently at a constant average rate. It is a particular case of the gamma distribution. It is the continuous analogue of the geometric distribution, and it has the key property of being memoryless. In addition to being used for the analysis of Poisson point processes it is found in various other contexts.

In statistics, a statistic is *sufficient* with respect to a statistical model and its associated unknown parameter if "no other statistic that can be calculated from the same sample provides any additional information as to the value of the parameter". In particular, a statistic is **sufficient** for a family of probability distributions if the sample from which it is calculated gives no additional information than the statistic, as to which of those probability distributions is the sampling distribution.

In probability theory and statistics, the **Weibull distribution** is a continuous probability distribution. It is named after Swedish mathematician Waloddi Weibull, who described it in detail in 1951, although it was first identified by Fréchet (1927) and first applied by Rosin & Rammler (1933) to describe a particle size distribution.

In probability theory and statistics, the **gamma distribution** is a two-parameter family of continuous probability distributions. The exponential distribution, Erlang distribution, and chi-squared distribution are special cases of the gamma distribution. There are three different parametrizations in common use:

- With a shape parameter
*k*and a scale parameter*θ*. - With a shape parameter
*α*=*k*and an inverse scale parameter*β*= 1/*θ*, called a rate parameter. - With a shape parameter
*k*and a mean parameter*μ*=*kθ*=*α*/*β*.

In Bayesian probability theory, if the posterior distributions *p*(*θ* | *x*) are in the same probability distribution family as the prior probability distribution *p*(θ), the prior and posterior are then called **conjugate distributions,** and the prior is called a **conjugate prior** for the likelihood function *p*(x | *θ*). For example, the Gaussian family is conjugate to itself with respect to a Gaussian likelihood function: if the likelihood function is Gaussian, choosing a Gaussian prior over the mean will ensure that the posterior distribution is also Gaussian. This means that the Gaussian distribution is a conjugate prior for the likelihood that is also Gaussian. The concept, as well as the term "conjugate prior", were introduced by Howard Raiffa and Robert Schlaifer in their work on Bayesian decision theory. A similar concept had been discovered independently by George Alfred Barnard.

**Directional statistics** is the subdiscipline of statistics that deals with directions, axes or rotations in **R**^{n}. More generally, directional statistics deals with observations on compact Riemannian manifolds.

In statistics, a **consistent estimator** or **asymptotically consistent estimator** is an estimator—a rule for computing estimates of a parameter *θ*_{0}—having the property that as the number of data points used increases indefinitely, the resulting sequence of estimates converges in probability to *θ*_{0}. This means that the distributions of the estimates become more and more concentrated near the true value of the parameter being estimated, so that the probability of the estimator being arbitrarily close to *θ*_{0} converges to one.

In Bayesian statistics, a **maximum a posteriori probability** (**MAP**) **estimate** is an estimate of an unknown quantity, that equals the mode of the posterior distribution. The MAP can be used to obtain a point estimate of an unobserved quantity on the basis of empirical data. It is closely related to the method of maximum likelihood (ML) estimation, but employs an augmented optimization objective which incorporates a prior distribution over the quantity one wants to estimate. MAP estimation can therefore be seen as a regularization of ML estimation.

In Bayesian probability, the **Jeffreys prior**, named after Sir Harold Jeffreys, is a non-informative (objective) prior distribution for a parameter space; it is proportional to the square root of the determinant of the Fisher information matrix:

In statistics, a **parametric model** or **parametric family** or **finite-dimensional model** is a particular class of statistical models. Specifically, a parametric model is a family of probability distributions that has a finite number of parameters.

**Robust statistics** are statistics with good performance for data drawn from a wide range of probability distributions, especially for distributions that are not normal. Robust statistical methods have been developed for many common problems, such as estimating location, scale, and regression parameters. One motivation is to produce statistical methods that are not unduly affected by outliers. Another motivation is to provide methods with good performance when there are small departures from parametric distribution. For example, robust methods work well for mixtures of two normal distributions with different standard-deviations; under this model, non-robust methods like a t-test work poorly.

In statistics, **M-estimators** are a broad class of extremum estimators for which the objective function is a sample average. Both non-linear least squares and maximum likelihood estimation are special cases of M-estimators. The definition of M-estimators was motivated by robust statistics, which contributed new types of M-estimators. The statistical procedure of evaluating an M-estimator on a data set is called **M-estimation**.

In estimation theory and decision theory, a **Bayes estimator** or a **Bayes action** is an estimator or decision rule that minimizes the posterior expected value of a loss function. Equivalently, it maximizes the posterior expectation of a utility function. An alternative way of formulating an estimator within Bayesian statistics is maximum a posteriori estimation.

In statistics, the **bias** of an estimator is the difference between this estimator's expected value and the true value of the parameter being estimated. An estimator or decision rule with zero bias is called **unbiased**. In statistics, "bias" is an **objective** property of an estimator. Bias can also be measured with respect to the median, rather than the mean, in which case one distinguishes *median*-unbiased from the usual *mean*-unbiasedness property. Bias is a distinct concept from consistency. Consistent estimators converge in probability to the true value of the parameter, but may be biased or unbiased; see bias versus consistency for more.

A **ratio distribution** is a probability distribution constructed as the distribution of the ratio of random variables having two other known distributions. Given two random variables *X* and *Y*, the distribution of the random variable *Z* that is formed as the ratio *Z* = *X*/*Y* is a *ratio distribution*.

In statistics, the concept of being an **invariant estimator** is a criterion that can be used to compare the properties of different estimators for the same quantity. It is a way of formalising the idea that an estimator should have certain intuitively appealing qualities. Strictly speaking, "invariant" would mean that the estimates themselves are unchanged when both the measurements and the parameters are transformed in a compatible way, but the meaning has been extended to allow the estimates to change in appropriate ways with such transformations. The term **equivariant estimator** is used in formal mathematical contexts that include a precise description of the relation of the way the estimator changes in response to changes to the dataset and parameterisation: this corresponds to the use of "equivariance" in more general mathematics.

In probability and statistics, a **compound probability distribution** is the probability distribution that results from assuming that a random variable is distributed according to some parametrized distribution, with the parameters of that distribution themselves being random variables. If the parameter is a scale parameter, the resulting mixture is also called a **scale mixture**.

- ↑ Prokhorov, A.V. (7 February 2011). "Scale parameter".
*Encyclopedia of Mathematics*. Springer. Retrieved 7 February 2019. - ↑ Koski, Timo. "Scale parameter".
*KTH Royal Institute of Technology*. Retrieved 7 February 2019.

- Mood, A. M.; Graybill, F. A.; Boes, D. C. (1974). "VII.6.2
*Scale invariance*".*Introduction to the theory of statistics*(3rd ed.). New York: McGraw-Hill.

This page is based on this Wikipedia article

Text is available under the CC BY-SA 4.0 license; additional terms may apply.

Images, videos and audio are available under their respective licenses.

Text is available under the CC BY-SA 4.0 license; additional terms may apply.

Images, videos and audio are available under their respective licenses.