Geometric standard deviation

Last updated

In probability theory and statistics, the geometric standard deviation (GSD) describes how spread out are a set of numbers whose preferred average is the geometric mean. For such data, it may be preferred to the more usual standard deviation. Note that unlike the usual arithmetic standard deviation, the geometric standard deviation is a multiplicative factor, and thus is dimensionless, rather than having the same dimension as the input values. Thus, the geometric standard deviation may be more appropriately called geometric SD factor. [1] [2] When using geometric SD factor in conjunction with geometric mean, it should be described as "the range from (the geometric mean divided by the geometric SD factor) to (the geometric mean multiplied by the geometric SD factor), and one cannot add/subtract "geometric SD factor" to/from geometric mean. [3]

Contents

Definition

If the geometric mean of a set of numbers is denoted as , then the geometric standard deviation is

Derivation

If the geometric mean is

then taking the natural logarithm of both sides results in

The logarithm of a product is a sum of logarithms (assuming is positive for all ), so

It can now be seen that is the arithmetic mean of the set , therefore the arithmetic standard deviation of this same set should be

This simplifies to

Geometric standard score

The geometric version of the standard score is

If the geometric mean, standard deviation, and z-score of a datum are known, then the raw score can be reconstructed by

Relationship to log-normal distribution

The geometric standard deviation is used as a measure of log-normal dispersion analogously to the geometric mean. [3] As the log-transform of a log-normal distribution results in a normal distribution, we see that the geometric standard deviation is the exponentiated value of the standard deviation of the log-transformed values, i.e. .

As such, the geometric mean and the geometric standard deviation of a sample of data from a log-normally distributed population may be used to find the bounds of confidence intervals analogously to the way the arithmetic mean and standard deviation are used to bound confidence intervals for a normal distribution. See discussion in log-normal distribution for details.

Related Research Articles

The Beer-Lambert law is commonly applied to chemical analysis measurements to determine the concentration of chemical species that absorb light. It is often referred to as Beer's law. In physics, the Bouguer–Lambert law is an empirical law which relates the extinction or attenuation of light to the properties of the material through which the light is travelling. It had its first use in astronomical extinction. The fundamental law of extinction is sometimes called the Beer-Bouguer-Lambert law or the Bouguer-Beer-Lambert law or merely the extinction law. The extinction law is also used in understanding attenuation in physical optics, for photons, neutrons, or rarefied gases. In mathematical physics, this law arises as a solution of the BGK equation.

<span class="mw-page-title-main">Geometric mean</span> N-th root of the product of n numbers

In mathematics, the geometric mean is a mean or average which indicates a central tendency of a finite set of real numbers by using the product of their values. The geometric mean is defined as the nth root of the product of n numbers, i.e., for a set of numbers a1, a2, ..., an, the geometric mean is defined as

<span class="mw-page-title-main">Normal distribution</span> Probability distribution

In statistics, a normal distribution or Gaussian distribution is a type of continuous probability distribution for a real-valued random variable. The general form of its probability density function is

<span class="mw-page-title-main">Standard deviation</span> In statistics, a measure of variation

In statistics, the standard deviation is a measure of the amount of variation or dispersion of a set of values. A low standard deviation indicates that the values tend to be close to the mean of the set, while a high standard deviation indicates that the values are spread out over a wider range.

In probability theory, the central limit theorem (CLT) establishes that, in many situations, for independent and identically distributed random variables, the sampling distribution of the standardized sample mean tends towards the standard normal distribution even if the original variables themselves are not normally distributed.

<span class="mw-page-title-main">Multivariate normal distribution</span> Generalization of the one-dimensional normal distribution to higher dimensions

In probability theory and statistics, the multivariate normal distribution, multivariate Gaussian distribution, or joint normal distribution is a generalization of the one-dimensional (univariate) normal distribution to higher dimensions. One definition is that a random vector is said to be k-variate normally distributed if every linear combination of its k components has a univariate normal distribution. Its importance derives mainly from the multivariate central limit theorem. The multivariate normal distribution is often used to describe, at least approximately, any set of (possibly) correlated real-valued random variables each of which clusters around a mean value.

<span class="mw-page-title-main">Log-normal distribution</span> Probability distribution

In probability theory, a log-normal (or lognormal) distribution is a continuous probability distribution of a random variable whose logarithm is normally distributed. Thus, if the random variable X is log-normally distributed, then Y = ln(X) has a normal distribution. Equivalently, if Y has a normal distribution, then the exponential function of Y, X = exp(Y), has a log-normal distribution. A random variable which is log-normally distributed takes only positive real values. It is a convenient and useful model for measurements in exact and engineering sciences, as well as medicine, economics and other topics (e.g., energies, concentrations, lengths, prices of financial instruments, and other metrics).

<span class="mw-page-title-main">Geometric Brownian motion</span> Continuous stochastic process

A geometric Brownian motion (GBM) (also known as exponential Brownian motion) is a continuous-time stochastic process in which the logarithm of the randomly varying quantity follows a Brownian motion (also called a Wiener process) with drift. It is an important example of stochastic processes satisfying a stochastic differential equation (SDE); in particular, it is used in mathematical finance to model stock prices in the Black–Scholes model.

In medicine and health-related fields, a reference range or reference interval is the range or the interval of values that is deemed normal for a physiological measurement in healthy persons. It is a basis for comparison for a physician or other health professional to interpret a set of test results for a particular patient. Some important reference ranges in medicine are reference ranges for blood tests and reference ranges for urine tests.

<span class="mw-page-title-main">Gumbel distribution</span> Particular case of the generalized extreme value distribution

In probability theory and statistics, the Gumbel distribution is used to model the distribution of the maximum of a number of samples of various distributions.

<span class="mw-page-title-main">Logistic distribution</span> Continuous probability distribution

In probability theory and statistics, the logistic distribution is a continuous probability distribution. Its cumulative distribution function is the logistic function, which appears in logistic regression and feedforward neural networks. It resembles the normal distribution in shape but has heavier tails. The logistic distribution is a special case of the Tukey lambda distribution.

In probability theory and statistics, the coefficient of variation (CV), also known as Normalized Root-Mean-Square Deviation (NRMSD), Percent RMS, and relative standard deviation (RSD), is a standardized measure of dispersion of a probability distribution or frequency distribution. It is defined as the ratio of the standard deviation to the mean , and often expressed as a percentage ("%RSD"). The CV or RSD is widely used in analytical chemistry to express the precision and repeatability of an assay. It is also commonly used in fields such as engineering or physics when doing quality assurance studies and ANOVA gauge R&R, by economists and investors in economic models, and in psychology/neuroscience.

In probability theory and statistics, the generalized extreme value (GEV) distribution is a family of continuous probability distributions developed within extreme value theory to combine the Gumbel, Fréchet and Weibull families also known as type I, II and III extreme value distributions. By the extreme value theorem the GEV distribution is the only possible limit distribution of properly normalized maxima of a sequence of independent and identically distributed random variables. Note that a limit distribution needs to exist, which requires regularity conditions on the tail of the distribution. Despite this, the GEV distribution is often used as an approximation to model the maxima of long (finite) sequences of random variables.

<span class="mw-page-title-main">Rice distribution</span> Probability distribution

In probability theory, the Rice distribution or Rician distribution is the probability distribution of the magnitude of a circularly-symmetric bivariate normal random variable, possibly with non-zero mean (noncentral). It was named after Stephen O. Rice (1907–1986).

Covariance matrix adaptation evolution strategy (CMA-ES) is a particular kind of strategy for numerical optimization. Evolution strategies (ES) are stochastic, derivative-free methods for numerical optimization of non-linear or non-convex continuous optimization problems. They belong to the class of evolutionary algorithms and evolutionary computation. An evolutionary algorithm is broadly based on the principle of biological evolution, namely the repeated interplay of variation and selection: in each generation (iteration) new individuals are generated by variation, usually in a stochastic way, of the current parental individuals. Then, some individuals are selected to become the parents in the next generation based on their fitness or objective function value . Like this, over the generation sequence, individuals with better and better -values are generated.

<span class="mw-page-title-main">Normal-inverse-gamma distribution</span>

In probability theory and statistics, the normal-inverse-gamma distribution is a four-parameter family of multivariate continuous probability distributions. It is the conjugate prior of a normal distribution with unknown mean and variance.

<span class="mw-page-title-main">Wrapped normal distribution</span>

In probability theory and directional statistics, a wrapped normal distribution is a wrapped probability distribution that results from the "wrapping" of the normal distribution around the unit circle. It finds application in the theory of Brownian motion and is a solution to the heat equation for periodic boundary conditions. It is closely approximated by the von Mises distribution, which, due to its mathematical simplicity and tractability, is the most commonly used distribution in directional statistics.

The Datar–Mathews Method is a method for real options valuation. The method provides an easy way to determine the real option value of a project simply by using the average of positive outcomes for the project. The method can be understood as an extension of the net present value (NPV) multi-scenario Monte Carlo model with an adjustment for risk aversion and economic decision-making. The method uses information that arises naturally in a standard discounted cash flow (DCF), or NPV, project financial valuation. It was created in 2000 by Vinay Datar, professor at Seattle University; and Scott H. Mathews, Technical Fellow at The Boeing Company.

In statistics and econometrics, the mean log deviation (MLD) is a measure of income inequality. The MLD is zero when everyone has the same income, and takes larger positive values as incomes become more unequal, especially at the high end.

The modified lognormal power-law (MLP) function is a three parameter function that can be used to model data that have characteristics of a log-normal distribution and a power law behavior. It has been used to model the functional form of the initial mass function (IMF). Unlike the other functional forms of the IMF, the MLP is a single function with no joining conditions.

References

  1. GraphPad Guide
  2. Kirkwood, T.B.L. (1993). "Geometric standard deviation - reply to Bohidar". Drug Dev. Ind. Pharmacy 19(3): 395-6.
  3. 1 2 Kirkwood, T.B.L. (1979). "Geometric means and measures of dispersion". Biometrics. 35: 908–9. JSTOR   2530139.