Contraharmonic mean

Last updated

In mathematics, a contraharmonic mean is a function complementary to the harmonic mean. The contraharmonic mean is a special case of the Lehmer mean, , where p = 2.

Contents

Definition

The contraharmonic mean of a set of positive real numbers [1] is defined as the arithmetic mean of the squares of the numbers divided by the arithmetic mean of the numbers:

Two-variable formulae

From the formulas for the arithmetic mean and harmonic mean of two variables we have:

Notice that for two variables the average of the harmonic and contraharmonic means is exactly equal to the arithmetic mean:

A(H(a, b), C(a, b)) = A(a, b)

As a gets closer to 0 then H(a, b) also gets closer to 0. The harmonic mean is very sensitive to low values. On the other hand, the contraharmonic mean is sensitive to larger values, so as a approaches 0 then C(a, b) approaches b (so their average remains A(a, b)).

There are two other notable relationships between 2-variable means. First, the geometric mean of the arithmetic and harmonic means is equal to the geometric mean of the two values:

The second relationship is that the geometric mean of the arithmetic and contraharmonic means is the root mean square:

The contraharmonic mean of two variables can be constructed geometrically using a trapezoid. [2]

Additional constructions

The contraharmonic mean can be constructed on a circle similar to the way the Pythagorean means of two variables are constructed. [3] The contraharmonic mean is the remainder of the diameter on which the harmonic mean lies. [4]

History

The contraharmonic mean was discovered by the Greek mathematician Eudoxus in the 4th century BCE. [5]

Properties

It is easy to show that this satisfies the characteristic properties of a mean of some list of values :

The first property implies the fixed point property, that for all k > 0,

C(k, k, ..., k) = k

The contraharmonic mean is higher in value than the arithmetic mean and also higher than the root mean square: where x is a list of values, H is the harmonic mean, G is geometric mean, L is the logarithmic mean, A is the arithmetic mean, R is the root mean square and C is the contraharmonic mean. Unless all values of x are the same, the ≤ signs above can be replaced by <.

The name contraharmonic may be due to the fact that when taking the mean of only two variables, the contraharmonic mean is as high above the arithmetic mean as the arithmetic mean is above the harmonic mean (i.e., the arithmetic mean of the two variables is equal to the arithmetic mean of their harmonic and contraharmonic means).

Relationship to arithmetic mean and variance

The contraharmonic mean of a random variable is equal to the sum of the arithmetic mean and the variance divided by the arithmetic mean. [6] Since the variance is always ≥0 the contraharmonic mean is always greater than or equal to the arithmetic mean.

The ratio of the variance and the mean was proposed as a test statistic by Clapham. [7] This statistic is the contraharmonic mean less one.

Other relationships

Any integer contraharmonic mean of two different positive integers is the hypotenuse of a Pythagorean triple, while any hypotenuse of a Pythagorean triple is a contraharmonic mean of two different positive integers. [8]

It is also related to Katz's statistic [9] where m is the mean, s2 the variance and n is the sample size.

Jn is asymptotically normally distributed with a mean of zero and variance of 1.

Uses in statistics

The problem of a size biased sample was discussed by Cox in 1969 on a problem of sampling fibres. The expectation of size biased sample is equal to its contraharmonic mean, [10] and the contraharmonic mean is also used to estimate bias fields in multiplicative models, rather than the arithmetic mean as used in additive models. [11]

The contraharmonic mean can be used to average the intensity value of neighbouring pixels in graphing, so as to reduce noise in images and make them clearer to the eye. [12]

The probability of a fibre being sampled is proportional to its length. Because of this the usual sample mean (arithmetic mean) is a biased estimator of the true mean. To see this consider where f(x) is the true population distribution, g(x) is the length weighted distribution and m is the sample mean. Taking the usual expectation of the mean here gives the contraharmonic mean rather than the usual (arithmetic) mean of the sample. [13] This problem can be overcome by taking instead the expectation of the harmonic mean (1/x). The expectation and variance of 1/x are and has variance where E is the expectation operator. Asymptotically E[1/x] is distributed normally.

The asymptotic efficiency of length biased sampling depends compared to random sampling on the underlying distribution. if f(x) is log normal the efficiency is 1 while if the population is gamma distributed with index b, the efficiency is b/(b − 1). This distribution has been used in modelling consumer behaviour [14] as well as quality sampling.

It has been used longside the exponential distribution in transport planning in the form of its inverse. [15]

See also

Related Research Articles

<span class="mw-page-title-main">Autocorrelation</span> Correlation of a signal with a time-shifted copy of itself, as a function of shift

Autocorrelation, sometimes known as serial correlation in the discrete time case, is the correlation of a signal with a delayed copy of itself as a function of delay. Informally, it is the similarity between observations of a random variable as a function of the time lag between them. The analysis of autocorrelation is a mathematical tool for finding repeating patterns, such as the presence of a periodic signal obscured by noise, or identifying the missing fundamental frequency in a signal implied by its harmonic frequencies. It is often used in signal processing for analyzing functions or series of values, such as time domain signals.

<span class="mw-page-title-main">Expected value</span> Average value of a random variable

In probability theory, the expected value is a generalization of the weighted average. Informally, the expected value is the arithmetic mean of the possible values a random variable can take, weighted by the probability of those outcomes. Since it is obtained through arithmetic, the expected value sometimes may not even be included in the sample data set; it is not the value you would "expect" to get in reality.

In mathematics, the harmonic mean is one of several kinds of average, and in particular, one of the Pythagorean means. It is sometimes appropriate for situations when the average rate is desired.

<span class="mw-page-title-main">Normal distribution</span> Probability distribution

In probability theory and statistics, a normal distribution or Gaussian distribution is a type of continuous probability distribution for a real-valued random variable. The general form of its probability density function is The parameter is the mean or expectation of the distribution, while the parameter is the variance. The standard deviation of the distribution is . A random variable with a Gaussian distribution is said to be normally distributed, and is called a normal deviate.

<span class="mw-page-title-main">Standard deviation</span> In statistics, a measure of variation

In statistics, the standard deviation is a measure of the amount of variation of the values of a variable about its mean. A low standard deviation indicates that the values tend to be close to the mean of the set, while a high standard deviation indicates that the values are spread out over a wider range. The standard deviation is commonly used in the determination of what constitutes an outlier and what does not.

<span class="mw-page-title-main">Variance</span> Statistical measure of how far values spread from their average

In probability theory and statistics, variance is the expected value of the squared deviation from the mean of a random variable. The standard deviation (SD) is obtained as the square root of the variance. Variance is a measure of dispersion, meaning it is a measure of how far a set of numbers is spread out from their average value. It is the second central moment of a distribution, and the covariance of the random variable with itself, and it is often represented by , , , , or .

The weighted arithmetic mean is similar to an ordinary arithmetic mean, except that instead of each of the data points contributing equally to the final average, some data points contribute more than others. The notion of weighted mean plays a role in descriptive statistics and also occurs in a more general form in several other areas of mathematics.

<span class="mw-page-title-main">Central limit theorem</span> Fundamental theorem in probability theory and statistics

In probability theory, the central limit theorem (CLT) states that, under appropriate conditions, the distribution of a normalized version of the sample mean converges to a standard normal distribution. This holds even if the original variables themselves are not normally distributed. There are several versions of the CLT, each applying in the context of different conditions.

<span class="mw-page-title-main">Multivariate normal distribution</span> Generalization of the one-dimensional normal distribution to higher dimensions

In probability theory and statistics, the multivariate normal distribution, multivariate Gaussian distribution, or joint normal distribution is a generalization of the one-dimensional (univariate) normal distribution to higher dimensions. One definition is that a random vector is said to be k-variate normally distributed if every linear combination of its k components has a univariate normal distribution. Its importance derives mainly from the multivariate central limit theorem. The multivariate normal distribution is often used to describe, at least approximately, any set of (possibly) correlated real-valued random variables, each of which clusters around a mean value.

Covariance in probability theory and statistics is a measure of the joint variability of two random variables.

<span class="mw-page-title-main">Covariance matrix</span> Measure of covariance of components of a random vector

In probability theory and statistics, a covariance matrix is a square matrix giving the covariance between each pair of elements of a given random vector.

In mathematics, the moments of a function are certain quantitative measures related to the shape of the function's graph. If the function represents mass density, then the zeroth moment is the total mass, the first moment is the center of mass, and the second moment is the moment of inertia. If the function is a probability distribution, then the first moment is the expected value, the second central moment is the variance, the third standardized moment is the skewness, and the fourth standardized moment is the kurtosis.

In statistics, propagation of uncertainty is the effect of variables' uncertainties on the uncertainty of a function based on them. When the variables are the values of experimental measurements they have uncertainties due to measurement limitations which propagate due to the combination of variables in the function.

In statistics, sometimes the covariance matrix of a multivariate random variable is not known but has to be estimated. Estimation of covariance matrices then deals with the question of how to approximate the actual covariance matrix on the basis of a sample from the multivariate distribution. Simple cases, where observations are complete, can be dealt with by using the sample covariance matrix. The sample covariance matrix (SCM) is an unbiased and efficient estimator of the covariance matrix if the space of covariance matrices is viewed as an extrinsic convex cone in Rp×p; however, measured using the intrinsic geometry of positive-definite matrices, the SCM is a biased and inefficient estimator. In addition, if the random variable has a normal distribution, the sample covariance matrix has a Wishart distribution and a slightly differently scaled version of it is the maximum likelihood estimate. Cases involving missing data, heteroscedasticity, or autocorrelated residuals require deeper considerations. Another issue is the robustness to outliers, to which sample covariance matrices are highly sensitive.

<span class="mw-page-title-main">Pythagorean means</span> Classical averages studied in ancient Greece

In mathematics, the three classical Pythagorean means are the arithmetic mean (AM), the geometric mean (GM), and the harmonic mean (HM). These means were studied with proportions by Pythagoreans and later generations of Greek mathematicians because of their importance in geometry and music.

In mathematics, the Lehmer mean of a tuple of positive real numbers, named after Derrick Henry Lehmer, is defined as:

In statistics, the bias of an estimator is the difference between this estimator's expected value and the true value of the parameter being estimated. An estimator or decision rule with zero bias is called unbiased. In statistics, "bias" is an objective property of an estimator. Bias is a distinct concept from consistency: consistent estimators converge in probability to the true value of the parameter, but may be biased or unbiased.

The sample mean or empirical mean, and the sample covariance or empirical covariance are statistics computed from a sample of data on one or more random variables.

<span class="mw-page-title-main">Distance correlation</span> Statistical measure

In statistics and in probability theory, distance correlation or distance covariance is a measure of dependence between two paired random vectors of arbitrary, not necessarily equal, dimension. The population distance correlation coefficient is zero if and only if the random vectors are independent. Thus, distance correlation measures both linear and nonlinear association between two random variables or random vectors. This is in contrast to Pearson's correlation, which can only detect linear association between two random variables.

In statistics, the complex Wishart distribution is a complex version of the Wishart distribution. It is the distribution of times the sample Hermitian covariance matrix of zero-mean independent Gaussian random variables. It has support for Hermitian positive definite matrices.

References

  1. See "Means of Complex Numbers" (PDF). Texas College Mathematics Journal. 1 (1). January 1, 2005. Archived from the original (PDF) on September 9, 2006.
  2. Umberger, Shannon. "Construction of the Contraharmonic Mean in a Trapezoid". University of Georgia.
  3. Nelsen, Roger B. Proofs without Words/Exercises in Visual Thinking. p. 56. ISBN   0-88385-700-6.
  4. Slaev, Valery A.; Chunovkina, Anna G.; Mironovsky, Leonid A. (2019). Metrology and Theory of Measurement. De Gruyter. p. 217. ISBN   9783110652505.
  5. Antoine, C. (1998). Les Moyennes. Paris: Presses Unversitaires de France.
  6. Kingley, Michael C.S. (1989). "The distribution of hauled out ringed seals an interpretation of Taylor's law". Oecologia. 79 (79): 106–110. doi:10.1007/BF00378246. PMID   28312819.
  7. Clapham, Arthur Roy (1936). "Overdispersion in grassland communities and the use of statistical methods in plant ecology". The Journal of Ecology (14): 232. doi:10.2307/2256277. JSTOR   2256277.
  8. Pahikkala, Jussi (2010). "On contraharmonic mean and Pythagorean triples". Elemente der Mathematik. 65 (2): 62–67. doi:10.4171/em/141.
  9. Katz, L. (1965). United treatment of a broad class of discrete probability distributions. Proceedings of the International Symposium on Discrete Distributions. Montreal.
  10. Zelen, Marvin (1972). Length-biased sampling and biomedical problems. Biometric Society Meeting. Dallas, Texas.
  11. Banerjee, Abhirup; Maji, Pradipta (2013). Rough Sets for Bias Field Correction in MR Images Using Contraharmonic Mean and Quantitative Index. IEEE Transactions on Medical Imaging.
  12. Mitra, Sabry (October 2021). "Contraharmonic Mean Filter". Kajian Ilmiah Informatika Dan Komputer. 2 (2): 75–79.
  13. Sudman, Seymour (1980). Quota sampling techniques and weighting procedures to correct for frequency bias.
  14. Keillor, Bruce D.; D'Amico, Michael; Horton, Veronica (2001). "Global Consumer Tendencies". Psychology and Marketing. 18 (1): 1–19. doi:10.1002/1520-6793(200101)18:1<1::AID-MAR1>3.0.CO;2-U.
  15. Amreen, Mohammed; Venkateswarlu, Bandi (2024). "A New Way for Solving Transportation Issues Based on the Exponential Distribution and the Contraharmonic Mean". Journal of Applied Mathematics and Informatics. 42 (3): 647–661.