Hermite distribution

Last updated
Hermite
Probability mass function
PMF Hermite.jpg
The horizontal axis is the index k, the number of occurrences. The function is only defined at integer values of k. The connecting lines are only guides for the eye.
Cumulative distribution function
Cdfwiki.jpg
The horizontal axis is the index k, the number of occurrences. The CDF is discontinuous at the integers of k and flat everywhere else because a variable that is Hermite distributed only takes on integer values.
Notation
Parameters a1 ≥ 0, a2 ≥ 0
Support x ∈ { 0, 1, 2, ... }
PMF
CDF
Mean
Variance
Skewness
Ex. kurtosis
MGF
CF
PGF

In probability theory and statistics, the Hermite distribution, named after Charles Hermite, is a discrete probability distribution used to model count data with more than one parameter. This distribution is flexible in terms of its ability to allow a moderate over-dispersion in the data.

Contents

The authors Kemp and Kemp [1] have called it "Hermite distribution" from the fact its probability function and the moment generating function can be expressed in terms of the coefficients of (modified) Hermite polynomials.

History

The distribution first appeared in the paper Applications of Mathematics to Medical Problems, [2] by Anderson Gray McKendrick in 1926. In this work the author explains several mathematical methods that can be applied to medical research. In one of this methods he considered the bivariate Poisson distribution and showed that the distribution of the sum of two correlated Poisson variables follow a distribution that later would be known as Hermite distribution.

As a practical application, McKendrick considered the distribution of counts of bacteria in leucocytes. Using the method of moments he fitted the data with the Hermite distribution and found the model more satisfactory than fitting it with a Poisson distribution.

The distribution was formally introduced and published by C. D. Kemp and Adrienne W. Kemp in 1965 in their work Some Properties of ‘Hermite’ Distribution. The work is focused on the properties of this distribution for instance a necessary condition on the parameters and their maximum likelihood estimators (MLE), the analysis of the probability generating function (PGF) and how it can be expressed in terms of the coefficients of (modified) Hermite polynomials. An example they have used in this publication is the distribution of counts of bacteria in leucocytes that used McKendrick but Kemp and Kemp estimate the model using the maximum likelihood method.

Hermite distribution is a special case of discrete compound Poisson distribution with only two parameters. [3] [4]

The same authors published in 1966 the paper An alternative Derivation of the Hermite Distribution. [5] In this work established that the Hermite distribution can be obtained formally by combining a Poisson distribution with a normal distribution.

In 1971, Y. C. Patel [6] did a comparative study of various estimation procedures for the Hermite distribution in his doctoral thesis. It included maximum likelihood, moment estimators, mean and zero frequency estimators and the method of even points.

In 1974, Gupta and Jain [7] did a research on a generalized form of Hermite distribution.

Definition

Probability mass function

Let X1 and X2 be two independent Poisson variables with parameters a1 and a2. The probability distribution of the random variable Y = X1 + 2X2 is the Hermite distribution with parameters a1 and a2 and probability mass function is given by [8]

where

The probability generating function of the probability mass is, [8]

Notation

When a random variable Y = X1 + 2X2 is distributed by an Hermite distribution, where X1 and X2 are two independent Poisson variables with parameters a1 and a2, we write

Properties

Moment and cumulant generating functions

The moment generating function of a random variable X is defined as the expected value of et, as a function of the real parameter t. For an Hermite distribution with parameters X1 and X2, the moment generating function exists and is equal to

The cumulant generating function is the logarithm of the moment generating function and is equal to [4]

If we consider the coefficient of (it)rr! in the expansion of K(t) we obtain the r-cumulant

Hence the mean and the succeeding three moments about it are

OrderMomentCumulant
1
2
3
4

Skewness

The skewness is the third moment centered around the mean divided by the 3/2 power of the standard deviation, and for the hermite distribution is, [4]

Kurtosis

The kurtosis is the fourth moment centered around the mean, divided by the square of the variance, and for the Hermite distribution is, [4]

The excess kurtosis is just a correction to make the kurtosis of the normal distribution equal to zero, and it is the following,

Characteristic function

In a discrete distribution the characteristic function of any real-valued random variable is defined as the expected value of , where i is the imaginary unit and t  R

This function is related to the moment-generating function via . Hence for this distribution the characteristic function is, [1]

Cumulative distribution function

The cumulative distribution function is, [1]

Other properties

Example of a multi-modal data, Hermite distribution(0.1,1.5). Plot Hermite distribution.png
Example of a multi-modal data, Hermite distribution(0.1,1.5).

Parameter estimation

Method of moments

The mean and the variance of the Hermite distribution are and , respectively. So we have these two equation,

Solving these two equation we get the moment estimators and of a1 and a2. [6]

Since a1 and a2 both are positive, the estimator and are admissible (≥ 0) only if, .

Maximum likelihood

Given a sample X1, ..., Xm are independent random variables each having an Hermite distribution we wish to estimate the value of the parameters and . We know that the mean and the variance of the distribution are and , respectively. Using these two equation,

We can parameterize the probability function by μ and d

Hence the log-likelihood function is, [9]

where

From the log-likelihood function, the likelihood equations are, [9]

Straightforward calculations show that, [9]

where

The likelihood equation does not always have a solution like as it shows the following proposition,

Proposition: [9] Let X1, ..., Xm come from a generalized Hermite distribution with fixed n. Then the MLEs of the parameters are and if only if , where indicates the empirical factorial momement of order 2.

Zero frequency and the mean estimators

A usual choice for discrete distributions is the zero relative frequency of the data set which is equated to the probability of zero under the assumed distribution. Observing that and . Following the example of Y. C. Patel (1976) the resulting system of equations,

We obtain the zero frequency and the mean estimator a1 of and a2 of , [6]

where , is the zero relative frequency, n > 0

It can be seen that for distributions with a high probability at 0, the efficiency is high.

Testing Poisson assumption

When Hermite distribution is used to model a data sample is important to check if the Poisson distribution is enough to fit the data. Following the parametrized probability mass function used to calculate the maximum likelihood estimator, is important to corroborate the following hypothesis,

Likelihood-ratio test

The likelihood-ratio test statistic [9] for hermite distribution is,

Where is the log-likelihood function. As d = 1 belongs to the boundary of the domain of parameters, under the null hypothesis, W does not have an asymptotic distribution as expected. It can be established that the asymptotic distribution of W is a 50:50 mixture of the constant 0 and the . The α upper-tail percentage points for this mixture are the same as the 2α upper-tail percentage points for a ; for instance, for α = 0.01, 0.05, and 0.10 they are 5.41189, 2.70554 and 1.64237.

The "score" or Lagrange multiplier test

The score statistic is, [9]

where m is the number of observations.

The asymptotic distribution of the score test statistic under the null hypothesis is a distribution. It may be convenient to use a signed version of the score test, that is, , following asymptotically a standard normal.

See also

Related Research Articles

<span class="mw-page-title-main">Normal distribution</span> Probability distribution

In statistics, a normal distribution or Gaussian distribution is a type of continuous probability distribution for a real-valued random variable. The general form of its probability density function is

<span class="mw-page-title-main">Exponential distribution</span> Probability distribution

In probability theory and statistics, the exponential distribution or negative exponential distribution is the probability distribution of the time between events in a Poisson point process, i.e., a process in which events occur continuously and independently at a constant average rate. It is a particular case of the gamma distribution. It is the continuous analogue of the geometric distribution, and it has the key property of being memoryless. In addition to being used for the analysis of Poisson point processes it is found in various other contexts.

In statistics, maximum likelihood estimation (MLE) is a method of estimating the parameters of an assumed probability distribution, given some observed data. This is achieved by maximizing a likelihood function so that, under the assumed statistical model, the observed data is most probable. The point in the parameter space that maximizes the likelihood function is called the maximum likelihood estimate. The logic of maximum likelihood is both intuitive and flexible, and as such the method has become a dominant means of statistical inference.

<span class="mw-page-title-main">Gamma distribution</span> Probability distribution

In probability theory and statistics, the gamma distribution is a two-parameter family of continuous probability distributions. The exponential distribution, Erlang distribution, and chi-squared distribution are special cases of the gamma distribution. There are two equivalent parameterizations in common use:

  1. With a shape parameter and a scale parameter .
  2. With a shape parameter and an inverse scale parameter , called a rate parameter.

In probability and statistics, an exponential family is a parametric set of probability distributions of a certain form, specified below. This special form is chosen for mathematical convenience, including the enabling of the user to calculate expectations, covariances using differentiation based on some useful algebraic properties, as well as for generality, as exponential families are in a sense very natural sets of distributions to consider. The term exponential class is sometimes used in place of "exponential family", or the older term Koopman–Darmois family. The terms "distribution" and "family" are often used loosely: specifically, an exponential family is a set of distributions, where the specific distribution varies with the parameter; however, a parametric family of distributions is often referred to as "a distribution", and the set of all exponential families is sometimes loosely referred to as "the" exponential family. They are distinct because they possess a variety of desirable properties, most importantly the existence of a sufficient statistic.

In statistics, a generalized linear model (GLM) is a flexible generalization of ordinary linear regression. The GLM generalizes linear regression by allowing the linear model to be related to the response variable via a link function and by allowing the magnitude of the variance of each measurement to be a function of its predicted value.

<span class="mw-page-title-main">Laplace distribution</span> Probability distribution

In probability theory and statistics, the Laplace distribution is a continuous probability distribution named after Pierre-Simon Laplace. It is also sometimes called the double exponential distribution, because it can be thought of as two exponential distributions spliced together along the abscissa, although the term is also sometimes used to refer to the Gumbel distribution. The difference between two independent identically distributed exponential random variables is governed by a Laplace distribution, as is a Brownian motion evaluated at an exponentially distributed random time. Increments of Laplace motion or a variance gamma process evaluated over the time scale also have a Laplace distribution.

In Bayesian statistics, a maximum a posteriori probability (MAP) estimate is an estimate of an unknown quantity, that equals the mode of the posterior distribution. The MAP can be used to obtain a point estimate of an unobserved quantity on the basis of empirical data. It is closely related to the method of maximum likelihood (ML) estimation, but employs an augmented optimization objective which incorporates a prior distribution over the quantity one wants to estimate. MAP estimation can therefore be seen as a regularization of maximum likelihood estimation.

von Mises distribution Probability distribution on the circle

In probability theory and directional statistics, the von Mises distribution is a continuous probability distribution on the circle. It is a close approximation to the wrapped normal distribution, which is the circular analogue of the normal distribution. A freely diffusing angle on a circle is a wrapped normally distributed random variable with an unwrapped variance that grows linearly in time. On the other hand, the von Mises distribution is the stationary distribution of a drift and diffusion process on the circle in a harmonic potential, i.e. with a preferred orientation. The von Mises distribution is the maximum entropy distribution for circular data when the real and imaginary parts of the first circular moment are specified. The von Mises distribution is a special case of the von Mises–Fisher distribution on the N-dimensional sphere.

<span class="mw-page-title-main">Inverse Gaussian distribution</span> Family of continuous probability distributions

In probability theory, the inverse Gaussian distribution is a two-parameter family of continuous probability distributions with support on (0,∞).

In probability and statistics, a natural exponential family (NEF) is a class of probability distributions that is a special case of an exponential family (EF).

A ratio distribution is a probability distribution constructed as the distribution of the ratio of random variables having two other known distributions. Given two random variables X and Y, the distribution of the random variable Z that is formed as the ratio Z = X/Y is a ratio distribution.

In probability and statistics, the Tweedie distributions are a family of probability distributions which include the purely continuous normal, gamma and inverse Gaussian distributions, the purely discrete scaled Poisson distribution, and the class of compound Poisson–gamma distributions which have positive mass at zero, but are otherwise continuous. Tweedie distributions are a special case of exponential dispersion models and are often used as distributions for generalized linear models.

In probability and statistics, the class of exponential dispersion models (EDM) is a set of probability distributions that represents a generalisation of the natural exponential family. Exponential dispersion models play an important role in statistical theory, in particular in generalized linear models because they have a special structure which enables deductions to be made about appropriate statistical inference.

<span class="mw-page-title-main">Half-normal distribution</span> Probability distribution

In probability theory and statistics, the half-normal distribution is a special case of the folded normal distribution.

<span class="mw-page-title-main">Poisson distribution</span> Discrete probability distribution

In probability theory and statistics, the Poisson distribution is a discrete probability distribution that expresses the probability of a given number of events occurring in a fixed interval of time or space if these events occur with a known constant mean rate and independently of the time since the last event. It is named after French mathematician Siméon Denis Poisson. The Poisson distribution can also be used for the number of events in other specified interval types such as distance, area, or volume. It plays an important role for discrete-stable distributions.

<span class="mw-page-title-main">Wrapped Cauchy distribution</span>

In probability theory and directional statistics, a wrapped Cauchy distribution is a wrapped probability distribution that results from the "wrapping" of the Cauchy distribution around the unit circle. The Cauchy distribution is sometimes known as a Lorentzian distribution, and the wrapped Cauchy distribution may sometimes be referred to as a wrapped Lorentzian distribution.

Least-squares support-vector machines (LS-SVM) for statistics and in statistical modeling, are least-squares versions of support-vector machines (SVM), which are a set of related supervised learning methods that analyze data and recognize patterns, and which are used for classification and regression analysis. In this version one finds the solution by solving a set of linear equations instead of a convex quadratic programming (QP) problem for classical SVMs. Least-squares SVM classifiers were proposed by Johan Suykens and Joos Vandewalle. LS-SVMs are a class of kernel-based learning methods.

In probability theory and statistics, the Poisson binomial distribution is the discrete probability distribution of a sum of independent Bernoulli trials that are not necessarily identically distributed. The concept is named after Siméon Denis Poisson.

In statistics, the variance function is a smooth function that depicts the variance of a random quantity as a function of its mean. The variance function is a measure of heteroscedasticity and plays a large role in many settings of statistical modelling. It is a main ingredient in the generalized linear model framework and a tool used in non-parametric regression, semiparametric regression and functional data analysis. In parametric modeling, variance functions take on a parametric form and explicitly describe the relationship between the variance and the mean of a random quantity. In a non-parametric setting, the variance function is assumed to be a smooth function.

References

  1. 1 2 3 Kemp, C.D; Kemp, A.W (1965). "Some Properties of the "Hermite" Distribution". Biometrika. 52 (3–4): 381–394. doi:10.1093/biomet/52.3-4.381.
  2. 1 2 McKendrick, A.G. (1926). "Applications of Mathematics to Medical Problems". Proceedings of the Edinburgh Mathematical Society. 44: 98–130. doi: 10.1017/s0013091500034428 .
  3. Huiming, Zhang; Yunxiao Liu; Bo Li (2014). "Notes on discrete compound Poisson model with applications to risk theory". Insurance: Mathematics and Economics. 59: 325–336. doi:10.1016/j.insmatheco.2014.09.012.
  4. 1 2 3 4 Johnson, N.L., Kemp, A.W., and Kotz, S. (2005) Univariate Discrete Distributions, 3rd Edition, Wiley, ISBN   978-0-471-27246-5.
  5. Kemp, ADRIENNE W.; Kemp C.D (1966). "An alternative derivation of the Hermite distribution". Biometrika. 53 (3–4): 627–628. doi:10.1093/biomet/53.3-4.627.
  6. 1 2 3 Patel, Y.C (1976). "Even Point Estimation and Moment Estimation in Hermite Distribution". Biometrics. 32 (4): 865–873. doi:10.2307/2529270. JSTOR   2529270.
  7. Gupta, R.P.; Jain, G.C. (1974). "A Generalized Hermite distribution and Its Properties". SIAM Journal on Applied Mathematics. 27 (2): 359–363. doi:10.1137/0127027. JSTOR   2100572.
  8. 1 2 Kotz, Samuel (1982–1989). Encyclopedia of statistical sciences. John Wiley. ISBN   978-0471055525.
  9. 1 2 3 4 5 6 7 8 Puig, P. (2003). "Characterizing Additively Closed Discrete Models by a Property of Their Maximum Likelihood Estimators, with an Application to Generalized Hermite Distributions". Journal of the American Statistical Association. 98 (463): 687–692. doi:10.1198/016214503000000594. JSTOR   30045296. S2CID   120484966.