Quasi-likelihood

Last updated

In statistics, quasi-likelihood methods are used to estimate parameters in a statistical model when exact likelihood methods, for example maximum likelihood estimation, are computationally infeasible. Due to the wrong likelihood being used, quasi-likelihood estimators lose asymptotic efficiency compared to, e.g., maximum likelihood estimators. Under broadly applicable conditions, quasi-likelihood estimators are consistent and asymptotically normal. The asymptotic covariance matrix can be obtained using the so-called sandwich estimator. Examples of quasi-likelihood methods include the generalized estimating equations and pairwise likelihood approaches.

Contents

History

The term quasi-likelihood function was introduced by Robert Wedderburn in 1974 to describe a function that has similar properties to the log-likelihood function but is not the log-likelihood corresponding to any actual probability distribution. [1] He proposed to fit certain quasi-likelihood models using a straightforward extension of the algorithms used to fit generalized linear models.

Application to overdispersion modelling

Quasi-likelihood estimation is one way of allowing for overdispersion, that is, greater variability in the data than would be expected from the statistical model used. It is most often used with models for count data or grouped binary data, i.e. data that would otherwise be modelled using the Poisson or binomial distribution.

Instead of specifying a probability distribution for the data, only a relationship between the mean and the variance is specified in the form of a variance function giving the variance as a function of the mean. Generally, this function is allowed to include a multiplicative factor known as the overdispersion parameter or scale parameter that is estimated from the data. Most commonly, the variance function is of a form such that fixing the overdispersion parameter at unity results in the variance-mean relationship of an actual probability distribution such as the binomial or Poisson. (For formulae, see the binomial data example and count data example under generalized linear models.)

Comparison to alternatives

Random-effects models, and more generally mixed models (hierarchical models) provide an alternative method of fitting data exhibiting overdispersion using fully specified probability models. However, these methods often become complex and computationally intensive to fit to binary or count data. Quasi-likelihood methods have the advantage of relative computational simplicity, speed and robustness, as they can make use of the more straightforward algorithms developed to fit generalized linear models.

See also

Notes

  1. Wedderburn, R. W. M. (1974). "Quasi-likelihood functions, generalized linear models, and the Gauss—Newton method". Biometrika. 61 (3): 439–447. doi:10.1093/biomet/61.3.439. MR   0375592.

Related Research Articles

<span class="mw-page-title-main">Statistical inference</span> Process of using data analysis

Statistical inference is the process of using data analysis to infer properties of an underlying distribution of probability. Inferential statistical analysis infers properties of a population, for example by testing hypotheses and deriving estimates. It is assumed that the observed data set is sampled from a larger population.

Statistics is a field of inquiry that studies the collection, analysis, interpretation, and presentation of data. It is applicable to a wide variety of academic disciplines, from the physical and social sciences to the humanities; it is also used and misused for making informed decisions in all areas of business and government.

<span class="mw-page-title-main">Negative binomial distribution</span> Probability distribution

In probability theory and statistics, the negative binomial distribution is a discrete probability distribution that models the number of failures in a sequence of independent and identically distributed Bernoulli trials before a specified (non-random) number of successes occurs. For example, we can define rolling a 6 on a dice as a success, and rolling any other number as a failure, and ask how many failure rolls will occur before we see the third success. In such a case, the probability distribution of the number of failures that appear will be a negative binomial distribution.

<span class="mw-page-title-main">Least squares</span> Approximation method in statistics

The method of least squares is a standard approach in regression analysis to approximate the solution of overdetermined systems by minimizing the sum of the squares of the residuals made in the results of each individual equation.

In statistics, point estimation involves the use of sample data to calculate a single value which is to serve as a "best guess" or "best estimate" of an unknown population parameter. More formally, it is the application of a point estimator to the data to obtain a point estimate.

In statistics, a generalized linear model (GLM) is a flexible generalization of ordinary linear regression. The GLM generalizes linear regression by allowing the linear model to be related to the response variable via a link function and by allowing the magnitude of the variance of each measurement to be a function of its predicted value.

<span class="mw-page-title-main">Mathematical statistics</span> Branch of statistics

Mathematical statistics is the application of probability theory, a branch of mathematics, to statistics, as opposed to techniques for collecting statistical data. Specific mathematical techniques which are used for this include mathematical analysis, linear algebra, stochastic analysis, differential equations, and measure theory.

In statistics, Poisson regression is a generalized linear model form of regression analysis used to model count data and contingency tables. Poisson regression assumes the response variable Y has a Poisson distribution, and assumes the logarithm of its expected value can be modeled by a linear combination of unknown parameters. A Poisson regression model is sometimes known as a log-linear model, especially when used to model contingency tables.

In statistics, a generalized additive model (GAM) is a generalized linear model in which the linear response variable depends linearly on unknown smooth functions of some predictor variables, and interest focuses on inference about these smooth functions.

In statistics, resampling is the creation of new samples based on one observed sample. Resampling methods are:

  1. Permutation tests
  2. Bootstrapping
  3. Cross validation

In statistics, overdispersion is the presence of greater variability in a data set than would be expected based on a given statistical model.

In statistics, binomial regression is a regression analysis technique in which the response has a binomial distribution: it is the number of successes in a series of independent Bernoulli trials, where each trial has probability of success . In binomial regression, the probability of a success is related to explanatory variables: the corresponding concept in ordinary regression is to relate the mean value of the unobserved response to explanatory variables.

Bootstrapping is any test or metric that uses random sampling with replacement, and falls under the broader class of resampling methods. Bootstrapping assigns measures of accuracy to sample estimates. This technique allows estimation of the sampling distribution of almost any statistic using random sampling methods.

<span class="mw-page-title-main">Conway–Maxwell–Poisson distribution</span> Probability distribution

In probability theory and statistics, the Conway–Maxwell–Poisson distribution is a discrete probability distribution named after Richard W. Conway, William L. Maxwell, and Siméon Denis Poisson that generalizes the Poisson distribution by adding a parameter to model overdispersion and underdispersion. It is a member of the exponential family, has the Poisson distribution and geometric distribution as special cases and the Bernoulli distribution as a limiting case.

<span class="mw-page-title-main">Hermite distribution</span>

In probability theory and statistics, the Hermite distribution, named after Charles Hermite, is a discrete probability distribution used to model count data with more than one parameter. This distribution is flexible in terms of its ability to allow a moderate over-dispersion in the data.

In statistics, the variance function is a smooth function that depicts the variance of a random quantity as a function of its mean. The variance function is a measure of heteroscedasticity and plays a large role in many settings of statistical modelling. It is a main ingredient in the generalized linear model framework and a tool used in non-parametric regression, semiparametric regression and functional data analysis. In parametric modeling, variance functions take on a parametric form and explicitly describe the relationship between the variance and the mean of a random quantity. In a non-parametric setting, the variance function is assumed to be a smooth function.

The generalized functional linear model (GFLM) is an extension of the generalized linear model (GLM) that allows one to regress univariate responses of various types on functional predictors, which are mostly random trajectories generated by a square-integrable stochastic processes. Similarly to GLM, a link function relates the expected value of the response variable to a linear predictor, which in case of GFLM is obtained by forming the scalar product of the random predictor function with a smooth parameter function . Functional Linear Regression, Functional Poisson Regression and Functional Binomial Regression, with the important Functional Logistic Regression included, are special cases of GFLM. Applications of GFLM include classification and discrimination of stochastic processes and functional data.

Partial (pooled) likelihood estimation for panel data is a quasi-maximum likelihood method for panel analysis that assumes that density of yit given xit is correctly specified for each time period but it allows for misspecification in the conditional density of yi≔(yi1,...,yiT) given xi≔(xi1,...,xiT).

In statistics, the class of vector generalized linear models (VGLMs) was proposed to enlarge the scope of models catered for by generalized linear models (GLMs). In particular, VGLMs allow for response variables outside the classical exponential family and for more than one parameter. Each parameter can be transformed by a link function. The VGLM framework is also large enough to naturally accommodate multiple responses; these are several independent responses each coming from a particular statistical distribution with possibly different parameter values.

References