Partial likelihood methods for panel data

Last updated

Partial (pooled) likelihood estimation for panel data is a quasi-maximum likelihood method for panel analysis that assumes that density of given is correctly specified for each time period but it allows for misspecification in the conditional density of given .

Contents

Description

Concretely, partial likelihood estimation uses the product of conditional densities as the density of the joint conditional distribution. This generality facilitates maximum likelihood methods in panel data setting because fully specifying conditional distribution of yi can be computationally demanding. [1] On the other hand, allowing for misspecification generally results in violation of information equality and thus requires robust standard error estimator for inference.

In the following exposition, we follow the treatment in Wooldridge. [1] Particularly, the asymptotic derivation is done under fixed-T, growing-N setting.

Writing the conditional density of yit given xit as ft (yit | xit;θ), the partial maximum likelihood estimator solves:

In this formulation, the joint conditional density of yi given xi is modeled as Πtft (yit | xit ; θ). We assume that ft (yit |xit ; θ) is correctly specified for each t = 1,...,T and that there exists θ0 ∈ Θ that uniquely maximizes E[ft (yit│xit ; θ)]. But, it is not assumed that the joint conditional density is correctly specified. Under some regularity conditions, partial MLE is consistent and asymptotically normal.

By the usual argument for M-estimators (details in Wooldridge [1] ), the asymptotic variance of NMLE- θ0) is A−1 BA−1 where A−1 = E[ Σt2θ logft (yit│xit ; θ)]−1 and B=E[( Σtθ logft (yit│xit ; θ) ) ( Σtθ logft (yit│xit; θ ) )T]. If the joint conditional density of yi given xi is correctly specified, the above formula for asymptotic variance simplifies because information equality says B=A. Yet, except for special circumstances, the joint density modeled by partial MLE is not correct. Therefore, for valid inference, the above formula for asymptotic variance should be used. For information equality to hold, one sufficient condition is that scores of the densities for each time period are uncorrelated. In dynamically complete models, the condition holds and thus simplified asymptotic variance is valid. [1]

Pooled QMLE for Poisson models

Pooled QMLE is a technique that allows estimating parameters when panel data is available with Poisson outcomes. For instance, one might have information on the number of patents files by a number of different firms over time. Pooled QMLE does not necessarily contain unobserved effects (which can be either random effects or fixed effects), and the estimation method is mainly proposed for these purposes. The computational requirements are less stringent, especially compared to fixed-effect Poisson models, but the trade off is the possibly strong assumption of no unobserved heterogeneity. Pooled refers to pooling the data over the different time periods T, while QMLE refers to the quasi-maximum likelihood technique.

The Poisson distribution of given is specified as follows: [2]

the starting point for Poisson pooled QMLE is the conditional mean assumption. Specifically, we assume that for some in a compact parameter space B, the conditional mean is given by [2]

The compact parameter space condition is imposed to enable the use of M-estimation techniques, while the conditional mean reflects the fact that the population mean of a Poisson process is the parameter of interest. In this particular case, the parameter governing the Poisson process is allowed to vary with respect to the vector . [2] The function m can, in principle, change over time even though it is often specified as static over time. [3] Note that only the conditional mean function is specified, and we will get consistent estimates of as long as this mean condition is correctly specified. This leads to the following first order condition, which represents the quasi-log likelihood for the pooled Poisson estimation: [2]

A popular choice is , as Poisson processes are defined over the positive real line. [3] This reduces the conditional moment to an exponential index function, where is the linear index and exp is the link function. [4]

Related Research Articles

<span class="mw-page-title-main">Estimator</span> Rule for calculating an estimate of a given quantity based on observed data

In statistics, an estimator is a rule for calculating an estimate of a given quantity based on observed data: thus the rule, the quantity of interest and its result are distinguished. For example, the sample mean is a commonly used estimator of the population mean.

In statistics, maximum likelihood estimation (MLE) is a method of estimating the parameters of an assumed probability distribution, given some observed data. This is achieved by maximizing a likelihood function so that, under the assumed statistical model, the observed data is most probable. The point in the parameter space that maximizes the likelihood function is called the maximum likelihood estimate. The logic of maximum likelihood is both intuitive and flexible, and as such the method has become a dominant means of statistical inference.

In statistics, point estimation involves the use of sample data to calculate a single value which is to serve as a "best guess" or "best estimate" of an unknown population parameter. More formally, it is the application of a point estimator to the data to obtain a point estimate.

<span class="mw-page-title-main">Gamma distribution</span> Probability distribution

In probability theory and statistics, the gamma distribution is a versatile two-parameter family of continuous probability distributions. The exponential distribution, Erlang distribution, and chi-squared distribution are special cases of the gamma distribution. There are two equivalent parameterizations in common use:

  1. With a shape parameter k and a scale parameter θ
  2. With a shape parameter and an inverse scale parameter , called a rate parameter.

In mathematical statistics, the Fisher information is a way of measuring the amount of information that an observable random variable X carries about an unknown parameter θ of a distribution that models X. Formally, it is the variance of the score, or the expected value of the observed information.

In statistics, a generalized linear model (GLM) is a flexible generalization of ordinary linear regression. The GLM generalizes linear regression by allowing the linear model to be related to the response variable via a link function and by allowing the magnitude of the variance of each measurement to be a function of its predicted value.

In statistics, Poisson regression is a generalized linear model form of regression analysis used to model count data and contingency tables. Poisson regression assumes the response variable Y has a Poisson distribution, and assumes the logarithm of its expected value can be modeled by a linear combination of unknown parameters. A Poisson regression model is sometimes known as a log-linear model, especially when used to model contingency tables.

In statistics, M-estimators are a broad class of extremum estimators for which the objective function is a sample average. Both non-linear least squares and maximum likelihood estimation are special cases of M-estimators. The definition of M-estimators was motivated by robust statistics, which contributed new types of M-estimators. However, M-estimators are not inherently robust, as is clear from the fact that they include maximum likelihood estimators, which are in general not robust. The statistical procedure of evaluating an M-estimator on a data set is called M-estimation.

In estimation theory and decision theory, a Bayes estimator or a Bayes action is an estimator or decision rule that minimizes the posterior expected value of a loss function. Equivalently, it maximizes the posterior expectation of a utility function. An alternative way of formulating an estimator within Bayesian statistics is maximum a posteriori estimation.

In probability and statistics, a natural exponential family (NEF) is a class of probability distributions that is a special case of an exponential family (EF).

<span class="mw-page-title-main">Maximum spacing estimation</span> Method of estimating a statistical models parameters

In statistics, maximum spacing estimation (MSE or MSP), or maximum product of spacing estimation (MPS), is a method for estimating the parameters of a univariate statistical model. The method requires maximization of the geometric mean of spacings in the data, which are the differences between the values of the cumulative distribution function at neighbouring data points.

In probability and statistics, a compound probability distribution is the probability distribution that results from assuming that a random variable is distributed according to some parametrized distribution, with the parameters of that distribution themselves being random variables. If the parameter is a scale parameter, the resulting mixture is also called a scale mixture.

In Bayesian inference, the Bernstein–von Mises theorem provides the basis for using Bayesian credible sets for confidence statements in parametric models. It states that under some conditions, a posterior distribution converges in the limit of infinite data to a multivariate normal distribution centered at the maximum likelihood estimator with covariance matrix given by , where is the true population parameter and is the Fisher information matrix at the true population parameter value:

<span class="mw-page-title-main">Hermite distribution</span> Statistical probability Distribution for discrete event counts

In probability theory and statistics, the Hermite distribution, named after Charles Hermite, is a discrete probability distribution used to model count data with more than one parameter. This distribution is flexible in terms of its ability to allow a moderate over-dispersion in the data.

In statistics, the variance function is a smooth function that depicts the variance of a random quantity as a function of its mean. The variance function is a measure of heteroscedasticity and plays a large role in many settings of statistical modelling. It is a main ingredient in the generalized linear model framework and a tool used in non-parametric regression, semiparametric regression and functional data analysis. In parametric modeling, variance functions take on a parametric form and explicitly describe the relationship between the variance and the mean of a random quantity. In a non-parametric setting, the variance function is assumed to be a smooth function.

The generalized functional linear model (GFLM) is an extension of the generalized linear model (GLM) that allows one to regress univariate responses of various types on functional predictors, which are mostly random trajectories generated by a square-integrable stochastic processes. Similarly to GLM, a link function relates the expected value of the response variable to a linear predictor, which in case of GFLM is obtained by forming the scalar product of the random predictor function with a smooth parameter function . Functional Linear Regression, Functional Poisson Regression and Functional Binomial Regression, with the important Functional Logistic Regression included, are special cases of GFLM. Applications of GFLM include classification and discrimination of stochastic processes and functional data.

Control functions are statistical methods to correct for endogeneity problems by modelling the endogeneity in the error term. The approach thereby differs in important ways from other models that try to account for the same econometric problem. Instrumental variables, for example, attempt to model the endogenous variable X as an often invertible model with respect to a relevant and exogenous instrument Z. Panel analysis uses special data properties to difference out unobserved heterogeneity that is assumed to be fixed over time.

In statistics, a fixed-effect Poisson model is a Poisson regression model used for static panel data when the outcome variable is count data. Hausman, Hall, and Griliches pioneered the method in the mid 1980s. Their outcome of interest was the number of patents filed by firms, where they wanted to develop methods to control for the firm fixed effects. Linear panel data models use the linear additivity of the fixed effects to difference them out and circumvent the incidental parameter problem. Even though Poisson models are inherently nonlinear, the use of the linear index and the exponential link function lead to multiplicative separability, more specifically

Two-step M-estimators deals with M-estimation problems that require preliminary estimation to obtain the parameter of interest. Two-step M-estimation is different from usual M-estimation problem because asymptotic distribution of the second-step estimator generally depends on the first-step estimator. Accounting for this change in asymptotic distribution is important for valid inference.

In least squares estimation problems, sometimes one or more regressors specified in the model are not observable. One way to circumvent this issue is to estimate or generate regressors from observable data. This generated regressor method is also applicable to unobserved instrumental variables. Under some regularity conditions, consistency and asymptotic normality of least squares estimator is preserved, but asymptotic variance has a different form in general.

References

  1. 1 2 3 4 Wooldridge, J.M., Econometric Analysis of Cross Section and Panel Data, MIT Press, Cambridge, Mass.
  2. 1 2 3 4 Cameron, C. A. and P. K. Trivedi (2015) Count Panel Data, Oxford Handbook of Panel Data, ed. by B. Baltagi, Oxford University Press, pp. 233–256
  3. 1 2 Wooldridge, J. (2002): Econometric Analysis of Cross Section and Panel Data, MIT Press, Cambridge, Mass.
  4. McCullagh, P. and J. A. Nelder (1989): Generalized Linear Models, CRC Monographs on Statistics and Applied Probability (Book 37), 2nd Edition, Chapman and Hall, London.