Bayesian econometrics

Last updated

Bayesian econometrics is a branch of econometrics which applies Bayesian principles to economic modelling. Bayesianism is based on a degree-of-belief interpretation of probability, as opposed to a relative-frequency interpretation.

Contents

The Bayesian principle relies on Bayes' theorem which states that the probability of B conditional on A is the ratio of joint probability of A and B divided by probability of B. Bayesian econometricians assume that coefficients in the model have prior distributions.

This approach was first propagated by Arnold Zellner. [1]

Basics

Subjective probabilities have to satisfy the standard axioms of probability theory if one wishes to avoid losing a bet regardless of the outcome. [2] Before the data is observed, the parameter is regarded as an unknown quantity and thus random variable, which is assigned a prior distribution with . Bayesian analysis concentrates on the inference of the posterior distribution , i.e. the distribution of the random variable conditional on the observation of the discrete data . The posterior density function can be computed based on Bayes' Theorem:

where , yielding a normalized probability function. For continuous data , this corresponds to:

where and which is the centerpiece of Bayesian statistics and econometrics. It has the following components:

The posterior function is given by , i.e., the posterior function is proportional to the product of the likelihood function and the prior distribution, and can be understood as a method of updating information, with the difference between and being the information gain concerning after observing new data. The choice of the prior distribution is used to impose restrictions on , e.g. , with the beta distribution as a common choice due to (i) being defined between 0 and 1, (ii) being able to produce a variety of shapes, and (iii) yielding a posterior distribution of the standard form if combined with the likelihood function . Based on the properties of the beta distribution, an ever-larger sample size implies that the mean of the posterior distribution approximates the maximum likelihood estimator The assumed form of the likelihood function is part of the prior information and has to be justified. Different distributional assumptions can be compared using posterior odds ratios if a priori grounds fail to provide a clear choice. Commonly assumed forms include the beta distribution, the gamma distribution, and the uniform distribution, among others. If the model contains multiple parameters, the parameter can be redefined as a vector. Applying probability theory to that vector of parameters yields the marginal and conditional distributions of individual parameters or parameter groups. If data generation is sequential, Bayesian principles imply that the posterior distribution for the parameter based on new evidence will be proportional to the product of the likelihood for the new data, given previous data and the parameter, and the posterior distribution for the parameter, given the old data, which provides an intuitive way of allowing new information to influence beliefs about a parameter through Bayesian updating. If the sample size is large, (i) the prior distribution plays a relatively small role in determining the posterior distribution, (ii) the posterior distribution converges to a degenerate distribution at the true value of the parameter, and (iii) the posterior distribution is approximately normally distributed with mean .

History

The ideas underlying Bayesian statistics were developed by Rev. Thomas Bayes during the 18th century and later expanded by Pierre-Simon Laplace. As early as 1950, the potential of the Bayesian inference in econometrics was recognized by Jacob Marschak. [3] The Bayesian approach was first applied to econometrics in the early 1960s by W. D. Fisher, Jacques Drèze, Clifford Hildreth, Thomas J. Rothenberg, George Tiao, and Arnold Zellner. The central motivation behind these early endeavors in Bayesian econometrics was the combination of the parameter estimators with available uncertain information on the model parameters that was not included in a given model formulation. [4] From the mid-1960s to the mid-1970s, the reformulation of econometric techniques along Bayesian principles under the traditional structural approach dominated the research agenda, with Zellner's An Introduction to Bayesian Inference in Econometrics in 1971 as one of its highlights, and thus closely followed the work of frequentist econometrics. Therein, the main technical issues were the difficulty of specifying prior densities without losing either economic interpretation or mathematical tractability and the difficulty of integral calculation in the context of density functions. The result of the Bayesian reformulation program was to highlight the fragility of structural models to uncertain specification. This fragility came to motivate the work of Edward Leamer, who emphatically criticized modelers' tendency to indulge in "post-data model construction" and consequently developed a method of economic modelling based on the selection of regression models according to the types of prior density specification in order to identify the prior structures underlying modelers' working rules in model selection explicitly. [5] Bayesian econometrics also became attractive to Christopher Sims' attempt to move from structural modeling to VAR modeling due to its explicit probability specification of parameter restrictions. Driven by the rapid growth of computing capacities from the mid-1980s on, the application of Markov chain Monte Carlo simulation to statistical and econometric models, first performed in the early 1990s, enabled Bayesian analysis to drastically increase its influence in economics and econometrics. [6]

Current research topics

Since the beginning of the 21st century, research in Bayesian econometrics has concentrated on: [7]

Related Research Articles

The likelihood function is the joint probability of observed data viewed as a function of the parameters of a statistical model.

Bayesian inference is a method of statistical inference in which Bayes' theorem is used to update the probability for a hypothesis as more evidence or information becomes available. Fundamentally, Bayesian inference uses prior knowledge, in the form of a prior distribution in order to estimate posterior probabilities. Bayesian inference is an important technique in statistics, and especially in mathematical statistics. Bayesian updating is particularly important in the dynamic analysis of a sequence of data. Bayesian inference has found application in a wide range of activities, including science, engineering, philosophy, medicine, sport, and law. In the philosophy of decision theory, Bayesian inference is closely related to subjective probability, often called "Bayesian probability".

In statistics, maximum likelihood estimation (MLE) is a method of estimating the parameters of an assumed probability distribution, given some observed data. This is achieved by maximizing a likelihood function so that, under the assumed statistical model, the observed data is most probable. The point in the parameter space that maximizes the likelihood function is called the maximum likelihood estimate. The logic of maximum likelihood is both intuitive and flexible, and as such the method has become a dominant means of statistical inference.

In statistics, a statistic is sufficient with respect to a statistical model and its associated unknown parameter if "no other statistic that can be calculated from the same sample provides any additional information as to the value of the parameter". In particular, a statistic is sufficient for a family of probability distributions if the sample from which it is calculated gives no additional information than the statistic, as to which of those probability distributions is the sampling distribution.

The posterior probability is a type of conditional probability that results from updating the prior probability with information summarized by the likelihood via an application of Bayes' rule. From an epistemological perspective, the posterior probability contains everything there is to know about an uncertain proposition, given prior knowledge and a mathematical model describing the observations available at a particular time. After the arrival of new information, the current posterior probability may serve as the prior in another round of Bayesian updating.

A prior probability distribution of an uncertain quantity, often simply called the prior, is its assumed probability distribution before some evidence is taken into account. For example, the prior could be the probability distribution representing the relative proportions of voters who will vote for a particular politician in a future election. The unknown quantity may be a parameter of the model or a latent variable rather than an observable variable.

In statistics, Gibbs sampling or a Gibbs sampler is a Markov chain Monte Carlo (MCMC) algorithm for sampling from a specified multivariate probability distribution when direct sampling from the joint distribution is difficult, but sampling from the conditional distribution is more practical. This sequence can be used to approximate the joint distribution ; to approximate the marginal distribution of one of the variables, or some subset of the variables ; or to compute an integral. Typically, some of the variables correspond to observations whose values are known, and hence do not need to be sampled.

A marginal likelihood is a likelihood function that has been integrated over the parameter space. In Bayesian statistics, it represents the probability of generating the observed sample for all possible values of the parameters; it can be understood as the probability of the model itself and is therefore often referred to as model evidence or simply evidence.

In Bayesian statistics, a maximum a posteriori probability (MAP) estimate is an estimate of an unknown quantity, that equals the mode of the posterior distribution. The MAP can be used to obtain a point estimate of an unobserved quantity on the basis of empirical data. It is closely related to the method of maximum likelihood (ML) estimation, but employs an augmented optimization objective which incorporates a prior distribution over the quantity one wants to estimate. MAP estimation can therefore be seen as a regularization of maximum likelihood estimation.

In estimation theory and decision theory, a Bayes estimator or a Bayes action is an estimator or decision rule that minimizes the posterior expected value of a loss function. Equivalently, it maximizes the posterior expectation of a utility function. An alternative way of formulating an estimator within Bayesian statistics is maximum a posteriori estimation.

Bayesian inference of phylogeny combines the information in the prior and in the data likelihood to create the so-called posterior probability of trees, which is the probability that the tree is correct given the data, the prior and the likelihood model. Bayesian inference was introduced into molecular phylogenetics in the 1990s by three independent groups: Bruce Rannala and Ziheng Yang in Berkeley, Bob Mau in Madison, and Shuying Li in University of Iowa, the last two being PhD students at the time. The approach has become very popular since the release of the MrBayes software in 2001, and is now one of the most popular methods in molecular phylogenetics.

<span class="mw-page-title-main">Monotone likelihood ratio</span> Statistical property

In statistics, the monotone likelihood ratio property is a property of the ratio of two probability density functions (PDFs). Formally, distributions ƒ(x) and g(x) bear the property if

Approximate Bayesian computation (ABC) constitutes a class of computational methods rooted in Bayesian statistics that can be used to estimate the posterior distributions of model parameters.

<span class="mw-page-title-main">Half-normal distribution</span> Probability distribution

In probability theory and statistics, the half-normal distribution is a special case of the folded normal distribution.

In statistical inference, the concept of a confidence distribution (CD) has often been loosely referred to as a distribution function on the parameter space that can represent confidence intervals of all levels for a parameter of interest. Historically, it has typically been constructed by inverting the upper limits of lower sided confidence intervals of all levels, and it was also commonly associated with a fiducial interpretation, although it is a purely frequentist concept. A confidence distribution is NOT a probability distribution function of the parameter of interest, but may still be a function useful for making inferences.

In Bayesian inference, the Bernstein–von Mises theorem provides the basis for using Bayesian credible sets for confidence statements in parametric models. It states that under some conditions, a posterior distribution converges in the limit of infinite data to a multivariate normal distribution centered at the maximum likelihood estimator with covariance matrix given by , where is the true population parameter and is the Fisher information matrix at the true population parameter value:

In probability theory and statistics, empirical likelihood (EL) is a nonparametric method for estimating the parameters of statistical models. It requires fewer assumptions about the error distribution while retaining some of the merits in likelihood-based inference. The estimation method requires that the data are independent and identically distributed (iid). It performs well even when the distribution is asymmetric or censored. EL methods can also handle constraints and prior information on parameters. Art Owen pioneered work in this area with his 1988 paper.

Bayesian hierarchical modelling is a statistical model written in multiple levels that estimates the parameters of the posterior distribution using the Bayesian method. The sub-models combine to form the hierarchical model, and Bayes' theorem is used to integrate them with the observed data and account for all the uncertainty that is present. The result of this integration is the posterior distribution, also known as the updated probability estimate, as additional evidence on the prior distribution is acquired.

In computational statistics, the pseudo-marginal Metropolis–Hastings algorithm is a Monte Carlo method to sample from a probability distribution. It is an instance of the popular Metropolis–Hastings algorithm that extends its use to cases where the target density is not available analytically. It relies on the fact that the Metropolis–Hastings algorithm can still sample from the correct target distribution if the target density in the acceptance ratio is replaced by an estimate. It is especially popular in Bayesian statistics, where it is applied if the likelihood function is not tractable.

Laplace's approximation provides an analytical expression for a posterior probability distribution by fitting a Gaussian distribution with a mean equal to the MAP solution and precision equal to the observed Fisher information. The approximation is justified by the Bernstein–von Mises theorem, which states that under regularity conditions the posterior converges to a Gaussian in large samples.

References

  1. Greenberg, Edward (2012). Introduction to Bayesian Econometrics (Second ed.). Cambridge University Press. ISBN   978-1-107-01531-9.
  2. Chapter 3 in de Finetti, B. (1990). Theory of Probability. Chichester: John Wiley & Sons.
  3. Marschak made this acknowledgment in a lecture, which was formalized in Marschak (1954); cf. Marschak, J. (1954). Probability in the Social Sciences. In Marschak, J. (1974). Economic Information, Decision, and Prediction. Selected Essays: Volume I Part I - Economics of Decision. Amsterdam: Springer Netherlands.
  4. Qin, D. (1996). "Bayesian Econometrics: The First Twenty Years". Econometric Theory . 12 (3): 500–516. doi:10.1017/S0266466600006836.
  5. Leamer, Edward E. (1974). "False Models and Post-Data Model Construction". Journal of the American Statistical Association . 69 (345): 122–131. doi:10.1080/01621459.1974.10480138.
  6. Koop, Gary; Korobilis, Dimitris (2010). "Bayesian Multivariate Time Series Methods for Empirical Macroeconomics". Foundations and Trends in Econometrics. 3 (4): 267–358. CiteSeerX   10.1.1.164.7962 . doi:10.1561/0800000013.
  7. Basturk, N. (2013). Historical Developments in Bayesian Econometrics after Cowles Foundation Monographs 10, 14. Tinbergen Institute Discussion Paper 191/III.