Multivariate probit model

Last updated

In statistics and econometrics, the multivariate probit model is a generalization of the probit model used to estimate several correlated binary outcomes jointly. For example, if it is believed that the decisions of sending at least one child to public school and that of voting in favor of a school budget are correlated (both decisions are binary), then the multivariate probit model would be appropriate for jointly predicting these two choices on an individual-specific basis. J.R. Ashford and R.R. Sowden initially proposed an approach for multivariate probit analysis. [1] Siddhartha Chib and Edward Greenberg extended this idea and also proposed simulation-based inference methods for the multivariate probit model which simplified and generalized parameter estimation. [2]

Contents

Example: bivariate probit

In the ordinary probit model, there is only one binary dependent variable and so only one latent variable is used. In contrast, in the bivariate probit model there are two binary dependent variables and , so there are two latent variables: and . It is assumed that each observed variable takes on the value 1 if and only if its underlying continuous latent variable takes on a positive value:

with

and

Fitting the bivariate probit model involves estimating the values of and . To do so, the likelihood of the model has to be maximized. This likelihood is

Substituting the latent variables and in the probability functions and taking logs gives

After some rewriting, the log-likelihood function becomes:

Note that is the cumulative distribution function of the bivariate normal distribution. and in the log-likelihood function are observed variables being equal to one or zero.

Multivariate Probit

For the general case, where we can take as choices and as individuals or observations, the probability of observing choice is

Where and,

The log-likelihood function in this case would be

Except for typically there is no closed form solution to the integrals in the log-likelihood equation. Instead simulation methods can be used to simulated the choice probabilities. Methods using importance sampling include the GHK algorithm, [3] AR (accept-reject), Stern's method. There are also MCMC approaches to this problem including CRB (Chib's method with Rao–Blackwellization), CRT (Chib, Ritter, Tanner), ARK (accept-reject kernel), and ASK (adaptive sampling kernel). [4] A variational approach scaling to large datasets is proposed in Probit-LMM. [5]

Related Research Articles

<span class="mw-page-title-main">Multivariate normal distribution</span> Generalization of the one-dimensional normal distribution to higher dimensions

In probability theory and statistics, the multivariate normal distribution, multivariate Gaussian distribution, or joint normal distribution is a generalization of the one-dimensional (univariate) normal distribution to higher dimensions. One definition is that a random vector is said to be k-variate normally distributed if every linear combination of its k components has a univariate normal distribution. Its importance derives mainly from the multivariate central limit theorem. The multivariate normal distribution is often used to describe, at least approximately, any set of (possibly) correlated real-valued random variables, each of which clusters around a mean value.

In statistics, the Gauss–Markov theorem states that the ordinary least squares (OLS) estimator has the lowest sampling variance within the class of linear unbiased estimators, if the errors in the linear regression model are uncorrelated, have equal variances and expectation value of zero. The errors do not need to be normal for the theorem to apply, nor do they need to be independent and identically distributed.

<span class="mw-page-title-main">Logistic regression</span> Statistical model for a binary dependent variable

In statistics, the logistic model is a statistical model that models the log-odds of an event as a linear combination of one or more independent variables. In regression analysis, logistic regression is estimating the parameters of a logistic model. Formally, in binary logistic regression there is a single binary dependent variable, coded by an indicator variable, where the two values are labeled "0" and "1", while the independent variables can each be a binary variable or a continuous variable. The corresponding probability of the value labeled "1" can vary between 0 and 1, hence the labeling; the function that converts log-odds to probability is the logistic function, hence the name. The unit of measurement for the log-odds scale is called a logit, from logistic unit, hence the alternative names. See § Background and § Definition for formal mathematics, and § Example for a worked example.

Linear elasticity is a mathematical model of how solid objects deform and become internally stressed due to prescribed loading conditions. It is a simplification of the more general nonlinear theory of elasticity and a branch of continuum mechanics.

In statistics and in particular in regression analysis, a design matrix, also known as model matrix or regressor matrix and often denoted by X, is a matrix of values of explanatory variables of a set of objects. Each row represents an individual object, with the successive columns corresponding to the variables and their specific values for that object. The design matrix is used in certain statistical models, e.g., the general linear model. It can contain indicator variables that indicate group membership in an ANOVA, or it can contain values of continuous variables.

In statistics, a probit model is a type of regression where the dependent variable can take only two values, for example married or not married. The word is a portmanteau, coming from probability + unit. The purpose of the model is to estimate the probability that an observation with particular characteristics will fall into a specific one of the categories; moreover, classifying observations based on their predicted probabilities is a type of binary classification model.

In statistics, ordinary least squares (OLS) is a type of linear least squares method for choosing the unknown parameters in a linear regression model by the principle of least squares: minimizing the sum of the squares of the differences between the observed dependent variable in the input dataset and the output of the (linear) function of the independent variable.

<span class="mw-page-title-main">Simple linear regression</span> Linear regression model with a single explanatory variable

In statistics, simple linear regression (SLR) is a linear regression model with a single explanatory variable. That is, it concerns two-dimensional sample points with one independent variable and one dependent variable and finds a linear function that, as accurately as possible, predicts the dependent variable values as a function of the independent variable. The adjective simple refers to the fact that the outcome variable is related to a single predictor.

In statistics, binomial regression is a regression analysis technique in which the response has a binomial distribution: it is the number of successes in a series of independent Bernoulli trials, where each trial has probability of success . In binomial regression, the probability of a success is related to explanatory variables: the corresponding concept in ordinary regression is to relate the mean value of the unobserved response to explanatory variables.

Bayesian linear regression is a type of conditional modeling in which the mean of one variable is described by a linear combination of other variables, with the goal of obtaining the posterior probability of the regression coefficients and ultimately allowing the out-of-sample prediction of the regressandconditional on observed values of the regressors. The simplest and most widely used version of this model is the normal linear model, in which given is distributed Gaussian. In this model, and under a particular choice of prior probabilities for the parameters—so-called conjugate priors—the posterior can be found analytically. With more arbitrarily chosen priors, the posteriors generally have to be approximated.

In statistics, Bayesian multivariate linear regression is a Bayesian approach to multivariate linear regression, i.e. linear regression where the predicted outcome is a vector of correlated random variables rather than a single scalar random variable. A more general treatment of this approach can be found in the article MMSE estimator.

In probability theory and statistics, partial correlation measures the degree of association between two random variables, with the effect of a set of controlling random variables removed. When determining the numerical relationship between two variables of interest, using their correlation coefficient will give misleading results if there is another confounding variable that is numerically related to both variables of interest. This misleading information can be avoided by controlling for the confounding variable, which is done by computing the partial correlation coefficient. This is precisely the motivation for including other right-side variables in a multiple regression; but while multiple regression gives unbiased results for the effect size, it does not give a numerical value of a measure of the strength of the relationship between the two variables of interest.

In statistics, polynomial regression is a form of regression analysis in which the relationship between the independent variable x and the dependent variable y is modeled as an nth degree polynomial in x. Polynomial regression fits a nonlinear relationship between the value of x and the corresponding conditional mean of y, denoted E(y |x). Although polynomial regression fits a nonlinear model to the data, as a statistical estimation problem it is linear, in the sense that the regression function E(y | x) is linear in the unknown parameters that are estimated from the data. For this reason, polynomial regression is considered to be a special case of multiple linear regression.

The purpose of this page is to provide supplementary materials for the ordinary least squares article, reducing the load of the main article with mathematics and improving its accessibility, while at the same time retaining the completeness of exposition.

<span class="mw-page-title-main">Plate theory</span> Mathematical model of the stresses within flat plates under loading

In continuum mechanics, plate theories are mathematical descriptions of the mechanics of flat plates that draw on the theory of beams. Plates are defined as plane structural elements with a small thickness compared to the planar dimensions. The typical thickness to width ratio of a plate structure is less than 0.1. A plate theory takes advantage of this disparity in length scale to reduce the full three-dimensional solid mechanics problem to a two-dimensional problem. The aim of plate theory is to calculate the deformation and stresses in a plate subjected to loads.

<span class="mw-page-title-main">Kirchhoff–Love plate theory</span>

The Kirchhoff–Love theory of plates is a two-dimensional mathematical model that is used to determine the stresses and deformations in thin plates subjected to forces and moments. This theory is an extension of Euler-Bernoulli beam theory and was developed in 1888 by Love using assumptions proposed by Kirchhoff. The theory assumes that a mid-surface plane can be used to represent a three-dimensional plate in two-dimensional form.

In econometrics, Prais–Winsten estimation is a procedure meant to take care of the serial correlation of type AR(1) in a linear model. Conceived by Sigbert Prais and Christopher Winsten in 1954, it is a modification of Cochrane–Orcutt estimation in the sense that it does not lose the first observation, which leads to more efficiency as a result and makes it a special case of feasible generalized least squares.

<span class="mw-page-title-main">Matrix representation of Maxwell's equations</span>

In electromagnetism, a branch of fundamental physics, the matrix representations of the Maxwell's equations are a formulation of Maxwell's equations using matrices, complex numbers, and vector calculus. These representations are for a homogeneous medium, an approximation in an inhomogeneous medium. A matrix representation for an inhomogeneous medium was presented using a pair of matrix equations. A single equation using 4 × 4 matrices is necessary and sufficient for any homogeneous medium. For an inhomogeneous medium it necessarily requires 8 × 8 matrices.

In statistics, linear regression is a statistical model which estimates the linear relationship between a scalar response and one or more explanatory variables. The case of one explanatory variable is called simple linear regression; for more than one, the process is called multiple linear regression. This term is distinct from multivariate linear regression, where multiple correlated dependent variables are predicted, rather than a single scalar variable. If the explanatory variables are measured with error then errors-in-variables models are required, also known as measurement error models.

<span class="mw-page-title-main">Homoscedasticity and heteroscedasticity</span> Statistical property

In statistics, a sequence of random variables is homoscedastic if all its random variables have the same finite variance; this is also known as homogeneity of variance. The complementary notion is called heteroscedasticity, also known as heterogeneity of variance. The spellings homoskedasticity and heteroskedasticity are also frequently used. Assuming a variable is homoscedastic when in reality it is heteroscedastic results in unbiased but inefficient point estimates and in biased estimates of standard errors, and may result in overestimating the goodness of fit as measured by the Pearson coefficient.

References

  1. Ashford, J.R.; Sowden, R.R. (September 1970). "Multivariate Probit Analysis". Biometrics. 26 (3): 535–546. doi:10.2307/2529107. JSTOR   2529107. PMID   5480663.
  2. Chib, Siddhartha; Greenberg, Edward (June 1998). "Analysis of multivariate probit models". Biometrika. 85 (2): 347–361. CiteSeerX   10.1.1.198.8541 . doi:10.1093/biomet/85.2.347 via Oxford Academic.
  3. Hajivassiliou, Vassilis (1994). "Chapter 40 Classical estimation methods for LDV models using simulation". Handbook of Econometrics. 4: 2383–2441. doi: 10.1016/S1573-4412(05)80009-1 . ISBN   9780444887665. S2CID   13232902.
  4. Jeliazkov, Ivan (2010). "MCMC perspectives on simulated likelihood estimation". Advances in Econometrics. 26: 3–39. doi:10.1108/S0731-9053(2010)0000026005. ISBN   978-0-85724-149-8.
  5. Mandt, Stephan; Wenzel, Florian; Nakajima, Shinichi; John, Cunningham; Lippert, Christoph; Kloft, Marius (2017). "Sparse probit linear mixed model" (PDF). Machine Learning. 106 (9–10): 1–22. arXiv: 1507.04777 . doi:10.1007/s10994-017-5652-6. S2CID   11588006.

Further reading