# Simultaneous equations model

Last updated

Simultaneous equations models are a type of statistical model in which the dependent variables are functions of other dependent variables, rather than just independent variables.  This means some of the explanatory variables are jointly determined with the dependent variable, which in economics usually is the consequence of some underlying equilibrium mechanism. Take the typical supply and demand model: whilst typically one would determine the quantity supplied and demanded as a function of the price set by the market, it is also possible for the reverse to be true, where producers observe the quantity that consumers demand and then set the price. 

## Contents

Simultaneity poses challenges for the estimation of the statistical parameters of interest, because the Gauss–Markov assumption of strict exogeneity of the regressors is violated. And while it would be natural to estimate all simultaneous equations at once, this often leads to a computationally costly non-linear optimization problem even for the simplest system of linear equations.  This situation prompted the development, spearheaded by the Cowles Commission in the 1940s and 1950s,  of various techniques that estimate each equation in the model seriatim, most notably limited information maximum likelihood and two-stage least squares. 

## Structural and reduced form

Suppose there are m regression equations of the form

$y_{it}=y_{-i,t}'\gamma _{i}+x_{it}'\;\!\beta _{i}+u_{it},\quad i=1,\ldots ,m,$ where i is the equation number, and t = 1, ..., T is the observation index. In these equations xit is the ki×1 vector of exogenous variables, yit is the dependent variable, y−i,t is the ni×1 vector of all other endogenous variables which enter the ith equation on the right-hand side, and uit are the error terms. The “−i” notation indicates that the vector y−i,t may contain any of the y’s except for yit (since it is already present on the left-hand side). The regression coefficients βi and γi are of dimensions ki×1 and ni×1 correspondingly. Vertically stacking the T observations corresponding to the ith equation, we can write each equation in vector form as

$y_{i}=Y_{-i}\gamma _{i}+X_{i}\beta _{i}+u_{i},\quad i=1,\ldots ,m,$ where yi and ui are 1 vectors, Xi is a T×ki matrix of exogenous regressors, and Y−i is a T×ni matrix of endogenous regressors on the right-hand side of the ith equation. Finally, we can move all endogenous variables to the left-hand side and write the m equations jointly in vector form as

$Y\Gamma =X\mathrm {B} +U.\,$ This representation is known as the structural form. In this equation Y = [y1y2 ... ym] is the T×m matrix of dependent variables. Each of the matrices Y−i is in fact an ni-columned submatrix of this Y. The m×m matrix Γ, which describes the relation between the dependent variables, has a complicated structure. It has ones on the diagonal, and all other elements of each column i are either the components of the vector −γi or zeros, depending on which columns of Y were included in the matrix Y−i. The T×k matrix X contains all exogenous regressors from all equations, but without repetitions (that is, matrix X should be of full rank). Thus, each Xi is a ki-columned submatrix of X. Matrix Β has size k×m, and each of its columns consists of the components of vectors βi and zeros, depending on which of the regressors from X were included or excluded from Xi. Finally, U = [u1u2 ... um] is a T×m matrix of the error terms.

Postmultiplying the structural equation by Γ −1, the system can be written in the reduced form as

$Y=X\mathrm {B} \Gamma ^{-1}+U\Gamma ^{-1}=X\Pi +V.\,$ This is already a simple general linear model, and it can be estimated for example by ordinary least squares. Unfortunately, the task of decomposing the estimated matrix ${\hat {\Pi }}$ into the individual factors Β and Γ −1 is quite complicated, and therefore the reduced form is more suitable for prediction but not inference.

### Assumptions

Firstly, the rank of the matrix X of exogenous regressors must be equal to k, both in finite samples and in the limit as T → ∞ (this later requirement means that in the limit the expression ${\frac {1}{T}}X'\!X$ should converge to a nondegenerate k×k matrix). Matrix Γ is also assumed to be non-degenerate.

Secondly, error terms are assumed to be serially independent and identically distributed. That is, if the tth row of matrix U is denoted by u(t), then the sequence of vectors {u(t)} should be iid, with zero mean and some covariance matrix Σ (which is unknown). In particular, this implies that E[U] = 0, and E[U′U] = T Σ.

Lastly, assumptions are required for identification.

## Identification

The identification conditions require that the system of linear equations be solvable for the unknown parameters.

More specifically, the order condition, a necessary condition for identification, is that for each equation ki + ni ≤ k, which can be phrased as “the number of excluded exogenous variables is greater or equal to the number of included endogenous variables”.

The rank condition, a stronger condition which is necessary and sufficient, is that the rank of Πi0 equals ni, where Πi0 is a (k − kini matrix which is obtained from Π by crossing out those columns which correspond to the excluded endogenous variables, and those rows which correspond to the included exogenous variables.

### Using cross-equation restrictions to achieve identification

In simultaneous equations models, the most common method to achieve identification is by imposing within-equation parameter restrictions.  Yet, identification is also possible using cross equation restrictions.

To illustrate how cross equation restrictions can be used for identification, consider the following example from Wooldridge 

{\begin{aligned}y_{1}&=\gamma _{12}y_{2}+\delta _{11}z_{1}+\delta _{12}z_{2}+\delta _{13}z_{3}+u_{1}\\y_{2}&=\gamma _{21}y_{1}+\delta _{21}z_{1}+\delta _{22}z_{2}+u_{2}\end{aligned}} where z's are uncorrelated with u's and y's are endogenous variables. Without further restrictions, the first equation is not identified because there is no excluded exogenous variable. The second equation is just identified if δ13≠0, which is assumed to be true for the rest of discussion.

Now we impose the cross equation restriction of δ12=δ22. Since the second equation is identified, we can treat δ12 as known for the purpose of identification. Then, the first equation becomes:

$y_{1}-\delta _{12}z_{2}=\gamma _{12}y_{2}+\delta _{11}z_{1}+\delta _{13}z_{3}+u_{1}$ Then, we can use (z1, z2, z3) as instruments to estimate the coefficients in the above equation since there are one endogenous variable (y2) and one excluded exogenous variable (z2) on the right hand side. Therefore, cross equation restrictions in place of within-equation restrictions can achieve identification.

## Estimation

### Two-stages least squares (2SLS)

The simplest and the most common estimation method for the simultaneous equations model is the so-called two-stage least squares method,  developed independently by Theil (1953) and Basmann (1957).   It is an equation-by-equation technique, where the endogenous regressors on the right-hand side of each equation are being instrumented with the regressors X from all other equations. The method is called “two-stage” because it conducts estimation in two steps: 

Step 1: Regress Y−i on X and obtain the predicted values ${\hat {Y}}_{\!-i}$ ;
Step 2: Estimate γi, βi by the ordinary least squares regression of yi on ${\hat {Y}}_{\!-i}$ and Xi.

If the ith equation in the model is written as

$y_{i}={\begin{pmatrix}Y_{-i}&X_{i}\end{pmatrix}}{\begin{pmatrix}\gamma _{i}\\\beta _{i}\end{pmatrix}}+u_{i}\equiv Z_{i}\delta _{i}+u_{i},$ where Zi is a (ni + ki) matrix of both endogenous and exogenous regressors in the ith equation, and δi is an (ni + ki)-dimensional vector of regression coefficients, then the 2SLS estimator of δi will be given by 

${\hat {\delta }}_{i}={\big (}{\hat {Z}}'_{i}{\hat {Z}}_{i}{\big )}^{-1}{\hat {Z}}'_{i}y_{i}={\big (}Z'_{i}PZ_{i}{\big )}^{-1}Z'_{i}Py_{i},$ where P = X (X ′X)−1X ′ is the projection matrix onto the linear space spanned by the exogenous regressors X.

### Indirect least squares

Indirect least squares is an approach in econometrics where the coefficients in a simultaneous equations model are estimated from the reduced form model using ordinary least squares.   For this, the structural system of equations is transformed into the reduced form first. Once the coefficients are estimated the model is put back into the structural form.

### Limited information maximum likelihood (LIML)

The “limited information” maximum likelihood method was suggested M. A. Girshick in 1947,  and formalized by T. W. Anderson and H. Rubin in 1949.  It is used when one is interested in estimating a single structural equation at a time (hence its name of limited information), say for observation i:

$y_{i}=Y_{-i}\gamma _{i}+X_{i}\beta _{i}+u_{i}\equiv Z_{i}\delta _{i}+u_{i}$ The structural equations for the remaining endogenous variables Y−i are not specified, and they are given in their reduced form:

$Y_{-i}=X\Pi +U_{-i}$ Notation in this context is different than for the simple IV case. One has:

• $Y_{-i}$ : The endogenous variable(s).
• $X_{-i}$ : The exogenous variable(s)
• $X$ : The instrument(s) (often denoted $Z$ )

The explicit formula for the LIML is: 

${\hat {\delta }}_{i}={\Big (}Z'_{i}(I-\lambda M)Z_{i}{\Big )}^{\!-1}Z'_{i}(I-\lambda M)y_{i},$ where M = I − X (X ′X)−1X ′, and λ is the smallest characteristic root of the matrix:

${\Big (}{\begin{bmatrix}y_{i}\\Y_{-i}\end{bmatrix}}M_{i}{\begin{bmatrix}y_{i}&Y_{-i}\end{bmatrix}}{\Big )}{\Big (}{\begin{bmatrix}y_{i}\\Y_{-i}\end{bmatrix}}M{\begin{bmatrix}y_{i}&Y_{-i}\end{bmatrix}}{\Big )}^{\!-1}$ where, in a similar way, Mi = I − Xi (XiXi)−1Xi.

In other words, λ is the smallest solution of the generalized eigenvalue problem, see Theil (1971 , p. 503):

${\Big |}{\begin{bmatrix}y_{i}&Y_{-i}\end{bmatrix}}'M_{i}{\begin{bmatrix}y_{i}&Y_{-i}\end{bmatrix}}-\lambda {\begin{bmatrix}y_{i}&Y_{-i}\end{bmatrix}}'M{\begin{bmatrix}y_{i}&Y_{-i}\end{bmatrix}}{\Big |}=0$ #### K class estimators

The LIML is a special case of the K-class estimators: 

${\hat {\delta }}={\Big (}Z'(I-\kappa M)Z{\Big )}^{\!-1}Z'(I-\kappa M)y,$ with:

• $\delta ={\begin{bmatrix}\beta _{i}&\gamma _{i}\end{bmatrix}}$ • $Z={\begin{bmatrix}X_{i}&Y_{-i}\end{bmatrix}}$ Several estimators belong to this class:

• κ=0: OLS
• κ=1: 2SLS. Note indeed that in this case, $I-\kappa M=I-M=P$ the usual projection matrix of the 2SLS
• κ=λ: LIML
• κ=λ - α (n-K): Fuller (1977) estimator.  Here K represents the number of instruments, n the sample size, and α a positive constant to specify. A value of α=1 will yield an estimator that is approximately unbiased. 

### Three-stage least squares (3SLS)

The three-stage least squares estimator was introduced by Zellner & Theil (1962).   It can be seen as a special case of multi-equation GMM where the set of instrumental variables is common to all equations.  If all regressors are in fact predetermined, then 3SLS reduces to seemingly unrelated regressions (SUR). Thus it may also be seen as a combination of two-stage least squares (2SLS) with SUR.

## Applications in social science

Across fields and disciplines simultaneous equation models are applied to various observational phenomena. These equations are applied when phenomena are assumed to be reciprocally causal. The classic example is supply and demand in economics. In other disciplines there are examples such as candidate evaluations and party identification  or public opinion and social policy in political science;   road investment and travel demand in geography;  and educational attainment and parenthood entry in sociology or demography.  The simultaneous equation model requires a theory of reciprocal causality that includes special features if the causal effects are to be estimated as simultaneous feedback as opposed to one-sided 'blocks' of an equation where a researcher is interested in the causal effect of X on Y while holding the causal effect of Y on X constant, or when the researcher knows the exact amount of time it takes for each causal effect to take place, i.e., the length of the causal lags. Instead of lagged effects, simultaneous feedback means estimating the simultaneous and perpetual impact of X and Y on each other. This requires a theory that causal effects are simultaneous in time, or so complex that they appear to behave simultaneously; a common example are the moods of roommates.  To estimate simultaneous feedback models a theory of equilibrium is also necessary – that X and Y are in relatively steady states or are part of a system (society, market, classroom) that is in a relatively stable state. 

## Related Research Articles The method of least squares is a standard approach in regression analysis to approximate the solution of overdetermined systems by minimizing the sum of the squares of the residuals made in the results of every single equation. Tikhonov regularization, named for Andrey Tikhonov, is a method of regularization of ill-posed problems. Ridge regression is a special case of Tikhonov regularization in which all parameters are regularized equally. Ridge regression is particularly useful to mitigate the problem of multicollinearity in linear regression, which commonly occurs in models with large numbers of parameters. In general, the method provides improved efficiency in parameter estimation problems in exchange for a tolerable amount of bias. In applied statistics, total least squares is a type of errors-in-variables regression, a least squares data modeling technique in which observational errors on both dependent and independent variables are taken into account. It is a generalization of Deming regression and also of orthogonal regression, and can be applied to both linear and non-linear models.

In statistics, and particularly in econometrics, the reduced form of a system of equations is the result of solving the system for the endogenous variables. This gives the latter as functions of the exogenous variables, if any. In econometrics, the equations of a structural form model are estimated in their theoretically given form, while an alternative approach to estimation is to first solve the theoretical equations for the endogenous variables to obtain reduced form equations, and then to estimate the reduced form equations.

In statistics, econometrics, epidemiology and related disciplines, the method of instrumental variables (IV) is used to estimate causal relationships when controlled experiments are not feasible or when a treatment is not successfully delivered to every unit in a randomized experiment. Intuitively, IVs are used when an explanatory variable of interest is correlated with the error term, in which case ordinary least squares and ANOVA give biased results. A valid instrument induces changes in the explanatory variable but has no independent effect on the dependent variable, allowing a researcher to uncover the causal effect of the explanatory variable on the dependent variable.

In statistics, omitted-variable bias (OVB) occurs when a statistical model leaves out one or more relevant variables. The bias results in the model attributing the effect of the missing variables to those that were included. In statistics, ordinary least squares (OLS) is a type of linear least squares method for estimating the unknown parameters in a linear regression model. OLS chooses the parameters of a linear function of a set of explanatory variables by the principle of least squares: minimizing the sum of the squares of the differences between the observed dependent variable in the given dataset and those predicted by the linear function of the independent variable.

In econometrics, endogeneity broadly refers to situations in which an explanatory variable is correlated with the error term. The distinction between endogenous and exogenous variables originated in simultaneous equations models, where one separates variables whose values are determined by the model from variables which are predetermined; ignoring simultaneity in the estimation leads to biased estimates as it violates the exogeneity assumption of the Gauss–Markov theorem. The problem of endogeneity is often, unfortunately, ignored by researchers conducting non-experimental research and doing so precludes making policy recommendations. Instrumental variable techniques are commonly used to address this problem.

Vector autoregression (VAR) is a statistical model used to capture the relationship between multiple quantities as they change over time. VAR is a type of stochastic process model. VAR models generalize the single-variable (univariate) autoregressive model by allowing for multivariate time series. VAR models are often used in economics and the natural sciences.

In econometrics, the seemingly unrelated regressions (SUR) or seemingly unrelated regression equations (SURE) model, proposed by Arnold Zellner in (1962), is a generalization of a linear regression model that consists of several regression equations, each having its own dependent variable and potentially different sets of exogenous explanatory variables. Each equation is a valid linear regression on its own and can be estimated separately, which is why the system is called seemingly unrelated, although some authors suggest that the term seemingly related would be more appropriate, since the error terms are assumed to be correlated across the equations.

In statistics, a tobit model is any of a class of regression models in which the observed range of the dependent variable is censored in some way. The term was coined by Arthur Goldberger in reference to James Tobin, who developed the model in 1958 to mitigate the problem of zero-inflated data for observations of household expenditure on durable goods. Because Tobin's method can be easily extended to handle truncated and other non-randomly selected samples, some authors adopt a broader definition of the tobit model that includes these cases. In statistics, a fixed effects model is a statistical model in which the model parameters are fixed or non-random quantities. This is in contrast to random effects models and mixed models in which all or some of the model parameters are random variables. In many applications including econometrics and biostatistics a fixed effects model refers to a regression model in which the group means are fixed (non-random) as opposed to a random effects model in which the group means are a random sample from a population. Generally, data can be grouped according to several observed factors. The group means could be modeled as fixed or random effects for each grouping. In a fixed effects model each group mean is a group-specific fixed quantity. In statistics, semiparametric regression includes regression models that combine parametric and nonparametric models. They are often used in situations where the fully nonparametric model may not perform well or when the researcher wants to use a parametric model but the functional form with respect to a subset of the regressors or the density of the errors is not known. Semiparametric regression models are a particular type of semiparametric modelling and, since semiparametric models contain a parametric component, they rely on parametric assumptions and may be misspecified and inconsistent, just like a fully parametric model.

The Heckman correction is a statistical technique to correct bias from non-randomly selected samples or otherwise incidentally truncated dependent variables, a pervasive issue in quantitative social sciences when using observational data. Conceptually, this is achieved by explicitly modelling the individual sampling probability of each observation together with the conditional expectation of the dependent variable. The resulting likelihood function is mathematically similar to the tobit model for censored dependent variables, a connection first drawn by James Heckman in 1974. Heckman also developed a two-step control function approach to estimate this model, which avoids the computational burden of having to estimate both equations jointly, albeit at the cost of inefficiency. Heckman received the Nobel Memorial Prize in Economic Sciences in 2000 for his work in this field.

An error correction model (ECM) belongs to a category of multiple time series models most commonly used for data where the underlying variables have a long-run common stochastic trend, also known as cointegration. ECMs are a theoretically-driven approach useful for estimating both short-term and long-term effects of one time series on another. The term error-correction relates to the fact that last-period's deviation from a long-run equilibrium, the error, influences its short-run dynamics. Thus ECMs directly estimate the speed at which a dependent variable returns to equilibrium after a change in other variables. In statistics, errors-in-variables models or measurement error models are regression models that account for measurement errors in the independent variables. In contrast, standard regression models assume that those regressors have been measured exactly, or observed without error; as such, those models account only for errors in the dependent variables, or responses. In econometrics, the Arellano–Bond estimator is a generalized method of moments estimator used to estimate dynamic models of panel data. It was proposed in 1991 by Manuel Arellano and Stephen Bond, based on the earlier work by Alok Bhargava and John Denis Sargan in 1983, for addressing certain endogeneity problems. The GMM-SYS estimator is a system that contains both the levels and the first difference equations. It provides an alternative to the standard first difference GMM estimator.

Control functions are statistical methods to correct for endogeneity problems by modelling the endogeneity in the error term. The approach thereby differs in important ways from other models that try to account for the same econometric problem. Instrumental variables, for example, attempt to model the endogenous variable X as an often invertible model with respect to a relevant and exogenous instrument Z. Panel analysis uses special data properties to difference out unobserved heterogeneity that is assumed to be fixed over time.

In least squares estimation problems, sometimes one or more regressors specified in the model are not observable. One way to circumvent this issue is to estimate or generate regressors from observable data. This generated regressor method is also applicable to unobserved instrumental variables. Under some regularity conditions, consistency and asymptotic normality of least squares estimator is preserved, but asymptotic variance has a different form in general.

In statistics and econometrics, optimal instruments are a technique for improving the efficiency of estimators in conditional moment models, a class of semiparametric models that generate conditional expectation functions. To estimate parameters of a conditional moment model, the statistician can derive an expectation function and use the generalized method of moments (GMM). However, there are infinitely many moment conditions that can be generated from a single model; optimal instruments provide the most efficient moment conditions.

1. Martin, Vance; Hurn, Stan; Harris, David (2013). Econometric Modelling with Time Series. Cambridge University Press. p. 159. ISBN   978-0-521-19660-4.
2. Maddala, G. S.; Lahiri, Kajal (2009). Introduction to Econometrics (Fourth ed.). Wiley. pp. 355–357. ISBN   978-0-470-01512-4.
3. Quandt, Richard E. (1983). "Computational Problems and Methods". In Griliches, Z.; Intriligator, M. D. (eds.). Handbook of Econometrics. Volume I. North-Holland. pp. 699–764. ISBN   0-444-86185-8.
4. Christ, Carl F. (1994). "The Cowles Commission's Contributions to Econometrics at Chicago, 1939–1955". Journal of Economic Literature . 32 (1): 30–59. JSTOR   2728422.
5. Johnston, J. (1971). "Simultaneous-equation Methods: Estimation". Econometric Methods (Second ed.). New York: McGraw-Hill. pp. 376–423. ISBN   0-07-032679-7.
6. Wooldridge, J.M., Econometric Analysis of Cross Section and Panel Data, MIT Press, Cambridge, Mass.
7. Greene, William H. (2002). Econometric analysis (5th ed.). Prentice Hall. pp. 398–99. ISBN   0-13-066189-9.
8. Basmann, R. L. (1957). "A generalized classical method of linear estimation of coefficients in a structural equation". Econometrica . 25 (1): 77–83. doi:10.2307/1907743. JSTOR   1907743.
9. Theil, Henri (1971). . New York: John Wiley.
10. Park, S-B. (1974) "On Indirect Least Squares Estimation of a Simultaneous Equation System", The Canadian Journal of Statistics / La Revue Canadienne de Statistique, 2 (1), 75–82 JSTOR   3314964
11. Vajda, S.; Valko, P.; Godfrey, K.R. (1987). "Direct and indirect least squares methods in continuous-time parameter estimation". Automatica. 23 (6): 707–718. doi:10.1016/0005-1098(87)90027-6.
12. First application by Girshick, M. A.; Haavelmo, Trygve (1947). "Statistical Analysis of the Demand for Food: Examples of Simultaneous Estimation of Structural Equations". Econometrica . 15 (2): 79–110. doi:10.2307/1907066. JSTOR   1907066.
13. Anderson, T.W.; Rubin, H. (1949). "Estimator of the parameters of a single equation in a complete system of stochastic equations". Annals of Mathematical Statistics . 20 (1): 46–63. doi:. JSTOR   2236803.
14. Amemiya, Takeshi (1985). . Cambridge, Massachusetts: Harvard University Press. p.  235. ISBN   0-674-00560-0.
15. Davidson, Russell; MacKinnon, James G. (1993). Estimation and inference in econometrics. Oxford University Press. p. 649. ISBN   0-19-506011-3.
16. Fuller, Wayne (1977). "Some Properties of a Modification of the Limited Information Estimator". Econometrica. 45 (4): 939–953. doi:10.2307/1912683. JSTOR   1912683.
17. Zellner, Arnold; Theil, Henri (1962). "Three-stage least squares: simultaneous estimation of simultaneous equations". Econometrica. 30 (1): 54–78. doi:10.2307/1911287. JSTOR   1911287.
18. Kmenta, Jan (1986). "System Methods of Estimation". Elements of Econometrics (Second ed.). New York: Macmillan. pp. 695–701.
19. Hayashi, Fumio (2000). "Multiple-Equation GMM". Econometrics. Princeton University Press. pp. 276–279.
20. Page, Benjamin I.; Jones, Calvin C. (1979-12-01). "Reciprocal Effects of Policy Preferences, Party Loyalties and the Vote". American Political Science Review. 73 (4): 1071–1089. doi:10.2307/1953990. ISSN   0003-0554. JSTOR   1953990.
21. Wlezien, Christopher (1995-01-01). "The Public as Thermostat: Dynamics of Preferences for Spending". American Journal of Political Science. 39 (4): 981–1000. doi:10.2307/2111666. JSTOR   2111666.
22. Breznau, Nate (2016-07-01). "Positive Returns and Equilibrium: Simultaneous Feedback Between Public Opinion and Social Policy". Policy Studies Journal. 45 (4): 583–612. doi:10.1111/psj.12171. ISSN   1541-0072.
23. Xie, F.; Levinson, D. (2010-05-01). "How streetcars shaped suburbanization: a Granger causality analysis of land use and transit in the Twin Cities". Journal of Economic Geography. 10 (3): 453–470. doi:10.1093/jeg/lbp031. hdl:. ISSN   1468-2702.
24. Marini, Margaret Mooney (1984-01-01). "Women's Educational Attainment and the Timing of Entry into Parenthood". American Sociological Review. 49 (4): 491–511. doi:10.2307/2095464. JSTOR   2095464.
25. Wong, Chi-Sum; Law, Kenneth S. (1999-01-01). "Testing Reciprocal Relations by Nonrecursive Structuralequation Models Using Cross-Sectional Data". Organizational Research Methods. 2 (1): 69–87. doi:10.1177/109442819921005. ISSN   1094-4281.
26. 2013. “Reverse Arrow Dynamics: Feedback Loops and Formative Measurement.” In Structural Equation Modeling: A Second Course, edited by Gregory R. Hancock and Ralph O. Mueller, 2nd ed., 41–79. Charlotte, NC: Information Age Publishing