This article includes a list of general references, but it remains largely unverified because it lacks sufficient corresponding inline citations .(February 2012) (Learn how and when to remove this template message) |

**Vector autoregression** (**VAR**) is a statistical model used to capture the relationship between multiple quantities as they change over time. VAR is a type of stochastic process model. VAR models generalize the single-variable (univariate) autoregressive model by allowing for multivariate time series. VAR models are often used in economics and the natural sciences.

- Specification
- Definition
- Order of integration of the variables
- Concise matrix notation
- Example
- Writing VAR(p) as VAR(1)
- Structural vs. reduced form
- Structural VAR
- Reduced-form VAR
- Estimation
- Estimation of the regression parameters
- Estimation of the covariance matrix of the errors
- Estimation of the estimator's covariance matrix
- Degrees of freedom
- Interpretation of estimated model
- Impulse response
- Forecasting using an estimated VAR model
- Applications
- Software
- See also
- Notes
- Further reading

Like the autoregressive model, each variable has an equation modelling its evolution over time. This equation includes the variable's lagged (past) values, the lagged values of the other variables in the model, and an error term. VAR models do not require as much knowledge about the forces influencing a variable as do structural models with simultaneous equations. The only prior knowledge required is a list of variables which can be hypothesized to affect each other over time.

This section includes a list of references, related reading or external links, but its sources remain unclear because it lacks inline citations .(February 2012) (Learn how and when to remove this template message) |

A VAR model describes the evolution of a set of *k* variables, called * endogenous variables*, over time. Each period of time is numbered, *t* = 1, ..., *T*. The variables are collected in a vector, *y _{t}*, which is of length

VAR models are characterized by their *order*, which refers to the number of earlier time periods the model will use. Continuing the above example, a 5th-order VAR would model each year's wheat price as a linear combination of the last five years of wheat prices. A *lag* is the value of a variable in a previous time period. So in general a *p*th-order VAR refers to a VAR model which includes lags for the last *p* time periods. A *p*th-order VAR is denoted "VAR(*p*)" and sometimes called "a VAR with *p* lags". A *p*th-order VAR model is written as

The variables of the form *y*_{t−i} indicate that variable's value *i* time periods earlier and are called the "i*th* lag" of *y*_{t}. The variable *c* is a *k*-vector of constants serving as the intercept of the model. *A _{i}* is a time-invariant (

- . Every error term has a mean of zero.
- . The contemporaneous covariance matrix of error terms is a
*k*×*k*positive-semidefinite matrix denoted Ω. - for any non-zero
*k*. There is no correlation across time. In particular, there is no serial correlation in individual error terms.^{ [1] }

The process of choosing the maximum lag *p* in the VAR model requires special attention because inference is dependent on correctness of the selected lag order.^{ [2] }^{ [3] }

Note that all variables have to be of the same order of integration. The following cases are distinct:

- All the variables are I(0) (stationary): this is in the standard case, i.e. a VAR in level
- All the variables are I(
*d*) (non-stationary) with*d*> 0:^{[ citation needed ]}- The variables are cointegrated: the error correction term has to be included in the VAR. The model becomes a Vector error correction model (VECM) which can be seen as a restricted VAR.
- The variables are not cointegrated: first, the variables have to be differenced d times and one has a VAR in difference.

One can stack the vectors in order to write a VAR(*p*) as a stochastic matrix difference equation, with a concise matrix notation:

Details of the matrices are in a separate page.

For a general example of a VAR(*p*) with *k* variables, see General matrix notation of a VAR(p).

A VAR(1) in two variables can be written in matrix form (more compact notation) as

(in which only a single *A* matrix appears because this example has a maximum lag *p* equal to 1), or, equivalently, as the following system of two equations

Each variable in the model has one equation. The current (time *t*) observation of each variable depends on its own lagged values as well as on the lagged values of each other variable in the VAR.

A VAR with *p* lags can always be equivalently rewritten as a VAR with only one lag by appropriately redefining the dependent variable. The transformation amounts to stacking the lags of the VAR(*p*) variable in the new VAR(1) dependent variable and appending identities to complete the number of equations.

For example, the VAR(2) model

can be recast as the VAR(1) model

where *I* is the identity matrix.

The equivalent VAR(1) form is more convenient for analytical derivations and allows more compact statements.

A * structural VAR with p lags* (sometimes abbreviated

where *c*_{0} is a *k* × 1 vector of constants, *B _{i}* is a

The error terms ε* _{t}* (

For example, a two variable structural VAR(1) is:

where

that is, the variances of the structural shocks are denoted (*i* = 1, 2) and the covariance is .

Writing the first equation explicitly and passing *y _{2,t}* to the right hand side one obtains

Note that *y*_{2,t} can have a contemporaneous effect on *y _{1,t}* if

Because of the parameter identification problem, ordinary least squares estimation of the structural VAR would yield inconsistent parameter estimates. This problem can be overcome by rewriting the VAR in reduced form.

From an economic point of view, if the joint dynamics of a set of variables can be represented by a VAR model, then the structural form is a depiction of the underlying, "structural", economic relationships. Two features of the structural form make it the preferred candidate to represent the underlying relations:

- 1.
*Error terms are not correlated*. The structural, economic shocks which drive the dynamics of the economic variables are assumed to be independent, which implies zero correlation between error terms as a desired property. This is helpful for separating out the effects of economically unrelated influences in the VAR. For instance, there is no reason why an oil price shock (as an example of a supply shock) should be related to a shift in consumers' preferences towards a style of clothing (as an example of a demand shock); therefore one would expect these factors to be statistically independent.

- 2.
*Variables can have a contemporaneous impact on other variables*. This is a desirable feature especially when using low frequency data. For example, an indirect tax rate increase would not affect tax revenues the day the decision is announced, but one could find an effect in that quarter's data.

By premultiplying the structural VAR with the inverse of *B*_{0}

and denoting

one obtains the *p*th order reduced VAR

Note that in the reduced form all right hand side variables are predetermined at time *t*. As there are no time *t* endogenous variables on the right hand side, no variable has a *direct* contemporaneous effect on other variables in the model.

However, the error terms in the reduced VAR are composites of the structural shocks *e*_{t} = *B*_{0}^{−1}*ε*_{t}. Thus, the occurrence of one structural shock *ε _{i,t}* can potentially lead to the occurrence of shocks in all error terms

can have non-zero off-diagonal elements, thus allowing non-zero correlation between error terms.

Starting from the concise matrix notation (for details see this annex):

- The multivariate least squares (MLS) approach for estimating B yields:

This can be written alternatively as:

where denotes the Kronecker product and Vec the vectorization of the indicated matrix.

This estimator is consistent and asymptotically efficient. It is furthermore equal to the conditional maximum likelihood estimator.^{ [4] }

- As the explanatory variables are the same in each equation, the multivariate least squares estimator is equivalent to the ordinary least squares estimator applied to each equation separately.
^{ [5] }

As in the standard case, the maximum likelihood estimator (MLE) of the covariance matrix differs from the ordinary least squares (OLS) estimator.

MLE estimator:^{[ citation needed ]}

OLS estimator:^{[ citation needed ]} for a model with a constant, *k* variables and *p* lags.

In a matrix notation, this gives:

The covariance matrix of the parameters can be estimated as^{[ citation needed ]}

Vector autoregression models often involve the estimation of many parameters. For example, with seven variables and four lags, each matrix of coefficients for a given lag length is 7 by 7, and the vector of constants has 7 elements, so a total of 49×4 + 7 = 203 parameters are estimated, substantially lowering the degrees of freedom of the regression (the number of data points minus the number of parameters to be estimated). This can hurt the accuracy of the parameter estimates and hence of the forecasts given by the model.

Properties of the VAR model are usually summarized using structural analysis using Granger causality, impulse responses, and forecast error variance decompositions.

Consider the first-order case (i.e., with only one lag), with equation of evolution

for evolving (state) vector and vector of shocks. To find, say, the effect of the *j*-th element of the vector of shocks upon the *i*-th element of the state vector 2 periods later, which is a particular impulse response, first write the above equation of evolution one period lagged:

Use this in the original equation of evolution to obtain

then repeat using the twice lagged equation of evolution, to obtain

From this, the effect of the *j*-th component of upon the *i*-th component of is the *i, j* element of the matrix

It can be seen from this induction process that any shock will have an effect on the elements of *y* infinitely far forward in time, although the effect will become smaller and smaller over time assuming that the AR process is stable — that is, that all the eigenvalues of the matrix *A* are less than 1 in absolute value.

An estimated VAR model can be used for forecasting, and the quality of the forecasts can be judged, in ways that are completely analogous to the methods used in univariate autoregressive modelling.

Christopher Sims has advocated VAR models, criticizing the claims and performance of earlier modeling in macroeconomic econometrics.^{ [6] } He recommended VAR models, which had previously appeared in time series statistics and in system identification, a statistical specialty in control theory. Sims advocated VAR models as providing a theory-free method to estimate economic relationships, thus being an alternative to the "incredible identification restrictions" in structural models.^{ [6] } VAR models are also increasingly used in health research for automatic analyses of diary data^{ [7] } or sensor data.

- R: The package
*vars*includes functions for VAR models.^{ [8] }^{ [9] }Other R packages are listed in the CRAN Task View: Time Series Analysis. - Python: The
*statsmodels*package's tsa (time series analysis) module supports VARs.*PyFlux*has support for VARs and Bayesian VARs. - SAS: VARMAX
- Stata: "var"
- EViews: "VAR"
- Gretl: "var"
- Matlab: "varm"
- Regression analysis of time series: "SYSTEM"
- LDT

- Bayesian vector autoregression
- Convergent cross mapping
- Granger causality
- Panel vector autoregression, an extension of VAR models to panel data
^{ [10] } - Variance decomposition

- ↑ For multivariate tests for autocorrelation in the VAR models, see Hatemi-J, A. (2004). "Multivariate tests for autocorrelation in the stable and unstable VAR models".
*Economic Modelling*.**21**(4): 661–683. doi:10.1016/j.econmod.2003.09.005. - ↑ Hacker, R. S.; Hatemi-J, A. (2008). "Optimal lag-length choice in stable and unstable VAR models under situations of homoscedasticity and ARCH".
*Journal of Applied Statistics*.**35**(6): 601–615. doi:10.1080/02664760801920473. - ↑ Hatemi-J, A.; Hacker, R. S. (2009). "Can the LR test be helpful in choosing the optimal lag order in the VAR model when information criteria suggest different lag orders?".
*Applied Economics*.**41**(9): 1489–1500. - ↑ Hamilton, James D. (1994).
*Time Series Analysis*. Princeton University Press. p. 293. - ↑ Zellner, Arnold (1962). "An Efficient Method of Estimating Seemingly Unrelated Regressions and Tests for Aggregation Bias".
*Journal of the American Statistical Association*.**57**(298): 348–368. doi:10.1080/01621459.1962.10480664. - 1 2 Sims, Christopher (1980). "Macroeconomics and Reality".
*Econometrica*.**48**(1): 1–48. CiteSeerX 10.1.1.163.5425 . doi:10.2307/1912017. JSTOR 1912017. - ↑ van der Krieke; et al. (2016). "Temporal Dynamics of Health and Well-Being: A Crowdsourcing Approach to Momentary Assessments and Automated Generation of Personalized Feedback (2016)".
*Psychosomatic Medicine*: 1. doi:10.1097/PSY.0000000000000378. PMID 27551988. - ↑ Bernhard Pfaff VAR, SVAR and SVEC Models: Implementation Within R Package vars
- ↑ Hyndman, Rob J; Athanasopoulos, George (2018). "11.2: Vector Autoregressions".
*Forecasting: Principles and Practice*. OTexts. pp. 333–335. ISBN 978-0-9875071-1-2. - ↑ Holtz-Eakin, D., Newey, W., and Rosen, H. S. (1988). Estimating Vector Autoregressions with Panel Data. Econometrica, 56(6):1371–1395.

- Asteriou, Dimitrios; Hall, Stephen G. (2011). "Vector Autoregressive (VAR) Models and Causality Tests".
*Applied Econometrics*(Second ed.). London: Palgrave MacMillan. pp. 319–333. - Enders, Walter (2010).
*Applied Econometric Time Series*(Third ed.). New York: John Wiley & Sons. pp. 272–355. ISBN 978-0-470-50539-7. - Favero, Carlo A. (2001).
*Applied Macroeconometrics*. New York: Oxford University Press. pp. 162–213. ISBN 0-19-829685-1. - Lütkepohl, Helmut (2005).
*New Introduction to Multiple Time Series Analysis*. Berlin: Springer. ISBN 3-540-40172-5. - Qin, Duo (2011). "Rise of VAR Modelling Approach".
*Journal of Economic Surveys*.**25**(1): 156–174. doi:10.1111/j.1467-6419.2010.00637.x.

In statistics, the **Gauss–Markov theorem** states that the ordinary least squares (OLS) estimator has the lowest sampling variance within the class of linear unbiased estimators, if the errors in the linear regression model are uncorrelated, have equal variances and expectation value of zero. The errors do not need to be normal, nor do they need to be independent and identically distributed. The requirement that the estimator be unbiased cannot be dropped, since biased estimators exist with lower variance. See, for example, the James–Stein estimator, ridge regression, or simply any degenerate estimator.

In probability theory and statistics, a **covariance matrix** is a square matrix giving the covariance between each pair of elements of a given random vector. Any covariance matrix is symmetric and positive semi-definite and its main diagonal contains variances.

**Levinson recursion** or **Levinson–Durbin recursion** is a procedure in linear algebra to recursively calculate the solution to an equation involving a Toeplitz matrix. The algorithm runs in Θ(*n*^{2}) time, which is a strong improvement over Gauss–Jordan elimination, which runs in Θ(*n*^{3}).

**Simultaneous equations models** are a type of statistical model in which the dependent variables are functions of other dependent variables, rather than just independent variables. This means some of the explanatory variables are jointly determined with the dependent variable, which in economics usually is the consequence of some underlying equilibrium mechanism. Take the typical supply and demand model: whilst typically one would determine the quantity supplied and demanded to be a function of the price set by the market, it is also possible for the reverse to be true, where producers observe the quantity that consumers demand *and then* set the price.

In statistics, a sequence of random variables is **homoscedastic** if all its random variables have the same finite variance. This is also known as **homogeneity of variance**. The complementary notion is called heteroscedasticity. The spellings *homos kedasticity* and

In statistics, originally in geostatistics, **kriging** or **Gaussian process regression** is a method of interpolation for which the interpolated values are modeled by a Gaussian process governed by prior covariances. Under suitable assumptions on the priors, kriging gives the best linear unbiased prediction of the intermediate values. Interpolating methods based on other criteria such as smoothness may not yield the most likely intermediate values. The method is widely used in the domain of spatial analysis and computer experiments. The technique is also known as **Wiener–Kolmogorov prediction**, after Norbert Wiener and Andrey Kolmogorov.

In mathematical statistics, the **Fisher information** is a way of measuring the amount of information that an observable random variable *X* carries about an unknown parameter *θ* of a distribution that models *X*. Formally, it is the variance of the score, or the expected value of the observed information. In Bayesian statistics, the asymptotic distribution of the posterior mode depends on the Fisher information and not on the prior. The role of the Fisher information in the asymptotic theory of maximum-likelihood estimation was emphasized by the statistician Ronald Fisher. The Fisher information is also used in the calculation of the Jeffreys prior, which is used in Bayesian statistics.

In statistics, a vector of random variables is **heteroscedastic** if the variability of the random disturbance is different across elements of the vector. Here, variability could be quantified by the variance or any other measure of statistical dispersion. Thus heteroscedasticity is the absence of homoscedasticity. A typical example is the set of observations of income in different cities.

In applied statistics, **total least squares** is a type of errors-in-variables regression, a least squares data modeling technique in which observational errors on both dependent and independent variables are taken into account. It is a generalization of Deming regression and also of orthogonal regression, and can be applied to both linear and non-linear models.

In statistics, econometrics and signal processing, an **autoregressive** (**AR**) **model** is a representation of a type of random process; as such, it is used to describe certain time-varying processes in nature, economics, etc. The autoregressive model specifies that the output variable depends linearly on its own previous values and on a stochastic term ; thus the model is in the form of a stochastic difference equation. Together with the moving-average (MA) model, it is a special case and key component of the more general autoregressive–moving-average (ARMA) and autoregressive integrated moving average (ARIMA) models of time series, which have a more complicated stochastic structure; it is also a special case of the vector autoregressive model (VAR), which consists of a system of more than one interlocking stochastic difference equation in more than one evolving random variable.

In statistics, **ordinary least squares** (**OLS**) is a type of linear least squares method for estimating the unknown parameters in a linear regression model. OLS chooses the parameters of a linear function of a set of explanatory variables by the principle of least squares: minimizing the sum of the squares of the differences between the observed dependent variable in the given dataset and those predicted by the linear function of the independent variable.

In econometrics, the **seemingly unrelated regressions** (**SUR**) or **seemingly unrelated regression equations ** (**SURE**) model, proposed by Arnold Zellner in (1962), is a generalization of a linear regression model that consists of several regression equations, each having its own dependent variable and potentially different sets of exogenous explanatory variables. Each equation is a valid linear regression on its own and can be estimated separately, which is why the system is called *seemingly unrelated*, although some authors suggest that the term *seemingly related* would be more appropriate, since the error terms are assumed to be correlated across the equations.

In econometrics and other applications of multivariate time series analysis, a **variance decomposition** or **forecast error variance decomposition** (**FEVD**) is used to aid in the interpretation of a vector autoregression (VAR) model once it has been fitted. The variance decomposition indicates the amount of information each variable contributes to the other variables in the autoregression. It determines how much of the forecast error variance of each of the variables can be explained by exogenous shocks to the other variables.

In statistics, **generalized least squares** (**GLS**) is a technique for estimating the unknown parameters in a linear regression model when there is a certain degree of correlation between the residuals in a regression model. In these cases, ordinary least squares and weighted least squares can be statistically inefficient, or even give misleading inferences. GLS was first described by Alexander Aitken in 1936.

In statistics, **Bayesian multivariate linear regression** is a Bayesian approach to multivariate linear regression, i.e. linear regression where the predicted outcome is a vector of correlated random variables rather than a single scalar random variable. A more general treatment of this approach can be found in the article MMSE estimator.

In statistics, the **projection matrix**, sometimes also called the **influence matrix** or **hat matrix**, maps the vector of response values to the vector of fitted values. It describes the influence each response value has on each fitted value. The diagonal elements of the projection matrix are the leverages, which describe the influence each response value has on the fitted value for that same observation.

In statistics and signal processing, the **orthogonality principle** is a necessary and sufficient condition for the optimality of a Bayesian estimator. Loosely stated, the orthogonality principle says that the error vector of the optimal estimator is orthogonal to any possible estimator. The orthogonality principle is most commonly stated for linear estimators, but more general formulations are possible. Since the principle is a necessary and sufficient condition for optimality, it can be used to find the minimum mean square error estimator.

In statistics, **polynomial regression** is a form of regression analysis in which the relationship between the independent variable *x* and the dependent variable *y* is modelled as an *n*th degree polynomial in *x*. Polynomial regression fits a nonlinear relationship between the value of *x* and the corresponding conditional mean of *y*, denoted E(*y* |*x*). Although *polynomial regression* fits a nonlinear model to the data, as a statistical estimation problem it is linear, in the sense that the regression function E(*y* | *x*) is linear in the unknown parameters that are estimated from the data. For this reason, polynomial regression is considered to be a special case of multiple linear regression.

In statistics, **errors-in-variables models** or **measurement error models** are regression models that account for measurement errors in the independent variables. In contrast, standard regression models assume that those regressors have been measured exactly, or observed without error; as such, those models account only for errors in the dependent variables, or responses.

In econometrics, the **Arellano–Bond estimator** is a generalized method of moments estimator used to estimate dynamic models of panel data. It was proposed in 1991 by Manuel Arellano and Stephen Bond, based on the earlier work by Alok Bhargava and John Denis Sargan in 1983, for addressing certain endogeneity problems. The GMM-SYS estimator is a system that contains both the levels and the first difference equations. It provides an alternative to the standard first difference GMM estimator.

This page is based on this Wikipedia article

Text is available under the CC BY-SA 4.0 license; additional terms may apply.

Images, videos and audio are available under their respective licenses.

Text is available under the CC BY-SA 4.0 license; additional terms may apply.

Images, videos and audio are available under their respective licenses.