Single-equation methods (econometrics)

Last updated

A variety of methods are used in econometrics to estimate models consisting of a single equation . The oldest and still the most commonly used is the ordinary least squares method used to estimate linear regressions.

Econometrics is the application of statistical methods to economic data in order to give empirical content to economic relationships. More precisely, it is "the quantitative analysis of actual economic phenomena based on the concurrent development of theory and observation, related by appropriate methods of inference". An introductory economics textbook describes econometrics as allowing economists "to sift through mountains of data to extract simple relationships". The first known use of the term "econometrics" was by Polish economist Paweł Ciompa in 1910. Jan Tinbergen is considered by many to be one of the founding fathers of econometrics. Ragnar Frisch is credited with coining the term in the sense in which it is used today.

In mathematics, an equation is a statement that asserts the equality of two expressions. The word equation and its cognates in other languages may have subtly different meanings; for example, in French an équation is defined as containing one or more variables, while in English any equality is an equation.

Ordinary least squares method for estimating the unknown parameters in a linear regression model

In statistics, ordinary least squares (OLS) is a type of linear least squares method for estimating the unknown parameters in a linear regression model. OLS chooses the parameters of a linear function of a set of explanatory variables by the principle of least squares: minimizing the sum of the squares of the differences between the observed dependent variable in the given dataset and those predicted by the linear function.

A variety of methods are available to estimate non-linear models. A particularly important class of non-linear models are those used to estimate relationships where the dependent variable is discrete, truncated or censored. These include logit, probit and Tobit models.

Logit inverse of the sigmoidal "logistic" function or logistic transform

In statistics, the logit function or the log-odds is the logarithm of the odds p/(1 − p) where p is probability. It is a type of function that creates a map of probability values from to . It is the inverse of the sigmoidal "logistic" function or logistic transform used in mathematics, especially in statistics.

Probit

In probability theory and statistics, the probit function is the quantile function associated with the standard normal distribution, which is commonly denoted as N(0,1). Mathematically, it is the inverse of the cumulative distribution function of the standard normal distribution, which is denoted as , so the probit is denoted as . It has applications in exploratory statistical graphics and specialized regression modeling of binary response variables.

The Tobit model refers to a class of regression models in which the observed range of the dependent variable is censored in some way. The term was coined by Arthur Goldberger in reference to James Tobin, who developed the model in 1958 to mitigate the problem of zero-inflated data for observations of household expenditure on durable goods. Because Tobin's method can be easily extended to handle truncated and other non-randomly selected samples, some authors adopt a broader definition of the Tobit model that includes these cases.

Single equation methods may be applied to time-series, cross section or panel data.


Related Research Articles

Least squares Method in statistics

The method of least squares is a standard approach in regression analysis to approximate the solution of overdetermined systems, i.e., sets of equations in which there are more equations than unknowns. "Least squares" means that the overall solution minimizes the sum of the squares of the residuals made in the results of every single equation.

Simultaneous equation models are a type of statistical model in the form of a set of linear simultaneous equations. They are often used in econometrics. One can estimate these models equation by equation; however, estimation methods that exploit the system of equations, such as generalized method of moments (GMM) and instrumental variables estimation (IV) tend to be more efficient.

Heteroscedasticity statistical property in which some subpopulations in a collection of random variables have different variabilities from others

In statistics, a collection of random variables is heteroscedastic if there are sub-populations that have different variabilities from others. Here "variability" could be quantified by the variance or any other measure of statistical dispersion. Thus heteroscedasticity is the absence of homoscedasticity.

Econometric models are statistical models used in econometrics. An econometric model specifies the statistical relationship that is believed to hold between the various economic quantities pertaining to a particular economic phenomenon. An econometric model can be derived from a deterministic economic model by allowing for uncertainty, or from an economic model which itself is stochastic. However, it is also possible to use econometric models that are not tied to any specific economic theory.

In statistics, the Wald test assesses constraints on statistical parameters based on the weighted distance between the unrestricted estimate and its hypothesized value under the null hypothesis, where the weight is the precision of the estimate. Intuitively, the larger this weighted distance, the less likely it is that the constraint is true. While the finite sample distributions of Wald tests are generally unknown, it has an asymptotic χ2-distribution under the null hypothesis, a fact that can be used to determine statistical significance.

In econometrics, the seemingly unrelated regressions (SUR) or seemingly unrelated regression equations (SURE) model, proposed by Arnold Zellner in (1962), is a generalization of a linear regression model that consists of several regression equations, each having its own dependent variable and potentially different sets of exogenous explanatory variables. Each equation is a valid linear regression on its own and can be estimated separately, which is why the system is called seemingly unrelated, although some authors suggest that the term seemingly related would be more appropriate, since the error terms are assumed to be correlated across the equations.

In statistics and econometrics, the parameter identification problem is the inability in principle to identify a best estimate of the value(s) of one or more parameters in a regression. This problem can occur in the estimation of multiple-equation econometric models where the equations have variables in common.

The Cowles Foundation for Research in Economics is an economic research institute at Yale University. It was created as the Cowles Commission for Research in Economics at Colorado Springs in 1932 by businessman and economist Alfred Cowles. In 1939, the Cowles Commission moved to the University of Chicago under Theodore O. Yntema. Jacob Marschak directed it from 1943 until 1948, when Tjalling C. Koopmans assumed leadership. Increasing opposition to the Cowles Commission from the department of economics of the University of Chicago during the 1950s impelled Koopmans to persuade the Cowles family to move the commission to Yale University in 1955 where it became the Cowles Foundation.

Semiparametric regression

In statistics, semiparametric regression includes regression models that combine parametric and nonparametric models. They are often used in situations where the fully nonparametric model may not perform well or when the researcher wants to use a parametric model but the functional form with respect to a subset of the regressors or the density of the errors is not known. Semiparametric regression models are a particular type of semiparametric modelling and, since semiparametric models contain a parametric component, they rely on parametric assumptions and may be misspecified and inconsistent, just like a fully parametric model.

Mixed model statistical model containing both fixed effects and random effects

A mixed model is a statistical model containing both fixed effects and random effects. These models are useful in a wide variety of disciplines in the physical, biological and social sciences. They are particularly useful in settings where repeated measurements are made on the same statistical units, or where measurements are made on clusters of related statistical units. Because of their advantage in dealing with missing values, mixed effects models are often preferred over more traditional approaches such as repeated measures ANOVA.

In statistics, the Ramsey Regression Equation Specification Error Test (RESET) test is a general specification test for the linear regression model. More specifically, it tests whether non-linear combinations of the fitted values help explain the response variable. The intuition behind the test is that if non-linear combinations of the explanatory variables have any power in explaining the response variable, the model is misspecified in the sense that the data generating process might be better approximated by a polynomial or another non-linear functional form.

RATS, an abbreviation of Regression Analysis of Time Series, is a statistical package for time series analysis and econometrics. RATS is developed and sold by Estima, Inc., located in Evanston, IL.

Structural estimation is a technique for estimating deep "structural" parameters of theoretical economic models. The term is inherited from the simultaneous equations model. In this sense "structural estimation" is contrasted with "reduced-form estimation", which is the statistical relationship between observed variables.

The Heckman correction is a statistical technique to correct bias from non-randomly selected samples or otherwise incidentally truncated dependent variables, a pervasive issue in quantitative social sciences when using observational data. Conceptually, this is achieved by explicitly modelling the individual sampling probability of each observation together with the conditional expectation of the dependent variable. The resulting likelihood function is mathematically similar to the Tobit model for censored dependent variables, a connection first drawn by James Heckman in 1976. Heckman also developed a two-step control function approach to estimate this model, which reduced the computional burden of having to estimate both equations jointly, albeit at the cost of inefficiency. Heckman received the Nobel Memorial Prize in Economic Sciences in 2000 for his work in this field.

The methodology of econometrics is the study of the range of differing approaches to undertaking econometric analysis.

In a linear panel data setting, it can be desirable to estimate the magnitude of the fixed effects, as they provide measures of the unobserved components. For instance, in wage equation regressions, Fixed Effects capture ability measures that are constant over time, such as motivation. Chamberlain's approach to unobserved effects models is a way of estimating the linear unobserved effects, under Fixed Effect assumptions, in the following unobserved effects model

Control functions are statistical methods to correct for endogeneity problems by modelling the endogeneity in the error term. The approach thereby differs in important ways from other models that try to account for the same econometric problem. Instrumental variables, for example, attempt to model the endogenous variable X as an often invertible model with respect to a relevant and exogenous instrument Z. Panel data use special data properties to difference out unobserved heterogeneity that is assumed to be fixed over time.

Linear regression statistical approach for modeling the relationship between a scalar dependent variable and one or more explanatory variables

In statistics, linear regression is a linear approach to modeling the relationship between a scalar response and one or more explanatory variables. The case of one explanatory variable is called simple linear regression. For more than one explanatory variable, the process is called multiple linear regression. This term is distinct from multivariate linear regression, where multiple correlated dependent variables are predicted, rather than a single scalar variable.