Trend-stationary process

Last updated

In the statistical analysis of time series, a trend-stationary process is a stochastic process from which an underlying trend (function solely of time) can be removed, leaving a stationary process. [1] The trend does not have to be linear.

Contents

Conversely, if the process requires differencing to be made stationary, then it is called difference stationary and possesses one or more unit roots. [2] [3] Those two concepts may sometimes be confused, but while they share many properties, they are different in many aspects. It is possible for a time series to be non-stationary, yet have no unit root and be trend-stationary. In both unit root and trend-stationary processes, the mean can be growing or decreasing over time; however, in the presence of a shock, trend-stationary processes are mean-reverting (i.e. transitory, the time series will converge again towards the growing mean, which was not affected by the shock) while unit-root processes have a permanent impact on the mean (i.e. no convergence over time). [4]

Formal definition

A process {Y} is said to be trend-stationary if [5]

where t is time, f is any function mapping from the reals to the reals, and {e} is a stationary process. The value is said to be the trend value of the process at time t.

Simplest example: stationarity around a linear trend

Suppose the variable Y evolves according to

where t is time and et is the error term, which is hypothesized to be white noise or more generally to have been generated by any stationary process. Then one can use [5] [6] [7] linear regression to obtain an estimate of the true underlying trend slope and an estimate of the underlying intercept term b; if the estimate is significantly different from zero, this is sufficient to show with high confidence that the variable Y is non-stationary. The residuals from this regression are given by

If these estimated residuals can be statistically shown to be stationary (more precisely, if one can reject the hypothesis that the true underlying errors are non-stationary), then the residuals are referred to as the detrended data, [8] and the original series {Yt} is said to be trend-stationary even though it is not stationary.

Stationarity around other types of trend

Exponential growth trend

Many economic time series are characterized by exponential growth. For example, suppose that one hypothesizes that gross domestic product is characterized by stationary deviations from a trend involving a constant growth rate. Then it could be modeled as

with Ut being hypothesized to be a stationary error process. To estimate the parameters and B, one first takes [8] the natural logarithm (ln) of both sides of this equation:

This log-linear equation is in the same form as the previous linear trend equation and can be detrended in the same way, giving the estimated as the detrended value of , and hence the implied as the detrended value of , assuming one can reject the hypothesis that is non-stationary.

Quadratic trend

Trends do not have to be linear or log-linear. For example, a variable could have a quadratic trend:

This can be regressed linearly in the coefficients using t and t2 as regressors; again, if the residuals are shown to be stationary then they are the detrended values of .

See also

Notes

  1. About.com economics Online Glossary of Research Economics
  2. "Differencing And Unit Root Tests" (PDF). pages.stern.nyu.edu. Archived (PDF) from the original on 2004-05-13. Retrieved 27 May 2023.
  3. Burke, Orlaith (2011). "Non-Stationary Series" (PDF). www.stats.ox.ac.uk. University of Oxford. Archived from the original (PDF) on June 11, 2014. Retrieved 27 May 2023.
  4. Heino Bohn Nielsen. "Non-Stationary Time Series and Unit Root Tests" (PDF).
  5. 1 2 Nelson, Charles R. and Plosser, Charles I. (1982), "Trends and Random Walks in Macroeconomic Time Series: Some Evidence and Implications," Journal of Monetary Economics , 10, 139–162.
  6. Hegwood, Natalie, and Papell, David H. "Are real GDP levels trend, difference, or regime-wise trend stationary? Evidence from panel data tests incorporating structural change." http://www.uh.edu/~dpapell/realgdp.pdf
  7. Lucke, Bernd. "Is Germany‘s GDP trend-stationary? A measurement-with-theory approach." "Archived copy" (PDF). Archived from the original (PDF) on 2011-07-08. Retrieved 2010-12-07.{{cite web}}: CS1 maint: archived copy as title (link)
  8. 1 2 http://www.duke.edu/~rnau/411diff.htm "Stationarity and differencing"

Related Research Articles

<span class="mw-page-title-main">Logistic regression</span> Statistical model for a binary dependent variable

In statistics, the logistic model is a statistical model that models the log-odds of an event as a linear combination of one or more independent variables. In regression analysis, logistic regression is estimating the parameters of a logistic model. Formally, in binary logistic regression there is a single binary dependent variable, coded by an indicator variable, where the two values are labeled "0" and "1", while the independent variables can each be a binary variable or a continuous variable. The corresponding probability of the value labeled "1" can vary between 0 and 1, hence the labeling; the function that converts log-odds to probability is the logistic function, hence the name. The unit of measurement for the log-odds scale is called a logit, from logistic unit, hence the alternative names. See § Background and § Definition for formal mathematics, and § Example for a worked example.

In mathematics and statistics, a stationary process is a stochastic process whose unconditional joint probability distribution does not change when shifted in time. Consequently, parameters such as mean and variance also do not change over time. If you draw a line through the middle of a stationary process then it should be flat; it may have 'seasonal' cycles around the trend line, but overall it does not trend up nor down.

Linear trend estimation is a statistical technique used to analyze data patterns. When a series of measurements of a process are treated as a sequence or time series, trend estimation can be used to make and justify statements about tendencies in the data by relating the measurements to the times at which they occurred. This model can then be used to describe the behavior of the observed data.

<span class="mw-page-title-main">Nonlinear regression</span> Regression analysis

In statistics, nonlinear regression is a form of regression analysis in which observational data are modeled by a function which is a nonlinear combination of the model parameters and depends on one or more independent variables. The data are fitted by a method of successive approximations (iterations).

In statistics and econometrics, and in particular in time series analysis, an autoregressive integrated moving average (ARIMA) model is a generalization of an autoregressive moving average (ARMA) model. To better comprehend the data or to forecast upcoming series points, both of these models are fitted to time series data. ARIMA models are applied in some cases where data show evidence of non-stationarity in the sense of mean, where an initial differencing step can be applied one or more times to eliminate the non-stationarity of the mean function. When the seasonality shows in a time series, the seasonal-differencing could be applied to eliminate the seasonal component. Since the ARMA model, according to the Wold's decomposition theorem, is theoretically sufficient to describe a regular wide-sense stationary time series, we are motivated to make stationary a non-stationary time series, e.g., by using differencing, before we can use the ARMA model. Note that if the time series contains a predictable sub-process, the predictable component is treated as a non-zero-mean but periodic component in the ARIMA framework so that it is eliminated by the seasonal differencing.

In statistics, ordinary least squares (OLS) is a type of linear least squares method for choosing the unknown parameters in a linear regression model by the principle of least squares: minimizing the sum of the squares of the differences between the observed dependent variable in the input dataset and the output of the (linear) function of the independent variable.

Cointegration is a statistical property of a collection (X1X2, ..., Xk) of time series variables. First, all of the series must be integrated of order d (see Order of integration). Next, if a linear combination of this collection is integrated of order less than d, then the collection is said to be co-integrated. Formally, if (X,Y,Z) are each integrated of order d, and there exist coefficients a,b,c such that aX + bY + cZ is integrated of order less than d, then X, Y, and Z are cointegrated. Cointegration has become an important property in contemporary time series analysis. Time series often have trends—either deterministic or stochastic. In an influential paper, Charles Nelson and Charles Plosser (1982) provided statistical evidence that many US macroeconomic time series (like GNP, wages, employment, etc.) have stochastic trends.

In probability theory and statistics, a unit root is a feature of some stochastic processes that can cause problems in statistical inference involving time series models. A linear stochastic process has a unit root if 1 is a root of the process's characteristic equation. Such a process is non-stationary but does not always have a trend.

Vector autoregression (VAR) is a statistical model used to capture the relationship between multiple quantities as they change over time. VAR is a type of stochastic process model. VAR models generalize the single-variable (univariate) autoregressive model by allowing for multivariate time series. VAR models are often used in economics and the natural sciences.

In statistics, the Dickey–Fuller test tests the null hypothesis that a unit root is present in an autoregressive (AR) time series model. The alternative hypothesis is different depending on which version of the test is used, but is usually stationarity or trend-stationarity. The test is named after the statisticians David Dickey and Wayne Fuller, who developed it in 1979.

In probability theory, stochastic drift is the change of the average value of a stochastic (random) process. A related concept is the drift rate, which is the rate at which the average changes. For example, a process that counts the number of heads in a series of fair coin tosses has a drift rate of 1/2 per toss. This is in contrast to the random fluctuations about this average value. The stochastic mean of that coin-toss process is 1/2 and the drift rate of the stochastic mean is 0, assuming 1 = heads and 0 = tails.

Cochrane–Orcutt estimation is a procedure in econometrics, which adjusts a linear model for serial correlation in the error term. Developed in the 1940s, it is named after statisticians Donald Cochrane and Guy Orcutt.

In statistics, generalized least squares (GLS) is a method used to estimate the unknown parameters in a linear regression model. It is used when there is a non-zero amount of correlation between the residuals in the regression model. GLS is employed to improve statistical efficiency and reduce the risk of drawing erroneous inferences, as compared to conventional least squares and weighted least squares methods. It was first described by Alexander Aitken in 1935.

In statistics, semiparametric regression includes regression models that combine parametric and nonparametric models. They are often used in situations where the fully nonparametric model may not perform well or when the researcher wants to use a parametric model but the functional form with respect to a subset of the regressors or the density of the errors is not known. Semiparametric regression models are a particular type of semiparametric modelling and, since semiparametric models contain a parametric component, they rely on parametric assumptions and may be misspecified and inconsistent, just like a fully parametric model.

Seasonal adjustment or deseasonalization is a statistical method for removing the seasonal component of a time series. It is usually done when wanting to analyse the trend, and cyclical deviations from trend, of a time series independently of the seasonal components. Many economic phenomena have seasonal cycles, such as agricultural production, and consumer consumption. It is necessary to adjust for this component in order to understand underlying trends in the economy, so official statistics are often adjusted to remove seasonal components. Typically, seasonally adjusted data is reported for unemployment rates to reveal the underlying trends and cycles in labor markets.

In statistics, the Breusch–Godfrey test is used to assess the validity of some of the modelling assumptions inherent in applying regression-like models to observed data series. In particular, it tests for the presence of serial correlation that has not been included in a proposed model structure and which, if present, would mean that incorrect conclusions would be drawn from other tests or that sub-optimal estimates of model parameters would be obtained.

An error correction model (ECM) belongs to a category of multiple time series models most commonly used for data where the underlying variables have a long-run common stochastic trend, also known as cointegration. ECMs are a theoretically-driven approach useful for estimating both short-term and long-term effects of one time series on another. The term error-correction relates to the fact that last-period's deviation from a long-run equilibrium, the error, influences its short-run dynamics. Thus ECMs directly estimate the speed at which a dependent variable returns to equilibrium after a change in other variables.

<span class="mw-page-title-main">Errors-in-variables models</span> Regression models accounting for possible errors in independent variables

In statistics, errors-in-variables models or measurement error models are regression models that account for measurement errors in the independent variables. In contrast, standard regression models assume that those regressors have been measured exactly, or observed without error; as such, those models account only for errors in the dependent variables, or responses.

In statistics and econometrics, the ADF-GLS test is a test for a unit root in an economic time series sample. It was developed by Elliott, Rothenberg and Stock (ERS) in 1992 as a modification of the augmented Dickey–Fuller test (ADF).

In statistics, linear regression is a statistical model which estimates the linear relationship between a scalar response and one or more explanatory variables. The case of one explanatory variable is called simple linear regression; for more than one, the process is called multiple linear regression. This term is distinct from multivariate linear regression, where multiple correlated dependent variables are predicted, rather than a single scalar variable. If the explanatory variables are measured with error then errors-in-variables models are required, also known as measurement error models.