Box–Jenkins method

Last updated

In time series analysis, the Box–Jenkins method, [1] named after the statisticians George Box and Gwilym Jenkins, applies autoregressive moving average (ARMA) or autoregressive integrated moving average (ARIMA) models to find the best fit of a time-series model to past values of a time series.

A statistician is a person who works with theoretical or applied statistics. The profession exists in both the private and public sectors. It is common to combine statistical knowledge with expertise in other subjects, and statisticians may work as employees or as statistical consultants.

Gwilym Meirion Jenkins was a British statistician and systems engineer, born in Gowerton, Swansea, Wales. He is most notable for his pioneering work with George Box on autoregressive moving average models, also called Box–Jenkins models, in time-series analysis.

In statistics and econometrics, and in particular in time series analysis, an autoregressive integrated moving average (ARIMA) model is a generalization of an autoregressive moving average (ARMA) model. Both of these models are fitted to time series data either to better understand the data or to predict future points in the series (forecasting). ARIMA models are applied in some cases where data show evidence of non-stationarity, where an initial differencing step can be applied one or more times to eliminate the non-stationarity.


Modeling approach

The original model uses an iterative three-stage modeling approach:

  1. Model identification and model selection : making sure that the variables are stationary, identifying seasonality in the dependent series (seasonally differencing it if necessary), and using plots of the autocorrelation and partial autocorrelation functions of the dependent time series to decide which (if any) autoregressive or moving average component should be used in the model.
  2. Parameter estimation using computation algorithms to arrive at coefficients that best fit the selected ARIMA model. The most common methods use maximum likelihood estimation or non-linear least-squares estimation.
  3. Model checking by testing whether the estimated model conforms to the specifications of a stationary univariate process. In particular, the residuals should be independent of each other and constant in mean and variance over time. (Plotting the mean and variance of residuals over time and performing a Ljung–Box test or plotting autocorrelation and partial autocorrelation of the residuals are helpful to identify misspecification.) If the estimation is inadequate, we have to return to step one and attempt to build a better model.

The data they used were from a gas furnace. These data are well known as the Box and Jenkins gas furnace data for benchmarking predictive models.

Commandeur & Koopman (2007, §10.4) [2] argue that the Box–Jenkins approach is fundamentally problematic. The problem arises because in "the economic and social fields, real series are never stationary however much differencing is done". Thus the investigator has to face the question: how close to stationary is close enough? As the authors note, "This is a hard question to answer". The authors further argue that rather than using Box–Jenkins, it is better to use state space methods, as stationarity of the time series is then not required.

Box–Jenkins model identification

Stationarity and seasonality

The first step in developing a Box–Jenkins model is to determine if the time series is stationary and if there is any significant seasonality that needs to be modelled.

Time series Sequence of data over time

A time series is a series of data points indexed in time order. Most commonly, a time series is a sequence taken at successive equally spaced points in time. Thus it is a sequence of discrete-time data. Examples of time series are heights of ocean tides, counts of sunspots, and the daily closing value of the Dow Jones Industrial Average.

In mathematics and statistics, a stationary process is a stochastic process whose unconditional joint probability distribution does not change when shifted in time. Consequently, parameters such as mean and variance also do not change over time.

In time series data, seasonality is the presence of variations that occur at specific regular intervals less than a year, such as weekly, monthly, or quarterly. Seasonality may be caused by various factors, such as weather, vacation, and holidays and consists of periodic, repetitive, and generally regular and predictable patterns in the levels of a time series.

Detecting stationarity

Stationarity can be assessed from a run sequence plot. The run sequence plot should show constant location and scale. It can also be detected from an autocorrelation plot. Specifically, non-stationarity is often indicated by an autocorrelation plot with very slow decay.

Scale (ratio) proportional ratio of a linear dimension of a model to the same feature of the original

The scale ratio of a model represents the proportional ratio of a linear dimension of the model to the same feature of the original. Examples include a 3-dimensional scale model of a building or the scale drawings of the elevations or plans of a building. In such cases the scale is dimensionless and exact throughout the model or drawing.

Detecting seasonality

Seasonality (or periodicity) can usually be assessed from an autocorrelation plot, a seasonal subseries plot, or a spectral plot.

Seasonal subseries plots are a graphical tool to visualize and detect seasonality in a time series. Seasonal subseries plots involves the extraction of the seasons from a time series into a subseries. Based on a selected periodicity, it is an alternative plot that emphasizes the seasonal patterns are where the data for each season are collected together in separate mini time plots.

Differencing to achieve stationarity

Box and Jenkins recommend the differencing approach to achieve stationarity. However, fitting a curve and subtracting the fitted values from the original data can also be used in the context of Box–Jenkins models.

Curve fitting

Curve fitting is the process of constructing a curve, or mathematical function, that has the best fit to a series of data points, possibly subject to constraints. Curve fitting can involve either interpolation, where an exact fit to the data is required, or smoothing, in which a "smooth" function is constructed that approximately fits the data. A related topic is regression analysis, which focuses more on questions of statistical inference such as how much uncertainty is present in a curve that is fit to data observed with random errors. Fitted curves can be used as an aid for data visualization, to infer values of a function where no data are available, and to summarize the relationships among two or more variables. Extrapolation refers to the use of a fitted curve beyond the range of the observed data, and is subject to a degree of uncertainty since it may reflect the method used to construct the curve as much as it reflects the observed data.

Seasonal differencing

At the model identification stage, the goal is to detect seasonality, if it exists, and to identify the order for the seasonal autoregressive and seasonal moving average terms. For many series, the period is known and a single seasonality term is sufficient. For example, for monthly data one would typically include either a seasonal AR 12 term or a seasonal MA 12 term. For Box–Jenkins models, one does not explicitly remove seasonality before fitting the model. Instead, one includes the order of the seasonal terms in the model specification to the ARIMA estimation software. However, it may be helpful to apply a seasonal difference to the data and regenerate the autocorrelation and partial autocorrelation plots. This may help in the model identification of the non-seasonal component of the model. In some cases, the seasonal differencing may remove most or all of the seasonality effect.

Identify p and q

Once stationarity and seasonality have been addressed, the next step is to identify the order (i.e. the p and q) of the autoregressive and moving average terms. Different authors have different approaches for identifying p and q. Brockwell and Davis (1991) [3] state "our prime criterion for model selection [among ARMA(p,q) models] will be the AICc", i.e. the Akaike information criterion with correction. Other authors use the autocorrelation plot and the partial autocorrelation plot, described below.

Autocorrelation and partial autocorrelation plots

The sample autocorrelation plot and the sample partial autocorrelation plot are compared to the theoretical behavior of these plots when the order is known.

Specifically, for an AR(1) process, the sample autocorrelation function should have an exponentially decreasing appearance. However, higher-order AR processes are often a mixture of exponentially decreasing and damped sinusoidal components.

For higher-order autoregressive processes, the sample autocorrelation needs to be supplemented with a partial autocorrelation plot. The partial autocorrelation of an AR(p) process becomes zero at lag p + 1 and greater, so we examine the sample partial autocorrelation function to see if there is evidence of a departure from zero. This is usually determined by placing a 95% confidence interval on the sample partial autocorrelation plot (most software programs that generate sample autocorrelation plots also plot this confidence interval). If the software program does not generate the confidence band, it is approximately , with N denoting the sample size.

The autocorrelation function of a MA(q) process becomes zero at lag q + 1 and greater, so we examine the sample autocorrelation function to see where it essentially becomes zero. We do this by placing the 95% confidence interval for the sample autocorrelation function on the sample autocorrelation plot. Most software that can generate the autocorrelation plot can also generate this confidence interval.

The sample partial autocorrelation function is generally not helpful for identifying the order of the moving average process.

The following table summarizes how one can use the sample autocorrelation function for model identification.

ShapeIndicated Model
Exponential, decaying to zero Autoregressive model. Use the partial autocorrelation plot to identify the order of the autoregressive model.
Alternating positive and negative, decaying to zeroAutoregressive model. Use the partial autocorrelation plot to help identify the order.
One or more spikes, rest are essentially zero Moving average model, order identified by where plot becomes zero.
Decay, starting after a few lagsMixed autoregressive and moving average (ARMA) model.
All zero or close to zeroData are essentially random.
High values at fixed intervalsInclude seasonal autoregressive term.
No decay to zeroSeries is not stationary.

Hyndman & Athanasopoulos suggest the following: [4]

The data may follow an ARIMA(p,d,0) model if the ACF and PACF plots of the differenced data show the following patterns:
  • the ACF is exponentially decaying or sinusoidal;
  • there is a significant spike at lag p in PACF, but none beyond lag p.
The data may follow an ARIMA(0,d,q) model if the ACF and PACF plots of the differenced data show the following patterns:
  • the PACF is exponentially decaying or sinusoidal;
  • there is a significant spike at lag q in ACF, but none beyond lag q.

In practice, the sample autocorrelation and partial autocorrelation functions are random variables and do not give the same picture as the theoretical functions. This makes the model identification more difficult. In particular, mixed models can be particularly difficult to identify. Although experience is helpful, developing good models using these sample plots can involve much trial and error.

Box–Jenkins model estimation

Estimating the parameters for Box–Jenkins models involves numerically approximating the solutions of nonlinear equations. For this reason, it is common to use statistical software designed to handle to the approach – virtually all modern statistical packages feature this capability. The main approaches to fitting Box–Jenkins models are nonlinear least squares and maximum likelihood estimation. Maximum likelihood estimation is generally the preferred technique. The likelihood equations for the full Box–Jenkins model are complicated and are not included here. See (Brockwell and Davis, 1991) for the mathematical details.

Box–Jenkins model diagnostics

Assumptions for a stable univariate process

Model diagnostics for Box–Jenkins models is similar to model validation for non-linear least squares fitting.

That is, the error term At is assumed to follow the assumptions for a stationary univariate process. The residuals should be white noise (or independent when their distributions are normal) drawings from a fixed distribution with a constant mean and variance. If the Box–Jenkins model is a good model for the data, the residuals should satisfy these assumptions.

If these assumptions are not satisfied, one needs to fit a more appropriate model. That is, go back to the model identification step and try to develop a better model. Hopefully the analysis of the residuals can provide some clues as to a more appropriate model.

One way to assess if the residuals from the Box–Jenkins model follow the assumptions is to generate statistical graphics (including an autocorrelation plot) of the residuals. One could also look at the value of the Box–Ljung statistic.

Related Research Articles

Autocorrelation correlation of a signal with a time-shifted copy of itself, as a function of shift

Autocorrelation, also known as serial correlation, is the correlation of a signal with a delayed copy of itself as a function of delay. Informally, it is the similarity between observations as a function of the time lag between them. The analysis of autocorrelation is a mathematical tool for finding repeating patterns, such as the presence of a periodic signal obscured by noise, or identifying the missing fundamental frequency in a signal implied by its harmonic frequencies. It is often used in signal processing for analyzing functions or series of values, such as time domain signals.

In the statistical analysis of time series, autoregressive–moving-average (ARMA) models provide a parsimonious description of a (weakly) stationary stochastic process in terms of two polynomials, one for the autoregression (AR) and the second for the moving average (MA). The general ARMA model was described in the 1951 thesis of Peter Whittle, Hypothesis testing in time series analysis, and it was popularized in the 1970 book by George E. P. Box and Gwilym Jenkins.

In statistics, econometrics and signal processing, an autoregressive (AR) model is a representation of a type of random process; as such, it is used to describe certain time-varying processes in nature, economics, etc. The autoregressive model specifies that the output variable depends linearly on its own previous values and on a stochastic term ; thus the model is in the form of a stochastic difference equation. Together with the moving-average (MA) model, it is a special case and key component of the more general ARMA and ARIMA models of time series, which have a more complicated stochastic structure; it is also a special case of the vector autoregressive model (VAR), which consists of a system of more than one interlocking stochastic difference equation in more than one evolving random variable.

A portmanteau test is a type of statistical hypothesis test in which the null hypothesis is well specified, but the alternative hypothesis is more loosely specified. Tests constructed in this context can have the property of being at least moderately powerful against a wide range of departures from the null hypothesis. Thus, in applied statistics, a portmanteau test provides a reasonable way of proceeding as a general check of a model's match to a dataset where there are many different ways in which the model may depart from the underlying data generating process. Use of such tests avoids having to be very specific about the particular type of departure being tested.

In statistics, a unit root test tests whether a time series variable is non-stationary and possesses a unit root. The null hypothesis is generally defined as the presence of a unit root and the alternative hypothesis is either stationarity, trend stationarity or explosive root depending on the test used.

In probability theory, stochastic drift is the change of the average value of a stochastic (random) process. A related concept is the drift rate, which is the rate at which the average changes. For example, a process that counts the number of heads in a series of fair coin tosses has a drift rate of 1/2 per toss. This is in contrast to the random fluctuations about this average value. The stochastic mean of that coin-toss process is 1/2 and the drift rate of the stochastic mean is 0, assuming 1=heads and 0=tails.

Predictive analytics encompasses a variety of statistical techniques from data mining, predictive modelling, and machine learning, that analyze current and historical facts to make predictions about future or otherwise unknown events.

In statistics, the Durbin–Watson statistic is a test statistic used to detect the presence of autocorrelation at lag 1 in the residuals from a regression analysis. It is named after James Durbin and Geoffrey Watson. The small sample distribution of this ratio was derived by John von Neumann. Durbin and Watson applied this statistic to the residuals from least squares regressions, and developed bounds tests for the null hypothesis that the errors are serially uncorrelated against the alternative that they follow a first order autoregressive process. Later, John Denis Sargan and Alok Bhargava developed several von Neumann–Durbin–Watson type test statistics for the null hypothesis that the errors on a regression model follow a process with a unit root against the alternative hypothesis that the errors follow a stationary first order autoregression. Note that the distribution of this test statistic does not depend on the estimated regression coefficients and the variance of the errors.

Correlogram image of correlation statistics

In the analysis of data, a correlogram is an image of correlation statistics. For example, in time series analysis, a correlogram, also known as an autocorrelation plot, is a plot of the sample autocorrelations versus .

The Ljung–Box test is a type of statistical test of whether any of a group of autocorrelations of a time series are different from zero. Instead of testing randomness at each distinct lag, it tests the "overall" randomness based on a number of lags, and is therefore a portmanteau test.

In time series analysis, the partial autocorrelation function (PACF) gives the partial correlation of a stationary time series with its own lagged values, regressed the values of the time series at all shorter lags. It contrasts with the autocorrelation function, which does not control for other lags.

In time series analysis, the moving-average model, also known as moving-average process, is a common approach for modeling univariate time series. The moving-average model specifies that the output variable depends linearly on the current and various past values of a stochastic term.

NumXL An econometrics and time series analysis add-in for Microsoft Excel

Numerical Analysis for Excel (NumXL) is an econometrics and time series analysis add-in for Microsoft Excel. Developed by Spider Financial, NumXL provides a wide variety of statistical and time series analysis techniques, including linear and nonlinear time series modeling, statistical tests and others.

The following outline is provided as an overview of and topical guide to regression analysis:

In statistics and econometrics, the ADF-GLS test is a test for a unit root in an economic time series sample. It was developed by Elliott, Rothenberg and Stock (ERS) in 1992 as a modification of the augmented Dickey–Fuller test (ADF).


  1. Box, George; Jenkins, Gwilym (1970). Time Series Analysis: Forecasting and Control. San Francisco: Holden-Day.
  2. Commandeur, J. J. F.; Koopman, S. J. (2007). Introduction to State Space Time Series Analysis. Oxford University Press.
  3. Brockwell, Peter J.; Davis, Richard A. (1991). Time Series: Theory and Methods. Springer-Verlag. p. 273.
  4. Hyndman, Rob J; Athanasopoulos, George. "Forecasting: principles and practice" . Retrieved 18 May 2015.

Further reading

PD-icon.svg This article incorporates  public domain material from the National Institute of Standards and Technology website .