Moving-average model

Last updated May 06, 2024

In time series analysis, the moving-average model (MA model), also known as moving-average process, is a common approach for modeling univariate time series.^[1]^[2] The moving-average model specifies that the output variable is cross-correlated with a non-identical to itself random-variable.

Definition

The notation MA(q) refers to the moving average model of order q:

X_{t}=\mu +\varepsilon _{t}+\theta _{1}\varepsilon _{t-1}+\cdots +\theta _{q}\varepsilon _{t-q}=\mu +\sum _{i=1}^{q}\theta _{i}\varepsilon _{t-i}+\varepsilon _{t},

where $\mu$ is the mean of the series, the $\theta _{1},...,\theta _{q}$ are the coefficients of the model^{[ example needed ]} and $\varepsilon _{t},\varepsilon _{t-1},...,\varepsilon _{t-q}$ are the error terms. The value of q is called the order of the MA model. This can be equivalently written in terms of the backshift operator B as^[4]

X_{t}=\mu +(1+\theta _{1}B+\cdots +\theta _{q}B^{q})\varepsilon _{t}.

Thus, a moving-average model is conceptually a linear regression of the current value of the series against current and previous (observed) white noise error terms or random shocks. The random shocks at each point are assumed to be mutually independent and to come from the same distribution, typically a normal distribution, with location at zero and constant scale.

Interpretation

The moving-average model is essentially a finite impulse response filter applied to white noise, with some additional interpretation placed on it.^{[ clarification needed ]} The role of the random shocks in the MA model differs from their role in the autoregressive (AR) model in two ways. First, they are propagated to future values of the time series directly: for example, $\varepsilon _{t-1}$ appears directly on the right side of the equation for $X_{t}$ . In contrast, in an AR model $\varepsilon _{t-1}$ does not appear on the right side of the $X_{t}$ equation, but it does appear on the right side of the $X_{t-1}$ equation, and $X_{t-1}$ appears on the right side of the $X_{t}$ equation, giving only an indirect effect of $\varepsilon _{t-1}$ on $X_{t}$ . Second, in the MA model a shock affects $X$ values only for the current period and q periods into the future; in contrast, in the AR model a shock affects $X$ values infinitely far into the future, because $\varepsilon _{t}$ affects $X_{t}$ , which affects $X_{t+1}$ , which affects $X_{t+2}$ , and so on forever (see Impulse response).

Fitting the model

Fitting a moving-average model is generally more complicated than fitting an autoregressive model.^[5] This is because the lagged error terms are not observable. This means that iterative non-linear fitting procedures need to be used in place of linear least squares. Moving average models are linear combinations of past white noise terms, while autoregressive models are linear combinations of past time series values.^[6] ARMA models are more complicated than pure AR and MA models, as they combine both autoregressive and moving average components.^[5]

The autocorrelation function (ACF) of an MA(q) process is zero at lag q + 1 and greater. Therefore, we determine the appropriate maximum lag for the estimation by examining the sample autocorrelation function to see where it becomes insignificantly different from zero for all lags beyond a certain lag, which is designated as the maximum lag q.

Sometimes the ACF and partial autocorrelation function (PACF) will suggest that an MA model would be a better model choice and sometimes both AR and MA terms should be used in the same model (see Box–Jenkins method).

Autoregressive Integrated Moving Average (ARIMA) models are an alternative to segmented regression that can also be used for fitting a moving-average model.^[7]

Related Research Articles

Autocorrelation, sometimes known as serial correlation in the discrete time case, is the correlation of a signal with a delayed copy of itself as a function of delay. Informally, it is the similarity between observations of a random variable as a function of the time lag between them. The analysis of autocorrelation is a mathematical tool for finding repeating patterns, such as the presence of a periodic signal obscured by noise, or identifying the missing fundamental frequency in a signal implied by its harmonic frequencies. It is often used in signal processing for analyzing functions or series of values, such as time domain signals.

In statistics, the term linear model refers to any model which assumes linearity in the system. The most common occurrence is in connection with regression models and the term is often taken as synonymous with linear regression model. However, the term is also used in time series analysis with a different meaning. In each case, the designation "linear" is used to identify a subclass of models for which substantial reduction in the complexity of the related statistical theory is possible.

In econometrics, the autoregressive conditional heteroskedasticity (ARCH) model is a statistical model for time series data that describes the variance of the current error term or innovation as a function of the actual sizes of the previous time periods' error terms; often the variance is related to the squares of the previous innovations. The ARCH model is appropriate when the error variance in a time series follows an autoregressive (AR) model; if an autoregressive moving average (ARMA) model is assumed for the error variance, the model is a generalized autoregressive conditional heteroskedasticity (GARCH) model.

In the statistical analysis of time series, autoregressive–moving-average (ARMA) models provide a parsimonious description of a (weakly) stationary stochastic process in terms of two polynomials, one for the autoregression (AR) and the second for the moving average (MA). The general ARMA model was described in the 1951 thesis of Peter Whittle, Hypothesis testing in time series analysis, and it was popularized in the 1970 book by George E. P. Box and Gwilym Jenkins.

In statistics, econometrics, and signal processing, an autoregressive (AR) model is a representation of a type of random process; as such, it is used to describe certain time-varying processes in nature, economics, behavior, etc. The autoregressive model specifies that the output variable depends linearly on its own previous values and on a stochastic term ; thus the model is in the form of a stochastic difference equation which should not be confused with a differential equation. Together with the moving-average (MA) model, it is a special case and key component of the more general autoregressive–moving-average (ARMA) and autoregressive integrated moving average (ARIMA) models of time series, which have a more complicated stochastic structure; it is also a special case of the vector autoregressive model (VAR), which consists of a system of more than one interlocking stochastic difference equation in more than one evolving random variable.

In time series analysis, the lag operator (L) or backshift operator (B) operates on an element of a time series to produce the previous element. For example, given some time series

In statistics and econometrics, and in particular in time series analysis, an autoregressive integrated moving average (ARIMA) model is a generalization of an autoregressive moving average (ARMA) model. To better comprehend the data or to forecast upcoming series points, both of these models are fitted to time series data. ARIMA models are applied in some cases where data show evidence of non-stationarity in the sense of mean, where an initial differencing step can be applied one or more times to eliminate the non-stationarity of the mean function. When the seasonality shows in a time series, the seasonal-differencing could be applied to eliminate the seasonal component. Since the ARMA model, according to the Wold's decomposition theorem, is theoretically sufficient to describe a regular wide-sense stationary time series, we are motivated to make stationary a non-stationary time series, e.g., by using differencing, before we can use the ARMA model. Note that if the time series contains a predictable sub-process, the predictable component is treated as a non-zero-mean but periodic component in the ARIMA framework so that it is eliminated by the seasonal differencing.

In probability theory and statistics, a unit root is a feature of some stochastic processes that can cause problems in statistical inference involving time series models. A linear stochastic process has a unit root if 1 is a root of the process's characteristic equation. Such a process is non-stationary but does not always have a trend.

In time series analysis, the Box–Jenkins method, named after the statisticians George Box and Gwilym Jenkins, applies autoregressive moving average (ARMA) or autoregressive integrated moving average (ARIMA) models to find the best fit of a time-series model to past values of a time series.

In applied mathematics, the Wiener–Khinchin theorem or Wiener–Khintchine theorem, also known as the Wiener–Khinchin–Einstein theorem or the Khinchin–Kolmogorov theorem, states that the autocorrelation function of a wide-sense-stationary random process has a spectral decomposition given by the power spectral density of that process.

In statistics, Wold's decomposition or the Wold representation theorem, named after Herman Wold, says that every covariance-stationary time series $can be written as the sum of two time series, one deterministic and one stochastic .$

<span class="mw-page-title-main">Correlogram</span> Image of correlation statistics

In the analysis of data, a correlogram is a chart of correlation statistics. For example, in time series analysis, a plot of the sample autocorrelations $versus is an autocorrelogram . If cross-correlation is plotted, the result is called a cross-correlogram .$

The Ljung–Box test is a type of statistical test of whether any of a group of autocorrelations of a time series are different from zero. Instead of testing randomness at each distinct lag, it tests the "overall" randomness based on a number of lags, and is therefore a portmanteau test.

In statistics, autoregressive fractionally integrated moving average models are time series models that generalize ARIMA (autoregressive integrated moving average) models by allowing non-integer values of the differencing parameter. These models are useful in modeling time series with long memory—that is, in which deviations from the long-run mean decay more slowly than an exponential decay. The acronyms "ARFIMA" or "FARIMA" are often used, although it is also conventional to simply extend the "ARIMA(p, d, q)" notation for models, by simply allowing the order of differencing, d, to take fractional values. Fractional differencing and the ARFIMA model were introduced in the early 1980s by Clive Granger, Roselyne Joyeux, and Jonathan Hosking.

In statistical signal processing, the goal of spectral density estimation (SDE) or simply spectral estimation is to estimate the spectral density of a signal from a sequence of time samples of the signal. Intuitively speaking, the spectral density characterizes the frequency content of the signal. One purpose of estimating the spectral density is to detect any periodicities in the data, by observing peaks at the frequencies corresponding to these periodicities.

In time series analysis, the partial autocorrelation function (PACF) gives the partial correlation of a stationary time series with its own lagged values, regressed the values of the time series at all shorter lags. It contrasts with the autocorrelation function, which does not control for other lags.

In statistics, the Breusch–Godfrey test is used to assess the validity of some of the modelling assumptions inherent in applying regression-like models to observed data series. In particular, it tests for the presence of serial correlation that has not been included in a proposed model structure and which, if present, would mean that incorrect conclusions would be drawn from other tests or that sub-optimal estimates of model parameters would be obtained.

In statistics, identifiability is a property which a model must satisfy for precise inference to be possible. A model is identifiable if it is theoretically possible to learn the true values of this model's underlying parameters after obtaining an infinite number of observations from it. Mathematically, this is equivalent to saying that different values of the parameters must generate different probability distributions of the observable variables. Usually the model is identifiable only under certain technical restrictions, in which case the set of these requirements is called the identification conditions.

In computer networks, self-similarity is a feature of network data transfer dynamics. When modeling network data dynamics the traditional time series models, such as an autoregressive moving average model are not appropriate. This is because these models only provide a finite number of parameters in the model and thus interaction in a finite time window, but the network data usually have a long-range dependent temporal structure. A self-similar process is one way of modeling network data dynamics with such a long range correlation. This article defines and describes network data transfer dynamics in the context of a self-similar process. Properties of the process are shown and methods are given for graphing and estimating parameters modeling the self-similarity of network data.

Generalized filtering is a generic Bayesian filtering scheme for nonlinear state-space models. It is based on a variational principle of least action, formulated in generalized coordinates of motion. Note that "generalized coordinates of motion" are related to—but distinct from—generalized coordinates as used in (multibody) dynamical systems analysis. Generalized filtering furnishes posterior densities over hidden states generating observed data using a generalized gradient descent on variational free energy, under the Laplace assumption. Unlike classical filtering, generalized filtering eschews Markovian assumptions about random fluctuations. Furthermore, it operates online, assimilating data to approximate the posterior density over unknown quantities, without the need for a backward pass. Special cases include variational filtering, dynamic expectation maximization and generalized predictive coding.

References

1 2 Shumway, Robert H.; Stoffer, David S. (19 April 2017). Time series analysis and its applications : with R examples. Springer. ISBN 978-3-319-52451-1. OCLC 966563984.
↑ "2.1 Moving Average Models (MA models) | STAT 510". PennState: Statistics Online Courses. Retrieved 2023-02-27.
↑ Shumway, Robert H.; Stoffer, David S. (2019-05-17), "ARIMA Models", Time Series: A Data Analysis Approach Using R, Boca Raton : CRC Press, Taylor & Francis Group, 2019.: Chapman and Hall/CRC, pp. 99–128, doi:10.1201/9780429273285-5, ISBN 978-0-429-27328-5 , retrieved 2023-02-27{{citation}}: CS1 maint: location (link)
↑ Box, George E. P.; Jenkins, Gwilym M.; Reinsel, Gregory C.; Ljung, Greta M. (2016). Time series analysis : forecasting and control (5th ed.). Hoboken, New Jersey: John Wiley & Sons, Incorporated. p. 53. ISBN 978-1-118-67492-5. OCLC 908107438.
1 2 "Autoregressive Moving Average ARMA(p, q) Models for Time Series Analysis - Part 1 | QuantStart". www.quantstart.com. Retrieved 2023-02-27.
↑ "Autoregressive Moving Average ARMA(p, q) Models for Time Series Analysis - Part 2 | QuantStart". www.quantstart.com. Retrieved 2023-02-27.
↑ Schaffer, Andrea L.; Dobbins, Timothy A.; Pearson, Sallie-Anne (2021-03-22). "Interrupted time series analysis using autoregressive integrated moving average (ARIMA) models: a guide for evaluating large-scale health interventions". BMC Medical Research Methodology. 21 (1): 58. doi: 10.1186/s12874-021-01235-8 . ISSN 1471-2288. PMC 7986567 . PMID 33752604.

External links

Common approaches to univariate time series

This article incorporates public domain material from the National Institute of Standards and Technology

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[:0-1] 1 2 Shumway, Robert H.; Stoffer, David S. (19 April 2017). Time series analysis and its applications : with R examples. Springer. ISBN 978-3-319-52451-1. OCLC 966563984.

[2] "2.1 Moving Average Models (MA models) | STAT 510". PennState: Statistics Online Courses. Retrieved 2023-02-27.

[3] Shumway, Robert H.; Stoffer, David S. (2019-05-17), "ARIMA Models", Time Series: A Data Analysis Approach Using R, Boca Raton : CRC Press, Taylor & Francis Group, 2019.: Chapman and Hall/CRC, pp. 99–128, doi:10.1201/9780429273285-5, ISBN 978-0-429-27328-5 , retrieved 2023-02-27{{citation}}: CS1 maint: location (link)

[4] Box, George E. P.; Jenkins, Gwilym M.; Reinsel, Gregory C.; Ljung, Greta M. (2016). Time series analysis : forecasting and control (5th ed.). Hoboken, New Jersey: John Wiley & Sons, Incorporated. p. 53. ISBN 978-1-118-67492-5. OCLC 908107438.

[:1-5] 1 2 "Autoregressive Moving Average ARMA(p, q) Models for Time Series Analysis - Part 1 | QuantStart". www.quantstart.com. Retrieved 2023-02-27.

[6] "Autoregressive Moving Average ARMA(p, q) Models for Time Series Analysis - Part 2 | QuantStart". www.quantstart.com. Retrieved 2023-02-27.

[7] Schaffer, Andrea L.; Dobbins, Timothy A.; Pearson, Sallie-Anne (2021-03-22). "Interrupted time series analysis using autoregressive integrated moving average (ARIMA) models: a guide for evaluating large-scale health interventions". BMC Medical Research Methodology. 21 (1): 58. doi: 10.1186/s12874-021-01235-8 . ISSN 1471-2288. PMC 7986567 . PMID 33752604.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

v t e Stochastic processes
Discrete time	Bernoulli process Branching process Chinese restaurant process Galton–Watson process Independent and identically distributed random variables Markov chain Moran process Random walk Loop-erased Self-avoiding Biased Maximal entropy
Continuous time	Additive process Bessel process Birth–death process pure birth Brownian motion Bridge Excursion Fractional Geometric Meander Cauchy process Contact process Continuous-time random walk Cox process Diffusion process Dyson Brownian motion Empirical process Feller process Fleming–Viot process Gamma process Geometric process Hawkes process Hunt process Interacting particle systems Itô diffusion Itô process Jump diffusion Jump process Lévy process Local time Markov additive process McKean–Vlasov process Ornstein–Uhlenbeck process Poisson process Compound Non-homogeneous Schramm–Loewner evolution Semimartingale Sigma-martingale Stable process Superprocess Telegraph process Variance gamma process Wiener process Wiener sausage
Both	Branching process Galves–Löcherbach model Gaussian process Hidden Markov model (HMM) Markov process Martingale Differences Local Sub- Super- Random dynamical system Regenerative process Renewal process Stochastic chains with memory of variable length White noise
Fields and other	Dirichlet process Gaussian random field Gibbs measure Hopfield model Ising model Potts model Boolean network Markov random field Percolation Pitman–Yor process Point process Cox Poisson Random field Random graph
Time series models	Autoregressive conditional heteroskedasticity (ARCH) model Autoregressive integrated moving average (ARIMA) model Autoregressive (AR) model Autoregressive–moving-average (ARMA) model Generalized autoregressive conditional heteroskedasticity (GARCH) model Moving-average (MA) model
Financial models	Binomial options pricing model Black–Derman–Toy Black–Karasinski Black–Scholes Chan–Karolyi–Longstaff–Sanders (CKLS) Chen Constant elasticity of variance (CEV) Cox–Ingersoll–Ross (CIR) Garman–Kohlhagen Heath–Jarrow–Morton (HJM) Heston Ho–Lee Hull–White Korn-Kreer-Lenssen LIBOR market Rendleman–Bartter SABR volatility Vašíček Wilkie
Actuarial models	Bühlmann Cramér–Lundberg Risk process Sparre–Anderson
Queueing models	Bulk Fluid Generalized queueing network M/G/1 M/M/1 M/M/c
Properties	Càdlàg paths Continuous Continuous paths Ergodic Exchangeable Feller-continuous Gauss–Markov Markov Mixing Piecewise-deterministic Predictable Progressively measurable Self-similar Stationary Time-reversible
Limit theorems	Central limit theorem Donsker's theorem Doob's martingale convergence theorems Ergodic theorem Fisher–Tippett–Gnedenko theorem Large deviation principle Law of large numbers (weak/strong) Law of the iterated logarithm Maximal ergodic theorem Sanov's theorem Zero–one laws (Blumenthal, Borel–Cantelli, Engelbert–Schmidt, Hewitt–Savage, Kolmogorov, Lévy)
Inequalities	Burkholder–Davis–Gundy Doob's martingale Doob's upcrossing Kunita–Watanabe Marcinkiewicz–Zygmund
Tools	Cameron–Martin formula Convergence of random variables Doléans-Dade exponential Doob decomposition theorem Doob–Meyer decomposition theorem Doob's optional stopping theorem Dynkin's formula Feynman–Kac formula Filtration Girsanov theorem Infinitesimal generator Itô integral Itô's lemma Karhunen–Loève theorem Kolmogorov continuity theorem Kolmogorov extension theorem Lévy–Prokhorov metric Malliavin calculus Martingale representation theorem Optional stopping theorem Prokhorov's theorem Quadratic variation Reflection principle Skorokhod integral Skorokhod's representation theorem Skorokhod space Snell envelope Stochastic differential equation Tanaka Stopping time Stratonovich integral Uniform integrability Usual hypotheses Wiener space Classical Abstract
Disciplines	Actuarial mathematics Control theory Econometrics Ergodic theory Extreme value theory (EVT) Large deviations theory Mathematical finance Mathematical statistics Probability theory Queueing theory Renewal theory Ruin theory Signal processing Statistics Stochastic analysis Time series analysis Machine learning
List of topics Category