Bayesian structural time series

Last updated

Bayesian structural time series (BSTS) model is a statistical technique used for feature selection, time series forecasting, nowcasting, inferring causal impact and other applications. The model is designed to work with time series data.

Contents

The model has also promising application in the field of analytical marketing. In particular, it can be used in order to assess how much different marketing campaigns have contributed to the change in web search volumes, product sales, brand popularity and other relevant indicators. Difference-in-differences models [1] and interrupted time series designs [2] are alternatives to this approach. "In contrast to classical difference-in-differences schemes, state-space models make it possible to (i) infer the temporal evolution of attributable impact, (ii) incorporate empirical priors on the parameters in a fully Bayesian treatment, and (iii) flexibly accommodate multiple sources of variation, including the time-varying influence of contemporaneous covariates, i.e., synthetic controls." [1]

General model description

The model consists of three main components:

  1. Kalman filter . The technique for time series decomposition. In this step, a researcher can add different state variables: trend, seasonality, regression, and others.
  2. Spike-and-slab method. In this step, the most important regression predictors are selected.
  3. Bayesian model averaging. Combining the results and prediction calculation.

The model could be used to discover the causations with its counterfactual prediction and the observed data. [1]

A possible drawback of the model can be its relatively complicated mathematical underpinning and difficult implementation as a computer program. However, the programming language R has ready-to-use packages for calculating the BSTS model, [3] [4] which do not require strong mathematical background from a researcher.

See also

Related Research Articles

Causality is an influence by which one event, process, state, or object (acause) contributes to the production of another event, process, state, or object (an effect) where the cause is at least partly responsible for the effect, and the effect is at least partly dependent on the cause. The cause of something may also be described as the reason for the event or process.

<span class="mw-page-title-main">Prediction</span> Statement about a future event

A prediction or forecast is a statement about a future event or about future data. Predictions are often, but not always, based upon experience or knowledge of forecasters. There is no universal agreement about the exact difference between "prediction" and "estimation"; different authors and disciplines ascribe different connotations.

In statistics, path analysis is used to describe the directed dependencies among a set of variables. This includes models equivalent to any form of multiple regression analysis, factor analysis, canonical correlation analysis, discriminant analysis, as well as more general families of models in the multivariate analysis of variance and covariance analyses.

In the statistical analysis of time series, autoregressive–moving-average (ARMA) models are a way to describe a (weakly) stationary stochastic process using autoregression (AR) and a moving average (MA), each with a polynomial. They are a tool for understanding a series and predicting future values. AR involves regressing the variable on its own lagged (i.e., past) values. MA involves modeling the error as a linear combination of error terms occurring contemporaneously and at various times in the past. The model is usually denoted ARMA(p, q), where p is the order of AR and q is the order of MA.

<span class="mw-page-title-main">Regression analysis</span> Set of statistical processes for estimating the relationships among variables

In statistical modeling, regression analysis is a set of statistical processes for estimating the relationships between a dependent variable and one or more error-free independent variables.

<span class="mw-page-title-main">Granger causality</span> Statistical hypothesis test for forecasting

The Granger causality test is a statistical hypothesis test for determining whether one time series is useful in forecasting another, first proposed in 1969. Ordinarily, regressions reflect "mere" correlations, but Clive Granger argued that causality in economics could be tested for by measuring the ability to predict the future values of a time series using prior values of another time series. Since the question of "true causality" is deeply philosophical, and because of the post hoc ergo propter hoc fallacy of assuming that one thing preceding another can be used as a proof of causation, econometricians assert that the Granger test finds only "predictive causality". Using the term "causality" alone is a misnomer, as Granger-causality is better described as "precedence", or, as Granger himself later claimed in 1977, "temporally related". Rather than testing whether Xcauses Y, the Granger causality tests whether X forecastsY.

<span class="mw-page-title-main">Structural equation modeling</span> Form of causal modeling that fit networks of constructs to data

Structural equation modeling (SEM) is a diverse set of methods used by scientists for both observational and experimental research. SEM is used mostly in the social and behavioral science fields, but it is also used in epidemiology, business, and other fields. A common definition of SEM is, "...a class of methodologies that seeks to represent hypotheses about the means, variances, and covariances of observed data in terms of a smaller number of 'structural' parameters defined by a hypothesized underlying conceptual or theoretical model,".

<span class="mw-page-title-main">Causal model</span> Conceptual model in philosophy of science

In metaphysics, a causal model is a conceptual model that describes the causal mechanisms of a system. Several types of causal notation may be used in the development of a causal model. Causal models can improve study designs by providing clear rules for deciding which independent variables need to be included/controlled for.

In statistics, multivariate adaptive regression splines (MARS) is a form of regression analysis introduced by Jerome H. Friedman in 1991. It is a non-parametric regression technique and can be seen as an extension of linear models that automatically models nonlinearities and interactions between variables.

In statistics and machine learning, ensemble methods use multiple learning algorithms to obtain better predictive performance than could be obtained from any of the constituent learning algorithms alone. Unlike a statistical ensemble in statistical mechanics, which is usually infinite, a machine learning ensemble consists of only a concrete finite set of alternative models, but typically allows for much more flexible structure to exist among those alternatives.

The following outline is provided as an overview of and topical guide to regression analysis:

Clark N. Glymour is the Alumni University Professor Emeritus in the Department of Philosophy at Carnegie Mellon University. He is also a senior research scientist at the Florida Institute for Human and Machine Cognition.

Causal inference is the process of determining the independent, actual effect of a particular phenomenon that is a component of a larger system. The main difference between causal inference and inference of association is that causal inference analyzes the response of an effect variable when a cause of the effect variable is changed. The study of why things occur is called etiology, and can be described using the language of scientific causal notation. Causal inference is said to provide the evidence of causality theorized by causal reasoning.

In statistics, econometrics, epidemiology, genetics and related disciplines, causal graphs are probabilistic graphical models used to encode assumptions about the data-generating process.

<span class="mw-page-title-main">JASP</span> Free and open-source statistical program

JASP is a free and open-source program for statistical analysis supported by the University of Amsterdam. It is designed to be easy to use, and familiar to users of SPSS. It offers standard analysis procedures in both their classical and Bayesian form. JASP generally produces APA style results tables and plots to ease publication. It promotes open science via integration with the Open Science Framework and reproducibility by integrating the analysis settings into the results. The development of JASP is financially supported by sponsors several universities and research funds. As the JASP GUI is developed in C++ using Qt framework, some of the team left to make a notable fork which is Jamovi which has its GUI developed in JavaScript and HTML5.

Spike-and-slab regression is a type of Bayesian linear regression in which a particular hierarchical prior distribution for the regression coefficients is chosen such that only a subset of the possible regressors is retained. The technique is particularly useful when the number of possible predictors is larger than the number of observations. The idea of the spike-and-slab model was originally proposed by Mitchell & Beauchamp (1988). The approach was further significantly developed by Madigan & Raftery (1994) and George & McCulloch (1997). A recent and important contribution to this literature is Ishwaran & Rao (2005).

The following outline is provided as an overview of, and topical guide to, machine learning:

<i>The Book of Why</i> 2018 book by Judea Pearl and Dana Mackenzie

The Book of Why: The New Science of Cause and Effect is a 2018 nonfiction book by computer scientist Judea Pearl and writer Dana Mackenzie. The book explores the subject of causality and causal inference from statistical and philosophical points of view for a general audience.

References

  1. 1 2 3 "Inferring causal impact using Bayesian structural time-series models". research.google.com. Retrieved 2016-04-17.
  2. "Interrupted Time-Series Design". Interrupted Time-Series Design. Insights Association. Retrieved 21 March 2019.{{cite web}}: CS1 maint: url-status (link)
  3. "bsts" (PDF).
  4. "CausalImpact". google.github.io. Retrieved 2016-04-17.

Further reading