Stan (software)

Last updated
Stan
Original author(s) Stan Development Team
Initial releaseAugust 30, 2012 (2012-08-30)
Stable release
2.34.1 [1]   OOjs UI icon edit-ltr-progressive.svg / 23 January 2024;34 days ago (23 January 2024)
Repository
Written in C++
Operating system Unix-like, Microsoft Windows, Mac OS X
Platform Intel x86 - 32-bit, x64
Type Statistical package
License New BSD License
Website mc-stan.org

Stan is a probabilistic programming language for statistical inference written in C++. [2] The Stan language is used to specify a (Bayesian) statistical model with an imperative program calculating the log probability density function. [2]

Contents

Stan is licensed under the New BSD License. Stan is named in honour of Stanislaw Ulam, pioneer of the Monte Carlo method. [2]

Stan was created by a development team consisting of 34 members [3] that includes Andrew Gelman, Bob Carpenter, Matt Hoffman, and Daniel Lee.

Interfaces

The Stan language itself can be accessed through several interfaces:

In addition, higher-level interfaces are provided with packages using Stan as backend, primarily in the R language: [4]

Algorithms

Stan implements gradient-based Markov chain Monte Carlo (MCMC) algorithms for Bayesian inference, stochastic, gradient-based variational Bayesian methods for approximate Bayesian inference, and gradient-based optimization for penalized maximum likelihood estimation.

Automatic differentiation

Stan implements reverse-mode automatic differentiation to calculate gradients of the model, which is required by HMC, NUTS, L-BFGS, BFGS, and variational inference. [2] The automatic differentiation within Stan can be used outside of the probabilistic programming language.

Usage

Stan is used in fields including social science, [8] pharmaceutical statistics, [9] market research, [10] and medical imaging. [11]


See also


Related Research Articles

Monte Carlo methods, or Monte Carlo experiments, are a broad class of computational algorithms that rely on repeated random sampling to obtain numerical results. The underlying concept is to use randomness to solve problems that might be deterministic in principle. The name comes from the Monte Carlo Casino in Monaco, where the primary developer of the method, physicist Stanislaw Ulam, was inspired by his uncle's gambling habits.

A Bayesian network is a probabilistic graphical model that represents a set of variables and their conditional dependencies via a directed acyclic graph (DAG). While it is one of several forms of causal notation, causal networks are special cases of Bayesian networks. Bayesian networks are ideal for taking an event that occurred and predicting the likelihood that any one of several possible known causes was the contributing factor. For example, a Bayesian network could represent the probabilistic relationships between diseases and symptoms. Given symptoms, the network can be used to compute the probabilities of the presence of various diseases.

In statistics, Markov chain Monte Carlo (MCMC) methods comprise a class of algorithms for sampling from a probability distribution. By constructing a Markov chain that has the desired distribution as its equilibrium distribution, one can obtain a sample of the desired distribution by recording states from the chain. The more steps that are included, the more closely the distribution of the sample matches the actual desired distribution. Various algorithms exist for constructing chains, including the Metropolis–Hastings algorithm.

Bayesian statistics is a theory in the field of statistics based on the Bayesian interpretation of probability, where probability expresses a degree of belief in an event. The degree of belief may be based on prior knowledge about the event, such as the results of previous experiments, or on personal beliefs about the event. This differs from a number of other interpretations of probability, such as the frequentist interpretation, which views probability as the limit of the relative frequency of an event after many trials. More concretely, analysis in Bayesian methods codifies prior knowledge in the form of a prior distribution.

Bayesian inference of phylogeny combines the information in the prior and in the data likelihood to create the so-called posterior probability of trees, which is the probability that the tree is correct given the data, the prior and the likelihood model. Bayesian inference was introduced into molecular phylogenetics in the 1990s by three independent groups: Bruce Rannala and Ziheng Yang in Berkeley, Bob Mau in Madison, and Shuying Li in University of Iowa, the last two being PhD students at the time. The approach has become very popular since the release of the MrBayes software in 2001, and is now one of the most popular methods in molecular phylogenetics.

GNU MCSim is a suite of simulation software. It allows one to design one's own statistical or simulation models, perform Monte Carlo simulations, and Bayesian inference through (tempered) Markov chain Monte Carlo simulations. The latest version allows parallel computing of Monte Carlo or MCMC simulations.

Probabilistic programming (PP) is a programming paradigm in which probabilistic models are specified and inference for these models is performed automatically. It represents an attempt to unify probabilistic modeling and traditional general purpose programming in order to make the former easier and more widely applicable. It can be used to create systems that help make decisions in the face of uncertainty.

The Hamiltonian Monte Carlo algorithm is a Markov chain Monte Carlo method for obtaining a sequence of random samples which converge to being distributed according to a target probability distribution for which direct sampling is difficult. This sequence can be used to estimate integrals with respect to the target distribution.

Bayesian inference using Gibbs sampling (BUGS) is a statistical software for performing Bayesian inference using Markov chain Monte Carlo (MCMC) methods. It was developed by David Spiegelhalter at the Medical Research Council Biostatistics Unit in Cambridge in 1989 and released as free software in 1991.

Bayesian optimization is a sequential design strategy for global optimization of black-box functions that does not assume any functional forms. It is usually employed to optimize expensive-to-evaluate functions.

<span class="mw-page-title-main">LaplacesDemon</span> Open-source statistical package

LaplacesDemon is an open-source statistical package that is intended to provide a complete environment for Bayesian inference. LaplacesDemon has been used in numerous fields. The user writes their own model specification function and selects a numerical approximation algorithm to update their Bayesian model. Some numerical approximation families of algorithms include Laplace's method, numerical integration, Markov chain Monte Carlo (MCMC), and variational Bayesian methods.

Radford M. Neal is a professor emeritus at the Department of Statistics and Department of Computer Science at the University of Toronto, where he holds a research chair in statistics and machine learning.

PyMC is a probabilistic programming language written in Python. It can be used for Bayesian statistical modeling and probabilistic machine learning.

The following outline is provided as an overview of and topical guide to machine learning:

<span class="mw-page-title-main">Adept (C++ library)</span>

Adept is a combined automatic differentiation and array software library for the C++ programming language. The automatic differentiation capability facilitates the development of applications involving mathematical optimization. Adept is notable for having applied the template metaprogramming technique of expression templates to speed-up the differentiation of mathematical statements. Along with the efficient way that it stores the differential information, this makes it significantly faster than most other C++ tools that provide similar functionality, although comparable performance has been reported for Stan and in some cases Sacado. Differentiation may be in forward mode, reverse mode, or the full Jacobian matrix may be computed.

<span class="mw-page-title-main">Stochastic gradient Langevin dynamics</span> Optimization and sampling technique

Stochastic gradient Langevin dynamics (SGLD) is an optimization and sampling technique composed of characteristics from Stochastic gradient descent, a Robbins–Monro optimization algorithm, and Langevin dynamics, a mathematical extension of molecular dynamics models. Like stochastic gradient descent, SGLD is an iterative optimization algorithm which uses minibatching to create a stochastic gradient estimator, as used in SGD to optimize a differentiable objective function. Unlike traditional SGD, SGLD can be used for Bayesian learning as a sampling method. SGLD may be viewed as Langevin dynamics applied to posterior distributions, but the key difference is that the likelihood gradient terms are minibatched, like in SGD. SGLD, like Langevin dynamics, produces samples from a posterior distribution of parameters based on available data. First described by Welling and Teh in 2011, the method has applications in many contexts which require optimization, and is most notably applied in machine learning problems.

Differentiable programming is a programming paradigm in which a numeric computer program can be differentiated throughout via automatic differentiation. This allows for gradient-based optimization of parameters in the program, often via gradient descent, as well as other learning approaches that are based on higher order derivative information. Differentiable programming has found use in a wide variety of areas, particularly scientific computing and machine learning. One of the early proposals to adopt such a framework in a systematic fashion to improve upon learning algorithms was made by the Advanced Concepts Team at the European Space Agency in early 2016.

ArviZ is a Python package for exploratory analysis of Bayesian models. It is specifically designed to work with the output of probabilistic programming libraries like PyMC, Stan, and others by providing a set of tools for summarizing and visualizing the results of Bayesian inference in a convenient and informative way. ArviZ also provides a common data structure for manipulating and storing data commonly arising in Bayesian analysis, like posterior samples or observed data.

Probabilistic numerics is an active field of study at the intersection of applied mathematics, statistics, and machine learning centering on the concept of uncertainty in computation. In probabilistic numerics, tasks in numerical analysis such as finding numerical solutions for integration, linear algebra, optimization and simulation and differential equations are seen as problems of statistical, probabilistic, or Bayesian inference.

References

  1. "Release 2.34.1". 23 January 2024. Retrieved 20 February 2024.
  2. 1 2 3 4 5 Stan Development Team. 2015. Stan Modeling Language User's Guide and Reference Manual, Version 2.9.0
  3. "Development Team". stan-dev.github.io. Retrieved 2018-07-25.
  4. Gabry, Jonah. "The current state of the Stan ecosystem in R". Statistical Modeling, Causal Inference, and Social Science. Retrieved 25 August 2020.
  5. "BRMS: Bayesian Regression Models using 'Stan'". 23 August 2021.
  6. Hoffman, Matthew D.; Gelman, Andrew (April 2014). "The No-U-Turn Sampler: Adaptively Setting Path Lengths in Hamiltonian Monte Carlo". Journal of Machine Learning Research . 15: pp. 1593–1623.
  7. Kucukelbir, Alp; Ranganath, Rajesh; Blei, David M. (June 2015). "Automatic Variational Inference in Stan". 1506 (3431). arXiv: 1506.03431 . Bibcode:2015arXiv150603431K.{{cite journal}}: Cite journal requires |journal= (help)
  8. Goodrich, Benjamin King, Wawro, Gregory and Katznelson, Ira, Designing Quantitative Historical Social Inquiry: An Introduction to Stan (2012). APSA 2012 Annual Meeting Paper. Available at SSRN   2105531
  9. Natanegara, Fanni; Neuenschwander, Beat; Seaman, John W.; Kinnersley, Nelson; Heilmann, Cory R.; Ohlssen, David; Rochester, George (2013). "The current state of Bayesian methods in medical product development: survey results and recommendations from the DIA Bayesian Scientific Working Group". Pharmaceutical Statistics. 13 (1): 3–12. doi:10.1002/pst.1595. ISSN   1539-1612. PMID   24027093. S2CID   19738522.
  10. Feit, Elea (15 May 2017). "Using Stan to Estimate Hierarchical Bayes Models" . Retrieved 19 March 2019.
  11. Gordon, GSD; Joseph, J; Alcolea, MP; Sawyer, T; Macfaden, AJ; Williams, C; Fitzpatrick, CRM; Jones, PH; di Pietro, M; Fitzgerald, RC; Wilkinson, TD; Bohndiek, SE (2019). "Quantitative phase and polarization imaging through an optical fiber applied to detection of early esophageal tumorigenesis". Journal of Biomedical Optics. 24 (12): 1–13. arXiv: 1811.03977 . Bibcode:2019JBO....24l6004G. doi:10.1117/1.JBO.24.12.126004. PMC   7006047 . PMID   31840442.

Further reading