Just another Gibbs sampler

Last updated
JAGS
Original author(s) Martyn Plummer
Initial releaseDecember 11, 2007 (2007-12-11)
Stable release
4.3.2 / March 4, 2023 (2023-03-04)
Repository
Written in C++
Operating system Unix-like, Microsoft Windows, Mac OS X
Platform Intel x86 - 32-bit, x64
Size 1.7 MB
Type Statistical package
License GNU General Public License
Website mcmc-jags.sourceforge.net

Just another Gibbs sampler (JAGS) is a program for simulation from Bayesian hierarchical models using Markov chain Monte Carlo (MCMC), developed by Martyn Plummer. JAGS has been employed for statistical work in many fields, for example ecology, management, and genetics. [1] [2] [3]

Contents

JAGS aims for compatibility with WinBUGS/OpenBUGS through the use of a dialect of the same modeling language (informally, BUGS), but it provides no GUI for model building and MCMC sample postprocessing, which must therefore be treated in a separate program (for example calling JAGS from R through a library such as rjags and post-processing MCMC output in R). [4]

The main advantage of JAGS in comparison to the members of the original BUGS family (WinBUGS and OpenBUGS) is its platform independence. It is written in C++, while the BUGS family is written in Component Pascal, a less widely known programming language. [5] [6] In addition, JAGS is already part of many repositories of Linux distributions such as Ubuntu. It can also be compiled as a 64-bit application on 64-bit platforms, thus making all the addressable space available to BUGS models.

JAGS can be used via the command line or run in batch mode through script files. This means that there is no need to redo the settings with every run and that the program can be called and controlled from within another program (e.g. from R via rjags as outlined above).

JAGS is licensed under the GNU General Public License.

See also

Related Research Articles

Monte Carlo methods, or Monte Carlo experiments, are a broad class of computational algorithms that rely on repeated random sampling to obtain numerical results. The underlying concept is to use randomness to solve problems that might be deterministic in principle. They are often used in physical and mathematical problems and are most useful when it is difficult or impossible to use other approaches. Monte Carlo methods are mainly used in three problem classes: optimization, numerical integration, and generating draws from a probability distribution.

<span class="mw-page-title-main">Metropolis–Hastings algorithm</span> Monte Carlo algorithm

In statistics and statistical physics, the Metropolis–Hastings algorithm is a Markov chain Monte Carlo (MCMC) method for obtaining a sequence of random samples from a probability distribution from which direct sampling is difficult. This sequence can be used to approximate the distribution or to compute an integral. Metropolis–Hastings and other MCMC algorithms are generally used for sampling from multi-dimensional distributions, especially when the number of dimensions is high. For single-dimensional distributions, there are usually other methods that can directly return independent samples from the distribution, and these are free from the problem of autocorrelated samples that is inherent in MCMC methods.

A hidden Markov model (HMM) is a statistical Markov model in which the system being modeled is assumed to be a Markov process — call it — with unobservable ("hidden") states. As part of the definition, HMM requires that there be an observable process whose outcomes are "influenced" by the outcomes of in a known way. Since cannot be observed directly, the goal is to learn about by observing HMM has an additional requirement that the outcome of at time must be "influenced" exclusively by the outcome of at and that the outcomes of and at must be conditionally independent of at given at time

A Bayesian network is a probabilistic graphical model that represents a set of variables and their conditional dependencies via a directed acyclic graph (DAG). Bayesian networks are ideal for taking an event that occurred and predicting the likelihood that any one of several possible known causes was the contributing factor. For example, a Bayesian network could represent the probabilistic relationships between diseases and symptoms. Given symptoms, the network can be used to compute the probabilities of the presence of various diseases.

In statistics, Markov chain Monte Carlo (MCMC) methods comprise a class of algorithms for sampling from a probability distribution. By constructing a Markov chain that has the desired distribution as its equilibrium distribution, one can obtain a sample of the desired distribution by recording states from the chain. The more steps that are included, the more closely the distribution of the sample matches the actual desired distribution. Various algorithms exist for constructing chains, including the Metropolis–Hastings algorithm.

Bayesian statistics is a theory in the field of statistics based on the Bayesian interpretation of probability where probability expresses a degree of belief in an event. The degree of belief may be based on prior knowledge about the event, such as the results of previous experiments, or on personal beliefs about the event. This differs from a number of other interpretations of probability, such as the frequentist interpretation that views probability as the limit of the relative frequency of an event after many trials.

In statistics, Gibbs sampling or a Gibbs sampler is a Markov chain Monte Carlo (MCMC) algorithm for obtaining a sequence of observations which are approximated from a specified multivariate probability distribution, when direct sampling is difficult. This sequence can be used to approximate the joint distribution ; to approximate the marginal distribution of one of the variables, or some subset of the variables ; or to compute an integral. Typically, some of the variables correspond to observations whose values are known, and hence do not need to be sampled.

OpenBUGS is a software application for the Bayesian analysis of complex statistical models using Markov chain Monte Carlo (MCMC) methods. OpenBUGS is the open source variant of WinBUGS. It runs under Microsoft Windows and Linux, as well as from inside the R statistical package. Versions from v3.0.7 onwards have been designed to be at least as efficient and reliable as WinBUGS over a range of test applications.

Multilevel models are statistical models of parameters that vary at more than one level. An example could be a model of student performance that contains measures for individual students as well as measures for classrooms within which the students are grouped. These models can be seen as generalizations of linear models, although they can also extend to non-linear models. These models became much more popular after sufficient computing power and software became available.

Bayesian inference of phylogeny combines the information in the prior and in the data likelihood to create the so-called posterior probability of trees, which is the probability that the tree is correct given the data, the prior and the likelihood model. Bayesian inference was introduced into molecular phylogenetics in the 1990s by three independent groups: Bruce Rannala and Ziheng Yang in Berkeley, Bob Mau in Madison, and Shuying Li in University of Iowa, the last two being PhD students at the time. The approach has become very popular since the release of the MrBayes software in 2001, and is now one of the most popular methods in molecular phylogenetics.

<span class="mw-page-title-main">WinBUGS</span> Statistical software for Bayesian analysis

WinBUGS is statistical software for Bayesian analysis using Markov chain Monte Carlo (MCMC) methods.

Probabilistic programming (PP) is a programming paradigm in which probabilistic models are specified and inference for these models is performed automatically. It represents an attempt to unify probabilistic modeling and traditional general purpose programming in order to make the former easier and more widely applicable. It can be used to create systems that help make decisions in the face of uncertainty.

Bayesian inference using Gibbs sampling (BUGS) is a statistical software for performing Bayesian inference using Markov chain Monte Carlo (MCMC) methods. It was developed by David Spiegelhalter at the Medical Research Council Biostatistics Unit in Cambridge in 1989 and released as free software in 1991.

<span class="mw-page-title-main">Stan (software)</span> Probabilistic programming language for Bayesian inference

Stan is a probabilistic programming language for statistical inference written in C++. The Stan language is used to specify a (Bayesian) statistical model with an imperative program calculating the log probability density function.

<span class="mw-page-title-main">JASP</span> Free and open-source statistical program

JASP is a free and open-source program for statistical analysis supported by the University of Amsterdam. It is designed to be easy to use, and familiar to users of SPSS. It offers standard analysis procedures in both their classical and Bayesian form. JASP generally produces APA style results tables and plots to ease publication. It promotes open science via integration with the Open Science Framework and reproducibility by integrating the analysis settings into the results. The development of JASP is financially supported by several universities and research funds.

PyMC is a Python package for Bayesian statistical modeling and probabilistic machine learning which focuses on advanced Markov chain Monte Carlo and variational fitting algorithms. It is a rewrite from scratch of the previous version of the PyMC software. Unlike PyMC2, which had used Fortran extensions for performing computations, PyMC relies on PyTensor, a Python library that allows defining, optimizing, and efficiently evaluating mathematical expressions involving multi-dimensional arrays. From version 3.8 PyMC relies on ArviZ to handle plotting, diagnostics, and statistical checks. PyMC and Stan are the two most popular probabilistic programming tools. PyMC is an open source project, developed by the community and fiscally sponsored by NumFOCUS.

<span class="mw-page-title-main">Outline of machine learning</span> Overview of and topical guide to machine learning

The following outline is provided as an overview of and topical guide to machine learning. Machine learning is a subfield of soft computing within computer science that evolved from the study of pattern recognition and computational learning theory in artificial intelligence. In 1959, Arthur Samuel defined machine learning as a "field of study that gives computers the ability to learn without being explicitly programmed". Machine learning explores the study and construction of algorithms that can learn from and make predictions on data. Such algorithms operate by building a model from an example training set of input observations in order to make data-driven predictions or decisions expressed as outputs, rather than following strictly static program instructions.

BEAST 2 is a cross-platform program for Bayesian analysis of molecular sequences. It estimates rooted, timed phylogenies using a range of substitution and clock models, and a variety of tree priors. There is an associated tool, called BEAUTi, for setting up standard analyses. BEAST 2 is a complete re-write of the earlier BEAST program and as such draws on a large body of work. A notable feature of BEAST 2 is the packaging system which has simplified the process of implementing novel models.

References

  1. Semmens, B. X.; Ward, E. J.; Moore, J. W.; Darimont, C. T. (2009). Getz, Wayne M (ed.). "Quantifying Inter- and Intra-Population Niche Variability Using Hierarchical Bayesian Stable Isotope Mixing Models". PLOS ONE. 4 (7): e6187. Bibcode:2009PLoSO...4.6187S. doi: 10.1371/journal.pone.0006187 . PMC   2704373 . PMID   19587790.
  2. Johnson, T. R.; Kuhn, K. M. (2013). "Bayesian Thurstonian models for ranking data using JAGS". Behavior Research Methods. 45 (3): 857–872. doi:10.3758/s13428-012-0300-3. PMID   23539504. S2CID   42660145.
  3. McKeigue, P. M.; Campbell, H.; Wild, S.; Vitart, V.; Hayward, C.; Rudan, I.; Wright, A. F.; Wilson, J. F. (2010). "Bayesian methods for instrumental variable analysis with genetic instruments ('Mendelian randomization'): Example with urate transporter SLC2A9 as an instrumental variable for effect of urate levels on metabolic syndrome". International Journal of Epidemiology. 39 (3): 907–918. doi:10.1093/ije/dyp397. PMC   2878456 . PMID   20348110.
  4. Martyn Plummer (2003). JAGS: A Program for Analysis of Bayesian Graphical Models Using Gibbs Sampling, Proceedings of the 3rd International Workshop on Distributed Statistical Computing (DSC 2003), March 20–22, Vienna, Austria. ISSN 1609-395X.
  5. Lunn, David; Spiegelhalter, David; Thomas, Andrew; Best, Nicky (2009). "The BUGS project: Evolution, critique and future directions" (PDF). Statistics in Medicine. 28 (25): 3049–3067. doi:10.1002/sim.3680. PMID   19630097. Archived from the original (PDF) on 2012-09-15.
  6. Simon Jackman (2009). Bayesian Analysis for the Social Sciences. Wiley Series in Probability and Statistics, volume 845. John Wiley and Sons. ISBN   978-0-470-01154-6.