ArviZ

ArviZ
Original author(s)	ArviZ Development Team
Initial release	July 21, 2018
Stable release	0.20.0 / 28 September 2024;57 days ago
Repository	https://github.com/arviz-devs/arviz
Written in	Python
Operating system	Unix-like, Mac OS X, Microsoft Windows
Platform	Intel x86 – 32-bit, x64
Type	Statistical package
License	Apache License, Version 2.0
Website	python.arviz.org

Last updated November 25, 2024 • 2 min readFrom Wikipedia, The Free Encyclopedia

ArviZ ( /ˈɑːrvɪz/ AR-vees) is a Python package for exploratory analysis of Bayesian models.^[2]^[3]^[4]^[5] It is specifically designed to work with the output of probabilistic programming libraries like PyMC, Stan, and others by providing a set of tools for summarizing and visualizing the results of Bayesian inference in a convenient and informative way. ArviZ also provides a common data structure for manipulating and storing data commonly arising in Bayesian analysis, like posterior samples or observed data.

Etymology

The ArviZ name is derived from reading "rvs" (the short form of random variates) as a word instead of spelling it and also using the particle "viz" usually used to abbreviate visualization.

Exploratory analysis of Bayesian models

When working with Bayesian models there are a series of related tasks that need to be addressed besides inference itself:

Diagnoses of the quality of the inference, this is needed when using numerical methods such as Markov chain Monte Carlo techniques
Model criticism, including evaluations of both model assumptions and model predictions
Comparison of models, including model selection or model averaging
Preparation of the results for a particular audience

All these tasks are part of the Exploratory analysis of Bayesian models approach, and successfully performing them is central to the iterative and interactive modeling process. These tasks require both numerical and visual summaries.^[12]^[13]^[14]

Library features

InferenceData object for Bayesian data manipulation. This object is based on xarray
Plots using two alternative backends matplotlib or bokeh
Numerical summaries and diagnostics for Markov chain Monte Carlo methods.
Integration with established probabilistic programming languages including; PyStan (the Python interface of Stan), PyMC,^[15] Edward^[16] Pyro,^[17] and easily integrated with novel or bespoke Bayesian analyses. ArviZ is also available in Julia, using the ArviZ.jl interface

Related Research Articles

A Bayesian network is a probabilistic graphical model that represents a set of variables and their conditional dependencies via a directed acyclic graph (DAG). While it is one of several forms of causal notation, causal networks are special cases of Bayesian networks. Bayesian networks are ideal for taking an event that occurred and predicting the likelihood that any one of several possible known causes was the contributing factor. For example, a Bayesian network could represent the probabilistic relationships between diseases and symptoms. Given symptoms, the network can be used to compute the probabilities of the presence of various diseases.

In statistics, Markov chain Monte Carlo (MCMC) is a class of algorithms used to draw samples from a probability distribution. Given a probability distribution, one can construct a Markov chain whose elements' distribution approximates it – that is, the Markov chain's equilibrium distribution matches the target distribution. The more steps that are included, the more closely the distribution of the sample matches the actual desired distribution.

Bayesian statistics is a theory in the field of statistics based on the Bayesian interpretation of probability, where probability expresses a degree of belief in an event. The degree of belief may be based on prior knowledge about the event, such as the results of previous experiments, or on personal beliefs about the event. This differs from a number of other interpretations of probability, such as the frequentist interpretation, which views probability as the limit of the relative frequency of an event after many trials. More concretely, analysis in Bayesian methods codifies prior knowledge in the form of a prior distribution.

In statistics, Gibbs sampling or a Gibbs sampler is a Markov chain Monte Carlo (MCMC) algorithm for sampling from a specified multivariate probability distribution when direct sampling from the joint distribution is difficult, but sampling from the conditional distribution is more practical. This sequence can be used to approximate the joint distribution ; to approximate the marginal distribution of one of the variables, or some subset of the variables ; or to compute an integral. Typically, some of the variables correspond to observations whose values are known, and hence do not need to be sampled.

The nested sampling algorithm is a computational approach to the Bayesian statistics problems of comparing models and generating samples from posterior distributions. It was developed in 2004 by physicist John Skilling.

Approximate Bayesian computation (ABC) constitutes a class of computational methods rooted in Bayesian statistics that can be used to estimate the posterior distributions of model parameters.

GNU MCSim is a suite of simulation software. It allows one to design one's own statistical or simulation models, perform Monte Carlo simulations, and Bayesian inference through (tempered) Markov chain Monte Carlo simulations. The latest version allows parallel computing of Monte Carlo or MCMC simulations.

Andrew Eric Gelman is an American statistician and professor of statistics and political science at Columbia University.

Probabilistic programming (PP) is a programming paradigm in which probabilistic models are specified and inference for these models is performed automatically. It represents an attempt to unify probabilistic modeling and traditional general purpose programming in order to make the former easier and more widely applicable. It can be used to create systems that help make decisions in the face of uncertainty.

The Hamiltonian Monte Carlo algorithm is a Markov chain Monte Carlo method for obtaining a sequence of random samples whose distribution converges to a target probability distribution that is difficult to sample directly. This sequence can be used to estimate integrals of the target distribution, such as expected values and moments.

LaplacesDemon is an open-source statistical package that is intended to provide a complete environment for Bayesian inference. LaplacesDemon has been used in numerous fields. The user writes their own model specification function and selects a numerical approximation algorithm to update their Bayesian model. Some numerical approximation families of algorithms include Laplace's method, numerical integration, Markov chain Monte Carlo (MCMC), and variational Bayesian methods.

Stan is a probabilistic programming language for statistical inference written in C++. The Stan language is used to specify a (Bayesian) statistical model with an imperative program calculating the log probability density function.

Radford M. Neal is a professor emeritus at the Department of Statistics and Department of Computer Science at the University of Toronto, where he holds a research chair in statistics and machine learning.

PyMC is a probabilistic programming language written in Python. It can be used for Bayesian statistical modeling and probabilistic machine learning.

Siddhartha Chib is an econometrician and statistician, the Harry C. Hartkopf Professor of Econometrics and Statistics at Washington University in St. Louis. His work is primarily in Bayesian statistics, econometrics, and Markov chain Monte Carlo methods.

This is a comparison of statistical analysis software that allows doing inference with Gaussian processes often using approximations.

Probabilistic numerics is an active field of study at the intersection of applied mathematics, statistics, and machine learning centering on the concept of uncertainty in computation. In probabilistic numerics, tasks in numerical analysis such as finding numerical solutions for integration, linear algebra, optimization and simulation and differential equations are seen as problems of statistical, probabilistic, or Bayesian inference.

Bayesian quadrature is a method for approximating intractable integration problems. It falls within the class of probabilistic numerical methods. Bayesian quadrature views numerical integration as a Bayesian inference task, where function evaluations are used to estimate the integral of that function. For this reason, it is sometimes also referred to as "Bayesian probabilistic numerical integration" or "Bayesian numerical integration". The name "Bayesian cubature" is also sometimes used when the integrand is multi-dimensional. A potential advantage of this approach is that it provides probabilistic uncertainty quantification for the value of the integral.

In probability theory and Bayesian statistics, the Lewandowski-Kurowicka-Joe distribution, often referred to as the LKJ distribution, is a probability distribution over positive definite symmetric matrices with unit diagonals.

Bambi is a high-level Bayesian model-building interface written in Python. It works with the PyMC probabilistic programming framework. Bambi provides an interface to build and solve Bayesian generalized (non-)linear multivariate multilevel models.

References

↑ "Release 0.20.0". 28 September 2024. Retrieved 21 October 2024.
↑ Kumar, Ravin; Carroll, Colin; Hartikainen, Ari; Martin, Osvaldo (2019). "ArviZ a unified library for exploratory analysis of Bayesian models in Python". Journal of Open Source Software. 4 (33): 1143. Bibcode:2019JOSS....4.1143K. doi: 10.21105/joss.01143 . hdl: 11336/114615 .
↑ Martin, Osvaldo (2024). Bayesian Analysis with Python - Third Edition: A practical guide to probabilistic modeling. Packt Publishing Ltd. ISBN 9781805127161.
↑ Martin, Osvaldo; Kumar, Ravin; Lao, Junpeng (2021). Bayesian Modeling and Computation in Python. CRC-press. pp. 1–420. ISBN 9780367894368 . Retrieved 7 July 2022.
↑ Gelman, Andrew; Vehtari, Aki; Simpson, Daniel; Margossian, Charles; Carpenter, Bob; Yao, Yuling; Kennedy, Lauren; Gabry, Jonah; Bürkner, Paul-Christian; Martin, Modrák (2021). "Bayesian Workflow". arXiv: 2011.01808 [stat.ME].
↑ "NumFOCUS Affiliated Projects". NumFOCUS | Open Code = Better Science. Retrieved 2019-11-30.
↑ Farr, Will M.; Fishbach, Maya; Ye, Jiani; Holz, Daniel E. (2019). "A Future Percent-level Measurement of the Hubble Expansion at Redshift 0.8 with Advanced LIGO". The Astrophysical Journal. 883 (2): L42. arXiv: 1908.09084 . Bibcode:2019ApJ...883L..42F. doi: 10.3847/2041-8213/ab4284 . S2CID 202150341.
↑ Busch-Moreno, Simon; Tuomainen, Jyrki; Vinson, David (2021). "Trait anxiety effects on late phase threatening speech processing: Evidence from electroencephalography". European Journal of Neuroscience. 54 (9): 7152–7175. doi: 10.1111/ejn.15470 . PMID 34553432.
↑ Jovanovski, Petar; Kocarev, Ljupco (2019). "Bayesian consensus clustering in multiplex networks". Chaos: An Interdisciplinary Journal of Nonlinear Science. 29 (10): 103142. Bibcode:2019Chaos..29j3142J. doi:10.1063/1.5120503. PMID 31675792. S2CID 207834500.
↑ Zhou, Guangyao (2019). "Mixed Hamiltonian Monte Carlo for Mixed Discrete and Continuous Variables". arXiv: 1909.04852 [stat.CO].
↑ Graham, Matthew M.; Thiery, Alexandre H.; Beskos, Alexandros (2019). "Manifold Markov chain Monte Carlo methods for Bayesian inference in a wide class of diffusion models". arXiv: 1912.02982 [stat.CO].
↑ Gabry, Jonah; Simpson, Daniel; Vehtari, Aki; Betancourt, Michael; Gelman, Andrew (2019). "Visualization in Bayesian workflow". Journal of the Royal Statistical Society, Series A (Statistics in Society). 182 (2): 389–402. arXiv: 1709.01449 . doi:10.1111/rssa.12378. S2CID 26590874.
↑ Vehtari, Aki; Gelman, Andrew; Simpson, Daniel; Carpenter, Bob; Bürkner, Paul-Christian (2021). "Rank-Normalization, Folding, and Localization: An Improved Rˆ for Assessing Convergence of MCMC (With Discussion)". Bayesian Analysis. 16 (2): 667. arXiv: 1903.08008 . Bibcode:2021BayAn..16..667V. doi:10.1214/20-BA1221. S2CID 88522683.
↑ Martin, Osvaldo (2018). Bayesian Analysis with Python: Introduction to statistical modeling and probabilistic programming using PyMC3 and ArviZ. Packt Publishing Ltd. ISBN 9781789341652.
↑ Salvatier, John; Wiecki, Thomas V.; Fonnesbeck, Christopher (2016). "Probabilistic programming in Python using PyMC3". PeerJ Computer Science. 2: e55. arXiv: 1507.08050 . doi: 10.7717/peerj-cs.55 .
↑ Tran, Dustin; Kucukelbir, Alp; Dieng, Adji B.; Rudolph, Maja; Liang, Dawen; Blei, David M. (2016). "Edward: A library for probabilistic modeling, inference, and criticism". arXiv: 1610.09787 [stat.CO].
↑ Bingham, Eli; Chen, Jonathan P.; Jankowiak, Martin; Obermeyer, Fritz; Pradhan, Neeraj; Karaletsos, Theofanis; Singh, Rohit; Szerlip, Paul; Horsfall, Paul; Goodman, Noah D. (2018). "Pyro: Deep Universal Probabilistic Programming". arXiv: 1810.09538 [cs.LG].

External links

ArviZ web site

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[wikidata-d07d25cd1fbbc1e6c4604665bb0fe4a6ee08bfac-v18-1] "Release 0.20.0". 28 September 2024. Retrieved 21 October 2024.

[2] Kumar, Ravin; Carroll, Colin; Hartikainen, Ari; Martin, Osvaldo (2019). "ArviZ a unified library for exploratory analysis of Bayesian models in Python". Journal of Open Source Software. 4 (33): 1143. Bibcode:2019JOSS....4.1143K. doi: 10.21105/joss.01143 . hdl: 11336/114615 .

[Martin2024-3] Martin, Osvaldo (2024). Bayesian Analysis with Python - Third Edition: A practical guide to probabilistic modeling. Packt Publishing Ltd. ISBN 9781805127161.

[Martin2021-4] Martin, Osvaldo; Kumar, Ravin; Lao, Junpeng (2021). Bayesian Modeling and Computation in Python. CRC-press. pp. 1–420. ISBN 9780367894368 . Retrieved 7 July 2022.

[5] Gelman, Andrew; Vehtari, Aki; Simpson, Daniel; Margossian, Charles; Carpenter, Bob; Yao, Yuling; Kennedy, Lauren; Gabry, Jonah; Bürkner, Paul-Christian; Martin, Modrák (2021). "Bayesian Workflow". arXiv: 2011.01808 [stat.ME].

[6] "NumFOCUS Affiliated Projects". NumFOCUS | Open Code = Better Science. Retrieved 2019-11-30.

[7] Farr, Will M.; Fishbach, Maya; Ye, Jiani; Holz, Daniel E. (2019). "A Future Percent-level Measurement of the Hubble Expansion at Redshift 0.8 with Advanced LIGO". The Astrophysical Journal. 883 (2): L42. arXiv: 1908.09084 . Bibcode:2019ApJ...883L..42F. doi: 10.3847/2041-8213/ab4284 . S2CID 202150341.

[8] Busch-Moreno, Simon; Tuomainen, Jyrki; Vinson, David (2021). "Trait anxiety effects on late phase threatening speech processing: Evidence from electroencephalography". European Journal of Neuroscience. 54 (9): 7152–7175. doi: 10.1111/ejn.15470 . PMID 34553432.

[9] Jovanovski, Petar; Kocarev, Ljupco (2019). "Bayesian consensus clustering in multiplex networks". Chaos: An Interdisciplinary Journal of Nonlinear Science. 29 (10): 103142. Bibcode:2019Chaos..29j3142J. doi:10.1063/1.5120503. PMID 31675792. S2CID 207834500.

[10] Zhou, Guangyao (2019). "Mixed Hamiltonian Monte Carlo for Mixed Discrete and Continuous Variables". arXiv: 1909.04852 [stat.CO].

[11] Graham, Matthew M.; Thiery, Alexandre H.; Beskos, Alexandros (2019). "Manifold Markov chain Monte Carlo methods for Bayesian inference in a wide class of diffusion models". arXiv: 1912.02982 [stat.CO].

[12] Gabry, Jonah; Simpson, Daniel; Vehtari, Aki; Betancourt, Michael; Gelman, Andrew (2019). "Visualization in Bayesian workflow". Journal of the Royal Statistical Society, Series A (Statistics in Society). 182 (2): 389–402. arXiv: 1709.01449 . doi:10.1111/rssa.12378. S2CID 26590874.

[13] Vehtari, Aki; Gelman, Andrew; Simpson, Daniel; Carpenter, Bob; Bürkner, Paul-Christian (2021). "Rank-Normalization, Folding, and Localization: An Improved Rˆ for Assessing Convergence of MCMC (With Discussion)". Bayesian Analysis. 16 (2): 667. arXiv: 1903.08008 . Bibcode:2021BayAn..16..667V. doi:10.1214/20-BA1221. S2CID 88522683.

[Martin2018-14] Martin, Osvaldo (2018). Bayesian Analysis with Python: Introduction to statistical modeling and probabilistic programming using PyMC3 and ArviZ. Packt Publishing Ltd. ISBN 9781789341652.

[15] Salvatier, John; Wiecki, Thomas V.; Fonnesbeck, Christopher (2016). "Probabilistic programming in Python using PyMC3". PeerJ Computer Science. 2: e55. arXiv: 1507.08050 . doi: 10.7717/peerj-cs.55 .

[16] Tran, Dustin; Kucukelbir, Alp; Dieng, Adji B.; Rudolph, Maja; Liang, Dawen; Blei, David M. (2016). "Edward: A library for probabilistic modeling, inference, and criticism". arXiv: 1610.09787 [stat.CO].

[17] Bingham, Eli; Chen, Jonathan P.; Jankowiak, Martin; Obermeyer, Fritz; Pradhan, Neeraj; Karaletsos, Theofanis; Singh, Rohit; Szerlip, Paul; Horsfall, Paul; Goodman, Noah D. (2018). "Pyro: Deep Universal Probabilistic Programming". arXiv: 1810.09538 [cs.LG].

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]