Bambi (software)

Last updated
Bambi
Original author(s) Bambinos
Initial releaseMay 15, 2016 (2016-05-15)
Repository https://github.com/bambinos/bambi
Written in Python
Operating system Unix-like, Mac OS X, Microsoft Windows
Platform Intel x86 – 32-bit, x64
Type Statistical package
License MIT License
Website bambinos.github.io/bambi/

Bambi is a high-level Bayesian model-building interface written in Python. It works with the PyMC probabilistic programming framework. Bambi provides an interface to build and solve Bayesian generalized (non-)linear multivariate multilevel models. [1] [2] [3] [4] [5] [6] [7] [8] [9] [10]

Contents

Bambi is an open source project, developed by the community and is an affiliated project of NumFOCUS.

Etymology

Bambi is an acronym for BAyesian Model-Building Interface.

Library features

See also

Related Research Articles

Bayesian inference is a method of statistical inference in which Bayes' theorem is used to update the probability for a hypothesis as more evidence or information becomes available. Fundamentally, Bayesian inference uses prior knowledge, in the form of a prior distribution in order to estimate posterior probabilities. Bayesian inference is an important technique in statistics, and especially in mathematical statistics. Bayesian updating is particularly important in the dynamic analysis of a sequence of data. Bayesian inference has found application in a wide range of activities, including science, engineering, philosophy, medicine, sport, and law. In the philosophy of decision theory, Bayesian inference is closely related to subjective probability, often called "Bayesian probability".

A Bayesian network is a probabilistic graphical model that represents a set of variables and their conditional dependencies via a directed acyclic graph (DAG). While it is one of several forms of causal notation, causal networks are special cases of Bayesian networks. Bayesian networks are ideal for taking an event that occurred and predicting the likelihood that any one of several possible known causes was the contributing factor. For example, a Bayesian network could represent the probabilistic relationships between diseases and symptoms. Given symptoms, the network can be used to compute the probabilities of the presence of various diseases.

In statistics, Markov chain Monte Carlo (MCMC) is a class of algorithms used to draw samples from a probability distribution. Given a probability distribution, one can construct a Markov chain whose elements' distribution approximates it – that is, the Markov chain's equilibrium distribution matches the target distribution. The more steps that are included, the more closely the distribution of the sample matches the actual desired distribution.

Bayesian statistics is a theory in the field of statistics based on the Bayesian interpretation of probability, where probability expresses a degree of belief in an event. The degree of belief may be based on prior knowledge about the event, such as the results of previous experiments, or on personal beliefs about the event. This differs from a number of other interpretations of probability, such as the frequentist interpretation, which views probability as the limit of the relative frequency of an event after many trials. More concretely, analysis in Bayesian methods codifies prior knowledge in the form of a prior distribution.

A graphical model or probabilistic graphical model (PGM) or structured probabilistic model is a probabilistic model for which a graph expresses the conditional dependence structure between random variables. They are commonly used in probability theory, statistics—particularly Bayesian statistics—and machine learning.

<span class="mw-page-title-main">Dynamic Bayesian network</span> Probabilistic graphical model

A dynamic Bayesian network (DBN) is a Bayesian network (BN) which relates variables to each other over adjacent time steps.

Bayesian inference of phylogeny combines the information in the prior and in the data likelihood to create the so-called posterior probability of trees, which is the probability that the tree is correct given the data, the prior and the likelihood model. Bayesian inference was introduced into molecular phylogenetics in the 1990s by three independent groups: Bruce Rannala and Ziheng Yang in Berkeley, Bob Mau in Madison, and Shuying Li in University of Iowa, the last two being PhD students at the time. The approach has become very popular since the release of the MrBayes software in 2001, and is now one of the most popular methods in molecular phylogenetics.

Approximate Bayesian computation (ABC) constitutes a class of computational methods rooted in Bayesian statistics that can be used to estimate the posterior distributions of model parameters.

<span class="mw-page-title-main">Andrew Gelman</span> American statistician

Andrew Eric Gelman is an American statistician and professor of statistics and political science at Columbia University.

Probabilistic programming (PP) is a programming paradigm in which probabilistic models are specified and inference for these models is performed automatically. It represents an attempt to unify probabilistic modeling and traditional general purpose programming in order to make the former easier and more widely applicable. It can be used to create systems that help make decisions in the face of uncertainty.

The free energy principle is a theoretical framework suggesting that the brain reduces surprise or uncertainty by making predictions based on internal models and updating them using sensory input. It highlights the brain's objective of aligning its internal model with the external world to enhance prediction accuracy. This principle integrates Bayesian inference with active inference, where actions are guided by predictions and sensory feedback refines them. It has wide-ranging implications for comprehending brain function, perception, and action.

<span class="mw-page-title-main">Stan (software)</span> Probabilistic programming language for Bayesian inference

Stan is a probabilistic programming language for statistical inference written in C++. The Stan language is used to specify a (Bayesian) statistical model with an imperative program calculating the log probability density function.

PyMC is a probabilistic programming language written in Python. It can be used for Bayesian statistical modeling and probabilistic machine learning.

PyClone is a software that implements a Hierarchical Bayes statistical model to estimate cellular frequency patterns of mutations in a population of cancer cells using observed alternate allele frequencies, copy number, and loss of heterozygosity (LOH) information. PyClone outputs clusters of variants based on calculated cellular frequencies of mutations.

Robert E. Kass is the Maurice Falk Professor of Statistics and Computational Neuroscience in the Department of Statistics and Data Science, the Machine Learning Department, and the Neuroscience Institute at Carnegie Mellon University.

ArviZ is a Python package for exploratory analysis of Bayesian models. It is specifically designed to work with the output of probabilistic programming libraries like PyMC, Stan, and others by providing a set of tools for summarizing and visualizing the results of Bayesian inference in a convenient and informative way. ArviZ also provides a common data structure for manipulating and storing data commonly arising in Bayesian analysis, like posterior samples or observed data.

This is a comparison of statistical analysis software that allows doing inference with Gaussian processes often using approximations.

Probabilistic numerics is an active field of study at the intersection of applied mathematics, statistics, and machine learning centering on the concept of uncertainty in computation. In probabilistic numerics, tasks in numerical analysis such as finding numerical solutions for integration, linear algebra, optimization and simulation and differential equations are seen as problems of statistical, probabilistic, or Bayesian inference.

In probability theory and Bayesian statistics, the Lewandowski-Kurowicka-Joe distribution, often referred to as the LKJ distribution, is a probability distribution over positive definite symmetric matrices with unit diagonals.

References

  1. Mikkola, Petrus; Martin, Osvaldo A.; Chandramouli, Suyog; Hartmann, Marcelo; Abril Pla, Oriol; Thomas, Owen; Pesonen, Henri; Corander, Jukka; Vehtari, Aki; Kaski, Samuel; Bürkner, Paul-Christian; Klami, Arto (2023). "Prior Knowledge Elicitation: The Past, Present, and Future". Bayesian Analysis. International Society for Bayesian Analysis: 1–33. doi:10.1214/23-BA1381.
  2. Štrumbelj, Erik; Bouchard-Côté, Alexandre; Corander, Jukka; Gelman, Andrew; Rue, Håvard; Murray, Lawrence; Pesonen, Henri; Plummer, Martyn; Vehtari, Aki (2024). "Past, Present and Future of Software for Bayesian Inference". Statistical Science. 39 (1). Institute of Mathematical Statistics: 46–61. doi:10.1214/23-STS907.
  3. Martin, OA; Kumar, R; Lao, J (2021). Bayesian Modeling and Computation in Python. Taylor & Francis.
  4. Qasim, SE; Mohan, UR; Stein, JM; Jacobs, J (2023). "Neuronal activity in the human amygdala and hippocampus enhances emotional memory encoding". Nature Human Behaviour. 7 (5): 754–764. doi:10.1038/s41562-022-01502-8. PMID   36646837.
  5. Pettine, WW; Raman, DV; Redish, AD (2023). "Human generalization of internal representations through prototype learning with goal-directed attention". Nature Human Behaviour. 7 (3): 442–463. doi:10.1038/s41562-023-01543-7. PMID   36894642.
  6. Pudhiyidath, A; Morton, NW; Viveros Duran, R; Schapiro, AC; Momennejad, I; Hinojosa-Rowland, DM; Molitor, RJ; Preston, AR (2022). "Representations of Temporal Community Structure in Hippocampus and Precuneus Predict Inductive Reasoning Decisions". Journal of Cognitive Neuroscience. 34 (10): 1736–1760. doi:10.1162/jocn_a_01864. PMC   10262802 . PMID   35579986.
  7. Michiels, Lien; Vannieuwenhuyze, Jorre; Leysen, Jens; Verachtert, Robin; Smets, Annelien; Goethals, Bart (2023). "How Should We Measure Filter Bubbles? A Regression Model and Evidence for Online News". Proceedings of the 17th ACM Conference on Recommender Systems. RecSys '23. Association for Computing Machinery. pp. 640–651. doi:10.1145/3604915.3608805. ISBN   979-8-4007-0241-9.
  8. Kallioinen, N; Paananen, T; Bürkner, PC (2024). "Detecting and diagnosing prior and likelihood sensitivity with power-scaling". Statistics and Computing. 34 (1): 57. doi:10.1007/s11222-023-10366-5.
  9. Gehmacher, Q; Schubert, J; Schmidt, F (2024). "Eye movements track prioritized auditory features in selective attention to natural speech". Nature Communications. 15: 3692. Bibcode:2024NatCo..15.3692G. doi:10.1038/s41467-024-48126-2.
  10. Abril-Pla, O; Andreani, V; Carroll, C; Dong, L; Fonnesbeck, CJ; Kochurov, M; Kumar, R; Lao, J; Luhmann, CC; Martin, OA; Osthege, M; Vieira, R; Wiecki, T; Zinkov, R (2023). "PyMC: a modern, and comprehensive probabilistic programming framework in Python". PeerJ Computer Science. 9: e1516. doi: 10.7717/peerj-cs.1516 . PMC   10495961 . PMID   37705656.