Bayesian approaches to brain function

Last updated

Bayesian approaches to brain function investigate the capacity of the nervous system to operate in situations of uncertainty in a fashion that is close to the optimal prescribed by Bayesian statistics. [1] [2] This term is used in behavioural sciences and neuroscience and studies associated with this term often strive to explain the brain's cognitive abilities based on statistical principles. It is frequently assumed that the nervous system maintains internal probabilistic models that are updated by neural processing of sensory information using methods approximating those of Bayesian probability. [3] [4]

Contents

Origins

This field of study has its historical roots in numerous disciplines including machine learning, experimental psychology and Bayesian statistics. As early as the 1860s, with the work of Hermann Helmholtz in experimental psychology, the brain's ability to extract perceptual information from sensory data was modeled in terms of probabilistic estimation. [5] [6] The basic idea is that the nervous system needs to organize sensory data into an accurate internal model of the outside world.

Bayesian probability has been developed by many important contributors. Pierre-Simon Laplace, Thomas Bayes, Harold Jeffreys, Richard Cox and Edwin Jaynes developed mathematical techniques and procedures for treating probability as the degree of plausibility that could be assigned to a given supposition or hypothesis based on the available evidence. [7] In 1988 Edwin Jaynes presented a framework for using Bayesian Probability to model mental processes. [8] It was thus realized early on that the Bayesian statistical framework holds the potential to lead to insights into the function of the nervous system.

This idea was taken up in research on unsupervised learning, in particular the Analysis by Synthesis approach, branches of machine learning. [9] [10] In 1983 Geoffrey Hinton and colleagues proposed the brain could be seen as a machine making decisions based on the uncertainties of the outside world. [11] During the 1990s researchers including Peter Dayan, Geoffrey Hinton and Richard Zemel proposed that the brain represents knowledge of the world in terms of probabilities and made specific proposals for tractable neural processes that could manifest such a Helmholtz Machine. [12] [13] [14]

Psychophysics

A wide range of studies interpret the results of psychophysical experiments in light of Bayesian perceptual models. Many aspects of human perceptual and motor behavior can be modeled with Bayesian statistics. This approach, with its emphasis on behavioral outcomes as the ultimate expressions of neural information processing, is also known for modeling sensory and motor decisions using Bayesian decision theory. Examples are the work of Landy, [15] [16] Jacobs, [17] [18] Jordan, Knill, [19] [20] Kording and Wolpert, [21] [22] and Goldreich. [23] [24] [25]

Neural coding

Many theoretical studies ask how the nervous system could implement Bayesian algorithms. Examples are the work of Pouget, Zemel, Deneve, Latham, Hinton and Dayan. George and Hawkins published a paper that establishes a model of cortical information processing called hierarchical temporal memory that is based on Bayesian network of Markov chains. They further map this mathematical model to the existing knowledge about the architecture of cortex and show how neurons could recognize patterns by hierarchical Bayesian inference. [26]

Electrophysiology

A number of recent electrophysiological studies focus on the representation of probabilities in the nervous system. Examples are the work of Shadlen and Schultz.

Predictive coding

Predictive coding is a neurobiologically plausible scheme for inferring the causes of sensory input based on minimizing prediction error. [27] These schemes are related formally to Kalman filtering and other Bayesian update schemes.

Free energy

During the 1990s some researchers such as Geoffrey Hinton and Karl Friston began examining the concept of free energy as a calculably tractable measure of the discrepancy between actual features of the world and representations of those features captured by neural network models. [28] A synthesis has been attempted recently [29] by Karl Friston, in which the Bayesian brain emerges from a general principle of free energy minimisation. [30] In this framework, both action and perception are seen as a consequence of suppressing free-energy, leading to perceptual [31] and active inference [32] and a more embodied (enactive) view of the Bayesian brain. Using variational Bayesian methods, it can be shown how internal models of the world are updated by sensory information to minimize free energy or the discrepancy between sensory input and predictions of that input. This can be cast (in neurobiologically plausible terms) as predictive coding or, more generally, Bayesian filtering.

According to Friston: [33]

"The free-energy considered here represents a bound on the surprise inherent in any exchange with the environment, under expectations encoded by its state or configuration. A system can minimise free energy by changing its configuration to change the way it samples the environment, or to change its expectations. These changes correspond to action and perception, respectively, and lead to an adaptive exchange with the environment that is characteristic of biological systems. This treatment implies that the system’s state and structure encode an implicit and probabilistic model of the environment." [33]

This area of research was summarized in terms understandable by the layperson in a 2008 article in New Scientist that offered a unifying theory of brain function. [34] Friston makes the following claims about the explanatory power of the theory:

"This model of brain function can explain a wide range of anatomical and physiological aspects of brain systems; for example, the hierarchical deployment of cortical areas, recurrent architectures using forward and backward connections and functional asymmetries in these connections. In terms of synaptic physiology, it predicts associative plasticity and, for dynamic models, spike-timing-dependent plasticity. In terms of electrophysiology it accounts for classical and extra-classical receptive field effects and long-latency or endogenous components of evoked cortical responses. It predicts the attenuation of responses encoding prediction error with perceptual learning and explains many phenomena like repetition suppression, mismatch negativity and the P300 in electroencephalography. In psychophysical terms, it accounts for the behavioural correlates of these physiological phenomena, e.g., priming, and global precedence." [33]

"It is fairly easy to show that both perceptual inference and learning rest on a minimisation of free energy or suppression of prediction error." [33]

See also

Related Research Articles

Unsupervised learning is a method in machine learning where, in contrast to supervised learning, algorithms learn patterns exclusively from unlabeled data. The hope is that through mimicry, which is an important mode of learning in people, the machine is forced to build a concise representation of its world and then generate imaginative content from it.

Computational neuroscience is a branch of neuroscience which employs mathematics, computer science, theoretical analysis and abstractions of the brain to understand the principles that govern the development, structure, physiology and cognitive abilities of the nervous system.

The memory-prediction framework is a theory of brain function created by Jeff Hawkins and described in his 2004 book On Intelligence. This theory concerns the role of the mammalian neocortex and its associations with the hippocampi and the thalamus in matching sensory inputs to stored memory patterns and how this process leads to predictions of what will happen in the future.

The Helmholtz machine is a type of artificial neural network that can account for the hidden structure of a set of data by being trained to create a generative model of the original set of data. The hope is that by learning economical representations of the data, the underlying structure of the generative model should reasonably approximate the hidden structure of the data set. A Helmholtz machine contains two networks, a bottom-up recognition network that takes the data as input and produces a distribution over hidden variables, and a top-down "generative" network that generates values of the hidden variables and the data itself. At the time, Helmholtz machines were one of a handful of learning architectures that used feedback as well as feedforward to ensure quality of learned models.

<span class="mw-page-title-main">Stephen Grossberg</span> American scientist (born 1939)

Stephen Grossberg is a cognitive scientist, theoretical and computational psychologist, neuroscientist, mathematician, biomedical engineer, and neuromorphic technologist. He is the Wang Professor of Cognitive and Neural Systems and a Professor Emeritus of Mathematics & Statistics, Psychological & Brain Sciences, and Biomedical Engineering at Boston University.

The kappa effect or perceptual time dilation is a temporal perceptual illusion that can arise when observers judge the elapsed time between sensory stimuli applied sequentially at different locations. In perceiving a sequence of consecutive stimuli, subjects tend to overestimate the elapsed time between two successive stimuli when the distance between the stimuli is sufficiently large, and to underestimate the elapsed time when the distance is sufficiently small.

<span class="mw-page-title-main">Peter Dayan</span> Researcher in computational neuroscience

Peter Dayan is a British neuroscientist and computer scientist who is director at the Max Planck Institute for Biological Cybernetics in Tübingen, Germany, along with Ivan De Araujo. He is co-author of Theoretical Neuroscience, an influential textbook on computational neuroscience. He is known for applying Bayesian methods from machine learning and artificial intelligence to understand neural function and is particularly recognized for relating neurotransmitter levels to prediction errors and Bayesian uncertainties. He has pioneered the field of reinforcement learning (RL) where he helped develop the Q-learning algorithm, and made contributions to unsupervised learning, including the wake-sleep algorithm for neural networks and the Helmholtz machine.

The cutaneous rabbit illusion is a tactile illusion evoked by tapping two or more separate regions of the skin in rapid succession. The illusion is most readily evoked on regions of the body surface that have relatively poor spatial acuity, such as the forearm. A rapid sequence of taps delivered first near the wrist and then near the elbow creates the sensation of sequential taps hopping up the arm from the wrist towards the elbow, although no physical stimulus was applied between the two actual stimulus locations. Similarly, stimuli delivered first near the elbow then near the wrist evoke the illusory perception of taps hopping from elbow towards wrist. The illusion was discovered by Frank Geldard and Carl Sherrick of Princeton University, in the early 1970s, and further characterized by Geldard (1982) and in many subsequent studies. Geldard and Sherrick likened the perception to that of a rabbit hopping along the skin, giving the phenomenon its name. While the rabbit illusion has been most extensively studied in the tactile domain, analogous sensory saltation illusions have been observed in audition and vision. The word "saltation" refers to the leaping or jumping nature of the percept.

Hierarchical temporal memory (HTM) is a biologically constrained machine intelligence technology developed by Numenta. Originally described in the 2004 book On Intelligence by Jeff Hawkins with Sandra Blakeslee, HTM is primarily used today for anomaly detection in streaming data. The technology is based on neuroscience and the physiology and interaction of pyramidal neurons in the neocortex of the mammalian brain.

Neurorobotics is the combined study of neuroscience, robotics, and artificial intelligence. It is the science and technology of embodied autonomous neural systems. Neural systems include brain-inspired algorithms, computational models of biological neural networks and actual biological systems. Such neural systems can be embodied in machines with mechanic or any other forms of physical actuation. This includes robots, prosthetic or wearable systems but also, at smaller scale, micro-machines and, at the larger scales, furniture and infrastructures.

In physiology, an efference copy or efferent copy is an internal copy of an outflowing (efferent), movement-producing signal generated by an organism's motor system. It can be collated with the (reafferent) sensory input that results from the agent's movement, enabling a comparison of actual movement with desired movement, and a shielding of perception from particular self-induced effects on the sensory input to achieve perceptual stability. Together with internal models, efference copies can serve to enable the brain to predict the effects of an action.

Common coding theory is a cognitive psychology theory describing how perceptual representations and motor representations are linked. The theory claims that there is a shared representation for both perception and action. More important, seeing an event activates the action associated with that event, and performing an action activates the associated perceptual event.

The Troland Research Awards are an annual prize given by the United States National Academy of Sciences to two researchers in recognition of psychological research on the relationship between consciousness and the physical world. The areas where these award funds are to be spent include but are not limited to areas of experimental psychology, the topics of sensation, perception, motivation, emotion, learning, memory, cognition, language, and action. The award preference is given to experimental work with a quantitative approach or experimental research seeking physiological explanations.

Karl John Friston FRS FMedSci FRSB is a British neuroscientist and theoretician at University College London. He is an authority on brain imaging and theoretical neuroscience, especially the use of physics-inspired statistical methods to model neuroimaging data and other random dynamical systems. Friston is a key architect of the free energy principle and active inference. In imaging neuroscience he is best known for statistical parametric mapping and dynamic causal modelling. In October 2022, he joined VERSES Inc, a California-based cognitive computing company focusing on artificial intelligence designed using the principles of active inference, as Chief Scientist.

<span class="mw-page-title-main">Wake-sleep algorithm</span> Unsupervised learning algorithm

The wake-sleep algorithm is an unsupervised learning algorithm for deep generative models, especially Helmholtz Machines. The algorithm is similar to the expectation-maximization algorithm, and optimizes the model likelihood for observed data. The name of the algorithm derives from its use of two learning phases, the “wake” phase and the “sleep” phase, which are performed alternately. It can be conceived as a model for learning in the brain, but is also being applied for machine learning.

The free energy principle is a theoretical framework suggesting that the brain reduces surprise or uncertainty by making predictions based on internal models and updating them using sensory input. It highlights the brain's objective of aligning its internal model with the external world to enhance prediction accuracy. This principle integrates Bayesian inference with active inference, where actions are guided by predictions and sensory feedback refines them. It has wide-ranging implications for comprehending brain function, perception, and action.

Radford M. Neal is a professor emeritus at the Department of Statistics and Department of Computer Science at the University of Toronto, where he holds a research chair in statistics and machine learning.

In neuroscience, predictive coding is a theory of brain function which postulates that the brain is constantly generating and updating a "mental model" of the environment. According to the theory, such a mental model is used to predict input signals from the senses that are then compared with the actual input signals from those senses. With the rising popularity of representation learning, the theory is being actively pursued and applied in machine learning and related fields.

Dynamic causal modeling (DCM) is a framework for specifying models, fitting them to data and comparing their evidence using Bayesian model comparison. It uses nonlinear state-space models in continuous time, specified using stochastic or ordinary differential equations. DCM was initially developed for testing hypotheses about neural dynamics. In this setting, differential equations describe the interaction of neural populations, which directly or indirectly give rise to functional neuroimaging data e.g., functional magnetic resonance imaging (fMRI), magnetoencephalography (MEG) or electroencephalography (EEG). Parameters in these models quantify the directed influences or effective connectivity among neuronal populations, which are estimated from the data using Bayesian statistical methods.

References

  1. Whatever next? Predictive brains, situated agents, and the future of cognitive science. (2013). Behavioral and Brain Sciences Behav Brain Sci, 36(03), 181-204. doi : 10.1017/s0140525x12000477
  2. Sanders, Laura (May 13, 2016). "Bayesian reasoning implicated in some mental disorders". Science News . Retrieved 20 July 2016.
  3. Kenji Doya (Editor), Shin Ishii (Editor), Alexandre Pouget (Editor), Rajesh P. N. Rao (Editor) (2007), Bayesian Brain: Probabilistic Approaches to Neural Coding, The MIT Press; 1 edition (Jan 1 2007)
  4. Knill David, Pouget Alexandre (2004), The Bayesian brain: the role of uncertainty in neural coding and computation, Trends in Neurosciences Vol.27 No.12 December 2004
  5. Helmholtz, H. (1860/1962). Handbuch der physiologischen optik (Southall, J. P. C. (Ed.), English trans.), Vol. 3. New York: Dover.
  6. Westheimer, G. (2008) Was Helmholtz a Bayesian?" Perception 39, 642–50
  7. Jaynes, E. T., 1986, `Bayesian Methods: General Background,' in Maximum-Entropy and Bayesian Methods in Applied Statistics, J. H. Justice (ed.), Cambridge Univ. Press, Cambridge
  8. Jaynes, E. T., 1988, `How Does the Brain Do Plausible Reasoning?', in Maximum-Entropy and Bayesian Methods in Science and Engineering, 1, G. J. Erickson and C. R. Smith (eds.)
  9. Ghahramani, Z. (2004). Unsupervised learning. In O. Bousquet, G. Raetsch, & U. von Luxburg (Eds.), Advanced lectures on machine learning. Berlin: Springer-Verlag.
  10. Neisser, U., 1967. Cognitive Psychology. Appleton-Century-Crofts, New York.
  11. Fahlman, S.E., Hinton, G.E. and Sejnowski, T.J.(1983). Massively parallel architectures for A.I.: Netl, Thistle, and Boltzmann machines. Proceedings of the National Conference on Artificial Intelligence, Washington DC.
  12. Dayan, P., Hinton, G. E., & Neal, R. M. (1995). The Helmholtz machine. Neural Computation, 7, 889–904.
  13. Dayan, P. and Hinton, G. E. (1996), Varieties of Helmholtz machines, Neural Networks, 9 1385–1403.
  14. Hinton, G. E., Dayan, P., To, A. and Neal R. M. (1995), The Helmholtz machine through time., Fogelman-Soulie and R. Gallinari (editors) ICANN-95, 483–490
  15. Tassinari H, Hudson TE & Landy MS. (2006). Combining priors and noisy visual cues in a rapid pointing task" Journal of Neuroscience 26(40), 10154–10163.
  16. Hudson TE, Maloney LT & Landy MS. (2008). Optimal compensation for temporal uncertainty in movement planning. PLoS Computational Biology, 4(7).
  17. Jacobs RA (1999). Optimal integration of texture and motion cues to depth" Vision Research 39(21), 3621–9.
  18. Battaglia PW, Jacobs RA & Aslin RN (2003). Bayesian integration of visual and auditory signals for spatial localization. Journal of the Optical Society of America, 20(7), 1391–7.
  19. Knill DC (2005). Reaching for visual cues to depth: The brain combines depth cues differently for motor control and perception. Journal of Vision, 5(2), 103:15.
  20. Knill DC (2007). Learning Bayesian priors for depth perception Archived 2008-11-21 at the Wayback Machine . Journal of Vision, 7(8), 1–20.
  21. Koerding KP & Wolpert DM (2004). Bayesian integration in sensorimotor learning. Nature, 427, 244–7.
  22. Koerding KP, Ku S & Wolpert DM (2004). Bayesian integration in force estimation" Journal of Neurophysiology 92, 3161–5.
  23. Goldreich, D (Mar 28, 2007). "A Bayesian perceptual model replicates the cutaneous rabbit and other tactile spatiotemporal illusions". PLOS ONE. 2 (3): e333. Bibcode:2007PLoSO...2..333G. doi: 10.1371/journal.pone.0000333 . PMC   1828626 . PMID   17389923.
  24. Goldreich, Daniel; Tong, Jonathan (10 May 2013). "Prediction, Postdiction, and Perceptual Length Contraction: A Bayesian Low-Speed Prior Captures the Cutaneous Rabbit and Related Illusions". Frontiers in Psychology. 4 (221): 221. doi: 10.3389/fpsyg.2013.00221 . PMC   3650428 . PMID   23675360.
  25. Goldreich, D; Peterson, MA (2012). "A Bayesian observer replicates convexity context effects in figure-ground perception". Seeing and Perceiving. 25 (3–4): 365–95. doi:10.1163/187847612X634445. PMID   22564398. S2CID   4931501.
  26. George D, Hawkins J, 2009 Towards a Mathematical Theory of Cortical Micro-circuits" PLoS Comput Biol 5(10) e1000532. doi : 10.1371/journal.pcbi.1000532
  27. Rao RPN, Ballard DH. Predictive coding in the visual cortex: a functional interpretation of some extra-classical receptive-field effects. Nature Neuroscience. 1999. 2:79–87
  28. Hinton, G. E. and Zemel, R. S.(1994), Autoencoders, minimum description length, and Helmholtz free energy. Advances in Neural Information Processing Systems 6. J. D. Cowan, G. Tesauro and J. Alspector (Eds.), Morgan Kaufmann: San Mateo, CA.
  29. Friston K, The free-energy principle: A unified brain theory?, Nat Rev Neurosci. 2010. 11:127–38
  30. Friston K, Kilner J, Harrison L. A free energy principle for the brain, J Physiol Paris. 2006. 100:70–87
  31. Friston K, A theory of cortical responses, Philos Trans R Soc Lond B Biol Sci. 2005. 360:815–36.
  32. Friston KJ, Daunizeau J, Kilner J, Kiebel SJ. Action and behavior: A free-energy formulation, Biol Cybern. 2010. 102:227–60
  33. 1 2 3 4 Friston K, Stephan KE., Free energy and the brain, Synthese. 2007. 159:417–458
  34. Huang Gregory (2008), "Is This a Unified Theory of the Brain?", New Scientist . May 23, 2008.