Radford M. Neal

Last updated
Radford M. Neal
Born (1956-09-12) September 12, 1956 (age 67) [1]
CitizenshipCanadian
Education University of Calgary
University of Toronto
Scientific career
Fields Statistics, Machine Learning, Artificial Intelligence
Institutions University of Toronto
Thesis Bayesian Learning for Neural Networks  (1995)
Doctoral advisor Geoffrey Hinton
Other academic advisorsDavid Hill
Website www.cs.utoronto.ca/~radford/

Radford M. Neal is a professor emeritus at the Department of Statistics and Department of Computer Science at the University of Toronto, where he holds a research chair in statistics and machine learning.

Contents

Education and career

Neal studied computer science at the University of Calgary, where he received his B.Sc. in 1977 and M.Sc. in 1980, with thesis work supervised by David Hill. He worked for several years as a sessional instructor at the University of Calgary and as a statistical consultant in the industry before coming back to the academia. Neal continued his study at the University of Toronto, where he received his Ph.D. in 1995 under the supervision of Geoffrey Hinton. [2] Neal became an assistant professor at the University of Toronto in 1995, an associated professor in 1999 and a full professor since 2001. He was the Canada Research Chair in Statistics and Machine Learning from 2003 to 2016 and retired in 2017.

Neal has made great contributions in the area of machine learning and statistics, where he is particularly well known for his work on Markov chain Monte Carlo, [3] [4] error correcting codes [5] and Bayesian learning for neural networks. [6] He is also known for his blog [7] and as the developer of pqR: a new version of the R interpreter. [8]

Bibliography

Books and chapters

Selected papers

Related Research Articles

A hidden Markov model (HMM) is a Markov model in which the observations are dependent on a latent Markov process. An HMM requires that there be an observable process whose outcomes depend on the outcomes of in a known way. Since cannot be observed directly, the goal is to learn about state of by observing By definition of being a Markov model, an HMM has an additional requirement that the outcome of at time must be "influenced" exclusively by the outcome of at and that the outcomes of and at must be conditionally independent of at given at time Estimation of the parameters in an HMM can be performed using maximum likelihood. For linear chain HMMs, the Baum–Welch algorithm can be used to estimate the parameters.

A Bayesian network is a probabilistic graphical model that represents a set of variables and their conditional dependencies via a directed acyclic graph (DAG). While it is one of several forms of causal notation, causal networks are special cases of Bayesian networks. Bayesian networks are ideal for taking an event that occurred and predicting the likelihood that any one of several possible known causes was the contributing factor. For example, a Bayesian network could represent the probabilistic relationships between diseases and symptoms. Given symptoms, the network can be used to compute the probabilities of the presence of various diseases.

In statistics, Markov chain Monte Carlo (MCMC) methods comprise a class of algorithms for sampling from a probability distribution. By constructing a Markov chain that has the desired distribution as its equilibrium distribution, one can obtain a sample of the desired distribution by recording states from the chain. The more steps that are included, the more closely the distribution of the sample matches the actual desired distribution. Various algorithms exist for constructing chains, including the Metropolis–Hastings algorithm.

Bayesian statistics is a theory in the field of statistics based on the Bayesian interpretation of probability where probability expresses a degree of belief in an event. The degree of belief may be based on prior knowledge about the event, such as the results of previous experiments, or on personal beliefs about the event. This differs from a number of other interpretations of probability, such as the frequentist interpretation that views probability as the limit of the relative frequency of an event after many trials. More concretely, analysis in Bayesian methods codifies prior knowledge in the form of a prior distribution.

In statistics, Gibbs sampling or a Gibbs sampler is a Markov chain Monte Carlo (MCMC) algorithm for sampling from a specified multivariate probability distribution when direct sampling from the joint distribution is difficult, but sampling from the conditional distribution is more practical. This sequence can be used to approximate the joint distribution ; to approximate the marginal distribution of one of the variables, or some subset of the variables ; or to compute an integral. Typically, some of the variables correspond to observations whose values are known, and hence do not need to be sampled.

The Helmholtz machine is a type of artificial neural network that can account for the hidden structure of a set of data by being trained to create a generative model of the original set of data. The hope is that by learning economical representations of the data, the underlying structure of the generative model should reasonably approximate the hidden structure of the data set. A Helmholtz machine contains two networks, a bottom-up recognition network that takes the data as input and produces a distribution over hidden variables, and a top-down "generative" network that generates values of the hidden variables and the data itself. At the time, Helmholtz machines were one of a handful of learning architectures that used feedback as well as feedforward to ensure quality of learned models.

<span class="mw-page-title-main">Peter Dayan</span> Researcher in computational neuroscience

Peter Dayan is a British neuroscientist and computer scientist who is director at the Max Planck Institute for Biological Cybernetics in Tübingen, Germany, along with Ivan De Araujo. He is co-author of Theoretical Neuroscience, an influential textbook on computational neuroscience. He is known for applying Bayesian methods from machine learning and artificial intelligence to understand neural function and is particularly recognized for relating neurotransmitter levels to prediction errors and Bayesian uncertainties. He has pioneered the field of reinforcement learning (RL) where he helped develop the Q-learning algorithm, and made contributions to unsupervised learning, including the wake-sleep algorithm for neural networks and the Helmholtz machine.

Bayesian inference of phylogeny combines the information in the prior and in the data likelihood to create the so-called posterior probability of trees, which is the probability that the tree is correct given the data, the prior and the likelihood model. Bayesian inference was introduced into molecular phylogenetics in the 1990s by three independent groups: Bruce Rannala and Ziheng Yang in Berkeley, Bob Mau in Madison, and Shuying Li in University of Iowa, the last two being PhD students at the time. The approach has become very popular since the release of the MrBayes software in 2001, and is now one of the most popular methods in molecular phylogenetics.

<span class="mw-page-title-main">WinBUGS</span> Statistical software for Bayesian analysis

WinBUGS is statistical software for Bayesian analysis using Markov chain Monte Carlo (MCMC) methods.

Just another Gibbs sampler (JAGS) is a program for simulation from Bayesian hierarchical models using Markov chain Monte Carlo (MCMC), developed by Martyn Plummer. JAGS has been employed for statistical work in many fields, for example ecology, management, and genetics.

The Hamiltonian Monte Carlo algorithm is a Markov chain Monte Carlo method for obtaining a sequence of random samples which converge to being distributed according to a target probability distribution for which direct sampling is difficult. This sequence can be used to estimate integrals with respect to the target distribution.

<span class="mw-page-title-main">Wake-sleep algorithm</span> Unsupervised learning algorithm

The wake-sleep algorithm is an unsupervised learning algorithm for deep generative models, especially Helmholtz Machines. The algorithm is similar to the expectation-maximization algorithm, and optimizes the model likelihood for observed data. The name of the algorithm derives from its use of two learning phases, the “wake” phase and the “sleep” phase, which are performed alternately. It can be conceived as a model for learning in the brain, but is also being applied for machine learning.

<span class="mw-page-title-main">Stuart Geman</span> American mathematician

Stuart Alan Geman is an American mathematician, known for influential contributions to computer vision, statistics, probability theory, machine learning, and the neurosciences. He and his brother, Donald Geman, are well known for proposing the Gibbs sampler, and for the first proof of convergence of the simulated annealing algorithm.

<span class="mw-page-title-main">Stan (software)</span> Probabilistic programming language for Bayesian inference

Stan is a probabilistic programming language for statistical inference written in C++. The Stan language is used to specify a (Bayesian) statistical model with an imperative program calculating the log probability density function.

PyMC is a probabilistic programming language written in Python. It can be used for Bayesian statistical modeling and probabilistic machine learning.

Yee-Whye Teh is a professor of statistical machine learning in the Department of Statistics, University of Oxford. Prior to 2012 he was a reader at the Gatsby Charitable Foundation computational neuroscience unit at University College London. His work is primarily in machine learning, artificial intelligence, statistics and computer science.

<span class="mw-page-title-main">Éric Moulines</span> French researcher in statistical learning

Éric Moulines is a French researcher in statistical learning and signal processing. He received the silver medal from the CNRS in 2010, the France Télécom prize awarded in collaboration with the French Academy of Sciences in 2011. He was appointed a Fellow of the European Association for Signal Processing in 2012 and of the Institute of Mathematical Statistics in 2016. He is General Engineer of the Corps des Mines (X81).

ArviZ is a Python package for exploratory analysis of Bayesian models.

An energy-based model (EBM) is a form of generative model (GM) imported directly from statistical physics to learning.

Siddhartha Chib is an econometrician and statistician, the Harry C. Hartkopf Professor of Econometrics and Statistics at Washington University in St. Louis. His work is primarily in Bayesian statistics, econometrics, and Markov chain Monte Carlo methods.

References

  1. "Radford M. Neal Curriculum Vitae" (PDF). User radford at cs.utoronto.ca. Retrieved 4 May 2015.
  2. Neal, Radford M. (2022-05-31). "Curriculum Vitae" (PDF).
  3. Neal, Radford (1993). Probabilistic Inference Using Markov Chain Monte Carlo Methods (PDF) (Report). Technical Report CRG-TR-93-1, Department of Computer Science, University of Toronto. p. 144. Retrieved 9 May 2015.
  4. Neal, Radford M (2011). "MCMC Using Hamiltonian Dynamics" (PDF). In Steve Brooks; Andrew Gelman; Galin L. Jones; Xiao-Li Meng (eds.). Handbook of Markov Chain Monte Carlo. Chapman and Hall/CRC. ISBN   978-0470177938.
  5. MacKay, D. J. C.; Neal, R. M. (1996). "Near Shannon limit performance of low density parity check codes". Electronics Letters. 32 (18): 1645. Bibcode:1996ElL....32.1645M. doi:10.1049/el:19961141.
  6. Neal, R. M. (1996). Bayesian Learning for Neural Networks. Lecture Notes in Statistics. Vol. 118. doi:10.1007/978-1-4612-0745-0. ISBN   978-0-387-94724-2.
  7. "Radford Neal's blog" . Retrieved 9 May 2015.
  8. "pqR - a pretty quick version of R" . Retrieved 9 May 2015.