Andrew G. Barto | |
---|---|
Born | c. 1948 (age 75–76) |
Nationality | American |
Alma mater | University of Michigan |
Awards | IEEE Neural Networks Society Pioneer Award, IJCAI Award for Research Excellence |
Scientific career | |
Fields | Computer science |
Institutions | University of Massachusetts Amherst |
Doctoral students |
Andrew G. Barto (born c. 1948) is an American computer scientist, currently Professor Emeritus of computer science at University of Massachusetts Amherst. Barto is best known for his foundational contributions to the field of modern computational reinforcement learning. [1]
Barto received his B.S. with distinction in mathematics from the University of Michigan in 1970, after having initially majored in naval architecture and engineering. After reading work by Michael Arbib and McCulloch and Pitts he became interested in using computers and mathematics to model the brain, and five years later was awarded a Ph.D. in computer science for a thesis on cellular automata. [2]
In 1977, Barto joined the College of Information and Computer Sciences at the University of Massachusetts Amherst as a postdoctoral research associate, was promoted to associate professor in 1982, and full professor in 1991. He was department chair from 2007 to 2011 and a core faculty member of the Neuroscience and Behavior program. [3]
During this time at UMass, Barto co-directed the Autonomous Learning Laboratory (initially the Adaptive Network Laboratory), which generated several key ideas in reinforcement learning. Richard Sutton, with whom he co-authored the influential book Reinforcement Learning: An Introduction (MIT Press 1998; 2nd edition 2018), was his first PhD student. Barto graduated 27 PhD students, thirteen of which went on to become professors. [3]
Barto published over one hundred papers or chapters in journals, books, and conference and workshop proceedings. He is co-author with Richard Sutton of the book Reinforcement Learning: An Introduction, MIT Press 1998 (2nd edition 2018), and co-editor with Jennie Si, Warren Powell, and Don Wunch II of the Handbook of Learning and Approximate Dynamic Programming, Wiley-IEEE Press, 2004. [4]
Barto is a Fellow of the American Association for the Advancement of Science, a Fellow and Senior Member of the IEEE, [5] and a member of the American Association for Artificial Intelligence and the Society for Neuroscience.
Barto was awarded the UMass Neurosciences Lifetime Achievement Award, 2019, the IEEE Neural Network Society Pioneer Award in 2004, [6] and the IJCAI Award for Research Excellence, 2017. His citation for the latter read:
Professor Barto is recognized for his groundbreaking and impactful research in both the theory and application of reinforcement learning. [1]
Reinforcement learning (RL) is an interdisciplinary area of machine learning and optimal control concerned with how an intelligent agent should take actions in a dynamic environment in order to maximize a reward signal. Reinforcement learning is one of the three basic machine learning paradigms, alongside supervised learning and unsupervised learning.
Stephen Byram Furber is a British computer scientist, mathematician and hardware engineer, and Emeritus ICL Professor of Computer Engineering in the Department of Computer Science at the University of Manchester, UK. After completing his education at the University of Cambridge, he spent the 1980s at Acorn Computers, where he was a principal designer of the BBC Micro and the ARM 32-bit RISC microprocessor. As of 2023, over 250 billion ARM chips have been manufactured, powering much of the world's mobile computing and embedded systems, everything from sensors to smartphones to servers.
Terrence Joseph Sejnowski is the Francis Crick Professor at the Salk Institute for Biological Studies where he directs the Computational Neurobiology Laboratory and is the director of the Crick-Jacobs center for theoretical and computational biology. He has performed pioneering research in neural networks and computational neuroscience.
Temporal difference (TD) learning refers to a class of model-free reinforcement learning methods which learn by bootstrapping from the current estimate of the value function. These methods sample from the environment, like Monte Carlo methods, and perform updates based on current estimates, like dynamic programming methods.
Michael Anthony Arbib is an American computational neuroscientist. He is an adjunct professor of Psychology at the University of California at San Diego and professor emeritus at the University of Southern California; before his 2016 retirement he was the Fletcher Jones Professor of computer science, as well as a professor of biological sciences, biomedical engineering, electrical engineering, neuroscience and psychology.
Stephen Grossberg is a cognitive scientist, theoretical and computational psychologist, neuroscientist, mathematician, biomedical engineer, and neuromorphic technologist. He is the Wang Professor of Cognitive and Neural Systems and a Professor Emeritus of Mathematics & Statistics, Psychological & Brain Sciences, and Biomedical Engineering at Boston University.
Eduardo Daniel Sontag is an Argentine-American mathematician, and distinguished university professor at Northeastern University, who works in the fields control theory, dynamical systems, systems molecular biology, cancer and immunology, theoretical computer science, neural networks, and computational biology.
Peter Dayan is a British neuroscientist and computer scientist who is director at the Max Planck Institute for Biological Cybernetics in Tübingen, Germany, along with Ivan De Araujo. He is co-author of Theoretical Neuroscience, an influential textbook on computational neuroscience. He is known for applying Bayesian methods from machine learning and artificial intelligence to understand neural function and is particularly recognized for relating neurotransmitter levels to prediction errors and Bayesian uncertainties. He has pioneered the field of reinforcement learning (RL) where he helped develop the Q-learning algorithm, and made contributions to unsupervised learning, including the wake-sleep algorithm for neural networks and the Helmholtz machine.
Jacek M. Zurada is a Polish-American computer scientist who serves as a Professor of Electrical and Computer Engineering Department at the University of Louisville, Kentucky. His M.S. and Ph.D. degrees are from Politechnika Gdaṅska ranked as #1 among Polish universities of technology. He has held visiting appointments at Swiss Federal Institute of Technology, Zurich, Princeton, Northeastern, Auburn, and at overseas universities in Australia, Chile, China, France, Germany, Hong Kong, Italy, Japan, Poland, Singapore, Spain, and South Africa. He is a Life Fellow of IEEE and a Fellow of International Neural Networks Society and Doctor Honoris Causa of Czestochowa Institute of Technology, Poland.
Richard S. Sutton is a Canadian computer scientist. He is a professor of computing science at the University of Alberta and a research scientist at Keen Technologies. Sutton is considered one of the founders of modern computational reinforcement learning, having several significant contributions to the field, including temporal difference learning and policy gradient methods.
Hava Siegelmann is an American computer scientist and Provost Professor at the University of Massachusetts Amherst.
Leslie Pack Kaelbling is an American roboticist and the Panasonic Professor of Computer Science and Engineering at the Massachusetts Institute of Technology. She is widely recognized for adapting partially observable Markov decision processes from operations research for application in artificial intelligence and robotics. Kaelbling received the IJCAI Computers and Thought Award in 1997 for applying reinforcement learning to embedded control systems and developing programming tools for robot navigation. In 2000, she was elected as a Fellow of the Association for the Advancement of Artificial Intelligence.
Dimitri Panteli Bertsekas is an applied mathematician, electrical engineer, and computer scientist, a McAfee Professor at the Department of Electrical Engineering and Computer Science in School of Engineering at the Massachusetts Institute of Technology (MIT), Cambridge, Massachusetts, and also a Fulton Professor of Computational Decision Making at Arizona State University, Tempe.
Eric Poe Xing is an American computer scientist whose research spans machine learning, computational biology, and statistical methodology. Xing is founding President of the world’s first artificial intelligence university, Mohamed bin Zayed University of Artificial Intelligence (MBZUAI).
Christoph von der Malsburg is a German physicist and neuroscientist.
Klaus-Robert Müller is a German computer scientist and physicist, most noted for his work in machine learning and brain–computer interfaces.
Stefan Schaal is a German-American computer scientist specializing in robotics, machine learning, autonomous systems, and computational neuroscience.
Prashant Shenoy is an Indian-American Computer Scientist. He is a Distinguished Professor of Computer Science in the College of Information and Computer Sciences at the University of Massachusetts Amherst. He is known for his contributions to distributed computing, computer networks, cloud computing, and computational sustainability.
Victor R. Lesser is Distinguished Professor Emeritus in the School of Computer Science at the University of Massachusetts at Amherst and the Director of Multi-Agent Systems Laboratory. He is widely considered as the founding father of multi-agent systems. He received the IJCAI Award for Research Excellence in 2009.
Donald C. Wunsch II is Mary K. Finley Distinguished Professor of computer engineering at the Missouri University of Science and Technology, and a Fellow of the Institute of Electrical and Electronics Engineers He is known for his work on " hardware implementations, reinforcement and unsupervised learning".