Leslie P. Kaelbling | |
---|---|
Nationality | American |
Alma mater | Stanford University |
Known for | Partially observable Markov decision process Founder and first editor-in-chief of the Journal of Machine Learning Research |
Awards | IJCAI Computers and Thought Award (1997) AAAI Fellow (2000) |
Scientific career | |
Fields | Robotics Computer Science |
Institutions | SRI International Brown University Massachusetts Institute of Technology |
Thesis | Learning in Embedded Systems (1990) |
Doctoral advisor | Nils J. Nilsson |
Doctoral students | Michael L. Littman Leonid Peshkin Kristian Kersting |
Website | people |
Leslie Pack Kaelbling is an American roboticist and the Panasonic Professor of Computer Science and Engineering at the Massachusetts Institute of Technology. [1] She is widely recognized for adapting partially observable Markov decision processes from operations research for application in artificial intelligence and robotics. [2] [3] [4] Kaelbling received the IJCAI Computers and Thought Award in 1997 for applying reinforcement learning to embedded control systems and developing programming tools for robot navigation. [5] In 2000, she was elected as a Fellow of the Association for the Advancement of Artificial Intelligence. [6]
Kaelbling received an A. B. in Philosophy in 1983 and a Ph. D. in Computer Science in 1990, both from Stanford University. [7] During this time she was also affiliated with the Center for the Study of Language and Information. [8] She then worked at SRI International and the affiliated robotics spin-off Teleos Research before joining the faculty at Brown University. She left Brown in 1999 to join the faculty at MIT. [9] Her research focuses on decision-making under uncertainty, machine learning, and sensing with applications to robotics. [7]
In the spring of 2000, she and two-thirds of the editorial board of the Kluwer-owned journal Machine Learning resigned in protest to its pay-to-access archives with simultaneously limited financial compensation for authors. [10] Kaelbling co-founded and served as the first editor-in-chief of the Journal of Machine Learning Research , a peer-reviewed open access journal on the same topics which allows researchers to publish articles for free and retain copyright with its archives freely available online. [11] In response to the mass resignation, Kluwer changed their publishing policy to allow authors to self-archive their papers online after peer-review. Kaelbling responded that this policy was reasonable and would have made the creation of an alternative journal unnecessary, but the editorial board members had made it clear they wanted such a policy and it was only after the threat of resignations and the actual founding of JMLR that the publishing policy finally changed. [12]
Reinforcement learning (RL) is an interdisciplinary area of machine learning and optimal control concerned with how an intelligent agent should take actions in a dynamic environment in order to maximize a reward signal. Reinforcement learning is one of the three basic machine learning paradigms, alongside supervised learning and unsupervised learning.
A multi-agent system is a computerized system composed of multiple interacting intelligent agents. Multi-agent systems can solve problems that are difficult or impossible for an individual agent or a monolithic system to solve. Intelligence may include methodic, functional, procedural approaches, algorithmic search or reinforcement learning.
Automated planning and scheduling, sometimes denoted as simply AI planning, is a branch of artificial intelligence that concerns the realization of strategies or action sequences, typically for execution by intelligent agents, autonomous robots and unmanned vehicles. Unlike classical control and classification problems, the solutions are complex and must be discovered and optimized in multidimensional space. Planning is also related to decision theory.
Michael Irwin Jordan is an American scientist, professor at the University of California, Berkeley, research scientist at the Inria Paris, and researcher in machine learning, statistics, and artificial intelligence.
A partially observable Markov decision process (POMDP) is a generalization of a Markov decision process (MDP). A POMDP models an agent decision process in which it is assumed that the system dynamics are determined by an MDP, but the agent cannot directly observe the underlying state. Instead, it must maintain a sensor model and the underlying MDP. Unlike the policy function in MDP which maps the underlying states to the actions, POMDP's policy is a mapping from the history of observations to the actions.
In computer science, a predictive state representation (PSR) is a way to model a state of controlled dynamical system from a history of actions taken and resulting observations. PSR captures the state of a system as a vector of predictions for future tests (experiments) that can be done on the system. A test is a sequence of action-observation pairs and its prediction is the probability of the test's observation-sequence happening if the test's action-sequence were to be executed on the system. One of the advantage of using PSR is that the predictions are directly related to observable quantities. This is in contrast to other models of dynamical systems, such as partially observable Markov decision processes (POMDPs) where the state of the system is represented as a probability distribution over unobserved nominal states.
Michael Lederman Littman is a computer scientist, researcher, educator, and author. His research interests focus on reinforcement learning. He is currently a University Professor of Computer Science at Brown University, where he has taught since 2012.
In probability theory, a Markov model is a stochastic model used to model pseudo-randomly changing systems. It is assumed that future states depend only on the current state, not on the events that occurred before it. Generally, this assumption enables reasoning and computation with the model that would otherwise be intractable. For this reason, in the fields of predictive modelling and probabilistic forecasting, it is desirable for a given model to exhibit the Markov property.
Sven Koenig is a full professor in computer science at the University of Southern California. He received an M.S. degree in computer science from the University of California at Berkeley in 1991 and a Ph.D. in computer science from Carnegie Mellon University in 1997, advised by Reid Simmons.
John Matthew Hollerbach is a professor of computer science and research professor of mechanical engineering at the University of Utah. He is the editor of The International Journal of Robotics Research, a Senior Editor of Presence: Teleoperators & Virtual Environments, and a Governing Board member of the electronic journal Haptics-e.
Action model learning is an area of machine learning concerned with creation and modification of software agent's knowledge about effects and preconditions of the actions that can be executed within its environment. This knowledge is usually represented in logic-based action description language and used as the input for automated planners.
Peter Stone is an American computer scientist who holds the Truchard Foundation Chair of Computer Science at The University of Texas at Austin. He is also Chief Scientist of Sony AI, an Alfred P. Sloan Research Fellow, Guggenheim Fellow, AAAI Fellow, IEEE Fellow, AAAS Fellow, ACM Fellow, and Fulbright Scholar.
Shlomo Zilberstein is an Israeli-American computer scientist. He is a Professor of Computer Science and Associate Dean for Research and Engagement in the College of Information and Computer Sciences at the University of Massachusetts, Amherst. He graduated with a B.A. in Computer Science summa cum laude from Technion – Israel Institute of Technology in 1982, and received a Ph.D. in Computer Science from University of California at Berkeley in 1993, advised by Stuart J. Russell. He is known for his contributions to artificial intelligence, anytime algorithms, multi-agent systems, and automated planning and scheduling algorithms, notably within the context of Markov decision processes (MDPs), Partially Observable MDPs (POMDPs), and Decentralized POMDPs (Dec-POMDPs).
Stephen John Young is a British researcher, Professor of Information Engineering at the University of Cambridge and an entrepreneur. He is one of the pioneers of automated speech recognition and statistical spoken dialogue systems. He served as the Senior Pro-Vice-Chancellor of the University of Cambridge from 2009 to 2015, responsible for planning and resources. From 2015 to 2019, he held a joint appointment between his professorship at Cambridge and Apple, where he was a senior member of the Siri development team.
Joëlle Pineau is a Canadian computer scientist and Associate Professor at McGill University. She is the global Vice President of Facebook Artificial Intelligence Research (FAIR), now known as Meta AI, and is based in Montreal, Quebec. She was elected to the Fellow of the Royal Society of Canada in 2023.
Pieter Abbeel is a professor of electrical engineering and computer sciences, Director of the Berkeley Robot Learning Lab, and co-director of the Berkeley AI Research (BAIR) Lab at the University of California, Berkeley. He is also the co-founder of Covariant, a venture-funded start-up that aims to teach robots new, complex skills, and co-founder of Gradescope, an online grading system that has been implemented in over 500 universities across the USA. He is best known for his cutting-edge research in robotics and machine learning, particularly in deep reinforcement learning. In 2021, he joined AIX Ventures as an Investment Partner. AIX Ventures is a venture capital fund that invests in artificial intelligence startups.
Deep reinforcement learning is a subfield of machine learning that combines reinforcement learning (RL) and deep learning. RL considers the problem of a computational agent learning to make decisions by trial and error. Deep RL incorporates deep learning into the solution, allowing agents to make decisions from unstructured input data without manual engineering of the state space. Deep RL algorithms are able to take in very large inputs and decide what actions to perform to optimize an objective. Deep reinforcement learning has been used for a diverse set of applications including but not limited to robotics, video games, natural language processing, computer vision, education, transportation, finance and healthcare.
Thomas L. Dean is an American computer scientist known for his work in robot planning, probabilistic graphical models, and computational neuroscience. He was one of the first to introduce ideas from operations research and control theory to artificial intelligence. In particular, he introduced the idea of the anytime algorithm and was the first to apply the factored Markov decision process to robotics. He has authored several influential textbooks on artificial intelligence.
Chelsea Finn is an American computer scientist and assistant professor at Stanford University. Her research investigates intelligence through the interactions of robots, with the hope to create robotic systems that can learn how to learn. She is part of the Google Brain group.
Andrea L. Thomaz is a senior research scientist in the Department of Electrical and Computer Engineering at The University of Texas at Austin and Director of Socially Intelligent Machines Lab. She specializes in Human-Robot Interaction, Artificial Intelligence and Interactive Machine Learning.