David Silver | |
---|---|
Born | 1976 (age 47–48) |
Alma mater | University of Cambridge (BA) University of Alberta (PhD) |
Known for | AlphaGo AlphaZero AlphaStar |
Awards | Royal Society University Research Fellowship (2011) ACM Prize in Computing (2019) |
Scientific career | |
Fields | Artificial intelligence Machine learning Reinforcement learning Planning Computer Games [1] |
Institutions | Google Deepmind University College London Elixir Studios |
Thesis | Reinforcement learning and simulation-based search in computer Go (2009) |
Website | www |
David Silver FRS (born 1976) is a principal research scientist at Google DeepMind and a professor at University College London. He has led research on reinforcement learning with AlphaGo, AlphaZero and co-lead on AlphaStar. [1] [2]
He studied at Christ's College, Cambridge, [3] graduating in 1997 with the Addison-Wesley award, and having befriended Demis Hassabis whilst at Cambridge. [4] Silver returned to academia in 2004 at the University of Alberta to study for a PhD on reinforcement learning, [5] where he co-introduced the algorithms used in the first master-level 9×9 Go programs and graduated in 2009. [6] [7] His version of program MoGo (co-authored with Sylvain Gelly) was one of the strongest Go programs as of 2009. [8]
After graduating from university, Silver co-founded the video games company Elixir Studios, where he was CTO and lead programmer, receiving several awards for technology and innovation. [4] [9]
Silver was awarded a Royal Society University Research Fellowship in 2011, and subsequently became a lecturer at University College London. [10] His lectures on Reinforcement Learning are available on YouTube. [11] Silver consulted for Google DeepMind from its inception, joining full-time in 2013.
His recent work has focused on combining reinforcement learning with deep learning, including a program that learns to play Atari games directly from pixels. [12] Silver led the AlphaGo project, culminating in the first program to defeat a top professional player in the full-size game of Go. [13] AlphaGo subsequently received an honorary 9 Dan Professional Certification; and won the Cannes Lion award for innovation. [14] He then led development of AlphaZero, which used the same AI to learn to play Go from scratch (learning only by playing itself and not from human games) before learning to play chess and shogi in the same way, to higher levels than any other computer program.
Silver is among the most published members of staff at Google DeepMind, with over 200,000 citations and has an h-index of 93 according to Google scholar. [1]
Silver was awarded the 2019 ACM Prize in Computing for breakthrough advances in computer game-playing. [15]
In 2021, Silver was elected Fellow of the Royal Society (FRS) for his contributions to Deep Q-Networks and AlphaGo. [16] He was elected a Fellow of the Association for the Advancement of Artificial Intelligence in 2022. [17]
Sir Demis Hassabis is a British computer scientist, artificial intelligence researcher and entrepreneur. In his early career he was a video game AI programmer and designer, and an expert board games player. He is the chief executive officer and co-founder of DeepMind and Isomorphic Labs, and a UK Government AI Advisor.
Richard S. Sutton is a Canadian computer scientist. He is a professor of computing science at the University of Alberta and a research scientist at Keen Technologies. Sutton is considered one of the founders of modern computational reinforcement learning, having several significant contributions to the field, including temporal difference learning and policy gradient methods.
Michael Lederman Littman is a computer scientist, researcher, educator, and author. His research interests focus on reinforcement learning. He is currently a University Professor of Computer Science at Brown University, where he has taught since 2012.
Manuela Maria Veloso is the Head of J.P. Morgan AI Research & Herbert A. Simon University Professor Emeritus in the School of Computer Science at Carnegie Mellon University, where she was previously Head of the Machine Learning Department. She served as president of Association for the Advancement of Artificial Intelligence (AAAI) until 2014, and the co-founder and a Past President of the RoboCup Federation. She is a fellow of AAAI, Institute of Electrical and Electronics Engineers (IEEE), American Association for the Advancement of Science (AAAS), and Association for Computing Machinery (ACM). She is an international expert in artificial intelligence and robotics.
Dimitri Panteli Bertsekas is an applied mathematician, electrical engineer, and computer scientist, a McAfee Professor at the Department of Electrical Engineering and Computer Science in School of Engineering at the Massachusetts Institute of Technology (MIT), Cambridge, Massachusetts, and also a Fulton Professor of Computational Decision Making at Arizona State University, Tempe.
In computer science, Monte Carlo tree search (MCTS) is a heuristic search algorithm for some kinds of decision processes, most notably those employed in software that plays board games. In that context MCTS is used to solve the game tree.
Google DeepMind Technologies Limited is a British-American artificial intelligence research laboratory which serves as a subsidiary of Google. Founded in the UK in 2010, it was acquired by Google in 2014 and merged with Google AI's Google Brain division to become Google DeepMind in April 2023. The company is based in London, with research centres in Canada, France, Germany, and the United States.
Francesca Rossi is an Italian computer scientist, currently working at the IBM Thomas J. Watson Research Center as an IBM Fellow and the IBM AI Ethics Global Leader.
Pushmeet Kohli is highly regarded Computer and Machine Learning scientist and at Google DeepMind where he holds the position of Vice President of research for the "Secure and Reliable AI" and "AI for Science and Sustainability". Before joining DeepMind, he was partner scientist and director of research at Microsoft Research and a post-doctoral fellow at the University of Cambridge. Kohli's research investigates applications of machine learning and computer vision. He has also made contributions in game theory, discrete algorithms and psychometrics.
AlphaGo is a computer program that plays the board game Go. It was developed by the London-based DeepMind Technologies, an acquired subsidiary of Google. Subsequent versions of AlphaGo became increasingly powerful, including a version that competed under the name Master. After retiring from competitive play, AlphaGo Master was succeeded by an even more powerful version known as AlphaGo Zero, which was completely self-taught without learning from human games. AlphaGo Zero was then generalized into a program known as AlphaZero, which played additional games, including chess and shogi. AlphaZero has in turn been succeeded by a program known as MuZero which learns without being taught the rules.
AlphaGo versus Lee Sedol, also known as the DeepMind Challenge Match, was a five-game Go match between top Go player Lee Sedol and AlphaGo, a computer Go program developed by DeepMind, played in Seoul, South Korea between 9 and 15 March 2016. AlphaGo won all but the fourth game; all games were won by resignation. The match has been compared with the historic chess match between Deep Blue and Garry Kasparov in 1997.
In artificial intelligence, a differentiable neural computer (DNC) is a memory augmented neural network architecture (MANN), which is typically recurrent in its implementation. The model was published in 2016 by Alex Graves et al. of DeepMind.
AlphaGo Zero is a version of DeepMind's Go software AlphaGo. AlphaGo's team published an article in the journal Nature on 19 October 2017, introducing AlphaGo Zero, a version created without using data from human games, and stronger than any previous version. By playing games against itself, AlphaGo Zero surpassed the strength of AlphaGo Lee in three days by winning 100 games to 0, reached the level of AlphaGo Master in 21 days, and exceeded all the old versions in 40 days.
AlphaZero is a computer program developed by artificial intelligence research company DeepMind to master the games of chess, shogi and go. This algorithm uses an approach similar to AlphaGo Zero.
AlphaStar is a computer program by DeepMind that plays the video game StarCraft II. It was unveiled to the public by name in January 2019. In a significant milestone for artificial intelligence, AlphaStar attained Grandmaster status in August 2019.
Deep reinforcement learning is a subfield of machine learning that combines reinforcement learning (RL) and deep learning. RL considers the problem of a computational agent learning to make decisions by trial and error. Deep RL incorporates deep learning into the solution, allowing agents to make decisions from unstructured input data without manual engineering of the state space. Deep RL algorithms are able to take in very large inputs and decide what actions to perform to optimize an objective. Deep reinforcement learning has been used for a diverse set of applications including but not limited to robotics, video games, natural language processing, computer vision, education, transportation, finance and healthcare.
Artificial intelligence and machine learning techniques are used in video games for a wide variety of applications such as non-player character (NPC) control and procedural content generation (PCG). Machine learning is a subset of artificial intelligence that uses historical data to build predictive and analytical models. This is in sharp contrast to traditional methods of artificial intelligence such as search trees and expert systems.
Timothy P. Lillicrap is a Canadian neuroscientist and AI researcher, adjunct professor at University College London, and staff research scientist at Google DeepMind, where he has been involved in the AlphaGo and AlphaZero projects mastering the games of Go, Chess and Shogi. His research focuses on machine learning and statistics for optimal control and decision making, as well as using these mathematical frameworks to understand how the brain learns. He has developed algorithms and approaches for exploiting deep neural networks in the context of reinforcement learning, and new recurrent memory architectures for one-shot learning.
John Michael Jumper is an American senior research scientist at DeepMind Technologies. Jumper and his colleagues created AlphaFold, an artificial intelligence (AI) model to predict protein structures from their amino acid sequence with high accuracy. Jumper has stated that the AlphaFold team plans to release 100 million protein structures. The scientific journal Nature included Jumper as one of the ten "people who mattered" in science in their annual listing of Nature's 10 in 2021.
AlphaDev is an artificial intelligence system developed by Google DeepMind to discover enhanced computer science algorithms using reinforcement learning. AlphaDev is based on AlphaZero, a system that mastered the games of chess, shogi and go by self-play. AlphaDev applies the same approach to finding faster algorithms for fundamental tasks such as sorting and hashing.