David Silver (computer scientist)

Last updated

David Silver

FRS
Born1976 (age 4647)
Alma mater University of Cambridge (BA)
University of Alberta (PhD)
Known for AlphaGo
AlphaZero
AlphaStar
Awards Royal Society University Research Fellowship (2011)
ACM Prize in Computing (2019)
Scientific career
Fields Artificial intelligence
Machine learning
Reinforcement learning
Planning
Computer Games [1]
Institutions Google Deepmind
University College London
Elixir Studios
Thesis Reinforcement learning and simulation-based search in computer Go  (2009)
Website www.davidsilver.uk OOjs UI icon edit-ltr-progressive.svg

David Silver FRS (born 1976) is a principal research scientist at Google DeepMind and a professor at University College London. He has led research on reinforcement learning with AlphaGo, AlphaZero and co-lead on AlphaStar. [1] [2]

Contents

Education

He studied at Christ's College, Cambridge, [3] graduating in 1997 with the Addison-Wesley award, and having befriended Demis Hassabis whilst at Cambridge. [4] Silver returned to academia in 2004 at the University of Alberta to study for a PhD on reinforcement learning, [5] where he co-introduced the algorithms used in the first master-level 9×9 Go programs and graduated in 2009. [6] [7] His version of program MoGo (co-authored with Sylvain Gelly) was one of the strongest Go programs as of 2009. [8]

Career and research

After graduating from university, Silver co-founded the video games company Elixir Studios, where he was CTO and lead programmer, receiving several awards for technology and innovation. [4] [9]

Silver was awarded a Royal Society University Research Fellowship in 2011, and subsequently became a lecturer at University College London. [10] His lectures on Reinforcement Learning are available on YouTube. [11] Silver consulted for Google DeepMind from its inception, joining full-time in 2013.

His recent work has focused on combining reinforcement learning with deep learning, including a program that learns to play Atari games directly from pixels. [12] Silver led the AlphaGo project, culminating in the first program to defeat a top professional player in the full-size game of Go. [13] AlphaGo subsequently received an honorary 9 Dan Professional Certification; and won the Cannes Lion award for innovation. [14] He then led development of AlphaZero, which used the same AI to learn to play Go from scratch (learning only by playing itself and not from human games) before learning to play chess and shogi in the same way, to higher levels than any other computer program.

Silver is among the most published members of staff at Google DeepMind, with over 170,000 citations and has an h-index of 86 according to Google scholar. [1]

Awards and honours

Silver was awarded the 2019 ACM Prize in Computing for breakthrough advances in computer game-playing. [15]

In 2021, Silver was elected Fellow of the Royal Society (FRS) for his contributions to Deep Q-Networks and AlphaGo. [16]

Related Research Articles

<span class="mw-page-title-main">Computer Go</span> Field of artificial intelligence around Go computer programs

Computer Go is the field of artificial intelligence (AI) dedicated to creating a computer program that plays the traditional board game Go. The field is sharply divided into two eras. Before 2015, the programs of the era were weak. The best efforts of the 1980s and 1990s produced only AIs that could be defeated by beginners, and AIs of the early 2000s were intermediate level at best. Professionals could defeat these programs even given handicaps of 10+ stones in favor of the AI. Many of the algorithms such as alpha-beta minimax that performed well as AIs for checkers and chess fell apart on Go's 19x19 board, as there were too many branching possibilities to consider. Creation of a human professional quality program with the techniques and hardware of the time was out of reach. Some AI researchers speculated that the problem was unsolvable without creation of human-like AI.

<span class="mw-page-title-main">Demis Hassabis</span> British entrepreneur and artificial intelligence researcher (born 1976)

Demis Hassabis is a British computer scientist, artificial intelligence researcher and entrepreneur. In his early career he was a video game AI programmer and designer, and an expert board games player. He is the chief executive officer and co-founder of DeepMind and Isomorphic Labs, and a UK Government AI Advisor. He is a Fellow of the Royal Society, and has won many prestigious awards for his work on AlphaFold including the Breakthrough Prize, the Canada Gairdner International Award, and the Lasker Award. In 2017 he was appointed a CBE and listed in the Time 100 most influential people list.

<span class="mw-page-title-main">Dimitri Bertsekas</span>

Dimitri Panteli Bertsekas is an applied mathematician, electrical engineer, and computer scientist, a McAfee Professor at the Department of Electrical Engineering and Computer Science in School of Engineering at the Massachusetts Institute of Technology (MIT), Cambridge, Massachusetts, and also a Fulton Professor of Computational Decision Making at Arizona State University, Tempe.

In computer science, Monte Carlo tree search (MCTS) is a heuristic search algorithm for some kinds of decision processes, most notably those employed in software that plays board games. In that context MCTS is used to solve the game tree.

<span class="mw-page-title-main">Google DeepMind</span> Artificial intelligence division

DeepMind Technologies Limited, doing business as Google DeepMind, is a British-American artificial intelligence research laboratory which serves as a subsidiary of Google. Founded in the UK in 2010, it was acquired by Google in 2014, The company is based in London, with research centres in Canada, France, Germany and the United States.

Pushmeet Kohli is a computer scientist at Google DeepMind where he holds the position of Vice President of research for the "Secure and Reliable AI" and "AI for Science and Sustainability". Before joining DeepMind, he was partner scientist and director of research at Microsoft Research and a post-doctoral fellow at the University of Cambridge. Kohli's research investigates applications of machine learning and computer vision. He has also made contributions in game theory, discrete algorithms and psychometrics.

AlphaGo is a computer program that plays the board game Go. It was developed by the London-based DeepMind Technologies, an acquired subsidiary of Google. Subsequent versions of AlphaGo became increasingly powerful, including a version that competed under the name Master. After retiring from competitive play, AlphaGo Master was succeeded by an even more powerful version known as AlphaGo Zero, which was completely self-taught without learning from human games. AlphaGo Zero was then generalized into a program known as AlphaZero, which played additional games, including chess and shogi. AlphaZero has in turn been succeeded by a program known as MuZero which learns without being taught the rules.

AlphaGo versus Lee Sedol, also known as the DeepMind Challenge Match, was a five-game Go match between top Go player Lee Sedol and AlphaGo, a computer Go program developed by DeepMind, played in Seoul, South Korea between 9 and 15 March 2016. AlphaGo won all but the fourth game; all games were won by resignation. The match has been compared with the historic chess match between Deep Blue and Garry Kasparov in 1997.

<span class="mw-page-title-main">Ilya Sutskever</span> Computer scientist (born 1985/86)

Ilya Sutskever is a computer scientist working in machine learning. He is a co-founder and Chief Scientist at OpenAI.

<span class="mw-page-title-main">Differentiable neural computer</span> Artificial neural network architecture

In artificial intelligence, a differentiable neural computer (DNC) is a memory augmented neural network architecture (MANN), which is typically recurrent in its implementation. The model was published in 2016 by Alex Graves et al. of DeepMind.

AlphaGo Zero is a version of DeepMind's Go software AlphaGo. AlphaGo's team published an article in the journal Nature on 19 October 2017, introducing AlphaGo Zero, a version created without using data from human games, and stronger than any previous version. By playing games against itself, AlphaGo Zero surpassed the strength of AlphaGo Lee in three days by winning 100 games to 0, reached the level of AlphaGo Master in 21 days, and exceeded all the old versions in 40 days.

AlphaGo versus Fan Hui was a five-game Go match between European champion Fan Hui, a 2-dan professional, and AlphaGo, a computer Go program developed by DeepMind, held at DeepMind's headquarters in London in October 2015. AlphaGo won all five games. This was the first time a computer Go program had beaten a professional human player on a full-sized board without handicap. This match was not disclosed to the public until 27 January 2016 to coincide with the publication of a paper in the journal Nature describing the algorithms AlphaGo used.

<span class="mw-page-title-main">AlphaZero</span> Game-playing artificial intelligence

AlphaZero is a computer program developed by artificial intelligence research company DeepMind to master the games of chess, shogi and go. This algorithm uses an approach similar to AlphaGo Zero.

AlphaStar is a computer program by DeepMind that plays the video game StarCraft II. It was unveiled to the public by name in January 2019. In a significant milestone for artificial intelligence, AlphaStar attained Grandmaster status in August 2019.

Deep reinforcement learning is a subfield of machine learning that combines reinforcement learning (RL) and deep learning. RL considers the problem of a computational agent learning to make decisions by trial and error. Deep RL incorporates deep learning into the solution, allowing agents to make decisions from unstructured input data without manual engineering of the state space. Deep RL algorithms are able to take in very large inputs and decide what actions to perform to optimize an objective. Deep reinforcement learning has been used for a diverse set of applications including but not limited to robotics, video games, natural language processing, computer vision, education, transportation, finance and healthcare.

Artificial intelligence and machine learning techniques are used in video games for a wide variety of applications such as non-player character (NPC) control and procedural content generation (PCG). Machine learning is a subset of artificial intelligence that uses historical data to build predictive and analytical models. This is in sharp contrast to traditional methods of artificial intelligence such as search trees and expert systems.

Timothy P. Lillicrap is a Canadian neuroscientist and AI researcher, adjunct professor at University College London, and staff research scientist at Google DeepMind, where he has been involved in the AlphaGo and AlphaZero projects mastering the games of Go, Chess and Shogi. His research focuses on machine learning and statistics for optimal control and decision making, as well as using these mathematical frameworks to understand how the brain learns. He has developed algorithms and approaches for exploiting deep neural networks in the context of reinforcement learning, and new recurrent memory architectures for one-shot learning.

<span class="mw-page-title-main">MuZero</span> Game-playing artificial intelligence

MuZero is a computer program developed by artificial intelligence research company DeepMind to master games without knowing their rules. Its release in 2019 included benchmarks of its performance in go, chess, shogi, and a standard suite of Atari games. The algorithm uses an approach similar to AlphaZero. It matched AlphaZero's performance in chess and shogi, improved on its performance in Go, and improved on the state of the art in mastering a suite of 57 Atari games, a visually-complex domain.

John Michael Jumper is an American senior research scientist at DeepMind Technologies. Jumper and his colleagues created AlphaFold, an artificial intelligence (AI) model to predict protein structures from their amino acid sequence with high accuracy. Jumper has stated that the AlphaFold team plans to release 100 million protein structures. The scientific journal Nature included Jumper as one of the ten "people who mattered" in science in their annual listing of Nature's 10 in 2021.

AlphaDev is an artificial intelligence system developed by Google DeepMind to discover enhanced computer science algorithms using reinforcement learning. AlphaDev is based on AlphaZero, a system that mastered the games of chess, shogi and go by self-play. AlphaDev applies the same approach to finding faster algorithms for fundamental tasks such as sorting and hashing.

References

  1. 1 2 3 David Silver publications indexed by Google Scholar OOjs UI icon edit-ltr-progressive.svg
  2. Oriol Vinyals; Igor Babuschkin; Wojciech M Czarnecki; et al. (30 October 2019). "Grandmaster level in StarCraft II using multi-agent reinforcement learning". Nature . 575 (7782): 350–354. doi:10.1038/S41586-019-1724-Z. ISSN   1476-4687. PMID   31666705. Wikidata   Q72988805.
  3. The Cambridge University List of Members up to 31 July 1998
  4. 1 2 Shead, Sam. "David Silver: The unsung hero and intellectual powerhouse at Google DeepMind". businessinsider.com. Retrieved 26 September 2020.
  5. David Silver at the Mathematics Genealogy Project OOjs UI icon edit-ltr-progressive.svg
  6. Silver, David (2009). Reinforcement Learning and Simulation-Based Search in Computer Go. ualberta.ca (PhD thesis). University of Alberta. doi:10.7939/R39D8T. OCLC   575410609.
  7. Sylvain Gelly; David Silver (2008). "Achieving Master Level Play in 9 × 9 Computer Go" (PDF). Proceedings of the Twenty-Third AAAI Conference on Artificial Intelligence.
  8. Stuart J. Russell; Peter Norvig (2009). Artificial Intelligence: A Modern Approach (3rd ed.). Prentice Hall.
  9. "What the AI Behind AlphaGo Can Teach Us About Being Human". Wired.com. Retrieved 17 May 2016.
  10. "CSML | David Silver". ucl.ac.uk. Archived from the original on 24 April 2021. Retrieved 27 May 2017.
  11. "RL Course by David Silver - Lecture 1: Introduction to Reinforcement Learning". 13 May 2015 via YouTube.
  12. Volodymyr Mnih; Koray Kavukcuoglu; David Silver; et al. (25 February 2015). "Human-level control through deep reinforcement learning". Nature . 518 (7540): 529–533. doi:10.1038/NATURE14236. ISSN   1476-4687. PMID   25719670. Wikidata   Q27907579.
  13. David Silver; Aja Huang; Chris J. Maddison; et al. (27 January 2016). "Mastering the game of Go with deep neural networks and tree search". Nature . 529 (7587): 484–489. doi:10.1038/NATURE16961. ISSN   1476-4687. PMID   26819042. Wikidata   Q28005460.
  14. "Google DeepMind AlphaGo in U.K. Wins Innovation Grand Prix" . Retrieved 27 May 2017.
  15. Ormond, Jim. "ACM Prize in Computing Awarded to AlphaGo Developer: David Silver Recognized for Breakthrough Advances in Computer Game-Playing". acm.org . Retrieved 2 April 2020.
  16. "Royal Society elects outstanding new Fellows and Foreign Members". royalsociety.org . Retrieved 8 June 2021.