Timothy P. Lillicrap | |
---|---|
Nationality | Canadian |
Alma mater | Queen's University |
Known for | Optimal Control, Decision Making, Deep Learning, AlphaGo, AlphaZero |
Awards | CfNS Award for Excellence Governor General's Academic Medal |
Scientific career | |
Fields | Artificial intelligence Theoretical neuroscience |
Institutions | Google Deepmind, University College London |
Doctoral advisor | Stephen H. Scott |
Timothy P. Lillicrap is a Canadian neuroscientist and AI researcher, adjunct professor at University College London, and staff research scientist at Google DeepMind, where he has been involved in the AlphaGo and AlphaZero projects mastering the games of Go, Chess and Shogi. His research focuses on machine learning and statistics for optimal control and decision making, as well as using these mathematical frameworks to understand how the brain learns. He has developed algorithms and approaches for exploiting deep neural networks in the context of reinforcement learning, and new recurrent memory architectures for one-shot learning. [1]
His numerous contributions to the field have earned him a number of honors, including the Governor General's Academic Medal, an NSERC Fellowship, the Centre for Neuroscience Studies Award for Excellence, and numerous European Research Council grants. He has also won a number of Social Learning tournaments.
Lillicrap attained a B.Sc. in cognitive science and artificial intelligence from University of Toronto in 2005, and a Ph.D. in systems neuroscience from Queen's University in 2012 under Stephen H. Scott. He then went on to do become a postdoctoral research fellow at Oxford University, and joined Google DeepMind as a research scientist in 2014. Following a series of promotions, he eventually became a DeepMind staff research scientist in 2016, a position he still holds as of 2021. [2] [3]
In 2016, Lillicrap accepted an adjunct professorship at University College London. [4]
Timothy Lillicrap has an extensive publication record. A selection of works is listed below:
An evaluation function, also known as a heuristic evaluation function or static evaluation function, is a function used by game-playing computer programs to estimate the value or goodness of a position in a game tree. Most of the time, the value is either a real number or a quantized integer, often in nths of the value of a playing piece such as a stone in go or a pawn in chess, where n may be tenths, hundredths or other convenient fraction, but sometimes, the value is an array of three values in the unit interval, representing the win, draw, and loss percentages of the position.
In the game of Go, a ladder,(征子) is a basic sequence of moves in which an attacker pursues a group in atari in a zig-zag pattern across the board. If there are no intervening stones, the group will hit the edge of the board and be captured.
The KGS Go Server, known until 2006 as the Kiseido Go Server, is a game server first developed in 1999 and established in 2000 for people to play Go. The system was developed by William M. Shubert and its code is now written entirely in Java. In Spring of 2017, Shubert transferred ownership to the American Go Foundation.
Dimitri Panteli Bertsekas is an applied mathematician, electrical engineer, and computer scientist, a McAfee Professor at the Department of Electrical Engineering and Computer Science in School of Engineering at the Massachusetts Institute of Technology (MIT), Cambridge, Massachusetts, and also a Fulton Professor of Computational Decision Making at Arizona State University, Tempe.
DeepMind Technologies is a British artificial intelligence research laboratory founded in 2010, and now a subsidiary of Alphabet Inc. DeepMind was acquired by Google in 2014 and became a wholly owned subsidiary of Alphabet Inc., after Google's restructuring in 2015. The company is based in London, with research centres in Canada, France, and the United States.
AlphaGo is a computer program that plays the board game Go. It was developed by the London-based DeepMind Technologies, an acquired subsidiary of Google. Subsequent versions of AlphaGo became increasingly powerful, including a version that competed under the name Master. After retiring from competitive play, AlphaGo Master was succeeded by an even more powerful version known as AlphaGo Zero, which was completely self-taught without learning from human games. AlphaGo Zero was then generalized into a program known as AlphaZero, which played additional games, including chess and shogi. AlphaZero has in turn been succeeded by a program known as MuZero which learns without being taught the rules.
Fan Hui is a Chinese-born French Go player. Becoming a professional Go player in 1996, Fan moved to France in 2000 and became the coach of the French national Go team in 2005. He was the winner of the European Go Championship in 2013, 2014 and 2015. As of 2015, he is ranked as a 2 dan professional. He additionally won the 2016 European Professional Go Championship.
Darkforest is a computer go program developed by Meta Platforms, based on deep learning techniques using a convolutional neural network. Its updated version Darkfores2 combines the techniques of its predecessor with Monte Carlo tree search. The MCTS effectively takes tree search methods commonly seen in computer chess programs and randomizes them. With the update, the system is known as Darkfmcts3.
Tygem is an internet go server owned by South Korean company TongYang Online. Popular in Asia, their website states that over 500 professional Go players use their service.
David Silver is a British computer scientist and businessman who leads the reinforcement learning research group at DeepMind and was lead researcher on AlphaGo, AlphaZero and co-lead on AlphaStar.
Aja Huang is a Taiwanese computer scientist and expert on artificial intelligence. He works for DeepMind and was a member of the AlphaGo project.
AlphaGo versus Ke Jie was a three-game Go match between the computer Go program AlphaGo Master and current world No. 1 ranking player Ke Jie, being part of the Future of Go Summit in Wuzhen, China, played on 23, 25, and 27 May 2017. AlphaGo defeated Ke Jie in all three games.
AlphaGo Zero is a version of DeepMind's Go software AlphaGo. AlphaGo's team published an article in the journal Nature on 19 October 2017, introducing AlphaGo Zero, a version created without using data from human games, and stronger than any previous version. By playing games against itself, AlphaGo Zero surpassed the strength of AlphaGo Lee in three days by winning 100 games to 0, reached the level of AlphaGo Master in 21 days, and exceeded all the old versions in 40 days.
AlphaGo versus Fan Hui was a five-game Go match between European champion Fan Hui, a 2-dan professional, and AlphaGo, a computer Go program developed by DeepMind, held at DeepMind's headquarters in London in October 2015. AlphaGo won all five games. This was the first time a computer Go program had beaten a professional human player on a full-sized board without handicap. This match was not disclosed to the public until 27 January 2016 to coincide with the publication of a paper in the journal Nature describing the algorithms AlphaGo used.
AlphaZero is a computer program developed by artificial intelligence research company DeepMind to master the games of chess, shogi and go. This algorithm uses an approach similar to AlphaGo Zero.
Elmo, stylized as elmo, is a computer shogi evaluation function and book file (joseki) created by Makoto Takizawa (瀧澤誠). It is designed to be used with a third-party shogi alpha–beta search engine.
Deep reinforcement learning is a subfield of machine learning that combines reinforcement learning (RL) and deep learning. RL considers the problem of a computational agent learning to make decisions by trial and error. Deep RL incorporates deep learning into the solution, allowing agents to make decisions from unstructured input data without manual engineering of the state space. Deep RL algorithms are able to take in very large inputs and decide what actions to perform to optimize an objective. Deep reinforcement learning has been used for a diverse set of applications including but not limited to robotics, video games, natural language processing, computer vision, education, transportation, finance and healthcare.
In video games, various artificial intelligence techniques have been used in a variety of ways, ranging from non-player character (NPC) control to procedural content generation (PCG). Machine learning is a subset of artificial intelligence that focuses on using algorithms and statistical models to make machines act without specific programming. This is in sharp contrast to traditional methods of artificial intelligence such as search trees and expert systems.
MuZero is a computer program developed by artificial intelligence research company DeepMind to master games without knowing their rules. Its release in 2019 included benchmarks of its performance in go, chess, shogi, and a standard suite of Atari games. The algorithm uses an approach similar to AlphaZero. It matched AlphaZero's performance in chess and shogi, improved on its performance in Go, and improved on the state of the art in mastering a suite of 57 Atari games, a visually-complex domain.
Self-play is a technique for improving the performance of reinforcement learning agents. Intuitively, agents learn to improve their performance by playing "against themselves".