Darkforest

Last updated

Darkforest is a computer go program developed by Meta Platforms, based on deep learning techniques using a convolutional neural network. Its updated version Darkfores2 combines the techniques of its predecessor with Monte Carlo tree search. [1] [2] The MCTS effectively takes tree search methods commonly seen in computer chess programs and randomizes them. [3] With the update, the system is known as Darkfmcts3. [4]

Contents

Darkforest is of similar strength to programs like CrazyStone and Zen. [5] It has been tested against a professional human player at the 2016 UEC cup. Google's AlphaGo program won against a professional player in October 2015 using a similar combination of techniques. [6]

Darkforest is named after Liu Cixin's science fiction novel The Dark Forest . [7]

Background

Competing with top human players in the ancient game of Go has been a long-term goal of artificial intelligence. Go’s high branching factor makes traditional search techniques ineffective, even on cutting-edge hardware, and Go’s evaluation function could change drastically with one stone change. However, by using a Deep Convolutional Neural Network designed for long-term predictions, Darkforest has been able to substantially improve the win rate for bots over more traditional Monte Carlo Tree Search based approaches.

Matches

Against human players, Darkfores2 achieves a stable 3d ranking on KGS Go Server, which roughly corresponds to an advanced amateur human player. However, after adding Monte Carlo Tree Search to Darkfores2 to create a much stronger player named darkfmcts3, it can achieve a 5d ranking on the KGS Go Server.

Against other AI

darkfmcts3 is on par with state-of-the-art Go AIs such as Zen, DolBaram and Crazy Stone but lags behind AlphaGo. [8] It won 3rd place in January 2016 KGS Bot Tournament against other Go AIs.

News coverage

After Google's AlphaGo won against Fan Hui in 2015, Facebook made its AI's hardware designs public, alongside releasing the code behind DarkForest as open-source, along with heavy recruiting to strengthen its team of AI engineers. [3]

Style of play

Darkforest uses a neural network to sort through the 10100 board positions, and find the most powerful next move. [9] However, neural networks alone cannot match the level of good amateur players or the best search-based Go engines, and so Darkfores2 combines the neural network approach with a search-based machine. A database of 250,000 real Go games were used in the development of Darkforest, with 220,000 used as a training set and the rest used to test the neural network's ability to predict the next moves played in the real games. This allows Darkforest to accurately evaluate the global state of the board, but local tactics were still poor. Search-based engines have poor global evaluation, but are good at local tactics. Combining these two approaches is difficult because search-based engines work much faster than neural networks, a problem which was solved in Darkfores2 by running the processes in parallel with frequent communication between the two. [9]

Conventional strategies

Go is generally played by analyzing the position of the stones on the board. Some advanced players have described it as playing in some part subconsciously. Unlike chess and checkers, where AI players can simply look farther forward at moves than human players, but with each round of Go having on average 250 possible moves, that approach is ineffective. Instead, neural networks copy human play by training the AI systems on images of successful moves, the AI can effectively learn how to interpret how the board looks, as many grandmasters do. [10] In November 2015, Facebook demonstrated the combination of MCTS with neural networks, which played with a style that "felt human". [10]

Flaws

It has been noted that Darkforest still has flaws in its play style. Sometimes the bot plays tenuki ("move elsewhere") pointlessly when local powerful moves are required. When the bot is losing, it shows the typical behavior of MCTS, it plays bad moves and loses more. The Facebook AI team has acknowledged these as areas of future improvement. [11]

Program architecture

The family of Darkforest computer go programs is based on convolution neural networks. [3] The most recent advances in Darkfmcts3 combined convolutional neural networks with more traditional Monte Carlo tree search. [3] Darkfmcts3 is the most advanced version of Darkforest, which combines Facebook's most advanced convolutional neural network architecture from Darkfores2 with a Monte Carlo tree search.

Darkfmcts3 relies on a convolution neural networks that predicts the next k moves based on the current state of play. It treats the board as a 19x19 image with multiple channels. Each channel represents a different aspect of board information based upon the specific style of play. For standard and extended play, there are 21 and 25 different channels, respectively. In standard play, each players liberties are represented as six binary channels or planes. The respective plane is true if the player one, two, or three or more liberties available. Ko (i.e. illegal moves) is represented as one binary plane. Stone placement for each opponent and empty board positions are represented as three binary planes, and the duration since a stone has been placed is represented as real numbers on two planes, one for each player. Lastly, the opponents rank is represented by nine binary planes, where if all are true, the player is a 9d level, if 8 are true, a 8d level, and so forth. Extended play additionally considers the boarder (binary plane that is true at the border), position mask (represented as distance from the board center, i.e. , where is a real number at a position), and each player's territory (binary, based on which player a location is closer to).

Darkfmct3 uses a 12-layer full convolutional network with a width of 384 nodes without weight sharing or pooling. Each convolutional layer is followed by a rectified linear unit, a popular activation function for deep neural networks. [12] A key innovation of Darkfmct3 compared to previous approaches is that it uses only one softmax function to predict the next move, which enables the approach to reduce the overall number of parameters. [3] Darkfmct3 was trained against 300 random selected games from an empirical dataset representing different game stages. The learning rate was determined by vanilla stochastic gradient descent.

Darkfmct3 synchronously couples a convolutional neural network with a Monte Carlo tree search. Because the convolutional neural network is computationally taxing, the Monte Carlo tree search focuses computation on the more likely game play trajectories. By running the neural network synchronously with the Monte Carlo tree search, it is possible to guarantee that each node is expanded by the moves predicted by the neural network.

Comparison with other systems

Darkfores2 beats Darkforest, its neural network-only predecessor, around 90% of the time, and Pachi, one of the best search-based engines, around 95% of the time. [9] On the Kyu rating system, Darkforest holds a 1-2d level. Darkfores2 achieves a stable 3d level on KGS Go Server as a ranked bot. [1] With the added Monte Carlo tree search, Darkfmcts3 with 5,000 rollouts beats Pachi with 10k rollouts in all 250 games; with 75k rollouts it achieves a stable 5d level in KGS server, on par with state-of-the-art Go AIs (e.g., Zen, DolBaram, CrazyStone); with 110k rollouts, it won the 3rd place in January KGS Go Tournament. [4]

See also

Related Research Articles

<span class="mw-page-title-main">Computer chess</span> Computer hardware and software capable of playing chess

Computer chess includes both hardware and software capable of playing chess. Computer chess provides opportunities for players to practice even in the absence of human opponents, and also provides opportunities for analysis, entertainment and training. Computer chess applications that play at the level of a chess grandmaster or higher are available on hardware from supercomputers to smart phones. Standalone chess-playing machines are also available. Stockfish, Leela Chess Zero, GNU Chess, Fruit, and other free open source applications are available for various platforms.

<span class="mw-page-title-main">Evaluation function</span> Function in a computer game-playing program that evaluates a game position

An evaluation function, also known as a heuristic evaluation function or static evaluation function, is a function used by game-playing computer programs to estimate the value or goodness of a position in a game tree. Most of the time, the value is either a real number or a quantized integer, often in nths of the value of a playing piece such as a stone in go or a pawn in chess, where n may be tenths, hundredths or other convenient fraction, but sometimes, the value is an array of three values in the unit interval, representing the win, draw, and loss percentages of the position.

<span class="mw-page-title-main">Computer Go</span> Field of artificial intelligence around Go computer programs

Computer Go is the field of artificial intelligence (AI) dedicated to creating a computer program that plays the traditional board game Go. The field is sharply divided into two eras. Before 2015, the programs of the era were weak. The best efforts of the 1980s and 1990s produced only AIs that could be defeated by beginners, and AIs of the early 2000s were intermediate level at best. Professionals could defeat these programs even given handicaps of 10+ stones in favor of the AI. Many of the algorithms such as alpha-beta minimax that performed well as AIs for checkers and chess fell apart on Go's 19x19 board, as there were too many branching possibilities to consider. Creation of a human professional quality program with the techniques and hardware of the time was out of reach. Some AI researchers speculated that the problem was unsolvable without creation of human-like AI.

<span class="mw-page-title-main">Havannah (board game)</span> Board game

Havannah is a two-player abstract strategy board game invented by Christian Freeling. It belongs to the family of games commonly called connection games; its relatives include Hex and TwixT. Havannah has "a sophisticated and varied strategy" and is best played on a base-10 hexagonal board, 10 hex cells to a side.

Maven is an artificial intelligence Scrabble player, created by Brian Sheppard in 1986. It is no longer commercially available, the rights having been bought by Hasbro in 1996. It has been used in official licensed Hasbro Scrabble games since then.

A computer poker player is a computer program designed to play the game of poker, against human opponents or other computer opponents. It is commonly referred to as pokerbot or just simply bot. As of 2019, computers can beat any human player in poker.

General game playing (GGP) is the design of artificial intelligence programs to be able to play more than one game successfully. For many games like chess, computers are programmed to play these games using a specially designed algorithm, which cannot be transferred to another context. For instance, a chess-playing computer program cannot play checkers. General game playing is considered as a necessary milestone on the way to artificial general intelligence.

The KGS Go Server, known until 2006 as the Kiseido Go Server, is a game server first developed in 1999 and established in 2000 for people to play Go. The system was developed by William M. Shubert and its code is now written entirely in Java. In Spring of 2017, Shubert transferred ownership to the American Go Foundation.

<span class="mw-page-title-main">Anti-computer tactics</span> Human methods against game-playing computers

Anti-computer tactics are methods used by humans to try to beat computer opponents at various games, most typically board games such as chess and Arimaa. They are most associated with competitions against computer AIs that are playing to their utmost to win, rather than AIs merely programmed to be an interesting challenge that can be given intentional weaknesses and quirks by the programmer. Such tactics are most associated with the era when AIs searched a game tree with an evaluation function looking for promising moves, often with Alpha–beta pruning or other minimax algorithms used to narrow the search. Against such algorithms, a common tactic is to play conservatively aiming for a long-term advantage. The theory is that this advantage will manifest slowly enough that the computer is unable to notice in its search, and the computer won't play around the threat correctly. This may result in, for example, a subtle advantage that eventually turns into a winning chess endgame with a passed pawn.

Computer Arimaa refers to the playing of the board game Arimaa by computer programs.

In computer science, Monte Carlo tree search (MCTS) is a heuristic search algorithm for some kinds of decision processes, most notably those employed in software that plays board games. In that context MCTS is used to solve the game tree.

DeepMind Technologies Limited, also known by its trade name Google DeepMind, is a British-American artificial intelligence research laboratory which serves as a subsidiary of Google. Founded in the UK in 2010, it was acquired by Google in 2014 and merged with Google AI's Google Brain division to become Google DeepMind in April 2023. The company is based in London, with research centres in Canada, France, Germany, and the United States.

AlphaGo is a computer program that plays the board game Go. It was developed by the London-based DeepMind Technologies, an acquired subsidiary of Google. Subsequent versions of AlphaGo became increasingly powerful, including a version that competed under the name Master. After retiring from competitive play, AlphaGo Master was succeeded by an even more powerful version known as AlphaGo Zero, which was completely self-taught without learning from human games. AlphaGo Zero was then generalized into a program known as AlphaZero, which played additional games, including chess and shogi. AlphaZero has in turn been succeeded by a program known as MuZero which learns without being taught the rules.

AlphaGo versus Lee Sedol, also known as the DeepMind Challenge Match, was a five-game Go match between top Go player Lee Sedol and AlphaGo, a computer Go program developed by DeepMind, played in Seoul, South Korea between 9 and 15 March 2016. AlphaGo won all but the fourth game; all games were won by resignation. The match has been compared with the historic chess match between Deep Blue and Garry Kasparov in 1997.

Leela Zero is a free and open-source computer Go program released on 25 October 2017. It is developed by Belgian programmer Gian-Carlo Pascutto, the author of chess engine Sjeng and Go engine Leela.

<span class="mw-page-title-main">AlphaZero</span> Game-playing artificial intelligence

AlphaZero is a computer program developed by artificial intelligence research company DeepMind to master the games of chess, shogi and go. This algorithm uses an approach similar to AlphaGo Zero.

OpenAI Five is a computer program by OpenAI that plays the five-on-five video game Dota 2. Its first public appearance occurred in 2017, where it was demonstrated in a live one-on-one game against the professional player Dendi, who lost to it. The following year, the system had advanced to the point of performing as a full team of five, and began playing against and showing the capability to defeat professional teams.

Amos James Storkey is Professor of Machine Learning and Artificial Intelligence at the School of Informatics, University of Edinburgh.

Artificial intelligence and machine learning techniques are used in video games for a wide variety of applications such as non-player character (NPC) control and procedural content generation (PCG). Machine learning is a subset of artificial intelligence that uses historical data to build predictive and analytical models. This is in sharp contrast to traditional methods of artificial intelligence such as search trees and expert systems.

KataGo is a free and open-source computer Go program, capable of defeating top-level human players. First released on 27 February 2019, it is developed by David Wu, who also developed the Arimaa playing program bot_Sharp which defeated three top human players to win the Arimaa AI Challenge in 2015.

References

  1. 1 2 Tian, Yuandong; Zhu, Yan (2015). "Better Computer Go Player with Neural Network and Long-term Prediction". arXiv: 1511.06410v1 [cs.LG].
  2. "How Facebook's AI Researchers Built a Game-Changing Go Engine". MIT Technology Review. December 4, 2015. Retrieved 2016-02-03.
  3. 1 2 3 4 5 "Facebook AI Go Player Gets Smarter With Neural Network And Long-Term Prediction To Master World's Hardest Game". Tech Times. 2016-01-28. Retrieved 2016-04-24.
  4. 1 2 "Facebook's artificially intelligent Go player is getting smarter". VentureBeat. 27 January 2016. Retrieved 2016-04-24.
  5. "Strachey Lecture - Dr Demis Hassabis by University of Oxford Live".
  6. "No Go: Facebook fails to spoil Google's big AI day". The Guardian. 2016-01-28. ISSN   0261-3077 . Retrieved 2016-02-01.
  7. "FB围棋项目负责人谈人机大战" [FB Go Project Manager Discusses Man vs Machine Showdown] (in Chinese). Tencent. 2016-03-01.
  8. Silver, David; Huang, Aja; Maddison, Chris J.; Guez, Arthur; Sifre, Laurent; Driessche, George van den; Schrittwieser, Julian; Antonoglou, Ioannis; Panneershelvam, Veda; Lanctot, Marc; Dieleman, Sander; Grewe, Dominik; Nham, John; Kalchbrenner, Nal; Sutskever, Ilya; Lillicrap, Timothy; Leach, Madeleine; Kavukcuoglu, Koray; Graepel, Thore; Hassabis, Demis (28 January 2016). "Mastering the game of Go with deep neural networks and tree search". Nature . 529 (7587): 484–489. Bibcode:2016Natur.529..484S. doi:10.1038/nature16961. ISSN   0028-0836. PMID   26819042. S2CID   515925. Closed Access logo transparent.svg
  9. 1 2 3 "How Facebook's AI Researchers Built a Game-Changing Go Engine". MIT Technology Review. Retrieved 2016-04-24.
  10. 1 2 Metz, Cade (7 December 2015). "Google and Facebook Race to Solve the Ancient Game of Go With AI". WIRED. Retrieved 2016-04-24.
  11. Kelion, Leo (27 January 2016). "Facebook trains AI to beat humans at Go board game - BBC News". BBC News. Retrieved 2016-04-24.
  12. LeCun, Yann; Bengio, Yoshua; Hinton, Geoffrey (27 May 2015). "Deep learning". Nature. 521 (7553): 436–444. Bibcode:2015Natur.521..436L. doi:10.1038/nature14539. PMID   26017442. S2CID   3074096.