Skip to main content

Über dieses Buch

This book constitutes the thoroughly refereed post-conference proceedings of the 9th International Conference on Computers and Games, CG 2016, held in Leiden, The Netherlands,in conjunction with the 19th Computer Olympiad and the 22nd World Computer-Chess Championship.
The 20 papers presented were carefully reviewed and selected of 30 submitted papers.
The 20 papers cover a wide range of computer games and many different research topics in four main classes which determined the order of publication: Monte Carlo Tree Search (MCTS) and its enhancements (seven papers), concrete games (seven papers), theoretical aspects and complexity (five papers) and cognition model (one paper). The paper Using Partial Tablebases in Breakthrough by Andrew Isaac and Richard Lorentz received the Best Paper Award.



Using Partial Tablebases in Breakthrough

In the game of Breakthrough the endgame is reached when there are still many pieces on the board. This means there are too many possible positions to be able to construct a reasonable endgame tablebase on the standard 8 \(\times \) 8 board, or even on a 6 \(\times \) 6 board. The fact that Breakthrough pieces only move forward allows us to create partial tablebases on the last n rows of each side of the board. We show how doing this enables us to create a much stronger MCTS based 6 \(\times \) 6 player and allows us to solve positions that would otherwise be out of reach.
Andrew Isaac, Richard Lorentz

Using Deep Convolutional Neural Networks in Monte Carlo Tree Search

Deep Convolutional Neural Networks have revolutionized Computer Go. Large networks have emerged as state-of-the-art models for move prediction and are used not only as stand-alone players but also inside Monte Carlo Tree Search to select and bias moves. Using neural networks inside the tree search is a challenge due to their slow execution time even if accelerated on a GPU. In this paper we evaluate several strategies to limit the number of nodes in the search tree in which neural networks are used. All strategies are assessed using the freely available cuDNN library. We compare our strategies against an optimal upper bound which can be estimated by removing timing constraints. We show that the best strategies are only 50 ELO points worse than this upper bound.
Tobias Graf, Marco Platzner

Monte Carlo Approaches to Parameterized Poker Squares

Parameterized Poker Squares (PPS) is a generalization of Poker Squares where players must adapt to a point system supplied at play time and thus dynamically compute highly-varied strategies. Herein, we detail the top three performing AI players in a PPS research competition, all three of which make various use of Monte Carlo techniques.
Todd W. Neller, Zuozhi Yang, Colin M. Messinger, Calin Anton, Karo Castro-Wunsch, William Maga, Steven Bogaerts, Robert Arrington, Clay Langley

Monte Carlo Tree Search with Robust Exploration

This paper presents a new Monte-Carlo tree search method that focuses on identifying the best move. UCT which minimizes the cumulative regret, has achieved remarkable success in Go and other games. However, recent studies on simple regret reveal that there are better exploration strategies. To further improve the performance, a leaf to be explored is determined not only by the mean but also by the whole reward distribution. We adopted a hybrid approach to obtain reliable distributions. A negamax-style backup of reward distributions is used in the shallower half of a search tree, and UCT is adopted in the rest of the tree. Experiments on synthetic trees show that this presented method outperformed UCT and similar methods, except for trees having uniform width and depth.
Takahisa Imagawa, Tomoyuki Kaneko

Pruning Playouts in Monte-Carlo Tree Search for the Game of Havannah

Monte-Carlo Tree Search (MCTS) is a popular technique for playing multi-player games. In this paper, we propose a new method to bias the playout policy of MCTS. The idea is to prune the decisions which seem “bad” (according to the previous iterations of the algorithm) before computing each playout. Thus, the method evaluates the estimated “good” moves more precisely. We have tested our improvement for the game of Havannah and compared it to several classic improvements. Our method outperforms the classic version of MCTS (with the RAVE improvement) and the different playout policies of MCTS that we have experimented.
Joris Duguépéroux, Ahmad Mazyad, Fabien Teytaud, Julien Dehos

Fast Seed-Learning Algorithms for Games

Recently, a methodology has been proposed for boosting the computational intelligence of randomized game-playing programs. We propose faster variants of these algorithms, namely rectangular algorithms (fully parallel) and bandit algorithms (faster in a sequential setup). We check the performance on several board games and card games. In addition, in the case of Go, we check the methodology when the opponent is completely distinct to the one used in the training.
Jialin Liu, Olivier Teytaud, Tristan Cazenave

Heuristic Function Evaluation Framework

We present a heuristic function evaluation framework that allows to quickly compare a heuristic function’s output to benchmark values that are precomputed for a subset of the state space of the game. Our framework reduces the time to evaluate a heuristic function drastically while also providing some insight into where the heuristic is performing well or below par. We analyze the feasibility of using Monte-Carlo Tree Search to compute benchmark values instead of relying on game theoretic values that are hard to obtain in many cases. We also propose several metrics for comparing heuristic evaluations to benchmark values and discuss the feasibility of using MCTS benchmarks with those metrics.
Nera Nešić, Stephan Schiffel

Systematic Selection of N-Tuple Networks for 2048

The puzzle game 2048, a single-player stochastic game played on a \(4\,\times \,4\) grid, is the most popular among similar slide-and-merge games. One of the strongest computer players for 2048 uses temporal difference learning (TD learning) with N-tuple networks, and it matters a great deal how to design N-tuple networks. In this paper, we study the N-tuple networks for the game 2048. In the first set of experiments, we conduct TD learning by selecting 6- and 7-tuples exhaustively, and evaluate the usefulness of those tuples. In the second set of experiments, we conduct TD learning with high-utility tuples, varying the number of tuples. The best player with ten 7-tuples achieves an average score 234,136 and the maximum score 504,660. It is worth noting that this player utilize no game-tree search and plays a move in about 12 \(\upmu \)s.
Kazuto Oka, Kiminori Matsuzaki

Human-Side Strategies in the Werewolf Game Against the Stealth Werewolf Strategy

The werewolf game contains unique features, such as persuasion and deception, which are not included in games that have been previously studied in AI research. Studying the werewolf game could be one of the next challenging targets for AI research. In this paper, we concentrate on a werewolf-side strategy called the “stealth werewolf” strategy. With this strategy, each of the werewolf-side players behaves like a villager, and the player does not pretend to have a special role. Even though the strategy is thought to be suboptimal, so far this has not been proved. In this paper, we limit the human-side strategies such that the seer reveals his/her role on the first day and the bodyguard never reveals his/her role. So, the advantage of the werewolves in determining the player to be eliminated by vote is nullified. We calculated the \(\varepsilon \)-Nash equilibrium of strategies for both sides under this limitation. The solution shows that the winning rates of the human-side are more than half when the number of werewolves is assigned as in common play. Since it is thought to be fair and interesting for the winning rate to stay near 50%, the result suggests that the “stealth werewolf” strategy is not a good strategy for werewolf-side players. Furthermore, the result also suggests that there exist unusual actions in the strategies that result in an \(\varepsilon \)-Nash equilibrium.
Xiaoheng Bi, Tetsuro Tanaka

Werewolf Game Modeling Using Action Probabilities Based on Play Log Analysis

In this study, we construct a non-human agent that can play the werewolf game (i.e., AI wolf) with aims of creating more advanced intelligence and acquire more advanced communication skills for AI-based systems. We therefore constructed a behavioral model using information regarding human players and the decisions made by such players; all such information was obtained from play logs of the werewolf game. To confirm our model, we conducted simulation experiments of the werewolf game using an agent based on our proposed behavioral model, as well as a random agent for comparison. Consequently, we obtained an 81.55% coincidence ratio of agent behavior versus human behavior.
Yuya Hirata, Michimasa Inaba, Kenichi Takahashi, Fujio Toriumi, Hirotaka Osawa, Daisuke Katagami, Kousuke Shinoda

Nash Equilibrium in Mastermind

Mastermind is a famous two-player deduction game. A Codemaker chooses a secret code and a Codebreaker tries to guess this secret code in as few guesses as possible, with feedback information after each guess. Many existing works have computed optimal worst-case and average-case strategies of the Codebreaker, assuming that the Codemaker chooses the secret code uniformly at random. However, the Codemaker can freely choose any distribution probability on the secret codes. An optimal strategy in this more general setting is known as a Nash Equilibrium. In this research, we compute such a Nash Equilibrium for all instances of Mastermind up to the most classical instance of 4 pegs and 6 colors, showing that the uniform distribution is not always the best choice for the Codemaker. We also show the direct relation between Nash Equilibrium computations and computations of worst-case and average-case strategies.
François Bonnet, Simon Viennot

11  11 Domineering Is Solved: The First Player Wins

We have developed a program called MUDoS (Maastricht University Domineering Solver) that solves Domineering positions in a very efficient way. It enables the solution of known positions (up to the \(10\times 10\) board) to be much quicker.
More importantly, it enables the solution of \(11\times 11\) Domineering, a board size that up till now was far out of the reach of previous Domineering solvers. The solution needed the investigation of 259,689,994,008 nodes, using almost half a year of computation time on a single simple desktop computer. The results show that under optimal play the first player wins \(11\times 11\) Domineering, irrespective whether Vertical or Horizontal starts.
In addition, several other new boards were solved. Using the convention that Vertical starts, the \(8\times 15\), \(11\times 9\), \(12\times 8\), \(12\times 15\), \(14\times 8\), and \(17\times 6\) boards are all won by Vertical, whereas the \(6\times 17\), \(8\times 12\), \(9\times 11\), and \(11\times 10\) boards are all won by Horizontal.
Jos W. H. M. Uiterwijk

A Reverse Hex Solver

We present Solrex, an automated solver for the game of Reverse Hex. Reverse Hex, also known as Rex, or Misère Hex, is the variant of the game of Hex in which the player who joins her two sides loses the game. Solrex performs a mini-max search of the state space using Scalable Parallel Depth First Proof Number Search, enhanced by the pruning of inferior moves and the early detection of certain winning strategies.
Solrex is implemented on the same code base as the Hex program Solver, and can solve arbitrary positions on board sizes up to 6 \(\times \) 6, with the hardest position taking less than four hours on four threads.
Kenny Young, Ryan B. Hayward

Computer-Aided Go: Chess as a Role Model

Recently computers have gained strength in the Asian board game Go. The Chess community experienced some 15 to 30 years ago that teams with humans and computers may be much stronger than each of their components. This paper claims that time is ripe for computer-aided Go on a large scale, although neither most users nor the Go programmers have realized it. A central part of the paper describes successful pioneers in Go play with computer help. Progress in computer-aided Go may also lead to progress in human Go and in computer Go itself.
Ingo Althöfer

Quantified Integer Programs with Polyhedral Uncertainty Set

Quantified Integer Programs (QIPs) are integer programs with variables being either existentially or universally quantified. They can be interpreted as a two-person zero-sum game with an existential and a universal player where the existential player tries to meet all constraints and the universal player intends to force at least one constraint to be not satisfied.
Originally, the universal player is only restricted to set the universal variables within their upper and lower bounds. We extend this idea by adding constraints for the universal variables, i.e., restricting the universal player to some polytope instead of the hypercube created by bounds. We also show how this extended structure can be polynomial-time reduced to a QIP.
Michael Hartisch, Thorsten Ederer, Ulf Lorenz, Jan Wolf

A Class Grammar for General Games

While there exist a variety of game description languages (GDLs) for modeling various classes of games, these are aimed at game playing rather than the more particular needs of game design. This paper describes a new approach to general game modeling that arose from this need. A class grammar is automatically generated from a given library of source code, from the constructors and associated parameters found along its class hierarchy, to give a context-free grammar that provides access to the underlying code while hiding its implementation details.
Cameron Browne

The Number of Legal Go Positions

The number of legal \(19\times 19\) Go positions has been determined as A roughly 1.2 % fraction of the \(3^{19\times 19}\) total number of positions, this is more naturally expressed in ternary. Replacing the usual ternary digits 0,1,2 by \(+\)(empty), (black), and (white) respectively, yields the following (illegal) position that counts all legal positions:
John Tromp

A Googolplex of Go Games

We establish the existence of \(10^{10^{100}}\) Go games, addressing an open problem in “Combinatorics of Go” by Tromp and Farnebäck.
Matthieu Walraet, John Tromp

An Analysis of Majority Systems with Dependent Agents in a Simple Subtraction Game

It is common knowledge that a majority system is typically better than its components, when the components are stochastically independent. However, in practice the independency assumption is often not justified. We investigate systems of experts which are constituted by couples of dependent agents. Based on recent theoretical work we analyse their performance in a simple 2-player subtraction game. It turns out that systems with negatively correlated couples perform better than those with positive correlation within the couples. From computer chess practice it was at least known that systems of very positively correlated bots were not too successful.
Raphael Thiele, Ingo Althöfer

Do People Think Like Computers?

Human cognition inspired the earliest algorithms for game-playing computer programs. However, the studies of human and computer game play quickly diverged: the Artificial Intelligence community focused on theory and techniques to solve games, while behavioral scientists empirically examined simple decision-making in humans. In this paper, we combine concepts and methods from the two fields to investigate whether human and AI players take similar approaches in an adversarial combinatorial game. We develop and compare five models that capture human behavior. We then demonstrate that our models can predict behavior in two related tasks. To conclude, we use our models to describe what makes a strong human player.
Bas van Opheusden, Zahy Bnaya, Gianni Galbiati, Wei Ji Ma


Weitere Informationen

Premium Partner