Top

2015 | Book

Read chapter Read first chapter

Advances in Computer Games

14th International Conference, ACG 2015, Leiden, The Netherlands, July 1-3, 2015, Revised Selected Papers

Editors: Aske Plaat, Jaap van den Herik, Walter Kosters

Publisher: Springer International Publishing

Book Series : Lecture Notes in Computer Science

Part of: Springer Professional "Wirtschaft+Technik" , Springer Professional "Technik" , Springer Professional "Wirtschaft"

About this book

This book constitutes the thoroughly refereed post-conference proceedings of the 14th International Conference on Advances in Computer Games, ACG 2015, held in Leiden, The Netherlands, in July 2015.
The 22 revised full papers presented were carefully reviewed and selected from 34 submissions. The papers cover a wide range of topics such as Monte-Carlo Tree Search and its enhancements; theoretical aspects and complexity; analysis of game characteristics; search algorithms; and machine learning.

Frontmatter

Adaptive Playouts in Monte-Carlo Tree Search with Policy-Gradient Reinforcement Learning

Abstract

Monte-Carlo Tree Search evaluates positions with the help of a playout policy. If the playout policy evaluates a position wrong then there are cases where the tree-search has difficulties to find the correct move due to the large search-space. This paper explores adaptive playout-policies which improve the playout-policy during a tree-search. With the help of policy-gradient reinforcement learning techniques we optimize the playout-policy to give better evaluations. We tested the algorithm in Computer Go and measured an increase in playing strength of more than 100 ELO. The resulting program was able to deal with difficult test-cases which are known to pose a problem for Monte-Carlo-Tree-Search.

Tobias Graf, Marco Platzner

Early Playout Termination in MCTS

Abstract

Many researchers view mini-max and MCTS-based searches as competing and incompatible approaches. For example, it is generally agreed that chess and checkers require a mini-max approach while Go and Havannah require MCTS. However, a hybrid technique is possible that has features of both mini-max and MCTS. It works by stopping the random MCTS playouts early and using an evaluation function to determine the winner of the playout. We call this algorithm MCTS-EPT (MCTS with early playout termination) and study it using MCTS-EPT programs we have written for Amazons, Havannah, and Breakthrough.

Richard Lorentz

Playout Policy Adaptation for Games

Abstract

Monte-Carlo Tree Search (MCTS) is the state of the art algorithm for General Game Playing (GGP). We propose to learn a playout policy online so as to improve MCTS for GGP. We test the resulting algorithm named Playout Policy Adaptation (PPA) on Atarigo, Breakthrough, Misere Breakthrough, Domineering, Misere Dominee-ring, Go, Knightthrough, Misere Knightthrough, Nogo and Misere Nogo. For most of these games, PPA is better than UCT with a uniform random playout policy, with the notable exceptions of Go and Nogo.

Tristan Cazenave

Strength Improvement and Analysis for an MCTS-Based Chinese Dark Chess Program

Abstract

Monte-Carlo tree search (MCTS) has been successfully applied to Chinese dark chess (CDC). In this paper, we study how to improve and analyze the playing strength of an MCTS-based CDC program, named DarkKnight, which won the CDC tournament in the 17th Computer Olympiad. We incorporate the three recent techniques, early playout terminations, implicit minimax backups, and quality-based rewards, into the program. For early playout terminations, playouts end when reaching states with likely outcomes. Implicit minimax backups use heuristic evaluations to help guide selections of MCTS. Quality-based rewards adjust rewards based on online collected information. Our experiments showed that the win rates against the original DarkKnight were 60.75 %, 70.90 % and 59.00 %, respectively for incorporating the three techniques. By incorporating all together, we obtained a win rate of 76.70 %.

Chu-Hsuan Hsueh, I-Chen Wu, Wen-Jie Tseng, Shi-Jim Yen, Jr-Chang Chen

LinUCB Applied to Monte-Carlo Tree Search

Abstract

UCT is a de facto standard method for Monte-Carlo tree search (MCTS) algorithms, which have been applied to various domains and have achieved remarkable success. This study proposes a family of LinUCT algorithms that incorporate LinUCB into MCTS algorithms. LinUCB is a recently developed method that generalizes past episodes by ridge regression with feature vectors and rewards. LinUCB outperforms UCB1 in contextual multi-armed bandit problems. We introduce a straightforward application of LinUCB, \(\text {LinUCT}_{\text {PLAIN}}\) by substituting UCB1 with LinUCB in UCT. We show that it does not work well owing to the minimax structure of game trees. To better handle such tree structures, we present \(\text {LinUCT}_{\text {RAVE}}\) and \(\text {LinUCT}_{\text {FP}}\) by further incorporating two existing techniques, rapid action value estimation (RAVE) and feature propagation, which recursively propagates the feature vector of a node to that of its parent. Experiments were conducted with a synthetic model, which is an extension of the standard incremental random tree model in which each node has a feature vector that represents the characteristics of the corresponding position. The experimental results indicate that \(\text {LinUCT}_{\text {RAVE}}\), \(\text {LinUCT}_{\text {FP}}\), and their combination \(\text {LinUCT}_{\text {RAVE-FP}}\) outperform UCT, especially when the branching factor is relatively large.

Yusaku Mandai, Tomoyuki Kaneko

Adapting Improved Upper Confidence Bounds for Monte-Carlo Tree Search

Abstract

The UCT algorithm, which combines the UCB algorithm and Monte-Carlo Tree Search (MCTS), is currently the most widely used variant of MCTS. Recently, a number of investigations into applying other bandit algorithms to MCTS have produced interesting results. In this research, we will investigate the possibility of combining the improved UCB algorithm, proposed by Auer et al. [2], with MCTS. However, various characteristics and properties of the improved UCB algorithm may not be ideal for a direct application to MCTS. Therefore, some modifications were made to the improved UCB algorithm, making it more suitable for the task of game-tree search. The Mi-UCT algorithm is the application of the modified UCB algorithm applied to trees. The performance of Mi-UCT is demonstrated on the games of \(9\times 9\) Go and \(9\times 9\) NoGo, and has shown to outperform the plain UCT algorithm when only a small number of playouts are given, and rougly on the same level when more playouts are available.

Yun-Ching Liu, Yoshimasa Tsuruoka

On Some Random Walk Games with Diffusion Control

Abstract

Random walks with discrete time steps and discrete state spaces have widely been studied for several decades. We investigate such walks as games with “Diffusion Control”: a player (=controller) with certain intentions influences the random movements of the particle. In our models the controller decides only about the step size for a single particle. It turns out that this small amount of control is sufficient to cause the particle to stay in “premium regions” of the state space with surprisingly high probabilities.

Ingo Althöfer, Matthias Beckmann, Friedrich Salzer

Go Complexities

Abstract

The game of Go is often said to be exptime-complete. The result refers to classical Go under Japanese rules, but many variants of the problem exist and affect the complexity. We survey what is known on the computational complexity of Go and highlight challenging open problems. We also propose a few new results. In particular, we show that Atari-Go is pspace-complete and that hardness results for classical Go carry over to their Partially Observable variant.

Abdallah Saffidine, Olivier Teytaud, Shi-Jim Yen

On Some Evacuation Games with Random Walks

Abstract

We consider a single-player game where a particle on a board has to be steered to evacuation cells. The actor has no direct control over this particle but may indirectly influence the movement of the particle by blockades. We examine optimal blocking strategies and the recurrence property experimentally and conclude that the random walk of our game is recurrent. Furthermore, we are interested in the average time in which an evacuation cell is reached.

Matthias Beckmann

Crystallization of Domineering Snowflakes

Abstract

In this paper we present a combinatorial game-theoretic analysis of special Domineering positions. In particular we investigate complex positions that are aggregates of simpler fragments, linked via bridging squares.

We aim to extend two theorems that exploit the characteristic of an aggregate of two fragments having as game-theoretic value the sum of the values of the fragments. We investigate these theorems to deal with the case of multiple-connected networks with arbitrary number of fragments, possibly also including cycles.

As an application, we introduce an interesting, special Domineering position with value \(*2\). We dub this position the Snowflake. We then show how from this fragment larger chains of Snowflakes can be built with known values, including flat networks of Snowflakes (a kind of crystallization).

Jos W. H. M. Uiterwijk

First Player’s Cannot-Lose Strategies for Cylinder-Infinite-Connect-Four with Widths 2 and 6

Abstract

Cylinder-Infinite-Connect-Four is Connect-Four played on a cylindrical square grid board with infinite row height and columns that cycle about its width. In previous work, the first player’s cannot-lose strategies have been discovered for all widths except 2 and 6, and the second player’s cannot-lose strategies have been discovered with all widths except 6 and 11. In this paper, we show the first player’s cannot-lose strategies for widths 2 and 6.

Yoshiaki Yamaguchi, Todd W. Neller

Development of a Program for Playing Progressive Chess

Abstract

We present the design of a computer program for playing Progressive Chess. In this game, players play progressively longer series of moves rather than just making one move per turn. Our program follows the generally recommended strategy for this game, which consists of three phases: looking for possibilities to checkmate the opponent, playing generally good moves when no checkmate can be found, and preventing checkmates from the opponent. In this paper, we focus on efficiently searching for checkmates, putting to test various heuristics for guiding the search. We also present the findings of self-play experiments between different versions of the program.

Vito Janko, Matej Guid

A Comparative Review of Skill Assessment: Performance, Prediction and Profiling

Abstract

The assessment of chess players is both an increasingly attractive opportunity and an unfortunate necessity. The chess community needs to limit potential reputational damage by inhibiting cheating and unjustified accusations of cheating: there has been a recent rise in both. A number of counter-intuitive discoveries have been made by benchmarking the intrinsic merit of players’ moves: these call for further investigation. Is Capablanca actually, objectively the most accurate World Champion? Has ELO rating inflation not taken place? Stimulated by FIDE/ACP, we revisit the fundamentals of the subject to advance a framework suitable for improved standards of computational experiment and more precise results. Other games and domains look to chess as demonstrator of good practice, including the rating of professionals making high-value decisions under pressure, personnel evaluation by Multichoice Assessment and the organization of crowd-sourcing in citizen science projects. The ‘3P’ themes of performance, prediction and profiling pervade all these domains.

Guy Haworth, Tamal Biswas, Ken Regan

Boundary Matching for Interactive Sprouts

Abstract

The simplicity of the pen-and-paper game Sprouts hides a surprising combinatorial complexity. We describe an optimization called boundary matching that accommodates this complexity to allow move generation for Sprouts games of arbitrary size at interactive speeds.

Cameron Browne

Draws, Zugzwangs, and PSPACE-Completeness in the Slither Connection Game

Abstract

Two features set Slither apart from other connection games. Previously played stones can be relocated and some stone configurations are forbidden. We show that the interplay of these peculiar mechanics with the standard goal of connecting opposite edges of a board results in a game with a few properties unexpected among connection games, for instance, the existence of mutual Zugzwangs. We also establish that, although there are positions where one player has no legal move, there is no position where both players lack a legal move and that the game cannot end in a draw. From the standpoint of computational complexity, we show that the game is pspace-complete, the relocation rule can indeed be tamed so as to simulate a hex game on a Slither board.

Édouard Bonnet, Florian Jamain, Abdallah Saffidine

Constructing Pin Endgame Databases for the Backgammon Variant Plakoto

Abstract

Palamedes is an ongoing project for building expert playing bots that can play backgammon variants. Until recently the position evaluation relied only on self-trained neural networks. This paper describes the first attempt to augment Palamedes by constructing databases for certain endgame positions for the backgammon variant of Plakoto. The result is 5 databases containing 12,480,720 records in total; they can calculate accurately the best move for roughly 3.4 × 10¹⁵ positions. To the best of our knowledge, this is the first time that an endgame database is created for this game.

Nikolaos Papahristou, Ioannis Refanidis

Reducing the Seesaw Effect with Deep Proof-Number Search

Abstract

In this paper, DeepPN is introduced. It is a modified version of PN-search. It introduces a procedure to solve the seesaw effect. DeepPN employs two important values associated with each node, viz. the usual proof number and a deep value. The deep value of a node is defined as the depth to which each child node has been searched. So, the deep value of a node shows the progress of the search in the depth direction. By mixing the proof numbers and the deep value, DeepPN works with two characteristics, viz., the best-first manner of search (equal to the original proof-number search) and the depth-first manner. By adjusting a parameter (called R in this paper) we can choose between best-first or depth-first behavior. In our experiments, we tried to find a balance between both manners of searching. As it turned out, best results were obtained at an R value in between the two extremes of best-first search (original proof number search) and depth-first search. Our experiments showed better results for DeepPN compared to the original PN-search: a point in between best-first and depth-first performed best. For random Othello and Hex positions, DeepPN works almost twice as good as PN-search. From the results, we may conclude that Deep Proof-Number Search outperforms PN-search considerably in Othello and Hex.

Taichi Ishitobi, Aske Plaat, Hiroyuki Iida, Jaap van den Herik

Feature Strength and Parallelization of Sibling Conspiracy Number Search

Abstract

Recently we introduced Sibling Conspiracy Number Search — an algorithm based not on evaluation of leaf states of the search tree but, for each node, on relative evaluation scores of all children of that node — and implemented an SCNS Hex bot. Here we show the strength of SCNS features: most critical is to initialize leaves via a multi-step process. Also, we show a simple parallel version of SCNS: it scales well for 2 threads but less efficiently for 4 or 8 threads.

Jakub Pawlewicz, Ryan B. Hayward

Parameter-Free Tree Style Pipeline in Asynchronous Parallel Game-Tree Search

Abstract

Asynchronous parallel game-tree search methods are effective in improving the playing strength by using many computers connected through relatively slow networks. In game-position parallelization, the master program manages a game-tree and distributes positions in the tree to workers. Then, each worker asynchronously searches the best move and the corresponding evaluation for its assigned position. We present a new method for constructing an appropriate master tree that provides more important moves with more workers on their sub-trees to improve the playing strength. Our contribution introduces two advantages: (1) being parameter free in that users do not need to tune parameters through trial and error, and (2) efficiency suitable even for short-time matches, such as one second per move. We implemented our method in chess with a top-level chess program Stockfish and evaluated the playing strength through self-plays. We confirm that the playing strength improves with up to sixty workers.

Shu Yokoyama, Tomoyuki Kaneko, Tetsuro Tanaka

Transfer Learning by Inductive Logic Programming

Abstract

In this paper, we propose a Transfer Learning method by Inductive Logic Programing for games. We generate general knowledge from a game, and specify the knowledge so that it is applicable in another game. This is called Transfer Learning. We show the working of Transfer Learning by taking knowledge from Tic-tac-toe and transfer it to Connect4 and Connect5. For Connect4 the number of Heuristic functions we developed is 30; for Connect5 it is 20.

Yuichiro Sato, Hiroyuki Iida, H. J. van den Herik

Developing Computer Hex Using Global and Local Evaluation Based on Board Network Characteristics

Abstract

The game of Hex was invented in the 1940s, and many studies have proposed ideas that led to the development of a computer Hex. One of the main approaches developing computer Hex is using an evaluation function of the electric circuit model. However, such a function evaluates the board states only from one perspective. Consequently, it is recently defeated by the Monte Carlo Tree Search approaches. In this paper, we therefore propose a novel evaluation function that uses network characteristics to capture features of the board states from two perspectives. Our proposed evaluation function separately evaluates the board network and the shortest path network using betweenness centrality, and combines the results of these evaluations. Furthermore, our proposed method involves changing the ratio between global and local evaluations through a support vector machine (SVM). So, it yields an improved strategy for Hex. Our method is called Ezo. It was tested against the world-champion Hex program MoHex. The results showed that our method was superior to the 2011 version of MoHex on an \(11 \times 11\) board.

Kei Takada, Masaya Honjo, Hiroyuki Iizuka, Masahito Yamamoto

Machine-Learning of Shape Names for the Game of Go

Abstract

Computer Go programs with only a 4-stone handicap have recently defeated professional humans. Now that the strength of Go programs is sufficiently close to that of humans, a new target in artificial intelligence is to develop programs able to provide commentary on Go games. A fundamental difficulty in this development is to learn the terminology of Go, which is often not well defined. An example is the problem of naming shapes such as Atari, Attachment or Hane. In this research, our goal is to allow a program to label relevant moves with an associated shape name. We use machine learning to deduce these names based on local patterns of stones. First, strong amateur players recorded for each game move the associated shape name, using a pre-selected list of 71 terms. Next, these records were used to train a supervised machine learning algorithm. The result is a program able to output the shape name from the local patterns of stones. Including other Go features such as change in liberties improved the performance. Humans agreed on a shape name with a rate of about 82 %. Our algorithm achieved a similar performance, picking the name most preferred by the humans with a rate of about 82 %. This performance is a first step towards a program that is able to communicate with human players in a game review or match.

Kokolo Ikeda, Takanari Shishido, Simon Viennot

Backmatter

Title: Advances in Computer Games
Editors: Aske Plaat
Jaap van den Herik
Walter Kosters
Publisher: Springer International Publishing
Electronic ISBN: 978-3-319-27992-3
Print ISBN: 978-3-319-27991-6
DOI: https://doi.org/10.1007/978-3-319-27992-3

Springer Professional

Advances in Computer Games

14th International Conference, ACG 2015, Leiden, The Netherlands, July 1-3, 2015, Revised Selected Papers

About this book

Table of Contents

Frontmatter

Adaptive Playouts in Monte-Carlo Tree Search with Policy-Gradient Reinforcement Learning

Early Playout Termination in MCTS

Playout Policy Adaptation for Games

Strength Improvement and Analysis for an MCTS-Based Chinese Dark Chess Program

LinUCB Applied to Monte-Carlo Tree Search

Adapting Improved Upper Confidence Bounds for Monte-Carlo Tree Search

On Some Random Walk Games with Diffusion Control

Go Complexities

On Some Evacuation Games with Random Walks

Crystallization of Domineering Snowflakes

First Player’s Cannot-Lose Strategies for Cylinder-Infinite-Connect-Four with Widths 2 and 6

Development of a Program for Playing Progressive Chess

A Comparative Review of Skill Assessment: Performance, Prediction and Profiling

Boundary Matching for Interactive Sprouts

Draws, Zugzwangs, and PSPACE-Completeness in the Slither Connection Game

Constructing Pin Endgame Databases for the Backgammon Variant Plakoto

Reducing the Seesaw Effect with Deep Proof-Number Search

Feature Strength and Parallelization of Sibling Conspiracy Number Search

Parameter-Free Tree Style Pipeline in Asynchronous Parallel Game-Tree Search

Transfer Learning by Inductive Logic Programming

Developing Computer Hex Using Global and Local Evaluation Based on Board Network Characteristics

Machine-Learning of Shape Names for the Game of Go

Backmatter

Premium Partner