2011 | OriginalPaper | Buchkapitel
Computational Experiments with the RAVE Heuristic
verfasst von : David Tom, Martin Müller
Erschienen in: Computers and Games
Verlag: Springer Berlin Heidelberg
Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.
Wählen Sie Textabschnitte aus um mit Künstlicher Intelligenz passenden Patente zu finden. powered by
Markieren Sie Textabschnitte, um KI-gestützt weitere passende Inhalte zu finden. powered by
The Monte-Carlo tree search algorithm
Upper Confidence bounds applied to Trees
(UCT) has become extremely popular in computer games research. The
Rapid Action Value Estimation
(RAVE) heuristic is a strong estimator that often improves the performance of UCT-based algorithms. However, there are situations where RAVE misleads the search whereas pure UCT search can find the correct solution. Two games, the simple abstract game Sum of Switches (
SOS
) and the game of Go, are used to study the behavior of the RAVE heuristic. In SOS, RAVE updates are manipulated to mimic game situations where RAVE misleads the search. Such false RAVE updates are used to create RAVE overestimates and underestimates. A study of the distributions of mean and RAVE values reveals great differences between Go and SOS. While the
RAVE-max
update rule is able to correct extreme cases of RAVE underestimation, it is not effective in closer to practical settings and in Go.