Skip to main content

2015 | OriginalPaper | Buchkapitel

Playout Policy Adaptation for Games

verfasst von : Tristan Cazenave

Erschienen in: Advances in Computer Games

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Monte-Carlo Tree Search (MCTS) is the state of the art algorithm for General Game Playing (GGP). We propose to learn a playout policy online so as to improve MCTS for GGP. We test the resulting algorithm named Playout Policy Adaptation (PPA) on Atarigo, Breakthrough, Misere Breakthrough, Domineering, Misere Dominee-ring, Go, Knightthrough, Misere Knightthrough, Nogo and Misere Nogo. For most of these games, PPA is better than UCT with a uniform random playout policy, with the notable exceptions of Go and Nogo.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
For brevity, we use ‘he’ and ‘his’, whenever ‘he or she’ and ‘his or her’ are meant.
 
Literatur
1.
Zurück zum Zitat Boissac, F., Cazenave, T.: De nouvelles heuristiques de recherche appliquées à la résolution d’Atarigo. In: Intelligence artificielle et jeux, pp. 127–141. Hermes Science (2006) Boissac, F., Cazenave, T.: De nouvelles heuristiques de recherche appliquées à la résolution d’Atarigo. In: Intelligence artificielle et jeux, pp. 127–141. Hermes Science (2006)
2.
Zurück zum Zitat Browne, C., Powley, E., Whitehouse, D., Lucas, S., Cowling, P., Rohlfshagen, P., Tavener, S., Perez, D., Samothrakis, S., Colton, S.: A survey of Monte Carlo tree search methods. IEEE Trans. Comput. Intell. AI Games 4(1), 1–43 (2012)CrossRef Browne, C., Powley, E., Whitehouse, D., Lucas, S., Cowling, P., Rohlfshagen, P., Tavener, S., Perez, D., Samothrakis, S., Colton, S.: A survey of Monte Carlo tree search methods. IEEE Trans. Comput. Intell. AI Games 4(1), 1–43 (2012)CrossRef
3.
Zurück zum Zitat Cazenave, T.: Nested Monte-Carlo search. In: Boutilier, C. (ed.) IJCAI, pp. 456–461 (2009) Cazenave, T.: Nested Monte-Carlo search. In: Boutilier, C. (ed.) IJCAI, pp. 456–461 (2009)
4.
Zurück zum Zitat Cazenave, T.: Sequential halving applied to trees. IEEE Trans. Comput. Intell. AI Games 7(1), 102–105 (2015)CrossRef Cazenave, T.: Sequential halving applied to trees. IEEE Trans. Comput. Intell. AI Games 7(1), 102–105 (2015)CrossRef
5.
Zurück zum Zitat Cazenave, T., Saffidine, A., Schofield, M., Thielscher, M.: Discounting and pruning for nested playouts in general game playing. GIGA at IJCAI (2015) Cazenave, T., Saffidine, A., Schofield, M., Thielscher, M.: Discounting and pruning for nested playouts in general game playing. GIGA at IJCAI (2015)
6.
Zurück zum Zitat Chou, C.-W., Teytaud, O., Yen, S.-J.: Revisiting Monte-Carlo tree search on a normal form game: NoGo. In: Di Chio, C., et al. (eds.) EvoApplications 2011, Part I. LNCS, vol. 6624, pp. 73–82. Springer, Heidelberg (2011) CrossRef Chou, C.-W., Teytaud, O., Yen, S.-J.: Revisiting Monte-Carlo tree search on a normal form game: NoGo. In: Di Chio, C., et al. (eds.) EvoApplications 2011, Part I. LNCS, vol. 6624, pp. 73–82. Springer, Heidelberg (2011) CrossRef
7.
Zurück zum Zitat Coulom, R.: Efficient selectivity and backup operators in Monte-Carlo tree search. In: van den Herik, H.J., Ciancarini, P., Donkers, H.H.L.M.J. (eds.) CG 2006. LNCS, vol. 4630, pp. 72–83. Springer, Heidelberg (2007) CrossRef Coulom, R.: Efficient selectivity and backup operators in Monte-Carlo tree search. In: van den Herik, H.J., Ciancarini, P., Donkers, H.H.L.M.J. (eds.) CG 2006. LNCS, vol. 4630, pp. 72–83. Springer, Heidelberg (2007) CrossRef
8.
Zurück zum Zitat Coulom, R.: Computing elo ratings of move patterns in the game of go. ICGA J. 30(4), 198–208 (2007) Coulom, R.: Computing elo ratings of move patterns in the game of go. ICGA J. 30(4), 198–208 (2007)
9.
Zurück zum Zitat Enzenberger, M., Muller, M., Arneson, B., Segal, R.: Fuego - an open-source framework for board games and go engine based on Monte Carlo tree search. IEEE Trans. Comput. Intell. AI Games 2(4), 259–270 (2010)CrossRef Enzenberger, M., Muller, M., Arneson, B., Segal, R.: Fuego - an open-source framework for board games and go engine based on Monte Carlo tree search. IEEE Trans. Comput. Intell. AI Games 2(4), 259–270 (2010)CrossRef
10.
Zurück zum Zitat Finnsson, H., Björnsson, Y.: Simulation-based approach to general game playing. In: AAAI, pp. 259–264 (2008) Finnsson, H., Björnsson, Y.: Simulation-based approach to general game playing. In: AAAI, pp. 259–264 (2008)
11.
Zurück zum Zitat Finnsson, H., Björnsson, Y.: Learning simulation control in general game-playing agents. In: AAAI (2010) Finnsson, H., Björnsson, Y.: Learning simulation control in general game-playing agents. In: AAAI (2010)
12.
13.
Zurück zum Zitat Gelly, S., Silver, D.: Monte-Carlo tree search and rapid action value estimation in computer go. Artif. Intell. 175(11), 1856–1875 (2011)MathSciNetCrossRef Gelly, S., Silver, D.: Monte-Carlo tree search and rapid action value estimation in computer go. Artif. Intell. 175(11), 1856–1875 (2011)MathSciNetCrossRef
14.
Zurück zum Zitat Genesereth, M.R., Love, N., Pell, B.: General game playing: overview of the AAAI competition. AI Mag. 26(2), 62–72 (2005) Genesereth, M.R., Love, N., Pell, B.: General game playing: overview of the AAAI competition. AI Mag. 26(2), 62–72 (2005)
15.
Zurück zum Zitat Huang, S., Arneson, B., Hayward, R.B., Müller, M., Pawlewicz, J.: Mohex 2.0: a pattern-based MCTS hex player. In: Computers and Games - 8th International Conference, CG 2013, Yokohama, Japan, 13–15 August 2013, Revised Selected Papers, pp. 60–71 (2013) Huang, S., Arneson, B., Hayward, R.B., Müller, M., Pawlewicz, J.: Mohex 2.0: a pattern-based MCTS hex player. In: Computers and Games - 8th International Conference, CG 2013, Yokohama, Japan, 13–15 August 2013, Revised Selected Papers, pp. 60–71 (2013)
16.
Zurück zum Zitat Huang, S.-C., Coulom, R., Lin, S.-S.: Monte-Carlo simulation balancing in practice. In: van den Herik, H.J., Iida, H., Plaat, A. (eds.) CG 2010. LNCS, vol. 6515, pp. 81–92. Springer, Heidelberg (2011) CrossRef Huang, S.-C., Coulom, R., Lin, S.-S.: Monte-Carlo simulation balancing in practice. In: van den Herik, H.J., Iida, H., Plaat, A. (eds.) CG 2010. LNCS, vol. 6515, pp. 81–92. Springer, Heidelberg (2011) CrossRef
17.
Zurück zum Zitat Kocsis, L., Szepesvári, C.: Bandit based Monte-Carlo planning. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) ECML 2006. LNCS (LNAI), vol. 4212, pp. 282–293. Springer, Heidelberg (2006) CrossRef Kocsis, L., Szepesvári, C.: Bandit based Monte-Carlo planning. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) ECML 2006. LNCS (LNAI), vol. 4212, pp. 282–293. Springer, Heidelberg (2006) CrossRef
18.
Zurück zum Zitat Lee, C., Wang, M., Chaslot, G., Hoock, J., Rimmel, A., Teytaud, O., Tsai, S., Hsu, S., Hong, T.: The computational intelligence of MoGo revealed in taiwan’s computer go tournaments. IEEE Trans. Comput. Intell. AI Games 1(1), 73–89 (2009)CrossRef Lee, C., Wang, M., Chaslot, G., Hoock, J., Rimmel, A., Teytaud, O., Tsai, S., Hsu, S., Hong, T.: The computational intelligence of MoGo revealed in taiwan’s computer go tournaments. IEEE Trans. Comput. Intell. AI Games 1(1), 73–89 (2009)CrossRef
19.
Zurück zum Zitat Lorentz, R., Horey, T.: Programming breakthrough. In: van den Herik, H.J., Iida, H., Plaat, A. (eds.) CG 2013. LNCS, vol. 8427, pp. 49–59. Springer, Heidelberg (2014) Lorentz, R., Horey, T.: Programming breakthrough. In: van den Herik, H.J., Iida, H., Plaat, A. (eds.) CG 2013. LNCS, vol. 8427, pp. 49–59. Springer, Heidelberg (2014)
20.
Zurück zum Zitat Méhat, J., Cazenave, T.: A parallel general game player. KI 25(1), 43–47 (2011) Méhat, J., Cazenave, T.: A parallel general game player. KI 25(1), 43–47 (2011)
21.
Zurück zum Zitat Pitrat, J.: Realization of a general game-playing program. IFIP Congr. 2, 1570–1574 (1968) Pitrat, J.: Realization of a general game-playing program. IFIP Congr. 2, 1570–1574 (1968)
22.
Zurück zum Zitat Rimmel, A., Teytaud, F., Cazenave, T.: Optimization of the nested Monte-Carlo algorithm on the traveling salesman problem with time windows. In: Di Chio, C., et al. (eds.) EvoApplications 2011, Part II. LNCS, vol. 6625, pp. 501–510. Springer, Heidelberg (2011) CrossRef Rimmel, A., Teytaud, F., Cazenave, T.: Optimization of the nested Monte-Carlo algorithm on the traveling salesman problem with time windows. In: Di Chio, C., et al. (eds.) EvoApplications 2011, Part II. LNCS, vol. 6625, pp. 501–510. Springer, Heidelberg (2011) CrossRef
23.
Zurück zum Zitat Rimmel, A., Teytaud, F., Teytaud, O.: Biasing Monte-Carlo simulations through RAVE values. In: van den Herik, H.J., Iida, H., Plaat, A. (eds.) CG 2010. LNCS, vol. 6515, pp. 59–68. Springer, Heidelberg (2011) CrossRef Rimmel, A., Teytaud, F., Teytaud, O.: Biasing Monte-Carlo simulations through RAVE values. In: van den Herik, H.J., Iida, H., Plaat, A. (eds.) CG 2010. LNCS, vol. 6515, pp. 59–68. Springer, Heidelberg (2011) CrossRef
24.
Zurück zum Zitat Rosin, C.D.: Nested rollout policy adaptation for Monte Carlo tree search. In: IJCAI, pp. 649–654 (2011) Rosin, C.D.: Nested rollout policy adaptation for Monte Carlo tree search. In: IJCAI, pp. 649–654 (2011)
25.
Zurück zum Zitat Saffidine, A., Jouandeau, N., Cazenave, T.: Solving breakthrough with race patterns and job-level proof number search. In: van den Herik, H.J., Plaat, A. (eds.) ACG 2011. LNCS, vol. 7168, pp. 196–207. Springer, Heidelberg (2012) CrossRef Saffidine, A., Jouandeau, N., Cazenave, T.: Solving breakthrough with race patterns and job-level proof number search. In: van den Herik, H.J., Plaat, A. (eds.) ACG 2011. LNCS, vol. 7168, pp. 196–207. Springer, Heidelberg (2012) CrossRef
26.
Zurück zum Zitat Swiechowski, M., Mandziuk, J.: Self-adaptation of playing strategies in general game playing. IEEE Trans. Comput. Intell. AI Games 6(4), 367–381 (2014)CrossRefMATH Swiechowski, M., Mandziuk, J.: Self-adaptation of playing strategies in general game playing. IEEE Trans. Comput. Intell. AI Games 6(4), 367–381 (2014)CrossRefMATH
27.
Zurück zum Zitat Tak, M.J.W., Winands, M.H.M., Björnsson, Y.: N-grams and the last-good-reply policy applied in general game playing. IEEE Trans. Comput. Intell. AI Games 4(2), 73–83 (2012)CrossRef Tak, M.J.W., Winands, M.H.M., Björnsson, Y.: N-grams and the last-good-reply policy applied in general game playing. IEEE Trans. Comput. Intell. AI Games 4(2), 73–83 (2012)CrossRef
28.
Zurück zum Zitat Uiterwijk, J.W.H.M.: Perfectly solving domineering boards. In: Cazenave, T., Winands, M.H.M., Lida, H. (eds.) CGW 2013. Communications in Computer and Information Science, vol. 408, pp. 97–121. Springer, Switzerland (2013) Uiterwijk, J.W.H.M.: Perfectly solving domineering boards. In: Cazenave, T., Winands, M.H.M., Lida, H. (eds.) CGW 2013. Communications in Computer and Information Science, vol. 408, pp. 97–121. Springer, Switzerland (2013)
Metadaten
Titel
Playout Policy Adaptation for Games
verfasst von
Tristan Cazenave
Copyright-Jahr
2015
DOI
https://doi.org/10.1007/978-3-319-27992-3_3