Skip to main content

2015 | OriginalPaper | Buchkapitel

The Application of Co-evolutionary Genetic Programming and TD(1) Reinforcement Learning in Large-Scale Strategy Game VCMI

verfasst von : Łukasz Wilisowski, Rafał Dreżewski

Erschienen in: Agent and Multi-Agent Systems: Technologies and Applications

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

VCMI is a new, open-source project that could become one of the biggest testing platform for modern AI algorithms in the future. Its complex environment and turn-based gameplay make it a perfect system for any AI driven solution. It also has a large community of active players which improves the testability of target algorithms. This paper explores VCMI’s environment and tries to assess its complexity by providing a base solution for battle handling problem using two global optimization algorithms: Co-Evolution of Genetic Programming Trees and TD(1) algorithm with Back Propagation neural network. Both algorithms have been used in VCMI to evolve battle strategies through a fully autonomous learning process. Finally, the obtained strategies have been tested against existing solutions and compared with players’ best tactics.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
2.
Zurück zum Zitat Amato, C., Shani, G.: High-level reinforcement learning in strategy games. In: van der Hoek, W. et al. (ed.) AAMAS. pp. 75–82. IFAAMAS (2010) Amato, C., Shani, G.: High-level reinforcement learning in strategy games. In: van der Hoek, W. et al. (ed.) AAMAS. pp. 75–82. IFAAMAS (2010)
3.
Zurück zum Zitat Baier, H., Winands, M.: Monte-carlo tree search and minimax hybrids with heuristic evaluation functions. In: Cazenave, T., Winands, M., Björnsson, Y. (eds.) Computer Games, Communications in Computer and Information Science, vol. 504, pp. 45–63. Springer International Publishing, Berlin (2014) Baier, H., Winands, M.: Monte-carlo tree search and minimax hybrids with heuristic evaluation functions. In: Cazenave, T., Winands, M., Björnsson, Y. (eds.) Computer Games, Communications in Computer and Information Science, vol. 504, pp. 45–63. Springer International Publishing, Berlin (2014)
4.
Zurück zum Zitat Beal, D.: Learn from your opponent—but what if he/she/it knows less than you? In: Retschitzki, J., Haddab-Zubel, R. (eds.) Step by Step : Proceedings of the 4th Colloquium Board Games in Academia. pp. 123–132. Editions Universitaires (2002) Beal, D.: Learn from your opponent—but what if he/she/it knows less than you? In: Retschitzki, J., Haddab-Zubel, R. (eds.) Step by Step : Proceedings of the 4th Colloquium Board Games in Academia. pp. 123–132. Editions Universitaires (2002)
5.
Zurück zum Zitat Björnsson, Y., Hafsteinsson, V., Jóhannsson, Á., Jónsson, E.: Efficient use of reinforcement learning in a computer game. In: Computer Games: Artificial Intellignece, Design and Education (CGAIDE’04), pp. 379–383 (2004) Björnsson, Y., Hafsteinsson, V., Jóhannsson, Á., Jónsson, E.: Efficient use of reinforcement learning in a computer game. In: Computer Games: Artificial Intellignece, Design and Education (CGAIDE’04), pp. 379–383 (2004)
6.
Zurück zum Zitat Brafman, R.I., Tennenholtz, M.: R-MAX—a general polynomial time algorithm for near-optimal reinforcement learning. J. Mach. Learn. Res. 3, 213–231 (2002)MathSciNet Brafman, R.I., Tennenholtz, M.: R-MAX—a general polynomial time algorithm for near-optimal reinforcement learning. J. Mach. Learn. Res. 3, 213–231 (2002)MathSciNet
7.
Zurück zum Zitat Browne, C., Powley, E.J., Whitehouse, D., Lucas, S.M., Cowling, P.I., Rohlfshagen, P., Tavener, S., Perez, D., Samothrakis, S., Colton, S.: A survey of monte carlo tree search methods. IEEE Trans. Comput. Intellig. AI Games 4(1), 1–43 (2012) Browne, C., Powley, E.J., Whitehouse, D., Lucas, S.M., Cowling, P.I., Rohlfshagen, P., Tavener, S., Perez, D., Samothrakis, S., Colton, S.: A survey of monte carlo tree search methods. IEEE Trans. Comput. Intellig. AI Games 4(1), 1–43 (2012)
8.
Zurück zum Zitat Conitzer, V., Sandholm, T.: Awesome: a general multiagent learning algorithm that converges in self-play and learns a best response against stationary opponents. In: Fawcett, T., Mishra, N. (eds.) ICML. pp. 83–90. AAAI Press (2003) Conitzer, V., Sandholm, T.: Awesome: a general multiagent learning algorithm that converges in self-play and learns a best response against stationary opponents. In: Fawcett, T., Mishra, N. (eds.) ICML. pp. 83–90. AAAI Press (2003)
9.
Zurück zum Zitat Dreżewski, R., Siwik, L.: Multi-objective optimization using co-evolutionary multi-agent system with host-parasite mechanism. In: Alexandrov, V.N., van Albada, G.D., Sloot, P.M.A., Dongarra, J. (eds.) Computational Science—ICCS 2006. LNCS, vol. 3993, pp. 871–878. Springer, Berlin (2006)CrossRef Dreżewski, R., Siwik, L.: Multi-objective optimization using co-evolutionary multi-agent system with host-parasite mechanism. In: Alexandrov, V.N., van Albada, G.D., Sloot, P.M.A., Dongarra, J. (eds.) Computational Science—ICCS 2006. LNCS, vol. 3993, pp. 871–878. Springer, Berlin (2006)CrossRef
10.
Zurück zum Zitat Dreżewski, R., Siwik, L.: Multi-objective optimization technique based on co-evolutionary interactions in multi-agent system. In: Giacobini, M., et al. (eds.) Applications of Evolutionary Computing, EvoWorkshops 2007. LNCS, vol. 4448, pp. 179–188. Springer, Berlin (2007) Dreżewski, R., Siwik, L.: Multi-objective optimization technique based on co-evolutionary interactions in multi-agent system. In: Giacobini, M., et al. (eds.) Applications of Evolutionary Computing, EvoWorkshops 2007. LNCS, vol. 4448, pp. 179–188. Springer, Berlin (2007)
11.
Zurück zum Zitat Dreżewski, R., Siwik, L.: Agent-based co-operative co-evolutionary algorithm for multi-objective optimization. In: Rutkowski, L., Tadeusiewicz, R., Zadeh, L.A., Zurada, J.M. (eds.) Artificial Intelligence and Soft Computing—ICAISC 2008. LNCS, vol. 5097, pp. 388–397. Springer, Berlin (2008)CrossRef Dreżewski, R., Siwik, L.: Agent-based co-operative co-evolutionary algorithm for multi-objective optimization. In: Rutkowski, L., Tadeusiewicz, R., Zadeh, L.A., Zurada, J.M. (eds.) Artificial Intelligence and Soft Computing—ICAISC 2008. LNCS, vol. 5097, pp. 388–397. Springer, Berlin (2008)CrossRef
12.
Zurück zum Zitat Ekker, R.J.: Reinforcement Learning and Games. Master’s thesis, University of Groningen (2003) Ekker, R.J.: Reinforcement Learning and Games. Master’s thesis, University of Groningen (2003)
13.
Zurück zum Zitat Elidrisi, M., Johnson, N., Gini, M.: Fast learning against adaptive adversarial opponents. In: AAMAS’12: Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems (2012) Elidrisi, M., Johnson, N., Gini, M.: Fast learning against adaptive adversarial opponents. In: AAMAS’12: Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems (2012)
14.
Zurück zum Zitat García-Martínez, C., Lozano, M.: Local search based on genetic algorithms. In: Siarry, P., Michalewicz, Z. (eds.) Advances in Metaheuristics for Hard Optimization. Natural Computing Series, pp. 199–221. Springer, Berlin (2008)CrossRef García-Martínez, C., Lozano, M.: Local search based on genetic algorithms. In: Siarry, P., Michalewicz, Z. (eds.) Advances in Metaheuristics for Hard Optimization. Natural Computing Series, pp. 199–221. Springer, Berlin (2008)CrossRef
15.
Zurück zum Zitat González-Pardo, A., Palero, F., Camacho, D.: Micro and macro lemmings simulations based on ants colonies. In: Esparcia-Alcázar, A.I., Mora, A.M. (eds.) Applications of Evolutionary Computation. Lecture Notes in Computer Science, pp. 337–348. Springer, Berlin (2014)CrossRef González-Pardo, A., Palero, F., Camacho, D.: Micro and macro lemmings simulations based on ants colonies. In: Esparcia-Alcázar, A.I., Mora, A.M. (eds.) Applications of Evolutionary Computation. Lecture Notes in Computer Science, pp. 337–348. Springer, Berlin (2014)CrossRef
16.
Zurück zum Zitat Jaskowski, W.: Algorithms for test-based problems. Ph.D. thesis, Institute of Computing Science, Poznan University of Technology, Poznan, Poland (2011) Jaskowski, W.: Algorithms for test-based problems. Ph.D. thesis, Institute of Computing Science, Poznan University of Technology, Poznan, Poland (2011)
17.
Zurück zum Zitat Koza, J.R.: The genetic programming paradigm: genetically breeding populations of computer programs to solve problems. In: Soucek, B. (ed.) Dynamic, Genetic, and Chaotic Programming, pp. 203–321. Wiley, New York (1992) Koza, J.R.: The genetic programming paradigm: genetically breeding populations of computer programs to solve problems. In: Soucek, B. (ed.) Dynamic, Genetic, and Chaotic Programming, pp. 203–321. Wiley, New York (1992)
18.
Zurück zum Zitat Littman, M.L.: Markov games as a framework for multi-agent reinforcement learning. In: Proceedings of the 11th International Conference on Machine Learning (ML-94), pp. 157–163. Morgan Kaufman, New Brunswick (1994) Littman, M.L.: Markov games as a framework for multi-agent reinforcement learning. In: Proceedings of the 11th International Conference on Machine Learning (ML-94), pp. 157–163. Morgan Kaufman, New Brunswick (1994)
19.
Zurück zum Zitat Lucas, S., Samothrakis, S., Pérez, D.: Fast evolutionary adaptation for monte carlo tree search. In: Esparcia-Alcázar, A.I., Mora, A.M. (eds.) Applications of Evolutionary Computation. Lecture Notes in Computer Science, pp. 349–360. Springer, Berlin (2014)CrossRef Lucas, S., Samothrakis, S., Pérez, D.: Fast evolutionary adaptation for monte carlo tree search. In: Esparcia-Alcázar, A.I., Mora, A.M. (eds.) Applications of Evolutionary Computation. Lecture Notes in Computer Science, pp. 349–360. Springer, Berlin (2014)CrossRef
20.
Zurück zum Zitat McPartland, M., Gallagher, M.: Learning to be a bot: reinforcement learning in shooter games. In: Darken, C., Mateas, M. (eds.) AIIDE. The AAAI Press, Menlo Park (2008) McPartland, M., Gallagher, M.: Learning to be a bot: reinforcement learning in shooter games. In: Darken, C., Mateas, M. (eds.) AIIDE. The AAAI Press, Menlo Park (2008)
21.
Zurück zum Zitat Nogueira, M., Cotta, C., Fernández-Leiva, A.J.: An analysis of hall-of-fame strategies in competitive coevolutionary algorithms for self-learning in RTS games. In: Nicosia, G., Pardalos, P.M. (eds.) LION. Lecture Notes in Computer Science, vol. 7997, pp. 174–188. Springer, Berlin (2013) Nogueira, M., Cotta, C., Fernández-Leiva, A.J.: An analysis of hall-of-fame strategies in competitive coevolutionary algorithms for self-learning in RTS games. In: Nicosia, G., Pardalos, P.M. (eds.) LION. Lecture Notes in Computer Science, vol. 7997, pp. 174–188. Springer, Berlin (2013)
22.
Zurück zum Zitat Perez, D., Rohlfshagen, P., Lucas, S.: Monte-carlo tree search for the physical travelling salesman problem. In: Di Chio, C., et al. (eds.) Applications of Evolutionary Computation. Lecture Notes in Computer Science, vol. 7248, pp. 255–264. Springer, Berlin (2012)CrossRef Perez, D., Rohlfshagen, P., Lucas, S.: Monte-carlo tree search for the physical travelling salesman problem. In: Di Chio, C., et al. (eds.) Applications of Evolutionary Computation. Lecture Notes in Computer Science, vol. 7248, pp. 255–264. Springer, Berlin (2012)CrossRef
23.
Zurück zum Zitat van der Ree, M., Wiering, M.: Reinforcement learning in the game of Othello: learning against a fixed opponent and learning from self-play. In: IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL), 2013, pp. 108–115 (2013) van der Ree, M., Wiering, M.: Reinforcement learning in the game of Othello: learning against a fixed opponent and learning from self-play. In: IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL), 2013, pp. 108–115 (2013)
24.
Zurück zum Zitat Robilliard, D., Fonlupt, C., Teytaud, F.: Monte-carlo tree search for the game of “7 wonders”. In: Cazenave, T., Winands, M., Björnsson, Y. (eds.) Computer Games, Communications in Computer and Information Science, vol. 504, pp. 64–77. Springer International Publishing, Basel (2014) Robilliard, D., Fonlupt, C., Teytaud, F.: Monte-carlo tree search for the game of “7 wonders”. In: Cazenave, T., Winands, M., Björnsson, Y. (eds.) Computer Games, Communications in Computer and Information Science, vol. 504, pp. 64–77. Springer International Publishing, Basel (2014)
25.
Zurück zum Zitat Sutton, R., Barto, A.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (1998) Sutton, R., Barto, A.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (1998)
26.
Zurück zum Zitat Tanner, B., Sutton, R.S.: Td(lambda) networks: temporal-difference networks with eligibility traces. In: De Raedt, L., Wrobel, S. (eds.) ICML. ACM International Conference Proceeding Series, vol. 119, pp. 888–895. ACM (2005) Tanner, B., Sutton, R.S.: Td(lambda) networks: temporal-difference networks with eligibility traces. In: De Raedt, L., Wrobel, S. (eds.) ICML. ACM International Conference Proceeding Series, vol. 119, pp. 888–895. ACM (2005)
27.
Zurück zum Zitat Uther, W., Veloso, M.: Adversarial reinforcement learning. Technical report. In: Proceedings of the AAAI Fall Symposium on Model Directed Autonomous Systems (1997) Uther, W., Veloso, M.: Adversarial reinforcement learning. Technical report. In: Proceedings of the AAAI Fall Symposium on Model Directed Autonomous Systems (1997)
28.
Zurück zum Zitat Wender, S., Watson, I.: Using reinforcement learning for city site selection in the turn-based strategy game civilization iv. In: Hingston, P., Barone, L. (eds.) CIG. pp. 372–377. IEEE (2008) Wender, S., Watson, I.: Using reinforcement learning for city site selection in the turn-based strategy game civilization iv. In: Hingston, P., Barone, L. (eds.) CIG. pp. 372–377. IEEE (2008)
29.
Zurück zum Zitat Wiegand, P.R.: An analysis of cooperative coevolutionary algorithms. Ph.D. thesis, George Mason University (2003) Wiegand, P.R.: An analysis of cooperative coevolutionary algorithms. Ph.D. thesis, George Mason University (2003)
Metadaten
Titel
The Application of Co-evolutionary Genetic Programming and TD(1) Reinforcement Learning in Large-Scale Strategy Game VCMI
verfasst von
Łukasz Wilisowski
Rafał Dreżewski
Copyright-Jahr
2015
DOI
https://doi.org/10.1007/978-3-319-19728-9_7