nach oben

Erschienen in:

2022 | OriginalPaper | Buchkapitel

25. IBPO: Solving 3D Strategy Game with the Intrinsic Reward

verfasst von : Huale Li, Rui Cao, Xiaohan Hou, Xuan Wang, Linlin Tang, Jiajia Zhang, Shuhan Qi

Erschienen in: Advances in Smart Vehicular Technology, Transportation, Communication and Applications

Verlag: Springer Singapore

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

In recent years, deep reinforcement learning achieves great success in many fields, especially in the field of games, such as AlphaGo, AlphaZero and AlphaStar. However, reward sparsity is still a problem in the 3D strategy games with a higher dimension of state space and more complex game scenarios. To solve this problem, in this paper, we propose an intrinsic-based policy optimization algorithm (IBPO) for reward sparsity. The IBPO incorporates the intrinsic reward into the traditional policy, which composed by the differential fusion mechanism and the modified value network. The experimental results show our method can obtain better performance than the previous methods on the VizDoom.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Vorheriges Kapitel Location Optimization of Service Centers for Seniors Based on an Improved Particle Swarm Optimization Algorithm

Nächstes Kapitel An Operation with Crossover and Mutation of MPSO Algorithm

Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A.A., Veness, J., Bellemare, M.G., Graves, A., Riedmiller, M., Fidjeland, A.K., Ostrovski, G., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529 (2015)

Silver, D., Schrittwieser, J., Simonyan, K., Antonoglou, I., Huang, A., Guez, A., Hubert, T., Baker, L., Lai, M., Bolton, A., et al.: Mastering the game of go without human knowledge. Nature 550(7676), 354 (2017)CrossRef

Kempka, M., Wydmuch, M., Runc, G., Toczek, J., Jaśkowski, W.: Vizdoom: a doom-based ai research platform for visual reinforcement learning. In: Proceedings of the 2016 IEEE Conference on Computational Intelligence and Games, pp. 1–8. IEEE (2016)

Shao, K., Zhu, Y., Zhao, D.: Starcraft micromanagement with reinforcement learning and curriculum transfer learning. IEEE Trans. Emerg. Topics Comput. Intell. 3(1), 73–84 (2018)CrossRef

Vinyals, O., Babuschkin, I., Chung, J., M.: Grandmaster level in starcraft II using multi-agent reinforcement learning. Nature 575(7782), 350–354 (2019)

Kulkarni, T.D., Narasimhan, K., Saeedi, A., Tenenbaum, J.: Hierarchical deep reinforcement learning: integrating temporal abstraction and intrinsic motivation. In: Proceedings of the Advances in Neural Information Processing Systems, pp. 3675–3683 (2016)

Song, S., Weng, J., Su, H., Yan, D., Zou, H., Zhu, J.: Playing fps games with environment-aware hierarchical reinforcement learning. In: Proceedings of the 28th International Joint Conference on Artificial Intelligence, pp. 3475–3482. AAAI Press (2019)

Pathak, D., Agrawal, P., Efros, A.A., Darrell, T.: Curiosity-driven exploration by self-supervised prediction. In: Proceedings of the International Conference on Machine Learning, pp. 2778–2787 (2017)

Mnih, V., Badia, A.P., Mirza, M., Graves, A., Lillicrap, T., Harley, T., Silver, D., Kavukcuoglu, K.: Asynchronous methods for deep reinforcement learning. In: Proceedings of the International Conference on Machine Learning, pp. 1928–1937 (2016)

10.

Zhang, S., Su, X., Jiang, X., Chen, M., Wu, T.: A traffic prediction method of bicycle-sharing based on long and short term memory network. J. Network Intell. 4(2), 17–29 (2019)

11.

Wang, Y., Wang, J., Deng, C., Zhu, H., Wang, S.: L1–L2 norms based target representation for visual tracking. J. Network Intell. 3(2), 102–112 (2018)

12.

Lample, G., Chaplot, D.S.: Playing fps games with deep reinforcement learning. In: Proceedings of the 31st AAAI Conference on Artificial Intelligence, pp. 2140–2146 (2017)

13.

Dosovitskiy, A., Koltun, V.: Learning to act by predicting the future. In: Proceedings of the International Conference on Learning Representations, pp. 637–645 (2017)

Titel: IBPO: Solving 3D Strategy Game with the Intrinsic Reward
verfasst von: Huale Li
Rui Cao
Xiaohan Hou
Xuan Wang
Linlin Tang
Jiajia Zhang
Shuhan Qi
Verlag: Springer Singapore
Buch: Advances in Smart Vehicular Technology, Transportation, Communication and Applications
Print ISBN: 978-981-16-4038-4

Electronic ISBN: 978-981-16-4039-1

Copyright-Jahr: 2022
DOI: https://doi.org/10.1007/978-981-16-4039-1_25

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Premium Partner