Verifying the Gaming Strategy of Self-learning Game by Using PRISM-Games

Zaw, Hein Htoo; Hlaing, Swe Zin

doi:10.1007/978-3-030-33585-4_15

Hein Htoo Zaw¹⁷ &
Swe Zin Hlaing¹⁷

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 1072))

Included in the following conference series:

International Conference on Intelligent Computing & Optimization

1151 Accesses
2 Citations

Abstract

Reinforcement Learning (RL) gained a huge amount of popularity in computer science; applied in fields such as gaming, intelligent robots, remote sensing, and so on. The objective of reinforcement learning is to generate the optimal policy. The main problem of that optimal policy is that it is not fully guaranteed to be satisfied all the system specifications. Model checking is a technique to verify the system to meet the system specifications. PRISM-games is one of the model-checking tools that is used to verify the probabilistic system with competitive or collaborative behavior. Safe Reinforcement Learning via Shielding is a method that uses shield to restrict the action of the RL agent if it violates the specification using temporal logic. This paper presents to compare the winning strategies between three agents; Monte-Carlo Tree Search agent (MCTS), RL agent and shielded RL agent (SRL) which uses PRISM-games to restrict the action based on Tic-Tac-Toe game. Over thousand times of simulations has been made, the experiments show that MCTS agent has the highest win rate compared to other agents, but the losing rate of the shielded agent is reduced by using PRISM-games.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Alshiekh, M., Bloem, R., Ehlers, R., Kὄnighofer, B., Niekum, S., Topcu, U.: Safe reinforcement learning via shielding. In: The Thirty-Second AAAI Conference on Artificial Intelligence (2018)
Google Scholar
Chen, P., Doan, J., Xu, E.: AI Agents for Ultimate Tic-Tac-Toe, 30 December 2018
Google Scholar
Kwiatkowska, M., Norman, G., Parker, D.: PRISM: 4.0: verification of probabilistic real-time systems. In: 23rd International Conference on Computer Aided Verification (2011)
Google Scholar
Chen, T., Forejt, V., Kwiakowskam, M., Parker, D., Simaitis, A.: PRISM-games: a model checker for stochastic multiplayer games. In: 19th International Conference on Tools and Algorithms for the Construction and Analysis of Systems (2013)
Google Scholar
Ahantab, A., Filip, R.: Formal verification of RL-based approaches
Google Scholar
Baier, H., Winands, M.H.M.: Monte-Carlo Tree Search and minimax hybrids
Google Scholar
Jamieson, K.: Lecture 19: Monte Carlo Tree Search. CSE599i: Online and Adaptive Machine Learning, Winter (2018)
Google Scholar
Mason, G., Calinescu, R., Kudenko, D., Banks, A.: Assured reinforcement learning with formally verified abstract policies. In: 9th International Conference on Agents and Artificial Intelligence (ICAART) (2017)
Google Scholar
Mason, G., Calinescu, R., Banks, A.: Assured reinforcement learning for safety-critical applications. In: 10th International Conference on Agents and Artificial Intelligence (ICAART) (2017)
Google Scholar
Kwiatkowska, M., Parker, D., Wiltsche, C.: PRISM-games: verification and strategy synthesis for stochastic multi-player games with multiple objectives. Int. J. Softw. Tools Technol. Transf. 20, 195–210 (2018)
Article Google Scholar
Basset, N., Kwiatkowska, M., Wiltsche, C.: Compositional strategy synthesis for stochastic games with multiple objectives
Google Scholar
Amrani, M., Lucio, L., Bibal, A.: A survey on the application of machine learning to formal verification
Google Scholar
PRISM website. www.prismmodelchecker.org/
PRISM-games website. www.prismmodelchecker.org/games/

Download references

Acknowledgments

Foremost, I would like to express my sincere gratitude to my supervisor, Dr. Swe Zin Hlaing for the continuous support of my research, for her patience, motivation, enthusiasm, and immense knowledge. And I would also like to thank all the experts who were involved in developing PRISM and its extension PRISM-Games.

Author information

Authors and Affiliations

University of Information Technology, Yangon, Myanmar
Hein Htoo Zaw & Swe Zin Hlaing

Authors

Hein Htoo Zaw
View author publications
You can also search for this author in PubMed Google Scholar
Swe Zin Hlaing
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hein Htoo Zaw .

Editor information

Editors and Affiliations

Department of Fundamental and Applied Sciences, Universiti Teknologi Petronas, Tronoh, Perak, Malaysia
Pandian Vasant
Computer Science, FEI, VSB-TU Ostrava, Ostrava, Czech Republic
Ivan Zelinka
Faculty of Engineering Management, Poznan University of Technology, Poznan, Poland
Gerhard-Wilhelm Weber

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zaw, H.H., Hlaing, S.Z. (2020). Verifying the Gaming Strategy of Self-learning Game by Using PRISM-Games. In: Vasant, P., Zelinka, I., Weber, GW. (eds) Intelligent Computing and Optimization. ICO 2019. Advances in Intelligent Systems and Computing, vol 1072. Springer, Cham. https://doi.org/10.1007/978-3-030-33585-4_15

Download citation

DOI: https://doi.org/10.1007/978-3-030-33585-4_15
Published: 27 October 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-33584-7
Online ISBN: 978-3-030-33585-4
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics