Skip to main content

Verifying the Gaming Strategy of Self-learning Game by Using PRISM-Games

  • Conference paper
  • First Online:
Intelligent Computing and Optimization (ICO 2019)

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 1072))

Included in the following conference series:

Abstract

Reinforcement Learning (RL) gained a huge amount of popularity in computer science; applied in fields such as gaming, intelligent robots, remote sensing, and so on. The objective of reinforcement learning is to generate the optimal policy. The main problem of that optimal policy is that it is not fully guaranteed to be satisfied all the system specifications. Model checking is a technique to verify the system to meet the system specifications. PRISM-games is one of the model-checking tools that is used to verify the probabilistic system with competitive or collaborative behavior. Safe Reinforcement Learning via Shielding is a method that uses shield to restrict the action of the RL agent if it violates the specification using temporal logic. This paper presents to compare the winning strategies between three agents; Monte-Carlo Tree Search agent (MCTS), RL agent and shielded RL agent (SRL) which uses PRISM-games to restrict the action based on Tic-Tac-Toe game. Over thousand times of simulations has been made, the experiments show that MCTS agent has the highest win rate compared to other agents, but the losing rate of the shielded agent is reduced by using PRISM-games.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Alshiekh, M., Bloem, R., Ehlers, R., Kὄnighofer, B., Niekum, S., Topcu, U.: Safe reinforcement learning via shielding. In: The Thirty-Second AAAI Conference on Artificial Intelligence (2018)

    Google Scholar 

  2. Chen, P., Doan, J., Xu, E.: AI Agents for Ultimate Tic-Tac-Toe, 30 December 2018

    Google Scholar 

  3. Kwiatkowska, M., Norman, G., Parker, D.: PRISM: 4.0: verification of probabilistic real-time systems. In: 23rd International Conference on Computer Aided Verification (2011)

    Google Scholar 

  4. Chen, T., Forejt, V., Kwiakowskam, M., Parker, D., Simaitis, A.: PRISM-games: a model checker for stochastic multiplayer games. In: 19th International Conference on Tools and Algorithms for the Construction and Analysis of Systems (2013)

    Google Scholar 

  5. Ahantab, A., Filip, R.: Formal verification of RL-based approaches

    Google Scholar 

  6. Baier, H., Winands, M.H.M.: Monte-Carlo Tree Search and minimax hybrids

    Google Scholar 

  7. Jamieson, K.: Lecture 19: Monte Carlo Tree Search. CSE599i: Online and Adaptive Machine Learning, Winter (2018)

    Google Scholar 

  8. Mason, G., Calinescu, R., Kudenko, D., Banks, A.: Assured reinforcement learning with formally verified abstract policies. In: 9th International Conference on Agents and Artificial Intelligence (ICAART) (2017)

    Google Scholar 

  9. Mason, G., Calinescu, R., Banks, A.: Assured reinforcement learning for safety-critical applications. In: 10th International Conference on Agents and Artificial Intelligence (ICAART) (2017)

    Google Scholar 

  10. Kwiatkowska, M., Parker, D., Wiltsche, C.: PRISM-games: verification and strategy synthesis for stochastic multi-player games with multiple objectives. Int. J. Softw. Tools Technol. Transf. 20, 195–210 (2018)

    Article  Google Scholar 

  11. Basset, N., Kwiatkowska, M., Wiltsche, C.: Compositional strategy synthesis for stochastic games with multiple objectives

    Google Scholar 

  12. Amrani, M., Lucio, L., Bibal, A.: A survey on the application of machine learning to formal verification

    Google Scholar 

  13. PRISM website. www.prismmodelchecker.org/

  14. PRISM-games website. www.prismmodelchecker.org/games/

Download references

Acknowledgments

Foremost, I would like to express my sincere gratitude to my supervisor, Dr. Swe Zin Hlaing for the continuous support of my research, for her patience, motivation, enthusiasm, and immense knowledge. And I would also like to thank all the experts who were involved in developing PRISM and its extension PRISM-Games.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hein Htoo Zaw .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zaw, H.H., Hlaing, S.Z. (2020). Verifying the Gaming Strategy of Self-learning Game by Using PRISM-Games. In: Vasant, P., Zelinka, I., Weber, GW. (eds) Intelligent Computing and Optimization. ICO 2019. Advances in Intelligent Systems and Computing, vol 1072. Springer, Cham. https://doi.org/10.1007/978-3-030-33585-4_15

Download citation

Publish with us

Policies and ethics