Skip to main content
Top

2024 | OriginalPaper | Chapter

Synthesizing Understandable Strategies

Author : Peter Backeman

Published in: Engineering of Computer-Based Systems

Publisher: Springer Nature Switzerland

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

The result of reinforcement learning is often obtained in the form of a q-table mapping actions to future rewards. We propose to use SMT solvers and strategy trees to generate a representation of a learned strategy in a format which is understandable for a human. We present the methodology and demonstrate it on a small game.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Footnotes
1
As well as a set of meta-parameters to the learning algorithm, e.g., learning rate.
 
2
For simplicity, action \(a_2\) is forbidden when \(s = 1\), i.e., it is impossible to pick more sticks than remaining.
 
3
Where % is the remainder operator.
 
4
The subtraction of one comes from the action space being defined as \(\{a_1= 0, a_2 = 1\}\) instead of the number of sticks removed (\(\{a_1 = 1, a_2 = 2\}\)).
 
5
Interval constraints are added on edges, limiting the functions domains for efficiency.
 
Literature
3.
go back to reference Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction, 2nd edn. The MIT Press (2018) Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction, 2nd edn. The MIT Press (2018)
4.
go back to reference Wu, K., et al.: Automatic synthesis of generalized winning strategies of impartial combinatorial games using SMT solvers. In: Bessiere, C. (ed.) Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, IJCAI-20, pp. 1703–1711. International Joint Conferences on Artificial Intelligence Organization (2020) Wu, K., et al.: Automatic synthesis of generalized winning strategies of impartial combinatorial games using SMT solvers. In: Bessiere, C. (ed.) Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, IJCAI-20, pp. 1703–1711. International Joint Conferences on Artificial Intelligence Organization (2020)
Metadata
Title
Synthesizing Understandable Strategies
Author
Peter Backeman
Copyright Year
2024
DOI
https://doi.org/10.1007/978-3-031-49252-5_15

Premium Partner