Skip to main content

2018 | OriginalPaper | Buchkapitel

Two-Stage Reinforcement Learning Algorithm for Quick Cooperation in Repeated Games

verfasst von : Wataru Fujita, Koichi Moriyama, Ken-ichi Fukui, Masayuki Numao

Erschienen in: Transactions on Computational Collective Intelligence XXVIII

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

People often learn their behavior from its outcome, e.g., success and failure. Also, in the real world, people are not alone and many interactions occur among people every day. To model such learning and interactions, let us consider reinforcement learning agents playing games. Many researchers have studied a lot of reinforcement learning algorithms to obtain good strategies in games. However, most of the algorithms are “suspicious”, i.e., focusing on how to escape from being exploited by greedy opponents. Therefore, it takes long time to establish cooperation among such agents. On the other hand, if the agents are “innocent”, i.e., prone to trust others, they establish cooperation easily but are exploited by acquisitive opponents. In this work, we propose an algorithm that uses two complementary, “innocent” and “suspicious” algorithms in the early and the late stage, respectively. The algorithm allows the agent to cooperate with good associates quickly as well as treat greedy opponents well. The experiments in ten games showed us that the proposed algorithm successfully learned good strategies quickly in nine games.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
2.
Zurück zum Zitat Claus, C., Boutilier, C.: The dynamics of reinforcement learning in cooperative multiagent systems. In: Proceedings of the 15th National Conference on Artificial Intelligence (AAAI), pp. 746–752 (1998) Claus, C., Boutilier, C.: The dynamics of reinforcement learning in cooperative multiagent systems. In: Proceedings of the 15th National Conference on Artificial Intelligence (AAAI), pp. 746–752 (1998)
3.
Zurück zum Zitat Crandall, J.W., Goodrich, M.A.: Learning to compete, coordinate, and cooperate in repeated games using reinforcement learning. Mach. Learn. 82, 281–314 (2011)MathSciNetCrossRefMATH Crandall, J.W., Goodrich, M.A.: Learning to compete, coordinate, and cooperate in repeated games using reinforcement learning. Mach. Learn. 82, 281–314 (2011)MathSciNetCrossRefMATH
4.
Zurück zum Zitat Fujita, W., Moriyama, K., Fukui, K., Numao, M.: Learning better strategies with a combination of complementary reinforcement learning algorithms. In: Nishizaki, S., Numao, M., Caro, J.D.L., Suarez, M.T.C. (eds.) Theory and Practice of Computation, pp. 43–54. World Scientific, Singapore (2016)CrossRef Fujita, W., Moriyama, K., Fukui, K., Numao, M.: Learning better strategies with a combination of complementary reinforcement learning algorithms. In: Nishizaki, S., Numao, M., Caro, J.D.L., Suarez, M.T.C. (eds.) Theory and Practice of Computation, pp. 43–54. World Scientific, Singapore (2016)CrossRef
5.
Zurück zum Zitat Hu, J., Wellman, M.P.: Nash Q-learning for general-sum stochastic games. J. Mach. Learn. Res. 4, 1039–1069 (2003)MathSciNetMATH Hu, J., Wellman, M.P.: Nash Q-learning for general-sum stochastic games. J. Mach. Learn. Res. 4, 1039–1069 (2003)MathSciNetMATH
6.
Zurück zum Zitat Masuda, N., Nakamura, M.: Numerical analysis of a reinforcement learning model with the dynamic aspiration level in the iterated Prisoner’s dilemma. J. Theor. Biol. 278, 55–62 (2011)MathSciNetCrossRefMATH Masuda, N., Nakamura, M.: Numerical analysis of a reinforcement learning model with the dynamic aspiration level in the iterated Prisoner’s dilemma. J. Theor. Biol. 278, 55–62 (2011)MathSciNetCrossRefMATH
7.
Zurück zum Zitat Okada, A.: Game Theory, New edn. Yuhikaku, Tokyo (2011). (in Japanese) Okada, A.: Game Theory, New edn. Yuhikaku, Tokyo (2011). (in Japanese)
8.
Zurück zum Zitat Rummery, G.A., Niranjan, M.: On-line Q-learning using connectionist systems. Technical report TR166, Cambridge University Engineering Department (1994) Rummery, G.A., Niranjan, M.: On-line Q-learning using connectionist systems. Technical report TR166, Cambridge University Engineering Department (1994)
9.
Zurück zum Zitat Schembri, M., Mirolli, M., Baldassarre, G.: Evolving internal reinforcers for an intrinsically motivated reinforcement-learning robot. In: Proceedings of the 6th IEEE International Conference on Development and Learning (ICDL), pp. 282–287 (2007) Schembri, M., Mirolli, M., Baldassarre, G.: Evolving internal reinforcers for an intrinsically motivated reinforcement-learning robot. In: Proceedings of the 6th IEEE International Conference on Development and Learning (ICDL), pp. 282–287 (2007)
10.
Zurück zum Zitat Stimpson, J.L., Goodrich, M.A.: Learning to cooperate in a social dilemma: a satisficing approach to bargaining. In: Proceedings of the 20th International Conference on Machine Learning (ICML), pp. 728–735 (2003) Stimpson, J.L., Goodrich, M.A.: Learning to cooperate in a social dilemma: a satisficing approach to bargaining. In: Proceedings of the 20th International Conference on Machine Learning (ICML), pp. 728–735 (2003)
11.
Zurück zum Zitat Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (1998) Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (1998)
12.
Zurück zum Zitat Watkins, C.J.C.H., Dayan, P.: Technical note: Q-learning. Mach. Learn. 8, 279–292 (1992)MATH Watkins, C.J.C.H., Dayan, P.: Technical note: Q-learning. Mach. Learn. 8, 279–292 (1992)MATH
13.
Zurück zum Zitat Wiering, M.A., van Hasselt, H.: Ensemble algorithms in reinforcement learning. IEEE Trans. Syst. Man Cybern. B 38, 930–936 (2008)CrossRef Wiering, M.A., van Hasselt, H.: Ensemble algorithms in reinforcement learning. IEEE Trans. Syst. Man Cybern. B 38, 930–936 (2008)CrossRef
Metadaten
Titel
Two-Stage Reinforcement Learning Algorithm for Quick Cooperation in Repeated Games
verfasst von
Wataru Fujita
Koichi Moriyama
Ken-ichi Fukui
Masayuki Numao
Copyright-Jahr
2018
DOI
https://doi.org/10.1007/978-3-319-78301-7_3