Skip to main content
Top

2018 | OriginalPaper | Chapter

Two-Stage Reinforcement Learning Algorithm for Quick Cooperation in Repeated Games

Authors : Wataru Fujita, Koichi Moriyama, Ken-ichi Fukui, Masayuki Numao

Published in: Transactions on Computational Collective Intelligence XXVIII

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

People often learn their behavior from its outcome, e.g., success and failure. Also, in the real world, people are not alone and many interactions occur among people every day. To model such learning and interactions, let us consider reinforcement learning agents playing games. Many researchers have studied a lot of reinforcement learning algorithms to obtain good strategies in games. However, most of the algorithms are “suspicious”, i.e., focusing on how to escape from being exploited by greedy opponents. Therefore, it takes long time to establish cooperation among such agents. On the other hand, if the agents are “innocent”, i.e., prone to trust others, they establish cooperation easily but are exploited by acquisitive opponents. In this work, we propose an algorithm that uses two complementary, “innocent” and “suspicious” algorithms in the early and the late stage, respectively. The algorithm allows the agent to cooperate with good associates quickly as well as treat greedy opponents well. The experiments in ten games showed us that the proposed algorithm successfully learned good strategies quickly in nine games.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
2.
go back to reference Claus, C., Boutilier, C.: The dynamics of reinforcement learning in cooperative multiagent systems. In: Proceedings of the 15th National Conference on Artificial Intelligence (AAAI), pp. 746–752 (1998) Claus, C., Boutilier, C.: The dynamics of reinforcement learning in cooperative multiagent systems. In: Proceedings of the 15th National Conference on Artificial Intelligence (AAAI), pp. 746–752 (1998)
3.
go back to reference Crandall, J.W., Goodrich, M.A.: Learning to compete, coordinate, and cooperate in repeated games using reinforcement learning. Mach. Learn. 82, 281–314 (2011)MathSciNetCrossRefMATH Crandall, J.W., Goodrich, M.A.: Learning to compete, coordinate, and cooperate in repeated games using reinforcement learning. Mach. Learn. 82, 281–314 (2011)MathSciNetCrossRefMATH
4.
go back to reference Fujita, W., Moriyama, K., Fukui, K., Numao, M.: Learning better strategies with a combination of complementary reinforcement learning algorithms. In: Nishizaki, S., Numao, M., Caro, J.D.L., Suarez, M.T.C. (eds.) Theory and Practice of Computation, pp. 43–54. World Scientific, Singapore (2016)CrossRef Fujita, W., Moriyama, K., Fukui, K., Numao, M.: Learning better strategies with a combination of complementary reinforcement learning algorithms. In: Nishizaki, S., Numao, M., Caro, J.D.L., Suarez, M.T.C. (eds.) Theory and Practice of Computation, pp. 43–54. World Scientific, Singapore (2016)CrossRef
5.
go back to reference Hu, J., Wellman, M.P.: Nash Q-learning for general-sum stochastic games. J. Mach. Learn. Res. 4, 1039–1069 (2003)MathSciNetMATH Hu, J., Wellman, M.P.: Nash Q-learning for general-sum stochastic games. J. Mach. Learn. Res. 4, 1039–1069 (2003)MathSciNetMATH
6.
go back to reference Masuda, N., Nakamura, M.: Numerical analysis of a reinforcement learning model with the dynamic aspiration level in the iterated Prisoner’s dilemma. J. Theor. Biol. 278, 55–62 (2011)MathSciNetCrossRefMATH Masuda, N., Nakamura, M.: Numerical analysis of a reinforcement learning model with the dynamic aspiration level in the iterated Prisoner’s dilemma. J. Theor. Biol. 278, 55–62 (2011)MathSciNetCrossRefMATH
7.
go back to reference Okada, A.: Game Theory, New edn. Yuhikaku, Tokyo (2011). (in Japanese) Okada, A.: Game Theory, New edn. Yuhikaku, Tokyo (2011). (in Japanese)
8.
go back to reference Rummery, G.A., Niranjan, M.: On-line Q-learning using connectionist systems. Technical report TR166, Cambridge University Engineering Department (1994) Rummery, G.A., Niranjan, M.: On-line Q-learning using connectionist systems. Technical report TR166, Cambridge University Engineering Department (1994)
9.
go back to reference Schembri, M., Mirolli, M., Baldassarre, G.: Evolving internal reinforcers for an intrinsically motivated reinforcement-learning robot. In: Proceedings of the 6th IEEE International Conference on Development and Learning (ICDL), pp. 282–287 (2007) Schembri, M., Mirolli, M., Baldassarre, G.: Evolving internal reinforcers for an intrinsically motivated reinforcement-learning robot. In: Proceedings of the 6th IEEE International Conference on Development and Learning (ICDL), pp. 282–287 (2007)
10.
go back to reference Stimpson, J.L., Goodrich, M.A.: Learning to cooperate in a social dilemma: a satisficing approach to bargaining. In: Proceedings of the 20th International Conference on Machine Learning (ICML), pp. 728–735 (2003) Stimpson, J.L., Goodrich, M.A.: Learning to cooperate in a social dilemma: a satisficing approach to bargaining. In: Proceedings of the 20th International Conference on Machine Learning (ICML), pp. 728–735 (2003)
11.
go back to reference Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (1998) Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (1998)
12.
go back to reference Watkins, C.J.C.H., Dayan, P.: Technical note: Q-learning. Mach. Learn. 8, 279–292 (1992)MATH Watkins, C.J.C.H., Dayan, P.: Technical note: Q-learning. Mach. Learn. 8, 279–292 (1992)MATH
13.
go back to reference Wiering, M.A., van Hasselt, H.: Ensemble algorithms in reinforcement learning. IEEE Trans. Syst. Man Cybern. B 38, 930–936 (2008)CrossRef Wiering, M.A., van Hasselt, H.: Ensemble algorithms in reinforcement learning. IEEE Trans. Syst. Man Cybern. B 38, 930–936 (2008)CrossRef
Metadata
Title
Two-Stage Reinforcement Learning Algorithm for Quick Cooperation in Repeated Games
Authors
Wataru Fujita
Koichi Moriyama
Ken-ichi Fukui
Masayuki Numao
Copyright Year
2018
DOI
https://doi.org/10.1007/978-3-319-78301-7_3

Premium Partner