Skip to main content
Erschienen in: Theory and Decision 1/2015

01.07.2015

Stubborn learning

verfasst von: Jean-François Laslier, Bernard Walliser

Erschienen in: Theory and Decision | Ausgabe 1/2015

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

The paper studies a specific adaptive learning rule when each player faces a unidimensional strategy set. The rule states that a player keeps on incrementing her strategy in the same direction if her utility increased and reverses direction if it decreased. The paper concentrates on games on the square \([0,1]\times [0,1]\) as mixed extensions of \(2\times 2\) games. We study in general the behavior of the system in the interior as well as on the borders of the strategy space. We then describe the system asymptotic behavior for symmetric, zero-sum, and twin games. Original patterns emerge. For instance, for the “prisoner’s dilemma” with symmetric initial conditions, the system goes directly to the symmetric Pareto optimum. For “matching pennies,” the system follows slowly expanding cycles around the mixed strategy equilibrium.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Anhänge
Nur mit Berechtigung zugänglich
Fußnoten
1
Within the class of symmetric games, generically, \(E\ne 0\). But this rules out games which are both symmetric and zero-sum. Generic zero-sum games will be treated in Sect. 4.
 
2
Within the class of symmetric games, generically, \(b\ne c\). But this rules out games which are both symmetric and twin. Generic twin games will be treated in Sect. 5.
 
Literatur
Zurück zum Zitat Anderson, S., & Gueree, J. (2004). Noisy directional learning and the logit equilibrium. Scandinavian Journal of Economics, 106, 581–602.CrossRef Anderson, S., & Gueree, J. (2004). Noisy directional learning and the logit equilibrium. Scandinavian Journal of Economics, 106, 581–602.CrossRef
Zurück zum Zitat Bowling, M. (2005). Convergence and no-regret in multi-agent learning. Advances in Neural Information Processing Systems, 17, 209–216. Bowling, M. (2005). Convergence and no-regret in multi-agent learning. Advances in Neural Information Processing Systems, 17, 209–216.
Zurück zum Zitat Bowling, M., & Veloso, M. (2001). Rational and convergent learning in stochastic games. Proceedings of the 17th international joint conference on artificial intelligence (pp. 1021–1026). Bowling, M., & Veloso, M. (2001). Rational and convergent learning in stochastic games. Proceedings of the 17th international joint conference on artificial intelligence (pp. 1021–1026).
Zurück zum Zitat Banerjee, D., & Sen, S. (2007). Reaching Pareto-optimality in prisoners’ dilema using conditional joint action learning. Journal of Autonomous Agents and Multi-Agent Systems, 15, 91–108. Banerjee, D., & Sen, S. (2007). Reaching Pareto-optimality in prisoners’ dilema using conditional joint action learning. Journal of Autonomous Agents and Multi-Agent Systems, 15, 91–108.
Zurück zum Zitat Börgers, T., & Sarin, R. (1997). Learning through reinforcement and replicator dynamics. Journal of Economic Theory, 77, 1–14.CrossRef Börgers, T., & Sarin, R. (1997). Learning through reinforcement and replicator dynamics. Journal of Economic Theory, 77, 1–14.CrossRef
Zurück zum Zitat Bush, R., & Mosteller, F. (1955). Stochastic models of learning. New York: Wiley.CrossRef Bush, R., & Mosteller, F. (1955). Stochastic models of learning. New York: Wiley.CrossRef
Zurück zum Zitat Crawford, V. (2013). Boundedly rational versus optimization-based models of strategic thinking and learning in games. Journal of Economic Literature, 51(2), 512–527.CrossRef Crawford, V. (2013). Boundedly rational versus optimization-based models of strategic thinking and learning in games. Journal of Economic Literature, 51(2), 512–527.CrossRef
Zurück zum Zitat Erev, I., & Roth, A. (1998). Predicting how people play games: Reinforcement learning in experimental games with unique, mixed strategy equilibria. American Economic Review, 88, 848–881. Erev, I., & Roth, A. (1998). Predicting how people play games: Reinforcement learning in experimental games with unique, mixed strategy equilibria. American Economic Review, 88, 848–881.
Zurück zum Zitat Foster, D. P., & Young, H. P. (2006). Regret testing: Learning to play Nash equilibrium without knowing you have an opponent. Theoretical Economics, 1, 341–367. Foster, D. P., & Young, H. P. (2006). Regret testing: Learning to play Nash equilibrium without knowing you have an opponent. Theoretical Economics, 1, 341–367.
Zurück zum Zitat Fudenberg, D., & Levine, D. K. (1998). Theory of learning in games. Cambridge: MIT Press. Fudenberg, D., & Levine, D. K. (1998). Theory of learning in games. Cambridge: MIT Press.
Zurück zum Zitat Grosskopf, B. (2003). Reinforcement and directional learning in the ultimatum game with responder competition. Experimental Economics, 6, 141–158.CrossRef Grosskopf, B. (2003). Reinforcement and directional learning in the ultimatum game with responder competition. Experimental Economics, 6, 141–158.CrossRef
Zurück zum Zitat Harstad, R. M., & Selten, R. (2013). Bounded rationality models: Tasks to become intellectually competitive. Journal of Economic Literature, 51(2), 496–511.CrossRef Harstad, R. M., & Selten, R. (2013). Bounded rationality models: Tasks to become intellectually competitive. Journal of Economic Literature, 51(2), 496–511.CrossRef
Zurück zum Zitat Hart, S., & Mas-Collel, A. (2001). A general class of adaptive strategies. Journal of Economic Theory, 98, 26–54.CrossRef Hart, S., & Mas-Collel, A. (2001). A general class of adaptive strategies. Journal of Economic Theory, 98, 26–54.CrossRef
Zurück zum Zitat Hart, S., & Mas-Collel, A. (2003). Uncoupled dynamics do not lead to Nash equilibrium. American Economic Review, 93(5), 1830–36.CrossRef Hart, S., & Mas-Collel, A. (2003). Uncoupled dynamics do not lead to Nash equilibrium. American Economic Review, 93(5), 1830–36.CrossRef
Zurück zum Zitat Huck, S., Norman, H. T., & Oechssler, J. (2004). GLAD: A simple adaptive strategy that yields cooperation in dilemma games. Physica D, 200, 133–138.CrossRef Huck, S., Norman, H. T., & Oechssler, J. (2004). GLAD: A simple adaptive strategy that yields cooperation in dilemma games. Physica D, 200, 133–138.CrossRef
Zurück zum Zitat Huck, S., Norman, H. T., & Oechssler, J. (2005). Through trial and error to collusion. International Economic Review, 45, 205–224.CrossRef Huck, S., Norman, H. T., & Oechssler, J. (2005). Through trial and error to collusion. International Economic Review, 45, 205–224.CrossRef
Zurück zum Zitat Huck, S., Oechssler, J., & Norman, H. T. (2003). Zero-knowledge cooperation in dilemma games. Journal of Theoretical Biology, 220, 47–54.CrossRef Huck, S., Oechssler, J., & Norman, H. T. (2003). Zero-knowledge cooperation in dilemma games. Journal of Theoretical Biology, 220, 47–54.CrossRef
Zurück zum Zitat Laslier, J.-F., Topol, R., & Walliser, B. (2001). A behavioral learning process in games. Games and Economic Behavior, 37, 340–366.CrossRef Laslier, J.-F., Topol, R., & Walliser, B. (2001). A behavioral learning process in games. Games and Economic Behavior, 37, 340–366.CrossRef
Zurück zum Zitat Liu, Y., Chen, X., Wang, L., Li, B., Zhang, W., & Wang, H. (2011). Aspiration-based learning promotes cooperation in spatial prisoner’s dilemma games. EPL (Europhysics Letters), 94, 60002. article.CrossRef Liu, Y., Chen, X., Wang, L., Li, B., Zhang, W., & Wang, H. (2011). Aspiration-based learning promotes cooperation in spatial prisoner’s dilemma games. EPL (Europhysics Letters), 94, 60002. article.CrossRef
Zurück zum Zitat Nax, H., Burton-Chellew, M., West, S., & Young, P. (2013). Learning in a black box. Discussion Paper, Oxford University. Nax, H., Burton-Chellew, M., West, S., & Young, P. (2013). Learning in a black box. Discussion Paper, Oxford University.
Zurück zum Zitat Nax, H., Pradelski, B. S., & Young, H. P. (2012). The evolution of score stability in decentralised matching markets. Proceedings of the 52nd IEEE conference on decision and control (pp. 2391–2397). Nax, H., Pradelski, B. S., & Young, H. P. (2012). The evolution of score stability in decentralised matching markets. Proceedings of the 52nd IEEE conference on decision and control (pp. 2391–2397).
Zurück zum Zitat Nowack, M., & Sigmund, K. (1993). A strategy of winstay, lose-shift that outperforms tit-for-tat in the Prisoner’s Dilemna game. Nature, 364, 56–58.CrossRef Nowack, M., & Sigmund, K. (1993). A strategy of winstay, lose-shift that outperforms tit-for-tat in the Prisoner’s Dilemna game. Nature, 364, 56–58.CrossRef
Zurück zum Zitat Posh, M. (1999). Win-stay, lose-shift strategies for repeated games—memory length, aspiration levels and noise. Journal of Theoretical Biology, 198, 183–195.CrossRef Posh, M. (1999). Win-stay, lose-shift strategies for repeated games—memory length, aspiration levels and noise. Journal of Theoretical Biology, 198, 183–195.CrossRef
Zurück zum Zitat Roth, A., & Erev, I. (1995). Learning in extensive-form games: Experimental data and simple dynamic models in the intermediate term. Games and Economic Behavior, 8, 164–212.CrossRef Roth, A., & Erev, I. (1995). Learning in extensive-form games: Experimental data and simple dynamic models in the intermediate term. Games and Economic Behavior, 8, 164–212.CrossRef
Zurück zum Zitat Selten, R., & Buchta, J. (1998). Experimental sealed bidfirst price auctions with directly observed bid functions. Games and human behavior, essays in honor of Amnon Rapoport. Mahwah, NJ: Lawrence Erlbaum Associates. Selten, R., & Buchta, J. (1998). Experimental sealed bidfirst price auctions with directly observed bid functions. Games and human behavior, essays in honor of Amnon Rapoport. Mahwah, NJ: Lawrence Erlbaum Associates.
Zurück zum Zitat Selten, R., & Stoecker, R. (1986). End behavior in sequences of finite prisoner’s dilemma supergames. Journal of Economic Behavior and Organization, 7, 47–70.CrossRef Selten, R., & Stoecker, R. (1986). End behavior in sequences of finite prisoner’s dilemma supergames. Journal of Economic Behavior and Organization, 7, 47–70.CrossRef
Zurück zum Zitat Singh, S., Kearns, M., & Mansour, Y. (2000). Nash convergence of gradient dynamics in general-sum games. In: Proceedings of the 16th conference on uncertainty in IA (pp. 541–548). Singh, S., Kearns, M., & Mansour, Y. (2000). Nash convergence of gradient dynamics in general-sum games. In: Proceedings of the 16th conference on uncertainty in IA (pp. 541–548).
Zurück zum Zitat Sutton, R. S., & Barto, A. G. (1998). Reinforcement learning. Cambridge: MIT Press. Sutton, R. S., & Barto, A. G. (1998). Reinforcement learning. Cambridge: MIT Press.
Zurück zum Zitat Trégouet, T. (2004). Le duopole, l’apprentissage de la concurrence. Working Paper, ENSAE. Trégouet, T. (2004). Le duopole, l’apprentissage de la concurrence. Working Paper, ENSAE.
Zurück zum Zitat Weibull, J. (1995). Evolutionary game theory. Cambridge: MIT Press. Weibull, J. (1995). Evolutionary game theory. Cambridge: MIT Press.
Zurück zum Zitat Young, P. (2007). The possible and the impossible in multi-agent learning. Artificial Intelligence, 171(7), 429–33.CrossRef Young, P. (2007). The possible and the impossible in multi-agent learning. Artificial Intelligence, 171(7), 429–33.CrossRef
Metadaten
Titel
Stubborn learning
verfasst von
Jean-François Laslier
Bernard Walliser
Publikationsdatum
01.07.2015
Verlag
Springer US
Erschienen in
Theory and Decision / Ausgabe 1/2015
Print ISSN: 0040-5833
Elektronische ISSN: 1573-7187
DOI
https://doi.org/10.1007/s11238-014-9450-3

Weitere Artikel der Ausgabe 1/2015

Theory and Decision 1/2015 Zur Ausgabe

Premium Partner