Skip to main content
Erschienen in: Quantitative Marketing and Economics 4/2012

01.12.2012

Dynamic learning in behavioral games: A hidden Markov mixture of experts approach

verfasst von: Asim Ansari, Ricardo Montoya, Oded Netzer

Erschienen in: Quantitative Marketing and Economics | Ausgabe 4/2012

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Over the course of a repeated game, players often exhibit learning in selecting their best response. Research in economics and marketing has identified two key types of learning rules: belief and reinforcement. It has been shown that players use either one of these learning rules or a combination of them, as in the Experience-Weighted Attraction (EWA) model. Accounting for such learning may help in understanding and predicting the outcomes of games. In this research, we demonstrate that players not only employ learning rules to determine what actions to choose based on past choices and outcomes, but also change their learning rules over the course of the game. We investigate the degree of state dependence in learning and uncover the latent learning rules and learning paths used by the players. We build a non-homogeneous hidden Markov mixture of experts model which captures shifts between different learning rules over the course of a repeated game. The transition between the learning rule states can be affected by the players’ experiences in the previous round of the game. We empirically validate our model using data from six games that have been previously used in the literature. We demonstrate that one can obtain a richer understanding of how different learning rules impact the observed strategy choices of players by accounting for the latent dynamics in the learning rules. In addition, we show that such an approach can improve our ability to predict observed choices in games.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Anhänge
Nur mit Berechtigung zugänglich
Fußnoten
1
The model could be extended to capture more elaborate reinforcement behaviors (e.g., Erev and Roth 1998; Roth and Erev 1995).
 
2
Note that in some EWA papers ρ i is replaced by ϕ i (1 − κ i ). We decided to keep the original notation used in Camerer and Ho (1999). The transformation of the results is straightforward.
 
3
Alternative functional forms for the choice probabilities have been proposed in the literature (see e.g., Camerer and Ho 1998, 1999; Erev and Roth 1998). However, the logit formulation has consistently showed equal or better fit and prediction ability.
 
4
Accordingly, the first period in the likelihood function in Eq. 9 corresponds to the second period of the game.
 
5
One can allow for a richer specification of heterogeneity by using a mixture of normals (e.g., Allenby et al. 1998) or by using Mixtures of Dirichlet Process priors (MDP, e.g., Ansari and Mela 2003; Ansari and Iyengar 2006).
 
6
We conducted a series of simulations to analyze the empirical identification of the NH-HMME parameters (Salmon 2001). We found that for data that mimics the data in our empirical application, the model parameters can be correctly recovered (see Appendix B for details).
 
7
For the p-Beauty game, we follow Camerer and Ho (1999) and assume that players knew only the winning number and neglected the effect of their own choice on the target number. We modified the computation of forgone payoffs accordingly.
 
8
We thank Professor Teck-Hua Ho of University of California at Berkeley and the authors of the corresponding papers for generously providing us with the behavioral games experimental data.
 
9
To ensure that N is non-decreasing over time, we follow the previous literature by imposing N 0 ≤ 1/(1 − ρ), where N 0 = N(t = 0), for the belief, EWA, ME, HMMs, HMME, and NH-HMME models (Ho et al. 2002). For the time-varying EWA, we impose this constraint by constraining ρ t to be non-decreasing in t.
 
10
Estimating our EWA models without heterogeneity resulted in estimates that are very similar to those reported in the literature.
 
Literatur
Zurück zum Zitat Allenby, G., Arora, N., Ginter, J. (1998). On the heterogeneity of demand. Journal of Marketing Research, 34(3), 384–389.CrossRef Allenby, G., Arora, N., Ginter, J. (1998). On the heterogeneity of demand. Journal of Marketing Research, 34(3), 384–389.CrossRef
Zurück zum Zitat Ansari, A., & Iyengar, R. (2006). Semiparametric Thurstonian models for recurrent choices: a Bayesian analysis. Psychometrika, 71(4), 631–657.CrossRef Ansari, A., & Iyengar, R. (2006). Semiparametric Thurstonian models for recurrent choices: a Bayesian analysis. Psychometrika, 71(4), 631–657.CrossRef
Zurück zum Zitat Ansari, A., & Mela, C. (2003). E-customization. Journal of Marketing Research, 40(22), 131–145.CrossRef Ansari, A., & Mela, C. (2003). E-customization. Journal of Marketing Research, 40(22), 131–145.CrossRef
Zurück zum Zitat Amaldoss, W., Chong, J.-K., Ho, T.H. (2005). Choosing the right pond: EWA learning in games with different group sizes. Working paper, Department of Marketing, University of California, Berkeley. Amaldoss, W., Chong, J.-K., Ho, T.H. (2005). Choosing the right pond: EWA learning in games with different group sizes. Working paper, Department of Marketing, University of California, Berkeley.
Zurück zum Zitat Atchadé, Y.F. (2006). An adaptive version for the metropolis adjusted Langevin algorithm with a truncated drift. Methodology and Computing in Applied Probability, 8(2), 235–254.CrossRef Atchadé, Y.F. (2006). An adaptive version for the metropolis adjusted Langevin algorithm with a truncated drift. Methodology and Computing in Applied Probability, 8(2), 235–254.CrossRef
Zurück zum Zitat Brown, G.W. (1951). Iterative solution of games by fictitious play. In T. Jallings, & C. Koopmans (Eds.), Activity analysis of production and allocation. New York: Wiley. Brown, G.W. (1951). Iterative solution of games by fictitious play. In T. Jallings, & C. Koopmans (Eds.), Activity analysis of production and allocation. New York: Wiley.
Zurück zum Zitat Camerer, C. (2003). Behavioral game theory: Experiments on strategic interaction. Princeton: Princeton University Press. Camerer, C. (2003). Behavioral game theory: Experiments on strategic interaction. Princeton: Princeton University Press.
Zurück zum Zitat Camerer, C., & Ho, T.H. (1998). EWA learning in coordination games: probability rules, heterogeneity, and time variation. Journal of Mathematical Psychology, 42, 305–326.CrossRef Camerer, C., & Ho, T.H. (1998). EWA learning in coordination games: probability rules, heterogeneity, and time variation. Journal of Mathematical Psychology, 42, 305–326.CrossRef
Zurück zum Zitat Camerer, C., & Ho, T.H. (1999). Experience-weighted attraction learning in games. Econometrica, 87, 827–874.CrossRef Camerer, C., & Ho, T.H. (1999). Experience-weighted attraction learning in games. Econometrica, 87, 827–874.CrossRef
Zurück zum Zitat Camerer, C., Ho, T.H., Chong, J.-K. (2002a). Sophisticated learning and strategic teaching. Journal of Economic Theory, 104, 137–188.CrossRef Camerer, C., Ho, T.H., Chong, J.-K. (2002a). Sophisticated learning and strategic teaching. Journal of Economic Theory, 104, 137–188.CrossRef
Zurück zum Zitat Camerer, C., Ho, T.H., Chong, J.-K. (2003). Models of thinking, and teaching in games. The American Economic Review, 93(2), 192–195.CrossRef Camerer, C., Ho, T.H., Chong, J.-K. (2003). Models of thinking, and teaching in games. The American Economic Review, 93(2), 192–195.CrossRef
Zurück zum Zitat Camerer, C., Ho, T.H., Chong, J.-K. (2007). Self-tuning experience weighted attraction learning in games. Journal of Economic Theory, 133, 177–198.CrossRef Camerer, C., Ho, T.H., Chong, J.-K. (2007). Self-tuning experience weighted attraction learning in games. Journal of Economic Theory, 133, 177–198.CrossRef
Zurück zum Zitat Camerer C., Hsia, D., Ho, T.H. (2002b). EWA learning in bilateral call markets. In A. Rapoport, & R. Zwick (Eds.), Experimental business research. Camerer C., Hsia, D., Ho, T.H. (2002b). EWA learning in bilateral call markets. In A. Rapoport, & R. Zwick (Eds.), Experimental business research.
Zurück zum Zitat Cournot, A. (1960). Recherches sur les Principes Mathematiques de la Theorie des Richesses. Translated into English by N. Bacon as Researches in the Mathematical Principles of the Theory of Wealth. London: Haffner. Cournot, A. (1960). Recherches sur les Principes Mathematiques de la Theorie des Richesses. Translated into English by N. Bacon as Researches in the Mathematical Principles of the Theory of Wealth. London: Haffner.
Zurück zum Zitat Crawford, V. (1995). Adaptive dynamics in coordination games. Econometrica, 63, 103–143.CrossRef Crawford, V. (1995). Adaptive dynamics in coordination games. Econometrica, 63, 103–143.CrossRef
Zurück zum Zitat Erev, I., & Haruvy, E. (2005). Generality, repetition, and the role of descriptive learning models. Journal of Mathematical Psychology, 49(5), 357–371.CrossRef Erev, I., & Haruvy, E. (2005). Generality, repetition, and the role of descriptive learning models. Journal of Mathematical Psychology, 49(5), 357–371.CrossRef
Zurück zum Zitat Erev, I., & Haruvy, E. (2012). Learning and the economics of small decisions. In J. Kagel, & A. Roth (Eds.), The handbook of experimental economics (Vol. 2). Erev, I., & Haruvy, E. (2012). Learning and the economics of small decisions. In J. Kagel, & A. Roth (Eds.), The handbook of experimental economics (Vol. 2).
Zurück zum Zitat Erev, I., & Roth, A. (1998). Predicting how people play games: reinforcement learning in experimental games with unique mixed strategy equilibria. American Economic Review, 88, 848–881. Erev, I., & Roth, A. (1998). Predicting how people play games: reinforcement learning in experimental games with unique mixed strategy equilibria. American Economic Review, 88, 848–881.
Zurück zum Zitat Erev, I., & Roth, A. (2007). Multi-agent learning and the descriptive value of simple models. Artificial Intelligence, 171, 423–428.CrossRef Erev, I., & Roth, A. (2007). Multi-agent learning and the descriptive value of simple models. Artificial Intelligence, 171, 423–428.CrossRef
Zurück zum Zitat Fudenberg, D., & Levine, D. (1998). The theory of learning in games. Cambridge: MIT. Fudenberg, D., & Levine, D. (1998). The theory of learning in games. Cambridge: MIT.
Zurück zum Zitat Gelman, A., & Rubin, D.B. (1992). Inference from iterative simulation using multiple sequences. Statistical Science, 7, 457–472.CrossRef Gelman, A., & Rubin, D.B. (1992). Inference from iterative simulation using multiple sequences. Statistical Science, 7, 457–472.CrossRef
Zurück zum Zitat Goldfarb, A., & Yang, B. (2009). Are all managers created equal? Journal of Marketing Research, 46(5), 612–622.CrossRef Goldfarb, A., & Yang, B. (2009). Are all managers created equal? Journal of Marketing Research, 46(5), 612–622.CrossRef
Zurück zum Zitat Haruvy, E., & Erev, I. (2002). On the application and interpretation of learning models. In R. Zwick, & A. Rapoport (Eds.), Experimental business research (pp. 285–300). Boston: Kluwer Academic. Haruvy, E., & Erev, I. (2002). On the application and interpretation of learning models. In R. Zwick, & A. Rapoport (Eds.), Experimental business research (pp. 285–300). Boston: Kluwer Academic.
Zurück zum Zitat Haruvy, E., & Stahl, D. (2012). Learning transference between dissimilar symmetric normal-form games. Games and Economic Behavior, 74(1), 208–221.CrossRef Haruvy, E., & Stahl, D. (2012). Learning transference between dissimilar symmetric normal-form games. Games and Economic Behavior, 74(1), 208–221.CrossRef
Zurück zum Zitat Heckman, J.J. (1981). Heterogeneity and state dependence. In S. Rosen (Ed.), Studies in labor markets(pp. 91–139). ChicagoL University of Chicago Press. Heckman, J.J. (1981). Heterogeneity and state dependence. In S. Rosen (Ed.), Studies in labor markets(pp. 91–139). ChicagoL University of Chicago Press.
Zurück zum Zitat Ho, T.H., Camerer, C., Chong, J.-K. (2002). Functional EWA: a one-parameter theory of learning in games. Working paper, University of California, Berkeley. Ho, T.H., Camerer, C., Chong, J.-K. (2002). Functional EWA: a one-parameter theory of learning in games. Working paper, University of California, Berkeley.
Zurück zum Zitat Ho, T.H., Camerer, C., Chong, J.-K. (2007). Self-tuning experience weighted attraction learning in games. Journal of Economic Theory, 133, 177–198.CrossRef Ho, T.H., Camerer, C., Chong, J.-K. (2007). Self-tuning experience weighted attraction learning in games. Journal of Economic Theory, 133, 177–198.CrossRef
Zurück zum Zitat Ho, T.H., Camerer, C., Weigelt, K. (1998). Iterated dominance and iterated best-response in experimental P-beauty contests. American Economic Review, 88, 947–969. Ho, T.H., Camerer, C., Weigelt, K. (1998). Iterated dominance and iterated best-response in experimental P-beauty contests. American Economic Review, 88, 947–969.
Zurück zum Zitat Ho, T.H., Wang, X., Camerer, C. (2008). Individual differences in the EWA learning with partial payoff information. The Economic Journal, 118, 37–59.CrossRef Ho, T.H., Wang, X., Camerer, C. (2008). Individual differences in the EWA learning with partial payoff information. The Economic Journal, 118, 37–59.CrossRef
Zurück zum Zitat Hück, S., H-Normann, T., Oechssler, J. (1999). Learning in Cournot oligopoly—an experiment. The Economic Journal, 109, 80–95.CrossRef Hück, S., H-Normann, T., Oechssler, J. (1999). Learning in Cournot oligopoly—an experiment. The Economic Journal, 109, 80–95.CrossRef
Zurück zum Zitat Jordan, M.I., & Jacobs, R.A. (1994). Hierarchical mixtures of experts and the EM algorithm. Neural Computation, 6(2), 181–214.CrossRef Jordan, M.I., & Jacobs, R.A. (1994). Hierarchical mixtures of experts and the EM algorithm. Neural Computation, 6(2), 181–214.CrossRef
Zurück zum Zitat Keane, M.P. (1997). Modeling heterogeneity and state dependence in consumer choice behavior. Journal of Business and Economic Statistics, 15(July), 310–327. Keane, M.P. (1997). Modeling heterogeneity and state dependence in consumer choice behavior. Journal of Business and Economic Statistics, 15(July), 310–327.
Zurück zum Zitat Kunreuther, H., Silvasi, G., Bradlow, E.T., Small, D. (2009). Bayesian analysis of deterministic and stochastic prisoner’s dilemma games. Judgment and Decision Making, 4(5), 363–384. Kunreuther, H., Silvasi, G., Bradlow, E.T., Small, D. (2009). Bayesian analysis of deterministic and stochastic prisoner’s dilemma games. Judgment and Decision Making, 4(5), 363–384.
Zurück zum Zitat McDonald, I., & Zucchini, W. (1997). Hidden Markov and other models for discrete-valued time series. London: Chapman & Hall. McDonald, I., & Zucchini, W. (1997). Hidden Markov and other models for discrete-valued time series. London: Chapman & Hall.
Zurück zum Zitat Montoya, R., Netzer, O., Jedidi, K. (2010). Dynamic allocation of pharmaceutical detailing and sampling for long-term profitability. Marketing Science, 29(5), 909–924.CrossRef Montoya, R., Netzer, O., Jedidi, K. (2010). Dynamic allocation of pharmaceutical detailing and sampling for long-term profitability. Marketing Science, 29(5), 909–924.CrossRef
Zurück zum Zitat Mookherjee, D., & Sopher, B (1994). Learning behavior in an experimental matching pennies game. Games and Economic Behavior, 7, 62–91.CrossRef Mookherjee, D., & Sopher, B (1994). Learning behavior in an experimental matching pennies game. Games and Economic Behavior, 7, 62–91.CrossRef
Zurück zum Zitat Mookherjee, D., & Sopher, B. (1997). Learning and decision costs in experimental constant sum games. Games and Economic Behavior, 19(1), 97–132.CrossRef Mookherjee, D., & Sopher, B. (1997). Learning and decision costs in experimental constant sum games. Games and Economic Behavior, 19(1), 97–132.CrossRef
Zurück zum Zitat Netzer, O., Lattin, J.M., Srinivasan, V. (2008). A hidden Markov model of customer relationship dynamics. Marketing Science, 27(2), 185–204.CrossRef Netzer, O., Lattin, J.M., Srinivasan, V. (2008). A hidden Markov model of customer relationship dynamics. Marketing Science, 27(2), 185–204.CrossRef
Zurück zum Zitat Nevo, I., & Erev, I. (2010). On surprise, change, and the effect of recent outcomes. Working paper, Technion, Haifa, Israel. Nevo, I., & Erev, I. (2010). On surprise, change, and the effect of recent outcomes. Working paper, Technion, Haifa, Israel.
Zurück zum Zitat Peng, F., Jacobs, R.A., Tanner, M.A. (1996). Bayesian inference in mixtures-of-experts and hierarchical mixtures-of-experts models with an application to speech recognition. Journal of the American Statistical Association, 91, 953–960.CrossRef Peng, F., Jacobs, R.A., Tanner, M.A. (1996). Bayesian inference in mixtures-of-experts and hierarchical mixtures-of-experts models with an application to speech recognition. Journal of the American Statistical Association, 91, 953–960.CrossRef
Zurück zum Zitat Rapoport, A., & Amaldoss, W. (2000). Mixed strategies and iterative elimination of strongly dominated strategies: an experimental investigation of states of knowledge. Journal of Economic Behavior and Organization, 42, 483–521.CrossRef Rapoport, A., & Amaldoss, W. (2000). Mixed strategies and iterative elimination of strongly dominated strategies: an experimental investigation of states of knowledge. Journal of Economic Behavior and Organization, 42, 483–521.CrossRef
Zurück zum Zitat Roth, A.E., & Erev, I. (1995). Learning in extensive-form games: experimental data and simple dynamic models in the intermediate term. Games and Economic Behavior, 8, 164–212.CrossRef Roth, A.E., & Erev, I. (1995). Learning in extensive-form games: experimental data and simple dynamic models in the intermediate term. Games and Economic Behavior, 8, 164–212.CrossRef
Zurück zum Zitat Salmon, T. (2001). An evaluation of econometric models of adaptive learning. Econometrica, 69, 1597–1628.CrossRef Salmon, T. (2001). An evaluation of econometric models of adaptive learning. Econometrica, 69, 1597–1628.CrossRef
Zurück zum Zitat Salmon, T. (2004). Evidence for learning to learn behavior in normal form games. Theory and Decision, 56(4), 367–404.CrossRef Salmon, T. (2004). Evidence for learning to learn behavior in normal form games. Theory and Decision, 56(4), 367–404.CrossRef
Zurück zum Zitat Selten, R. (1991). Evolution, learning, and economic behavior. Journal of Risk Uncertainty, 3, 3–24. Selten, R. (1991). Evolution, learning, and economic behavior. Journal of Risk Uncertainty, 3, 3–24.
Zurück zum Zitat Stahl, D. (1999). Evidence based rules and learning in symmetric normal form games. International Journal of Game Theory, 28, 111–130.CrossRef Stahl, D. (1999). Evidence based rules and learning in symmetric normal form games. International Journal of Game Theory, 28, 111–130.CrossRef
Zurück zum Zitat Stahl, D. (2000). Rule learning in symmetric normal-form games: theory and evidence. Games and Economic Behavior, 32, 105–138.CrossRef Stahl, D. (2000). Rule learning in symmetric normal-form games: theory and evidence. Games and Economic Behavior, 32, 105–138.CrossRef
Zurück zum Zitat Stahl, D. (2001). Population rule learning in symmetric normal-form games: theory and evidence. Journal of Economic Behavior and Organization, 1304, 1–17. Stahl, D. (2001). Population rule learning in symmetric normal-form games: theory and evidence. Journal of Economic Behavior and Organization, 1304, 1–17.
Zurück zum Zitat Stahl, D. (2003). Mixture models of individual heterogeneity. In Encyclopedia of cognitive science. London: MacMillan. Stahl, D. (2003). Mixture models of individual heterogeneity. In Encyclopedia of cognitive science. London: MacMillan.
Zurück zum Zitat Stahl, D., & Haruvy, E. (2002). Aspiration-based and reciprocity-based rules in learning dynamics for symmetric normal-form games. Journal of Mathematical Psychology, 46(5), 531–553.CrossRef Stahl, D., & Haruvy, E. (2002). Aspiration-based and reciprocity-based rules in learning dynamics for symmetric normal-form games. Journal of Mathematical Psychology, 46(5), 531–553.CrossRef
Zurück zum Zitat Thorndike, E.L. (1898). Animal intelligence: an experimental study of the associative processes in animals. In Psychological review, Monograph supplements (No. 8). New York: Macmillan. Thorndike, E.L. (1898). Animal intelligence: an experimental study of the associative processes in animals. In Psychological review, Monograph supplements (No. 8). New York: Macmillan.
Zurück zum Zitat Van Huyck, J., Battalio, R., Beil, R. (1990). Tacit coordination games, strategic uncertainty, and coordination failure. American Economic Review, 80, 234–248. Van Huyck, J., Battalio, R., Beil, R. (1990). Tacit coordination games, strategic uncertainty, and coordination failure. American Economic Review, 80, 234–248.
Zurück zum Zitat Van Huyck, J., Cook, J., Battalio, R. (1997). Adaptive behavior and coordination failure. Journal of Economic Behavior and Organization, 32, 483–503.CrossRef Van Huyck, J., Cook, J., Battalio, R. (1997). Adaptive behavior and coordination failure. Journal of Economic Behavior and Organization, 32, 483–503.CrossRef
Zurück zum Zitat Van Huyck, J., Battalio, R., Rankin, F. (2007). Selection dynamics and adaptive behavior without much information. Economic Theory, 33(1), 53–65.CrossRef Van Huyck, J., Battalio, R., Rankin, F. (2007). Selection dynamics and adaptive behavior without much information. Economic Theory, 33(1), 53–65.CrossRef
Zurück zum Zitat Wilcox, N. (2006). Theories of learning in games and heterogeneity bias. Econometrica, 74, 1271–1292.CrossRef Wilcox, N. (2006). Theories of learning in games and heterogeneity bias. Econometrica, 74, 1271–1292.CrossRef
Metadaten
Titel
Dynamic learning in behavioral games: A hidden Markov mixture of experts approach
verfasst von
Asim Ansari
Ricardo Montoya
Oded Netzer
Publikationsdatum
01.12.2012
Verlag
Springer US
Erschienen in
Quantitative Marketing and Economics / Ausgabe 4/2012
Print ISSN: 1570-7156
Elektronische ISSN: 1573-711X
DOI
https://doi.org/10.1007/s11129-012-9125-8