Skip to main content
main-content
Top

Hint

Swipe to navigate through the articles of this issue

Published in: Dynamic Games and Applications 4/2019

03-12-2018

Dynamic Exploitation of Myopic Best Response

Author: Burkhard C. Schipper

Published in: Dynamic Games and Applications | Issue 4/2019

Login to get access
share
SHARE

Abstract

How can a rational player manipulate a myopic best response player in a repeated two-player game? We show that in games with strategic substitutes or strategic complements the optimal control strategy is monotone in the initial action of the opponent, in time periods, and in the discount rate. As an interesting example outside this class of games we present a repeated “textbook-like” Cournot duopoly with nonnegative prices and show that the optimal control strategy involves a cycle.

To get access to this content you need the following product:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 69.000 Bücher
  • über 500 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Testen Sie jetzt 15 Tage kostenlos.

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 50.000 Bücher
  • über 380 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




Testen Sie jetzt 15 Tage kostenlos.

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 58.000 Bücher
  • über 300 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Testen Sie jetzt 15 Tage kostenlos.

Appendix
Available only for authorised users
Footnotes
1
The game was repeated over 40 rounds. The participant played the cycle of quantities (108, 70, 54, 42). This cycle yields an average payoff of 1520 which is well above Stackelberg leader payoff of 1458. In this game, the Stackelberg leader’s quantity is 54, the follower’s quantity is 27 (payoff 728), the Cournot Nash equilibrium quantity 36 (payoff 1296). The computer is programmed to myopic best response with some noise. The x-axis in Fig. 1 indicates the rounds of play, the y-axis the quantities. The lower time series depicts the computer’s sequence of actions. The upper time series shows the participant’s quantities. See Duersch et al. [15] for details of the game and the experiment.
 
2
In fact, the average payoff of the optimal cycle is 1522, only a minor improvement over the average payoff (1520) of the cycle played by the participant.
 
3
As a reviewer pointed out, this literature is related to the literature on indirect evolution (e.g., [25, 29]). Yet, instead of the evolution of utility function, the evolution of learning heuristics is featured.
 
4
In Sect. 4 we explain why we do not consider here multi-dimensional strategy sets.
 
5
Note that throughout the analysis we do not allow the manipulator to choose suitably the initial action of the puppet.
 
6
As a reviewer rightfully points out this would be problematic if the manipulator does not know the learning heuristic used by the puppet.
 
7
As a reviewer pointed out, we could have stated the model just in terms of assumptions on m and a continuous best-response function b. This might be even more realistic as the manipulator may observe the opponent’s best responses but not necessarily the opponent’s payoff function.
 
8
In the first four periods, the cyclic example of Sect. 3 coincides with the smooth problem that we discuss in Sect. 3. Proposition 1 applies to this smooth problem. The manipulator’s quantity in the last period is 41, which is the best response to the puppet’s Stackelberg follower quantity.
 
9
Amir ([1], Theorem 2 (ii)) does not state explicitly that the one-period value function is increasing and \(X_y\) is expanding. Yet, this property is required in the proof.
 
10
This finding that an optimal control strategy involves strictly dominated actions is not restricted to games for which monotone differences differ among players.
 
11
Since we look at cycles (of finite length), we can neglect discounting in the calculations below.
 
12
To save space, we write out only the objective functions for \(n = 1, 2, 3\).
 
13
Interestingly, the denominator in the linear factor in \(s_n\) is identical the numerator of the linear factor in \(s_{n+1}\).
 
14
We like to remark that not in all zero-sum games the optimal control strategy of the manipulator involves a cycle. This is the case for some classes of zero-sum games studied in Duersch et al. [16, 17].
 
15
One reviewer suggested that if the puppet uses fictitious play rather than myopic best response, then it is much more difficult to manipulate with a cycle. Fictitious play is an uncoupled learning heuristic. Moreover, in our Cournot example, the Stackelberg outcome is unique. Thus, it follows from Schipper [40] that the payoff to the dynamic optimizer would be strictly above Nash equilibrium. So fictitious play can be exploited by a patient dynamic optimizers in our Cournot example although the strategy may not be cyclic. At present, the form of the optimal manipulation strategy against a fictitious player is not clear to us and is left for future research.
 
16
A real-valued function f on a lattice X is supermodular on X if \(f(x'' \vee x') - f(x'') \ge f(x') - f(x'' \wedge x')\) for all \(x'', x' \in X\) (see [45], p. 43).
 
Literature
1.
go back to reference Amir R (1996a) Sensitivity analysis of multisector optimal economic dynamics. J Math Econ 25:123–141 MathSciNetMATH Amir R (1996a) Sensitivity analysis of multisector optimal economic dynamics. J Math Econ 25:123–141 MathSciNetMATH
2.
go back to reference Amir R (1996b) Cournot oligopoly and the theory of supermodular games. Games Econ Behav 15:132–148 MathSciNetMATH Amir R (1996b) Cournot oligopoly and the theory of supermodular games. Games Econ Behav 15:132–148 MathSciNetMATH
3.
go back to reference Aoyagi M (1996) Evolution of beliefs and the Nash equilibrium of normal form games. J Econ Theory 70:444–469 MathSciNetMATH Aoyagi M (1996) Evolution of beliefs and the Nash equilibrium of normal form games. J Econ Theory 70:444–469 MathSciNetMATH
4.
go back to reference Banerjee A, Weibull JW (1995) Evolutionary selection and rational behavior. In: Kirman A, Salmon M (eds) Learning and rationality in economics. Blackwell, Oxford, pp 343–363 Banerjee A, Weibull JW (1995) Evolutionary selection and rational behavior. In: Kirman A, Salmon M (eds) Learning and rationality in economics. Blackwell, Oxford, pp 343–363
5.
6.
go back to reference Berge C (1963) Topological spaces, Dover edition, 1997. Dover Publications Inc, Mineola Berge C (1963) Topological spaces, Dover edition, 1997. Dover Publications Inc, Mineola
7.
go back to reference Bertsekas DP (2005) Dynamic programming and optimal control, vol I & II, 3rd edn. Athena Scientific, Belmont MATH Bertsekas DP (2005) Dynamic programming and optimal control, vol I & II, 3rd edn. Athena Scientific, Belmont MATH
8.
go back to reference Boldrin M, Montrucchio L (1986) On the indeterminancy of capital accumulation paths. J Econ Theory 40:26–29 MATH Boldrin M, Montrucchio L (1986) On the indeterminancy of capital accumulation paths. J Econ Theory 40:26–29 MATH
9.
go back to reference Bryant J (1983) A simple rational expectations Keynes-type coordination model. Q J Econ 98:525–528 Bryant J (1983) A simple rational expectations Keynes-type coordination model. Q J Econ 98:525–528
10.
go back to reference Bulavsky VA, Kalashnikov VV (1996) Equilibria in generalized Cournot and Stackelberg markets. Z Angew Math Mech 76(S3):387–388 MATH Bulavsky VA, Kalashnikov VV (1996) Equilibria in generalized Cournot and Stackelberg markets. Z Angew Math Mech 76(S3):387–388 MATH
11.
go back to reference Camerer CF, Ho T-H, Chong J-K (2002) Sophisticated experience-weighted attraction learning and strategic teaching in repeated games. J Econ Theory 104:137–188 MATH Camerer CF, Ho T-H, Chong J-K (2002) Sophisticated experience-weighted attraction learning and strategic teaching in repeated games. J Econ Theory 104:137–188 MATH
12.
go back to reference Chong J-K, Camerer CF, Ho T-H (2006) A learning-based model of repeated games with incomplete information. Games Econ Behav 55:340–371 MathSciNetMATH Chong J-K, Camerer CF, Ho T-H (2006) A learning-based model of repeated games with incomplete information. Games Econ Behav 55:340–371 MathSciNetMATH
13.
go back to reference Cournot A (1838) Researches into the mathematical principles of the theory of wealth. MacMillan, London MATH Cournot A (1838) Researches into the mathematical principles of the theory of wealth. MacMillan, London MATH
14.
go back to reference Dubey P, Haimanko O, Zapechelnyuk A (2006) Strategic substitutes and complements, and potential games. Games Econ Behav 54:77–94 MATH Dubey P, Haimanko O, Zapechelnyuk A (2006) Strategic substitutes and complements, and potential games. Games Econ Behav 54:77–94 MATH
15.
go back to reference Duersch P, Kolb A, Oechssler J, Schipper BC (2010) Rage against the machines: how subjects learn to play against computers. Econ Theory 43:407–430 MATH Duersch P, Kolb A, Oechssler J, Schipper BC (2010) Rage against the machines: how subjects learn to play against computers. Econ Theory 43:407–430 MATH
16.
17.
go back to reference Duersch P, Oechssler J, Schipper BC (2014) When is tit-for-tat unbeatable? Int J Game Theory 43:25–36 MathSciNetMATH Duersch P, Oechssler J, Schipper BC (2014) When is tit-for-tat unbeatable? Int J Game Theory 43:25–36 MathSciNetMATH
19.
go back to reference Droste E, Hommes C, Tuinstra J (2002) Endogenous fluctuations under evolutionary pressure in Cournot competition. Games Econ Behav 40:232–269 MathSciNetMATH Droste E, Hommes C, Tuinstra J (2002) Endogenous fluctuations under evolutionary pressure in Cournot competition. Games Econ Behav 40:232–269 MathSciNetMATH
20.
go back to reference Ellison G (1997) Learning from personal experience: one rational guy and the justification of myopia. Games Econ Behav 19:180–210 MathSciNetMATH Ellison G (1997) Learning from personal experience: one rational guy and the justification of myopia. Games Econ Behav 19:180–210 MathSciNetMATH
21.
go back to reference Fudenberg D, Kreps DM, Maskin ES (1990) Repeated games with long-run short-run players. Rev Econ Stud 57:555–573 MathSciNetMATH Fudenberg D, Kreps DM, Maskin ES (1990) Repeated games with long-run short-run players. Rev Econ Stud 57:555–573 MathSciNetMATH
22.
go back to reference Fudenberg D, Levine DK (1998) The theory of learning in games. The MIT Press, Cambridge MATH Fudenberg D, Levine DK (1998) The theory of learning in games. The MIT Press, Cambridge MATH
23.
go back to reference Fudenberg D, Levine DK (1994) Efficiency and observability with long-run and short-run players. J Econ Theory 62:103–135 MathSciNetMATH Fudenberg D, Levine DK (1994) Efficiency and observability with long-run and short-run players. J Econ Theory 62:103–135 MathSciNetMATH
24.
go back to reference Fudenberg D, Levine DK (1989) Reputation and equilibrium selection in games with a patient player. Econometrica 57:759–778 MathSciNetMATH Fudenberg D, Levine DK (1989) Reputation and equilibrium selection in games with a patient player. Econometrica 57:759–778 MathSciNetMATH
25.
go back to reference Güth W, Peleg B (2001) When will payoff maximization survive? An indirect evolutionary analysis. J Evol Econ 11:479–499 Güth W, Peleg B (2001) When will payoff maximization survive? An indirect evolutionary analysis. J Evol Econ 11:479–499
26.
go back to reference Hart S, Mas-Colell A (2013) Simple adaptive strategies: from regret-matching to uncoupled dynamics. World Scientific Publishing, Singapore MATH Hart S, Mas-Colell A (2013) Simple adaptive strategies: from regret-matching to uncoupled dynamics. World Scientific Publishing, Singapore MATH
27.
go back to reference Hart S, Mas-Colell A (2006) Stochastic uncoupled dynamics and Nash equilibrium. Games Econ Behav 57:286–303 MathSciNetMATH Hart S, Mas-Colell A (2006) Stochastic uncoupled dynamics and Nash equilibrium. Games Econ Behav 57:286–303 MathSciNetMATH
28.
go back to reference Hehenkamp B, Kaarbøe O (2006) Imitators and optimizers in a changing environment. J Econ Dyn Control 32:1357–1380 MATH Hehenkamp B, Kaarbøe O (2006) Imitators and optimizers in a changing environment. J Econ Dyn Control 32:1357–1380 MATH
29.
30.
go back to reference Hyndman K, Ozbay EY, Schotter A, Ehrblatt WZ (2012) Convergence: an experimental study of teaching and learning in repeated games. J Eur Econ Assoc 10:573–604 Hyndman K, Ozbay EY, Schotter A, Ehrblatt WZ (2012) Convergence: an experimental study of teaching and learning in repeated games. J Eur Econ Assoc 10:573–604
32.
go back to reference Kordonis I, Charalampidis AC, Papavassilopoulos GP (2018) Pretending in dynamic games: alternative outcomes and application to electricity markets. Dyn Games Appl 8:844–873 MathSciNetMATH Kordonis I, Charalampidis AC, Papavassilopoulos GP (2018) Pretending in dynamic games: alternative outcomes and application to electricity markets. Dyn Games Appl 8:844–873 MathSciNetMATH
33.
go back to reference Kukushkin NS (2004) Best response dynamics in finite games with additive aggregation. Games Econ Behav 48:94–110 MathSciNetMATH Kukushkin NS (2004) Best response dynamics in finite games with additive aggregation. Games Econ Behav 48:94–110 MathSciNetMATH
34.
go back to reference Milgrom P, Roberts J (1990) Rationalizability, learning, and equilibrium in games with strategic complementarities. Econometrica 58:1255–1277 MathSciNetMATH Milgrom P, Roberts J (1990) Rationalizability, learning, and equilibrium in games with strategic complementarities. Econometrica 58:1255–1277 MathSciNetMATH
37.
go back to reference Osborne M (2004) An introduction to game theory. Oxford University Press, Oxford Osborne M (2004) An introduction to game theory. Oxford University Press, Oxford
38.
go back to reference Puterman ML (1994) Markov decision processes. Discrete stochastic dynamic programming. Wiley, New York MATH Puterman ML (1994) Markov decision processes. Discrete stochastic dynamic programming. Wiley, New York MATH
39.
go back to reference Rand D (1978) Excotic phenomena in games and duopoly models. J Math Econ 5:173–184 MATH Rand D (1978) Excotic phenomena in games and duopoly models. J Math Econ 5:173–184 MATH
40.
go back to reference Schipper BC (2017) Strategic teaching and learning in games. The University of California, Davis, Davis Schipper BC (2017) Strategic teaching and learning in games. The University of California, Davis, Davis
41.
go back to reference Schipper BC (2009) Imitators and optimizers in Cournot oligopoly. J Econ Dyn Control 33:1981–1990 MathSciNetMATH Schipper BC (2009) Imitators and optimizers in Cournot oligopoly. J Econ Dyn Control 33:1981–1990 MathSciNetMATH
42.
go back to reference Stokey NL, Lucas RE, Prescott EC (1989) Recursive methods in economic dynamics. Harvard University Press, Cambridge Stokey NL, Lucas RE, Prescott EC (1989) Recursive methods in economic dynamics. Harvard University Press, Cambridge
43.
go back to reference Terracol A, Vaksmann J (2009) Dumbing down rational players: learning and teaching in an experimental game. J Econ Behav Organ 70:54–71 Terracol A, Vaksmann J (2009) Dumbing down rational players: learning and teaching in an experimental game. J Econ Behav Organ 70:54–71
45.
go back to reference Topkis D (1998) Supermodularity and complementarity. Princeton University Press, Princeton Topkis D (1998) Supermodularity and complementarity. Princeton University Press, Princeton
46.
go back to reference Van Huyck J, Battalio R, Beil R (1990) Tacit coordination games, strategic uncertainty and coordination failure. Am Econ Rev 80:234–248 Van Huyck J, Battalio R, Beil R (1990) Tacit coordination games, strategic uncertainty and coordination failure. Am Econ Rev 80:234–248
47.
go back to reference Vives X (1999) Oligopoly pricing. Old ideas and new tools. Cambridge University Press, Cambridge Vives X (1999) Oligopoly pricing. Old ideas and new tools. Cambridge University Press, Cambridge
48.
go back to reference Walker JM, Gardner R, Ostrom E (1990) Rent dissipation in a limited access Common-Pool resource: experimental evidence. J Environ Econ Manag 19:203–211 Walker JM, Gardner R, Ostrom E (1990) Rent dissipation in a limited access Common-Pool resource: experimental evidence. J Environ Econ Manag 19:203–211
49.
go back to reference Young P (2013) Strategic learning and its limits. Oxford University Press, Oxford Young P (2013) Strategic learning and its limits. Oxford University Press, Oxford
Metadata
Title
Dynamic Exploitation of Myopic Best Response
Author
Burkhard C. Schipper
Publication date
03-12-2018
Publisher
Springer US
Published in
Dynamic Games and Applications / Issue 4/2019
Print ISSN: 2153-0785
Electronic ISSN: 2153-0793
DOI
https://doi.org/10.1007/s13235-018-0289-z

Other articles of this Issue 4/2019

Dynamic Games and Applications 4/2019 Go to the issue

Premium Partner