Skip to main content
Erschienen in: Dynamic Games and Applications 3/2018

Open Access 16.03.2018

Minmax via Replicator Dynamics

verfasst von: Josef Hofbauer

Erschienen in: Dynamic Games and Applications | Ausgabe 3/2018

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
download
DOWNLOAD
print
DRUCKEN
insite
SUCHEN
loading …

Abstract

I present a short proof of the minmax theorem using the replicator dynamics.

1 Introduction

The minmax theorem of von Neumann [10] says that
$$\begin{aligned} { \max _{x\in X} \min _{y \in Y} U(x,y)= \min _{y\in Y} \max _{x \in X} U(x,y) } \end{aligned}$$
where XY are the unit simplices in \({\mathbb {R}}^n, {\mathbb {R}}^m\) and \(U: X \times Y \rightarrow {\mathbb {R}}\) is a continuous function, quasi-concave in x and quasi-convex in y. The proof was by induction on the number of variables, see also [7]. An important special case is where U is a bilinear function \(U(x,y) = x \! \cdot \!Ay\), with A an \(n \times m \) matrix.
The idea to use dynamics for proving the minmax theorem (and computing the equilibria) goes back to Brown [1, 2]: for symmetric zero-sum games, i.e., \(A = -A^{{\mathsf {T}}}\), he proved together with von Neumann [2] that the solutions of a certain differential equation converge to the set of equilibria. In [1], he showed that the (continuous time) fictitious play process approaches the set of equilibria in any finite zero-sum game, which implies the minmax theorem for any \(n \times m \) matrix A. Brown’s fictitious play process [1] is now often framed as best response dynamics and can be used to prove the minmax theorem for more general payoff functions U, which are continuous and concave/convex, see [6]. For the original version [10] for continuous quasi-concave/quasi-convex functions, a dynamic proof is still missing. Another proof based on differential inclusions can be found in [8].
In the present note, I give a short proof of the minmax theorem in the matrix case, based on the replicator dynamics.

2 Replicator Dynamics

The replicator dynamics [5] for an \(n \times m \) bimatrix game (AB) is given by
$$\begin{aligned} \dot{x}_i= & {} x_i \left( e^i \! \cdot \!Ay - x\! \cdot \!Ay \right) \qquad i=1, \ldots , n \\ \dot{y}_j= & {} y_j \left( e^j \! \cdot \!Bx - y\! \cdot \!Bx \right) \qquad j=1, \ldots , m \end{aligned}$$
Here \(x_i\) denotes the frequency of strategy i of player 1, hence \(x = (x_1, \dots , x_n)\) is in the probability simplex \( \Delta _n = \{x \in [0,1]^n: \sum x_i = 1\}\), \(y_j\) is the frequency of strategy j of player 2, \(y = (y_1, \dots , y_m) \in \Delta _m\), and \(e^i\) denotes the ith unit vector.
Besides its original derivation from evolution and natural selection, there are at least two economic motivations based on imitation and on reinforcement learning.
For a zero-sum game \(B = -A^{{\mathsf {T}}}\), in the interior of \(\Delta _n \times \Delta _m\) we obtain
$$\begin{aligned} \dot{x}_i/x_i= & {} e^i \! \cdot \!Ay - x\! \cdot \!Ay \quad i=1, \ldots , n \end{aligned}$$
(1)
$$\begin{aligned} \dot{y}_j/y_j= & {} - x \! \cdot \!A e^j + x\! \cdot \!Ay \quad j=1, \ldots , m \end{aligned}$$
(2)
Now add these equations
$$\begin{aligned} \frac{\dot{x}_i}{x_i} + \frac{\dot{y}_j}{y_j} = e^i \! \cdot \!Ay - x \! \cdot \!A e^j \quad \forall i,j \end{aligned}$$
and integrate
$$\begin{aligned} \frac{\log x_i(T) - \log x_i(0)+ \log y_j(T) - \log y_j(0)}{T} = e^i \! \cdot \!A{\bar{y}}(T) - {\bar{x}}(T) \! \cdot \!A e^j \end{aligned}$$
where
$$\begin{aligned} {\bar{x}}(T) = \frac{1}{T} \int _0^T x(t)\mathrm{d}t, \quad {\bar{y}}(T) = \frac{1}{T} \int _0^T y(t)\mathrm{d}t\quad \end{aligned}$$
denote time averages of the solutions of (1, 2). Now consider limit points, i.e., choose a sequence \(T_k \rightarrow \infty \) s.t. \({\bar{x}}(T_k) \rightarrow {\bar{x}}\), \({\bar{y}}(T_k) \rightarrow {\bar{y}}\). Since \(\log x_i (T) \le 0\), we obtain \( 0 \ge e^i \! \cdot \!A{\bar{y}} - {\bar{x}} \! \cdot \!A e^j \quad \forall i,j \) or
$$\begin{aligned} e^i \! \cdot \!A{\bar{y}} \le {\bar{x}} \! \cdot \!A e^j \quad \forall i,j. \end{aligned}$$
(3)
Multiplying by \(x_i\) and \(y_j\) and summing over i and j, we obtain
$$\begin{aligned} \max _{x \in \Delta _n} x \! \cdot \!A{\bar{y}} \le \min _{y\in \Delta _m} {\bar{x}} \! \cdot \!A y \end{aligned}$$
(4)
and
$$\begin{aligned} \min _y \max _x x \! \cdot \!A y \le \max _x x \! \cdot \!A{\bar{y}} \le \min _y {\bar{x}} \! \cdot \!A y \le \max _x \min _y x \! \cdot \!A y, \end{aligned}$$
and together with the obvious inequality, we obtain
$$\begin{aligned} \min _{y \in \Delta _m} \max _{x \in \Delta _n} x \! \cdot \!A y = \max _{x \in \Delta _n} \min _{y \in \Delta _m} x \! \cdot \!A y. \end{aligned}$$
(5)
Additionally, (3) or (4) also imply
$$\begin{aligned} x \! \cdot \!A{\bar{y}} \le {\bar{x}} \! \cdot \!A{\bar{y}} \le {\bar{x}} \! \cdot \!A y \quad \forall x,y \end{aligned}$$
(6)
so \(({\bar{x}}, {\bar{y}})\) is a pair of optimal strategies for the zero–sum game. (In particular, this shows the existence of equilibria.)
Furthermore, if we integrate (1, 2) directly, then we obtain \( 0 \ge e^i \! \cdot \!A{\bar{y}} - {\bar{a}}\) and \(0 \ge - {\bar{x}} \! \cdot \!A e^j + {\bar{a}}\), with \({\bar{a}} = \lim _{T \rightarrow \infty } \frac{1}{T} \int _0^T x(t)A y(t) \mathrm{d}t\), and hence
$$\begin{aligned} e^i \! \cdot \!A{\bar{y}} \le {\bar{a}} \le {\bar{x}} \! \cdot \!A e^j \quad \forall i,j. \end{aligned}$$
(7)
Comparing with (6), we get \({\bar{a}} = {\bar{x}} \! \cdot \!A{\bar{y}}\).
Summarizing, besides minmax theorem (5), we have shown:
Theorem
Every limit point \(({\bar{x}}, {\bar{y}})\) of the time averages \(({\bar{x}}(T), {\bar{y}}(T))\) of positive solutions (x(t), y(t)) of the replicator dynamics is a pair of optimal strategies of the zero-sum game. And the time averages of the payoffs
$$\begin{aligned} \frac{1}{T} \int _0^T \sum _{i,j} a_{ij} x_i(t)y_j(t)dt \end{aligned}$$
converge to the value \({\bar{x}} \! \cdot \!A {\bar{y}}\) of the game, as \(T \rightarrow \infty \).

3 Remarks

1.
If \(\log x_i(T)\) and \(\log y_j(T)\) are bounded functions of T (i.e., the solution stays at a positive distance from the boundary of \(\Delta _n\) and \(\Delta _m\)), then we have equality in (3) for all ij, and the existence of a fully mixed equilibrium follows. The converse holds as well, see [3, 5, 9]: If \((p,q) > 0\) is an equilibrium of the zero-sum game, then the relative entropy
$$\begin{aligned} H(x,y ) = - \sum _i p_i \log \frac{x_i}{p_i} - \sum _j q_j \log \frac{y_j}{q_j} \ge 0 \end{aligned}$$
or Kullback–Leibler divergence is a constant of motion for (1, 2): \(\dot{H} = 0\). The replicator dynamics is even a Hamiltonian system w.r.t. a suitable symplectic or Poisson structure [3], and hence, on each level set of H, by Poincaré’s recurrence theorem, almost every solution is recurrent. The behavior of the solutions might be chaotic, but by the above theorem, their time averages approach the set of equilibria.
 
2.
For nonzero-sum games, a similar argument shows that the time averages
$$\begin{aligned} \frac{1}{T} \int _0^T x_i(t)y_j(t)\mathrm{d}t \end{aligned}$$
(i.e., how often does player 1 use strategy i against strategy j of player 2 in a given period) converge (as \(T \rightarrow \infty \)) to the set of exact coarse correlated equilibria, see [4]. This holds also for N player normal form games.
 

Acknowledgements

Open access funding provided by University of Vienna.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://​creativecommons.​org/​licenses/​by/​4.​0/​), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
Literatur
1.
Zurück zum Zitat Brown GW (1951) Iterative solution of games by fictitious play. In: Koopmans TC (ed) Activity analysis of production and allocation. Wiley, New York, pp 374–376 Brown GW (1951) Iterative solution of games by fictitious play. In: Koopmans TC (ed) Activity analysis of production and allocation. Wiley, New York, pp 374–376
2.
Zurück zum Zitat Brown GW, von Neumann J (1950) Solutions of games by differential equations. Ann. Math. Stud. 24:73–79MathSciNet Brown GW, von Neumann J (1950) Solutions of games by differential equations. Ann. Math. Stud. 24:73–79MathSciNet
4.
Zurück zum Zitat Hofbauer J (2004) Time averages of the replicator dynamics and correlated equilibria. (Preprint) Hofbauer J (2004) Time averages of the replicator dynamics and correlated equilibria. (Preprint)
5.
Zurück zum Zitat Hofbauer J, Sigmund K (1998) Evolutionary games and population dynamics. Cambridge University Press, CambridgeCrossRefMATH Hofbauer J, Sigmund K (1998) Evolutionary games and population dynamics. Cambridge University Press, CambridgeCrossRefMATH
6.
Zurück zum Zitat Hofbauer J, Sorin S (2006) Best response dynamics for continuous zero-sum games. Discrete Continuous Dyn Syst B 6:215–224MathSciNetMATH Hofbauer J, Sorin S (2006) Best response dynamics for continuous zero-sum games. Discrete Continuous Dyn Syst B 6:215–224MathSciNetMATH
7.
Zurück zum Zitat Kjeldsen TH (2001) John von Neumann’s conception of the minimax theorem: a journey through different mathematical contexts. Arch Hist Exact Sci 56:39–68MathSciNetCrossRefMATH Kjeldsen TH (2001) John von Neumann’s conception of the minimax theorem: a journey through different mathematical contexts. Arch Hist Exact Sci 56:39–68MathSciNetCrossRefMATH
8.
9.
Zurück zum Zitat Schuster P, Sigmund K, Hofbauer J, Wolff R (1981) Selfregulation of behaviour in animal societies. Part 2: games between two populations without selfinteraction. Biol Cybern 40:9–15CrossRefMATH Schuster P, Sigmund K, Hofbauer J, Wolff R (1981) Selfregulation of behaviour in animal societies. Part 2: games between two populations without selfinteraction. Biol Cybern 40:9–15CrossRefMATH
Metadaten
Titel
Minmax via Replicator Dynamics
verfasst von
Josef Hofbauer
Publikationsdatum
16.03.2018
Verlag
Springer US
Erschienen in
Dynamic Games and Applications / Ausgabe 3/2018
Print ISSN: 2153-0785
Elektronische ISSN: 2153-0793
DOI
https://doi.org/10.1007/s13235-018-0252-z

Weitere Artikel der Ausgabe 3/2018

Dynamic Games and Applications 3/2018 Zur Ausgabe