Skip to main content
Top
Published in: Dynamic Games and Applications 2/2021

Open Access 22-07-2020

An Update on Continuous-Time Stochastic Games of Fixed Duration

Author: Yehuda John Levy

Published in: Dynamic Games and Applications | Issue 2/2021

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

This paper shows that continuous-time stochastic games of fixed duration need not possess equilibria in Markov strategies. The example requires payoffs and transitions to depend on time in a continuous but irregular (almost nowhere almost differentiable) way. This example offers a correction to the erroneous construction presented previously in Levy (Dyn Games Appl 3(2):279–312, 2013. https://​doi.​org/​10.​1007/​s13235-012-0067-2).
Notes

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

1 Introduction

Following [5], a framework of continuous-time stochastic games of fixed duration is studied. In such games, the game is played on a fixed time interval, there are finitely many possible states and actions, and the players control the rate of the payoffs and the rate of transition between states. The staple model, due to [9], assumes that payoffs and transitions rates are stationary—that is, time-independent—functions of the actions and state. However, as discussed in Levy [5, Sec. 9], many results concerning the model—including all the results in that paper—extend fairly automatically if the payoffs and transition rates depend in any (bounded and Borel) way on time.
The purpose of this corrigendum is to show that such games need not possess equilibria in Markov strategies—a natural class of strategies for these games which depend only on time and state, not on histories. Indeed, Levy [5] establishes a number of results concerning these strategies and their variations, in particular, the existence of extensive-form public-signal correlated Markov equilibria, various optimality equations for Markov equilibria, and a study of how approximate-Markov equilibria can be constructed. Levy [9] had previously established the existence of optimal Markov strategies in zero-sum games.
Levy [5] also claims to show that Markov equilibria—equilibria which depends only on the current state and on time—need not exist, even when the payoffs and transitions are stationary. That example is based on a construction carried out in Levy [4], in the frame-work of discounted (discrete-time) stochastic games, which is used to give an example of a discounted stochastic games which does not possess stationary equilibria, despite the transitions satisfying commonly assumed absolute continuity conditions.1 In Levy and McLennan [6], it is pointed out that the construction in Levy [4] is flawed, and an alternative—similar, but in some respects simpler—construction is presented to show the falsity of the same conjecture. The relevance of the error for [5] is that the crucial Proposition 4 on [4, p. 293] is incorrect; the reader is referred to [6] for the details of the error. Hence, the purpose of this note is to remark that the correction offered in Levy and McLennan [6] can be adapted back to continuous-time stochastic games.
Hence, to summarize: If one allows (continuous) dependence of the payoffs and transitions on the time coordinate, Markov equilibria may fail to exist (Theorem 2.1). However, the question of whether Markov equilibria need exist when the payoffs and transitions are stationary has been re-opened.
When carrying out the construction (correctly), a necessity seems to arise to condition the payoffs and transitions on the time coordinate in a continuous but (for the payoffs at least) rather irregular manner. This irregularity is known as almost nowhere approximately differentiable, which we will call erratic,2 which in particular implies that it is almost nowhere equal to any other given absolutely continuous function. This type of behavior is typical of, say, a Brownian motion path.
The intuition (and construction) of the revised example, in many ways, is similar to that of Levy [5]; as written there (p. 282), ‘There is an “active state” in which the game begins and an absorbing state with payoff 0. We will focus on a particular pair of players, C,D, out of a large set of players. If either of these two players expects a positive average payoff in the future, he will choose an action such that when the other players choose an equilibrium reply, he receives a negative payoff. And, conversely, a negative average payoff in the future will lead to a positive payoff. Hence, for both C,D, the payoff must always be 0; this is a result of the continuous time parameter.’ However, the continuation given there, ‘However, we take advantage of a nonsimply connected structure of the equilibria of the other players - in particular, of players A,B—to have that, at each point in time, at least one of the players C,D receives a nonzero payoff’ proves to be insufficient.3 To this make the construction work, like in Levy and McLennan [6], we add to the payoff a time-dependent4 pertubation with the sufficient irregularity discussed above. This irregularity guarantees that the players cannot ‘catch’ the future average payoffs and ‘cancel it out’ with the present running payoffs.
We conclude by remarking that, structurally, the construction presented here is quite similar to the construction conducted in Levy and McLennan [6]. Conceptually, in fact, one could say that the main required transition from one to the other is that the unit interval is used as a state space in Levy and McLennan [6], while in the current paper, the unit interval is used to represent the time variable. However, there are a very large number of fine details that change along the way, as the optimality conditions for continuous-time games (recalled in Sect. 3.2, essentially ‘relatives’ of Hamilton–Jacobi–Bellman equations) are quite different from the optimality criteria in discrete-time discounted processes (multi-player versions of Bellman’s equation). Both frameworks include the possibility of transition to an absorbing state with payoff 0, but while continuation payoffs enter the optimality conditions in discrete time with a positive sign, reflecting the fact that an agent is guaranteed the payoff of the current indivisible time period even if facing absorption immediately thereafter, the continuation payoffs enter our framework with a negative sign, reflecting the fact that a quick absorption will result in loss of ’the payoff that could have been’. In addition to required sign changes at various points in the analysis, one combats the intricacies resulting from needing to account for both present payoff and future payoff by forcing the payoffs to shrink as a function of time, hence guaranteeing future payoff concerns are small (but, crucially, not negligible).

2 Recalling Model and Result

Following [5], the general framework for a continuous-time stochastic games—also called Markov Games; see [9]—with finite duration, and allowing for time-dependent payoffs and transitions—consists of the following framework:5
  • A finite set of states Z.
  • A finite set of players \(\mathcal {P}\).
  • A finite set of actions \(I^p\) for each \(p \in \mathcal {P}\). Denote \(I^\mathcal {P}:= \prod _{p \in \mathcal {P}} I^p\) and \(\Delta ^\mathcal {P}(I) = \prod _{p \in \mathcal {P}} \Delta (I^p)\), the mixed action profiles.
  • A duration \(T\in \mathbb {R}\), \(T>0\).
  • A Borel bounded payoff function (a.k.a. running payoff) \(r:[0,T] \times Z \times I^\mathcal {P}\rightarrow \mathbb {R}^\mathcal {P}\).
  • A Borel bounded transition rate \(\mu :[0,T] \times Z \times Z \times I^\mathcal {P}\rightarrow \mathbb {R}\), where for all \(a \in I^\mathcal {P}\), \(t \in [0,T]\), and \(z \in Z\), \(\sum _{z' \in Z} \mu (z'|t, z,a) = 0\) and for all \(z' \ne z\), \(\mu (z'|t, z,a) \ge 0\).
  • The payoff functions and transition rates both extend multi-linearly to mixed-action profiles.
Given an initial state \(z_0 \in Z\), the game is played in continuous-time on the interval [0, T]. The states are governed by a stochastic process, in which the probability of a transition from state z to a state \(z' \ne z\) in time \([t,t+h]\), during which the players play action profile \(a \in \Delta ^\mathcal {P}(I)\), is given by \(\mu (z'|t,z,a)\cdot h + o(h)\); all this is formalized in Levy [5, Sec. 2].
A Markov strategy for player \(p \in \mathcal {P}\) is a Lebesgue-measurable mapping \(u^p:Z\times [0,T] \rightarrow \Delta (I^p)\). Given a Markov strategy profile \(u = (u^p)_{p \in \mathcal {P}}\), for each \(t \in T\), let \(u_t:Z \times [0,T-t] \rightarrow \Delta ^\mathcal {P}(I)\) be defined by \(u_t(z,s) = u(z,s+t)\). Also denote, for each \(t \in [0,T]\), \(z \in Z\), and \(p \in \mathcal {P}\),6
$$\begin{aligned} \gamma ^p_u(z,t) = E^{z}_{u_t} \big [\int _0^{T-t} r^p(s, z(s),u_t(z(s),s))\mathrm{d}s \big ] \end{aligned}$$
(2.1)
where the expectation is taken w.r.t. the measure induced by the initial state z and the profile \(u_t\), and z(s) denotes the state at time s. Denote further \(\gamma ^p_u(t) = (\gamma ^p_u(z,t))_z\), \(\gamma _u = (\gamma ^p_u)_p\). \(\gamma ^p_u(z,t)\) can be viewed as the payoff to a player p who evaluates his future payoffs starting at time t, assuming he is in state z at that time, under the profile u. A profile u of Markov strategies is a Markov equilibrium if for every \(z \in Z\), every \(p \in \mathcal {P}\) and every Markov \(\tau ^p\), we have
$$\begin{aligned} \gamma ^p_u(z,0) \ge \gamma ^p_{(\tau ^p,u^{-p})}(z,0). \end{aligned}$$
The purpose of this paper is to show:
Theorem 2.1
There exists a continuous-time stochastic game of fixed duration, with continuous payoffs and transition rates, possessing no Markov equilibrium.
We note that the time-dependence of at least the payoffs is crucial for the example presented here; in fact, we impose very irregular time dependence, typical of, e.g., the path of a Brownian motion.7 As such, it is still an open question as to whether stationarity of payoffs and transitions guarantees existence of Markov equilibria. If the answer to this question is affirmative, it still remains an open question as to whether one can allow time-dependence with sufficient regularly properties and still obtain equilibrium existence.
We remark that we strongly conjecture that one could suffice with time dependence of only the payoffs (i.e., use stationary transition rates) to construct the counter-example, but in the name of simplification have not attempted to do so, inasmuch as it does not seem to add value.

3 Preliminaries

3.1 Notations

Recall that \(\langle \cdot ,\cdot \rangle \) denotes the inner product of vectors. In addition, the following notational conventions will be used:
  • Throughout \(\Vert \cdot \Vert \) denotes the \(L_\infty \) norm. That is, for a vector or bounded real-valued function f, \(\Vert f\Vert = \sup |f|\), where the supremum is taken over the set of indices or the domain of f.
  • If p is a mixed action over an action space I and \(i \in I\), then p[i] denotes the probability that p chooses i.
  • In connection with a tuple c indexed by the elements of some set \(T \subset \mathcal {P}\) of players, if \(\ell _1, \ldots , \ell _k \in T\), then \(c^{\ell _1, \ldots , \ell _k}\) will denote \((c^{\ell _1}, \ldots , c^{\ell _k})\).

3.2 Optimality Criteria and Payoff Evolution

Fix a continuous-time stochastic games as per Sect. 2, and a Markov strategy profile \(u = (u^p)_{p\in \mathcal {P}}\). In Levy [5], Theorem 1 (p. 285) describes the evolution of \(\gamma _u\) over time, and Theorem 2 (p. 286) gives a criterion for a profile u to be a Markov equilibrium:8
Theorem 3.1
For each \(p\in \mathcal {P}\), \(\gamma ^p_u(\cdot ):[0,T]\rightarrow \mathbb {R}^Z\) is the unique absolutely continuous function satisfying the following differential equation for a.e. \(t \in [0,T]\):
$$\begin{aligned} \frac{\mathrm{d}\gamma _{u,z}^p}{\mathrm{d}t}(t) = -\big [ r^p(t,z,u(z,t)) + \langle \mu (t,z,u(z,t)), \gamma _{u,z}^p(t) \rangle \big ] \end{aligned}$$
(3.1)
with boundary condition \(\gamma ^p_u(T) =0\), where \(\langle ,\rangle \) denotes the inner product in \(\mathbb {R}^Z\).
This in particular follows that \(\gamma _u\) is absolutely continuous, and in fact Lipschitz, and hence in particular a.e. differentiable.
Theorem 3.2
\(u = (u^p)_{p\in \mathcal {P}}\) is a Markov equilibrium iff for all \(z\in Z\) and a.e. \(t \in [0,T]\),
$$\begin{aligned} u(z,t) \in NE\Big (\big (r^p(t, z,\cdot ) + \langle \mu (t,z,\cdot ), \gamma ^p_u(t)\big )_{p \in \mathcal {P}} \Big ) \end{aligned}$$
or, equivalently,
$$\begin{aligned} \frac{\mathrm{d} \gamma _u}{\mathrm{d}t}(z,t) \in -NEP\Big (\big (r^p(t,z,\cdot ) + \langle \mu (t,z,\cdot ), \gamma ^p_u(t)\big )_{p \in \mathcal {P}} \Big ) \end{aligned}$$
where NE (resp. NEP) denotes the Nash equilibria (resp. Nash equilibria payoff) correspondence, which assigns to each normal-form game its set of Nash equilibria (resp. Nash equilibria payoffs).
In this paper, we will discuss games with a particular structure: There are only two states, one denoted \(z_0\) and the other denoted \(\overline{0}\), the latter of which is an absorbing state with payoff 0, i.e., \(r(\cdot , \overline{0}, \cdot ) \equiv \mu (z_0 \mid \cdot , \overline{0}, \cdot ) \equiv 0\). Clearly, as only the non-absorbing state \(z_0\) is of interest, we may drop reference to it and write \(\gamma _u(t),u(t),r(t,\cdot )\), etc, instead of \(\gamma _u(z_0,t),u(z_0,t),r(z_0,t,\cdot )\), etc, and we let \(\mu (t,\cdot ) \ge 0\) denote the transition rate out of \(z_0\), that is we write \(\mu (t,\cdot )\) instead of \(\mu (\overline{0} \mid t, z_0, \cdot )\). Theorems 3.1 and 3.2 imply in this case that if \(u = (u^p)_{p \in \mathcal {P}}\) is a Markovian equilibrium, then for a.e. \(t \in [0,1]\),
$$\begin{aligned} \frac{\mathrm{d}\gamma _u^p}{\mathrm{d}t}(t) = -\big ( r^p(t,u(t)) - \mu (t,u(t)) \gamma _u^p(t) \big ) \end{aligned}$$
(3.2)
and for a.e. \(t \in [0,1]\),
$$\begin{aligned} u(t)\text { is a Nash Equilibrium of } r(t,\cdot ) - \mu (t,\cdot ) \gamma _u(t) \end{aligned}$$
(3.3)
In (3.2) and (3.3), we see how for both payoff and strategic purposes, we can clearly separate, in classic dynamic programming fashion, components resulting from present/running payoff \(r(t,\cdot )\), and components resulting from expected continuation/future payoffs, \(\mu (t,\cdot ) \gamma _u(t)\). This separation will prove most useful along the way for intuitions driving the constructions, in particular as we will force the continuation payoff vectors to be small (in norm) when compared to the running payoff vectors.

3.3 Erratic Functions

Spurred by the error in the previous work, as we have discussed in the introduction, the construction at hand requires a pertubation by a sufficiently ‘erratic’ function, in a sense we make precise here. Let \(\lambda \) denote the Lebesgue measure on \(\mathbb {R}\). The following definition can be found, e.g., in Saks [8, Sec VII.3]:
Definition 3.1
If \(E \subseteq \mathbb {R}\) is Lesbesgue measurable, \(f:E \rightarrow \mathbb {R}\) is Lebesgue measurable, \(x \in E\), and \(L \in \mathbb {R}\), then f is approximately differentiable at x with approximate derivative L if, for all \(\varepsilon >0\),
$$\begin{aligned} \frac{1}{2\delta } \lambda \Big ((x - \delta , x + \delta ) \cap \big \{y \in E : \Big |\frac{f(y) - f(x)}{y - x} - L\Big | < \varepsilon \big \}\Big ) \rightarrow 1 \end{aligned}$$
as \(\delta \rightarrow 0\).
Clearly, if f is differentiable at x with \(f'(x)=L\), then f is approximately differentiable at x with approximate derivative L.
Definition 3.2
For \(E \subseteq \mathbb {R}\) Lesbesgue measurable, we will call a Lebesgue-measurable \(f:E \rightarrow \mathbb {R}\) erratic if it is almost nowhere approximately differentiable.
The following is included in Theorem 3.3 of Saks [8, Sec VII.3]9:
Lemma 3.3
If \(f, g:[0,1] \rightarrow \mathbb {R}\) are Lesbesgue measurable, f is approximately differentiable a.e., g is erratic, and \(E = \{\, x : f(x) = g(x) \,\}\), then \(\lambda (E) = 0\).
Berman [1] shows that, with probability one, the path of a Brownian motion is nowhere approximately differentiable, and in particular erratic; the path of a Brownian motion is well-known to be continuous with probability one. The existence of erratic continuous functions is also shown more directly in Jarník [3]; see also [7] and the references within.

4 The Example’s Stage Game

Our construction has three phases: (a) Selecting four perturbations of a “base” game (Sect. 4.1); (b) Specification of a rescaled version of the stage game (Sect. 4.2); (c) The stochastic game itself (Sect. 5.1).10

4.1 The Base Game

The base game G has four players, A, B, C and D. The pure strategies of player A are U and D, the pure strategies of B are L, M, and R, and players C and D are dummy players, because their sets of pure strategies are singletons. The payoffs of players A and B are shown in Table 1.
Table 1
The Payoffs to A and B (\(G^{A,B}\))
\(A \backslash B\)
L
M
R
U
(1, 1)
(1, 1)
(0, 0)
D
(0, 0)
(1, 1)
(1, 1)
The Nash equilibria are the pure strategy profiles (UL), (UM), (DM), and (DR), as well as all “convex combinations” of successive pairs of elements of this list. The payoffs to C and D, as a function of AB’s actions, are shown in Table 2.
Table 2
The Payoffs to C and D (\(G^{C,D}\))
\(A \backslash B\)
L
M
R
U
\((-\,1,1)\)
(1, 1)
(0, 0)
D
(0, 0)
\((1,-\,1)\)
\((-\,1,-\,1)\)
We state the properties of G that figure in the subsequent analysis. For a mixed strategy profile x, let G(x) be the vector of expected payoffs.
Lemma 4.1
(a)
For each \((j,k) \in \{-1,1\}^2\), any neighborhood of G contains a game \(G_{j,k}\) whose unique Nash equilibrium x satisfies \(G^{C,D}(x) = (j,k)\).
 
(b)
For any equilibrium x of G, \(\Vert G^{C,D}(x)\Vert = 1\).
 
Proof
Obvious. \(\square \)
In view of (b) and the bounds on payoffs for C and D, the upper semicontinuity of the Nash equilibrium correspondence implies that there is an \(\eta _0 > 0\) such that
$$\begin{aligned} \tfrac{7}{8} \le \Vert G^{C,D}(x)\Vert \le 1 \end{aligned}$$
(4.1)
whenever x is an equilibrium of a game \(G'\) such that \(\Vert G' - G\Vert \le \eta _0\). (Note that the game in (4.1) is the original game G, but the profile x is the equilibrium of a perturbed game.) We fix such \(\eta _0>0\), and for each \((j,k) \in \{-1,1\}^2\) we fix such a perturbation \(G_{j,k}\) of G such that the unique Nash equilibrium x of \(G_{j,k}\) satisfies \(G^{C,D}(x) = (j,k)\). (The payoffs of A and B in \(G_{j,k}\) play no role in our analysis after Lemma 4.1 has been established.)

4.2 The Stage Game

Next we describe a second strategic form game; in our stochastic game there will be two states, one of which is absorbing with 0 payoff to all, and the other of which has running payoff that is a rescaling of this strategic form game.
The set of players is \(\mathcal {P}= \{ A,B,C,C',D,D',E,F \}\). As above, player A has the pure strategies U and D, and player B has the pure strategies L, M and R, but in this game players C and D have pure strategies 0 and 1. Players \(C'\) and \(D'\) also have pure strategies 0 and 1, and players E and F have pure strategies \(-1\) and 1. Pure and mixed strategy profiles will be denoted by
$$\begin{aligned} a = (a^A, a^B, a^C, a^{C'}, a^D, a^{D'}, a^E, a^{F}) \; \text {and} \; x = (x^A, x^B, x^C, x^{C'}, x^D, x^{D'}, x^E, x^{F}). \end{aligned}$$
The payoffs of this strategic form game depend on a parameter \(\varrho \in (-\tfrac{1}{2},\tfrac{1}{2})\). Let
$$\begin{aligned} \psi (a) = (\psi ^C(a^C),\psi ^D(a^D)) = (2 a^C - 1, 2 a^D - 1) \end{aligned}$$
and
$$\begin{aligned} \psi (x) = (\psi ^C(x^C),\psi ^D(x^D)) = (2 x^C[1] - 1, 2 x^D[1] - 1) \in [-1,1]^2 \end{aligned}$$
(4.2)
where \(x^C\), \(x^D\) denote the probability that these players play 1. The payoffs in the game \(g_1(\varrho ,\cdot )\) are:
$$\begin{aligned} \begin{aligned}&g_1^A(\varrho ,a) = G^A_{a^E,a^{F}}(a^A,a^B), \\&g_1^B(\varrho ,a) = G^B_{a^E,a^{F}}(a^A,a^B), \\&g_1^C(\varrho ,a) = {\left\{ \begin{array}{ll} -G^C(a^A,a^B) - \tfrac{1}{16}, &{} a^C = a^{C'}, \\ -G^C(a^A,a^B) + \tfrac{1}{16}, &{} a^C \ne a^{C'}, \end{array}\right. } \\&g_1^{C'}(\varrho ,a) = -g_1^C(\varrho ,a), \\&g_1^D(\varrho ,a) = {\left\{ \begin{array}{ll} -G^D(a^A,a^B) - \tfrac{1}{16}, &{} a^D = a^{D'}, \\ -G^D(a^A,a^B) + \tfrac{1}{16}, &{} a^D \ne a^{D'}, \end{array}\right. } \\&g_1^{D'}(\varrho ,a) = -g_1^D(\varrho ,a), \\&g_1^E(\varrho ,a) = a^E \cdot \langle (1,\varrho ), \psi (a) \rangle , \\&g_1^F(\varrho ,a) = a^{F} \cdot \langle (-\varrho ,1), \psi (a) \rangle . \\ \end{aligned} \end{aligned}$$
where \(\langle , \rangle \) denotes the standard inner product, and the games \(G_{\pm 1, \pm 1}\) were chosen at the end of Sect. 4.1. Observe that \((1,\varrho ) \bot (-\varrho ,1)\).
In the stochastic game given in Sect. 5, \(\varrho (\cdot )\) will depend on time, and the transition rates are controlled by C, \(C'\), D, and \(D'\), so in each time period the other players will only be concerned with maximizing their running payoffs, which is a rescaling of \(g_1(\varrho (t),\cdot )\). Players A and B are playing a perturbation of the game G, as described above.
The running payoff to \(C'\) is the negation of the running payoff to C, so C and \(C'\) will have opposite views concerning the desirability of the game continuing (as opposed to transitioning to the absorbing state with zero payoffs). Leaving aside the components of the stage game payoffs for C and \(C'\) that depend only on the behavior of A and B, the conflict between C and \(C'\) at time t is a zero sum game that consists of matching pennies perturbed by these concerns about absorption to the state with payoff 0. These perturbations, i.e., these concerns, will be small enough that there is always a unique equilibrium which is mixed. The conflict between D and \(D'\) is similar to the conflict between C and \(C'\), albeit with different payoffs, as they are effected by AB in a different way.
The best responses of players E and F depend on the signs of the expectations of the inner products \(\langle (1,\varrho ), \psi (a) \rangle \) and \(\langle (-\varrho ,1), \psi (a) \rangle \), respectively. For \(\varrho \in (-\frac{1}{2},\frac{1}{2})\) and \(j,k = \pm 1\) let
$$\begin{aligned} \mathcal {D}^\varrho _{j,k}:=\{\, \psi \in \mathbb {R}^2 \mid j \cdot \langle (1,\varrho ), \psi \rangle> 0\text { and }k \cdot \langle (-\varrho ,1), \psi \rangle > 0 \,\}. \end{aligned}$$
(4.3)
Observe that \((1,\varrho ),(-\varrho ,1)\) are orthogonal, so the \(\mathcal {D}^\varrho _{j,k}\) are just the open quadrants of the plane under a certain rotation (see Fig. 1).
Set
$$\begin{aligned} \mathcal {D}^\varrho = \bigcup _{j,k = \pm 1} \mathcal {D}^\varrho _{j,k} \end{aligned}$$
(4.4)
As mentioned, in the stochastic game defined in Sect. 4.1, \(\varrho (\cdot )\) will be a function of the time \(t \in [0,1]\), and we will see that in any Markov equilibrium, for a.e. t, behavior at time t is characterized by a mixed strategy profile x such that \(\psi (x)\) defined in (4.2) lies in \(\mathcal {D}^{\varrho (t)}\), so that E and F play pure strategies, and consequently A and B are playing one of the perturbations \(G_{j,k}\) of G. In this sense, the behavior of A and B is well controlled.
The following lemma summarizes the properties of \(g_1\) needed going forward concerning ABEF; thereafter, we will only reference the payoffs of \(C,D,C',D'\).
Lemma 4.2
Let x be a mixed action profile in which ABEF are best-replying in \(g_1\), i.e., \(x^{A,B,E,F}\) is an equilibrium of \(g^{A,B,E,F}_1(\cdot , x^{C,D,C',D'})\). Then:
(a)
\(\tfrac{7}{8} \le \Vert G^{C,D}(x)\Vert \le 1\).
 
(b)
If \(\psi (x) \in \mathcal {D}^{\varrho }_{j,k}\) for \(j,k \in \{ \pm 1 \}\), then \(G^{C,D}(x) = (j,k)\).
 
Proof
\(x^{A,B}\) is an equilibrium of some game \(G'\) which is a convex combination \(G'\) of \((G_{j,k})_{j = \pm 1, k = \pm 1}\). Since \(\Vert G_{\pm 1, \pm 1} - G\Vert < \eta _0\), also \(\Vert G' - G\Vert < \eta _0\), we have \(\tfrac{7}{8} \le \Vert G^{C,D}(x)\Vert \le 1\) (see end of Sect. 4.1), which yields Part (a).
For Part (b), observe that if \(\psi (x) \in \mathcal {D}^{\varrho }_{j,k}\), then \(j \cdot \langle (1, \varrho ), \psi (x) \rangle > 0\) and \(k \cdot \langle (-\varrho ,1), \psi (x) \rangle > 0\); from the payoffs of \(g_1\), we see that players EF play pure with \((a^E,a^F) = (j,k)\), so \(x^{A,B}\) is an equilibrium of \(G_{j,k}\), which in turn implies \(G^{C,D}(x) = (j,k)\). \(\square \)

4.3 Equilibrium in a Stage

To complement the function \(g_1\) already defined, define a payoff function \(g_2\), which depends on a parameter \(\omega = (\omega ^C,\omega ^D) \in \mathbb {R}^2\), in the following way:
$$\begin{aligned} g^p_2(\omega ,a) = \frac{1}{64}(a^C + a^{C'} + a^D + a^{D'}) \times \left\{ \begin{array}{ll} \omega ^p &{} \text {if } p = C,D\\ -\omega ^C &{} \text {if } p = C' \\ -\omega ^D &{} \text {if } p = D' \\ 0 &{} \text {if } p = A,B,E,F\\ \end{array}\right. \end{aligned}$$
(4.5)
(In particular, \(g^C_2 \equiv -g^{C'}_2\), \(g^D_2 \equiv -g^{D'}_2\).) We also denote a payoff function \(g(\varrho ,\omega ,\cdot )\) which will be the sum of two payoffs:
$$\begin{aligned} g(\varrho ,\omega ,\cdot ) := g_1(\varrho ,\cdot ) + g_2(\omega ,\cdot ) \end{aligned}$$
(4.6)
In the analysis conducted in Sect. 5.2 of the stochastic game we will present, the stage payoffs at time t will be a rescaling of \( g_1(\varrho ,\cdot )\), where \(\varrho \) will be time-dependent as well, and \(g_2(\omega ,\cdot )\) will be a rescaling of the continuation payoffs, where \(\omega ^{C,D} = \gamma ^{C,D}_u(t)\) for a candidate Markov equilibrium u; hence \(g(\varrho ,\omega ,\cdot )\) will encompass all the strategic considerations of the agents at each time.11
Recall the notation \(\mathcal {D}^\varrho _{j,k}\) given in (4.3). Equilibrium analysis for \(g(\varrho ,\omega ,\cdot )\) will yield Proposition 4.4, which will summarize the properties of the equilibria of \(g(\varrho ,\omega ,\cdot )\) needed later. En route to that proposition, we need the following lemma, which will play no role after Proposition 4.4 is established:
Lemma 4.3
Suppose that \(\varrho \in \mathbb {R}\), \(\omega = (\omega ^C,\omega ^D) \in \mathbb {R}^2\), with \(|\varrho | < \frac{1}{2}\), \(\Vert \omega \Vert < 2\), and that x is an equilibrium of \(g(\varrho , \omega ,\cdot )\).
(a)
\(\tfrac{7}{8} \le \Vert G^{C,D}(x)\Vert \le 1\).
 
(b)
If \(\omega ^{C,D} \in \mathcal {D}^{\varrho }_{j,k}\), then \(G^{C,D}(x) = (j,k)\).
 
Proof
Since \(g^{A,B,E,F} = g_1^{A,B,E,F}\), Lemma 4.2 applies to x, so (a) follows from Lemma 4.2(a). For part (b), we claim that
$$\begin{aligned} x^C[1] = \tfrac{1}{2} + \tfrac{1}{16}\omega ^C\text { and }x^D[1] = \tfrac{1}{2} + \tfrac{1}{16}\omega ^D \end{aligned}$$
Observe that \(g^{C,C'}(\omega , \varrho , x)\) is the sum of
$$\begin{aligned} (-G^C(x^A,x^B),G^C(x^A,x^B)) + \tfrac{1}{64} (x^D[1] + x^{D'}[0])\omega ^{C}\cdot (1,-1), \end{aligned}$$
which is unaffected by \(x^{C,C'}\), and \(\tfrac{1}{16}\) times the payoffs resulting from applying \(x^{C,C'}\) to the bimatrix game below.
\(C \backslash C'\)
1
0
1
\(\big (-1+\tfrac{1}{2}\omega ^C,1-\tfrac{1}{2}\omega ^{C}\big )\)
\(\big (1+ \tfrac{1}{4} \omega ^C,-1- \tfrac{1}{4} \omega ^{C}\big )\)
0
\(\big (1 + \tfrac{1}{4} \omega ^C,-1 -\tfrac{1}{4} \omega ^{C}\big )\)
\(\big (-1,1\big )\)
(For example, the part of C’s payoff affected by C and \(C'\)’s behavior that accrues in the future is \((a^C + a^{C'}) \tfrac{1}{64} \omega ^C\).) Since \(|\omega ^C|< 2\) this bimatrix game has a unique equilibrium, which must be \(x^{C,C'}\). To see that \(x^C[1] = \tfrac{1}{2} + \tfrac{1}{16} \omega ^{C}\), one can simply compare the payoff differences for \(C'\). The result for \(x^D[1]\) follows by symmetry.
Recalling the definition of \(\psi \) given in (4.2), it follows that \(\psi (x) = \frac{1}{8}\omega = \frac{1}{8}(\omega ^C, \omega ^D)\), so (b) follows from Lemma 4.2(b). \(\square \)
Proposition 4.4
Suppose that \(\varrho \in \mathbb {R}\), \(\omega =(\omega ^C,\omega ^D)\in \mathbb {R}^2\) with \(|\varrho | < \frac{1}{2}\), \(\Vert \omega \Vert < 2\), and that x is an equilibrium of \(g(\varrho , \omega ,\cdot )\).
1.
$$\begin{aligned} \tfrac{13}{16} \le \Vert g_1^{C,D}(\varrho ,x) \Vert \le \tfrac{17}{16} \end{aligned}$$
(4.7)
 
2.
If, furthermore, \(\omega \in \mathcal {D}^\varrho \), then
(a)
\(\tfrac{15}{16} \le |g_1^C(\varrho ,x)| \le \tfrac{17}{16}\) and \(\tfrac{15}{16} \le |g_1^D(\varrho ,x)| \le \tfrac{17}{16}\);
 
(b)
if \(|\omega ^C| \ge \tfrac{1}{2}|\omega ^D|\), then \(g_1^C(\varrho ,x) \cdot \omega ^C < 0\);
 
(c)
if \(|\omega ^D| \ge \tfrac{1}{2}|\omega ^C|\), then \(g_1^D(\varrho ,x) \cdot \omega ^D < 0\).
 
 
Proof
From the definition of \(g_1\),
$$\begin{aligned} \left\| g_1^{C,D}(\varrho ,x) - (-G^{C,D}(x))\right\| \le \tfrac{1}{16} \end{aligned}$$
(4.8)
so the first part follows from Lemma 4.3(a). If \(\omega ^{C,D} \in \mathcal {D}^{\varrho }\), then Lemma 4.3(b) together with (4.8) yields (a) of Part 2. If say \(\omega \in \mathcal {D}^\varrho \) and \(\omega ^C \ge \tfrac{1}{2}|\omega ^D|\), since \(|\varrho |<\frac{1}{2}\),
$$\begin{aligned} \langle (1,\varrho ), \omega \rangle = \omega ^C + \varrho \cdot \omega ^D \ge \omega ^C - \frac{1}{2} |\omega ^D| \ge 0 \end{aligned}$$
we have \(\omega \in D^\varrho _{1, \pm 1}\). Therefore, \(G^C(x) = 1\) from Lemma 4.3(b). From (4.8), \(g_1^{C,D}(\varrho ,x) < 0\), hence (b) follows for the case \(\omega ^C \ge \tfrac{1}{2}|\omega ^D|\); the case \(\omega ^C \le -\tfrac{1}{2}|\omega ^D|\), as well as (c), follow symmetrically. \(\square \)

5 The Stochastic Game and Analysis

This section presents the example (or class of examples, insofar as there is a given function that is a parameter) of a continuous-time stochastic game of fixed duration not possessing a Markov equilibrium.

5.1 The Stochastic Game

We now specify the stochastic game. Let \(\varrho :[0,1] \rightarrow (-\tfrac{1}{2},\tfrac{1}{2})\) be a Borel function. The stochastic game \({\tilde{\Gamma }}= {\tilde{\Gamma }}_\varrho \) is as follows:
  • The players are \(\mathcal {P}= \{ A,B,E,F,C,C',D,D' \}\) as in Sect. 4.2, along with the actions sets as given there.
  • The game is played on the unit time interval, [0, 1].
  • The set of states is \(Z = \{ z_0, \overline{0} \}\), where \(\overline{0}\) is an absorbing state of payoff 0, i.e., \(r(\cdot , \overline{0}, \cdot ) \equiv \mu (z_0 \mid \cdot , \overline{0}, \cdot ) \equiv 0\). As discussed in Sect. 3.2, we will often drop reference to \(z_0\) as it is the only non-trivial state.
  • The payoff function in \(z_0\) is
    $$\begin{aligned} r(t,\cdot ) (:= r(t, z_0,\cdot )) := (1-t)g_1(\varrho (t),\cdot ) \end{aligned}$$
    where \(g_1\) is as in Sect. 4.2.
  • The transition rate \(\mu (t, \cdot ) := \mu (\overline{0} \mid t, z_0, \cdot )\ge 0\) (the intensity of the flow out of \(z_0\)) is determined by the actions of players \(C,D,C',D'\), and is given by:
    $$\begin{aligned} \mu (t, \cdot ) = \frac{1}{64}(1-t)((1-a^C) + (1-a^{C'}) + (1-a^D) + (1-a^{D'})) \end{aligned}$$
    (5.1)
    where recall that the actions \(a^C,a^{C'},a^D,a^{D'}\) are in the action space \(\{0,1\}\).
Recall the notion of an erratic function introduced in Definition 3.2. We now state the main step in the argument:
Proposition 5.1
Suppose that \(\varrho :[0,1] \rightarrow (-\tfrac{1}{2},\tfrac{1}{2})\) is an erratic function. Then the game \({\tilde{\Gamma }}_\varrho \) does not possess a Markov equilibrium.
To see the key intuition underlying the construction, suppose that \(u = (u^p)_{p \in \mathcal {P}}\) is a Markov equilibrium of \({\tilde{\Gamma }}_\varrho \). The optimality criteria recalled in Sect. 3.2, together with the fairly low transition rates, will show that at a.e. time t agents are playing an equilibrium of a game which is close to the game \(g_1(\varrho (t),\cdot )\). Since G is by the far the largest component of this payoff for ABCD, at a.e. time t these agents are getting a payoff close to an equilibrium payoff of G. Hence, \(\gamma _u^{C,D}\) is absolutely continuous and with derivative not far from equilibrium payoff \(G^{C,D}\), which is non-zero.12
Because \(\varrho \) is erratic, for a.e. t such that \(\gamma _u^{C,D}(t) \ne (0,0)\), the best responses of E and F are a.s. pure, leading the perturbation of the base game to be one of the \(G_{j,k}\) whose equilibrium pushes the vector of future payoffs of C and D away from the origin in \(\mathfrak {R}^2\) as we go forward in time, which is to say that the derivative of \(s \mapsto \Vert \gamma _u^{C,D}(s)\Vert _2\) is positive at t, for almost all t, where \(\Vert \cdot \Vert _2\) is Euclidean norm. Since \(\Vert \gamma _u^{C,D}\Vert _2\) is absolutely continuous and \(\gamma _u^{C,D}(1) = (0,0)\), this is impossible, which is the desired contradiction.
The technical machinery which drives the precise derivations of these results is, in addition to the optimality criteria recalled in Sect. 3.2, the equilibrium analysis of the pertubations of the stage payoff, as summarized in Proposition 4.4.
In particular, following the discussion of Sect. 3.3 on the existence of erratic functions yields, refining Theorem 2.1:
Theorem 5.1
There exists a stochastic game of the form \({\tilde{\Gamma }}_\varrho \), for continuous \(\varrho \), which does not possess a Markov equilibrium.

5.2 Proof: Preliminaries

We set about proving Proposition 5.1, beginning with preliminaries in this section, and completing the proof in the next section. Let \(\varrho : [0,1] \rightarrow (-\tfrac{1}{2},\tfrac{1}{2})\) be erratic. By way of contradiction, we suppose that \(u = (u^p)_{p \in \mathcal {P}}\) is a Markov equilibrium of \({\tilde{\Gamma }}_\varrho \). For brevity, denote \(\gamma \) instead of \(\gamma _u\).
Recall the definitions of \(g_2\) and g given in (4.5) and (4.6). Note that using (5.1) and (4.5), and the fact that \(\gamma ^{C'} = -\gamma ^C\), \(\gamma ^{D'} = -\gamma ^D\),
$$\begin{aligned} (1-t) g_2^p(\gamma ^{C,D}(t),\cdot ) = [\frac{1}{16}(1-t) - \mu (t,\cdot ) ]\gamma ^p(t),~p = C,C',D,D' \end{aligned}$$
(5.2)
By (5.2), and since \(r(t,\cdot ) = (1-t)g_1(\varrho (t),\cdot )\), it holds for each \(p \in \mathcal {P}\),
$$\begin{aligned} r^p(t,\cdot ) - \mu (t, \cdot ) \gamma ^p(t)&= (1-t)\big [ g^p_1(\varrho (t),\cdot ) + g^p_2(\gamma ^{C,D}(t),\cdot )\big ]\nonumber \\&\quad + \left\{ \begin{array}{ll} -\frac{1}{16}(1-t)\gamma ^p(t) &{} \text {if } p = C,C',D,D'\\ -\mu (t,\cdot ) \gamma ^p(t) &{} \text {if } p = A,B,E,F \end{array}\right. \end{aligned}$$
(5.3)
(recall \(g_2^p \equiv 0\) for \(p=A,B,E,F\)). Hence, the optimality criteria (3.2) and (3.3) present in Sect. 3.2 give, along with the facts that \(g = g_1 + g_2\) and that players ABEF do not effect \(\mu (t,\cdot )\):
Proposition 5.2
For a.e. \(t \in (0,1)\), we have:
  • For each player \(p = C,C',D,D'\),
    $$\begin{aligned} \frac{\mathrm{d} \gamma ^p}{\mathrm{d}t}(t) = -(1-t)\big [ g^p_1(\varrho (t),u(t)) + g^p_2(\gamma ^{C,D}(t),u(t)) - \frac{1}{16}\gamma ^p(t) \big ] \end{aligned}$$
    (5.4)
  • u(t) is an equilibrium of \(g(\varrho (t), \gamma ^{C,D}(t),\cdot )\).

5.3 Proof: Equilibrium Over Time

Lemma 5.3
There is \(t_0 \in (0,1)\) such that for a.e. \(t \in (t_0,1)\):
  • $$\begin{aligned} \left\| \frac{\mathrm{d} \gamma ^{C,D}}{\mathrm{d}t}(t) +(1-t) g^{C,D}_1(\varrho (t),u(t)) \right\| \le \frac{1}{8}(1-t) \end{aligned}$$
    (5.5)
  • $$\begin{aligned} (1-t)\frac{11}{16} \le \left\| \frac{\mathrm{d} \gamma ^{C,D}}{\mathrm{d}t}(t)\right\| \le (1-t)\frac{19}{16} \end{aligned}$$
    (5.6)
Proof
Observe
$$\begin{aligned} \Vert r(t,\cdot )\Vert \le (1-t) \cdot \max _{|\varrho | \le \frac{1}{2}} \Vert g_1(\varrho ,\cdot )\Vert \end{aligned}$$
(5.7)
Hence, for all \(t \in [0,1]\),
$$\begin{aligned}&\Vert g_2(\gamma ^{C,D}(t),\cdot )\Vert \\&\quad \le \frac{1}{16}\Vert \gamma ^{C,D}(t)\Vert \le \frac{1}{16} \int _t^1 \max _{s \in [t,1]} \Vert r(t,\cdot )\Vert \mathrm{d}t \le \frac{1}{32}(1-t)^2 \cdot \max _{|\varrho | \le \frac{1}{2}} \Vert g_1(\varrho ,\cdot )\Vert \end{aligned}$$
Therefore, (5.5) follows, for \(t_0\) close enough to 1, from (5.4) of Proposition 5.2. Now, by the above calculation, for \(t_0\) close enough to 1, \(\Vert \gamma ^{C,D}(t)\Vert < 2\) for all \(t \in (t_0,1)\). Hence, (5.6) follows from (5.5) and by (4.7) of Proposition 4.4. \(\square \)
Henceforth, fix some such \(t_0\) as in Lemma 5.3.
Lemma 5.4
For a.e. \(t \in (t_0,1)\):
(a)
\(\gamma ^{C,D}(t) \ne 0\).
 
(b)
\(\gamma ^{C,D}(t) \in \mathcal {D}^{\varrho (t)}\), where \(\mathcal {D}^{\varrho }\) was defined in (4.4).
 
(c)
If \(|\gamma ^C(t)| \ge \tfrac{1}{2}|\gamma ^D(t)|\), then \(\frac{\mathrm{d} \gamma ^C}{\mathrm{d}t}(t) \cdot \gamma ^C(t) \ge \tfrac{13}{16}(1-t)|\gamma ^C(t)|\).
 
(d)
If \(|\gamma ^D(t)| \ge \tfrac{1}{2}|\gamma ^C(t)|\), then \(\frac{\mathrm{d} \gamma ^D}{\mathrm{d}t}(t) \cdot \gamma ^D(t) \ge \tfrac{13}{16}(1-t)|\gamma ^D(t)|\).
 
The proof13 follows from Proposition 4.4, Lemma 5.3 and our irregularity assumptions on the function \(\varrho \):
Proof
(a)
This follows from (5.6).
 
(b)
Define \(\eta :[0,1] \rightarrow \mathbb {R}^2\) by \(\eta (t) = \frac{\gamma ^{C,D}(t)}{\Vert \gamma ^{C,D}(t)\Vert }\). This is a.e. defined and a.e. differentiable, since \(\gamma ^{C,D}_u\) is Lipschitz and therefore both the denominator and numerator are Lipschitz, hence a.e. differentiable by Rademacher’s theorem (e.g., Federer [2, Thm. 3.1.6]), and the latter is a.e. non-zero by (a). Clearly, \(\eta (t) \in \mathcal {D}^{\varrho (t)}\) if and only if \(\gamma ^{C,D}(t) \in \mathcal {D}^{\varrho (t)}\). For a.e. t, the requirement \(\eta (t) \notin \mathcal {D}^{\varrho (t)}\) is equivalent (because \(\Vert \eta (\cdot )\Vert \equiv 1\) and \(\Vert \varrho \Vert < \tfrac{1}{2}\)) to \(\eta (t) \in \{ \pm (-\varrho (t),1),\pm (1, \varrho (t))\}\). Due to the assumed irregularity of \(\varrho (\cdot )\), \(\eta ^C(t) \ne \pm \varrho (t)\) and \(\eta ^D(t) \ne \pm \varrho (t)\) for almost all t.
 
(c)
In view of (b), we may assume that \(\gamma ^{C,D}(t) \in \mathcal {D}^{\varrho (t)}\), so
$$\begin{aligned} |(1-t)g^{C}_1(\varrho (t),u(t)) + \frac{\mathrm{d}\gamma ^C}{\mathrm{d}t}(t)|&\le \tfrac{1}{8}(1-t) \\ g^{C}_1(\varrho (t),u(t)) \cdot \gamma ^C(t)&< 0 \\ |g^{C}_1(\varrho (t),u(t))|&\ge \tfrac{15}{16} \end{aligned}$$
(These inequalities follow from (5.5), Proposition 4.4(b) and Proposition 4.4(a), respectively.). Applying these in the order that they are given,
$$\begin{aligned} \frac{\mathrm{d}\gamma ^C}{\mathrm{d}t}(t) \cdot \gamma ^C(t)&\ge -(1-t)g^{C}_1(\varrho (t),u(t)) \cdot \gamma ^C(t) - \frac{1}{8}(1-t)|\gamma ^C(t)| \\&= |(1-t)g^{C}_1(\varrho (t),u(t)) \cdot \gamma ^C(t)| - \frac{1}{8}(1-t)|\gamma ^C(t)| \\&\ge \frac{15}{16}(1-t)|\gamma ^C(t) | - \frac{1}{8}(1-t)|\gamma ^C(t)| \end{aligned}$$
 
(d)
By symmetry, the proof of (c) also establishes (d). \(\square \)
 
Define
$$\begin{aligned} J(t) = \frac{1}{2}(\gamma ^C(t))^2 + \frac{1}{2}(\gamma ^D(t))^2 \end{aligned}$$
Then \(J \ge 0\), \(J(1)=0\), and J is a.c.. We claim
Proposition 5.5
\(J' > 0\) a.e. in \((t_0,1)\).
Given the proposition,14 we have \(0 = J(1) > J(t_0) \ge 0\), a contradiction to the existence of a Markov equilibrium \(u(\cdot )\).
Proof
Let \(t \in (t_0,1)\) be such that all the properties of Lemma 5.4 hold. To simplify notation, we drop the argument t. Denote \(\delta = \frac{\mathrm{d}\gamma ^{C,D}}{\mathrm{d}t}\). We have \(J' = \gamma ^C \cdot \delta ^C + \gamma ^D \cdot \delta ^D\). Either
$$\begin{aligned} |\gamma ^C| \ge \tfrac{1}{2}|\gamma ^D| \quad \text {and hence} \quad \delta ^C \cdot \gamma ^C \ge \tfrac{13}{16}(1-t)|\gamma ^C| \end{aligned}$$
or
$$\begin{aligned} |\gamma ^D| \ge \tfrac{1}{2}|\gamma ^C| \quad \text {and hence} \quad \delta ^D \cdot \gamma ^D \ge \tfrac{13}{16}(1-t)|\gamma ^D|. \end{aligned}$$
If both hold, then
$$\begin{aligned} \delta ^C \cdot \gamma ^C+ \delta ^D \cdot \gamma ^D \ge \tfrac{13}{16}(1-t)\cdot (|\gamma ^C| + |\gamma ^D|) > 0. \end{aligned}$$
(The strict inequality is from Lemma 5.4(a).) Therefore, we may suppose that one of these holds, say the first without loss of generality, and the other does not, so \(2 |\gamma ^D| < |\gamma ^C|\). By (5.6), \(|\delta ^D| \le \tfrac{19}{16}(1-t)\), so
$$\begin{aligned} \delta ^C \cdot \gamma ^C + \gamma ^D \cdot \delta ^D&\ge \tfrac{13}{16}(1-t)|\gamma ^C| - |\gamma ^D| \cdot |\delta ^D| \ge (1-t)\cdot (\tfrac{13}{16}|\gamma ^C| - \tfrac{19}{16}|\gamma ^D|) \\&> (1-t)\cdot (\tfrac{26}{16}|\gamma ^D| - \tfrac{19}{16}|\gamma ^D|) = \tfrac{7}{16}(1-t)|\gamma ^D|\ge 0 \end{aligned}$$
\(\square \)
Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://​creativecommons.​org/​licenses/​by/​4.​0/​.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Footnotes
1
That paper presents two examples; in the first, transitions are deterministic, and to the best of our knowledge, it suffers from no errors.
 
2
I am grateful to an anonymous referee for suggesting the terminology.
 
3
We do remark that one does not need, it turns out, a non-simply connected equilibria component, although one does require a component which is a continuum with additional properties; see [6] for discussion.
 
4
In Levy and McLennan [6], this is a state-dependent pertubation.
 
5
In Levy [5, Sec. 2], we first present a model in which the payoffs and transitions do not depend on time, but in Sec. 9 there, it is remarked that the model and all results generalize immediately. Zachrisson [9] similar works with only stationary payoffs and transitions.
 
6
We note an addition typo in the middle term Equation (3.1) of Levy [5, Sec. 3]; \(E^z_u\) there should be \(E^z_{u_t}\), as it is written here. The evaluation is written correctly on the right side of that equation in terms of the transition matrix.
 
7
The incorrect example of Levy [5, Sec. 6] had payoffs and transitions independent of time.
 
8
Like for most of that paper, these results are stated for payoffs and transitions which do not depend on time, but as remarked in Sec. 9 there, these results generalize immediately when this stationarity is dropped; the proofs remain precisely the same.
 
9
A proof is sketched in Footnote 6 on Levy and McLennan [6, p. 1245]
 
10
Sections 4.1 and 4.2 closely follow Sections 3.2 and 3.3 of Levy and McLennan [6].
 
11
This section from this point follows Section 4.1 of Levy and McLennan [6].
 
12
Indeed, an essential feature of the construction is that G does not have any equilibria that give expected utility zero to both C and D, but nonetheless the origin is in the convex hull of the set of pairs of expected payoffs for C and D induced by the equilibria of G. The reader is referred to the discussion on this point on [6, p. 1244].
 
13
The proof is very similar to the proof of Lemma 4.9 of Levy and McLennan [6], with \(\gamma ^{C,D}\) replacing both W and \(\omega \) there, and \(\frac{\mathrm{d} \gamma ^p}{\mathrm{d}t}\) replacing V.
 
14
The proof of the proposition is very similar to the proof of Lemma 4.7 of Levy and McLennan [6], with \(\gamma ^{C,D}\) replacing \(W^{C,D}\), and the inequalities/signs reversed.
 
Literature
1.
go back to reference Berman SM (1970) Gaussian processes with stationary increments: local times and sample function properties. Ann Math Stat 41:1173–1396MathSciNetCrossRef Berman SM (1970) Gaussian processes with stationary increments: local times and sample function properties. Ann Math Stat 41:1173–1396MathSciNetCrossRef
2.
go back to reference Federer H (1969) Geometric measure theory. Die Grundlehren der mathematischen Wissenschaften in Einzeldarstellungen, Springer, BerlinMATH Federer H (1969) Geometric measure theory. Die Grundlehren der mathematischen Wissenschaften in Einzeldarstellungen, Springer, BerlinMATH
3.
go back to reference Jarník V (1934) Sur la dérivabilité des fonctions continues. Publications de la Faculty des Sciences de L’Université Charles 129:9MATH Jarník V (1934) Sur la dérivabilité des fonctions continues. Publications de la Faculty des Sciences de L’Université Charles 129:9MATH
4.
go back to reference Levy Y (2013) Discounted stochastic games with no stationary Nash equilibrium: two examples. Econometrica 81(5):1973–2007MathSciNetCrossRef Levy Y (2013) Discounted stochastic games with no stationary Nash equilibrium: two examples. Econometrica 81(5):1973–2007MathSciNetCrossRef
7.
go back to reference Preiss D, Zajïcek L (2000) On Dini and approximate Dini derivatives of typical continuous functions. Real Anal Exchange 26:401–412MathSciNetCrossRef Preiss D, Zajïcek L (2000) On Dini and approximate Dini derivatives of typical continuous functions. Real Anal Exchange 26:401–412MathSciNetCrossRef
8.
go back to reference Saks S (1937) The theory of the integral, volume 7 of Monografie Matematyczne, 2nd edn. G.E. Stechert and Co., New York Saks S (1937) The theory of the integral, volume 7 of Monografie Matematyczne, 2nd edn. G.E. Stechert and Co., New York
9.
go back to reference Zachrisson LE (1964) Markov games. In: Dresher M, Shapley LS, Tucker AW (eds) Advances in game theory. Princeton University Press, Princeton, pp 211–253 Zachrisson LE (1964) Markov games. In: Dresher M, Shapley LS, Tucker AW (eds) Advances in game theory. Princeton University Press, Princeton, pp 211–253
Metadata
Title
An Update on Continuous-Time Stochastic Games of Fixed Duration
Author
Yehuda John Levy
Publication date
22-07-2020
Publisher
Springer US
Published in
Dynamic Games and Applications / Issue 2/2021
Print ISSN: 2153-0785
Electronic ISSN: 2153-0793
DOI
https://doi.org/10.1007/s13235-020-00361-0

Other articles of this Issue 2/2021

Dynamic Games and Applications 2/2021 Go to the issue

Premium Partner