1 Introduction

A common assumption used in Markov decision processes as well as stochastic games is that the decision makers have the preferences represented by an overall utility parametrized by an expectation operator with respect to the current information. More precisely, if u is an instantaneous utility of the agent and \(c_t\) is the consumption level in period t, then the discounted lifetime utility \(V_t\) from period t onwards is defined in a recursive way as

$$\begin{aligned} V_t=u(c_t)+\beta E_t V_{t+1}, \end{aligned}$$

where \(\beta \in (0,1)\) is a discount factor and \(E_t\) is the expectation operator with respect to the information in period t. However, taking only an expectation of \(V_{t+1}\) means that the agent is risk neutral to future discounted utility. In real life, this assumption is very often violated. For example, in an optimal growth model, the agent may have a higher risk aversion, which generates precautionary saving. Therefore, we propose to equip the agent with a constant absolute risk aversion coefficient, say \(\gamma >0,\) and assume that he/she uses the entropic risk measure, known also as certainty equivalent for the exponential function. In other words, the lifetime utility of the agent is defined now as

$$\begin{aligned} \tilde{V}_t=u(c_t)-\frac{\beta }{\gamma }\ln E_t\exp \{-\gamma \tilde{V}_{t+1}\}. \end{aligned}$$
(1)

Within this framework, the agent is risk averse in future utility \(\tilde{V}_{t+1},\) in addition to being risk averse in future consumption levels \(c_{t+1}, c_{t+2},\ldots .\) The latter risk attitude is reflected in the concave function u of the agent. According to the properties of the entropic risk measure listed in Sect. 3, the agent takes into account not only the expected value of the future lifetime utility, but also all further moments with appropriate weights (see also Section 3.4 in Bäuerle and Rieder (2011)). These preferences have drawn attention of many authors. For instance, Hansen and Sargent (1995) applied them to a linear quadratic Gaussian control model, and Weil (1993) used them to examine precautionary savings and permanent income hypothesis. Moreover, these preferences found applications in the problems of Pareto optimal allocations (Anderson 2005) as well as in the study of Markov decision processes (Asienkiewicz and Jaśkiewicz 2017) or in one-sector optimal growth model with an unbounded felicity function (Bäuerle and Jaśkiewicz (2018)) . As argued by Hansen and Sargent (1995) the preferences in (1) are also attractive, because they can viewed as the robustness preferences. In this context, \(\gamma \) denotes the degree of robustness of the agent. This fact is a consequence of the robust representation of the entropic risk measure via the relative entropy as a penalty function, see Chapter 4 in Föllmer and Schied (2001).

In this paper, we study a strategic version of the discrete-time one-sector optimal growth model. Specifically, we deal with two players who own natural resource and they consume certain amount of the available stock in each time period. We assume that each player possesses the same risk coefficient \(\gamma \) and the same felicity function. Moreover, each player defines, using the aforementioned risk measure, his non-expected discounted utility. Our objective is to prove the existence of a symmetric Nash equilibrium in non-randomized strategies. Levhari and Mirman in their seminal paper Levhari and Mirman (1980) studied such a strategic optimal growth model with the same logarithmic felicity functions for each agent and the deterministic Cobb–Douglas production function. Their model has been extended in Sundaram (1989) for arbitrary production and felicity functions. Further generalizations to stochastic production functions were reported in Majumdar and Sundaram (1991), Dutta and Sundaram (1992) and Jaśkiewicz and Nowak (2018a). Other models of capital accumulation or resource extraction games with risk-neutral agents can be found in Jaśkiewicz and Nowak (2018b) and Balbus et al. (2016). Moreover, it is worth mentioning that there exist some iterative procedures (under special conditions) for finding Nash equilibria in such games, which were developed in Balbus and Nowak (2004) and Szajowski (2006). Finally, we wish to stress out that the games with risk-sensitive players have already been examined in the literature, but not with the non-expected discounted payoffs, see for instance Bäuerle and Rieder (2017), Jaśkiewicz and Nowak (2014), Klompstra (2000) and the references cited therein. Namely Bäuerle and Rieder (2017) dealt with zero-sum stochastic games, where the players take the expectation of the exponential function of accumulated discounted payoffs. Such an approach leads to a non-stationary model. Klompstra (2000), on the other hand, studied Nash equilibria for a two-person non-zero-sum game with a quadratic-exponential cost criterion, whilst in Jaśkiewicz and Nowak (2014) the authors treated intergenerational models with risk-sensitive generations. Finally, Başar (1999) dealt with risk-sensitive players playing a differential game. Therefore, to the best of our knowledge, this work is the first which studies recursive utilities in dynamic games.

To show the existence of an equilibrium, we need to accept some conditions on the felicity function and the transition probabilities. Our assumptions are borrowed from Balbus et al. (2015a) and Jaśkiewicz and Nowak (2018a). Namely we present two alternative sets of conditions. We assume either non-atomic transition probabilities or transition probabilities that allow atoms and embrace purely deterministic case. These assumptions allow us to prove the existence of an equilibrium in the class of stationary Markov strategies.

The paper is organized as follows. Section 2 is devoted to a model description. In Sect. 3, we carefully define a non-expected discounted utility in the infinite time horizon. The assumptions and the main result are formulated in Sect. 4, whereas Sect. 5 contains the proof. Examples are placed in Sect. 6.

2 The model

Put \(\mathbb {R}_+=[0,+\infty ).\) Consider a two-person stochastic game with the following objects:

  1. (i)

    \(S= \mathbb {R}_+\) is the state space, i.e., the space of available resource stocks;

  2. (ii)

    \(A_i(s)=[0,s]\) is the space of actions available for player \(i \in \{1,2\}\), when the current resource stock is \(s\in S\);

  3. (iii)

    \(u_i:S\times S\times S\rightarrow \mathbb {R}_+\) is a felicity function for player \(i \in \{1,2\}\); we assume that for every \(s\in S\), \(a\in A_1(s)\) and \(b\in A_2(s),\)\(u_1(s,a,b)=u(a)\) and \(u_2(s,a,b)=u(b)\), where \(u:S\mapsto \mathbb {R}_+\) is a temporal utility for both agents; note that the utility for player 1 depends only on his/her consumption; the same remark applies to agent 2;

  4. (iv)

    \(q(\cdot | s-a-b)\) is a Borel measurable transition probability on S for the given feasible pair of actions \((a,b) \in A_1(s)\times A_2(s),\)\(a+b\le s\) and the current resource stock \(s\in S;\)

  5. (v)

    we define

    $$\begin{aligned} D:=\left\{ (s,a,b)\in S\times S\times S: a+b \le s \right\} \end{aligned}$$

    and

    $$\begin{aligned} D(s):=\left\{ (a,b)\in S\times S:(s,a,b)\in D\right\} ; \end{aligned}$$
  6. (vi)

    \(\gamma > 0\) is a risk coefficient;

  7. (vii)

    \(\beta \in (0,1)\) is a discount factor.

We assume that \(u(s) \le d\) for every \(s\in S\) and some constant \(d >0\). In each period, the both agents observe the state \(s \in S\) and simultaneously choose their actions \((a,b)\in A_1(s)\times A_2(s)\) provided that the actions are feasible, i.e., \((a,b) \in D(s)\). Immediately, player 1 enjoys the utility u(a), whereas player 2 enjoys u(b). The next state of the game \(s'\) has a distribution \(q(\cdot |s-a-b)\). If the pair of actions (ab) is infeasible in state s, then the players choose again their actions. Therefore, we restrict our attention only to strategies generating feasible action pairs during the play. Next, we define a history of the game as follows:

$$\begin{aligned} h_t= \left\{ \begin{array}{ll} s_1,&{} \quad t=1\\ (s_1,a_1,b_1,s_2,a_2,b_2,s_3,\ldots ,s_t),&{} \quad t \ge 2,\\ \end{array}\right. \end{aligned}$$

where \(s_k \in S\), \(a_k+b_k \le s_k \) for all \(k=1,...,t\). Let \(H_t\) be a set of all histories up to tth step. We endow \(H_t\) with a natural product topology. We shall consider only pure strategies.

Definition 1

A strategy\(\pi \) for player 1 is a sequence \((\pi _{t})_{t=1}^{\infty }\) such that each \(\pi _{t}\) is a Borel measurable mapping from the history space to the space of actions available to player 1. The set of all strategies for player 1 is denoted by \(\varPi \). Similarly, we define a strategy \(\sigma \) for player 2 and denote the set of all his/her strategies by \(\varSigma \).

Furthermore, we introduce the following set of functions

$$\begin{aligned} F_i:=\{ \phi :S\rightarrow S: \phi (s) \in [0,s] \text{ for } \text{ every } s\in S \text{ and } \phi \text{ is } \text{ Borel } \text{ measurable } \text{ function }\}. \end{aligned}$$

Definition 2

A stationary Markov strategy for player 1 is a sequence \((\pi _{t})_{t=1}^{\infty }\) such that \(\pi _{t}=\phi \) for all \(t\in \mathbb {N}\) and some \(\phi \in F_1\). Analogously, we define a stationary strategy for player 2 as a sequence of \((\sigma _{t})_{t=1}^{\infty }\) such that \(\sigma _{t}=\hat{\phi }\) for all \(t\in \mathbb {N}\) and some \(\hat{\phi }\in F_2\). Further, we shall identify a stationary Markov strategy with the element of the sequence.

3 Non-expected \(\beta \)-discounted utility function

In this section, we define the non-expected utilities for the players. We assume that each player is equipped with the risk coefficient \(\gamma >0.\) Before giving a formal definition of the discounted utility in the infinite time horizon for each player, we introduce the notion of the entropic risk measure. Let \((\varOmega ,\mathcal {F},P)\) be a probability space and let \(X \in L^{\infty }(\varOmega ,\mathcal {F},P)\) be a random payoff. Then the entropic risk measure is defined as follows:

$$\begin{aligned} \rho (X):=-\frac{1}{\gamma }\ln \left( \int _{\varOmega }e^{-\gamma X(\omega )}P(\mathrm{d}\omega )\right) . \end{aligned}$$
(2)

Let X and Y be random variables from \(L^{\infty }(\varOmega ,\mathcal {F},P)\). Then \(\rho (\cdot )\) satisfies following properties:

  1. (P1)

    monotonicity: if \(X\le Y,\) then \(\rho (X)\le \rho (Y)\);

  2. (P2)

    translation invariance: if \(k\in \mathbb {R},\) then \(\rho (X+k)=\rho (X)+k\);

  3. (P3)

    \(\rho (X)\le E(X)\), the consequence of Jensen’s inequality.

Using the Taylor expansion for the exponential and logarithmic functions, for \(\gamma \) sufficiently close to 0, we obtain the following approximation:

$$\begin{aligned} \rho (X)\approx E(X)-\frac{\gamma }{2}\mathrm{{Var}}(X). \end{aligned}$$
(3)

It means that the risk-sensitive player, when calculating his random payoff, takes into account not only the expected value of this random payoff but also its variance. Formula (2) is also known in the literature as a certainty equivalent of the exponential function Weil (1993). For further properties of \(\rho \), the reader is referred to Föllmer and Schied (2001).

Let \((\pi ,\sigma )\in \varPi \times \varSigma \). By \(\mathcal {B}(H_t)\) we denote the set of all Borel measurable bounded non-negative real-valued functions defined on \(H_t\) equipped with the supremum norm \(||\cdot ||\). For \(v_{t+1}\in \mathcal {B}(H_{t+1})\) and \(h_t \in H_t\), we set

$$\begin{aligned} \rho _{\pi _t,\sigma _t,h_t}(v_{t+1}):= - \frac{1}{\gamma } \ln \int _{S} e^{-\gamma v_{t+1}(h_t,\pi _{t}(h_t),\sigma _{t}(h_t),s')} q(\mathrm{d}s'|s_t-\pi _{t}(h_t)-\sigma _{t}(h_t)). \end{aligned}$$

By properties (P1) and (P3), we have that

$$\begin{aligned} 0 \le \rho _{\pi _t,\sigma _t,h_t}(v_{t+1})\le ||v_{t+1}||. \end{aligned}$$

Next, for any \(v_{t+1}\in \mathcal {B}(H_{t+1})\), we define the operator \(L_{\pi _t,\sigma _t}^i\) for player i as follows:

$$\begin{aligned} \big (L_{\pi _{t},\sigma _{t}}^i v_{t+1}\big )(h_t)= \left\{ \begin{array}{ll} u(\pi _{t}(h_t))+\beta \rho _{\pi _{t},\sigma _{t},h_t}(v_{t+1}),&{} \quad \text{ if } i=1,\\ u(\sigma _{t}(h_t))+\beta \rho _{\pi _{t},\sigma _{t},h_t}(v_{t+1}),&{} \quad \text{ if } i=2. \end{array}\right. \end{aligned}$$

Note that \(L_{\pi _t,\sigma _t}^i: \mathcal {B}(H_{t+1})\rightarrow \mathcal {B}(H_{t+1}).\) Indeed, observe that for every player i

$$\begin{aligned} 0 \le \big (L_{\pi _{t},\sigma _{t}}^i v_{t+1}\big )(h_t) \le d+ \beta ||v_{t+1}||. \end{aligned}$$
(4)

Further, we define an N-stage total discounted utility for player i by

$$\begin{aligned} U_N^i(s,\pi ,\sigma ):= \underbrace{ L_{\pi _{1},\sigma _{1}}^i \circ \cdots \circ L_{\pi _{N},\sigma _{N}}^i }_{N \text{ times } }\mathbf {0}(s), \end{aligned}$$

where \(\mathbf {0}\) is a function that assigns 0 for any argument. For instance, for player 1 and stage 2 we have

$$\begin{aligned} U_2^1(s,\pi ,\sigma )=u(\pi _{1}(s))-\frac{\beta }{\gamma } \ln \int _{S} e^{-\gamma u(\pi _{2}(s,\pi _{1}(s),\sigma _{1}(s),s'))} q(\mathrm{d}s'|s-\pi _{1}(s)-\sigma _{1}(s)). \end{aligned}$$

Similarly, we can define \(U_2^2(s,\pi ,\sigma )\) for player 2.

From the monotonicity of \(\rho \), the sequence \(\big ( U_N^i(s,\pi ,\sigma )\big )_{N\in \mathbb {N}}\) is non-decreasing and bounded from below by 0 for every \(s\in S\) and \((\pi ,\sigma ) \in \varPi \times \varSigma \). Moreover, by properties (P1)–(P3) it follows that

$$\begin{aligned} U^i_N(s,\pi ,\sigma ) \le \frac{d}{1-\beta } \end{aligned}$$
(5)

for all \(s\in S\) and \((\pi ,\sigma )\in \varPi \times \varSigma \). The reader is referred to Asienkiewicz and Jaśkiewicz (2017), where (5) and further details are proved. Hence, \(\lim \nolimits _{N\rightarrow \infty } U_N^i(s,\pi ,\sigma )\) exists and let us denote this limit by \(U^i(s,\pi ,\sigma )\). By the aforementioned discussion, it follows that each player is careful of his future unknown continuation function. Therefore, at every stage he uses the entropic risk measure, parametrized by his risk-averse coefficient \(\gamma ,\) to calculate the discounted utility in the infinite time horizon.

4 Existence of symmetric stationary Nash equilibria

Definition 3

A feasible profile \((\pi ^*,\sigma ^*)\in \varPi \times \varSigma \) is called a Nash equilibrium, if

$$\begin{aligned} U^1(s,\pi ^*,\sigma ^*)\ge U^1(s,\pi ,\sigma ^*) \end{aligned}$$

for each \(s\in S\) and any \(\pi \in \varPi \) such that \((\pi ,\sigma ^*)\) is feasible and

$$\begin{aligned} U^2(s,\pi ^*,\sigma ^*)\ge U^2(s,\pi ^*,\sigma ) \end{aligned}$$

for each \(s\in S\) and any \(\sigma \in \varSigma \) such that \((\pi ^*,\sigma )\) is feasible.

Definition 4

A Stationary Markov Perfect Equilibrium (SMPE) is a Nash equilibrium \((\phi ^*_1,\phi ^*_2)\) that belongs to the class of strategy pairs \(F_1\times F_2\). An SMPE \((\phi _1^*,\phi ^*_2)\) is symmetric if \(\phi _1^*=\phi _2^*.\)

The purpose of this section is to find a symmetric stationary pure Nash equilibrium in an appropriate class of strategies. Therefore, we define the subset of \(F_i\) as follows:

$$\begin{aligned} F_i^0:= & {} \{ \phi \in F_i: 0 \le \phi (s) \le s/2 \text{ for } \text{ all } s\in S \\&\text{ and } \text{ the } \text{ function } \varphi (s):=s/2-\phi (s) \text{ is } \text{ non-decreasing }\\&\text{ and } \text{ upper } \text{ semicontinuous }\}. \end{aligned}$$

The definition of the sets \(F_i^0\) (\(i=1,2\)) given in Jaśkiewicz and Nowak (2018a) on p. 243 should be same as above. More precisely, the function \(\varphi (s):=s-\phi (s)\) in Jaśkiewicz and Nowak (2018a) must be replaced by \(\varphi (s):=s/2-\phi (s).\)

We shall need the following assumptions imposed on the felicity function.

Assumption 1

(Felicity function) Function u is increasing, bounded, strictly concave and continuous at \(s=0\).

We also propose two alternative sets of assumptions for the transition probability.

Assumption 2

(Transition probability)

  1. (i)

    q is stochastically increasing, i.e., the function

    $$\begin{aligned} y\rightarrow \int _S f(z)q(\mathrm{d}z|y) \end{aligned}$$

    is increasing, whenever \(f: S\rightarrow \mathbb {R}\) is increasing;

  2. (ii)

    q is weakly continuous, i.e., if \(y_n \rightarrow y\) in S, then \(q(\cdot |y_n) \Rightarrow q(\cdot |y)\) as \(n \rightarrow \infty \);

  3. (iii)

    For each \(s\in S\) the set \(Z_s:=\{y\in S:q(\{s\}|y)>0\}\) is countable and \(q(\{0\}|0)=1.\)

Assumption 3

(Transition probability)

  1. (i)

    For each \(y \in S_+:=(0,+\infty )\) the probability measure \(q(\cdot |y)\) is non-atomic and \(q(\cdot |0)\) has no atoms in \(S_+\);

  2. (ii)

    q is weakly continuous.

Theorem 1

Let either Assumptions 1 and 2 or Assumptions 1 and 3 be satisfied. Then there exists a symmetric SMPE \((\phi ^*,\phi ^*) \in F_1^0 \times F_2^0\).

Remark 1

The predecessors of our work on symmetric dynamic games of resource extraction are Dutta and Sundaram (1992) and Jaśkiewicz and Nowak (2018a). The common feature of these works is that the authors deal with standard discounted expected payoffs or utilities for the players. This in turn implies that the players care only about the expected value of the future random payoffs. In other words, when calculating the discounted expected utility in the infinite time horizon, the players take into account only the expectation of the continuation function. In our approach, we allow the agents to be risk averse towards future random payoffs in the sense that according to (3) the players care not only about the expectation but also about the variance of the continuation function. Therefore, they evaluate the discounted utility in a recursive way by using the entropic risk measure (or the exponential certainty equivalent) parametrized by the risk coefficient. As in Dutta and Sundaram (1992), a felicity function is bounded (in contrast to Jaśkiewicz and Nowak (2018a)) and as in Jaśkiewicz and Nowak (2018a) the resource stock takes values in \([0,+\infty )\) [in contrast to Dutta and Sundaram (1992)].

Our assumptions imposed on the model are borrowed from Balbus et al. (2015a) and Jaśkiewicz and Nowak (2018a). More precisely, Assumption 3 coincides with Assumption (A) in Jaśkiewicz and Nowak (2018a). However, Assumption 2, analogous to the one in Balbus et al. (2015a), is slightly stronger than Assumptions (B1)–(B3) in Jaśkiewicz and Nowak (2018a). This is because the risk measure \(\rho \) used in evaluating the discounted utility is not additive in the sense that, in general, \(\rho (X+Y)\not =\rho (X)+\rho (Y)\) for any random payoffs X and Y. Therefore, the transition probability q cannot be the convex combinations of stochastic kernels with coefficients depending on the investment amount as in Jaśkiewicz and Nowak (2018a).

On the other hand, our result can also be viewed as an extension of the optimization problem (one player case), studied in Asienkiewicz and Jaśkiewicz (2017) and Bäuerle and Jaśkiewicz (2018), to a strategic version of a one-sector optimal growth model. In contrast to Bäuerle and Jaśkiewicz (2018), we examine, as mentioned above, a model with bounded felicity functions. The crucial role played in a study of the unbounded case is the fact that both investment and consumption functions are non-decreasing. Here, this property does not hold, since the unique solution to the Bellman equation \(V_\phi \) in Lemma  5 depends on the consumption strategy \(\phi \) of the other player.

5 Proof of Theorem 1

The methods of proving Theorem 1 resemble the ones used in Jaśkiewicz and Nowak (2018a). However, most of the preceding results must be formulated in terms of the entropic risk measure. Moreover, for the sake of completeness and clarity, we decided to provide all lemmas with their proofs.

Let X be the vector space of all continuous from the right functions with bounded variation on every [0, n], \(n\in \mathbb {N}\). We endow X with the topology of weak convergence. Recall that a sequence \((\eta _t)_{t=1}^\infty \) converges weakly to \(\eta \in X\) if and only if \(\eta _t(s)\rightarrow \eta (s)\) as \(t\rightarrow \infty \) at any continuity point \(s\in S\) of \(\eta \). The weak convergence of \((\eta _t)_{t=1}^\infty \) to \(\eta \) is denoted by \(\eta _t \xrightarrow {w} \eta .\)

Let \(X^*\) be the set of all non-decreasing functions \(\eta \in X\) such that \(0 \le \eta (s) \le \frac{d}{1-\beta }\) for all \(s\in S.\) Note that each \(\eta \in X^*\) is upper semicontinuous. Furthermore, we notice that 0 is a continuity point of every function \(\eta \in X^*.\) By Proposition 1 in Jaśkiewicz and Nowak (2018a), we have that \(X^*\) is sequentially compact in X. Moreover, Proposition 2 in Jaśkiewicz and Nowak (2018a) yields that \(F_i^0\) is also a convex and sequentially compact subset of X when endowed with the topology of weak convergence.

Now we start with a sequence of preliminary lemmas.

Lemma 1

Assume that \(f_n \xrightarrow {w} f\) in \(X^*\) and \(y_n \rightarrow y\) in S as \(n \rightarrow \infty \). Then \(f(y) \ge \limsup \nolimits _{n\rightarrow \infty } f_n(y_n)\).

Proof

Let \(y_0 > y\) be a continuity point of f. Then there exists \(N \in \mathbb {N}\) such that \(y_n < y_0\)\( \text{ for } \text{ all } n>N\). Therefore, \(f_n(y_n) \le f_n(y_0)\) for \(n > N\) and finally \(\limsup \nolimits _{n \rightarrow \infty } f_n(y_n) \le \limsup \nolimits _{n \rightarrow \infty } f_n(y_0)=f(y_0)\). Since \(y_0\) can be chosen arbitrarily close to y and f is continuous from the right, we deduce that

$$\begin{aligned} \limsup \limits _{n\rightarrow \infty } f_n(y_n) \le f(y). \end{aligned}$$

Lemma 2

Let Assumptions 2 or 3 hold. Assume that \(f_n \xrightarrow {w} f\) in \(X^*\) and \(y_n \rightarrow y\) in S, \(n \rightarrow \infty \). Then we have

$$\begin{aligned} \limsup \limits _{n\rightarrow \infty } -\frac{1}{\gamma } \ln \int _S e^{-\gamma f_n(z)} q(\mathrm{d}z|y_n) \le -\frac{1}{\gamma } \ln \int _S e^{-\gamma f(z)} q(\mathrm{d}z|y). \end{aligned}$$

Proof

Define

$$\begin{aligned} f^*(z):=\sup \{\limsup \limits _{n \rightarrow \infty } f_n(z_n):z_n\rightarrow z \}. \end{aligned}$$

We have that

$$\begin{aligned} \limsup \limits _{n\rightarrow \infty } -\frac{1}{\gamma } \ln \int _S e^{-\gamma f_n(z)}q(\mathrm{d}z|y_n)\le & {} -\frac{1}{\gamma } \ln \int _S e^{-\gamma f^*(z)}q(\mathrm{d}z|y) \\\le & {} -\frac{1}{\gamma } \ln \int _S e^{-\gamma f(z)} q(\mathrm{d}z|y). \end{aligned}$$

The first inequality follows from property (P1) and Lemma 3.2 in Serfozo (1966), whereas the second one is a consequence of Lemma 1 and (P1). Thus, the result follows.

Lemma 3

Let Assumption 3 hold. Assume that \(f\in X^*\) and \(y_n \rightarrow y\) in S as \(n \rightarrow \infty \). Then we obtain

$$\begin{aligned} \lim \limits _{n\rightarrow \infty } -\frac{1}{\gamma } \ln \int _S e^{-\gamma f(z)} q(\mathrm{d}z|y_n)= -\frac{1}{\gamma } \ln \int _S e^{-\gamma f(z)} q(\mathrm{d}z|y). \end{aligned}$$

Proof

For any \(z \in S\) define

$$\begin{aligned} f_*(z):=\inf \{\liminf \limits _{n \rightarrow \infty }f(z_n):z_n\rightarrow z \}. \end{aligned}$$

The function \(f_*\) is lower semicontinuous. Furthermore, \(f_*(z)=f(z)\) for any continuity point \(z \in S\) of f. Recall that 0 is a continuity point of f. Hence, \(f_*(0)=f(0)\). By Assumption 3(i), we have that

$$\begin{aligned} -\frac{1}{\gamma } \ln \int _S e^{-\gamma f_*(z)} q(\mathrm{d}z|y)= -\frac{1}{\gamma } \ln \int _S e^{-\gamma f(z)} q(\mathrm{d}z|y), \text{ for } y\in S. \end{aligned}$$
(6)

By Lemma 3.2 in Serfozo (1966), we obtain

$$\begin{aligned} \liminf \limits _{n\rightarrow \infty } -\frac{1}{\gamma } \ln \int _S e^{-\gamma f(z)} q(\mathrm{d}z|y_n) \ge -\frac{1}{\gamma } \ln \int _S e^{-\gamma f_*(z)} q(\mathrm{d}z|y). \end{aligned}$$
(7)

Combining (6) and (7) with Lemma 2, we infer that

$$\begin{aligned} \limsup \limits _{n\rightarrow \infty } -\frac{1}{\gamma } \int _S e^{-\gamma f(z)} q(\mathrm{d}z|y_n)\le & {} -\frac{1}{\gamma } \int _S e^{-\gamma f(z)} q(\mathrm{d}z|y) \\\le & {} \liminf \limits _{n\rightarrow \infty } -\frac{1}{\gamma } \int _S e^{-\gamma f(z)} q(\mathrm{d}z|y_n). \end{aligned}$$

Thus, the result follows.

Lemma 4

Let Assumption 2 hold. Assume that \(y_n \searrow y\) in S as \(n \rightarrow \infty \) and \(f \in X^*\). Then it follows

$$\begin{aligned} \lim \limits _{n \rightarrow \infty } -\frac{1}{\gamma } \ln \int _S e^{-\gamma f(z)} q(\mathrm{d}z|y_n)=-\frac{1}{\gamma } \ln \int _S e^{-\gamma f(z)} q(\mathrm{d}z|y). \end{aligned}$$

Proof

By Assumption 2(i), we infer that

$$\begin{aligned} -\frac{1}{\gamma } \ln \int _S e^{-\gamma f(z)} q(\mathrm{d}z|y_n) \ge -\frac{1}{\gamma } \ln \int _S e^{-\gamma f(z)} q(\mathrm{d}z|y). \end{aligned}$$

Hence, the above inequality and Lemma 2 yield

$$\begin{aligned} \liminf \limits _{n \rightarrow \infty }-\frac{1}{\gamma } \ln \int _S e^{-\gamma f(z)} q(\mathrm{d}z|y_n)\ge & {} -\frac{1}{\gamma } \ln \int _S e^{-\gamma f(z)} q(\mathrm{d}z|y) \\\ge & {} \limsup \limits _{n \rightarrow \infty }-\frac{1}{\gamma } \ln \int _S e^{-\gamma f(z)} q(\mathrm{d}z|y_n). \end{aligned}$$

These inequalities finish the proof.

Let \(\phi \in F_2^0\) and \(\varPi (\phi )\) be the set of all strategies \(\pi \) for player 1 for which the pair \((\pi ,\phi )\) is feasible. We are now ready to formulate our next lemma.

Lemma 5

Put \(\varPhi (s)=[0,s-\phi (s)]\) for each \(s\in S\). Let either Assumptions 1 and 2 or Assumptions 1 and 3 be satisfied. Then there exists a unique function \(V_\phi \in X^*\) such that

$$\begin{aligned} \begin{aligned} V_\phi (s)&=\max _{c \in \varPhi (s)} \Bigg ( u(c)-\frac{\beta }{\gamma } \ln \int _S e^{-\gamma V_\phi (z)} q(\mathrm{d}z|s-\phi (s)-c)\Bigg )\\&=\max _{y\in \varPhi (s)} \Bigg ( u(s-\phi (s)-y)-\frac{\beta }{\gamma } \ln \int _S e^{-\gamma V_\phi (z)} q(\mathrm{d}z|y) \Bigg ) \end{aligned} \end{aligned}$$

for all \(s \in S\). Moreover,

$$\begin{aligned} V_\phi (s)=\sup \limits _{\pi \in \varPi (\phi )} U^1(s,\pi ,\phi ), s\in S. \end{aligned}$$

Proof

For any \(V\in X^*\), define the operator T as follows:

$$\begin{aligned} TV(s)=\max _{y\in \varPhi (s)} \Bigg ( u(s-\phi (s)-y)-\frac{\beta }{\gamma }\ln \int _S e^{-\gamma V(z)} q(\mathrm{d}z|y) \Bigg ), s\in S. \end{aligned}$$

Observe that since \(s\rightarrow s-\phi (s)\) is upper semicontinuous and u is increasing and continuous, it follows that the function \((s,y)\rightarrow u(s-\phi (s)-y)\) is upper semicontinuous. Moreover, by Lemma 2\(y\rightarrow -\frac{\beta }{\gamma }\ln \int _S e^{-\gamma V(z)} q(\mathrm{d}z|y)\) is also upper semicontinuous. Hence, by Proposition D.5 in Hernández-Lerma and Lasserre (1996), the function TV is upper semicontinuous. This fact and (4) yield that \(T:X^*\rightarrow X^*.\) We have to prove that T is contractive. Assume that \(V_1,V_2 \in X^*\). By properties (P1) and (P2) for each \(s\in S\), we have

$$\begin{aligned} \begin{aligned} TV_1(s)-TV_2(s)\le&\sup \limits _{y \in \varPhi (s)} \Bigg (-\frac{\beta }{\gamma }\ln \int _S e^{-\gamma ||V_1(z)-V_2(z)||-\gamma V_2(z)} q(\mathrm{d}z|y)\\&+ \frac{\beta }{\gamma }\ln \int _S e^{-\gamma V_2(z)} q(\mathrm{d}z|y) \Bigg ) \\ \le&\beta ||V_1-V_2||. \end{aligned} \end{aligned}$$

Changing the roles of \(V_1\) and \(V_2\) we get

$$\begin{aligned} ||TV_1-TV_2||\le \beta ||V_1-V_2||. \end{aligned}$$

By the Banach fixed point theorem, there exists a unique function \(V_\phi \in X^*\) such that \(TV_\phi =V_\phi \).

Now we prove that \(V_\phi (s)=\sup \nolimits _{\pi \in \varPi (\phi )} U^1(s,\pi ,\phi )\). We have

$$\begin{aligned} V_\phi (s) \ge u(a)-\frac{\beta }{\gamma } \ln \int _S e^{-\gamma V_\phi (z)} q(\mathrm{d}z|s-\phi (s)-a) \end{aligned}$$

for every feasible consumption a for agent 1 (that means \(a+\phi (s) \le s\) for every \(s \in S\)). Proceeding along similar lines as in Asienkiewicz and Jaśkiewicz (2017) (see formula (3.6)) we obtain by iteration that for every \(N \in \mathbb {N}\) and \(\pi \in \varPi (\phi )\)

$$\begin{aligned} V_\phi (s) \ge U^1_N(s,\pi ,\phi ). \end{aligned}$$

Letting N tend to infinity, we have that

$$\begin{aligned} V_\phi (s) \ge U^1(s,\pi ,\phi ) \text{ for } \text{ any } \pi \in \varPi (\phi ) \text{ and } s \in S. \end{aligned}$$

Hence,

$$\begin{aligned} V_\phi (s) \ge \sup \limits _{\pi \in \varPi (\phi )} U^1(s,\pi ,\phi ) \text{ for } s \in S. \end{aligned}$$
(8)

From Proposition D.5 in Hernández-Lerma and Lasserre (1996), there exists \(\psi \in F_1\) such that

$$\begin{aligned} V_\phi (s)= u(s-\phi (s)-\psi (s))-\frac{\beta }{\gamma } \ln \int _S e^{-\gamma V_\phi (z)} q(\mathrm{d}z|\psi (s)). \end{aligned}$$

Put \(\phi ^*(s)=s-\phi (s)-\psi (s)\). Hence, for every \(s \in S\) we get

$$\begin{aligned} V_\phi (s) = u(\phi ^*(s))-\frac{\beta }{\gamma } \ln \int _S e^{-\gamma V_\phi (z)} q(\mathrm{d}z|s-\phi (s)-\phi ^*(s)). \end{aligned}$$

Again, by iteration of this equation and making use of properties (P1)–(P3), we obtain that for every \(s \in S\)

$$\begin{aligned} V_\phi (s) \le U^1_N(s,\phi ^*,\phi )+\beta ^N ||V_\phi ||. \end{aligned}$$

Letting N go to infinity, we have

$$\begin{aligned} V_\phi (s) \le U^1(s,\phi ^*,\phi ) \end{aligned}$$

for all \(s \in S\) and, consequently,

$$\begin{aligned} V_\phi (s) \le \sup \limits _{\pi \in \varPi (\phi )} U^1(s,\pi ,\phi ). \end{aligned}$$
(9)

Inequalities (8) and (9) imply that

$$\begin{aligned} V_\phi (s) = \sup \limits _{\pi \in \varPi (\phi )} U^1(s,\phi ,\sigma ). \end{aligned}$$

Define

$$\begin{aligned} A_\phi (s):= \text{ arg }\max \limits _{y \in \varPhi (s)} \Big ( u(s-\phi (s)-y)- \frac{\beta }{\gamma } \ln \int _S e^{-\gamma V_\phi (z)} q(\mathrm{d}z|y) \Big ). \end{aligned}$$

For any \(s \in S\) we set \(g(\phi )(s):= \max A_\phi (s)\).

Lemma 6

The correspondence \(s \rightarrow A_\phi (s)\) is ascending, i.e., if \(s_1 < s_2\) and \(y_1 \in A_\phi (s_1)\), \(y_2 \in A_\phi (s_2)\), then \(y_1 \le y_2\).

Proof

Suppose that \(s \rightarrow A_\phi (s)\) is not ascending. This means that there exist \(s_1 <s_2\) and \(y_1 \in A_\phi (s_1)\), \(y_2 \in A_\phi (s_2)\) such that \(y_1 >y_2\). Observe that the set \({{\mathcal {L}}}:=\{(s,y):\ s\in S, \ y\in \varPhi (s)\}\) is a lattice with the usual component-wise order on \(\mathbb {R}^2.\) Consequently, the points \((s_1,y_2)\) and \((s_2,y_1)\) belong to \({{\mathcal {L}}}.\) From Assumption 1, u is strictly concave. From the proof of Lemma 2 in Nowak (2006) and the fact that \(s_2-\phi (s_2) > s_1-\phi (s_1)\), we infer

$$\begin{aligned} u(s_2-\phi (s_2)-y_1)-u(s_2-\phi (s_2)-y_2)>u(s_1-\phi (s_1)-y_1)-u(s_1-\phi (s_1)-y_2). \end{aligned}$$

Adding \(- \frac{\beta }{\gamma } \ln \int _S e^{-\gamma V_\phi (z)} q(\mathrm{d}z|y_1)- \big (- \frac{\beta }{\gamma } \ln \int _S e^{-\gamma V_\phi (z)} q(\mathrm{d}z|y_2)\big )\) to both sides it follows that

$$\begin{aligned} \begin{aligned} 0=&V_\phi (s_2)-V_\phi (s_2) \ge u(s_2-\phi (s_2)-y_1)- \frac{\beta }{\gamma } \ln \int _S e^{-\gamma V_\phi (z)} q(\mathrm{d}z|y_1)-V_\phi (s_2) \\&> V_\phi (s_1)-\big (u(s_1-\phi (s_1)-y_2)- \frac{\beta }{\gamma } \ln \int _S e^{-\gamma V_\phi (z)} q(\mathrm{d}z|y_2)\big ) \ge V_\phi (s_1)-V_\phi (s_1)=0. \end{aligned} \end{aligned}$$

Thus, we have a contradiction.

Lemma 7

Let \(\psi \) be any selector of the correspondence \(s \rightarrow A_\phi (s)\), i.e., \(\psi (s) \in A_\phi (s)\) for all \(s \in S\). If \(\psi \) is continuous at \(s_0\), then \(A_\phi (s_0)\) is a singleton.

Proof

Clearly, \(A_\phi (0)\) is a singleton. Assume that \(s_0>0\) and \(y_1,\)\(y_2\) are elements of \(A_\phi (s_0)\) such that \(y_1<y_2\). Since \(s \rightarrow A_\phi (s)\) is ascending, we conclude that

$$\begin{aligned} \lim \limits _{s\rightarrow s^{-}_0} \psi (s) \le y_1 <y_2 \le \lim \limits _{s\rightarrow s^{+}_0} \psi (s). \end{aligned}$$

But \(\psi \) is continuous at \(s_0 \in S\). Thus, we have a contradiction.

Lemma 8

The function \(g(\phi )\) is a unique non-decreasing and continuous from the right selector of the correspondence \(s \rightarrow A_\phi (s)\).

Proof

From Lemma 6 the function \(g(\phi )\) is non-decreasing. Moreover, we observe that the graph of the correspondence \(s \rightarrow A_{\phi }(s)\) is closed from the right. Indeed, take \(s_n\searrow s\) and \(y_n\in A_\phi (s_n).\) From Lemma 6 it follows that \(y_n\) is non-increasing and let \(y_n\) converge to some y. Lemma 3 (under Assumption 3) or Lemma 4 (under Assumption 2) and Assumption 1 imply that \(y\in A_\phi (s).\) Therefore, \(g(\phi )\) is continuous from the right. Hence, \(g(\phi )\) is an upper semicontinuous selector of the correspondence \(s \rightarrow A_\phi (s)\). The uniqueness follows from Lemma 7.

Proof of Theorem 1

Define the operator L as follows \(L\phi (s):=\frac{s-g(\phi )(s)}{2}\) for \(s\in S\) and \(\phi \in F_2^0\). Lemma 8 implies that \(L\phi \in F_1^0\). Hence, \(L: F_2^0\rightarrow F_1^0.\) We have to show that the operator L is continuous when \(F_1^0\) and \(F_2^0\) are equipped with the topology of weak convergence. Suppose that \(\phi _n \xrightarrow {w} \phi \) as \(n \rightarrow \infty \). From fact that the set \(X^*\) is sequentially compact in X,  we infer that there exists a subsequence of \((V_{\phi _n})_{n=1}^\infty \) converging to some V in \(X^*\). Without loss of generality we may accept that \(V_n:= V_{\phi _n} \xrightarrow {w} V\) in \(X^*\) as \(n\rightarrow \infty .\) Analogously, we may assume that \(\psi _n:=g(\phi _n) \xrightarrow {w} \psi \) in \(F^0_1\). Thus, for each \(n \in \mathbb {N}\), we obtain from Lemma 5 that

$$\begin{aligned} V_n(s)=u(s-\phi _n(s)-\psi _n(s))-\frac{\beta }{\gamma } \ln \int _S e^{-\gamma V_n(z)} q(\mathrm{d}z|\psi _n(s)), \text{ for } \text{ all } s \in S. \end{aligned}$$

Let \(S_1 \subset S\) be the set of all continuity points of the functions V, \(\phi \) and \(\psi \). For any \(s\in S_1\) we get \(V_n(s)\rightarrow V(s)\), \(\phi _n(s) \rightarrow \phi (s)\) and \(\psi _n(s) \rightarrow \psi (s)\). Using Assumption 1, Lemma 2 and the last display, we obtain that

$$\begin{aligned} V(s) \le u(s-\phi (s)-\psi (s))-\frac{\beta }{\gamma } \ln \int _S e^{-\gamma V(z)} q(\mathrm{d}z|\psi (s)). \end{aligned}$$
(10)

Let \(s \notin S_1\). Since \(S_1\) is dense in S and the functions V, \(\psi \) and \(\phi \) are continuous from the right, we may choose a sequence \((s_m)_{m=1}^\infty \) in S such that \(s_m \searrow s\) as \(m \rightarrow \infty \). Therefore, we get

$$\begin{aligned} V(s_m) \le u(s_m-\phi (s_m)-\psi (s_m))-\frac{\beta }{\gamma } \ln \int _S e^{-\gamma V(z)} q(\mathrm{d}z|\psi (s_m)). \end{aligned}$$

From Lemma 2 and letting \(m \rightarrow \infty \) we conclude that (10) holds for all \(s \in S\). On the other hand, for any \(n\in \mathbb {N},\)\(y\in [0,s-\phi _n(s)]\) and \(s\in S\), by Lemma 5 we have

$$\begin{aligned} V_n(s)\ge u(s-\phi _n(s)-y)-\frac{\beta }{\gamma } \ln \int _S e^{-\gamma V_n(z)} q(\mathrm{d}z|y). \end{aligned}$$

Now we define the following sets:

  • \(S_d\) is a countable set of discontinuity points of the function V;

  • \(S_2\) is the set of all continuity points of the functions V and \(\phi \);

  • \(S_3\) is the set of all \(y \in S\) such that \(q(S_d |y)=0\).

Recall that \(0 \notin S_d\). Clearly, the set \(S_2\) is dense in S. The set \(S_3\) is also dense in S and contains the state 0. These two facts follow from either Assumption 2(iii) or Assumption 3(i). Choose any \(s \in S_2 \cap S_+\) and \(y \in S_3 \cap [0, s-\phi (s))\). Then there exists some \(N \in \mathbb {N}\) such that \(y \in [0,s- \phi _n(s)]\) for all \(n >N\). Hence, we have the following inequality:

$$\begin{aligned} V_n(s) \ge u(s-\phi _n(s)-y)-\frac{\beta }{\gamma } \ln \int _S e^{-\gamma V_n(z)} q(\mathrm{d}z|y), n>N. \end{aligned}$$

By the dominated convergence theorem and the fact that \(y \in S_3\), we obtain

$$\begin{aligned} \lim \limits _{n\rightarrow \infty } -\frac{1}{\gamma } \ln \int _S e^{-\gamma V_n(z)} q(\mathrm{d}z|y)= -\frac{1}{\gamma } \ln \int _S e^{-\gamma V(z)} q(\mathrm{d}z|y). \end{aligned}$$

Thus, we can conclude that

$$\begin{aligned} V(s) \ge u(s-\phi (s)-y)-\frac{\beta }{\gamma } \ln \int _S e^{-\gamma V(z)} q(\mathrm{d}z|y), \end{aligned}$$
(11)

for \(y \in [0,s-\phi (s))\cap S_3\) and \(s\in S_2 \cap S_+\). Let us consider \(s_0 \in S\) and \(y_0 \in [0,s_0-\phi (s_0)]\). Now we choose two sequences \((s_m)_{m=1}^\infty \) and \((y_m)_{m=1}^\infty \) such that \(s_m \searrow s_0,\)\(y_m \searrow y_0\) as \(m \rightarrow \infty \) and \(s_m \in S_2 \cap S_+,\)\(y_m \in S_3 \cap [0, s_m-\phi (s_m))\) for all \(m \in \mathbb {N}\). Obviously, \(s_m-\phi (s_m) \ge s_0-\phi (s_0)\). Therefore, by (11), we obtain

$$\begin{aligned} V(s_m) \ge u(s_m-\phi (s_m)-y_m)-\frac{\beta }{\gamma } \ln \int _S e^{-\gamma V(z)} q(\mathrm{d}z|y_m). \end{aligned}$$

Letting \(m\rightarrow \infty \) and making use of Lemma 3 in case of Assumption 2 or Lemma 4 in case of Assumption 3, the continuity of u,  the continuity from the right of functions V,  and \(s\rightarrow s-\phi (s)\) we infer that inequality (11) holds for \(s_0\in S\) and \(y_0 \in [0,s_0-\phi (s_0)]\). Finally, inequalities (10) and (11) yield that

$$\begin{aligned} \begin{aligned} V(s)=&u(s-\phi (s)-\psi (s))-\frac{\beta }{\gamma }\ln \int _S e^{-\gamma V(z)} q(\mathrm{d}z|\psi (s)) \\ =&\max _{y\in [0,s-\phi (s)]} \left( u(s-\phi (s)-y)-\frac{\beta }{\gamma } \ln \int _S e^{-\gamma V(z)} q(\mathrm{d}z|y) \right) ,\ s\in S. \end{aligned} \end{aligned}$$

Since \(\psi \) is non-decreasing and upper semicontinuous, it follows by Lemma 8 that \(g(\phi )=\psi \). Thus, the operator L is continuous. By the Schauder–Tychonoff fixed point theorem (see Corollary 17.56 in Aliprantis and Border (2006)), there exists \(\phi ^*\in F_2^0\) such that \(L{\phi ^*}=\phi ^*\). This implies that \((\phi ^*,\phi ^*) \in F_1^0 \times F_2^0\) is a symmetric SMPE.

6 Examples

In this section, we provide two examples satisfying our assumptions. Further examples can be found in Balbus et al. (2015a, b), Brock and Mirman (1972) and Jaśkiewicz and Nowak (2018b).

Example 1

Let \(\varOmega :=[0,1]\) and let \(\lambda \) be the standard Lebesgue measure. Let \(F:S\times \varOmega \mapsto S\) be Borel measurable and non-decreasing and continuous in the first argument such that \(F(0,\omega )=0\) for each \(\omega \in \varOmega \). Let \(F_y(\omega ):=F(y,\omega )\) for each \((y,\omega )\in \varOmega \). Let q has the form \(q(\cdot |y):=\lambda F_y^{-1}(\cdot ).\) Obviously q is weakly continuous, hence Assumption 3 (ii) is satisfied. Clearly if \(F(y,\cdot )\) is \(1-1\) for each \(y\in S\setminus \{0\}\), then \(q(\cdot |y)\) is non-atomic. Hence, Assumption 3 (i) is also satisfied. For example, we can consider a multiplicative shock \(F(y,\omega )=y^{\alpha }\omega \) with \(\alpha \in (0,1)\). The utility function for the agent can be, for instance, of the form \(u(c)=1-e^{-c}\) for \(c\in S\). Clearly u is increasing, strictly concave and continuous at 0. Hence, Assumption 1 is satisfied.

Example 2

Let \(\mu (\cdot |y)\) be a non-atomic measure for each \(y\in S\setminus \{0\}\) and \(\mu (\{0\}|0)=1\). Furthermore, assume that \(\mu \) is stochastically increasing and weakly continuous. Suppose that \(f_j\) is increasing, continuous and \(f_j(0)=0\) for each \(j=1,\ldots ,m.\) Assume that \(\sum _{j=1}^m\alpha _j+\alpha _0=1,\) where \(\alpha _0,\alpha _j\in [0,1]\) for \(j=1,\ldots ,m.\) Let

$$\begin{aligned} q(\cdot |y)=\alpha _0\mu (\cdot |y)+\sum \limits _{j=1}^m\alpha _j\delta _{f_j(y)}(\cdot ). \end{aligned}$$

Observe that q satisfies Assumption 2 (i) and (ii). For proving that Assumption 2 (iii) is satisfied, observe that

$$\begin{aligned} Z_{s}=\{y\in S:f_j(y)=s \text{ for } \text{ some } j=1,\ldots ,m\} \end{aligned}$$

for each \(s\in S\). Hence, the cardinality of \(Z_s\) is at most m. As a result, q obeys Assumption 2. Here, we may assume that the utility function for both players has the following form:

$$\begin{aligned} u(c)=\sqrt{c} \text{ for } c\in [0,1)\quad \text{ and }\quad u(c)=\frac{3}{2}-\frac{1}{1+c^2} \text{ for } c\ge 1. \end{aligned}$$

Clearly u is increasing, strictly concave and continuous at 0. As a result, Assumption 1 is satisfied.