Open Access 09012021
Dynamic Coordination Games with Activation Costs
Published in: Dynamic Games and Applications  Issue 3/2021
Abstract
Motivated by inventory control problems with setup costs, we consider a coordination game where each player’s dynamics is an inventory model characterized by a controlled input and an uncontrolled output. An activation cost is shared among active players, namely players who control their dynamics at a given time. At each time, each player decides to be active or not depending on its inventory level. The main contribution of this paper is to show that strategies at a Nash equilibrium have a threshold structure on the number of active players. Furthermore, we provide an explicit expression for the lower and upper threshold is given both in the deterministic case, namely when the exogenous signal is known, and in the singlestage game. The relevance of the above results is discussed in the context of inventory control where Nash equilibrium reordering strategies imply that a single retailer reorders only if jointly with a number of other retailers and will reorder to restore a preassigned inventory level.
1 Introduction
This paper studies a discretestate discretetime dynamic game where players have to coordinate actions within a finite horizon window [2, 3]. Each player’s dynamics is an inventory model characterized by a controlled input and an uncontrolled output. The output flow is an uncontrolled exogenous signal. The input flow is controlled by the player and is subject to an activation cost. The state of the player is the accumulated discrepancy between input flow and output flow. The activation cost is shared among active players, namely those players who control their dynamics at a given time. The possibility of sharing the activation cost determines the need for coordination of control strategies on the part of the players. We study the cases under deterministic and stochastic disturbances. All results can be extended to the vector case by using the robust decomposition approach in [4, Section 3]. Applications arise in coordinated replenishment [8], and opportunistic maintenance [7].
1.1 Contribution
This study contributes in different ways to advance the theory on dynamic coordination games with activation costs for the control. An example of twothreshold strategy is the (s, S) strategy used in inventory control, see [6] and [5, Chapter 4]. We recall that (s, S) strategies are strategies where replenishments occur anytime the inventory level goes below a lower threshold s. Replenishments bring back the inventory level up to a higher threshold S. In particular, we highlight the following results.This paper is organized as follows. In Sect. 2, we introduce the dynamic inventory game. In Sect. 4, we first show that all Nash equilibrium strategies have a twothreshold structure with a reorder level and an orderupto level. We then provide a dual interpretation of such strategies as threshold strategies on the number of active players. In Sect. 5, we specialize our results to the case of single stage coordination game. In Sect. 6, we provide numerical analysis. Finally, in Sect. 7, we draw conclusions and discuss future works.

Strategies at a Nash equilibrium have a threshold structure. We obtain this result in two steps. First, we prove that Nash equilibria are associated with (s, S) strategies via \({\mathscr {K}}\)convex analysis. Second, we view the (s, S) strategies as threshold strategies on the number of active retailers.

Lower and upper thresholds have an explicit expression in the deterministic case, namely when the exogenous signal is known, or in singlestage games.

We corroborate our results with a numerical analysis of a stylized inventory model.
Advertisement
2 Dynamic Inventory Coordination Game
Consider a set of n retailers \(\varGamma =\{1,\ldots ,n\}\). At stage \(t=0,\ldots ,N1\), the ith retailer holds inventory \(x_i^t\in {\mathbb {Z}}\), faces a stochastic demand \(\omega _i^t \in \mathbb Z_+\) and orders a quantity \(u_i^t \in U_i^t \subseteq {\mathbb {Z}}_+\), where \(U_i^t\) denotes the set of admissible decisions, \({\mathbb {Z}}\) the set of integers, and \({\mathbb {Z}}_+\) the set of nonnegative integers. Thus, for all retailers \(i \in \varGamma \), inventory \(x_i^t\), which we refer to as the state of retailer i, evolves according to a linear finitestate, discretetime model of the form:Here, we assume that there are no delays between orders and deliveries. For all retailers, we also suppose that the inventory at hand plus inventory ordered may not exceed the storage capacity denoted by \(C_{\text {{store}}}\). Hence, we have \( x_i^t+u_i^t\le C_{{\text {store}}}. \) We also assume that \(C_{\text {{store}}} \ge x^i_0\) so to exclude an empty set of feasible orders.
$$\begin{aligned} x_i^{t+1}=x_i^t+u_i^t\omega _i^t. \end{aligned}$$
(1)
Now, for each time t, let us introduce the vector of the retailers’ decisions \(u^t = [u_i^t]_{i \in \varGamma }\) and the vector of decisions of all retailers other than i \(u_{i}^t=[u_j^t]_{j\in \varGamma ,j\ne i} \in U_{i}^t\), where \(U_{i}^t\) denotes the Cartesian product of all sets \(U_j^t\) \(j \ne i\). At each stage, the ith retailer has a costwhere \({\mathbb {E}} \{.\}\) indicates expectation, \(K \ge 0\) represents the transportation cost, \(c \ge 0\) is the purchase cost per stock unit, \(h\ge 0\) is the penalty on holding, \(p \ge 0\) the penalty on shortage. The term \(\delta (u_i^t)\) is one if the ith retailer replenishes, i.e., is active, and zero otherwise.
$$\begin{aligned} \begin{array}{lll} g_{i}(x_i^t,u_i^t,u_{i}^t) = \frac{K}{1+\sum \nolimits _{j=1,j\ne i}^n \delta (u_j^t)}\delta (u_i^t)+cu_i^t \\ \quad + p {\mathbb {E}} \{ \max (0,x_i^{t+1}) \}+h {\mathbb {E}} \{\max (0,x_i^{t+1}) \},\end{array} \end{aligned}$$
(2)
We henceforth denote by \(a^t\) the number of active retailers at stage t, i.e.:Note that the term \(\frac{K}{1+\sum \nolimits _{j=1,j\ne i}^n \delta (u_j^t)}\delta (u_i^t)\), which describes the fixed cost paid by retailer i in (2), is equal to \(\frac{K}{a^t}\) if retailer i is active and equal to zero otherwise.
$$\begin{aligned} a^t:=\sum \nolimits _{j=1}^n \delta (u_j^t). \end{aligned}$$
After introducing the N stage decision vectors \({{\mathbf {u}}_{\mathbf {i}}}^{0\sim N1}=[u_i^0,\ldots ,u_i^{N1}]\) and \({{\mathbf {u}}_{\mathbf {i}}}^{0\sim N1}=[u_{i}^0,\ldots ,u_{i}^{N1}]\), and denoting by \(\varPhi _i(x_i^{N})\) a penalty term on final state, the cost over the horizon from 0 to N is of the formA challenging issue in the definition of the stage cost (2) is its dependence on the number of active retailers through the term \(\frac{K}{1+\sum \nolimits _{j=1,j\ne i}^n \delta (u_j^t)}\delta (u_i^t)\). This term establishes that the transportation cost K is equally divided by all active retailers. This in turn implies that the cost of one retailer also depends on the decisions of all other retailers. Conditions (1)–(3) describe the dynamics and the costs of our game.
$$\begin{aligned} \begin{array}{ll} J_i\left( x_i^0,{{\mathbf {u}}_{\mathbf {i}}}^{0\sim N1},{{\mathbf {u}}_{\mathbf {i}}}^{0\sim N1}\right) = \varPhi _i\left( x_i^{N}\right) + \sum \nolimits _{t=0}^{N1}g_i\left( x_i^{t},u_i^{t},u_{i}^{t}\right) . \end{array} \end{aligned}$$
(3)
Advertisement
Other concepts we will make use of in the rest of the paper are Nash equilibrium strategies and \({\mathscr {K}}\)convexity which we briefly recall next.
Definition 1
Decisions \({{\mathbf {u}}_{\mathbf {i}}}^{\star }\) are at a Nash equilibrium, if it holds for all \(i \in \varGamma \)
$$\begin{aligned} J_i\left( x_i^0,{{\mathbf {u}}_{\mathbf {i}}}^{\star },{{\mathbf {u}}_{\mathbf {i}}}^{\star }\right) \le J_i\left( x_i^0,{{\mathbf {u}}_{\mathbf {i}}},{{\mathbf {u}}_{\mathbf {i}}}^{\star }\right) \; \forall \; {{\mathbf {u}}_{\mathbf {i}}} \in U_i^0\times \cdots \times U_i^{N1}. \end{aligned}$$
For the inventory problem, once at a Nash equilibrium, no retailer benefits from changing its replenishment decisions. The following definition of \({\mathscr {K}}\)convexity is borrowed from [6].
Definition 2
A function \(f: {\mathbb {R}} \rightarrow {\mathbb {R}}\) is \({\mathscr {K}}\)convex, where \({\mathscr {K}}\ge 0\), if
$$\begin{aligned} {\mathscr {K}} + f(z+y) \ge f(y) + z \left[ \frac{f(y)f(yb)}{b} \right] \, \forall z\ge 0, \, b>0,\, y. \end{aligned}$$
\({\mathscr {K}}\)convexity is used in [6] and reiterated in [5] to prove optimality of (s, S) strategies. We will make use of \({\mathscr {K}}\)convexity to prove the main result of this work.
In the following section, we consider threshold strategies according to which, given an inventory level \(x_i^t\), there exists a threshold \(l_i^t\in \{1,\,2,\,\ldots ,\,n\}\), such that retailer i reorders only if the number of active players \(a^t\) is greater than or equal to such a threshold. Such strategies are given byAs main result, we will show that all Nash equilibrium strategies have the threshold structure (4). To emphasize that \(l_i^t\) depends on \(x_i^t\), we sometimes write \(l_i^t(x_i^t)\).
$$\begin{aligned} \mu _i(x_i^t,a^t)=\left\{ \begin{array}{ll} \hbox {reorder}&{} \qquad \hbox {if} \quad a^t \ge l_i^t,\\ \hbox {do not reorder} &{} \qquad \hbox {if} \quad a^t < l_i^t. \end{array}\right. \end{aligned}$$
(4)
Note that orders depend on the history of the game as they are function of the state variable \(x_i^t\) which in turn depends on the past orders of the retailer as in (1). Orders of a single retailer also depend on her competitors’ orders through variable \(a^t\).
An additional concept that is important to explain is the one of subgame perfect equilibrium. We have borrowed and adapted from [9] the following definition:
Definition 3
A subgame perfect equilibrium is a ntuple of strategies \({\mathbf {u}}^{\star }=[{{\mathbf {u}}_{\mathbf {i}}}^{\star }]_{i \in \varGamma }\), such that for every \(i \in \varGamma \) and every history \(x_i^t\), we have that:
$$\begin{aligned} J_i\left( x_i^t,{{\mathbf {u}}_{\mathbf {i}}}^{\star },{{\mathbf {u}}_{\mathbf {i}}}^{\star }\right) \le J_i\left( x_i^t,{{\mathbf {u}}_{\mathbf {i}}},{{\mathbf {u}}_{\mathbf {i}}}^{\star }\right) \; \forall \; {{\mathbf {u}}_{\mathbf {i}}} \in U_i^0\times \cdots \times U_i^{N1}.\end{aligned}$$
Therefore, note that, if strategy (4) returns a Nash equilibrium, then such equilibrium is also subgame perfect in the sense that the strategy returns the optimal order depending on the current state \(x_i^t\) and irrespective of the fact that past orders might not be optimal.
To simplify the proofs and the graphs plotted in the following figures, in the rest of the paper, we assume that the penalty term on the final state \(\varPhi (x_i^n)\) is null. However, the results that we prove still hold if \(\varPhi (x_i^n)\) is a generic convex function with a minimum in \(x_i^n = 0\).
3 On the Generality of the Model
Consider an ndimensional inventory model characterized by discrete states \(x^t \in {\mathbb {Z}}^n\), integer controls \(u^t \in {\mathbb {Z}}_+^n\), and binary controls \(y^t \in \{0,1\}^n\), and discrete stochastic disturbances \(w^t\in {\mathbb {Z}}_+^n\), where \(t=0,1,\ldots \) is the time index. The evolution of the state is described by a linear discretetime (difference) equation in the general form (5) below, where A and E are matrices of compatible dimensions and \(x(0)=\xi _0 \ge 0\) is a given initial state. Integer and binary controls are linked through the general capacity constraints (6), where the (scalar) parameter c is an upper bound on control, with the inequalities in (5) and (6) to be interpreted componentwise.The above dynamics are characterized by two discrete valued control variables per each state. Starting from nonnegative initial states, we wish to control the state to remain confined to the positive orthant, which may describe a safety region in engineering applications or reflect the desire to prevent shortfalls in inventory applications.
$$\begin{aligned} x^{t+1}&=A x^t + E w^t + u^t, \end{aligned}$$
(5)
$$\begin{aligned} \quad 0&\le u^t \le cy^t, \quad y^t\in \{0,1\}^n. \end{aligned}$$
(6)
A common situation is where the disturbance seeks to push the state out of the desired region. Its value is given at the beginning and fixed that way. Each column of matrix E establishes how each disturbance component influences the evolution of the state vector. Then, it is reasonable to assume \( Ew(k) < 0,\) where the inequality is to be interpreted componentwise.
With regard to (5), we can isolate the dependence of one component state on the other ones and rewrite (5) in a way that establishes similarity with standard lot sizing models [10]:Equation (7) is a straightforward representation of (5) whereTo preserve the nature of the problem, which has stabilizing control actions playing against unstabilizing disturbances, we assume that the influence of other states on state i is relatively “weak.” In other words, we assume that the influence of \(B x^t\) is small if compared with the unstabilizing effects of disturbances captured by the term \(E w^t\). This is captured by assuming that the sum \(\varDelta x^t+Ew^t\) has same (negative) sign of \(Ew^t\), namelywhere inequality is again componentwise and it holds almost everywhere. Essentially, the states’ mutual dependence expressed by \(B x^t\) only emphasizes or reduces “weakly” the destabilizing effects of the disturbances. In the following, we present a robust decomposition approach that translates dynamics (7) into n scalar dynamics in “lot sizing” form [10].
$$\begin{aligned} x^{t+1}=x^t + B x^t + E w^t + u^t. \end{aligned}$$
(7)
$$\begin{aligned}&B:= AI =: \{b_{ij}\},\, b_{ij} = a_{ij}  \delta _{ij},\\&\quad \delta _{ij}:=\left\{ \begin{array}{ll}1, &{} \text{ if } i=j\text{, }\\ 0, &{} \text{ otherwise. } \end{array}\right. \end{aligned}$$
$$\begin{aligned} B x^t + E w^t < 0, \end{aligned}$$
With the term “robust decomposition” we mean a transformation through which dynamics (7) are replaced by n independent uncertain lot sizing models of the form (8) where \(x_i^t\) is the inventory, \(d_i^t\) the demand, \(u_i^t\) the reordered quantity and \({\mathscr {D}}_i^t \subset {\mathbb {R}}\) denotes the uncertainty set:Recall that in (7) the disturbance is given at the beginning and fixed that way. We use those values of the disturbance to determine set \({\mathscr {D}}_i^t\) in (8), as explained in the following. Replacing (7) with (8) is possible once we relate the demand \(d_i^t\) to the current values of all other state components and disturbances as expressed below:where we denote by \(B_{i \bullet }\) the ith row of the matrix B, with the same convention applying to \(E_{i \bullet }\).
$$\begin{aligned} x_i^{t+1}= x_i^t  d_i^t + u_i^t, \quad d_i^t \in {\mathscr {D}}_i^t. \end{aligned}$$
(8)
$$\begin{aligned} \begin{array}{lll} d_i^t &{} = &{} \left[ \sum \nolimits _{j=1}^n b_{ij} x_j^t + \sum \nolimits _{j=1}^n E_{ij} w_j^t \right] \\ &{} = &{}  \left[ \langle B_{i \bullet } x^t \rangle + \langle E_{i \bullet } w^t \rangle \right] , \end{array} \end{aligned}$$
(9)
In other words, we assume that the influence that all other states have on state i enters into Eq. (8) through demand \(d_i^t\) defined in (9).
Following the decomposition, each lot sizing model is controlled by an agent i (whose state is \(x_i\)) who plays against a virtual opponent which selects a worstcase demand, which can be viewed as a twoplayer game.
Our next step is to make the n dynamics in the form (8) mutually independent. Toward that end, we introduce \(X^t\) as the set of \(x^t\) and observe that this set is bounded for bounded \(d_i^t\). The set \(X^t\) can be defined in two steps. First, we assume that the states never leave a given region, then we compute the worstcase vector \(x^t\) in the region, namely the vector \(x^t\) that, once substituted in (9), has the effect of pushing the ith state out of the safe region. Then, we check whether the trajectory still lies within the region.
Boundedness of \(X^t\) means that there exists a scalar \(\phi >0 \) such that \(\Vert x\Vert _{\infty }\le \phi \) for all \(x\in X^t\). In view of this, it is possible to decompose the system by replacing the current demand \(d_i^t\) by the maximal or minimal demand as computed below:where \([B_{ij}]_+\) denotes the positive part of \(B_{ij}\), i.e., \(\max \{B_{ij},0\} \) and \([B_{ij}]_\) the negative part.
$$\begin{aligned} {\overline{d}}_i^t= & {} \max _{\xi \in X^t} \left\{  \langle B_{i \bullet } \xi \rangle  \langle E_{i \bullet } w^t \rangle \right\} = \sum \nolimits _{j}[B_{ij}]_ \phi  \langle E_{i \bullet } w^t \rangle \\ {\underline{d}}_i^t= & {} \min _{\xi \in X^t} \left\{  \langle B_{i \bullet } \xi \rangle  \langle E_{i \bullet } w^t \rangle \right\} = \sum \nolimits _{j}[B_{ij}]_+ \phi  \langle E_{i \bullet } w^t \rangle , \end{aligned}$$
From the above preamble, we derive the uncertainty set asLikewise, (10) describes the demand that would push the state out of the positive orthant in the longest time.
$$\begin{aligned} {\mathscr {D}}_i^t = \{\eta \in {\mathbb {R}}: \, {\underline{d}}_i^t\le \eta \le {\overline{d}}_i^t\}. \end{aligned}$$
4 Nash Equilibrium Strategies
In this section, we show that all Nash equilibrium strategies are threshold strategies of type (4): retailer i reorders only if the number of active retailers is greater than or equal to a given threshold. For the general model explained in Sect. 3, proving that strategies at a Nash equilibrium have a threshold structure is not straight forward, for that reason in this section the results are given for a single retailer i. To show this, in the next subsection we prove the optimality of the (s, S)like strategies via \({\mathscr {K}}\)convex analysis (see the definition in [5], chapter 4). We recall from [5] that (s, S) strategies are strategies where replenishments occur anytime the inventory level goes below a lower threshold s. Replenishments bring back the inventory level up to a higher threshold S [6]. This is formally stated below where \(\mu (.)\) is the strategy, x the inventory, and s and S lower and upper thresholds, respectively:We refer to (s, S)like strategies as (s, S) strategies whose thresholds depend on the players and on time, i.e., we will have \(s:=s_i^t\) and \(S:=S_i^t\) for fixed i and t.
$$\begin{aligned} \mu (x)=\left\{ \begin{array}{ll} Sx &{} \qquad \hbox {if} \quad x < s,\\ 0 &{} \qquad \hbox {if} \quad x \ge s. \end{array}\right. \end{aligned}$$
(10)
In Theorem 1, we prove the optimality of (s, S)like strategies. Before doing this, we need some preliminary analysis which is inspired by [5, Chapter 4].
Let \(K^t(u_{i}^t)=\frac{K}{1 + \sum \nolimits _{i\in \varGamma ,j\ne i} \delta (u_j^t)}\) be the transportation cost charged to each retailer i that replenishes at stage t. Fix decisions \(u_{i}^0,\ldots ,u_{i}^{N1}\) of all retailers other than i over the horizon and denote such decisions \({\bar{u}}_{i}^0,\ldots ,{\bar{u}}_{i}^{N1}\). Similarly, denote the resulting transportation costs by \(K^0,\ldots ,K^{N1}\). Note that \(K^t\) is a function of \(u_{i}^t\) but for ease of notation sometimes we omit the dependence. Then, let us rewrite the stage cost (2) for retailer i asNow, we can write the costtogo from stage t to the final stage recursively using dynamic programming and the Bellman equation. Let us use the superscript t to indicate the iteration. Then, we havewhere \(J^{0}_{i}(x_i^0,{\bar{u}}_{i}^0)\) is equal to the cost \(J_{i}(x_i^0,{u}_{i}^0)\) introduced in (3). Being \(y_i^t=x_i^t+u_i^t\), the instantaneous inventory position, i.e., the inventory level just after the order has been issued, let us define the new functionand rewrite the Bellman Eq. (11) as followsNote that if we can show that \(v_i^{t+1}\) is \({\mathscr {K}}\)convex with \({\mathscr {K}} = K^t\) then \(G_i^t\) is also \({\mathscr {K}}\)convex for \({\mathscr {K}} = K^{t}\) and the Bellman Eq. (13) has a unique minimizer. Indeed, it has been proved in [5], chapter 4.2, that \({\mathscr {K}}\)convexity of \(G_i^t(y_i^t,{\bar{u}}_{i}^{t+1})\) implies \({\mathscr {K}}\)convexity of \(v_i^t(x_i^t,{\bar{u}}_{i}^t)\). This represents a sufficient optimality condition for the (s, S)like strategies with thresholds depending on time t, that is, \(s:=s_i^t\) and \(S:=S_i^t\), where \(s_i^t\) and \(S_i^t\) satisfy:The meaning of \(s_i^t\) and \(S_i^t\) is exactly the same as in the (s, S) strategies (cfg. [5]), that is, \(s_i^t\) represents the minimum threshold on inventory level below which retailers replenish to restore the inventory up to level \(S_i^t\). Now, let us call \({\underline{s}}_i^t\), the threshold which corresponds to the assumption that the ith retailer is charged the whole transportation cost, i.e.,In the above condition, we have set \(K^t=K\).
$$\begin{aligned} g_{i}(x_i^t,u_i^t,{\bar{u}}_{i}^t)&= K^t \delta (u_i^t)+cu_i^t\\&\quad + p {\mathbb {E}} \{ \max (0,x_i^{t+1}) \} +h {\mathbb {E}} \{ \max (0,x_i^{t+1}) \}. \end{aligned}$$
$$\begin{aligned} v^{t}_{i}(x_i^t,{\bar{u}}_{i}^{t\sim N1})&=\min _{u_{i}^t\in U} [g_{i}(x_i^t,u_i^t,{\bar{u}}_{i}^t) \nonumber \\&\quad \, + {\mathbb {E}} \{ v^{t+1}_{i}(x_i^{t+1},{\bar{u}}_{i}^{t+1\sim N1}) \} ], \;\; t= 0,\ldots N1, \end{aligned}$$
(11)
$$\begin{aligned} {J_i^N(x_i^N)=0}, \end{aligned}$$
(12)
$$\begin{aligned} G^{t}_{i}(y_i^t,\,{\bar{u}}_{i}^{t+1\sim N1})&= cy_i^t+p {\mathbb {E}} \{ \max (0,(y_i^t\omega _{i}^t)) \} \\&\quad +\,h {\mathbb {E}} \{ \max (0,y_i^t\omega _i^t) \} + {\mathbb {E}} \{ v_{i}^{t+1}(x_i^{t+1},{\bar{u}}_{i}^{t+1\sim N1}) \}, \end{aligned}$$
$$\begin{aligned} v_{i}^{t}(x_{i}^t,{\bar{u}}_{i}^{t\sim N1})&=c_{i}x_{i}^t+\min _{y_{i}^t\ge x_i^t}[K^t(u_{i}^t)\nonumber \\&\quad +\,G_{i}^t(y_i^t,{\bar{u}}_{i}^{t+1\sim N1}),\,G_i^t(x_i^t,{\bar{u}}_{i}^{t+1\sim N1})]. \end{aligned}$$
(13)
$$\begin{aligned} S_i^t= & {} \arg \min _\gamma G_i^t(\gamma ,{\bar{u}}_{i}^{t+1\sim N1}),\\ G_i^t(s_i^t,{\bar{u}}_{i}^{t+1\sim N1})= & {} G_i^t(S_i^t,{\bar{u}}_{i}^{t+1\sim N1})+K^t(u_{i}^t). \end{aligned}$$
$$\begin{aligned} G_i^t\left( {\underline{s}}_i^t,{\bar{u}}_{i}^{t+1 \sim N1}\right) =G_i^t\left( S_i^t,{\bar{u}}_{i}^{t+1 \sim N1}\right) +K. \end{aligned}$$
Analogously, let us denote by \({\overline{s}}_i^t\) the threshold computed as if all retailers would share equally the transportation cost, i.e.,In essence, in the condition above, each retailer is charged a transportation cost \(K^t=\frac{K}{n}\), namely one nth of the full cost K. Hence, we have \({\underline{s}}_i \le s_i^t\le {\overline{s}}_i\).
$$\begin{aligned} G_i^t({\overline{s}}_i^t,{\bar{u}}_{i}^{t+1 \sim N1})=G_i^t(S_i^t,{\bar{u}}_{i}^{t+1 \sim N1})+\frac{K}{n}. \end{aligned}$$
The following theorem establishes the optimality of (s, S)like strategies, where each pair of thresholds is valid on different intervals of inventory levels.
Theorem 1
Let \(K^t\) be nondecreasing. Solutions of the Bellman Eq. (11) are at most N different (s, S)like strategies \((s_i^t,\,{S_i^t})\), \(t=0,\ldots ,N1\), where \(S_i^t \in \{\sum \nolimits _{{\hat{t}}=t}^{t+j}\omega _i({\hat{t}})\), \(j=t,\ldots ,N1t\)} and threshold \(s_i^t\) verifies \(G_i^t(s_i^t,{\bar{u}}_{i}^{t+1})=G_i^t(S_i^t,{\bar{u}}_{i}^{t+1})+K^t\).
Proof
The proof is by induction. Assume \(J_i^N(x_i^N)=0\), and consider the convex functionThen, we say that \(G^{N1}_i(\cdot )\) is convex and hence, it is also \({\mathscr {K}}\)convex where \({\mathscr {K}}=K^{N1}\) as shown in Fig. 1. Here, we also use the notationThe above reasoning on \({\mathscr {K}}\)convexity implies that the piecewise linear functionis \(K^{N1}\)convex, with a global minimum at \({S_i^{N1}}:= \arg \min _{\gamma } G^{N1}_{i}(\gamma ,{\bar{u}}_{i}^N)\) (in the deterministic case if the cost of purchase is relatively small then \({S_i^{N1}}=\omega _i^{N1}\)) (see, e.g., Fig. 1).
$$\begin{aligned} G^{N1}_{i}\left( y_{i}^{N1},{\bar{u}}_{i}^N\right)&= cy_{i}^{N1}+p {\mathbb {E}} \{ \max \left( 0,\left( y_{i}^{N1}\omega _{i}^{N1}\right) \right) \} \nonumber \\&\quad +\, h {\mathbb {E}} \{ \max \left( 0,y_{i}^{N1}\omega _{i}^{N1}\right) \} . \end{aligned}$$
(14)
$$\begin{aligned} H_i^{N1}(x_i)&:=p {\mathbb {E}} \{ \max (0,(x_{i}^{N1}\omega _{i}^{N1}) \} \\&\quad +\,h {\mathbb {E}} \{ \max (0,x_{i}^{N1}\omega _{i}^{N1}) \}. \end{aligned}$$
$$\begin{aligned} v_{i}^{N1}\left( x_{i}^{N1},{\bar{u}}_{i}^{N1}\right)&=c_{i}x_{i}^{N1} +\min _{y_{i}^{N1}\ge x_i^{N1}}\left[ K^{N1}\left( u_{i}^{N1}\right) \right. \nonumber \\&\quad \left. +\,G_{i}^{N1}\left( y_i^{N1},{\bar{u}}_{i}^N\right) , G_i^{N1}\left( x_i^{N1},{\bar{u}}_{i}^N\right) \right] \end{aligned}$$
(15)
To obtain \(S_i^{N1}\), let a probability distribution function \(\phi ^{N1}: {\mathbb {Z}}_+ \rightarrow [0,1]\) be given, namely \(\omega \mapsto \phi ^{N1}(\omega )\) where \(\phi ^{N1}(\omega )\) is the probability that \(\omega _i^{N1} = \omega \) for all \(\omega \in {\mathbb {Z}}_+\).
Then, the cost of reordering is given bywhere \({\mathbf {E}}_h^t(\gamma )\) and \({\mathbf {E}}_s^t(\gamma )\) are the expected holding and shortage, respectively, defined as:Let the discrete difference operator be given, \(\frac{{\text {d}}}{{\text {d}}S}\) and let us apply such an operator to function \( G^{N1}_{i}\left( \gamma ,{\bar{u}}_{i}^N\right) = c_i \left( \gamma  x_{i}^{N1} \right) + h {\mathbf {E}}_h(\gamma ) + p {\mathbf {E}}_s(\gamma ). \) Then, we havewhereIn the above equations, we make use of the following conditionsThe orderupto level \(S_i^{N1}\) is the optimal \(\gamma \), which is obtained from solvingFrom the above, we then obtainTo obtain \(s_i^{N1}\), let us consider the cost of not reordering, which is given byAlso, we haveNow, we are going to assume that the statement is true for some \(t=m\), and we are going to proof that it is also valid for \(t=m1\).
$$\begin{aligned}&K^{N1}(u_{i}^{N1}) c_{i}x_{i}^{N1} +G_{i}^{N1}(\gamma ,{\bar{u}}_{i}^N) \\&\quad = K^{N1}(u_{i}^{N1}) + c_i u_{i}^{N1} \\&\qquad +p {\mathbb {E}} \{ \max (0,(\gamma \omega _{i}^{N1}))\} + h {\mathbb {E}} \{ \max (0,\gamma \omega _{i}^{N1}) \} \\&\quad = K^{N1}(u_{i}^{N1}) + c_i ( \gamma  x_{i}^{N1} ) + h \mathbf{E}_h^{N1}(\gamma ) + p {\mathbf {E}}_s^{N1}(\gamma ), \end{aligned}$$
$$\begin{aligned} {\mathbf {E}}_h^t(\gamma ):={\mathbb {E}} \left\{ \max \left( 0,\gamma \omega _{i}^t\right) \right\} , \end{aligned}$$
$$\begin{aligned} {\mathbf {E}}_s^t\left( \gamma ):={\mathbb {E}}\left\{ \max (0,\left( \gamma \omega _{i}^t\right) \right) \right\} . \end{aligned}$$
$$\begin{aligned} \frac{\text {{d}}}{\text {{d}}\gamma } G^{N1}_{i}\left( \gamma ,{\bar{u}}_{i}^N\right)&:= G^{N1}_{i}\left( \gamma +1,{\bar{u}}_{i}^N\right)  G^{N1}_{i}\left( \gamma ,{\bar{u}}_{i}^N\right) \\&\quad = c_i + h \varPhi ^{N1}_\omega [\gamma ]  p\left( 1  \varPhi ^{N1}_\omega [\gamma ]\right) . \end{aligned}$$
$$\begin{aligned} \varPhi ^t_\omega [\gamma ]:=\sum \nolimits _{\omega =0}^\gamma \phi ^t_{\omega }, \quad 1 \varPhi ^t_\omega [\gamma ]:=\sum \nolimits _{\omega =\gamma +1}^\infty \phi ^t_{\omega }. \end{aligned}$$
$$\begin{aligned} \begin{array}{ll} \sum \nolimits _{\omega =0}^{\gamma +1 }(\gamma +1 \omega ) \phi ^{N1}_{\omega } = \sum \nolimits _{\omega =0}^{\gamma }(\gamma +1 \omega ) \phi ^{N1}_{\omega } \\ \qquad = \sum \nolimits _{\omega =0}^{\gamma }(\gamma \omega ) \phi ^{N1}_{\omega } + \sum \nolimits _{\omega =0}^{\gamma } \phi ^{N1}_{\omega },\\ \sum \nolimits _{\omega =\gamma +2}^\infty (\omega  \gamma 1) \phi ^{N1}_{\omega }= \sum \nolimits _{\omega =\gamma +1}^\infty (\omega  \gamma 1) \phi ^{N1}_{\omega } \\ \qquad = \sum \nolimits _{\omega =\gamma +1}^\infty (\omega  \gamma ) \phi ^{N1}_{\omega }  \sum \nolimits _{\omega =\gamma +1}^\infty \phi ^{N1}_{\omega }.\end{array} \end{aligned}$$
(16)
$$\begin{aligned}&\min _\gamma \, \left\{ \gamma  \, \frac{\text {d}}{{\text {d}}\gamma } G^{N1}_{i}\left( \gamma ,{\bar{u}}_{i}^N\right) \ge 0\right\} \\&\quad = \min _\gamma \,\left\{ \gamma  \, c_i + h \varPhi ^{N1}_\omega [\gamma ]  p \left( 1  \varPhi ^{N1}_\omega [\gamma ]\right) \ge 0\right\} . \end{aligned}$$
$$\begin{aligned} S_i^{N1} = \arg \min _\gamma \Big \{\gamma  \, \varPhi ^{N1}_\omega [\gamma ] \ge \frac{c_i + p }{h +p} \Big \}. \end{aligned}$$
$$\begin{aligned} c_{i}x_{i}^{N1} +G_{i}^{N1}(x_{i}^{N1},{\bar{u}}_{i}^N)&= p {\mathbb {E}} \{ \max (0,(x_{i}^{N1}\omega _{i}^{N1}))\} \\&\quad +\, h {\mathbb {E}} \{ \max (0,x_{i}^{N1} \omega _{i}^{N1}) \} \\&= h {\mathbf {E}}_h(x_{i}^{N1}) + p {\mathbf {E}}_s(x_{i}^{N1}) \end{aligned}$$
$$\begin{aligned} s_i^{N1}&:= \arg \min _{x_{i}^{N1}} \left\{ x_{i}^{N1}  \, h {\mathbf {E}}_h\left( x_{i}^{N1}\right) + p {\mathbf {E}}_s\left( x_{i}^{N1}\right) \right. \\&\le K^{N1}\left( u_{i}^{N1}\right) c_i \left( S_{i}^{N1}  x_{i}^{N1}\right) \\&\left. \quad +\, h {\mathbf {E}}_h\left( S_{i}^{N1}\right) + p {\mathbf {E}}_s\left( S_{i}^{N1}\right) \right\} . \end{aligned}$$
Consider now the convex function (see Fig. 2 which illustrate the example of \(t=N2\))We know that \(G_i^{m1}\) is \({\mathscr {K}}\)convex, with \({\mathscr {K}}=K^{m1}\). This property implies that the functionis \(K^{m1}\)convex, with a global minimum at \(S_i^{m1}:=\) \(argmin_{\gamma }G_i^{m1}(\gamma ,{\bar{u}}_{i}^m)\). It is important to notice that we can ensure the existence of a unique minimum value in (18) thanks to the nondecreasing property of \(K^{m1}\).
$$\begin{aligned} G_i^{m1}(y_{i}^{m1},{\bar{u}}_{i}^{m})&= c_iy_{i}^{m1}+p {\mathbb {E}} \left\{ \max \left( 0,\left( y_{i}^{m1}\omega _{i}^{m1}\right) \right) \right\} \nonumber \\&\quad +\,h {\mathbb {E}} \left\{ \max \left( 0,y_{i}^{m1}\omega _{i}^{m1}\right) \right\} + {\mathbb {E}} \left\{ v_{i}^{m}\left( x_i^{m},{\bar{u}}_{i}^{m}\right) \right\} \nonumber \\&= c_i y_{i}^{m1} + h {\mathbf {E}}_h\left( y_{i}^{m1}\right) + p {\mathbf { E}}_s\left( y_{i}^{m1}\right) \nonumber \\&\quad + \sum \nolimits _{\omega =0}^{\infty } v_{i}^{m}\left( y_{i}^{m1}\omega , {\bar{u}}_{i}^{m}\right) \phi ^{m1}_{\omega }. \end{aligned}$$
(17)
$$\begin{aligned} v_{i}^{m1}\left( x_{i}^{m1},{\bar{u}}_{i}^{m1}\right)&= c_{i}x_{i}^{m1}+\min _{y_{i}^{m1}\ge x_i^{m1}}\left[ K^{m1}\left( u_{i}^{m1}\right) \right. \nonumber \\&\quad \left. +\,G_{i}^{m1}\left( y_{i}^{m1},{\bar{u}}_{i}^{m}\right) , G_{i}^{m1}\left( x_{i}^{m1},{\bar{u}}_{i}^{m}\right) \right] , \end{aligned}$$
(18)
The cost of reordering for \(t=m1\) is given byApplying operator \(\frac{{\text {d}}}{{\text {d}}\gamma }\) to function \(G_i^{m1}(\gamma ,{\bar{u}}_{i}^m)\), we haveHence, the orderupto level \(S_i^{m1}\) is the optimal \(\gamma \), which is obtained from solvingTo obtain \(s_i^{m1}\), let us consider the cost of not reordering, which is given byThen, we haveThe above can be rewritten asThus by induction backwards in time, we have proved Theorem 1. \(\square \)
$$\begin{aligned}&K^{m1}\left( u_{i}^{m1}\right) c_{i}x_{i}^{m1} +G_{i}^{m1}\left( \gamma ,{\bar{u}}_{i}^m\right) \\&\quad = K^{m1}\left( u_{i}^{m1}\right) + c_i u_{i}^{m1} \\&\qquad +p {\mathbb {E}} \left\{ \max \left( 0,\left( \gamma \omega _{i}^{m1}\right) \right) \right\} + h {\mathbb {E}} \left\{ \max \left( 0,\gamma \omega _{i}^{m1}\right) \right\} \\&\quad = K^{m1}(u_{i}^{m1}) + c_i \left( \gamma  x_{i}^{m1} \right) + h \mathbf{E}_h^{m1}(\gamma ) + p {\mathbf {E}}_s^{m1}(\gamma ). \end{aligned}$$
$$\begin{aligned} \frac{\text {d}}{\text {d}\gamma } G^{m1}_{i}(\gamma ,{\bar{u}}_{i}^m)&:= G^{m1}_{i}\left( \gamma +1,{\bar{u}}_{i}^m\right)  G^{m1}_{i}\left( \gamma ,{\bar{u}}_{i}^m\right) \\&= c_i + h \varPhi ^{m1}_\omega [\gamma ]  p \left( 1  \varPhi ^{m1}_\omega [\gamma ]\right) + \\&\quad \sum \nolimits _{\omega =0}^{\infty } \left[ v_{i}^{m}(\gamma +1\omega , \cdot )  v_{i}^{m}\left( \gamma \omega , \cdot \right) \right] \phi ^{m1}_{\omega }. \end{aligned}$$
$$\begin{aligned} S_i^{m1}&= \arg \min _\gamma \,\left\{ \gamma  \, c_i + h \varPhi ^{m1}_\omega [\gamma ]  p \left( 1  \varPhi ^{m1}_\omega [\gamma ]\right) \right. \\&\left. \quad + \sum \nolimits _{\omega =0}^{\infty } \left[ v_{i}^{m}\left( \gamma +1\omega , \cdot \right)  v_{i}^{m}\left( \gamma \omega , \cdot \right) \right] \phi ^{m1}_{\omega } \ge 0\right\} . \end{aligned}$$
$$\begin{aligned}&c_{i}x_{i}^{m1} +G_{i}^{m1}\left( x_{i}^{m1},{\bar{u}}_{i}^{m}\right) \\&\quad = h {\mathbf {E}}_h\left( x_{i}^{m1}\right) + p {\mathbf {E}}_s(x_{i}^{m1}) + \sum \nolimits _{\omega =0}^{\infty } v_{i}^{m}\left( x_{i}^{m1}\omega , {\bar{u}}_{i}^{m}\right) \phi ^{m1}_{\omega }. \end{aligned}$$
$$\begin{aligned} s_i^{m1}&:= \arg \min _{x_{i}^{m1}} \left\{ x_{i}^{m1}  \, c_{i}x_{i}^{m1} +G_{i}^{m1}(x_{i}^{m1} ,{\bar{u}}_{i}^{m}) \right. \\&\left. \quad + \sum \nolimits _{\omega =0}^{\infty } v_{i}^{m}(x_{i}^{m1}\omega , {\bar{u}}_{i}^{m}) \phi ^{m1}_{\omega }\right. \\&\left. \le K^{m1}(u_{i}^{m1})c_{i}S_{i}^{m1} +G_{i}^{m1}(S_{i}^{m1},{\bar{u}}_{i}^{m})\right. \\&\left. \quad + \sum \nolimits _{\omega =0}^{\infty } v_{i}^{m}(S_{i}^{m1}\omega , {\bar{u}}_{i}^{m}) \phi ^{m1}_{\omega }\right\} . \end{aligned}$$
$$\begin{aligned} s_i^{m1}&:= \arg \min _{x_{i}^{m1}} \Big \{x_{i}^{m1}  \, h {\mathbf {E}}_h(x_{i}^{m1}) + p {\mathbf {E}}_s(x_{i}^{m1}) \\&\quad + \sum \nolimits _{\omega =0}^{\infty } v_{i}^{m}(x_{i}^{m1}\omega , {\bar{u}}_{i}^{m}) \phi ^{m1}_{\omega } \\&\le K^{m1}(u_{i}^{m1})c_i ( S_{i}^{m1}  x_{i}^{m1} ) \\&\quad + h {\mathbf {E}}_h(S_{i}^{m1}) + p {\mathbf { E}}_s(S_{i}^{m1})\\&\quad + \,\sum \nolimits _{\omega =0}^{\infty } v_{i}^{m}(S_{i}^{m1}\omega , {\bar{u}}_{i}^{m}) \phi ^{m1}_{\omega } \Big \}. \end{aligned}$$
×
×
We can reinterpret the (s, S)like strategies as threshold strategies on the number of active retailers. The result is that all Nash equilibrium strategies have the threshold structure (4).
In the following result on a singlestage inventory game (where we have dropped index t), we reinterpret a threshold on the inventory level as a threshold on the number of “active retailers”.
Theorem 2
For each inventory level \(x_i\), there exists a threshold \(l_i\in \{1,\,2,\,\ldots ,\,n\}\), such that the replenishment strategyis a Nash equilibrium for the singlestage formulation of the inventory game. For the sake of simplicity, we have dropped dependence on time.
$$\begin{aligned} \mu _i(x_i,a)=\left\{ \begin{array}{cc} S_ix_i, &{} \qquad \hbox {if} \quad a \ge l_i,\\ 0, &{} \qquad \hbox {if} \quad a < l_i, \end{array}\right. \end{aligned}$$
(19)
Proof
From Theorem 1, if \(N=1\), we have a unique multiperiod strategy \((s_i,\,S_i)\). This means that the retailers make decisions according toNote that from \(G_i(s_i)= G_i(S_i)+(\frac{K}{a})\) we have that \(s_i\) depends on the number of active players a. Now, for given \(x_i\), the idea is to find \(l_i\) as the minimum number of active players such that the cost of replenishing does not exceed the cost of not replenishing. This can be expressed by the minimization below (in a singlestage optimization, we can drop the second argument \({\bar{u}}_{i}^{t+1}\) from \(G_i(.,.)\))Strategy (20) implies (19) once we compute \(l_i\) from (21) for fixed \(x_i\).
$$\begin{aligned} u_i=\mu _i(x_i)=\left\{ \begin{array}{cc} S_ix_i, &{} \qquad \hbox {if} \quad x_i < s_i,\\ 0, &{} \qquad \hbox {if} \quad x_i \ge s_i. \end{array}\right. \end{aligned}$$
(20)
$$\begin{aligned} \begin{array}{ll} l_i=\min _{a=1,\ldots ,n} \Big \{ \, a \, G_i(x_i)\ge G_i(s_i),\, G_i(s_i)=G_i(S_i)+(K/a)\Big \}. \end{array} \end{aligned}$$
(21)
In solving (21), we distinguish three cases.

The inventory level is “low,” namely, \(x_i < {\underline{s}}_i\). Then, the optimal decision is “replenish” independently of a. Actually, the minimization (21) returns \(l_i=1\) and as it always holds \(a \ge l_i\) we have \(\mu _i(x_i,a)=S_ix_i\).

The inventory level is “high,” namely \(x_i \ge {\overline{s}}_i\). Then, the optimal decision is “do not replenish.” Indeed, the minimization (21) is infeasible. With a little abuse of notation, we can take \(l_i=n+1\) so that it always holds \(a<l_i\) and therefore also \(\mu _i(x_i,a)=0\).

The inventory level verifies \({\underline{s}}_i \le x_i \le {\overline{s}}_i\). To see this, note that the computation of \(l_i\) as in (21) leads to \(1 \le l_i \le n\). Then, if \(a\ge l_i\) from (21), we have \(x_i<s_i\) which substituted in (20) returns \(\mu _i(x_i,a)=S_ix_i\). Differently if \(a< l_i\) from (21), we have \(x_i \ge s_i\) which, again, substituted in (20) returns \(\mu _i(x_i,a)=0\). The obtained strategy for \(\mu _i(x_i,a)\) is in accordance with (19), and this concludes the proof. \(\square \)
5 SingleStage Coordination
In this section, we specialize our results to the case of singlestage game. In particular, we provide explicit expressions for the two thresholds, as a function of the probability distribution function which determines the stochastic demand.
Let us start by noting that in the singlestage game function \(G^{t}_{i}(y_{i}^{t},{\bar{u}}_{i}^{t+1 \sim N1})\) does not depend on \({\bar{u}}_{i}^N\) and therefore, we simply write \(G^{t}_{i}(y_{i}^{t})\):Then, we have for the value functionTo obtain \(S_i^t\), consider the cost of reordering, which is given byLet the discrete difference operator be given, \(\frac{{\text {d}}}{{\text {d}}S}\) and let us apply such an operator to functionBy applying the difference operator to function \(G^t_{i}(\gamma )\), we then haveFurther derivations yieldIn the above, we have used the following equalitiesThe orderupto level \(S_i^t\) is the optimal \(\gamma \), which is obtained from solvingFrom the above, we then obtainTo obtain \(s_i^t\), let us consider the cost of not reordering, which is given byFrom the above, we then obtainIn particular, we haveEquations (25) and (27) represent explicit expressions for the two thresholds and fully characterize then the reordering strategy once the probability distribution of the stochastic demand is given.
$$\begin{aligned} G^{t}_{i}\left( y_{i}^{t}\right) = cy_{i}^t+p {\mathbb {E}} \left\{ \max \left( 0,\left( y_{i}^t\omega _{i}^t\right) \right) \right\} + h {\mathbb {E}} \left\{ \max \left( 0,y_{i}^t\omega _{i}^t\right) \right\} . \end{aligned}$$
(22)
$$\begin{aligned} v_{i}^t\left( x_{i}^t,{\bar{u}}_{i}^t\right) =c_{i}x_{i}^t+\min _{y_{i}^t\ge x_i^t}\left[ K^t\left( u_{i}^t\right) +G_{i}^t\left( y_i^t\right) , G_i^t\left( x_i^t\right) \right] . \end{aligned}$$
(23)
$$\begin{aligned}&K^t\left( u_{i}^t\right) c_{i}x_{i}^t +G_{i}^t(\gamma )\\&\quad =K^t\left( u_{i}^t\right) + c_i u_{i}^t + p {\mathbb {E}} \{ \max \left( 0,\left( \gamma \omega _{i}^t\right) \right) \} \\&\qquad + h {\mathbb {E}} \{ \max \left( 0,\gamma \omega _{i}^t\right) \} \\&\quad =K^t\left( u_{i}^t\right) + c_i \left( \gamma  x_{i}^t \right) \\&\qquad +p {\mathbb {E}} \{ \max (0,(\gamma \omega _{i}^t))\} + h {\mathbb {E}} \{ \max \left( 0,\gamma \omega _{i}^t\right) \} \\&\quad = K^t(u_{i}^t) + c_i \left( \gamma  x_{i}^t \right) + h {\mathbf {E}}_h(\gamma ) + p {\mathbf {E}}_s(\gamma ). \end{aligned}$$
$$\begin{aligned} G^t_{i}(\gamma ) = c_i ( \gamma  x_{i}^t ) + h \underbrace{\sum \nolimits _{\omega =0}^\gamma (\gamma \omega ) \phi ^t_{\omega }}_{{\mathbf {E}}_h(\gamma )} + p \underbrace{\sum \nolimits _{\omega =\gamma +1}^\infty (\omega  \gamma ) \phi ^t_{\omega }}_{{\mathbf {E}}_s(\gamma )}. \end{aligned}$$
$$\begin{aligned} \frac{\text {d} }{\text {d}\gamma } G^t_{i}(\gamma )&:= G^t_{i}(\gamma +1)  G^t_{i}(\gamma ) \\&= c_i \left( \gamma +1  x_{i}^t\right) + h \sum \nolimits _{\omega =0}^{\gamma +1}(\gamma +1 \omega ) \phi ^t_{\omega } \\&\quad + p \sum \nolimits _{\omega =\gamma +2}^\infty (\omega  \gamma 1) \phi ^t_{\omega } \\&\quad \, c_i ( \gamma  x_{i}^t)  h \sum \nolimits _{\omega =0}^\gamma (\gamma \omega ) \phi ^t_{\omega }  p \sum \nolimits _{\omega =\gamma +1}^\infty (\omega  \gamma ) \phi ^t_{\omega }. \end{aligned}$$
$$\begin{aligned} \frac{\text {d}}{\text {d}\gamma } G^t_{i}(\gamma )&= c_i \left( \gamma +1  x_{i}^t\right) + h \left[ \sum \nolimits _{\omega =0}^{\gamma }(\gamma \omega ) \phi ^t_{\omega } + \sum \nolimits _{\omega =0}^{\gamma }\phi ^t_{\omega }\right] \\&\quad + p \left[ \sum \nolimits _{\omega =\gamma +1}^\infty (\omega  \gamma ) \phi ^t_{\omega }  \sum \nolimits _{\omega =\gamma +1}^\infty \phi ^t_{\omega }\right]  c_i \left( \gamma  x_{i}^t\right) \\&\quad  h \sum \nolimits _{\omega =0}^\gamma (\gamma \omega ) \phi ^t_{\omega }  p \sum \nolimits _{\omega =\gamma +1}^\infty (\omega  \gamma ) \phi ^t_{\omega } \\&= c_i + h \sum \nolimits _{\omega =0}^\gamma \phi ^t_\omega  p \sum \nolimits _{\omega =\gamma +1}^\infty \phi ^t_{\omega }\\&= c_i + h \varPhi ^t_\omega [\gamma ]  p \left( 1  \varPhi ^t_\omega [\gamma ]\right) . \end{aligned}$$
$$\begin{aligned}&\sum \nolimits _{\omega =0}^{\gamma +1 }(\gamma +1 \omega ) \phi ^t_{\omega } = \sum \nolimits _{\omega =0}^{\gamma }(\gamma +1 \omega ) \phi ^t_{\omega } \nonumber \\&\quad = \sum \nolimits _{\omega =0}^{\gamma }(\gamma \omega ) \phi ^t_{\omega } + \sum \nolimits _{\omega =0}^{\gamma } \phi ^t_{\omega }, \nonumber \\&\qquad \sum \nolimits _{\omega =\gamma +2}^\infty (\omega  \gamma 1) \phi ^t_{\omega }= \sum \nolimits _{\omega =\gamma +1}^\infty (\omega  \gamma 1) \phi ^t_{\omega } \nonumber \\&\quad = \sum \nolimits _{\omega =\gamma +1}^\infty (\omega  \gamma ) \phi ^t_{\omega }  \sum \nolimits _{\omega =\gamma +1}^\infty \phi ^t_{\omega }. \end{aligned}$$
(24)
$$\begin{aligned}&\min _\gamma \, \{\gamma  \, \frac{\text {d}}{\text {d}\gamma } G^t_{i}(\gamma ) \ge 0\} \\&\quad = \min _\gamma \,\{ \gamma  \, c_i + h \varPhi ^t_\omega [\gamma ]  p (1  \varPhi ^t_\omega [\gamma ]) \ge 0\}. \end{aligned}$$
$$\begin{aligned} S_i^t = \arg \min _\gamma \Big \{\gamma  \, \varPhi ^t_\omega [\gamma ] \ge \frac{c_i + p }{h +p} \Big \}. \end{aligned}$$
(25)
$$\begin{aligned} c_{i}x_{i}^t +G_{i}^t\left( x_{i}^t\right)&= p {\mathbb {E}} \{ \max \left( 0,\left( x_{i}^t\omega _{i}^t\right) \right) \}\nonumber \\&\quad +\, h {\mathbb {E}} \{ \max \left( 0,x_{i}^t \omega _{i}^t\right) \} \nonumber \\&= h {\mathbf {E}}_h\left( x_{i}^t\right) + p {\mathbf {E}}_s\left( x_{i}^t\right) . \end{aligned}$$
(26)
$$\begin{aligned} \begin{array}{ll} s_i^t:= \arg \min _{x_{i}^t} \left\{ x_{i}^t  \, c_{i}x_{i}^t +G_{i}^t\left( x_{i}^t\right) \le K^t\left( u_{i}^t\right) c_{i}S_{i}^t +G_{i}^t\left( S_{i}^t\right) \right\} . \end{array} \end{aligned}$$
$$\begin{aligned} s_i^t&:= \arg \min _{x_{i}^t} \left\{ x_{i}^t  \, h {\mathbf {E}}_h\left( x_{i}^t\right) + p {\mathbf {E}}_s(x_{i}^t)\right. \nonumber \\&\left. \le K^t\left( u_{i}^t\right) +c_i \left( S_{i}^t  x_{i}^t\right) + h \mathbf{E}_h\left( S_{i}^t\right) + p {\mathbf {E}}_s\left( S_{i}^t\right) \right\} . \end{aligned}$$
(27)
Once thresholds are obtained, we implement the control \(u_i^t\) which is given byThe resulting dynamics is then
$$\begin{aligned} u_i^t=\mu (x^t)=\left\{ \begin{array}{ll} S_i^tx^t, &{} \qquad \hbox {if} \quad x^t < s_i^t,\\ 0, &{} \qquad \hbox {if} \quad x^t \ge s_i^t. \end{array}\right. \end{aligned}$$
(28)
$$\begin{aligned} x_i^{t+1}=\left\{ \begin{array}{lll} S_i^t\omega _i^t, &{} \qquad \hbox {if} \quad x_i^t < s_i^t,\\ x_i^t\omega ^t, &{} \qquad \hbox {if} \quad x_i^t \ge s_i^t. \end{array} \right. \end{aligned}$$
(29)
6 Numerical Analysis
We consider an example where the demand \(\omega ^t \in \varOmega := \{0,1,2\}\) and is uniformly distributed, namely after introducing the notation \(\phi _\omega \) to indicate the probability that \(\omega ^t = \omega \), we have \(\phi _\omega =\frac{1}{3}\) for \( \omega = 0,1,2\).
Assume that the proportional purchase cost is \(c=1\), the shortage cost is \(p=10\), and the holding cost is \(h=2\). In the case of single stage optimization, we have that the order up to level is given byFrom the above, we obtain \(S=2\). Indeed for \(\gamma =2\), we haveDifferently, for \(\gamma =1\) it holdsand thereforeAs for the reorder level s, we haveWe show next that we have \(s=1\). Actually, for \(x=1\) we obtainwhich is satisfied by any \(K^t \ge 1\).
$$\begin{aligned} S = \arg \min _\gamma \Big \{\gamma  \, \varPhi ^t_\omega [\gamma ] \ge \frac{c + p }{h +p} \Big \}. \end{aligned}$$
$$\begin{aligned} \varPhi ^t_\omega [2] =1 \ge \frac{c + p }{h +p} = \frac{3}{4}.\end{aligned}$$
$$\begin{aligned}\varPhi ^t_\omega [1] = \frac{2}{3} \not \ge \frac{c + p }{h +p} = \frac{3}{4},\end{aligned}$$
$$\begin{aligned}S=\arg \min _\gamma \Big \{\gamma  \, \varPhi ^t_\omega [\gamma ] \ge \frac{c + p }{h +p} \Big \} =2.\end{aligned}$$
$$\begin{aligned} s&:= \arg \min _{x} \Big \{x  \, h {\mathbf {E}}_h(x) + p {\mathbf { E}}_s(x) \\&\le K^t +c ( S  x ) + h {\mathbf {E}}_h(S) + p {\mathbf {E}}_s(S) \Big \}. \end{aligned}$$
$$\begin{aligned}&h {\mathbf {E}}_h(1) + p {\mathbf {E}}_s(1) = h \frac{1}{3} + p \frac{1}{3} = 4 \\&\quad \le K^t +c + h {\mathbf {E}}_h(2) + p {\mathbf {E}}_s(2)\\&\quad = K^t + c + h {\mathbf {E}}_h(2) =K^t + 3, \end{aligned}$$
For \(x=0\), we havewhich is satisfied by any \(K^t < 6\). For any \(K^t < 6\), we then haveWe can conclude then that for any \(K^t\) such that \(1 \le K^t < 6\) we have the reorder level \(s=1\) and the orderupto level \(S=2\).
$$\begin{aligned}&h {\mathbf {E}}_h(0) + p {\mathbf {E}}_s(0) = p {\mathbf {E}}_s(0) = 10 \\&\quad \nleq K^t + 2 c + h {\mathbf {E}}_h(2) + p {\mathbf {E}}_s(2)\\&\quad = K^t + 2c + h {\mathbf {E}}_h(2) =K^t + 4, \end{aligned}$$
$$\begin{aligned} \begin{array}{ll}s:= \arg \min _{x} \Big \{x  \, h {\mathbf {E}}_h(x) + p {\mathbf {E}}_s(x)\\ \qquad \le K^t +c ( S  x ) + h {\mathbf {E}}_h(S) + p {\mathbf {E}}_s(S) \Big \}=1. \end{array} \end{aligned}$$
Then, from (29) the microscopic dynamics is defined in the bounded support \(\{1,0,1,2\}\), namely \(x^t \in \{1,0,1,2\}\) for all \(t\ge 0\) and is given byFigure 3 displays the time plot of the microscopic dynamics for a single player. In other words, the plot shows the inventory level (the state) of a player. The player’s inventory is for most of the time in state 0 and 1, which is in accordance with the greater values of the distribution in those states.
$$\begin{aligned} x^{t+1}=\left\{ \begin{array}{ll} 2\omega ^t, &{} \qquad \hbox {if} \quad x^t =1, 0,\\ x^t\omega ^t, &{} \qquad \hbox {if} \quad x^t = 1,2. \end{array}\right. \end{aligned}$$
(30)
×
In the following example, we consider a larger instance involving five agents, where the demand of each agent \(w^t \in \varOmega := \{0,1,\ldots ,20\}\) and is uniformly distributed.
Assume the same purchase, shortage, and holding costs as in the previous example and consider a transportation cost \(K = 120\), which will be divided among the active agents at each time \(t \in [0,50]\).
Figure 4 shows the relation between the inventory levels and the transportation costs that each player is willing to pay in case of reordering as well as the minimum number of active agents in case of replenishment for any inventory level. It is possible to see that the inventory has an inverse relation with the transportation cost and an increase relation with the number of active agents. This means that if the inventory level of agent i is higher, the agent is willing to pay less in case of reordering and hence, it is expected to require a large number of active agents to coordinate with.
×
×
×
The last two figures (Figs. 5, 6) display the inventory level of the five players over time. In Fig. 5, it is possible to see the moment in time when it is most convenient that the players coordinate for replenishment. On the other hand, Fig. 6 exhibits the relation of the inventory level and the number of active agents at each time. It is clear that the agents reorder when its inventory level is lower or equal to the threshold s, which also depends on the number of active agents, and they reorder up to the upper threshold \(S=15\).
7 Conclusions and Future Works
We first developed an abstraction in the form of a dynamic coordination game model where each player’s dynamics is a scalar inventory model characterized by a controlled input and an uncontrolled output. The players have to pay a share of the activation cost to control their dynamics at a given time. First, we showed that if the retailers are rational players, then they benefit from using threshold strategies where the threshold is on the number of active players. We then turned to obtain an explicit expressions for the lower and upper thresholds under specific circumstances. A main key direction for future works is to explore the feasibility of the proposed coordination scheme in multivector energy systems (heat, gas, power) with special focus on coalitional bidding in decentralized energy trade. The ultimate goal is to investigate the benefits of aggregating independent wind power producers.
Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.