Skip to main content


Swipe to navigate through the articles of this issue

Published in: Dynamic Games and Applications 4/2020

Open Access 09-04-2019

A Dynamic Game Approach for Demand-Side Management: Scheduling Energy Storage with Forecasting Errors

Authors: Matthias Pilz, Luluwah Al-Fagih

Published in: Dynamic Games and Applications | Issue 4/2020


Smart metering infrastructure allows for two-way communication and power transfer. Based on this promising technology, we propose a demand-side management (DSM) scheme for a residential neighbourhood of prosumers. Its core is a discrete time dynamic game to schedule individually owned home energy storage. The system model includes an advanced battery model, local generation of renewable energy, and forecasting errors for demand and generation. We derive a closed-form solution for the best response problem of a player and construct an iterative algorithm to solve the game. Empirical analysis shows exponential convergence towards the Nash equilibrium. A comparison of a DSM scheme with a static game reveals the advantages of the dynamic game approach. We provide an extensive analysis on the influence of the forecasting error on the outcome of the game. A key result demonstrates that our approach is robust even in the worst-case scenario. This grants considerable gains for the utility company organising the DSM scheme and its participants.
This work was supported by the Doctoral Training Alliance (DTA) Energy.
This article is part of the topical collection “Dynamic Games for Smart Energy Systems”.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

1 Introduction

Climate change poses a serious threat to the global ecosystem. To limit the increase in global average temperatures, it is critical to restrict greenhouse gas emissions. Currently, burning fossil fuels accounts for the largest share of CO\(_2\) emissions by humans into the atmosphere. Renewable energy sources, such as wind and solar, have a much smaller carbon footprint and should be employed instead [16]. Due to the intermittent nature of these sources, their integration into the power system can be a challenging task. Our research investigates possibilities for more efficient and environment friendly access to electricity by means of energy storage and renewable energy generation.
The concept rests upon the implementation of a technologically advanced power grid. In contrast to the current power grid, this smart grid features two-way communication and power transfer between the utility company (UC) and individual households [7]. Its decentralised nature is expressed through distributed generation and storage of energy, with individual households capable of doing both. These households are called prosumers (combination of producer and consumer). Moreover, the deployment of smart meters allows households to accurately measure electricity demands in real time. This permits the implementation of demand-side management (DSM) schemes. Within such schemes, the UC incentivises users to avoid consumption during peak hours by means of dynamic pricing tariffs. These tariffs determine the price per energy unit based on the aggregated load of all users (cf. [4, 10, 24]). This will eventually allow them to reduce investments into fast ramping technologies, needed otherwise.
In [4, 9, 11, 24, 28], consumers react to these price incentives by rescheduling their appliances, thus potentially interfering with their habits. Among them, [4, 9, 24] additionally model the usage of energy storage systems. All of these users are aiming at a reduction in the peak-to-average ratio (PAR) of the aggregated load, since achieving this eventually translates into financial benefits for the participants. The methods of choice to obtain the desired schedules are almost always based on game-theoretic concepts. Only [9] deviates by using convex optimisation. Since the DSM scheme directly influences the routines of the users, their comfort levels play an important role. For instance, Yaagoubi et al. [28] found that when acceptable comfort levels are preserved, the amount of savings from the energy bill reduces by more than half of the optimum. Note that all these studies have the common idea of scheduling the usage of appliances and batteries in a day-ahead manner.
Day-ahead scheduling that does not interfere with the users can solely be realised through energy storage systems. Nguyen et al. [12] and Pilz et al. [17] followed this approach and showed that considerable gains are achievable without interrupting the habits of the consumers. Nguyen et al. [12] put their focus on developing a distributed algorithm, while Pilz et al. [17] implemented an advanced battery model, providing insight into how specific battery characteristics influence the participation behaviour and thus the outcome of the game.
This work builds on these previous results and extends the approach of Pilz and Al-Fagih [17] in two directions. Firstly, we introduce a more sophisticated underlying game structure for the DSM scheme, namely a discrete time dynamic game. Within this formulation, the action space is continuous instead of the discrete options available in [17]. As a consequence, the outcome for the players improves as they can make more fine-grained decisions. Another advantage is that this allows for the derivation of a best response strategy and thus does not require a computationally expensive search for the best response. Secondly, we analyse the influence of the forecasting error for demand and energy generation on the scheduling outcome. In order to assure the stability of the grid, a real-world application requires the mechanism to be resilient against eventual errors in the predictions, as they will undoubtedly occur.
Our contributions are as follows:
We introduce a novel discrete time dynamic game for energy storage scheduling among prosumers in the smart grid. The closed-form solution to the best response problem is derived by means of a dynamic programming approach. The ensuing iterative algorithm converges quickly towards the Nash equilibrium. Direct comparison with similar approaches, i.e. [12, 17, 28], reveals the superiority in terms of both achieved PAR reduction and computational costs.
A complete day-ahead DSM scheme, consisting of prosumers with realistically modelled batteries, local renewable energy sources, and forecasting errors for demand and generation is simulated. In contrast to previous works which merely simulate individual days, our scheduling period covers a full year. The length of the simulation allows for an in-depth analysis of the influence of the forecasting errors as well as the impact of the number of participants in the DSM scheme.
We show that the proposed dynamic game approach is robust with respect to the forecasting errors, even in the worst-case scenario. The respective results exhibit only small deviations in the PAR reduction outcomes compared to runs with accurate predictions, and hardly any influence on the financial benefits for the DSM participants.
For the first time, a comparison of how different compositions of neighbourhoods perform in the DSM scheme is presented. We find that a community consisting of a mix of consumer types can achieve best results. This is furthermore supported by an extensive analysis of scenarios with randomly generated battery and generation parameters, indicating the overall robustness of our game-theoretic approach.
This paper is organised as follows. In Sect. 2, we give an overview of the system, provide details of the DSM protocol, introduce the battery and the renewable energy model, and explain the pricing tariff. Section 3 contains detailed information about the dynamic game. Furthermore, it includes the derivation of the best response solution and the description of the iterative algorithm. The simulation parameters and the data sets for demand and generation data are presented in the beginning of Sect. 4. Then, we compare our approach with the static game approach of Pilz and Al-Fagih [17], show the influence of the forecasting errors, investigate the neighbourhood composition, and analyse the robustness by simulating a large number of randomly generated scenarios. Section 5 concludes the paper and points out future research directions.

2 System Model—A Smart Grid Neighbourhood

In this section, we build the basis to the formulation of the battery scheduling game presented in Sect. 3. We introduce the concept of a smart grid neighbourhood that participates in a demand-side management (DSM) programme to reduce their electricity bills. Each of the participants is equipped with an individually owned lithium-ion battery in addition to a photovoltaic (PV) cell which generates electricity. Models for both the battery and the PV cell are stated in detail. Moreover, we clarify the specific smart meter infrastructure that is necessary to implement the DSM programme, as well as the role of the single utility company (UC) running this programme.

2.1 Neighbourhood and Demand-Side Management Programme

Consider a residential neighbourhood is comprised of M houses. Each of these is equipped with a smart meter. Smart meters are capable of measuring electricity consumption accurately and at a higher frequency than the usual monthly or quarterly readings. Furthermore, these devices can communicate directly with the utility company. This eventually allows for the implementation of the DSM programme and also eliminates the need for on-site readings. For our proposed model, we assume that we are able to obtain readings in regular intervals. Based on the reading frequency, we split each day into T discrete intervals and denote the set of all intervals by \({\mathcal {T}}\).
We assume that the M houses are served by the same UC. In order to incentivise consumers to participate in the DSM scheme, the UC offers them a specific pricing scheme, which eventually reduces their electricity bills. Details can be found in Sect. 2.3. Let us denote the set of households who participate in the DSM programme by \({\mathcal {N}}\subset {\mathcal {M}}\), where \({\mathcal {M}}\) is the set of all households in the neighbourhood. The total number of participants is \(N=|{\mathcal {N}}|\). Besides the different pricing scheme, the participants of the DSM possess their own battery storage system and have solar panels installed. An overview of the neighbourhood is shown in Fig. 1.
The DSM scheme can be seen as a protocol, which is gone through repeatedly. In our study, the protocol is run once per day. Note that this is a completely automated process run by our scheduling software selma (short for: Scheduler for Electricity in Local MArkets), which needs to be installed on a consumer access device given to each participant of the scheme. The algorithm to obtain the schedules is based on a discrete time dynamic game, which will be introduced in Sect. 3.1.
Before the start of each scheduling period, selma forecasts the demand1 of the respective household for each interval \(t\in {\mathcal {T}}\) of the upcoming day. This information is sent to the UC. The smart meters of non-participants are not able to forecast their own demand. Thus, the UC performs the forecasting step for these households, based on historically collected data. Eventually, forecasted demand curves are aggregated and the information is sent to each DSM participant. Note that no information about individual neighbours is shared, but only aggregated information. This provides anonymity to all consumers.
Based on this input, the households play a dynamic non-cooperative game (cf. Sect. 3). The outcome of the game is a set of schedules, one for each household, which specify how they can make best use of their battery system. The households will follow these schedules throughout the day, even if their actual demand differs from the forecasted one. In Sect. 4, we investigate the influence of the forecasting error and show the robustness of the approach even in the worst-case scenario. At the end of the scheduling period, the electricity costs for each consumer is calculated based on the agreed pricing terms and the protocol starts over again.

2.2 Individual Households

Households that participate in the DSM scheme are equipped with a lithium-ion battery and PV cells. In this subsection, we introduce the battery model and clarify how the battery can be used. Moreover, details on the PV system are provided. Finally, we clarify the terminology of demand, net demand, and load of a household based on the usage of their battery and PV cells.

2.2.1 Battery Model and Decision Variables

In this paper, we employ the same battery model as used in [17]. This includes charging, discharging, and self-discharging characteristics of a lithium-ion battery. In fact, the same model may also be applied for lead–acid battery systems (but not nickel-based batteries due to their different charging behaviour). As all our simulations are based on a real-world lithium-ion battery system, in the following we will only refer to them as such.
Charging Lithium-ion batteries are charged in a two-stage process [22]. In the first stage, the state of charge (SOC) increases linearly. This stage is called the ‘constant current’ (CC) stage, with a charging rate limited by \(\rho ^+>0\). In the second stage, i.e. the ‘constant voltage’ (CV) stage, the effective charging rate levels off exponentially towards the point where the SOC reaches the nominal maximum capacity \(s_{\max }\) of the battery. The point of transition from the first stage to the second is indicated by a SOC \(s^*\) and an associated time \(t^*\), which needs to be specified for the respective battery. During both stages, we additionally consider losses due to the specific charging efficiency \(\eta ^+\) with \(0\le \eta ^+\le 1\). Additionally, certain losses occur from the hybrid inverter (cf. Fig. 1), modelled by \(\eta _{\text {inv}}\) with \(0\le \eta _{\text {inv}}\le 1\). The hybrid inverter transforms the direct current from either the battery or PV into alternating current at usable voltage and frequency for the household appliances. It also works in the reverse direction to charge the battery.
To obtain an insight into how the households can make use of their battery system, let us look at a specific example (cf. Fig. 2a). Given a certain value for the SOC, e.g. \(s'\), we can associate a time \(t'\) and thus specify a point on the charging curve.
Within the next interval of length \({\varDelta } t\), the decision variable \(a^+\) of how much to charge the battery will lie in \({\mathcal {H}}^+\left( s'\right) =\left\{ a^+| h^+\left( s',a^+\right) \le 0 \right\} \), with
In other words, \(a^+\) is limited by \(0<a^+\le \phi ^+\left( s'\right) <s_{\max }-s'\). We use the notation above to comply with the one shown in [13]. The upper limit \(\phi ^+\left( s'\right) \) is described by the charging curve, as described above,
$$\begin{aligned} \phi ^+\left( s'\right) = {\left\{ \begin{array}{ll} \rho ^+{\Delta } t &{}\quad \text {if\;CC\;charged} \\ s_{\max }\gamma _1\exp \left[ -\frac{{\Delta } t}{\gamma _2}\right] &{}\quad \text {if\;CV\;charged} \end{array}\right. }\ , \end{aligned}$$
where \(\gamma _1, \gamma _2\) are defined such that the charging curve is smooth at the transition point \((t^*,s^*)\). The discrepancy between the grey-shaded area and the charging curve in Fig. 2(a) results from an imperfect charging efficiency. In fact, based on the decision variable \(a^+\) the SOC of the battery changes according to the charging transition equation
$$\begin{aligned} s\left( t'+{\Delta } t\right) = s\left( t'\right) + \eta _{\text {inv}}\;\eta ^+ a^+\ . \end{aligned}$$
Discharging and Self-Discharging We model the discharging behaviour of lithium-ion batteries by a linear decrease in the SOC. Here, the slope is given by the discharging rate \(\rho ^-<0\). In order to account for the usual sharp drop off of the discharging rate at low capacities, discharging is prohibited below a minimum SOC \(s_{\min }\). Again, we also consider losses due to the specific discharging efficiency \(\eta ^-\) with \(0\le \eta ^-\le 1\) and the hybrid inverter.
In Fig. 2b, a specific example is given, to clarify how the user can discharge its battery. Within the respective interval, the decision variable \(a^-\) of how much to discharge the battery will lie in \({\mathcal {H}}^-\left( s'\right) =\left\{ a^-| h^-\left( s',a^-\right) \le 0 \right\} \), with
In other words, \(a^-\) is limited by \(s'-s_{\min }<\phi ^-\left( s'\right) \le a^- < 0\) and
$$\begin{aligned} \phi ^-\left( s'\right) = \rho ^-{\Delta } t\;\eta _{\text {inv}}\;\eta ^-. \end{aligned}$$
The dependency on \(s'\) in (5) is implicitly given by the fact that we cannot go lower than \(s_{\min }\). Note that \(\phi ^-\) also depends on the efficiency parameter, such that the actual amount taken from the battery in correspondence with the decision variable \(a^-\) (grey-shaded area in Fig. 2b) is given by the discharging transition equation
$$\begin{aligned} s\left( t'+{\Delta } t\right) = s\left( t'\right) + \frac{a^-}{\eta _{\text {inv}}\;\eta ^-}. \end{aligned}$$
In the following subsection, we will see that \(\phi ^-\) is additionally limited by the demand of the specific household, i.e. one can only discharge as much as is needed to run all appliances.
Whenever the battery is neither charging nor discharging, it will be subject to self-discharging. We model this type of behaviour with an exponential decline. This case corresponds to the decision variable \(a = 0\). The respective self-discharging transition equation is given by
$$\begin{aligned} s\left( t'+{\Delta } t\right) = s\left( t'\right) \cdot \left( 1 + {\bar{\rho }}\right) ^{{\Delta } t} \end{aligned}$$
where \({\bar{\rho }}<0\) is the self-discharging rate.
For later usage (cf. Sect. 3.1), we summarise the transition equations for charging, discharging, and self-discharging into a single transition equation f, i.e.
$$\begin{aligned} s(t+{\Delta } t) = f\left( s(t),a\right) = {\left\{ \begin{array}{ll} s(t) + \eta _{\text {inv}}\;\eta ^+ a, &{}\quad a>0 \\ s(t) + {a}/{(\eta _{\text {inv}}\;\eta ^-)}, &{}\quad a<0 \\ s(t)\cdot \left( 1 + {\bar{\rho }}\right) ^{{\varDelta } t}, &{}\quad a=0 \end{array}\right. }. \end{aligned}$$
Furthermore, we combine the restrictions of the decision variable due to the battery restrictions for charging and discharging, i.e.

2.2.2 PV Model

We model the solar panel as an additional source of electricity besides the grid connection. The output of the nth household’s PV system during interval t is denoted by \(w^t_n\). It can serve two purposes: (1) direct usage by household appliances and (2) charging the battery. Whereas direct usage is influenced by the efficiency of the hybrid inverter, charging the battery does not require any inversion and thus only depends on the charging efficiency of the battery.
An important parameter of the PV installation is the nominal kilowatt peak \(kW\!p\) of the system. It is a measure of the size of the system and denotes the maximum output that can be expected under standardised conditions. A PV system which operates at its maximum capacity, e.g. \(kW\!p =3\;\hbox {kW}\), for one hour will produce \(3\;\hbox {kWh}\). Note that identifying the optimal size of the PV installation does not fall within the scope of this article. An approximated scale is obtained from Zhang and Grijalva [29] and Olaszi and Ladanyi [15] (cf. Sect. 4.1).

2.2.3 Demand, Net Demand, and Load

We define the demand \({\bar{d}}^t_m \ge 0\) of a household \(m\in {\mathcal {M}}\) as the amount of electricity that is needed to run all its appliances during the time interval \(t\in {\mathcal {T}}\). Thus, the total daily demand schedule can be written as \({\bar{d}}_m=\left( {\bar{d}}^0_m,\ldots ,{\bar{d}}^{T-1}_m\right) \). Throughout the paper, we assume that the demand cannot be shifted. Thus our approach is fully non-intrusive and does not influence the behaviour of the user.
Combining the demand \({\bar{d}}^t_n\) of a household \(n\in {\mathcal {N}}\) with the generated electricity \(w_n^t\) from the solar panel gives the net demand
$$\begin{aligned} d^t_n = {\bar{d}}^t_n-\eta _{\text {inv}}\;w^t_n\ , \end{aligned}$$
where \(\eta _{\text {inv}}\) is the efficiency of the inverter (cf. Fig. 1). Theoretically, this value can be smaller than zero, i.e. when the effective generation is larger than the demand in the specific interval. Practically, we ensure \(d^t_n \ge 0\) by storing all excess energy directly in the battery. For households \(m\not \in {\mathcal {N}}\), that do not participate in the DSM scheme, the net demand is identical to the demand.
Let \(l_m^t\) denote the load, i.e. the amount of energy drawn from the grid by household \(m\in {\mathcal {M}}\) during interval \(t\in {\mathcal {T}}\). For households which do not participate in the DSM scheme, the load equals their demand. For the others, the load depends on the decision \(a_n^t\) taken at the specific interval. In other words, it combines the net energy demand with the amount of energy that is charged or discharged by the battery
$$\begin{aligned} l_n^t = d^t_n + a^{t}_n\ , \end{aligned}$$
where \(\max \left\{ -d^t_n, \phi ^-\right\} \le a^{t}_n \le \phi ^+\). The lower boundary expresses the fact that one cannot discharge more than is actually needed to fulfil the net demand, while at the same time all battery restrictions remain valid. Due to this condition and (10), we ensure that \(l_m^t\ge 0\) for all \(m\in {\mathcal {M}}\) and all intervals \(t\in {\mathcal {T}}\). We write \(l_m=\left( l^0_m,\ldots ,l^{T-1}_m\right) \) for the schedule of loads of a specific household. Furthermore, we can calculate the total load on the grid for interval t by
$$\begin{aligned} L^t = \sum _{m\in {\mathcal {M}}} l_m^t\ . \end{aligned}$$
Similarly, we define the average aggregated load of all households other than n during time interval t by
$$\begin{aligned} L^t_{-n} = \frac{1}{M-1}\sum _{m\in {\mathcal {M}}\setminus n} l_m^t\ . \end{aligned}$$

2.2.4 Forecasting Errors

The DSM protocol states that households send a forecast of their net demand to the UC. This depends on the demand as well as the electricity generated by the solar panel. Both variables will introduce errors that need to be accounted for. In this paper, we consider the worst-case scenario. [3] gives a comprehensive overview of the current techniques for short-term demand forecasting. They specifically investigate how combining forecasts obtained from an integrated auto-regressive moving average, an artificial neural network, and a similar day approach can improve the short-term load forecast. From [3], we obtain an upper limit for the forecasting error \(\epsilon _d\), expressed as a percentage of the actual demand. Similarly, [5] gives an insight into 24-hour PV power output prediction. The forecasting error \(\epsilon _w\) is also given as a percentage of the actual generation.
The worst-case scenario is constituted when these two errors carry opposing signs and are correlated between all the participants. This becomes clear from (10), since both contributions for the net demand enter with different signs. Intuitively, it makes sense that in the worst case the forecasted net demand is smaller than the actual demand. This is because a too small forecasted net demand does disguise the incentive to make use of the battery system. With the same argument, the worst-case solar forecast is higher than the actual one. It might imply a sufficient SOC of the battery, when in reality more charging would have been necessary.

2.3 The Utility Company

Throughout the paper, we assume a single utility company (UC) serves all the consumers in the neighbourhood. The UC runs a DSM scheme in order to reshape the load profile. To be more precise, they want to achieve a flatter profile such that investments into fast ramping technology, which is needed to deliver peak demand, can be reduced. The incentive for the users to limit consumption during peak hours is given by a dynamic pricing tariff: the cost per energy unit is calculated separately for each interval and depends on the aggregated load of all users in the neighbourhood. Following [11, 12, 17, 28], we employ a quadratic cost function \(g^t\):
$$\begin{aligned} g^t(y) = c_2\cdot y^2 + c_1\cdot y + c_0,\quad t\in {\mathcal {T}}, \end{aligned}$$
where y is the aggregated load at time t given by \(L^t\) and the coefficients \(c_2>0\), \(c_1\ge 0\), and \(c_0\ge 0\). Similar to [11, 17, 24], we employ a proportional billing scheme, where each participant of the DMS scheme pays for their share of the consumption, i.e. the electricity bill \(B_n\) yields
$$\begin{aligned} B_n = -{\varOmega }_n \sum _{t\in {\mathcal {T}}} g^t\quad \forall n\in {\mathcal {N}}, \end{aligned}$$
$$\begin{aligned} {\varOmega }_n = \frac{\sum _{t}l_n^t}{\sum _{t}\sum _{k}l_k^t}. \end{aligned}$$
For households that do not participate in the DSM scheme, a standard fixed-price tariff is employed, i.e.
$$\begin{aligned} B_m = p \sum _{t}l^t_m\quad \forall m\in {\mathcal {M}}\setminus {\mathcal {N}}. \end{aligned}$$

3 Dynamic Battery Scheduling Game

In this section, we formulate the non-cooperative dynamic game between the households that possess individual energy storage and photovoltaic (PV) installations. To do so, we introduce the relevant notation and relate it to their respective ‘real-world’ meaning according to our system (cf. Sect. 2). Furthermore, the notion of a Nash equilibrium (NE) is defined and an important result concerning the link between the NE for the whole game and the NE for a subgame is provided. Subsequently, a dynamic programming algorithm is presented from which we derive a closed-form expression of the best response, i.e. the best decision a player can make in response to fixed decisions of other players. Eventually, we use this result to construct an iterative algorithm that computes a NE of the game.

3.1 Definitions and Game Formulation

Formally, the game belongs to the category of discrete time dynamic games (cf. [13]), where players make their decisions sequentially in stages. These stages directly correspond to the daily intervals introduced in Sect. 2.1. For each stage, we define a state of the game, i.e. the current state-of-charge (SOC) of all batteries, representing the configuration of the overall system. Furthermore, we define a transition equation that models the evolution of this state based on the decisions of the players. In other words, the players will choose actions that are directly related to their battery usage, which in turn depends on the state of the game. We consider a game with open-loop information structure, which means that the initial state of the game is known by all players. In this game, players want to minimise their energy bill, i.e. their utility function, which depends not only on their own but also on the decisions of all other players. In a nutshell, we have:
Definition 1
Our discrete time dynamic game with open-loop information structure consists of the following components:
A set of players, i.e. participating households (cf. Sect. 2.1), \({\mathcal {N}} = \{1,2,\ldots ,n,\ldots ,N\}\), where N denotes the number of players.
A set of stages, i.e. intervals (cf. Sect. 2.1), \({\mathcal {T}} = \{0,1,\ldots ,t,\ldots ,T-1\}\), where T denotes the number of stages and thus the number of decisions a player can make in the game.
Scalar state variables \(s_n^t\in {\mathcal {S}}_n\subset \mathbb {R}\) denoting the SOC of the nth player’s battery at stage \(t\in {\mathcal {T}}\cup \{T\}\). Collectively, we denote the state variables of all players at stage t by \(s^t:=\left( s_1^t,s_2^t,\ldots ,s_N^t\right) \in {\mathcal {S}}:={\mathcal {S}}_1\times {\mathcal {S}}_2\times \cdots \times {\mathcal {S}}_N\subset \mathbb {R}^N\). In the open-loop information structure, it is assumed that the initial state \(s^0\) is known2 to all players \(n\in {\mathcal {N}}\).
Scalar decision variables \(a_n^t\in {\mathcal {H}}_n^t\left( s_n^t\right) \subset {\mathcal {A}}_n\subset \mathbb {R}\) (for definition of \({\mathcal {H}}_n^t\) see item (5)) denoting the usage of the battery of the nth player at time \(t\in {\mathcal {T}}\). Collectively, we denote the decision variables of all players at stage t by \(a^t:=\left( a_1^t,a_2^t,\dots ,a_N^t\right) \in {\mathcal {A}}:={\mathcal {A}}_1\times {\mathcal {A}}_2\times \cdots \times {\mathcal {A}}_N\subset \mathbb {R}^N.\) Furthermore, we define the schedule of battery usage of an individual player \(n\in {\mathcal {N}}\) as a collection of all its decisions in the stages of the game by \(a_n:=\left( a_n^0, a_n^1,\dots ,a_n^{T-1}\right) \). A strategy profile is denoted by \(a:=\left( a_1,a_2,\dots ,a_N\right) \).
A set of admissible decisions \({\mathcal {H}}_n\left( s_n^0\right) := \left\{ a_n~|~h_n^t\left( s_n^t,a_n^t\right) \le 0,\ t\in {\mathcal {T}}\right\} \subset \mathbb {R}^T\) for the nth player. The function \(h_n^t\left( s_n^t,a_n^t\right) \) has been defined in (9) Sect. 2.2.1, capturing the restrictions posed on the battery. We denote \({\mathcal {H}}_n^t\left( s_n^t\right) := \left\{ a_n^t~|~h_n^t\left( s_n^t,a_n^t\right) \le 0\right\} \subset \mathbb {R}\)
A state transition equation
$$\begin{aligned} s_n^{t+1} = f_n^t\left( s_n^t, a_n^t\right) ,\quad t\in {\mathcal {T}},\quad n\in {\mathcal {N}}, \end{aligned}$$
governing the state variables \(\left\{ s^t \right\} _{t=0}^T\). The function \(f_n^t\left( s_n^t, a_n^t\right) \) is the discretised version of the transition equation (8) defined in Sect. 2.2.1, showing how a decision of the player influences the state of its battery for the upcoming stage.
A stage additive utility function
$$\begin{aligned} U_n\left( s_n^0, \left( a_n, a_{-n}\right) \right) =-g_n^T\left( s_n^T\right) - \sum _{t=0}^{T-1}g_n^t\left( s_n^t, \left( a_n^t, a_{-n}^t\right) \right) \end{aligned}$$
for the nth player, where \(a_{-n}:=\left( a_1,a_2,\dots ,a_{n-1},a_{n+1},\dots ,a_N\right) \) denotes the decisions of all other players. The function \(g_n^t\left( s_n^t, \left( a_n^t, a_{-n}^t\right) \right) \) has been defined in (14) Sect. 2.3 capturing the costs to the nth player at the tth stage. Note that the utility function depends only on the initial state variable \(s_n^0\), since the subsequent states \(s_n^t\) are determined by (18). The function
$$\begin{aligned} g_n^T\left( s_n^T\right) = s_n^T \end{aligned}$$
expresses a penalty for the nth player that is incurred by ending up in state \(s_n^T\), i.e. its SOC, at the end of the scheduling period.
We represent the decision problem of the nth player as the following optimisation problem: Moreover, the game is referred to as \(\left\{ G_1,G_2,\ldots ,G_N\right\} \).
Definition 2
A strategy profile \({\hat{a}}=\left( {\hat{a}}_1,\dots ,{\hat{a}}_N\right) \) is a Nash equilibrium for the game \(\left\{ G_1,\ldots ,G_N\right\} \) if and only if for all players \(n\in {\mathcal {N}}\) we have
$$\begin{aligned} U_n\left( s_n^0,\left( {\hat{a}}_n,{\hat{a}}_{-n}\right) \right) \ge U_n\left( s_n^0,\left( a_n,{\hat{a}}_{-n}\right) \right) ,\quad \forall a_n\in {\mathcal {H}}_n\left( s_n^0\right) . \end{aligned}$$

3.2 Analysis of the Game

In order to analyse the game \(\left\{ G_1,\ldots ,G_N\right\} \), we follow the dynamic programming (DP) idea by Nie et al. [13]. To do so, we introduce notation for subproblems of (21). Furthermore, we show an important result about Nash equilibria for these subproblems, which constitutes the basis for the DP algorithm. Applying the general algorithm eventually leads us to an analytic formulation of the nth player’s best response \({\hat{a}}_n\), given the strategies \(a_{-n}\) of other players at stage t of a T-stage game.

3.2.1 Subgame Formulation

For subproblems that are only interested in decisions taken from stage \(t'\) onwards, we write:
$$\begin{aligned}&s_n^{t',T-1}:=\left( s_n^{t'},\dots ,s_n^{T-1}\right) ,\quad s^{t',T-1}:=\left( s^{t'},\dots ,s^{T-1}\right) \\&a_n^{t',T-1}:=\left( a_n^{t'},\dots ,a_n^{T-1}\right) ,\quad a^{t',T-1}:=\left( a^{t'},\dots ,a^{T-1}\right) \\&U_n^{T-t'}\left( s_n^{t'}, \left( a_n^{t',T-1}, a_{-n}^{t',T-1}\right) \right) =-g_n^T\left( s_n^T\right) - \sum _{\tau =t'}^{T-1}g_n^\tau \left( s_n^\tau , \left( a_n^\tau , a_{-n}^\tau \right) \right) \\&{\mathcal {H}}_n^{t',T-1}\left( s_n^{t'}\right) := \left\{ a_n^{t',T-1}~|~h_n^\tau \left( s_n^\tau ,a_n^\tau \right) \le 0, \tau =t',t'+1,\dots ,T-1\right\} . \end{aligned}$$
For \(t'\in {\mathcal {T}}\), we define a subproblem of the nth player as the following optimisation problem: Therefore, the subgame is referred to as \(\left\{ G_1^{T-t'},G_2^{T-t'},\ldots ,G_N^{T-t'}\right\} \).
Theorem 1
Let \({\hat{a}}=\left( {\hat{a}}^{0},\dots ,{\hat{a}}^{T-1}\right) \) constitute a Nash equilibrium for the game \(\left\{ G_1, \dots , G_N\right\} \) with the corresponding trajectories of states \({\hat{s}}=\left( {\hat{s}}^{0},\dots ,{\hat{s}}^{T}\right) \). Consider the subgame \(\left\{ G_1^{T-t},\dots ,G_N^{T-t}\right\} \) for each \(t\in {\mathcal {T}}\). Then, the truncated strategy \({\hat{a}}^{t,T-1}=\left( {\hat{a}}^{t},{\hat{a}}^{t+1},\dots ,{\hat{a}}^{T-1}\right) \) comprises a Nash equilibrium for the subgame \(\left\{ G_1^{T-t},\dots ,G_N^{T-t}\right\} \).
The proof can be found in ‘Appendix A.1’. \(\square \)

3.2.2 The DP Algorithm and Derivation of the Best Response Solution

Based on the results of the previous subsection, we can formulate the following DP algorithm to find the solution to the decision problem \(G_n(a_{-n})\) (21), i.e. the optimal decision for the nth player given the decisions \(a_{-n}\) of the other players. Let us apply Algorithm 1 to obtain the result to the decision problem \(G_n(a_{-n})\) (21) in closed form. Note that both for loops (lines 1 and 3) are treated implicitly by keeping \(s^T\) and \(s^t\) unspecified throughout the computations.
Given the total scheduling length T, the aggregated decisions \(a_{-n}\) of all other players, and the initial SOC \(s^0\) of the batteries, at the first step (\(t=T\)) we set \(V^0_n(s^T_n)=s^T_n\) according to (20). With this, we enter the while loop (line 2) which overwrites t to now represent \(t=T-1\). We solve for the best decision \({\hat{a}}^{T-1}_n\) by solving the following problem:
where we made use of the transition equation (18) to rewrite \(V^0_n\). The solution is computed as
$$\begin{aligned} {\hat{a}}^{T-1}_n = -s_n^{T-1}, \end{aligned}$$
and subsequently we have
$$\begin{aligned} V^1_n = -c_2\left( d_n^{T-1} -s_n^{T-1} + L_{-n}^{T-1}\right) ^2 - c_1\left( d_n^{T-1} -s_n^{T-1} + L_{-n}^{T-1}\right) - c_0. \end{aligned}$$
With this, the first step is done and we again overwrite t to now represent \(t=T-2\). In this stage, we solve the following problem:
$$\begin{aligned} \begin{aligned} {\hat{a}}^{T-2}_n&= \mathop {{\mathrm{argmax}}}\limits _{a_n^{T-2}}\ -g_n^{T-2}\left( s_n^{T-2}, \left( a_n^{T-2}, a_{-n}^{T-2}\right) \right) \\&\quad + c_2\left( d_n^{T-1} - \left[ s_n^{T-2}+a_n^{T-2}\right] + L_{-n}^{T-1}\right) ^2\\&\quad + c_1\left( d_n^{T-1} - \left[ s_n^{T-2}+a_n^{T-2}\right] + L_{-n}^{T-1}\right) + c_0. \end{aligned} \end{aligned}$$
The solution is computed as
$$\begin{aligned} {\hat{a}}^{T-2}_n = \frac{1}{2}\left( d_n^{T-1}-d_n^{T-2} - s_n^{T-2} + L_{-n}^{T-1} - L_{-n}^{T-2} \right) , \end{aligned}$$
from which we obtain
$$\begin{aligned} \begin{aligned} V^2_n&= -\frac{c_2}{2}\left( d_n^{T-1}-d_n^{T-2} - s_n^{T-2} + L_{-n}^{T-1} - L_{-n}^{T-2} \right) ^2 \\&\quad - c_1\left( d_n^{T-1}-d_n^{T-2} - s_n^{T-2} + L_{-n}^{T-1} - L_{-n}^{T-2} \right) - 2c_0, \end{aligned} \end{aligned}$$
finalising the second step. This procedure can be done for all subsequent steps. As the equations increase quickly in size, they become infeasible to quote here. Fortunately though, our calculations provided insight into recurring patterns, which all the solutions seem to follow. Eventually, the solution for an arbitrary stage t of the T-stage dynamic game can be written as Note that during the derivation the nonlinear battery constraints are not strictly considered. Similar to the forecasting errors (cf. Sect. 3.3), these are considered in our simulation when the equilibrium schedules are actually executed. Furthermore, we want to highlight that the optimal decision (24) for nth player at time t only depends on the current and future forecasted net demand data, the current SOC of the battery, and the average load (13) on the grid caused by all the other households. This can be vaguely reminiscent of an alternative and elegant mean-field-type approach [6, 8] in which each player reacts directly to an aggregated signal from the group.

3.3 The Algorithm and Execution of NE Schedules

Similar to [17], we make use of a best response algorithm (cf. Algorithm 2) to find the solution to the game. Whereas in [17] an extensive search for optimal schedules \({\hat{a}}_n\) was performed, here we can compute the best response for each stage (line 3) analytically by means of (24) and concatenate the results to obtain the optimal schedule \({\hat{a}}_n\) in response to \(a_{-n}\). Performing this computation for each player \(n\in {\mathcal {N}}\) (line 2) results in a new strategy profile a. We iterate this (line 1) as long as ‘there exists a player n for whom \(a_n\) is not a best response to \(a_{-n}\)’. In the actual implementation, this check is done by comparing the current strategy profile with the one obtained from the previous iteration. If it did not change, up to machine precision, an equilibrium is reached and \({\hat{a}}=\left( {\hat{a}}_n,{\hat{a}}_{-n}\right) \) constitutes the Nash equilibrium. This iterative approach is a type of cobweb method [2] which theoretically does not converge for every given scenario. An analysis of the convergence behaviour is performed in Sect. 4.2.
Based on the definition of a NE, no household can benefit from unilaterally deviating from its respective schedule. Nonetheless, we have to keep in mind that it is based on forecasted demand and renewable generation. Whenever either the demand or the generation does not match the forecasted value, it might not be possible anymore to strictly follow this NE schedule. In the analysis in the subsequent sections, we assume that every individual always seeks to be as close as possible to their determined NE schedule. To illustrate the idea: imagine a NE schedule of household n requires them to discharge an amount x in a certain interval. Due to a forecasting error for the renewable generation, this has not been charged fully and can thus not be delivered. In this case, the schedule will discharge as much as possible during this interval. The deviation from the NE will decrease the benefit in terms of PAR reductions and achieved savings for the consumer. Anticipating the results, we want to highlight that in the following section we show that the solution is robust with respect to these deviations and gives considerable improvements in comparison with other approaches in the literature.

4 Results and Discussion

In this section, we firstly summarise important simulation parameters and introduce the specific data sets for electricity demand and generation from the photovoltaic (PV) installation. After analysing the convergence behaviour of the iteration algorithm, we compare the game-theoretic approach introduced in this manuscript (cf. Sect. 3.1) with a simpler non-cooperative static game, revealing the advantages of the dynamic treatment. Subsequently, the analysis of how the participation rate of the DSM scheme and the forecasting errors influence the scheduling outcome is shown. Finally, we consider the influence of the composition of the neighbourhood on the peak-to-average ratio (PAR) reduction. This is an important measurement of the effectiveness of the DSM scheme. We consider the PAR of the aggregated electricity load (12) over the respective scheduling period. It is defined by
$$\begin{aligned} \text {PAR} = T\cdot \frac{\max _{t\in {\mathcal {T}}}L^t}{\sum _{t\in {\mathcal {T}}}L^t}\ . \end{aligned}$$

4.1 The Simulation Setup

In the real-world application, the smart meter of individual households collects data about electricity demand and generation from the available PV installation. As specified in Sect. 2.1, the demand-side management (DSM) protocol requires participants to send forecasts of the demand and generation to the utility company. These forecasts are based on historically collected data. In order to run our simulations, we omit this forecasting step and rather make use of two publicly available data sets.
Demand data The demand data stem from the OpenEI data set [27]. It contains 365 days of simulated hourly data3 for households in TMY3-locations in the USA [14]. The building models used for this simulation can be found in [26]. Based on an additional survey, all buildings are put into one of three different categories. They differ with respect to their overall consumption. Following [27], we refer to them as LOW, BASE, and HIGH consumers. For all simulation runs, we picked the same \(M=25\) households, in close vicinity to each other, to represent our neighbourhood. With respect to their consumption categories, we have seven LOW, nine BASE, and nine HIGH users.
PV data Data for the PV generation are based on real-world measurements [19] in the UK. They contain hourly values for days between September 2013 and October 2014. Note that latitude and climate zone of the measurement location are similar to the ones of the demand data. Under the assumption that the weather for all households in the neighbourhood is the same, we use data from the same site for each of them. An estimate for the \(kW\!p\) value is obtained from looking at the highest hourly output in the course of a whole year. Its value is \(w_{\max }=3.7\;\hbox {kWh}\), which is why we assume \(kW\!p\approx 4\;\hbox {kW}\). We account for different sizes of PV installations by scaling the data set with a household specific factor \(p_{n}\). About 6% of the collected data were corrupted. We set all these values to \(w=0.0\;\hbox {kWh}\). This does not pose any problem for our simulation results, but can be seen as realistic failures of the installation.
Table 1
Battery parameters
\(\eta ^+\)
\(\eta ^-\)
\(\eta _{\text {inv}}\)
\(\rho ^+\)
\(5.0\;\hbox {kW/h}\)
\(\rho ^-\)
\(-\,7.0\;\hbox {kW/h}\)
\({\bar{\rho }}\)
\(s_{\max }\)
\(13.5\;\hbox {kWh}\)
\(s_{\min }\)
\(0.0\;\hbox {kWh}\)
\(9.46\;\hbox {kWh}\)
Parameters for a Tesla-inspired [25] home battery storage system
Battery and pricing parameters The parameters of the battery are based on the Tesla Powerwall 2 [25] data sheet. The choice to employ this battery system is motivated by two reasons: (1) the same battery was used in [17], allowing for a direct comparison of the results; (2) a non-extensive analysis of different battery systems showed that the Tesla Powerwall 2 qualifies as a representative of state-of-the-art technology. Please see ‘Appendix A.2’ for more details. A summary of the battery parameters is given in Table 1. The data sheet only specifies the round-trip efficiency \(\eta = \eta ^+\cdot \eta ^-\) of the battery. Without loss of generality, we assume that charging and discharging contribute equally, yielding \(\eta ^+ = \eta ^- = \sqrt{0.918}\).
For the parameters in the cost function (14), we use \(c_2 = 0.03125\;\)$/MW\(^2\), \(c_1=1.0\;\)$/MW, and \(c_0=0\), following other studies [17, 20]. This allows to directly compare our results. In ‘Appendix A.3’, the influence of these coefficients on the potential savings for the households is analysed.

4.2 Convergence Behaviour of the Algorithm

Let us provide an insight into the convergence behaviour of Algorithm 2. The condition that needs to be fulfilled to declare equilibrium is stated as ‘there exists no player n for whom his current action \(a_n\) is not a best response to the actions \(a_{-n}\) of the other players’ (cf. Algorithm 2, line 1). Within our specific implementation of selma, the stopping criteria are based on the L2 difference between the action profiles of two consecutive iterations, i.e. when this difference is smaller or equal to \(10^{-15}\) the algorithm breaks out of the loop. Associated with the current action profile during each iteration are also the energy bills for each participant.
Results In Fig. 3, the absolute change in the average bill \(B={1}/{N}\sum _{n}B_n\) (cf. (15)) is shown for a randomly selected day of the simulation shown in Sect. 4.4. To cover the large scale of different changes, a logarithmic representation is chosen. The respective sign of the change is then expressed in the colour of the bar.
Figure 4 shows how the number of average iterations per day depends on the number of participants in the DSM scheme. The values are again taken from the simulations in Sect. 4.4.
Discussion The results give evidence of a correctly working iteration algorithm (cf. Algorithm 2). From Fig. 3, we see that between any two consecutive iterations, the absolute change in the average electricity bill is monotonically decreasing. Furthermore, we observe that the rate of this decrease is almost linear in the semi-logarithmic plot, hinting towards an exponential relationship.
Due to the exponential convergence towards a Nash equilibrium, only few iterations are needed to obtain the equilibrium schedules. The specific number of iterations depends on the number of participants taking part in the DSM scheme. This is comparable to the ones shown in [12]. Figure 4 shows that the average number of iterations increases monotonically with the number of participants. Moreover, the variation across the number of iterations for individual scheduling periods is small, as shown by the standard deviation. This is a strong result, as it shows that the convergence properties are insensitive to different demand data of the individual participants. During experimentations with the code, more than one million games were solved which all converged to a Nash equilibrium.
The small number of iterations directly translates to small computational times and thus does not hinder a real-world application. Typical 365-day simulation runs take about \(30\;\)s on a single core of an i7-3770S CPU and require less than 1 GB of memory. Note that in the real-life scenario, the scheduling process is initiated once before the scheduling period and only needs to calculate the equilibrium schedules for the upcoming day. In summary, we expect no difficulties in implementing a DSM scheme based on our scheduling software selma.

4.3 Comparison Between a Static and a Dynamic DSM Scheme

In [17], a similar DSM scheme to the one described in Sect. 2.1 was examined. Both are based on a battery scheduling game for households of a neighbourhood served by the same utility company (UC). Their main difference is the underlying game that determines the schedules for the upcoming day. Whereas in this paper we employ a discrete time dynamic game, Pilz and Al-Fagih [17] made use of a simpler non-cooperative static game in which players were only able to choose between four discrete options for each interval. For a more thorough description please see [17]. For the sake of comparison, none of the households is equipped with PV cells.
In this subsection, we compare the two approaches with respect to their success in reducing the PAR of the aggregated load. To this end, the same parameters for each household and also the same demand data are used. Households do not have the capability of on-site generation, but are equipped with the same batteries (cf. Table 1). The upcoming day is divided into \(T=12\) intervals, and we assume \(N=M=25\), i.e. every household takes part in the DSM scheme. As in [17], we simulate full weeks by using the state-of-charge (SOC) values of the batteries at the end of the scheduling period as the initial configuration for the following one.
Results Figures 5a, b show the aggregated load curves achieved by the DSM schemes for forecasts given by week 12 and week 38 of the demand data set [27], respectively. For completion, we also simulated week 25 and week 51 as done in [17]. A summary of the achieved results is given in Table 2.
On average, a 14% and 32% decrease in the PAR value was achieved by the static and the dynamic games, respectively. To understand the differences of the outcomes, we explicitly look at the schedules that are obtained in the NE of the respective games. Figure 6 shows these schedules exemplarily for day 5 of week 38 (Fig. 5b, cf. Figure 3 in [17]) together with the aggregated load and aggregated SOC above it. Each row illustrates the equilibrium schedule of one household.
Discussion Comparing the aggregated load curves (cf. Fig. 5) shows that a DSM scheme based on a dynamic game can achieve an almost flat profile. Nevertheless, depending on the given data, the outcome of the scheduling is subject to a finite-horizon effect. Empirically, we observe peaks and troughs at the end of the scheduling period if the demand for the final interval is lower than the average demand of the whole day. This indicates that the starting time of the DSM scheme has an influence on the achievable outcome. Nonetheless, this parameter is fixed through the DSM scheme protocol, thus asking for alternative solutions to the finite-horizon effect. Future work will aim to eliminate the influence of the starting time altogether.
In Table 2, we observe that on average the dynamic game reduces the PAR value more than twice as much as the static game. However, with respect to the individual weeks the static game shows a smaller standard deviation of 0.04 and thus seems to be more consistent. Its achieved reductions are all between 10.4 and 15.3%, while the range of reductions by the DMS scheme with the dynamic game is 23.9–40.9%. The differences with respect to the standard deviations is again owed to the finite-horizon effect. It is also present in the case with the static game, but, due to generally worse outcome, does not alter it as much as the results of the dynamic scheduling game.
We can further understand the differences between the static and dynamic game from Fig. 6. The restriction to four discrete options for each interval in the static case, i.e. (1) remain idle, (2) charge half interval, (3) charge full interval, and (4) use battery, results in a majority of intervals where the battery remains idle. This is because of a lack of incentive to charge the battery by the two given amounts. In the dynamic game, players can choose to charge their battery from a continuous spectrum of decisions in a given interval. This difference becomes most apparent when looking at the aggregated SOC of all participants. Whereas the maximal SOC in the static case is approximately 64 kWh, almost twice as much (120 kWh) is charged in the dynamic case. In summary, it shows that the increased flexibility of the dynamic game is better suited to minimise the PAR of the aggregated load.
Table 2
PAR comparison
   Week 12
1.623 (0.005)
1.374 (0.070)
1.013 \((<0.001)\)
   Week 25
1.574 (0.033)
1.410 (0.035)
1.198 (0.016)
   Week 38
1.685 (0.031)
1.439 (0.080)
1.231 (0.015)
   Week 51
1.718 (0.037)
1.468 (0.082)
1.015 (0.001)
   \(\mu \)
1.650 (0.064)
1.423 (0.040)
1.114 (0.117)
Peak-to-average ratios calculated as the average over the individual days of week 12, week 25, week 38, and week 51 for the case without storage system (Reference) and both underlying games of the DSM scheme. \(\mu \) gives the average over all 4 weeks. Static: game employed in [17]; dynamic: game described in Sect. 3.1. The values in parentheses represent the standard deviation
Note that all these comparisons allow for strong conclusions as they are based on the identical data set [27] and also all the other parameters, such as number of players N and number of time intervals T, are chosen to be the same. Nevertheless, comparisons with other results in the literature are possible: compared to the work by Nguyen et al. [12], a better PAR reduction is achieved while also the number of iterations to obtain the equilibrium solution is lower by two orders of magnitude. Similarly, the PAR reduction of Yaagoubi et al. [28] is worse than the approach shown in this manuscript. As they schedule not only the battery but also shift other household appliances, a comparison of the computational costs is not appropriate.

4.4 Influence of Participation Rate and Forecasting Errors

The question of how many participants are needed to obtain considerable gains in terms of PAR reduction and savings is important. Moreover, within this subsection the robustness with respect to the forecasting errors (cf. Sect. 2.2.4) is shown. To do so, we assume the forecasting error for the demand to be \(\epsilon _d = 8\%\) for every household [3], which could be obtained from a forecast performed by an artificial neural network, and is approximately 2.5 times higher than the best forecast obtained by them. This is independent of whether the household participates in the DSM scheme or not. The forecasting error for the solar generation is set to \(\epsilon _w = 10\%\) in accordance with [5, 21]. Rana et al. [21] make use of a neural network and clustering of weather data to forecast half hourly solar power output for the upcoming day. Note that only participants of the DSM scheme are equipped with PV cells and thus subject to the forecasting error. The values are taken to represent a worst-case scenario. Subsequently, any real-world scheduling result should fall in the interval between the worst-case outcome and the respective outcome without any forecasting error.
We simulate a full year and average over the obtained PAR values for the individual days. All participants are equipped with a lithium-ion battery (cf. Table 1) and a solar cell. The size of the PV installation depends on the user’s category. For LOW, BASE, and HIGH consumers, we use \(p_n=0.3\), \(p_n=0.5\), and \(p_n=0.7\), respectively. Starting with all 25 households taking part in the DSM scheme, we eliminated three users, i.e. one randomly selected from each consumer category, in each subsequent run. Non-participant still exhibit the specified forecasting error for their demand.
Results Figure 7 shows the reduction in the PAR value over the rate of participating consumers for the scenarios with and without forecasting errors. It includes not only the mean values, but also the standard deviation. Note that we slightly shifted the results for both runs along the abscissa to increase readability. An additional axis on the left indicates the absolute PAR values. Whereas the PAR reduction is the interest of the UC, the financial rewards, i.e. savings off the energy bill, are the interests of the participants of the DSM scheme. Figure 8 shows the average saving per day for all participants both with and without forecasting error. For further insight, it also illustrates the difference between the two curves.
Discussion Although a worst-case scenario is simulated, the outcome with respect to PAR reduction (cf. Fig. 7) and electricity bill (cf. Fig. 8) reduction shows considerable gains for the UC and the participants of the DSM scheme.
Without forecasting error, the PAR reduction monotonically improves with the proportion of the participants. This stands in contrast to the results shown in [24], where a minimum is reached at medium range participation rate. In comparison with other studies, such as [9, 12], we conclude that our dynamic game performs as good as their respective scheduling approach. At 100% participation rate, a reduction of \(-33.3\%\) (5.8%) is achieved, in agreement with the results shown in Sect. 4.3. It should be noted that a perfectly flat load profile corresponds to an approximately \(-40\%\) reduction of the PAR. Thus, the outcome is close to the theoretical optimum. When looking at the standard deviation, we observe that it is lowest for the simulation run with 52% participation rate and increases towards both ends of the spectrum. On the lower end of participation rate, the fluctuations of the PAR value for different days are just an artefact of the data set in use. Small numbers of participants have not enough influence on the overall neighbourhood to change this. When regarding large participation rates, the PAR value is considerably reduced. The increase in the standard variation for these runs stems directly from the finite-horizon effect already discussed in Sect. 4.3.
The results for runs with forecasting errors follow the results without errors closely. Figure 9 shows how the forecasting error affects the achieved PAR values for three selected participation rates when transitioning from the perfect forecast to the worst-case scenario as depicted in Fig. 7. For low participation rates, the difference is negligible but starts to increase when more households participate in the DSM scheme. Nevertheless, even in the worst-case scenario, a reduction of \(-27.8\%\) (8.9%) is achieved at 100% participation rate (cf. Fig. 7). With respect to the standard deviation, we again recognise similarities to the runs without forecasting errors. Smallest variations in the PAR reduction are obtained for participation rates around 50%, while we again see increasing variations at high participation rates. Here, the increase is distinctly larger than in the other runs. The reason behind this difference is directly explained by the forecasting error. As more participants join the DSM scheme, the absolute amount of deviation from the actual demand and production is increasing.
It is worth noting that the result for a participation rate of \(76\%\), i.e. a reduction of \(-27.7\%\) (4.4%) (cf. Fig. 7), is very promising from a practical point of view. The UC might not be able to convince everybody to participate in the DSM scheme, but can still gain reductions in the PAR value close to what is achievable at maximum participation.
In [18], it is investigated what happens when no forecast is calculated. Rather than calculating the forecast, the demand data of the current day are used as the input to the dynamic game that determines the schedules for the next day. It turns out that this approach is oversimplified and results in distinctly worse outcomes (cf. [18, Figure 5]), i.e. it shows the importance of a suitable forecasting mechanism.
The savings that participants of the DSM scheme can gain increase monotonically with the share of participants. Furthermore, we observe that the variations between different participants are negligible. This is due to the particular proportional billing scheme employed in the scheme (cf. Sect. 2.3). It ensures fairness in the sense that LOW and HIGH consumers can gain equally by signing up for the DSM scheme. The difference between runs with and without forecasting errors reveals that the forecasting error does not influence the bill reduction to a great extent. Since the two curves are almost non-separable to the unaided eye, the difference is shown in the same plot (cf. Fig. 8). It becomes clear that the difference is actually decreasing for larger numbers of participants.
This highlights that the dynamic scheduling game ensures robust and beneficial results for the participants of the DSM scheme, even in the worst-case scenario.

4.5 Consumer-Type Dependency

Results The results in Sects. 4.3 and 4.4 are all based on a neighbourhood consisting of a mix of the three different consumer types (LOW, BASE, and HIGH). Figure 10 shows the possible PAR reductions for mono-type neighbourhoods. To allow for comparison, \(M=25\) is kept constant. Furthermore, we use the same forecasting errors of \(\epsilon _d=8\%\) and \(\epsilon _w=10\%\) for the demand and renewable energy generation, respectively (cf. Sect. 4.4). All the simulations consider a scheduling period of a full year.
We also calculated the average savings that are achieved by the participants of the DSM scheme. These results are presented in Fig. 11 together with the reference of a mixed neighbourhood (cf. Fig. 8) with forecasting errors.
Discussion When comparing different compositions of neighbourhoods, we can gain further insight into the conditions for which the DSM scheme works most efficiently. At first glance, Fig. 10 reveals that given a low rate of participants in the scheme, the actual type of consumer is not crucial. Figure 12 shows the difference between the respective results for mono-type neighbourhoods and a mixed neighbourhood. A closer look shows that mono-LOW communities are always worse in reducing the PAR value of the aggregated load than any of the other ones. The results in terms of both the mean PAR reduction and the standard deviation get even worse with more than two-thirds of households participating in the scheme. Similar observations for mono-type neighbourhoods are found in [24].
For both mono-BASE and mono-HIGH neighbourhoods, it can be observed that they perform better (\(<1\)%) in an interval of medium participation rate than the mixed neighbourhood. Nevertheless, at \(N=M\) the obtained PAR reduction is smaller by 1.8% and 4.5%, respectively. Considering the variation of these mean PAR reduction values, it becomes clear that it is most beneficial to have a mixed-consumer neighbourhood.
Figure 11 shows the average bill reduction for the participants of the DSM scheme for different participation rates. Generally, they show the same behaviour already observed in Fig. 8. The influence of the proportionality factor in the billing scheme (15) is clearly visible. Although the mixed-consumer neighbourhood achieves better PAR reduction, the average savings are almost identical to a mono-BASE neighbourhood. A neighbourhood that purely consists of HIGH consumers can save about 11% off the energy bill and is consistently most rewarding for the participants independent of the participation rate.

4.6 Robustness

Results In order to evaluate the robustness of the game-theoretic approach, we simulate a large set of scenarios with randomly generated parameters. The parameters under consideration are the size of the PV array of the individual households, the maximum capacity of the lithium-ion battery, and the charging and discharging efficiencies. The change in capacity and efficiency can either be interpreted as an ageing effect of the Tesla Powerwall 2, or as using batteries from different manufacturers for different participating households. An overview of the range of these parameters is shown in Table 3. All other parameters are kept constant at the values used throughout the other experiments, i.e. \(M=25\), \(T=24\), \(\epsilon _d=8\%\), \(\epsilon _w=10\%\), and all other battery parameters as in Table 1, to allow for a fair comparison. For each set of random parameters, we consider a simulation period of one year. Figure 13 shows the results for various participation rates for 2048 years each.
Table 3
Parameter ranges
Robustness study
Other studies
Uniformly drawn from
Fixed at
\(w_{\max }\ (\)kWh)
[0.0, 8.0]
\(s_{\max }\ (\)kWh)
[2.8, 13.5]
\(\eta ^+\)
[0.900, 0.958]
\(\eta ^-\)
[0.900, 0.958]
Instead of using the same battery and (scaled) PV installation for each participant of the DSM scheme, the parameters are drawn uniformly from the given ranges in the simulation runs to analyse the robustness of the scheduling performance
Discussion: The median PAR value over all the simulated years is monotonically decreasing with an increase in the participation rate. At 100% participation rate, a median PAR value of 1.198 is achieved, which equals a reduction of \(-28\%\) when compared to the scenario without the DSM scheme.
When comparing the median PAR values over 2048 years and random parameters with the ones obtained over one year and fixed parameters (cf. Fig. 7), it becomes clear that the former ones perform slightly worse, differing by \(<2.5\%\). This was expected due to the following reason: prior studies (not shown) in which only one of the four parameters was randomly changed at a time revealed that the maximum capacity \(s_{\max }\) of the battery has the biggest influence on the PAR reduction. Since all of the randomly generated scenarios have the previously analysed scenario as an upper limit in terms of the available storage capacity, a decreased performance can be explained by the loss of effective flexibility.
Note that the same iteration statistics as shown in Fig. 4 are observed independent of the chosen parameters. With a total of \(>6.7\) million simulated days, this provides further confidence in the correctly working iteration algorithm and its convergence behaviour.
All in all, we see that with the more realistic assumption of differing battery and solar installations the DSM scheme performs well and shows inherent robustness.

5 Conclusion

In this paper, we propose a demand-side management (DSM) scheme based on a discrete time dynamic game. Its purpose is to reduce the peak-to-average ratio (PAR) of the aggregated electricity load by scheduling the usage of individually owned (lithium ion) energy storage systems. The utility company running the scheme incentivises users to take part by offering fair financial benefits. To ensure realistic outcomes, an advanced battery model is employed. Furthermore, the integration of local energy generation in the form of photovoltaic cells is taken into account.
The DSM scheme is suitable for real-world implementation for five reasons: firstly, it is based on a complete model of the neighbourhood including storage systems, local energy generation, and crucially forecasting errors of both demand and generation. Secondly, computational costs to obtain schedules for the upcoming period are small and require only little amounts of memory. This was achieved by deriving a closed-form solution for the best response problem of an individual player. The ensuing iterative algorithm seems to converge exponentially towards a Nash equilibrium and thus obtains the strategy profiles for one scheduling period in a fraction of a second. Thirdly, the resulting schedules are robust with respect to the worst-case forecasting errors. Whereas the error weakens the effect of the PAR reduction by \(\le 5.5\%\), the corresponding savings off the energy bill for the participants of the scheme are hardly changed. Fourthly, we provide evidence that a neighbourhood that consists of various types of consumers performs best in such a DSM scheme. Since a mixed community is more probable than a mono-type community, this is a promising result. Finally, simulations with randomly generated parameters for battery and photovoltaic installation have provided insight into the expected outcomes of the DSM scheme for batteries of different age and performance. The effective loss of flexibility leads to a PAR reduction of \(<2.5\%\), showing the robustness of the DSM scheme.
A direct and in-depth comparison of a DSM scheme with an underlying static game revealed the advantages of the dynamic game approach. Players are overall more active and thus able to achieve distinctly better results. Further comparisons with the literature in terms of PAR reduction and computational costs show the superiority of our approach.
Future Work In future work, we plan to corroborate our results with an even more sophisticated approach to treat the uncertainties caused by the forecasts of demand and renewable energy generation. There are two main approaches that can be considered: (1) robust (finite) game theory [1], which can be seen as a generalised version of a Bayesian game; (2) a two-step approach that first determines day-ahead schedules and then refines them throughout the day by using most recent data. The latter might be realised in a sliding window framework which would potentially eliminate the finite-horizon effects that can be encountered in our solutions. Furthermore, one could also think about modelling risks associated with the uncertainties directly in the utility function by means of the conditional value-at-risk measurement.
Whereas the current approach investigates a rather small community of households, it is worth to also explore the other end of the spectrum when the number of players becomes large. The method of choice here is mean field game theory in which the behaviour of the system is examined in the limit of an infinite number of players.


This work was supported by the Doctoral Training Alliance (DTA) Energy. The authors want to thank Jean-Christophe Nebel and Eckhard Pfluegel for helpful discussions. Furthermore, the authors want to thank the anonymous reviewers for their constructive feedback that helped to improve the manuscript.

Compliance with Ethical Standards

Authors’ Contributions

MP and LA-F conceived and designed the system; MP performed the theoretical analysis; MP implemented the software; MP performed the simulations; MP analysed the data; MP wrote the paper.

Conflict of interest

The authors declare no conflict of interest.
Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.


Proof of Theorem 1

We prove the theorem by contradiction. Suppose \({\hat{a}}^{t,T-1}\) is not a Nash equilibrium to the subgame \(\left\{ G_1^{T-t},\ldots ,G_N^{T-t}\right\} \). Then, for some \(n\in {\mathcal {N}}\), there must exist another strategy \({\bar{a}}_n^{t,T-1}\) with the corresponding sequence of states \(\left\{ {\bar{s}}_n^\tau \right\} _{\tau =t}^T\) such that
$$\begin{aligned} U_n^{T-t}\left( \left( {\bar{s}}_n^t,{\hat{s}}_{-n}^t\right) , \left( {\bar{a}}_n^{t,T-1}, {\hat{a}}_{-n}^{t,T-1}\right) \right) > U_n^{T-t}\left( {\hat{s}}^t, {\hat{a}}^{t,T-1}\right) . \end{aligned}$$
Therefore, we obtain
$$\begin{aligned} U_n\left( s_n^0, \left( {\hat{a}}^{0,t-1}, \left( {\bar{a}}_n^{t,T-1}, {\hat{a}}_{-n}^{t,T-1}\right) \right) \right)&= U_n^{T-t}\left( \left( {\bar{s}}_n^t,{\hat{s}}_{-n}^t\right) , \left( {\bar{a}}_n^{t,T-1}, {\hat{a}}_{-n}^{t,T-1}\right) \right) \\&\quad - \sum _{\tau =0}^{t-1}g_n^\tau \left( {\hat{s}}_n^\tau , \left( {\hat{a}}_n^\tau , {\hat{a}}_{-n}^\tau \right) \right) \\&> U_n^{T-t}\left( {\hat{s}}^t, {\hat{a}}^{t,T-1}\right) - \sum _{\tau =0}^{t-1}g_n^\tau \left( {\hat{s}}_n^\tau , \left( {\hat{a}}_n^\tau , {\hat{a}}_{-n}^\tau \right) \right) \\&= U_n\left( {\hat{s}}^0,{\hat{a}}^{0,T-1}\right) = U_n\left( {\hat{s}}^0,{\hat{a}}\right) . \end{aligned}$$
That is in contradiction to our assumption that \({\hat{a}}\) is a Nash equilibrium for the game \(\left\{ G_1, \dots , G_N\right\} \). Consequently, our assumption about \({\hat{a}}^{t,T-1}\) is proved to be false. Thus, \({\hat{a}}^{t,T-1}\) indeed comprises a Nash equilibrium of the subgame \(\left\{ G_1^{T-t},\dots ,G_N^{T-t}\right\} \). \(\square \)

Battery Justification

Various companies produce home energy storage systems. To name just a few, there are Mercedes, Tesla, BMW, Nissan, and Powervault. Some of them are specialised in second-life batteries taken from their electric cars, while others (such as Tesla) produce these batteries for their special purpose. As most manufacturers provide technical data sheets, we were able to run our simulations for the demand-side management scheme, assuming that households are equipped with different batteries. The results in Fig. 14 stem from scenarios with 76% participation rate, forecasting errors as used in Sect. 4.4, and all participants with the exact same battery model. This is not supposed to compare different systems, but rather to show that this battery (also employed in [17]) can be taken as a representative of state-of-the-art technology. In this particular simulation run, it achieves a peak-to-average ratio reduction similar to the best in the field. Also, the savings off the energy bill are close to the best competitors.

Influence of Pricing Parameters

Throughout the paper, a quadratic cost function (14) with constant coefficients was used based on previous studies [17, 20]. In this section, the specific choice of these coefficients is justified and the influence of them on the obtainable savings for the consumers is analysed. In order to get a thorough overview, a wide range of coefficients was considered, i.e. \(c_2\in [10^{-4}, 10^6]\), and \(c_1\in [10^{-3}, 10^5]\). Note that for simplicity the constant term is kept at \(c_0=0\). The average savings over a simulation period of one year for \(N/M = 100\%\), including forecasting errors as discussed in Sect. 4.1, are shown in Fig. 15. We can observe three regions in this representation: (1) a plateau, where the relative savings are equal to \(\approx 17\%\), (2) a transitional region where the savings are diminishing, and (3) a second plateau of negative savings (\(\approx -1.5\%\)), i.e. the consumers actually have to pay more given these pricing coefficients. The previously used pricing coefficients, that lead to an average saving of 10% per year, are located in the second region. This means that by changing the pricing coefficients an even higher cost reduction could be achieved. It can be observed that as long as the quadratic term in the cost function is dominant the consumers can expect a financial gain from playing the game. Only when \(c_2 \ll c_1\) the advantage is lost.
While these results are all based on simulations in the presence of forecasting errors, we can report that without forecasting errors the resulting savings differ by \(<1\%\) for each of the considered sets of coefficients in the represented range, following the results shown in Fig. 8. This can be explained when looking at the effect the forecasting error has on the actual load curve of the households. In Sect. 4.4, we saw that the PAR of the aggregated load changes by only a few per cent when introducing inaccurate forecasted demand and generation data. A rough estimate shows that changing the load x per cent could also be interpreted as a change in the coefficients \(c_1\), \(c_2\) by the same amount and twice the amount, respectively.
In such a narrow margin, the function plotted in Fig. 15 is indeed almost flat.
As of today, the forecasting module is not included into selma. For this study, we assume this information to be given and refer the reader to [3, 5] for details on demand forecasting.
Later, we will see that the solutions/schedules require the players to deplete their battery towards the end of the scheduling period (cf. finite-horizon effect) to achieve maximum utility. This means as long as none of the players deviates from their respective schedule this knowledge is implicitly shared.
We make use of \(T=24\) for all simulations, if not stated otherwise.
go back to reference Bahn O, Haurie A, Malhamé R (2009) A stochastic control/game approach to the optimal timing of climate policies. In: Filar J, Haurie A (eds) Uncertainty and environmental decision making. International series in operations research & management science, vol 138. Springer, Boston, MA Bahn O, Haurie A, Malhamé R (2009) A stochastic control/game approach to the optimal timing of climate policies. In: Filar J, Haurie A (eds) Uncertainty and environmental decision making. International series in operations research & management science, vol 138. Springer, Boston, MA
go back to reference Celik B, Roche R, Bouquain D, Miraoui A (2017) Coordinated neighborhood energy sharing using game theory and multi-agent systems. In: 2017 IEEE Manchester PowerTech. Manchester, pp 1–6 Celik B, Roche R, Bouquain D, Miraoui A (2017) Coordinated neighborhood energy sharing using game theory and multi-agent systems. In: 2017 IEEE Manchester PowerTech. Manchester, pp 1–6
go back to reference Huang MY, Malhame RP, Caines PE (2006) Large population stochastic dynamic games: closed-loop McKean–Vlasov systems and the Nash certainty equivalence principle. Commun Inf Syst 6(3):221–252 MathSciNetMATH Huang MY, Malhame RP, Caines PE (2006) Large population stochastic dynamic games: closed-loop McKean–Vlasov systems and the Nash certainty equivalence principle. Commun Inf Syst 6(3):221–252 MathSciNetMATH
go back to reference Pilz M, Nebel JC, Al-Fagih L (2018) A practical approach to energy scheduling: a game worth playing? In: IEEE Conference, ISGT 2018 Europe Pilz M, Nebel JC, Al-Fagih L (2018) A practical approach to energy scheduling: a game worth playing? In: IEEE Conference, ISGT 2018 Europe
go back to reference Rana M, Koprinska I, Agelidis VG (2016) Solar power forecasting using weather type clustering and ensembles of neural networks. In: 2016 International joint conference on neural networks (IJCNN). Vancouver, BC, pp 4962–4969 CrossRef Rana M, Koprinska I, Agelidis VG (2016) Solar power forecasting using weather type clustering and ensembles of neural networks. In: 2016 International joint conference on neural networks (IJCNN). Vancouver, BC, pp 4962–4969 CrossRef
go back to reference Shoham Y, Leyton-Brown K (2009) Multiagent systems, 1st edn. Cambridge University Press, Cambridge MATH Shoham Y, Leyton-Brown K (2009) Multiagent systems, 1st edn. Cambridge University Press, Cambridge MATH
A Dynamic Game Approach for Demand-Side Management: Scheduling Energy Storage with Forecasting Errors
Matthias Pilz
Luluwah Al-Fagih
Publication date
Springer US
Published in
Dynamic Games and Applications / Issue 4/2020
Print ISSN: 2153-0785
Electronic ISSN: 2153-0793

Other articles of this Issue 4/2020

Dynamic Games and Applications 4/2020 Go to the issue

Premium Partner