2.1 Data
The empirical evaluation of this work is based on mobile phone and epidemiological data. We analysed an anonymised set of mobile phone data collected by Orange Côte d’Ivoire. It consists of billing information of about 8 million mobile phone users (i.e., 35% of the country population), collected between February and October 2014 in Ivory Coast, for a total of about 4.5 billion records. Mobile phone operators continuously collect such data for billing purposes and to improve the operation of their cellular networks. Every time a person uses a phone, makes a call, sends an SMS or goes online, a Call Data Record is generated. The record contains the caller and callee IDs, timestamp, duration and type of communication, as well as an identifier of the cellular tower that handled the call. The approximate spatio-temporal trajectory of a mobile phone and its user can be reconstructed by linking the CDRs associated with that phone with the geographic location of the cellular towers that handled the calls.
As far as the epidemiological data is concerned, in order to place our results in a more realistic context, we consider a scenario modelled using values of the parameters estimated from the Ebola outbreak in Sierra Leone in 2014 [
26] (Table
1). This type of modeling can be used for analyzing different “what-if” scenarios and for devising mitigation strategies. It is worth noting that we present the results considering a worst-case scenario, projecting the most severe form of Ebola epidemics.
Table 1
Ebola specific parameters values
β
| 0.45 |
σ
| 0.18 |
γ
| 0.2 |
ρ
| 0.48 |
2.2 Disease spread spatial model
In order to describe the countrywide-scale infectious disease spread, where individuals change location over time, we use a meta-population model. This framework has traditionally provided an attractive approach to epidemics modelling. In fact, a meta-population model allows modellers to include a realistic contact structure, and to reflect the spatial separation of the sub-populations (i.e., the contact rate might vary with spatial separation). The intuition behind meta-population models is that a natural population occupying any considerable area will be composed of a number
n of local populations (i.e., sub-populations), which interact and exchange individuals between them, because of their movement, through a given mobility network [
27]. The nodes of such a network are the geographical areas connected according to a well-defined adjacency matrix
M (i.e., mobility matrix) of dimension
n by
n. The element
\(m_{ij}\) represents the probability per unit of time that an individual chosen at random in an area
i will travel to an area
j.
We compute this quantity using the CDRs dataset. Given users’ movement trajectories, we estimate the probability of moving between antennas locations. A possible approach is to use a Markovian model as proposed in [
4]. The estimation of the probability of movement is described by Eq. (
1):
$$ m_{ij} = \frac{\sum_{u}M_{ij}^{u}}{\sum_{u}\sum_{k}M_{ik}^{u}}, $$
(1)
where
\(M_{ij}^{u}\) is the number of times an individual
u moves from an area
i to an area
j. Daily location and movement are then aggregated to measure transitions among 508 Ivorian administrative regions called sub-prefectures.
Within each geographic area, sub-populations may be in contact and may change their health state according to the disease dynamics. By doing so, the system will evolve under the action of two processes, namely disease contagion and the mobility of individuals.
To model the process of disease transmission we consider the SEIR epidemiological model. Thus, in each node of the spatial network, SEIR dynamics takes place over a population of size
\(N_{i}(t)\) (the number of individuals located in an area
i at time
t). With respect to the infection progress, individuals located in a given area
i are partitioned into
\(S_{i}(t)\),
\(E_{i}(t)\),
\(I_{i}(t)\),
\(R_{i}(t)\), denoting the number of susceptible, exposed, infected and recovered individuals at time
t. Hence, at each time
t, a person is either susceptible, exposed, infected or recovered (i.e.,
\(S_{i}(t)+E_{i}(t)+I_{i}(t)+R_{i}(t) = N_{i}(t)\)) and, as the SEIR process takes place, they change the state as follows: A susceptible individual becomes exposed to the disease with probability
\(\beta *I/N\) , with
β being the product of the contact rate and the contagion probability. An individual that is exposed becomes infected at infection rate
σ . An infected individual can then recover at a recovery rate
γ. Finally or he/she can die before recovering because of infection-induced mortality with probability
ρ [
25] .
As stated above, simultaneously with the contagion process, individuals move according to the mobility matrix. So as time passes,
\(N_{i}(t)\) changes according to the number of individuals who have entered and who have left the node (i.e., geographical area)
i, and the number of births and deaths. In order to combine the two interdependent processes and study their effect on the evolution of the system, we use the approach proposed by Lima et al. [
4], based on a product between the mobility matrix (
M) transpose and the state variable vectors (
S,
E,
I,
R). Overall, the system can be described by the system of Eqs. (
2):
$$\begin{aligned}& S_{i} (t+1 ) = \sum_{j=1}^{n} m_{ji} \biggl[S_{j}(t) + \nu- \beta\frac{S_{j}(t)}{N_{j}(t)}I_{j}(t) -\mu S_{j}(t) \biggr], \\ & E_{i} (t+1 ) = \sum_{j=1}^{n} m_{ji} \biggl[E_{j}(t) + \beta \frac{S_{j}(t)}{N_{j}(t)}I_{j}(t) -\sigma E_{j}(t) -\mu E_{j}(t) \biggr], \\& I_{i} (t+1 ) = \sum_{j=1}^{n} m_{ji} \biggl[I_{j}(t)+ \sigma E_{j}(t) - \frac{\mu+ \gamma}{1-\rho}I_{j}(t) \biggr], \\& R_{i} (t+1 ) = \sum_{j=1}^{n} m_{ji} \bigl[R_{j}(t) +\gamma I_{j}(t) -\mu R_{j}(t) \bigr], \end{aligned}$$
(2)
where the expressions inside brackets describe the evolution of the disease according to the SEIR model, and the matrix product accounts for individuals moving between meta-populations. At each time step, individuals can change both state and location within the spatial network. Please note that this model takes into account also birth and mortality rates: these are modelled through the population level birth rate (
ν), and the per capita natural death rate (
μ).
2.2.1 Geographic-based targeting
First, we consider spatial targeting. We approached this problem as the identification of influential spreaders within a complex spatial network. Traditional approaches to quantify the most efficient nodes in a network of interactions through which spreading processes take place have been based on centrality measures such as the degree, eigenvector centrality or k-shell [
28‐
30]. These measures, although effective in identifying the most influential nodal position in a network, are rarely accurate in terms of the quantification of their spreading power of a given node, particularly for those that are not highly influential [
31]. This is because they are not able to capture and represent the dynamic processes that take place in the networked system under consideration (see for example the discussion in [
32]).
Fortunately, it has been showed that various approaches are effective in measuring node’s influence in disease spreading processes. Here, in particular, we consider
accessibility, which has been shown to be effective in quantifying the relationship between structure and spreading dynamics [
33]. More specifically, this concept was introduced to quantify the efficiency of communications among nodes in a complex network. Several definitions of accessibility have been proposed. Our goal is to measure the possibility of interactions within an area. Thus, as suggested by Hansen [
34], we are interested in quantifying the inward accessibility, that is, for a given node
i, the frequency of access to a node
i from all the other nodes of the network. For this reason, in order to quantify accessibility we adopted the
place rank [
35] measure. In particular, place rank is a flow-based accessibility measure, which uses origin-destination information to estimate the accessibility of a location within a geographic network. It is based on an intuition similar to that at the basis of Google Page Rank, i.e., the accessibility of a certain area is related to the probability of visiting it. For each node (area) of a network, it is determined considering the number of people moving to it. The contribution of the people of a certain area is a function of the accessibility of the area they come from and so on. More precisely, a place rank is defined following the algorithm presented below:
$$\begin{aligned}& P_{i,t} = \frac{R_{i,t}}{O_{i}}, \end{aligned}$$
(3)
$$\begin{aligned}& E_{ij,t} = E_{ij,t-1}*P_{i,t-1}, \end{aligned}$$
(4)
$$\begin{aligned}& R_{j,t} = \sum_{i=1}^{I} E_{ij,t}, \end{aligned}$$
(5)
$$\begin{aligned}& R_{i,t} = R_{j,t}^{T}, \\& \text{if }R_{i,t} = R_{i,t-1},\text{stop};\quad \text{else: Eq.~(3)} \end{aligned}$$
(6)
where
\(P_{i,t}\) is the power of the contribution of each person leaving
i at iteration
t;
\(E_{ij,t}\) is the weighted origin-destination table, i.e. the weighted number of people leaving
i to reach
j;
\(R_{j,t}\) is the place rank for zone
j at iteration
t;
\(O_{i}\) is the number of people originating from
i;
I is the total number of zones
i within the network.
2.2.2 Individual-based targeting
We are aware that curbing the spread of a disease in an entire geographical region might be restrictive and somewhat difficult to implement. Thus, as a further improvement of the targeting process, we consider the “spreading power” of a single person based on their mobility profiles. We investigate the effect of specific spatial behavioural indexes, linked to users’ mobility, on the identification of individuals at highest risk.
Studying human mobility and its relationships with people’s daily activities might yield important insights into our understanding of human spatial behaviour. In the past decade, human mobility has attracted large attention in several disciplines. One of the main findings is related to the spatial heterogeneity of human movement (see for example [
13,
36,
37]). We consider diversity of travel histories and mobility profiles, and try to link it to the heterogeneity of infectiousness levels. We propose to take into consideration the risk of infectiousness/infection of the population given individuals’ travel behaviour. The rationale is that the higher the mobility of an individual, the higher the probability to get infected, and if infected, to infect other individuals.
To this end, we analyse existing mobile phone-based mobility measures and study their correlation with the contagion risk of individuals. A significant body of literature has focussed on the characterisation of human mobility patterns as derived from CDRs data [
13,
36,
38,
39], resulting into the definition of several indicators for individual mobility. These indicators relate to certain extent to the different dimensions of mobility. In this work, we focus on measures that represent individual mobility from three critical perspectives: the spatial range (as measured by the radius of gyration), the spatial regularity (as measured by the movement entropy) and the percentage of time spent at home.
As an additional index for the quantification of contagion risk, we considered the hybrid
Progmosis risk model proposed by Lima et al. [
40], which leverages both the mobility behaviour of single individuals and the epidemic dynamics itself.
We now discuss these indicators in more detail: