Open Access 01.12.2020  Research
Nodeweighted centrality: a new way of centrality hybridization
Erschienen in: Computational Social Networks  Ausgabe 1/2020
Abstract
Centrality measures have been proved to be a salient computational science tool for analyzing networks in the last two to three decades aiding many problems in the domain of computer science, economics, physics, and sociology. With increasing complexity and vividness in the network analysis problems, there is a need to modify the existing traditional centrality measures. Weighted centrality measures usually consider weights on the edges and assume the weights on the nodes to be uniform. One of the main reasons for this assumption is the hardness and challenges in mapping the nodes to their corresponding weights. In this paper, we propose a way to overcome this kind of limitation by hybridization of the traditional centrality measures. The hybridization is done by taking one of the centrality measures as a mapping function to generate weights on the nodes and then using the node weights in other centrality measures for better complex ranking.
Introduction
Centrality measures are an important tool in social and complex network analysis to quantify the eminence of nodes. These measures remain invariant under isomorphic transformation of a network [2]. By definition, a centrality measure is a quantification of the structural importance of a node based on its location, connectivity, or any other structural property. These have been used not only by the network scientists, but also by biologists, sociologists, physicists, psychologists, and economists over time. Several measures are coined in literature. The most popular centrality measures for network analysis (traditional measures) are degree, closeness, betweenness and eigenvector centrality. Readers are referred to the books by Jackson [3] and Brandes and Erlebach [2] for a detailed survey on the centrality indices and their applications. There exist several other measures which either extend or generalize these traditional measures or limit them to a restricted application. Moreover, various variants of these centrality measures have been proposed which consider a set of nodes and compute its collective centrality, called groupcentrality [4]. Yet another direction is to combine various centrality measures to achieve better results for answering more complex problems. Such measures are termed as hybridcentralities and few of them are summarized in section on related works.
In the last two decades, a major portion of the interdisciplinary work evolved just applying these measures to extract information from underlying network data. Each of the proposed centrality measures ranks nodes in a network based on a specific structural attribute, making them application specific. Due to this reason, choosing an appropriate centrality for a given application has been a key issue in network analysis. A major portion of the research work done in this direction is concerned with selecting the best of the available measures for a particular application or defining a new measure that outperforms the existing ones.
Anzeige
Most of the centrality measures proposed in literature were first defined for unweighted graphs, i.e., all of the nodes and all the edges were assumed homogeneous in the beginning of centrality computation. We refer these measures as unweighted centrality measures. After realizing the existence and understanding the necessity and importance of weights on the edges, centrality measures for unweighted graphs were extended to edgeweighted centrality measures. These measures take weights on the edges into consideration for ranking the nodes while analyzing the networks, but still assuming equal weights on the nodes. A substantial part of the presentday research in the analysis of weighted networks considers only edge weights to determine the topological significance of nodes [5].
We call a network with nonuniform weights on both: edges and nodes, as a fully weighted networks. Several such networks surround us. The weights on the nodes in fully weighted networks can be understood as some sort of mapping of the characteristics or attributes of the nodes to some real value. At times, these weights can also be dependent on the structure around the nodes. Let us understand the possibility of existence of weights on the nodes with the help of some popular networks around us. In these networks weights on the edges have already been discussed above. Therefore, here, we only discuss regarding the existence of nonuniform weights on the nodes. Similar to the networks stated above, several other kinds of networks also fall into the category of fully weighted networks. It should be noted that in any network, all nodes (actor/devices/places/person) with same characteristics is highly unlikely. One basic reason to ignore the weights on nodes is the hardness and complexity in figuring out the nontrivial mapping from attributes of nodes to real values. Another reason is to avoid the complexity of the analysis process, but sometimes the ignorance of node weights might lead to a wrong analysis of the given network. We show through toy examples that the unweighted and edgeweighted centrality measures may exhibit limitation in correctly analyzing the fully weighted networks. Therefore, it becomes essential to also consider weights on nodes in the process for a better analysis of fully weighted networks. Due to this reason, there was a need to upgrade the previously defined unweighted and edgeweighted centrality measures so that weights at the nodes can also be taken into account for the analysis.
1.
Friendship/social networks: in this type of networks, nodes represent persons and edges represent the friendship relationship between the considered set of persons. Here, weights on the nodes can be understood as a mapping of wealth, power, education level, or some other attribute of persons. It is notable that existence of two persons with identical attributes is highly unlikely and therefore all person’s attributes can be mapped to different real values based on the application specific mapping.
2.
Public transit networks: road networks, train networks, metro rail networks, airline networks are some of the major public transit networks. In these types of networks, nodes represent locations (place) and edges represent direct connected traveling medium between the locations. In such a network, weights on the nodes can be understood as a mapping of the population, frequency of commuters, development status, popularity for tourism, etc., of the location represented by that node.
3.
Communication networks: in communication networks, nodes are represented by the communicating devices and links represent the direct (wired or wireless) connection between these devices. Weights on the nodes in such networks can be understood as the role, cost, location of these devices.
4.
Coauthorship network: in such type of networks, nodes represent authors and edges represent the coauthorship relation, i.e., at least one article was written by the authors across each edge as coauthors. In such a network, weights on nodes can represent the popularity, hindex, citations, awards, designation of the authors, etc.
Therefore, while considering the weights on the nodes, there can be two possible extensions of the unweighted and edgeweighted centrality measures:In most of the studies done so far on fully weighted networks, while considering weighted edges, weights on the nodes were completely ignored. Meanwhile, only a little work has been accomplished while considering the weights on nodes in the realworld networks [6‐9]. The main contribution of this paper is to motivate the analysis of networks while considering the weights on the nodes. In order to overcome the challenges in figuring out the weights on the nodes, we propose to use appropriate centrality measures to generate the weights and then further use these weights in the nodeweighted centrality measures to analyze a given network. We give two applications based on this principle of hybridization.

Nodeweighted centrality measures: consider only the node weights for the analysis while taking all edge weights as one.

Fully weighted centrality measures: consider both types of weights, edge weights and node weights for the analysis.
In the next section, we provide the basic notations and provide definitions of traditional unweighted centrality measures. Two toy examples pointing out and motivating the necessity of considering weights on the nodes is discussed after that. Next, nodeweighted centrality measures and related works are summarized. Afterwards, new hybridization of centrality measures are introduced for solving two complex computational problems in networks based on the definitions of nodeweighted centrality measures. The experimental comparison between the newly proposed hybrid centrality measures and traditional centrality measures on realworld networks is comprised in the respective sections on problems. Finally, we conclude and discuss the prospective future directions.
Anzeige
It is an expanded and extended version of the results that appeared in CSoNet 2019 [1]. The conference version titled “Hybrid Centrality Measures for Service Coverage Problem” discussed a hybridization of centrality measures to solve Service Coverage problem based on the formulation for nodeweighted centrality measures. In this version, we have extended the concept of this type of hybridization to solve another problem called spread maximization problem. Due to this, the structure of the paper has been revised and this version is focused on motivating hybridization of centrality measures based on the formulation of nodeweighted centrality measures. Application of the proposed measures to more than one problem in this version exhibits the potential for this kind of hybridization and motivates for several open directions in this area.
Unweighted centrality measures
This section first defines the basic notations used through out the paper. Then, we briefly summarize traditional centrality measures; degree, closeness, betweenness, and eigenvector. We also describe in brief harmonic centrality that is highly correlated to closeness centrality.
Let \(G=(V,E)\) be an undirected unweighted network, where V is the set of nodes and E is the set of links. Let the number of nodes in the network, i.e., the cardinality of set V be \(V=n\) and the number of links, i.e., the cardinality of set E be \(E=m\). Let A be the adjacency matrix of size \(n\times n\) in network G, where entry \(a_{ij}\) denotes whether there is a link between node i and node j. \(a_{ij}=1\) if there exists a link between node i and node j, otherwise \(a_{ij}=0\). Let \(d_{ij}\) be the geodesic distance, length of the shortest path, between node i and node j. Next, we discuss in brief some of the widely used centrality measures.

Degree: Freeman’s [10] degree centrality considers that a node’s importance is proportional to its degree, number of links connected (starting/ending) to that node. Mathematically, degree centrality of a node u,Degree centrality can be normalized by dividing the above expression with \(n1\). In a social network, degree centrality of a node represents that node’s popularity. A higher degree node has many followers/friends which shows the strength of the node. Lower degree nodes are the actors who are nearly isolated from the population of the network and are least popular.$$\begin{aligned} \text{DC}(u)=\sum \limits _{v\in V{\setminus }\{u\}} a_{uv}. \end{aligned}$$

Closeness: Freeman’s [10] closeness centrality considers a node’s importance to be inversely proportional to the sum of its distance to other nodes. Mathematically, closeness centrality of a node u can be represented asCloseness centrality can be normalized by multiplying the above expression with \(n1\). The concept of closeness centrality was first given by Freeman [10] for social networks, but the concept has existed for a long time as status of a node [11]. Closeness centrality of a node in a network represents the node’s average distance, i.e., the expectation of how much time, a piece of information that started flowing from any node in the network, will take to reach that node or viceversa. A higher closeness central node gets updated very early when some information is spreading and nodes with low closeness centrality have to wait longer for getting updated with the flowing information.$$\begin{aligned} \text{CC}(u)=\frac{1}{\sum \nolimits _{v\in V{\setminus }\{u\}} d_{uv}}. \end{aligned}$$

Harmonic: harmonic centrality [5, 12] considers that a node’s importance in a network is proportional to the sum of inverse of its distances from other nodes. Mathematically, harmonic centrality of a node u isFreeman’s [10] closeness centrality was not applicable on disconnected networks. Harmonic centrality is highly correlated to closeness centrality [12] and works well on disconnected and directed networks.$$\begin{aligned} {\text{HC}}(u)=\sum \limits _{v\in V{\setminus }\{u\}} \frac{1}{ d_{uv}}. \end{aligned}$$

Decay: this centrality works on same principle as harmonic centrality but at place of penalizing the contribution of nodes linearly, it does exponentially [3]. Mathematically, decay centrality of a node u is,where \(\delta\) lies between 0 and 1. \(\delta\) is called decay parameter.$$\begin{aligned} \text{DKC}(u)=\sum \limits _{v\in V{\setminus }\{u\}} \delta ^{d_{uv}}, \end{aligned}$$

Betweenness: Freeman’s [10] betweenness centrality considers that a node’s importance is proportional to the number of times that node occurs in the shortest paths between all possible pairs of nodes in a network. Mathematically, betweenness centrality of a node u,where \(\sigma _{\text{st}}\) is the total number of shortest paths from node s to node t and \(\sigma _{\text{st}}(u)\) is the total number of shortest paths from node s to node t passing through node u. Betweenness centrality can be normalized by dividing with \(\frac{(n1)(n2)}{2}\). The concept of betweenness centrality was first proposed by Anthonisse [13] and Freeman [14] independently. Betweenness centrality of a node in a network represents the node’s brokerage power, i.e., control over the flow passing through the network with the assumption that information is flowing through shortest paths. A higher betweenness central node controls a major fraction of flow passing through the network while a low betweenness central node has nearly no such control.$$\begin{aligned} \text{BC}(u)=\sum \limits _{s,t,u\in V: \{s,t,v\}=3}\frac{\sigma _{\text{st}}(u)}{\sigma _{\text{st}}}, \end{aligned}$$

Eigenvector: Bonacich’s [15] eigenvector centrality considers that a node’s importance in a network is proportional to the sum of the importance of neighboring nodes in that network. Mathematically, eigenvector centrality of a node u can be written as,Eigenvector centrality of a node in a network represents the power of a node’s neighbors in the network. A node with high eigenvector centrality indicates the direct connection of this node to important nodes in the network and viceversa.$$\begin{aligned} \text{EC}(u)=\sum \limits _{v\in V{\setminus }\{u\}}( a_{uv} \cdot \text{EC}(v)). \end{aligned}$$
Toy examples
Consider the example networks given in Fig. 1. Let us assume that the first network in Fig. 1 is a road network connecting the cities of a state where the size of nodes represent the weights (in this case population) on the nodes. City F is highly populated while Population in other cities is significantly low and several times lesser than city F. Averagesum facility location problem questions for a node in a given network that is at the minimum average distance to all other nodes in the network. Now, if we are attempting to solve the averagesum facility location problem to install a facility for the whole population of the state, the solution is the most closeness central node assuming equal population in each city. By the definition of closeness centrality, the answer is city C, but this city is not really suitable in the reality for the whole population. The answer is city F where most of the population of the state already resides. Now, suppose if we are tackling a problem where the goal is to find a city from which maximum population is exactly at one hop distance, degree centrality seems to solve this problem in unweighted networks but here it fails. Degree centrality ranks city C as the most central but because of nonuniform distribution of population, city E is the correct answer.
×
×
Next, let us assume that the second example network given in Fig. 1 is a communication network where the size of nodes denotes weights (in this case we can assume importance/vitality) of the nodes. We assume that communication happens through shortest paths and the importance of a communication is a function of the importance of nodes, between which communication is happening. Here, we assume that the function computes the multiplication between the importance of the communicating nodes. Now, the goal is to compute the node which controls the most important communications in the network. Applying betweenness centrality, which is a tool directly used for this type of problems, answers node C neglecting the importance attribute. In the reality, node J is the answer because it controls the communication between the most important nodes in the given network. We note above that the existing centrality measures neglect node weights and due to this, are prone to analyze incorrectly the fully weighted networks. Therefore, there is a need to upgrade the current definitions of unweighted or edgeweighted centrality measure.
Nodeweighted centrality
This section summarizes centrality measures that take into consideration the weights on the nodes while giving equal priority to the edges in a given network. These fall in the category of nodeweighted centrality measures. We only mention these here to avoid the complexity of including weights on both edges and nodes. Once, these measures are understood, we recommend the readers to combine these measures with the edgeweighted centrality measures given in [5] to derive the definition for fully weighted centrality measures. Recall \(G = \{V, E\}\) as the undirected unweighted graphs. We add an extra element, weights on the nodes defined as a function \(W: V \rightarrow R\), where R is set of real numbers. Let \(W_x\) be the weight given at node x. We do not directly use \(W_x\) in the definitions, but at place of it, we use a function f of \(W_x\) (or a function of \(W_x\) and \(W_y\) depending on the number of parameters) without losing the generality. This gives us the flexibility to tune the function of weights on the nodes according to our need. Here, in this paper we take \(f(W_x) = W_x\) for the simplicity. To normalize the new centrality measures, we divide by the maximum possible value that any node can score in the numerator of belowgiven formulas. We start with degree centrality.Eigenvector centrality is a measure where it is still open how to include the effect of node weights. Even if we start the eigenvector centrality computation with a vector comprising the weights on the nodes, at the time when the convergence occurs, we arrive at the same eigenvector as the solution every time. It is because the computation of this particular centrality is dependent only on the adjacency matrix and eigenvector is the property of this matrix only. One way around here is to follow the idea given in [9], where they multiply the \(\beta \hbox {th}\) power of the weight of each node to its unweighted centrality to compute nodeweighted centrality. Here, \(\beta\) takes a value from the interval \([1, 1]\).

Nodeweighted degree centrality: in [6], weights on nodes are considered and the definition of degree centrality is modified to accommodate the node weights. Abbasi and Hossain [6] considered centrality scores as weights on the nodes. Following it, nodeweighted centrality of a node u is calculated as:This measure assigns higher importance to those nodes which are in the immediate neighborhood of highly weighted nodes. Next two measures extend the consideration to all the nodes to compute the eminence of a node.$$\begin{aligned} \frac{\sum \nolimits _{v\in V {\setminus } \{u\}} (f (W_v)\cdot a_{u,v})}{ \sum \limits _{x\in V} f(W_x) }. \end{aligned}$$

Nodeweighted closeness/harmonic centrality: to target the wider applicability, we define nodeweighted harmonic centrality (which also can be used in the case of closeness centrality computation as both are highly correlated). Nodeweighted harmonic centrality of a node u in a network is defined as:This measure depends on two factors: weight of the node u under consideration and the effective weights of other nodes corresponding to their distances from node u. It assigns a higher value to the nodes that are of high weights and closer to the nodes with high weights. We refer this measure as harmonically attenuated nodeweighted centrality measure.$$\begin{aligned} \frac{f(W_u)+\sum \nolimits _{v\in V{\setminus }\{u\}}\frac{f(W_v)}{d_{u,v}+1}}{\sum \nolimits _{x\in V}f(W_x)}. \end{aligned}$$

Node weighed decay: weighted decay of a node u in a network as defined aswhere \(\delta\) lies between 0 and 1. Here also the computation of importance depends on the same two factors as in NWCC. But, the contribution of weights of other nodes decays exponentially with distance. Weighted decay assigns a higher value to the nodes that are of high weights and very close to the nodes with high weights. We refer this measure as exponentially attenuated node weighted centrality measure.$$\begin{aligned} \frac{f(W_u)+\sum \nolimits _{v\in V{\setminus }\{u\}} (\delta ^{d_{u,v}}\cdot f(W_v))}{\sum \nolimits _{x\in V} f(W_x)}, \end{aligned}$$

Nodeweighted betweenness centrality: in [16], a factor denoting the importance of communication between two pairs is multiplied in the core formula while computing the betweenness centrality. The nodeweighted betweenness centrality of a node u was defined aswhere \(f(W_x , W_y)\) can be assumed to map the weights given on node x and y to a real value denoting the importance of flow happening between x and y.$$\begin{aligned} \frac{\sum \nolimits _{s,t,u\in V: \{s,t,v\}=3}f(W_s,W_t)\cdot \frac{\sigma _{\text{st}}(u)}{\sigma _{\text{st}}}}{\sum \nolimits _{x,y,u\in V: \{x,y,v\}=3}f(W_x,W_y)}, \end{aligned}$$
In this paper, we do not explore about these measure in the experimental section. We plan to do a thorough application oriented study of the above measures in near future. We present the definition for the sake of completeness of major traditional centrality measures.
Related work
Centrality measures are the tools to find application specific importance of a node. Unweighted Centrality measures mentioned earlier, are the most widely used measures but for complex problems and applications, these measures are inefficient. In that case, a combination of these centrality measures produces better analysis than using them individually. In a recent study [6], authors have proposed a new set of hybrid centralities; degreedegree, degree closeness, and degree betweenness. They noticed that the newly defined centrality measures are different than the traditional centrality measures. On realworld coauthorship network, they found that all the newly defined hybrid centrality measures are significantly correlated to authors’ performance (sum of citations of authors’ hindex). A similar study [7] on weighted collaboration networks was done and three new sets of collaborative centrality measures were proposed. The traditional collaborative centrality measures for an author node (hindex, aindex, gindex) are used to propose new centrality measures. Newly defined set of measures were highly correlated with the traditional performance measures of scholars (publication count, citation count, hindex, aindex, gindex). Zhang et al. [17] proposed a new hybrid centrality measure to measure a node’s importance in satellite communication network. This measure is also based on the combination of closeness and betweenness centrality but the considered measure in their paper punishes the betweenness importance with a different factor. Qiao et al. [18] proposed a hybrid page scoring algorithm based on degree centrality, betweenness centrality, closeness centrality and the PageRank algorithm. Lee and Djauhari [19] proposed a hybrid centrality measure which was a linear combination of the degree, closeness, betweenness, and eigenvector centrality measures. The proposed measures were used to find the most significant or influential stocks. A hybrid centrality based on the linear combination of degree centrality and cohesion centrality was proposed was Qiu et al. [20] and further used for community detection by LiQing et al. [21]. Wang et al. [22] proposed a hybrid centrality measure based on the combination of degree, the degree of neighbors and betweenness. In another study by Buechel and Buskens [23], authors analyze a hybrid centrality model as a combination of extended closeness (a variation of closeness for disconnected networks), betweenness and degree measures. None of the above studies attempt to solve the service coverage problem.
The service coverage problem
This section covers the first of two applications covered in this paper, the service coverage problem (SCP). We present two new hybrid measures to solve SCP a type of complex facility location problem. We define these measures for networks without weights on edges. These can be easily further extended to edgeweighted version of hybrid centrality measures following the ideas for edgeweighted centrality measures given in [5].
Flow networks are those networks where information, people, commodities, etc., move from one node to other while traversing on the links. Each node starts with a limited bandwidth of resources for convening the flow and these resources prone to collapse/degrade/reduce over time due to the continuous participation of the node in fostering the flow. Due to this, such networks require uninterrupted maintenance/replenishment service on a regular basis for the proper functioning of the network. Keeping this in mind, service facilities are installed at nodes that meet the service demand of other node from time to time.
After a service request is made by some node for more resources or reviving the node after a failure occurs, the service team is sent from the service center to meet the demand. The responsetime to meet the service demand is defined as the time taken between the request for service and start of the service work on the demanding node. The responsetime depends on the distance between the node requesting for a service and the node with the service stations. It is under the assumption that the service centers have sufficient resources to meet the demand of other nodes. The responseefficiency of a node is inversely correlated to the responsetime, i.e., when the responsetime is least, the node is said to be maximum responseefficient and most suitable for installing service stations. A node with a higher responsetime possesses a smaller responseefficiency and is not appropriate choice for installing service stations.
Given an unweighted network, the objective of service coverage problem (SCP) is to find a best suited node to install service station facility such that the expected responsetime is reduced. In another word, the goal is to find a node with highest expected responseefficiency, i.e., the service should reach faster at the nodes that request for service more often than the nodes with moderate or seldom demands. Therefore, service centers should be installed closer to the nodes with frequent demands.
Randomly choosing a node to install service facility is certainly not the best solution. It is because the randomly picked node may be very far from the nodes with higher demand. In such a case where a node with higher load fails, severe damage would have already been caused before the maintenance service reaches it. Thus, we require a measure to evaluate the importance of nodes for the candidacy to install stations covering service requests.
Betweenness centrality is used to predict the importance of nodes in a number of flow based networks, e.g., power grid networks, publictransit networks, gas line networks, and communication networks. The betweenness scores of the nodes in such networks have been considered as the load on the nodes in many literature. Several models for understanding and replicating the cascading phenomena have been proposed [24‐26]. Few of these models observe that failure of a node with a high load may cause a severe cascading failure and hence result in the breakdown of the whole network system. After a node fails, the best way to reduce the damage is to recover the failed node as soon as possible. We cannot prevent a node from failure but we can definitely put a maintenance mechanism on some of the nodes in the network.
Formulation
In the service coverage problem in flow networks, the load of the nodes (when no other information is provided) can be assumed to be proportional to the betweenness centrality of the nodes. This is because a node with large flow through it, is expected to degrade faster than other nodes and if such a node shuts down, a large amount of flow will be affected. Thus we take the probability of node j requesting for a maintenance service over a fixed timeinterval aswhere BC(j) is the betweenness centrality of node j. Let \(X_{ij}\), be the responseefficiency of a node i to meet service demand from node j if the service station is going to be installed on node i. Certainly, \(X_{ij}\) depends on the distance between the node requesting for a service (node j) and the node with the service stations (node i). Harmonic decay can be considered in applications where the nodes are more robust and the service requirement is not urgent while exponential decay might be a more appropriate simplified model in applications like realtime systems where service requests need to be met on an urgent basis.
$$\begin{aligned} \text{Pr[Node}\; j\; \text{request\; for\; service]} =\frac{\text{BC}(j)}{\sum _{k\in V}{\text{BC}(k)}}, \end{aligned}$$
If the responseefficiency decays harmonically in a given application, i.e., the node is maximum efficient (efficiency \(\hbox {value}=1\)) for itself while it is half efficient (efficiency \(\hbox {value}=0.5\)) for each of its neighbors and so on, then we can formulate for the value of responseefficiency \(X_{ij}=\frac{1}{d_{i,j}+1}\). Recall, \(d_{u,v}\) denotes the distance, i.e., length of the shortest path from node u to node v in the given network. Let \(\chi _i\) be the total responseefficiency of a node i if the service station is going to be installed on node i. Then,The expected responseefficiency of node i (\(E[\chi _i]\)) is computed by taking expectation over that node’s responseefficiency to service all the nodes in network:We define the following hybrid centrality measures based on the above formulation:In few complex systems, responsetime plays very crucial roles in the functionality of networks and the responseefficiency decays exponentially. In such networks, the responseefficiency decays faster, i.e., the node is maximum efficient (efficiency value=1) for itself while it decays by a multiplicative factor \(\alpha\) (efficiency \(\hbox {value}=\alpha\)) for each of it’s neighbors and so on. Here, \(\alpha\) is a severity factor that lies in interval[0,1]. Then we can formulate for the value of responseefficiency \(X_{ij}=\alpha ^{d_{i,j}}\). Similar to the previous analysis, the expected efficiency of a node i to be a service station, when efficiency decays exponentially by a factor \(\alpha\) is \(E[\chi _i]=\sum _{j\in V} \alpha ^{d_{i,j}}\cdot \frac{\text{BC}(j)}{\sum _{k\in V}{\text{BC}(k)}}\). We define the following hybrid centrality measures based on the above formulation:Let m be the number of links and n be the number of nodes in a given network. The over all time to compute proposed hybrid centrality measure is O(mn) in unweighted and \(O(mn+n^2\log n)\) in weighted graphs. It is due to the time complexity for computing betweenness centrality [27]. Efficient algorithms for updating and estimating betweenness centrality measures in dynamic and largesize networks are discussed in [28‐30].
$$\begin{aligned} \chi _i=\sum _{j\in V} X_{ij}\cdot \text{Pr[Node}\; j\; \text{request\; for\; service]}. \end{aligned}$$
$$\begin{aligned} E[\chi _i]=\sum _{j\in V} \frac{1}{d_{i,j}+1}\cdot \frac{\text{BC}(j)}{\sum _{k\in V}{\text{BC}(k)}}. \end{aligned}$$

Harmonically attenuatedbetweenness centrality (HABC): this measure is based on the harmonic attenuation of importance with respect to the distance. The harmonically attenuatedbetweenness centrality of a node u is defined as,where BC(u) is the unweighted betweenness centrality of node u. This hybrid measure assigns a value that is calculated based on an averaging function of distances and betweenness centrality. This measure assigns a higher value to the nodes that are of high betweenness and closer to other high betweenness central nodes. This measure solves the SCP problem in flow networks where the responseefficiency decays harmonically.$$\begin{aligned} \text{HABC}(u)=\frac{\text{BC}(u)+\sum _{v\in V{\setminus }\{u\}} \frac{\text{BC}(v)}{d_{u,v}+1}}{\sum _{w\in V} \text{BC}(w)}, \end{aligned}$$

Exponentially attenuatedbetweenness centrality (EABC): this measure is based on an exponential attenuation in the power of distance. The exponentially attenuatedbetweenness centrality of a node u is defined as,where BC(u) is the betweenness centrality of node i. \(\alpha\) is the attenuation factor that is used in the exponential function to the power of distance. It is used to sharply punish the service time. This hybrid measure assigns a value that is calculated based on betweenness centrality of nodes and an exponential function in the power of distances. This measure assigns a higher value to the nodes that are of high betweenness and very close to other high betweenness central nodes. This centrality measure is a sharp attenuation variant of harmonically attenuateddegree centrality. This measure solves the service coverage problem when the service is required on a very urgent basis.$$\begin{aligned} \text{EABC}(u)=\frac{\text{BC}(u)+\sum _{v\in V{\setminus }\{u\}} \alpha ^{d_{u,v}}\cdot \text{BC}(v)}{\sum _{w\in V} \text{BC}(w)}, \end{aligned}$$
Experimental results for service coverage problem
In this section, we discuss the simulation results on various realworld networks. First, we discuss the experimental setup. Then, we mention all the data set used for experimentation. Next, we provide a comparison of traditional centrality measures: degree (DC), closeness (CC), betweenness (BC) and, eigenvector (EC) and the proposed hybrid centrality measure [HABC, EABC (\(\alpha =0.5\)), EABC (\(\alpha =0.75\))] in the considered networks. The experiments are performed on a Linux system with 8 GB RAM and Intel Core i58250U CPU at 1.6 GHz. Implementation is done in Python 3.6.7 with the help of Networkx library.
Considered realworld data sets the proposed solution to solve service coverage problem discussed in this paper hybridizes betweenness centrality within closeness/harmonic and decay centrality. Betweenness centrality is first used to compute the load on each node, therefore, we have selected 21 realworld flow networks. We provide a brief summary of these networks in Table 1 and [31‐39] can be referred for a detailed description of the networks. We have considered various types of transport networks, energy networks, internet peertopeer network, etc. The columns of Table 1 consist of names of the network instances, the number of nodes (n), the number of edges (m), the average degree of the networks (Avg. deg.), density of the network, average clustering coefficient (\(\hat{C}\)), degree assortativity, the size of the maximum clique and the network type, respectively.
Table 1
Considered realworld networks
Instance name  n  m  Avg. deg.  Density  \(\hat{C}\)  Assortativity (D)  Max clique  Network type 

Chicago Road [36]  1467  1298  1.7696  0.0012  0.0001  \(\) 0.5049  2  Transport network 
Chilean Power Grid [38]  347  444  2.5591  0.0074  0.0865  \(\) 0.0773  4  Energy network 
Euroroad [36]  1174  1417  2.414  0.0021  0.0167  0.1267  3  Transport network 
London Metro [34]  266  308  2.3158  0.0087  0.0363  0.1531  3  Transport network 
Madrid Metro [34]  209  240  2.2967  0.011  0.0056  0.1776  3  Transport network 
Mexico Metro [34]  147  164  2.2313  0.0153  0.0034  \(\) 0.1105  3  Transport network 
Minnesota Road [31]  2642  3303  2.5004  0.0009  0.016  \(\) 0.1848  3  Transport network 
Moscow Metro [34]  134  156  2.3284  0.0175  0.0174  0.4492  3  Transport network 
New York Metro [34]  433  475  2.194  0.0051  0.0173  \(\) 0.0297  3  Transport network 
Oldenburg Road [32]  6105  7029  2.3027  0.0004  0.0108  0.0554  3  Transport network 
Openflights [34]  2939  15677  10.6683  0.0036  0.4526  0.0509  22  Transport network 
Osaka Metro [34]  108  123  2.2778  0.0213  0.0001  0.1965  2  Transport network 
p2pGnutella08 [37]  6301  20777  6.5948  0.001  0.0109  0.0356  5  Internet peertopeer network 
Paris Metro [34]  299  356  2.3813  0.008  0.0204  \(\) 0.0297  3  Transport network 
as20000102 [37]  6474  13895  4.2926  0.0007  0.2522  \(\) 0.1704  10  Autonomous systems graph 
Seoul Metro [34]  392  437  2.2296  0.0057  0.006  0.0313  3  Transport network 
Shanghai Metro [34]  148  158  2.1351  0.0145  0.0029  \(\) 0.0318  3  Transport network 
Tokyo Metro [34]  217  262  2.4147  0.0112  0.0237  0.1983  3  Transport network 
US Air [35]  332  2126  12.8072  0.0387  0.6252  \(\) 0.2079  22  Transport network 
US Power Grid [39]  4941  6594  2.6691  0.0005  0.0801  0.0035  6  Energy network 
NRPG data [33]  246  373  3.0325  0.0124  0.1071  \(\) 0.0932  3  Energy network 
Simulation we have conducted simulations for evaluating the performance of various traditional centrality measures and the hybrid measures proposed in this paper which are summarized in Table 2. The performance of a centrality measure is evaluated by computing the expected responsetime in terms of the average distance of the service requesting nodes from the top central node as per this centrality measure. We have emphasized in bold, the best expected response time in Table 2 for each network instances given in Table 1, when a service maintenence center has been installed in a node picked according to the considered centrality measures.
Table 2
Expected responsetime of the proposed hybrid measures and the traditional centrality measures when the most central node is made the service center
Instance name  BC  DC  CC  EC  HABC  EABC (\(\alpha =0.5\))  EABC (\(\alpha =0.75\)) 

Chicago Road  0.49  0.49  0.49  0.49  0.49  0.49  0.49 
Chilean Power Grid  0.29  0.33  0.29  0.33  0.29  0.29  0.29 
Euroroad  0.88  0.88  0.89  1.05  0.88  0.88  0.88 
London Metro  0.83  0.83  0.80  0.83  0.80  0.80  0.80 
Madrid Metro  0.46  0.46  0.44  0.66  0.46  0.46  0.42 
Mexico Metro  0.50  0.50  0.54  0.50  0.50  0.50  0.54 
Minnesota Road  2.29  3.21  2.15  3.11  3.29  2.29  3.45 
Moscow Metro  0.40  0.40  0.37  0.39  0.39  0.39  0.37 
New York Metro  1.11  1.3  1.12  1.3  1.11  1.11  1.11 
Oldenburg Road  1.66  1.66  1.68  2.64  1.66  1.66  1.66 
Openflights  0.16  0.16  0.16  0.16  0.16  0.16  0.16 
Osaka Metro  0.48  0.48  0.44  0.48  0.48  0.48  0.42 
p2pGnutella08  0.29  0.27  0.27  0.27  0.27  0.27  0.27 
Paris Metro  0.55  0.55  0.55  0.56  0.51  0.51  0.51 
as20000102  0.12  0.12  0.12  0.12  0.12  0.12  0.12 
Seoul Metro  1.06  1.25  1.06  1.23  1.02  1.02  1.02 
Shanghai Metro  0.53  0.53  0.64  0.53  0.64  0.53  0.64 
Tokyo Metro  0.57  0.57  0.57  0.57  0.57  0.57  0.49 
US Air  0.10  0.10  0.10  0.10  0.10  0.10  0.10 
US Power Grid  0.92  1.34  0.83  2.12  0.83  0.83  0.83 
NRPG  0.30  0.45  0.30  0.37  0.30  0.30  0.30 
Betweenness centrality has been used as one of the best measure to map loads on nodes in flow networks [24]. A node having higher load will be more frequent in requesting for services. Therefore, we consider the probability of a node requesting for services proportional to the betweenness centrality of that node. The expected responsetime is computed over \(\left\lceil n/10 \right\rceil\) service requests where n is the number of nodes in respective realworld networks. The first column in the table contains label of the considered realworld network instances. The next columns lists the expected responsetime of the traditional centrality measures (BC,CC, DC, EC) and the proposed hybrid measures (HABC, \(\hbox {EABC}(\alpha =0.5)\), and \(\hbox {EABC} (\alpha =0.75))\).
It is evident from Table 2 that no traditional measure can consistently find an ideal service center. While at least one of the proposed hybrid centrality measures are always able to minimize the average responsetime for all considered networks. In some cases such as Madrid Metro, Paris Metro, Osaka Metro, Seoul Metro, and Tokyo Metro, the proposed measures even outperform all the traditional measures.
Node ranks here, we discuss the ranking result of nodes by various centrality measures. Table 3 contains the ranking list of the top three central nodes using the proposed hybrid measures and the traditional centrality measures on the considered realworld networks given in Table 1. Due to space constraints, only the data of top three central nodes have been presented in the paper. The first column in the table contains the label of the considered realworld network instances. The next columns in a group of three lists top 3 central nodes using the proposed hybrid measures (HABC, \(\hbox {EABC}(\alpha =0.5)\), and \(\hbox {EABC} (\alpha =0.75))\) and the traditional centrality measures (BC,CC, DC, EC).
Table 3
Ranking list of top three central nodes using the proposed hybrid measures and the traditional centrality measures on considered realworld networks given in Table 1
Instance name  HABC  EABC (\(\alpha =0.5)\)  EABC (\(\alpha =0.75)\)  BC  CC  DC  EC  

R1  R2  R3  R1  R2  R3  R1  R2  R3  R1  R2  R3  R1  R2  R3  R1  R2  R3  R1  R2  R3  
Chicago Road  1156  1157  1147  1156  1157  1147  1156  1157  1147  1156  1157  1147  1156  1157  1147  552  922  10  1156  1147  1150 
Chilean Power Grid  178  105  147  178  105  147  178  105  147  178  105  108  178  105  147  108  6  178  108  264  29 
Euroroad  402  284  277  402  284  277  402  284  277  402  284  277  401  402  403  284  7  39  7  43  454 
London Metro  115  29  12  115  29  12  115  29  12  12  115  180  115  29  214  12  164  305  214  115  219 
Madrid Metro  80  162  146  80  162  146  162  80  146  80  146  162  162  80  235  80  15  9  15  54  40 
Mexico Metro  26  70  23  26  70  78  70  26  54  26  70  25  70  26  23  26  127  61  26  70  23 
Minnesota Road  300  328  280  1821  2069  300  280  328  300  1821  2069  2063  1356  1820  1109  2418  35  32  1927  1930  1987 
Moscow Metro  59  71  25  59  71  25  71  59  73  25  74  59  73  71  25  24  59  21  59  25  73 
New York Metro  424  191  410  424  191  410  424  191  410  424  191  148  84  424  410  273  228  410  273  228  188 
Oldenburg Road  831  714  731  831  714  711  831  714  803  831  714  593  714  731  712  831  1657  1571  1677  1683  1691 
Openflights  53  65  57  53  57  65  53  57  65  53  65  1576  53  65  57  53  65  59  53  59  65 
Osaka Metro  25  70  95  25  70  95  70  95  25  25  99  70  95  70  25  25  54  42  25  81  54 
p2pGnutella08  123  367  127  123  367  127  123  127  367  5831  1317  424  123  127  1317  123  127  367  367  123  145 
Paris Metro  150  57  235  150  57  235  150  57  235  57  150  106  57  150  235  57  177  246  186  150  235 
as20000102  2  10  7  2  10  7  2  10  7  2  7  10  2  10  7  2  10  7  2  10  7 
Seoul Metro  220  480  357  220  444  391  220  480  357  220  444  391  220  480  357  254  444  391  103  254  125 
Shanghai Metro  136  16  113  136  16  213  136  113  16  16  136  65  136  113  97  16  136  220  16  137  138 
Tokyo Metro  150  85  53  150  85  53  85  150  60  150  53  85  150  60  85  150  53  120  150  120  36 
US Air  118  261  201  118  201  47  118  201  47  118  8  261  118  261  67  118  261  255  118  261  255 
US Power Grid  1378  1365  2685  1378  1365  2685  1378  1365  1678  651  559  1365  1378  1678  2944  2847  602  932  4422  4436  4419 
NRPG data  233  235  238  233  235  238  233  235  238  70  235  213  233  238  70  194  65  56  239  213  215 
Finding only the most central node might not be useful in the cases when the top central node does not allow it to be made as a service facility due to some constraints. In that case, finding the first topk potential centers are important. It is evident from the experimental results that the proposed measures are results of hybridizations between closeness centrality and betweenness centrality.
We have emphasized the topranking node for a few entries in the table and have written them in bold to exhibit the above phenomenon. For some networks, the topranking node as per HABC is the same as BC but not as CC and viceversa. The ranking on two networks (Minnesota Road and Moscow Metro) also provide evidence that topranking nodes due to the proposed centrality measures are different from closeness and betweenness centrality. In addition to these two, another network (Openflights) shows that these proposed measures also rank nodes differently than each other.
It is evident that no single standard centrality measure is a perfect substitute for our proposed centrality measures. As our measures follow directly from the requirements of the Service Coverage Problem, it is evident that current centrality measures are not sufficient to solve the problem adequately.
We present the Spearman’s rank correlation coefficient, Kendall’s rank correlation coefficient and Fagin’s intersection metric [40] to evaluate the correlation between traditional centrality measures and the proposed hybrid centrality measures in Tables 4, 5 and 6, respectively. For our purposes we compute the Fagin’s rank intersection for top1000 ranks in the network. In case the number of nodes is less than or equal to 1000, the rank intersection is computed for all nodes in the network. The first column in the tables contain the label of the considered realworld network instances. The next four columns comprise the rank correlation between HABC and the traditional centrality measures (BC, CC, DC, EC). Similarly the next four columns contain the rank correlation values between and \(\hbox {EABC}(\alpha =0.5)\) and BC, CC, DC, EC, respectively. The last four columns are the rank correlation between \(\hbox {EABC}(\alpha =0.75)\) and the traditional centrality measures.
Table 4
Spearman’s rank correlation between the proposed and the traditional centrality measures and on considered realworld networks given in Table 1
Instance name  HABC  EABC (\(\alpha =0.5)\)  EABC (\(\alpha =0.75)\)  

BC  CC  DC  EC  BC  CC  DC  EC  BC  CC  DC  EC  
Chicago Road  0.2087  0.3688  0.2393  0.2027  0.2040  0.3437  0.2351  0.2024  0.2035  0.3241  0.1985  0.1515 
Chilean Power Grid  0.0691  \(\) 0.0620  0.0132  0.0057  0.0224  0.0274  0.0506  0.0163  0.0929  0.0867  \(\) 0.0200  0.0850 
Euroroad  0.7707  0.0144  0.2019  \(\) 0.0010  0.7985  0.0190  0.1675  \(\) 0.0174  0.7656  0.0214  0.1834  \(\) 0.0227 
London Metro  0.1625  0.1450  0.0296  0.0312  \(\) 0.0291  0.0323  0.0917  0.0636  0.0432  0.1496  \(\)0.0837  0.0324 
Madrid Metro  0.0540  0.1561  \(\) 0.0912  0.0149  0.0284  0.0891  \(\) 0.0949  0.1213  0.0109  0.0814  \(\) 0.1316  0.0049 
Mexico Metro  \(\)0.0598  0.2043  0.0428  0.1145  0.1267  \(\)0.0479  0.0516  0.1401  \(\) 0.0037  0.2043  \(\)0.0716  0.1522 
Minnesota Road  \(\)0.0068  \(\)0.0358  \(\) 0.0777  \(\) 0.1648  0.0286  \(\) 0.0221  0.0075  \(\) 0.0393  0.0015  \(\) 0.0552  0.0345  \(\) 0.1444 
Moscow Metro  0.0824  0.0474  0.1245  0.0570  \(\) 0.1814  \(\) 0.1215  \(\) 0.2326  0.0627  0.1012  0.0468  0.0331  \(\) 0.1045 
New York Metro  0.0386  0.0865  \(\) 0.0061  0.0760  0.0599  0.0394  \(\) 0.0382  \(\) 0.0222  0.0083  \(\) 0.0139  \(\) 0.0651  \(\) 0.0156 
Oldenburg Road  0.0098  0.1332  0.0094  \(\) 0.0222  0.0075  0.0593  0.0202  0.0169  0.0113  0.0991  0.0039  \(\) 0.0083 
Openflights  0.0018  0.1090  0.0263  0.0799  0.0055  0.1417  0.0318  0.0898  0.0032  0.1305  0.0200  0.0618 
Osaka Metro  0.0110  0.1728  0.0262  0.1440  0.0482  0.1662  0.0402  0.1398  0.0795  0.1031  \(\) 0.0713  0.0656 
p2pGnutella08  0.0537  0.0760  0.0880  0.0491  0.0636  0.0634  0.0775  0.0485  0.0660  0.0529  0.0754  0.0561 
as20000102  0.1327  0.5878  0.1837  0.4513  0.1408  0.5881  0.1810  0.4423  0.1309  0.6273  0.1912  0.4392 
Seoul Metro  0.0520  \(\) 0.0055  \(\) 0.0048  \(\) 0.0389  0.0257  0.0143  0.0480  0.0775  \(\) 0.0427  0.0109  \(\) 0.0643  0.0136 
Shanghai Metro  0.1104  \(\) 0.0165  0.1168  0.0977  0.0721  0.1124  0.0326  0.1617  0.0014  0.0922  0.1809  0.0610 
US Air  0.0855  0.1707  0.0522  0.0608  0.1171  0.2246  0.0059  0.0674  0.0503  0.2372  0.0243  0.1065 
US Power Grid  0.0450  0.0831  0.0830  0.0437  0.0350  0.0279  0.0345  0.0038  0.0436  0.0598  0.0622  0.0436 
Paris Metro  \(\) 0.1077  0.0421  \(\)0.0625  0.1169  0.0302  0.0846  \(\) 0.0271  0.0351  0.0175  0.1305  \(\) 0.0914  0.0309 
Tokyo Metro  0.1693  0.0920  0.0064  0.0460  0.0555  0.0883  0.0958  0.0511  0.0646  0.1071  \(\) 0.0794  0.1039 
NRPG data  0.0118  0.1334  \(\) 0.0693  0.0934  0.0445  0.0975  \(\) 0.1389  0.1395  \(\) 0.0867  0.1659  \(\) 0.1322  0.0936 
Average correlation  0.0902  0.1192  0.0444  0.0694  0.0811  0.0966  0.0305  0.0858  0.0744  0.1268  0.0094  0.0574 
Standard deviation  0.1729  0.1441  0.0907  0.1152  0.1802  0.1485  0.1043  0.1042  0.1701  0.1432  0.1069  0.1139 
The experimental results in Tables 4 and 5 for the Spearman’s and Kendall’s rank correlation coefficients make it evident that ranking by the proposed hybrid measures is different than traditional centrality measures. The average correlation coefficient between the proposed measures and traditional measures, although is positive but very small. The standard deviation is larger than the average coefficient values for most of the computation of rank correlation. It is due to several negative correlation coefficient values. Based on the average rank correlation coefficient values, HABC is best correlated with CC and then BC among the traditional measures. \(\hbox {EABC}(\alpha =0.75)\) also exhibits similar pattern as the decay rate is slower than \(\hbox {EABC}(\alpha =0.5)\). \(\hbox {EABC}(\alpha =0.5)\) is best correlated with the CC and then EC among the traditional measures. The proposed measures are least correlated with DC. The high Fagin’s intersection is due to the ranks being restricted to the top1000. Therefore, the existing traditional measures do not provide the solution to the service coverage problem. The topranked nodes by the proposed hybrid centrality measures are more appropriate in this application.
Table 5
Kendall rank correlation coefficient (Kendall’s Tau) between the proposed and the traditional centrality measures and on considered realworld networks given in Table 1
Instance name  HABC  EABC (\(\alpha =0.5)\)  EABC (\(\alpha =0.75)\)  

BC  CC  DC  EC  BC  CC  DC  EC  BC  CC  DC  EC  
Chicago Road  0.2402  0.2919  0.2531  0.1375  0.2493  0.2762  0.2601  0.1197  0.2375  0.2669  0.2267  0.1024 
Chilean Power Grid  0.0461  \(\) 0.0379  0.0089  0.0045  0.0149  0.0197  0.0318  0.0119  0.0621  0.0651  \(\) 0.0153  0.0556 
Euroroad  0.7126  0.0099  0.1404  \(\) 0.0004  0.7461  0.0127  0.1168  \(\) 0.0115  0.7025  0.0147  0.1278  \(\) 0.0149 
London Metro  0.1074  0.0972  0.0173  0.0220  \(\) 0.0200  0.0241  0.0617  0.0445  0.0286  0.1022  \(\) 0.0568  0.0241 
Madrid Metro  0.0334  0.1058  \(\)0.0609  0.0098  0.0215  0.0595  \(\) 0.0647  0.0846  0.0097  0.0558  \(\) 0.0885  0.0077 
Mexico Metro  \(\) 0.0445  0.1386  0.0264  0.0724  0.0860  \(\) 0.0351  0.0338  0.1026  \(\)0.0040  0.1376  \(\) 0.0566  0.1047 
Minnesota Road  \(\) 0.0047  \(\) 0.0216  \(\) 0.0432  \(\) 0.1205  0.0190  \(\) 0.0143  0.0052  \(\) 0.0264  0.0012  \(\) 0.0370  0.0232  \(\) 0.0999 
Moscow Metro  0.0553  0.0315  0.0809  0.0376  \(\) 0.1274  \(\) 0.0807  \(\) 0.1592  0.0439  0.0690  0.0349  0.0250  \(\) 0.0722 
New York Metro  0.0259  0.0582  \(\) 0.0040  0.0501  0.0399  0.0265  \(\)0.0254  \(\) 0.0143  0.0067  \(\) 0.0095  \(\) 0.0429  \(\) 0.0094 
Oldenburg Road  0.0066  0.0905  0.0067  \(\) 0.0146  0.0051  0.0401  0.0136  0.0118  0.0074  0.0671  0.0029  \(\)0.0056 
Openflights  0.0053  0.0797  0.0217  0.0571  0.0077  0.1029  0.0249  0.0645  0.0061  0.0936  0.0184  0.0443 
Osaka Metro  0.0097  0.1222  0.0194  0.1038  0.0301  0.1177  0.0280  0.1000  0.0568  0.0723  \(\) 0.0519  0.0561 
p2pGnutella08  0.0364  0.0514  0.0595  0.0329  0.0433  0.0427  0.0524  0.0326  0.0447  0.0360  0.0513  0.0379 
as20000102  0.0909  0.4913  0.1318  0.3756  0.0963  0.4917  0.1303  0.3696  0.0900  0.5191  0.1367  0.3672 
Seoul Metro  0.0349  \(\) 0.0052  \(\) 0.0031  \(\) 0.0280  0.0165  0.0090  0.0305  0.0527  \(\) 0.0281  0.0056  \(\) 0.0439  0.0112 
Shanghai Metro  0.0761  \(\) 0.0097  0.0798  0.0689  0.0506  0.0735  0.0182  0.1158  0.0029  0.0671  0.1129  0.0421 
US Air  0.0568  0.1184  0.0345  0.0401  0.0782  0.1584  0.0051  0.0473  0.0339  0.1685  0.0186  0.0717 
US Power Grid  0.0304  0.0559  0.0558  0.0294  0.0234  0.0186  0.0229  0.0025  0.0292  0.0406  0.0418  0.0289 
Paris Metro  \(\) 0.0721  0.0269  \(\) 0.0417  0.0795  0.0186  0.0589  \(\) 0.0180  0.0217  0.0119  0.0928  \(\) 0.0620  0.0184 
Tokyo Metro  0.1150  0.0610  0.0063  0.0311  0.0384  0.0608  0.0626  0.0315  0.0445  0.0742  \(\) 0.0559  0.0707 
NRPG data  0.0069  0.0888  \(\) 0.0416  0.0598  0.0283  0.0660  \(\) 0.0889  0.0972  \(\) 0.0596  0.1109  \(\) 0.0914  0.0625 
Average correlation  0.0747  0.0878  0.0356  0.0499  0.0698  0.0728  0.0258  0.0620  0.0644  0.0942  0.0105  0.0430 
Standard deviation  0.1594  0.1167  0.0729  0.0911  0.1684  0.1205  0.0833  0.0830  0.1572  0.1172  0.0835  0.0895 
Table 6
Fagin’s intersection metric between the proposed and the traditional centrality measures and on considered realworld networks given in Table 1
Instance name  HABC  EABC (\(\alpha =0.5)\)  EABC (\(\alpha =0.75)\)  

BC  CC  DC  EC  BC  CC  DC  EC  BC  CC  DC  EC  
Chicago Road  0.6487  0.8122  0.6106  0.8187  0.6639  0.7809  0.6286  0.7873  0.6308  0.8073  0.5992  0.816 
Chilean Power Grid  0.7007  0.9529  0.6461  0.7418  0.7105  0.9212  0.6553  0.7361  0.6912  0.966  0.6398  0.7416 
Euroroad  0.9986  0.5866  0.6477  0.4884  0.9992  0.5869  0.648  0.4888  0.9983  0.5863  0.6477  0.4882 
London Metro  0.6801  0.9379  0.6831  0.8843  0.7217  0.8795  0.7013  0.8351  0.6582  0.9546  0.6729  0.8954 
Madrid Metro  0.7071  0.9205  0.6645  0.7385  0.749  0.8651  0.6862  0.6978  0.6809  0.9407  0.6542  0.7437 
Mexico Metro  0.7546  0.9159  0.6332  0.9014  0.7898  0.8676  0.6409  0.858  0.7289  0.9326  0.618  0.9006 
Minnesota Road  0.2845  0.1173  0.3375  0.0239  0.6407  0.2839  0.3661  0.1631  0.3023  0.1223  0.3434  0.0615 
Moscow Metro  0.7809  0.9413  0.7758  0.9056  0.7967  0.9193  0.7825  0.8875  0.7615  0.9596  0.7769  0.9083 
New York Metro  0.704  0.9081  0.6248  0.7317  0.7949  0.7963  0.6488  0.658  0.7004  0.9022  0.6253  0.7257 
Oldenburg Road  0.493  0.8842  0.1815  0.1423  0.7121  0.6117  0.2158  0.1234  0.5207  0.8129  0.1877  0.129 
Openflights  0.5825  0.9675  0.7762  0.8194  0.5798  0.9674  0.7753  0.8207  0.5765  0.9632  0.7704  0.8194 
Osaka Metro  0.7505  0.9264  0.6979  0.869  0.7758  0.8949  0.7085  0.8427  0.7134  0.958  0.6749  0.8659 
p2pGnutella08  0.4044  0.926  0.5036  0.8427  0.3999  0.9203  0.4997  0.8474  0.4049  0.9291  0.5057  0.8394 
as20000102  0.3204  0.9779  0.4107  0.8947  0.3181  0.9792  0.4085  0.8932  0.3245  0.9823  0.4143  0.8864 
Seoul Metro  0.6927  0.9099  0.6274  0.709  0.7945  0.794  0.6263  0.641  0.6883  0.9047  0.6269  0.7098 
Shanghai Metro  0.7429  0.8944  0.6196  0.7672  0.7999  0.8264  0.6534  0.7504  0.7165  0.92  0.6016  0.7629 
US Air  0.8097  0.9562  0.8614  0.8836  0.8056  0.9541  0.8574  0.8823  0.8026  0.951  0.8538  0.8786 
US Power Grid  0.5087  0.8876  0.2427  0.1926  0.5869  0.7553  0.2679  0.1789  0.5025  0.8659  0.2449  0.1949 
Paris Metro  0.7242  0.9392  0.6342  0.9106  0.7554  0.8876  0.6523  0.8702  0.7059  0.955  0.6258  0.9207 
Tokyo Metro  0.7012  0.932  0.6779  0.8424  0.7305  0.8956  0.69  0.8181  0.6758  0.9439  0.6631  0.843 
NRPG data  0.8047  0.9426  0.6834  0.7305  0.8181  0.917  0.694  0.7396  0.7833  0.9572  0.6749  0.7213 
Average correlation  0.6569  0.8684  0.5971  0.7066  0.7116  0.8240  0.6098  0.6914  0.6461  0.8721  0.5915  0.7073 
Standard deviation  0.1715  0.1903  0.1716  0.2655  0.1480  0.1613  0.1659  0.2452  0.1629  0.1929  0.1671  0.2621 
Spread maximization in complex contagion
In this section, we present the second application. Contagions are the phenomena of disease/behavior/trends/information/idea spreading across networks. Contagions can be classified as simple or complex. Simple contagions can be transmitted by a single infected individual to other. Complex contagions, however, require multiple exposures to infect an individual. Instead of simple contagions like diseases, there have been studies such as [41] and [42] which indicate that the spread of ideas, trends, behavior, influence and information in a social network can be more accurately modeled as complex contagions. In a recent study by [43] on realworld networks, it has been noted that influence maximization is not due to the nodes in higher cores. We consider complex contagions first in a linear threshold diffusion model setting [44, 45]. Linear threshold models are one of the classical diffusion models. The considered linear threshold model in this paper assumes that a fixed proportion of neighbors of an individual need to be infected in order to transmit infection to that individual. It is also referred as relative threshold model. There are other models that considered deterministic thresholds yet distinct for each individual. Another deterministic threshold model fixes the threshold to be a constant number and does not depend on the neighborhood size. In stochastic setting, uniformly random thresholds from an interval are considered for individuals.
We also consider two stochastic models of diffusion, independent cascade and stochastic linear threshold. In the independent cascade model every edge in the network is associated with a random value which represents the probability of diffusion across the edge. The stochastic linear threshold model is a linear threshold model in a stochastic setting as explained earlier.
The total number of people infected can depend upon where the contagion starts, threshold on individual node, etc. There can be multiple sources in a network which can start a complex contagion. This is especially true in the case for complex contagions, which have a hard time jumping across communities [46]. Hence identification of an ideal origin of trend/behavior/idea, etc., for a contagion can be helpful if we want the contagion to spread to maximum number of people. This has realworld benefits, such as in social marketing campaigns.
In this section, we consider the problem of finding a seed node that could spread a behavior/advertisement/idea/trend or influence to maximum number of nodes. It is referred as spread maximization problem. In complex contagion, a single node might not have the strength to even start the propagation of infection, therefore, it is assumed that the neighborhood of the seed node is also infected with the seed node at time step zero.
Formulation
In this section, we formulate hybrid centrality measures for the above stated problem of spread maximization in complex contagions. When choosing a source for infection, several factors come to mind. Out of which, the most important is that the node must have high enough degree to expose multiple individuals with the infection. It also helps if the node is closer to other nodes with high degree to spread it further. Hence to measure the potential of a node in acting as a top spreader, we use its degree as its node weight. We present the following two hybrid measures to identify ideal sources for the spread of complex contagions in the relative threshold model, leveraging the hybridization similar to the one used in previous section.Let m be the number of links and n be the number of nodes in a given network. The over all time to compute proposed hybrid centrality measure is O(mn) in unweighted and \(O(nm + n^2\log n)\) in weighted graphs. It is due to the time complexity for all pair shortest paths. In realworld, each node has its own threshold which may be deterministic or stochastic in nature and varies from individual to individual. Finding best spreader node is possible in O(mn) time if the threshold values at all nodes are fixed to some constant value or proportion. Yet, for a different value of threshold, the best spreader may change and requires recomputation. The proposed measures do not require the knowledge of threshold value to be known beforehand. From the simulation results in the next section, it is evident that these measures have consistently performed the best in most cases for different values of threshold.

Harmonically attenuated degree centrality (HADC): in this measure, we attenuate the node weights harmonically with respect to distance. Formally, for a node u, the HADC(u) can be expressed aswhere DC(v) is the degree centrality of node v, m is the number of links in the graph.$$\begin{aligned} \text{HADC}(u)=\frac{\text{DC}(u) +\sum _{v\in V{\setminus }\{u\}} \frac{\text{DC}(v)}{d_{u,v}+1}}{2m}, \end{aligned}$$

Exponentially attenuated degree centrality (EADC): in this measure, we attenuate the node weights exponentially with respect to distance. Formally, for a node u, the EADC(u) can be expressed as,where \(\alpha\) is a parameter that lies between 0 and 1, m is the number of links in the graph, DC(v) is the degree centrality of node v.$$\begin{aligned} \text{EADC}(u)=\frac{\text{DC}(u)+\sum _{v\in V{\setminus }\{u\}} \alpha ^{d_{u,v}}\cdot \text{DC}(v)}{2m}, \end{aligned}$$
Experimental results for spread maximization problem
In this section, we discuss the simulation results for for spread maximization problem on various realworld networks. First, we discuss the experimental setup. Then, we mention all the data set used for experimentation. Next, we provide a comparison of traditional centrality measures: degree (DC), closeness (CC), betweenness (BC) and, eigenvector (EC) and the proposed hybrid centrality measure (HADC, \(\hbox {EADC}(\alpha =0.25)\), \(\hbox {EADC}(\alpha =0.5)\), \(\hbox {EADC}(\alpha =0.75))\) in the considered networks.
Considered realworld data sets the proposed solution to solve spread maximization problem discussed in this paper hybridizes degree centrality within closeness/harmonic and decay centrality. The considered problem is related to spread of trends, behaviors, influences, etc., therefore, we have picked 15 moderate size realworld social networks for simulations. We provide a brief summary of these networks in Table 7 and [31, 36] can be referred for a detailed description of the networks. The columns of Table 7 consist of names of the network instances, the number of nodes (n), the number of edges (m), the average degree of the networks (Avg. deg.), density of the network, average clustering coefficient (\(\hat{C}\)), degree assortativity and the size of the maximum clique, respectively.
Table 7
Considered realworld social networks
Instance name  n  m  Avg. deg.  Density  \(\hat{C}\)  Assortativity (D)  Max clique 

socgplus [31]  23,628  39,242  3.3217  0.0001  0.0037  \(\) 0.387  7 
Twitter Lists [36]  23,370  33,101  2.8328  0.0001  0.0215  \(\) 0.4741  7 
sochamsterer [31]  2426  16,630  13.7098  0.0057  0.2314  0.0474  25 
socpagesgovernment [31]  7057  89,455  25.3521  0.0036  0.2238  0.0294  28 
socpagespolitician [31]  5908  41,729  14.1263  0.0024  0.3011  0.0184  21 
socpagesshows [31]  3892  17,262  8.8705  0.0023  0.5906  0.5605  57 
socwikielec [31]  7118  107,071  30.0846  0.0042  0.1255  \(\) 0.0514  17 
socadvogato [31]  6551  51,332  15.6715  0.0024  0.0925  \(\) 0.0526  19 
socpagespublicfigure [31]  11,565  67,114  11.6064  0.001  0.1666  0.2024  21 
socanybeat [31]  12,645  67,053  10.6055  0.0008  0.0217  \(\) 0.1219  25 
socpagessport [31]  13,866  86,858  12.5282  0.0009  0.1292  \(\) 0.0268  29 
socepinions [31]  26,588  100,120  7.5312  0.0003  0.0892  0.0572  16 
socpagesmedia [31]  27,917  206,259  14.7766  0.0005  0.114  0.0221  31 
socpagescompany [31]  14,113  52,310  7.413  0.0005  0.1532  0.0135  21 
socgamesecRO [31]  41,773  125,826  6.0243  0.0001  0.0753  0.114  7 
Simulation we have conducted simulations for evaluating the performance of various traditional centrality measures and the hybrid measures proposed in this paper for spread maximization problem which are summarized in Tables 8, 9, 10 and 11. The simulation is carried out as follows. The topranked node for every measure is taken as the source of the infection. The source and its neighborhood are infected at time \(t=0\). At each time step the infection is propagated as per the diffusion model being used. The contagion ends if it is unable to infect any susceptible nodes in a timestep. The total number of infected nodes by the end of the contagion is the spreadscore for the source node in the simulation. The performance of a centrality measure is evaluated by computing the spreadscore of the top most central node as per the centrality measure.
Table 8
Number of infected nodes when the topranked node is chosen as the source of the infection for threshold = 30%
Instance name  BC  CC  DC  EC  HADC  EADC (\(\alpha =0.25\))  EADC (\(\alpha =0.5\))  EADC (\(\alpha =0.75\)) 

socgplus  1,4951  14,951  14,951  14,951  14,951  14,951  14,951  14,951 
Twitter Lists  484  1180  553  539  1180  1180  1180  1180 
sochamsterer  1879  1879  1879  1879  1879  1879  1879  1879 
socpagesgovernment  4316  4316  4316  3695  4316  4316  4316  4316 
socpagespolitician  2185  2185  946  674  2185  2185  2185  2185 
socpagesshows  1283  1283  1242  210  1283  279  1283  1283 
socwikielec  7061  7061  7061  7061  7061  7061  7061  7061 
socadvogato  5014  5014  5014  5014  5014  5014  5014  5014 
socpages  10,095  10,095  10,029  1485  10,184  1485  10,184  10,184 
socanybeat  12,538  12,538  12,538  12,538  12,538  12,538  12538  12,538 
socpagessport  13,739  13,739  13,739  13,739  13,739  13,739  13,739  13739 
socepinions  2094  3106  25,283  25,283  25,283  25,283  25,283  3106 
socpagesmedia  26,982  26,981  26,981  26,981  26,981  26,981  26,981  26,981 
socpagescompany  1065  12,432  12,432  163  12,432  12,432  12,432  12,432 
socgamesecRO  219  219  554  555  219  554  219  219 
Table 9
Number of infected nodes when the topranked node is chosen as the source of the infection for threshold = 40%
Instance name  BC  CC  DC  EC  HADC  EADC (\(\alpha =0.25\))  EADC (\(\alpha =0.5\))  EADC (\(\alpha =0.75\)) 

socgplus  14,556  14,556  14,556  14,556  14,556  14,556  14,556  14,556 
Twitter Lists  473  912  452  522  912  912  912  912 
sochamsterer  1666  1666  1666  1666  1666  1666  1666  1666 
socpagesgovernment  3372  3372  3372  1446  3372  3372  3372  3372 
socpagespolitician  552  552  652  387  552  552  552  552 
socpagesshows  297  297  299  141  297  196  297  297 
socwikielec  7061  7061  7061  7061  7061  7061  7061  7061 
socadvogato  4667  4667  4667  4667  4667  4667  4667  4667 
socpages  1143  1143  7825  1328  1223  1328  1223  1223 
socanybeat  12,287  12,,287  12,287  12,287  12,287  12,287  12287  12287 
socpagessport  1082  1082  1082  1082  1082  1082  1082  1082 
socepinions  1383  1824  22,581  22,581  22,581  22,581  22,581  1824 
socpagesmedia  1101  1189  1694  1694  1694  1694  1694  1694 
socpagescompany  467  561  561  121  561  561  561  561 
socgamesecRO  129  129  180  175  129  180  129  129 
Table 10
Number of infected nodes when the topranked node is chosen as the source of the infection for threshold = 50%
Instance name  BC  CC  DC  EC  HADC  EADC (\(\alpha =0.25\))  EADC (\(\alpha =0.5\))  EADC (\(\alpha =0.75\)) 

socgplus  11,482  11,482  11,,482  11,482  11,482  11,482  11,482  11,482 
Twitter Lists  473  850  452  522  850  850  850  850 
sochamsterer  1091  1091  1091  1091  1091  1091  1091  1091 
socpagesgovernment  2631  2631  2631  1094  2631  2631  2631  2631 
socpagespolitician  483  483  640  380  483  483  483  483 
socpagesshows  218  218  216  137  218  172  218  218 
socwikielec  7061  7061  7061  7061  7061  7061  7061  7061 
socadvogato  4584  4584  4584  4584  4584  4584  4584  4584 
socpagespublicfigure  905  905  958  1242  831  1242  831  831 
socanybeat  12,221  12,221  12,221  12,221  12,221  12,221  12,221  12,221 
socpagessport  837  837  837  837  837  837  837  837 
socepinions  1310  1715  1709  1709  1709  1709  1709  1715 
socpagesmedia  911  984  1159  1159  1159  1159  1159  1159 
socpagescompany  417  492  492  101  492  492  492  492 
socgamesecRO  124  124  158  159  124  158  124  124 
Table 11
Number of infected nodes when the topranked node is chosen as the source of the infection for threshold = 60%
Instance name  BC  CC  DC  EC  HADC  EADC (\(\alpha =0.25\))  EADC (\(\alpha =0.5\))  EADC (\(\alpha =0.75\)) 

socgplus  8741  8741  8741  8741  8741  8741  8741  8741 
Twitter Lists  402  633  417  500  633  633  633  633 
sochamsterer  567  567  567  567  567  567  567  567 
socpagesgovernment  1712  1712  1712  944  1712  1712  1712  1712 
socpagespolitician  334  334  558  340  334  334  334  334 
socpagesshows  161  161  161  124  161  150  161  161 
socwikielec  5241  5241  5241  5241  5241  5241  5241  5241 
socadvogato  1795  1795  1795  1795  1795  1795  1795  1795 
socpagespublicfigure  498  498  570  1062  508  1062  508  508 
socanybeat  10130  10130  10130  10130  10130  10130  10130  10130 
socpagessport  633  633  633  633  633  633  633  633 
socepinions  897  1107  981  981  981  981  981  1107 
socpagesmedia  640  724  863  863  863  863  863  863 
socpagescompany  275  326  326  95  326  326  326  326 
socgamesecRO  101  101  119  121  101  119  101  101 
For the deterministic threshold model, the spreadscores of top most central nodes according to the traditional and proposed hybrid centrality measures in all considered realworld networks for different possible threshold value in the interval (0,1) with a step increase of 0.1 are plotted in Fig. 2. The figure shows the number of infections on the yaxis and the threshold values on the xaxis. As a specific example consider the plot for Twitter Lists. The number of infections have a large dip when the threshold increases from 0.1 to 0.2. The worst performing metric at the threshold of 0.2 is EC. When increasing the threshold from 0.2 to 0.3, we see another dip in number of infections, with DC and BC performing almost equally while BC was superior at the 0.2 threshold. CC and proposed centrality measures outperform the other measures across the rest of the thresholds. The main aim to plot these is to understand, for which interval of threshold values, the spreadscores for various centrality measures vary majorly. It is clear from the plots that for most of the considered realworld networks, the spread scores vary mostly in the subinterval [0.3, 0.6]. Yet, the spreadscores for different measures are not clear and comparable from these plots. Therefore, the results of the simulation for threshold value 0.3, 0.4, 0.5, and 0.6, i.e., for threshold in terms of percentage set at 30%, 40%, 50% and 60% are presented in Tables 8, 9, 10, and 11, respectively. For each of the tables, the first column shows the network name, the second column shows the spreadscores of Betweenness Centrality, the third column shows the spreadscores of Closeness Centrality, the fourth column shows the spreadscores for the Degree Centrality and the fifth column shows the spreadscores for Eigenvector Centrality. The other half of the table, from the sixth column onward shows the scores of the proposed hybrid measures. Namely, the sixth column shows the spreadscore for HADC, the seventh column for EADC with \(\alpha =0.25\), the eight column for EADC with \(\alpha =0.5\) and the ninth column for EADC with \(\alpha =0.75\).
We have emphasized in bold, the best spreadscore in Table 8, 9, 10, 11, 12 and 13 for each network instances given in Table 7 over
various models, when the seed node is picked according to the considered centrality measures. As seen in Table 8, at a lower threshold, CC performs better than other traditional measures for most networks. However, as we increase the threshold, we observe that DC starts performing better than other traditional measures, as shown in Table 9. Since our measures are a hybrid of both the measures, they are able to maintain high scores across the thresholds. As an example if we look at Table 9 for Twitter Lists, the measures have similar results as CC, on the other hand, in the same table for socpagesmedia and socgamesecRO the results are similar to DC. It is evident from the tables that the proposed measures are a hybridization of DC and CC, and therefore, they perform at par with the better performing traditional measures across various thresholds. In some cases, the proposed measures even outperform all the traditional measures.
Apart from the deterministic linear threshold model, we have also evaluated the performance of the proposed measures as well as traditional measure using stochastic diffusion models, namely independent cascade (IC) and stochastic linear threshold (LT). The results of these experiments are shown in Tables 13 and 12, respectively. The results shown in the tables are the averages taken over 100 independent iterations of the diffusion models. From Table 12, we see that HADC and EADC with \(\alpha =0.75\) have the highest number of expected infections across most networks. For some networks such as socpagespublicfigure, they outperform all of the traditional measures. In the network Twitter lists, CC is the top performing traditional measure while in the network socpagesmedia, DC and EC are the top performing traditional measures. In both cases HADC and EADC with \(\alpha =0.75\) perform optimally as they are a type of hybridization between CC and DC. In case of the IC model as shown in Table 13, we see that no measure performs consistently across the networks. As none of the traditional measures are able to perform consistently across the networks, a hybridization between them may lead to suboptimal results.
Table 12
Expected number of infected nodes when the topranked node is chosen as the source of the infection for stochastic linear threshold model
Instance name  BC  CC  DC  EC  HADC  EADC (\(\alpha =0.25\))  EADC (\(\alpha =0.5\))  EADC (\(\alpha =0.75\)) 

socgplus  11,990.96  11,990.96  11,990.96  11,990.96  11,990.96  11,990.96  11,990.96  11,990.96 
Twitter Lists  671.22  1298.42  697.3  592.26  1298.42  1298.42  1298.42  1298.42 
sochamsterer  1242.72  1242.72  1242.72  1242.72  1242.72  1242.72  1242.72  1242.72 
socpagesgovernment  4027.29  4027.29  4027.29  2916.12  4027.29  4027.29  4027.29  4027.29 
socpagespolitician  1907.98  1907.98  1123.45  724.17  1907.98  1907.98  1907.98  1907.98 
socpagesshows  748.47  748.47  743.79  253.3  748.47  347.11  748.47  748.47 
socwikielec  6623.02  6623.02  6623.02  6623.02  6623.02  6623.02  6623.02  6623.02 
socadvogato  3741.1  3741.1  3741.1  3741.1  3741.1  3741.1  3741.1  3741.1 
socpagespublicfigure  4415.09  4415.09  3928.15  2025.33  4535.9  2025.33  4535.9  4535.9 
socanybeat  11,394.31  11,394.31  11,394.31  11,394.31  11,394.31  11,394.31  11,394.31  11,394.31 
socpagessport  4064.69  4064.69  4064.69  4064.69  4064.69  4064.69  4064.69  4064.69 
socepinions  6660.87  8353.97  7469.31  7469.31  7469.31  7469.31  7469.31  8353.97 
socpagesmedia  7927.39  10,118.52  10,916.34  10,916.34  10,916.34  10,916.34  10,916.34  10,916.34 
socpagescompany  1644.77  2429.62  2429.62  161.55  2429.62  2429.62  2429.62  2429.62 
socgamesecRO  1101.8  1101.8  1293.95  1235.01  1101.8  1293.95  1101.8  1101.8 
Table 13
Expected number of infected nodes when the top ranked node is chosen as the source of the infection for independent cascade model
Instance name  BC  CC  DC  EC  HADC  EADC (\(\alpha =0.25\))  EADC (\(\alpha =0.5\))  EADC (\(\alpha =0.75\)) 

socgplus  14,487.1  14,472.36  14,481.01  14,480.63  14,472.88  14,480.8  14,487.43  14,477.27 
Twitter Lists  11,989.12  11,955.72  11,992.85  11,964.66  11,906.41  11,917.2  11,916.64  11,908.19 
sochamsterer  937.39  937.74  938.05  936.18  937.11  937.69  936.21  937.34 
socpagesgovernment  3148.14  3149.21  3148.54  3077.98  3150.89  3152.66  3140.53  3144.39 
socpagespolitician  2771.62  2769.67  2640.67  2586.06  2770.33  2771.98  2769.76  2774.51 
socpagesshows  1437.09  1442.95  1431.11  1308.85  1444.45  1330.98  1445.35  1446.3 
socwikielec  2060.71  2062.82  2061.5  2061.38  2061.32  2061.22  2061.67  2061.75 
socadvogato  2508.59  2504.19  2505.09  2503.53  2504.48  2506.25  2505.2  2503.19 
socpagespublicfigure  4951.71  4949.31  4955.46  4792.01  4938.83  4792.36  4946.97  4940.21 
socanybeat  5989.9  5992.17  5990.56  5989.8  5989.71  5991.55  5994.93  5991.79 
socpagessport  7147.25  7147.66  7142.17  7149.91  7146.72  7150  7149.07  7143.33 
socepinions  8131.74  8083.85  7971.35  7966.32  7972.66  7963.08  7967.07  8074.81 
socpagesmedia  13272.46  13217.94  13071.14  13069.32  13068.38  13071.39  13067.68  13075.13 
socpagescompany  7085.8  7027.44  7036.72  6951.13  7038.37  7033.11  7033.63  7030.34 
socgamesecRO  23,789.15  23,782.03  23,757.48  23,762.35  23,795.5  23,748.52  23,793.17  23,783.46 
We present the Spearman’s rank correlation coefficient, Kendall’s rank correlation coefficient and Fagin’s intersection metric [40] to evaluate the correlation between traditional centrality measures and proposed hybrid centrality measures in Tables 14, 15 and 16, respectively. For our purposes we compute the Fagin’s rank intersection for top1000 ranks in the network. In case the number of nodes is less than or equal to 1000, the rank intersection is computed for all nodes in the network. For each of tables, the first column shows the Network Name. The next four columns show the respective correlation between HADC and the traditional measures, namely BC, CC, DC and EC. The next four columns, from columns 6 to 9 show the respective correlation between EADC with \(\alpha =0.25\) and the traditional measures, BC, CC, DC and EC. Finally the columns from 10 to 13 show the the respective correlation between EADC with \(\alpha =0.75\) and BC, CC, DC and EC.
Table 14
Spearman’s rank correlation between the proposed and the traditional centrality measures and on considered realworld networks given in Table 7
Instance name  HADC  \(\hbox {EADC}(\alpha =0.25)\)  \(\hbox {EADC}(\alpha =0.75)\)  

BC  CC  DC  EC  BC  CC  DC  EC  BC  CC  DC  EC  
socgplus  0.2877  0.17  0.1634  0.0384  0.2871  0.2403  0.1921  0.173  0.2725  0.172  0.1522  0.0793 
Twitter Lists  0.0645  0.0315  0.0898  \(\) 0.0034  0.0639  0.0169  0.099  0.0837  0.0727  0.0661  0.102  \(\) 0.0006 
sochamsterer  0.6595  0.7512  0.5876  0.6986  0.6472  0.7308  0.5757  0.6972  0.6636  0.7626  0.5742  0.6985 
socpagesgovernment  0.2753  0.296  0.3775  0.2483  0.2745  0.2877  0.3669  0.2698  0.2533  0.2903  0.3761  0.2698 
socpagespolitician  0.2079  0.197  0.2818  0.181  0.222  0.2039  0.3028  0.1982  0.182  0.1879  0.2883  0.1859 
socpagesshows  0.1713  0.199  0.274  0.1608  0.2047  0.2167  0.3236  0.1685  0.1708  0.1917  0.2544  0.1736 
socwikielec  \(\) 0.0191  0.0974  \(\) 0.0035  0.064  \(\) 0.0015  0.082  0.0027  0.0796  0.0003  0.0938  0.0087  0.0562 
socadvogato  0.4419  0.4772  0.399  0.47  0.4399  0.4628  0.3957  0.4805  0.4412  0.466  0.3936  0.4725 
socpages  0.2768  0.2789  0.3643  0.2843  0.2943  0.2847  0.4008  0.2936  0.2763  0.2787  0.3667  0.2821 
socanybeat  0.2006  0.249  0.1857  0.1463  0.1947  0.2019  0.1701  0.1928  0.2  0.2239  0.1968  0.1547 
socpagessport  0.2465  0.2275  0.2731  0.2186  0.2707  0.2239  0.2965  0.2188  0.2439  0.2202  0.2808  0.2081 
socepinions  0.5778  0.7117  0.6392  0.7523  0.5796  0.7237  0.6466  0.7593  0.58  0.7151  0.6388  0.7506 
socpagesmedia  0.2499  0.2552  0.3333  0.2567  0.2414  0.2564  0.3372  0.257  0.2411  0.2528  0.3287  0.2521 
socpagescompany  0.2085  0.2026  0.2759  0.1866  0.2353  0.2012  0.3035  0.1905  0.2208  0.184  0.2797  0.186 
socwikiVote  0.0427  0.0754  0.033  0.1052  0.0553  0.0186  0.0237  0.0325  0.0444  0.0915  0.0764  0.0259 
socgamesecRO  0.1859  0.1727  0.2366  0.1531  0.1956  0.1738  0.2495  0.1644  0.1821  0.1596  0.2358  0.1644 
Average correlation  0.25  0.27  0.28  0.25  0.26  0.27  0.29  0.27  0.25  0.27  0.28  0.25 
Standard deviation  0.179  0.2052  0.175  0.2169  0.173  0.2078  0.1746  0.2074  0.1775  0.2051  0.1673  0.2183 
Table 15
Kendall rank correlation between the proposed and the traditional centrality measures and on considered realworld networks given in Table 7
Instance name  HADC  EADC (\(\alpha =0.25)\)  EADC (\(\alpha =0.75)\)  

BC  CC  DC  EC  BC  CC  DC  EC  BC  CC  DC  EC  
socgplus  0.1793  0.2636  0.0768  0.1842  0.2025  0.2167  0.1011  0.2355  0.1839  0.2248  0.0778  0.1804 
Twitter Lists  0.0607  0.0461  0.0450  0.0188  0.0563  0.0635  0.0497  \(\)0.0099  0.0582  0.0874  0.0487  0.0208 
sochamsterer  0.4476  0.5573  0.4136  0.4890  0.4483  0.5265  0.4016  0.4805  0.4567  0.5652  0.4068  0.4906 
socpagesgovernment  0.1699  0.1924  0.2295  0.1636  0.1680  0.1900  0.2392  0.1764  0.1749  0.1821  0.2331  0.1516 
socpagespolitician  0.1352  0.1429  0.1958  0.1225  0.1315  0.1314  0.2033  0.1258  0.1278  0.1268  0.1837  0.1106 
socpagesshows  0.1094  0.1164  0.1808  0.1192  0.1262  0.1222  0.2122  0.1116  0.1094  0.1274  0.1838  0.0982 
socwikielec  \(\) 0.0064  0.0609  0.0120  0.0358  \(\)0.0036  0.0497  0.0079  0.0427  0.0063  0.0334  0.0106  0.0476 
socadvogato  0.3235  0.3516  0.2779  0.3425  0.3255  0.3522  0.2835  0.3432  0.3260  0.3403  0.2767  0.3453 
socpagespublicfigure  0.1880  0.1894  0.2480  0.1820  0.2010  0.1924  0.2586  0.1868  0.1986  0.1915  0.2473  0.1846 
socanybeat  0.1470  0.2268  0.1045  0.0746  0.1431  0.2182  0.1275  0.0775  0.1449  0.2153  0.1064  0.0729 
socpagessport  0.1655  0.1482  0.1949  0.1383  0.1623  0.1485  0.1973  0.1481  0.1651  0.1488  0.1865  0.1466 
socepinions  0.4176  0.5375  0.4658  0.5739  0.4232  0.5336  0.4650  0.5785  0.4167  0.5363  0.4685  0.5720 
socpagesmedia  0.1666  0.1631  0.2204  0.1669  0.1689  0.1680  0.2239  0.1756  0.1731  0.1696  0.2200  0.1710 
socpagescompany  0.1400  0.1294  0.1836  0.1207  0.1397  0.1294  0.1898  0.1198  0.1455  0.1344  0.1833  0.1148 
socgamesecRO  0.1232  0.1093  0.1590  0.1010  0.1273  0.1142  0.1618  0.1026  0.1280  0.1116  0.1576  0.1023 
Average correlation  0.1845  0.2157  0.2005  0.1889  0.1880  0.2104  0.2082  0.1930  0.1877  0.2130  0.1994  0.1873 
Standard deviation  0.1227  0.1550  0.1230  0.1587  0.1233  0.1484  0.1191  0.1604  0.1229  0.1539  0.1224  0.1592 
Table 16
Fagin’s intersection metric between the proposed and the traditional centrality measures and on considered realworld networks given in Table 7
Instance name  HADC  \(\hbox {EADC}(\alpha =0.25)\)  \(\hbox {EADC}(\alpha =0.75)\)  

BC  CC  DC  EC  BC  CC  DC  EC  BC  CC  DC  EC  
socgplus  0.3873  0.9108  0.6125  0.7472  0.3751  0.8643  0.6025  0.7626  0.3938  0.9309  0.6151  0.7385 
Twitter Lists  0.5074  0.9006  0.3708  0.2617  0.5237  0.7576  0.4133  0.2822  0.5052  0.9161  0.3659  0.2599 
sochamsterer  0.6296  0.9318  0.7721  0.8624  0.6266  0.909  0.7797  0.8847  0.6317  0.9428  0.7679  0.8499 
socpagesgovernment  0.3962  0.8576  0.6462  0.6835  0.3802  0.7979  0.6706  0.7401  0.4024  0.8853  0.6307  0.6543 
socpagespolitician  0.4003  0.7377  0.5264  0.5629  0.3317  0.5951  0.6021  0.6447  0.4166  0.7894  0.4984  0.5246 
socpagesshows  0.3818  0.8164  0.6155  0.7474  0.351  0.7154  0.6712  0.766  0.3912  0.8449  0.5841  0.7176 
socwikielec  0.6856  0.9279  0.8281  0.9099  0.6869  0.92  0.8406  0.9223  0.6841  0.9296  0.8184  0.8993 
socadvogato  0.6855  0.9322  0.7789  0.8504  0.6866  0.9156  0.7907  0.867  0.6842  0.9396  0.7724  0.8416 
socpagespublicfigure  0.4086  0.4934  0.7131  0.563  0.2974  0.3305  0.7534  0.7474  0.4504  0.5715  0.6575  0.478 
socanybeat  0.489  0.9049  0.7134  0.8538  0.4863  0.893  0.7156  0.863  0.4909  0.9122  0.7118  0.8467 
socpagessport  0.554  0.9609  0.5484  0.7228  0.5641  0.9344  0.5653  0.7328  0.551  0.9646  0.5438  0.7191 
socepinions  0.4379  0.7928  0.7482  0.7065  0.4086  0.739  0.7541  0.7644  0.4508  0.8176  0.7389  0.6782 
socpagesmedia  0.5299  0.8232  0.6674  0.8256  0.5213  0.7871  0.6852  0.8472  0.5327  0.8369  0.6599  0.8169 
socpagescompany  0.5619  0.95  0.5353  0.2957  0.5618  0.9123  0.5564  0.3019  0.5614  0.9566  0.531  0.2948 
socgamesecRO  0.5634  0.9371  0.4429  0.4533  0.567  0.8887  0.4712  0.4845  0.5634  0.9401  0.4402  0.4483 
Average correlation  0.5079  0.8585  0.6346  0.6697  0.4912  0.7973  0.6581  0.7074  0.5140  0.8785  0.6224  0.6512 
Standard deviation  0.1056  0.1203  0.1309  0.2032  0.1276  0.1617  0.1220  0.1998  0.0994  0.1007  0.1300  0.2049 
From the Spearman’s and Kendall’s rank correlation coefficients, we observe that our measures have a small positive correlation with traditional measures. The correlation is small enough to conclude that the proposed measures rank nodes differently from traditional measures. The correlations are positive and almost same for all traditional measures. This indicates that our measures share some characteristics with them and are able to switch between traditional measures to perform optimally across networks and thresholds. Fagin’s intersection metric has a higher value due to only considering top1000 ranked nodes to compute the metric.
Conclusion
In this paper, we have proposed new hybrid centrality measures based on closeness (harmonic) and decay measures. The first hybridization is used to solve service coverage problem in flow networks where the demand for services is assumed to be proportional to the betweenness centrality. The proposed measures can also be used in another application that requires installing a facility farthest from the failure prone nodes. The solution in this case will be the node with minimum expected responseefficiency. The second hybridization attempts to find most ideal node for spreading information to the maximum fraction of population while considering a complex contagion. The proposed hybridization of centrality measures for both applications are based on the formulation for nodeweighted centrality measures. The experimental results on several realworld networks show that the proposed measures perform relatively better than the individual traditional measures and also rank nodes differently.
Although, the reference centrality measures considered in this paper are due to the nature of considered applications, the framework allows using other measures based on the requirement. Analyzing the proposed measures on various other realworld networks and at the place of betweenness and degree centrality, hybridizing other measures specific to some particular applications are the possible future directions. The analysis of realworld networks where nonuniform weights are given at nodes using nodeweighted centrality measures is another open direction. Another interesting direction is using hybrid centrality measures in multilayer networks [47]. In multilayer networks, we may use one layer of the network to compute node weights to be used for hybridization in other layers.
Acknowledgements
The authors would like to thank the referees for comments that helped with the presentation of some of the material.
Open Access
This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
Competing interests
The authors declare that they have no competing interests.
Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.