Maximizing the Spread of Influence via Generalized Degree Discount

Xiaojie Wang; Xue Zhang; Chengli Zhao; Dongyun Yi

doi:10.1371/journal.pone.0164393

Abstract

It is a crucial and fundamental issue to identify a small subset of influential spreaders that can control the spreading process in networks. In previous studies, a degree-based heuristic called DegreeDiscount has been shown to effectively identify multiple influential spreaders and has severed as a benchmark method. However, the basic assumption of DegreeDiscount is not adequate, because it treats all the nodes equally without any differences. To consider a general situation in real world networks, a novel heuristic method named GeneralizedDegreeDiscount is proposed in this paper as an effective extension of original method. In our method, the status of a node is defined as a probability of not being influenced by any of its neighbors, and an index generalized discounted degree of one node is presented to measure the expected number of nodes it can influence. Then the spreaders are selected sequentially upon its generalized discounted degree in current network. Empirical experiments are conducted on four real networks, and the results show that the spreaders identified by our approach are more influential than several benchmark methods. Finally, we analyze the relationship between our method and three common degree-based methods.

Citation: Wang X, Zhang X, Zhao C, Yi D (2016) Maximizing the Spread of Influence via Generalized Degree Discount. PLoS ONE 11(10): e0164393. https://doi.org/10.1371/journal.pone.0164393

Editor: Peter Csermely, Semmelweis University, HUNGARY

Received: June 16, 2016; Accepted: September 23, 2016; Published: October 12, 2016

Copyright: © 2016 Wang et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: All relevant data are within the paper and its Supporting Information files.

Funding: This work is partially supported by HPCL open fund of NUDT (201403-01), Postgraduate Innovation Fund of Hunan province (CX2015B010), and Postgraduate Innovation Fund of NUDT (B150203).

Competing interests: The authors have declared that no competing interests exist.

Introduction

In complex networks, models and methods for propagation behavior are always of great theoretical and practical importance. Consider a scenario in advertising: a small IT company develops a cool online application and it wants to let more people know their product. However, the funds for advertising are limited. An economical way to advertise is to deliver the product to a small group of initial users(or spreaders) who are willing to advertise the product by word of mouth. This is often referred to as influence maximization. Theoretically, influence maximization in networks is a specific problem about how to effectively identify a small subset of nodes and maximize their spreading influence. Although much work has been done on measuring the influence of a single node [1–6], methods that can effectively identify multiple influential spreaders are still lacking.

The pioneers of this research area are Domingos and Richardson [7, 8] who studied influence maximization as an algorithmic problem and developed a probabilistic method. Kempe, Kleinberg and Tardos [9] also made a significant contribution to this field. They showed that the problem is an NP-hard discrete optimization problem, and proposed a greedy strategy to select the spreaders that could achieve an approximation guarantee of 63%. Unfortunately, their greedy method encountered a serious drawback in computing efficiency, which limited its wide usage in large-scale networks. Leskovec et al. [10] demonstrated that many realistic influence maximization problems exhibit a property of “submodularity”, and they proposed a Cost-Effective Lazy Forward(CELF) method to improve the efficiency of the greedy method. Narayanam et al. [11] analyzed the Shapley value concept from cooperative game theory and proposed ShaPley value-based Influential Nodes(SPIN). Zhao et al. [12] attempted to find effective multiple spreaders by generalizing the idea of the coloring problem in graph theory to complex networks. He et al. [13] suggested a novel method to identify multiple spreaders from communities in a balanced way. Zhang et al. [14] presented an iterative method named VoteRank to identify a set of decentralized spreaders. Chen et al. [15] decomposed the local topological structure of nodes and proposed a DegreeDiscount heuristic. Numerical experiments showed that DegreeDiscount could nearly match the performance of the greedy method, while the computational complexity of the former one was quite low. However, all nodes were treated equally in DegreeDiscount, which was a little oversimplified and might reduce the performance of the algorithm.

In this paper, we depict the status of nodes more concisely as a probabilistic score, and propose the GeneralizedDegreeDiscount heuristic. We discuss the computational efficiency of our method and demonstrate that the complexity is linearly correlated with the network scale, which makes our method efficient and scalable to large-scale networks. Experiments are performed on several real networks, and the results show that our method can outperform some centrality-based methods.

Materials and Methods

Intuition and theory

Degree is a basic centrality index in the research area of complex networks. It is well known that a node with a higher degree can influence more nodes than a node with a lower degree. Some researches in sociology have shown that selecting nodes with the highest degree as spreaders can result in better spreading influence than many other methods. However, in some recent studies, the authors argued that nodes with the highest degree might not always be the most influential ones. Though the effectiveness of the Degree is questionable, the low computational complexity of this strategy results in its widespread use in many business fields. In this section, we try to enhance the performance of this method by using several heuristic strategies.

Let node v be a neighbor of node u. Suppose u has been selected as a spreader. When considering the selection of v as a new spreader, one should formulate a method for calculating the contribution of edge to the degree of node v. It cannot be counted as 1, as has been done previously. As u has been selected as a spreader: (i) it is no longer necessary for v to influence u. (ii) u may also influence v with some probability, which further weakens the potential influence of v. Based on these considerations, we explore several heuristic strategies, in which the spreaders are selected one by one.

Degree-based heuristics.

DegreeDistance [16] takes a naive approach to avoid the relative influence between adjacent nodes. For example, if a node has been selected as a spreader, we can ignore its neighbors and consider other nodes. DegreeDistance defines a candidate set C and a distance threshold d_td. At first, all the nodes are in the candidate set C. In each round, a node v with maximum degree in C is selected as a spreader, and the nodes within a distance d_td to v are removed from C. The procedure ends when all the spreaders have been selected.

In Ref [15], Chen et al. proposed two degree-based heuristics, SingleDiscount and DegreeDiscount. SingleDiscount considers a simple adaptive strategy. In each round, a node u with the maximum degree is selected as a spreader. Then, for each v ∈ Γ(u), we do not count when calculating its degree. In other words, the degree of v will be discounted by 1. This type of degree is named as discount degree, and is denoted by sd_v (1) where d_v denotes the original degree of v, and t_v denotes the number of v’s neighbors who have already been selected as spreaders.

DegreeDiscount is specifically designed for the independent cascade model. For a specific spreading probability p, DegreeDiscount attempts to conduct a deeper analysis of the local structure of the nodes. Suppose that we want to calculate the potential spreading ability of node v. Let the spreading probability be p. When p is small, the multi-hop neighbors of v can be ignored, and only the nearest neighbors are counted toward the degree. Let u ∈ Γ(v) be a spreader neighbor of v. Obviously, the probability that v is directly influenced by u is p. As a result, u will not only contribute nothing to v, but also weakens the spreading ability of v.

When calculating the potential spreading ability of v, DegreeDiscount ignores the differences of v’s neighbors. As all the neighbors are treated equally, the diagram of v and its neighbors Γ(v) can be mapped into a star-like subgraph structure. Let Star(v) be the subgraph considered here, and let the edges in the subgraph be the edges incident to v. Let d_v be the degree of node v, t_v be the number of spreader neighbors of v, and p be the spreading probability.

As the candidate node v has t_v spreader neighbors, the probability that v is influenced by these neighbors is . In this situation, selecting v as a new spreader may not bring any additional influence. In the opposite situation, selecting v will contribute to the spreading process by v itself and its normal neighbors. The former term can influence 1 node(v itself), and the latter can influence d_v − t_v nodes(normal neighbors) with probability p. Together, the expected number of nodes influenced by v is (2)

Under the first order of Taylor expansion, when p is small, the left term can be approximated by 1 − t_v p + o(t_v p). After further simplification, the whole equation becomes (3)

Then, the discounted degree of v can be defined as (4)

Note that in the original equation Eq (2), dd_v is always non-negative. However, in the simplified form Eq (4), dd_v may be negative with some special parameters. In this situation, we manually set dd_v to be 0. Fig 1 depicts the local topology considered by DegreeDiscount. In this toy model, d_v = 4, t_v = 1, and (5)

Download:

Fig 1. The diagram of DegreeDiscount.

Nodes filled by gray denote the selected spreaders, and others denote normal nodes.

https://doi.org/10.1371/journal.pone.0164393.g001

Generalized Degree Discount.

As DegreeDiscount models all the neighbors in Γ(v) equally, it ignores the differences among them. Take an extreme case as example, let s, t ∈ Γ(v) be two normal neighbors of node v. If all the neighbors of s itself are spreaders and all the neighbors of t itself are normal nodes, they should not be treated equally. Obviously, the probability that s is influenced by its own neighbors is far larger than t. When calculating the potential contribution of s, t towards v, the latter one should be given more weight. To make the original model more precisely, we propose the GeneralizedDegreeDiscount.

Similar to the analysis in DegreeDiscount, the probability that node v is not influenced by its spreader neighbors is . If v is not influenced by any of those neighbors, selecting v will enhance the total influence by v itself and its d_v − t_v normal neighbors. For any normal neighbor w ∈ Γ(v), the probability that w is not influenced by its own neighbors is also . In other words, w will bring additional influence to v with probability p. Together, the expected number of nodes that will be influenced by v is (6) where the summation is over all d_v − t_v normal neighbors of node v.

Departing from the conduction in DegreeDiscount, here we consider the second-order Taylor expansion for the left term and the first-order expansion for the right term: (7)

After further simplification, the equation becomes (8)

Then, the generalized discounted degree of v can be defined as (9)

Similar to the situation in DegreeDiscount, the simplified equation of gdd_v may also be less than zero. In our real implementations, we set gdd_v = 0 in this situation. Fig 2 depicts the local topology considered by GeneralizedDegreeDiscount. In this toy model, d_v = 4 and t_v = 1. Note that the summation is over all normal neighbors, i.e., {g, i, j} in this figure. As t_g = 2, t_i = 1 and t_j = 1, the generalized discounted degree of v is (10)

Download:

Fig 2. The diagram of GeneralizedDegreeDiscount.

Nodes filled by gray denote the selected spreaders, and others denote normal nodes.

https://doi.org/10.1371/journal.pone.0164393.g002

Compared to the formulation of DegreeDiscount(Eq (4)), the formulation of GeneralizedDegreeDiscount(Eq (9)) adds the last two terms. As the consideration of the latter one is deeper than the former one, GeneralizedDegreeDiscount should be more effective than DegreeDiscount. However, the difference between them is not so significant in real situations. In reality, a small fraction of spreaders must be selected and their influences broadcast with low probability. Because the number of spreaders is not large, usually t_v ≪ d_v for all nodes. Thus, in Eq (9), the fourth and fifth terms are smaller than the third term, which makes GeneralizedDegreeDiscount just similar to DegreeDiscount. In the Results section, we will compare the two methods numerically.

Computational efficiency

The GeneralizedDegreeDiscount is implemented in Algorithm 1. If we want to select l spreaders, the algorithm must run for l rounds. Let N be the number of nodes and 〈k〉 be the average degree. In each round, the selection scheme costs O(N), the neighbor finding scheme costs O(〈k〉²), and for each of those neighbors, the updating process costs O(〈k〉). Then, the total time cost of the algorithm is O(l(N + 〈k〉² + 〈k〉² ⋅ 〈k〉)) ≈ O(l(N + 〈k〉³)). In many networks, the average degree is far less than the number of nodes: 〈k〉 ≪ N. Thus the time cost of GeneralizedDegreeDiscount will be nearly O(lN), which is just linearly correlated with the scale of the network.

Algorithm 1 GeneralizedDegreeDiscount(G, l, p)

Initialize S = ∅

for all v ∈ V do

gdd_v = d_v

t_v = 0

end for

// iteratively select l spreaders

for i = 1 to l do

// select the node with the maximum generalized discount degree gdd_v

select u = arg max_v {gdd_v|v ∈ V∖S}

S = S ∪ {u}

NB = ∅

// find the nearest and next nearest neighbors of u and update t_v for v ∈ Γ(u)

for all v ∈ Γ(u) do

NB = NB ∪ {v}

t_v = t_v + 1

for all w ∈ Γ(v) do

NB = NB ∪ {w}

end for

// update gdd_v for all v ∈ NB

for all v ∈ NB do

sum_tw = 0

for all w ∈ Γ(v) do

if w ∉ S then

sum_tw = sum_tw + t_w

end if

end for

if gdd_v < 0 then

gdd_v = 0

end if

end for

return S

Results and Discussion

To evaluate the performance, we simulate the experiments using the Susceptible-Infected-Recovered(SIR) model. The SIR model was originally proposed as a model of the dynamics of the spread of disease. Due to the similarities between epidemic transmission and the spread of information, we use SIR to measure the spreading influence of individual nodes. In the SIR model, a node may assume one of three states(susceptible, infected and recovered). Specifically, susceptible individuals S in the model is analogous to individuals who are not aware of the information. Infected individuals I can be analogous to information carriers who are willing to spread information to their neighbors. Recovered individuals R are those who had previously received the information but later lost interest. To better simulate the real-world spreading process, we use the SIR model with limited contact [17]. At each time step, each infected node will randomly select a neighbor to contact, and will transmit the disease to its neighbor with probability p if the neighbor is susceptible. After the transmission process, the infected node will become a recovered node with probability q. The effective spreading rate λ is defined as p/q. When there are no infected nodes, the process stops, and we use the fraction of recovered nodes to measure the spreading influence.

Data Description

To evaluate the influences of different groups of spreaders selected by various methods, we conduct the experiments on the following four networks from different fields.

Enron [18]: An email communication network which covers all the email communication within Enron Corporation. Nodes in the network are email addresses and edges represent the email communications among them.
Cond-mat [19]: A collaboration network of scientists posting preprints to the condensed matter archive at arxiv.org between January 1, 1995 and March 31, 2005. Nodes in the network represent the scientists and edges represent the collaborations among them.
Gnutella [20]: A snapshot of the Gnutella peer-to-peer file sharing network at August 31 2002. Nodes represent hosts in the Gnutella network topology and edges represent connections between the Gnutella hosts.
Epinions [21]: A who-trust-whom online social network of the general consumer review site Epinions.com. All the trust relationships interact and form the Web of Trust, which is then combined with review ratings to determine which reviews are shown to the users.

For simplicity, we treat all the networks as undirected and unweighted, and discard self-loops and multiple links. Only the largest connected component of each network is considered. A Brief overview of the networks is shown in Table 1.

Download:

Table 1. The basic topological features of four real networks.

N and M are the numbers of nodes and edges. 〈k〉 is the average degree. d_max denotes the network diameter and 〈d〉 denotes the average shortest path length. r and cc are the assortative coefficient and clustering coefficient, respectively.

https://doi.org/10.1371/journal.pone.0164393.t001

Benchmark methods

In complex networks, many centrality indexes have been defined to measure the importance of nodes and links. It is believed that nodes with higher centrality are more influential than common nodes. Accordingly, one naive solution for the influence maximization problem is to select the top − l nodes with the highest centrality indexes. In this paper, we use centrality-based methods as the benchmark methods.

Degree is a basic local centrality index for nodes. The higher degree a node has, the more important it is. In a social network, a person with more followers or friends is likely to have a larger influence.

Betweenness [22, 23] measures the extent to which a node is located on the shortest paths between pairs of nodes in networks. (11) where σ_st denotes the number of shortest paths between a pair (s, t) of nodes, and σ_st(v) denotes the number of shortest paths between any pair of nodes that pass through v.

Closeness [24] is an evaluation of the geometric location of nodes. (12) where d(v, u) denotes the distance between nodes v and u. Some researchers have proposed other definitions of closeness to measure the locations of nodes [25, 26].

PageRank [27] evaluates the status of nodes in the random walking process in networks, which is also a core algorithm in the many search engines. (13) where N denotes the total number of nodes, Γ(v) denotes the set of neighbors of v, denotes the out-degree of node u, and d is a dumping factor. In real implementations, we set d = 0.85.

Coreness [28] is a well-established centrality index that focuses on the structure of networks. Kitsak et al. found that the most efficient spreaders are those located within the core of the network as identified by the k-shell decomposition analysis. The decomposition runs in an iterative way. Nodes are assigned to k shells according to their remaining degrees, which are obtained by the successive pruning of nodes with degrees smaller than k_s. However, the performance of k-shell decomposition is not stable, and many studies have sought to enhance its effectiveness [29–31]. Recently, Liu et al. [32] analyzed the structure of core-like groups in networks, and improved the accuracy of the k-shell decomposition by filtering out the redundant links. In Ref [33], Lü et al. discovered an important relation among degree, H-index and coreness. By constructing a suitable operator, they proved that degree, H-index and coreness were the initial, intermediate and steady states of a special sequences, respectively.

Effectiveness

We use the SIR model to compare the effectiveness of GeneralizedDegreeDiscount with DegreeDistance, SingleDiscount, DegreeDiscount and several centrality-based methods discussed before. In each implementation, a fraction of the nodes is selected as spreaders, and the information spreads according to the SIR process described above. The spreading influence is used to measure the effectiveness of the methods. For each method, the SIR process is repeated many times to ensure the stability of the results.

Fig 3 shows the numerical results of nine methods on four networks. The proposed GeneralizedDegreeDiscount outperforms all other methods on all four networks for almost all fraction of spreaders. Especially, as the fraction of spreaders increases, our method shows better and better performance. The only exception is in Enron network, when the fraction of spreaders is small, the performance of GeneralizedDegreeDiscount is slightly worse than DegreeDiscount. Compared with DegreeDiscount, the performance of GeneralizedDegreeDiscount is consistently better. One promising phenomenon observed in our method is that as the fraction of spreaders becomes larger, the performance differences becomes more significant. Numerical results confirm that GeneralizedDegreeDiscount is indeed a effective extension of DegreeDiscount. In all networks, Coreness and Closeness perform the worst among all methods. In Ref [31], Liu et al. found that nodes in high shells may not be influential because of the existence of core-like groups: groups of nodes that link very locally within themselves. For nodes in the core-like groups, the Coreness cannot reflect their location importance in the network, which reduces the accuracy of the k-shell decomposition process. Moreover, if nodes in the highest shell tend to links with one another, their influence areas may overlap significantly. Obviously, selecting those nodes as spreaders may cause a large fraction of the network to overlooked. The situation for Closeness is similar: nodes with high closeness values often distribute closely with one another.

Download:

Fig 3. The spreading influence of nine methods on four networks with different fractions of spreaders.

The parameters are λ = 1.1, q = 1/〈k〉 for all networks, and all results are obtained by averaging over 200 implementations of the SIR model.

https://doi.org/10.1371/journal.pone.0164393.g003

In addition, we test the validity of our method with different effective spreading rates. We fix the fraction of spreaders to be 1% of the scale of the networks and vary the effective spreading rate λ. The results are shown in Fig 4. As in the previous experiments, the proposed GeneralizedDegreeDiscount shows a clear advantage in maximizing the spreading influence over all networks under various effective spreading rates. Even if the effective spreading rates are below 1.0 in four networks, GeneralizedDegreeDiscount performs better than other methods.

Download:

Fig 4. The spreading influence of nine methods on four networks under different effective spreading rates.

The parameters are λ = 1.1, q = 1/〈k〉 for all networks, and all results are obtained by averaging over 200 implementations of the SIR model.

https://doi.org/10.1371/journal.pone.0164393.g004

Obviously, GeneralizedDegreeDiscount is an adaptive method which recalculates the gdd_v during each step of the spreaders selection processes, while the centrality-based benchmark methods are not. In this part, more comparisons are done among our methods and adaptive versions of Degree, Betweenness and Closeness. To make them adaptive, a simple node-removing process is conducted: in each iteration, the node with the maximum centrality is selected as a spreader, and then we remove it from the network and recalculates the new centrality. The whole process ends until all the spreaders are selected. In fact, the adaptive version of Degree is the same as SingleDiscount. Fig 5 shows the numerical results. Unlike the previous results, when considering the top spreaders with low effective rate, our GeneralizedDegreeDiscount does not performs well. Especially in Gnutella network, the performance of GeneralizedDegreeDiscount is worse than Betweenness-adaptive and Closeness-adaptive. As the clustering coefficient of Gnutella is so small, the spreaders selection process in the early iteration of GeneralizedDegreeDiscount is just similar to Degree, which may limit the performance of our method. In Figs 3 and 4, it can also be seen that the performance differences between GeneralizedDegreeDiscount and other methods are not so remarkable under small number of spreaders and low effective spreading rate. How to identify multiple influential spreaders in networks with low clustering coefficients is a challenging problem, and we leave it in the future.

Download:

Fig 5. The spreading influence of GeneralizedDegreeDiscount and three adaptive centrality-based methods under different effective spreading rates.

The numbers of spreaders are 100 in all networks, and the results are obtained by averaging over 200 implementations of the SIR model.

https://doi.org/10.1371/journal.pone.0164393.g005

Relations with other methods

In this subsection, we perform numerical comparisons among four degree-based methods: Degree, SingleDiscount, DegreeDiscount and GeneralizedDegreeDiscount. Though DegreeDistance is also a degree-based method, we do not consider it because there is no clear formulation to describe this method. The mathematical formulations of the four are listed below.

Degree
SingleDiscount
DegreeDiscount
GeneralizedDegreeDiscount

Obviously, the complexity of these methods increases one by one. These formulations indicate that DegreeDiscount has more terms in common with GeneralizedDegreeDiscount than the other two methods. In an extreme case, when p = 0, GeneralizedDegreeDiscount is exact the same as DegreeDiscount. To better clarify the difference between the methods, we set the fraction of spreaders to be 1% and calculate the similarity(the fraction of commonly selected spreaders) between GeneralizedDegreeDiscount and other methods. Fig 6 shows the results for the four networks. In all the networks, GeneralizedDegreeDiscount shows the best similarity with DegreeDiscount, normal similarity with SingleDiscount, and the worst similarity with Degree.

Download:

Fig 6. The similarities between GeneralizedDegreeDiscount and Degree, SingleDiscount, DegreeDiscount.

The fraction of spreaders is 1%.

https://doi.org/10.1371/journal.pone.0164393.g006

Conclusion

In this paper, we propose a novel degree-based heuristic, GeneralizedDegreeDiscount, which selects multiple spreaders and maximizes their spreading influence. In our method, when evaluating the potential influence of a candidate node v, the way in which its neighbors are treated depends on whether it has been selected as a spreader or not. Taking both of the situations into consideration, GeneralizedDegreeDiscount uses a heuristic scheme to evaluate the potential influence of all individuals in the network.

We analyze the computational complexity of our method and show that it is just linearly correlated with the network scale. Then, the performance of our method is evaluated in four real networks from different fields. Results show that our method outperforms several centrality-based methods and other heuristic methods in all cases, no matter how many spreaders we choose to select or what the effective spreading rate is.

The theoretical analysis about influence maximization problem is still lacking. Although it has long been proven that there is a strong connection between the spreading process and the percolation process [34, 35], few researches have discussed the relationship between influence maximization and percolation. Recently, Morone and Makse pointed out that the influence maximization problem could be mapped onto optimal percolation problem in random networks [36], which might shed light on a new trend of future researches [37, 38]. Besides, we have witnessed the rapid development of theories and methods for temporal networks [39, 40]. Further researches on the influence maximization problem in temporal networks may also be a promising direction [41, 42].

Supporting Information

S1 File. Enron.

Data of Enron email communication network.

https://doi.org/10.1371/journal.pone.0164393.s001

(RAR)

S2 File. Cond-mat.

Data of Cond-mat collaboration network.

https://doi.org/10.1371/journal.pone.0164393.s002

(RAR)

S3 File. Gnutella.

Data of Gnutella peer-to-peer network.

https://doi.org/10.1371/journal.pone.0164393.s003

(RAR)

S4 File. Epinions.

Data of Epinions social network.

https://doi.org/10.1371/journal.pone.0164393.s004

(RAR)

Acknowledgments

The authors would like to thank Yanyuan Su and Yirun Ruan for useful comments. This work is partially supported by the major pre-research project of National University of Defense Technology, HPCL open fund of National University of Defense Technology(201403-01), Postgraduate Innovation Fund of Hunan province(CX2015B010), and Postgraduate Innovation Fund of NUDT(B150203).

Author Contributions

Conceptualization: XJW.
Formal analysis: XJW XZ.
Funding acquisition: DYY.
Methodology: XJW.
Project administration: DYY.
Software: XJW.
Supervision: CLZ.
Validation: XJW.
Writing – original draft: XJW.
Writing – review & editing: XZ.

References

1. Yang J, Yao C, Ma W, Chen G. A study of the spreading scheme for viral marketing based on a complex network model. Physica A: Statistical Mechanics and its Applications. 2010;389(4):859–870.
- View Article
- Google Scholar
2. Chen D, Lü L, Shang MS, Zhang YC, Zhou T. Identifying influential nodes in complex networks. Physica A: Statistical Mechanics and its Applications. 2012;391(4):1777–1787.
- View Article
- Google Scholar
3. Li Q, Zhou T, Lü L, Chen D. Identifying influential spreaders by weighted LeaderRank. Physica A: Statistical Mechanics and its Applications. 2014;404(24):47–55.
- View Article
- Google Scholar
4. Lü L, Zhang YC, Yeung CH, Zhou T. Leaders in Social Networks, the Delicious Case. PLoS ONE. 2011;6:e21202. pmid:21738620
- View Article
- PubMed/NCBI
- Google Scholar
5. Chen DB, Gao H, Lü L, Zhou T. Identifying Influential Nodes in Large-Scale Directed Networks: The Role of Clustering. PLoS ONE. 2013;8:e77455. pmid:24204833
- View Article
- PubMed/NCBI
- Google Scholar
6. Chen DB, Xiao R, Zeng A, Zhang YC. Path diversity improves the identification of influential spreaders. Epl. 2013;104(6):5580–5596.
- View Article
- Google Scholar
7. Domingos P, Richardson M. Mining the network value of customers. In: Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining. ACM; 2001. p. 57–66.
8. Richardson M, Domingos P. Mining knowledge-sharing sites for viral marketing. In: Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining. ACM; 2002. p. 61–70.
9. Kempe D, Kleinberg J, Tardos É. Maximizing the spread of influence through a social network. In: Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining. ACM; 2003. p. 137–146.
10. Leskovec J, Krause A, Guestrin C, Faloutsos C, VanBriesen J, Glance N. Cost-effective outbreak detection in networks. In: Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM; 2007. p. 420–429.
11. Narayanam R, Narahari Y. A shapley value-based approach to discover influential nodes in social networks. Automation Science and Engineering, IEEE Transactions on. 2011;8(1):130–147.
- View Article
- Google Scholar
12. Zhao XY, Huang B, Tang M, Zhang HF, Chen DB. Identifying effective multiple spreaders by coloring complex networks. EPL (Europhysics Letters). 2015;108(6):68005.
- View Article
- Google Scholar
13. He JL, Fu Y, Chen DB. A Novel Top-k Strategy for Influence Maximization in Complex Networks with Community Structure. PloS one. 2015;10(12):e0145283. pmid:26682706
- View Article
- PubMed/NCBI
- Google Scholar
14. Zhang JX, Chen DB, Dong Q, Zhao ZD. Identifying a set of influential spreaders in complex networks. Scientific Reports. 2016;6:27823. pmid:27296252
- View Article
- PubMed/NCBI
- Google Scholar
15. Chen W, Wang Y, Yang S. Efficient influence maximization in social networks. In: Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM; 2009. p. 199–208.
16. Sheikhahmadi A, Nematbakhsh MA, Shokrollahi A. Improving detection of influential nodes in complex networks. Physica A: Statistical Mechanics and its Applications. 2015;436:833–845.
- View Article
- Google Scholar
17. Yang R, Wang BH, Ren J, Bai WJ, Shi ZW, Wang WX, et al. Epidemic spreading on heterogeneous networks with identical infectivity. Physics Letters A. 2007;364(3):189–193.
- View Article
- Google Scholar
18. Leskovec J, Lang KJ, Dasgupta A, Mahoney MW. Community structure in large networks: Natural cluster sizes and the absence of large well-defined clusters. Internet Mathematics. 2009;6(1):29–123.
- View Article
- Google Scholar
19. Newman ME. The structure of scientific collaboration networks. Proceedings of the National Academy of Sciences. 2001;98(2):404–409.
- View Article
- Google Scholar
20. Leskovec J, Kleinberg J, Faloutsos C. Graph evolution: Densification and shrinking diameters. ACM Transactions on Knowledge Discovery from Data (TKDD). 2007;1(1):2.
- View Article
- Google Scholar
21. Richardson M, Agrawal R, Domingos P. Trust management for the semantic web. In: The Semantic Web-ISWC 2003. Springer; 2003. p. 351–368. https://doi.org/10.1007/978-3-540-39718-2_23
22. Freeman LC. A set of measures of centrality based on betweenness. Sociometry. 1977; p. 35–41.
23. Freeman LC. Centrality in social networks conceptual clarification. Social networks. 1979;1(3):215–239.
- View Article
- Google Scholar
24. Bavelas A. Communication patterns in task-oriented groups. Journal of the acoustical society of America. 1950;22:725.
- View Article
- Google Scholar
25. Dangalchev C. Residual closeness in networks. Physica A: Statistical Mechanics and its Applications. 2006;365(2):556–564.
- View Article
- Google Scholar
26. Holme P, Kim BJ, Yoon CN, Han SK. Attack vulnerability of complex networks. Physical Review E. 2002;65(5):056109.
- View Article
- Google Scholar
27. Brin S, Page L. Reprint of: The anatomy of a large-scale hypertextual web search engine. Computer networks. 2012;56(18):3825–3833.
- View Article
- Google Scholar
28. Kitsak M, Gallos LK, Havlin S, Liljeros F, Muchnik L, Stanley HE, et al. Identification of influential spreaders in complex networks. Nature Physics. 2010;6(11):888–893.
- View Article
- Google Scholar
29. Zeng A, Zhang CJ. Ranking spreaders by decomposing complex networks. Physics Letters A. 2013;377(14):1031–1035.
- View Article
- Google Scholar
30. Liu JG, Ren ZM, Guo Q. Ranking the spreading influence in complex networks. Physica A: Statistical Mechanics and its Applications. 2013;392(18):4154–4159.
- View Article
- Google Scholar
31. Liu Y, Tang M, Zhou T, Do Y. Core-like groups result in invalidation of identifying super-spreader by k-shell decomposition. Scientific Reports. 2015;5:9602. pmid:25946319
- View Article
- PubMed/NCBI
- Google Scholar
32. Liu Y, Tang M, Zhou T, Do Y. Improving the accuracy of the k-shell method by removing redundant links: From a perspective of spreading dynamics. Scientific Reports. 2015;5:13172–13172. pmid:26277903
- View Article
- PubMed/NCBI
- Google Scholar
33. Lü L, Zhou T, Zhang QM, Stanley HE. The H-index of a network node and its relation to degree and coreness. Nature communications. 2016;7:10168. pmid:26754161
- View Article
- PubMed/NCBI
- Google Scholar
34. Pastorsatorras R, Vespignani A. Immunization of complex networks. Physical Review E. 2002;65:36104.
- View Article
- Google Scholar
35. Newman MEJ. Spread of epidemic disease on networks. Physical Review E. 2002;66(1).
- View Article
- Google Scholar
36. Morone F, Makse HA. Influence maximization in complex networks through optimal percolation. Nature. 2015;524(7563):65. pmid:26131931
- View Article
- PubMed/NCBI
- Google Scholar
37. Ji S, Lü L, Yeung CH, Hu Y. Effective spreading from multiple leaders identified by percolation in social networks. arXiv preprint arXiv:150804294. 2015;.
38. Hu Y, Ji S, Feng L, Yuliang J. Quantify and Maximise Global Viral Influence Through Local Network Information. arXiv preprint arXiv:150903484. 2015;.
39. Holme P, Saramäki J. Temporal networks. Physics reports. 2012;519(3):97–125.
- View Article
- Google Scholar
40. Holme P. Modern temporal network theory: a colloquium. The European Physical Journal B. 2015;88(9):1–30.
- View Article
- Google Scholar
41. Ren G, Wang X. Epidemic spreading in time-varying community networks. Chaos: An Interdisciplinary Journal of Nonlinear Science. 2014;24(2):023116.
- View Article
- Google Scholar
42. Starnini M, Machens A, Cattuto C, Barrat A, Pastor-Satorras R. Immunization strategies for epidemic processes in time-varying contact networks. Journal of theoretical biology. 2013;337:89–100. pmid:23871715
- View Article
- PubMed/NCBI
- Google Scholar

[ref1] 1. Yang J, Yao C, Ma W, Chen G. A study of the spreading scheme for viral marketing based on a complex network model. Physica A: Statistical Mechanics and its Applications. 2010;389(4):859–870.
View Article
Google Scholar

[2] View Article

[3] Google Scholar

[ref2] 2. Chen D, Lü L, Shang MS, Zhang YC, Zhou T. Identifying influential nodes in complex networks. Physica A: Statistical Mechanics and its Applications. 2012;391(4):1777–1787.
View Article
Google Scholar

[5] View Article

[6] Google Scholar

[ref3] 3. Li Q, Zhou T, Lü L, Chen D. Identifying influential spreaders by weighted LeaderRank. Physica A: Statistical Mechanics and its Applications. 2014;404(24):47–55.
View Article
Google Scholar

[8] View Article

[9] Google Scholar

[ref4] 4. Lü L, Zhang YC, Yeung CH, Zhou T. Leaders in Social Networks, the Delicious Case. PLoS ONE. 2011;6:e21202. pmid:21738620
View Article
PubMed/NCBI
Google Scholar

[11] View Article

[12] PubMed/NCBI

[13] Google Scholar

[ref5] 5. Chen DB, Gao H, Lü L, Zhou T. Identifying Influential Nodes in Large-Scale Directed Networks: The Role of Clustering. PLoS ONE. 2013;8:e77455. pmid:24204833
View Article
PubMed/NCBI
Google Scholar

[15] View Article

[16] PubMed/NCBI

[17] Google Scholar

[ref6] 6. Chen DB, Xiao R, Zeng A, Zhang YC. Path diversity improves the identification of influential spreaders. Epl. 2013;104(6):5580–5596.
View Article
Google Scholar

[19] View Article

[20] Google Scholar

[ref7] 7. Domingos P, Richardson M. Mining the network value of customers. In: Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining. ACM; 2001. p. 57–66.

[ref8] 8. Richardson M, Domingos P. Mining knowledge-sharing sites for viral marketing. In: Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining. ACM; 2002. p. 61–70.

[ref9] 9. Kempe D, Kleinberg J, Tardos É. Maximizing the spread of influence through a social network. In: Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining. ACM; 2003. p. 137–146.

[ref10] 10. Leskovec J, Krause A, Guestrin C, Faloutsos C, VanBriesen J, Glance N. Cost-effective outbreak detection in networks. In: Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM; 2007. p. 420–429.

[ref11] 11. Narayanam R, Narahari Y. A shapley value-based approach to discover influential nodes in social networks. Automation Science and Engineering, IEEE Transactions on. 2011;8(1):130–147.
View Article
Google Scholar

[26] View Article

[27] Google Scholar

[ref12] 12. Zhao XY, Huang B, Tang M, Zhang HF, Chen DB. Identifying effective multiple spreaders by coloring complex networks. EPL (Europhysics Letters). 2015;108(6):68005.
View Article
Google Scholar

[29] View Article

[30] Google Scholar

[ref13] 13. He JL, Fu Y, Chen DB. A Novel Top-k Strategy for Influence Maximization in Complex Networks with Community Structure. PloS one. 2015;10(12):e0145283. pmid:26682706
View Article
PubMed/NCBI
Google Scholar

[32] View Article

[33] PubMed/NCBI

[34] Google Scholar

[ref14] 14. Zhang JX, Chen DB, Dong Q, Zhao ZD. Identifying a set of influential spreaders in complex networks. Scientific Reports. 2016;6:27823. pmid:27296252
View Article
PubMed/NCBI
Google Scholar

[36] View Article

[37] PubMed/NCBI

[38] Google Scholar

[ref15] 15. Chen W, Wang Y, Yang S. Efficient influence maximization in social networks. In: Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM; 2009. p. 199–208.

[ref16] 16. Sheikhahmadi A, Nematbakhsh MA, Shokrollahi A. Improving detection of influential nodes in complex networks. Physica A: Statistical Mechanics and its Applications. 2015;436:833–845.
View Article
Google Scholar

[41] View Article

[42] Google Scholar

[ref17] 17. Yang R, Wang BH, Ren J, Bai WJ, Shi ZW, Wang WX, et al. Epidemic spreading on heterogeneous networks with identical infectivity. Physics Letters A. 2007;364(3):189–193.
View Article
Google Scholar

[44] View Article

[45] Google Scholar

[ref18] 18. Leskovec J, Lang KJ, Dasgupta A, Mahoney MW. Community structure in large networks: Natural cluster sizes and the absence of large well-defined clusters. Internet Mathematics. 2009;6(1):29–123.
View Article
Google Scholar

[47] View Article

[48] Google Scholar

[ref19] 19. Newman ME. The structure of scientific collaboration networks. Proceedings of the National Academy of Sciences. 2001;98(2):404–409.
View Article
Google Scholar

[50] View Article

[51] Google Scholar

[ref20] 20. Leskovec J, Kleinberg J, Faloutsos C. Graph evolution: Densification and shrinking diameters. ACM Transactions on Knowledge Discovery from Data (TKDD). 2007;1(1):2.
View Article
Google Scholar

[53] View Article

[54] Google Scholar

[ref21] 21. Richardson M, Agrawal R, Domingos P. Trust management for the semantic web. In: The Semantic Web-ISWC 2003. Springer; 2003. p. 351–368. https://doi.org/10.1007/978-3-540-39718-2_23

[ref22] 22. Freeman LC. A set of measures of centrality based on betweenness. Sociometry. 1977; p. 35–41.

[ref23] 23. Freeman LC. Centrality in social networks conceptual clarification. Social networks. 1979;1(3):215–239.
View Article
Google Scholar

[58] View Article

[59] Google Scholar

[ref24] 24. Bavelas A. Communication patterns in task-oriented groups. Journal of the acoustical society of America. 1950;22:725.
View Article
Google Scholar

[61] View Article

[62] Google Scholar

[ref25] 25. Dangalchev C. Residual closeness in networks. Physica A: Statistical Mechanics and its Applications. 2006;365(2):556–564.
View Article
Google Scholar

[64] View Article

[65] Google Scholar

[ref26] 26. Holme P, Kim BJ, Yoon CN, Han SK. Attack vulnerability of complex networks. Physical Review E. 2002;65(5):056109.
View Article
Google Scholar

[67] View Article

[68] Google Scholar

[ref27] 27. Brin S, Page L. Reprint of: The anatomy of a large-scale hypertextual web search engine. Computer networks. 2012;56(18):3825–3833.
View Article
Google Scholar

[70] View Article

[71] Google Scholar

[ref28] 28. Kitsak M, Gallos LK, Havlin S, Liljeros F, Muchnik L, Stanley HE, et al. Identification of influential spreaders in complex networks. Nature Physics. 2010;6(11):888–893.
View Article
Google Scholar

[73] View Article

[74] Google Scholar

[ref29] 29. Zeng A, Zhang CJ. Ranking spreaders by decomposing complex networks. Physics Letters A. 2013;377(14):1031–1035.
View Article
Google Scholar

[76] View Article

[77] Google Scholar

[ref30] 30. Liu JG, Ren ZM, Guo Q. Ranking the spreading influence in complex networks. Physica A: Statistical Mechanics and its Applications. 2013;392(18):4154–4159.
View Article
Google Scholar

[79] View Article

[80] Google Scholar

[ref31] 31. Liu Y, Tang M, Zhou T, Do Y. Core-like groups result in invalidation of identifying super-spreader by k-shell decomposition. Scientific Reports. 2015;5:9602. pmid:25946319
View Article
PubMed/NCBI
Google Scholar

[82] View Article

[83] PubMed/NCBI

[84] Google Scholar

[ref32] 32. Liu Y, Tang M, Zhou T, Do Y. Improving the accuracy of the k-shell method by removing redundant links: From a perspective of spreading dynamics. Scientific Reports. 2015;5:13172–13172. pmid:26277903
View Article
PubMed/NCBI
Google Scholar

[86] View Article

[87] PubMed/NCBI

[88] Google Scholar

[ref33] 33. Lü L, Zhou T, Zhang QM, Stanley HE. The H-index of a network node and its relation to degree and coreness. Nature communications. 2016;7:10168. pmid:26754161
View Article
PubMed/NCBI
Google Scholar

[90] View Article

[91] PubMed/NCBI

[92] Google Scholar

[ref34] 34. Pastorsatorras R, Vespignani A. Immunization of complex networks. Physical Review E. 2002;65:36104.
View Article
Google Scholar

[94] View Article

[95] Google Scholar

[ref35] 35. Newman MEJ. Spread of epidemic disease on networks. Physical Review E. 2002;66(1).
View Article
Google Scholar

[97] View Article

[98] Google Scholar

[ref36] 36. Morone F, Makse HA. Influence maximization in complex networks through optimal percolation. Nature. 2015;524(7563):65. pmid:26131931
View Article
PubMed/NCBI
Google Scholar

[100] View Article

[101] PubMed/NCBI

[102] Google Scholar

[ref37] 37. Ji S, Lü L, Yeung CH, Hu Y. Effective spreading from multiple leaders identified by percolation in social networks. arXiv preprint arXiv:150804294. 2015;.

[ref38] 38. Hu Y, Ji S, Feng L, Yuliang J. Quantify and Maximise Global Viral Influence Through Local Network Information. arXiv preprint arXiv:150903484. 2015;.

[ref39] 39. Holme P, Saramäki J. Temporal networks. Physics reports. 2012;519(3):97–125.
View Article
Google Scholar

[106] View Article

[107] Google Scholar

[ref40] 40. Holme P. Modern temporal network theory: a colloquium. The European Physical Journal B. 2015;88(9):1–30.
View Article
Google Scholar

[109] View Article

[110] Google Scholar

[ref41] 41. Ren G, Wang X. Epidemic spreading in time-varying community networks. Chaos: An Interdisciplinary Journal of Nonlinear Science. 2014;24(2):023116.
View Article
Google Scholar

[112] View Article

[113] Google Scholar

[ref42] 42. Starnini M, Machens A, Cattuto C, Barrat A, Pastor-Satorras R. Immunization strategies for epidemic processes in time-varying contact networks. Journal of theoretical biology. 2013;337:89–100. pmid:23871715
View Article
PubMed/NCBI
Google Scholar

[115] View Article

[116] PubMed/NCBI

[117] Google Scholar

Figures

Abstract

Introduction

Materials and Methods

Intuition and theory

Degree-based heuristics.

Generalized Degree Discount.

Computational efficiency

Results and Discussion

Data Description

Benchmark methods

Effectiveness

Relations with other methods

Conclusion

Supporting Information

S1 File. Enron.

S2 File. Cond-mat.

S3 File. Gnutella.

S4 File. Epinions.

Acknowledgments

Author Contributions

References