nach oben

Peer-to-Peer Networking and Applications

Erschienen in:

Open Access 19.09.2018

Identifying influential nodes in complex networks based on Neighbours and edges

verfasst von: Zengzhen Shao, Shulei Liu, Yanyu Zhao, Yanxiu Liu

Erschienen in: Peer-to-Peer Networking and Applications | Ausgabe 6/2019

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Patentsuche

Aus

Abstract

Identifying the influential nodes is one of the research focuses in network information mining. Many centrality measures used to evaluate influence abilities of nodes can’t balance between high accuracy and low time complexity. The NL centrality based on the neighbors and importance of edges is proposed which considers the second-degree neighbor’s impact on the influence of a node and utilizes the connectivity and unsubstitutability of edge to distinguish topological position of a node. In order to evaluate the accuracy of NL centrality, the SIR model is used to simulate the process of virus propagation in four real-world networks. Experiment results of monotonicity, validity and efficiency demonstrate that the NL centrality has a competitive performance in distinguishing the influence of nodes and it is suitable for large-scale networks because of the high efficiency in computation.

This article is part of the Topical Collection: Special Issue on Fog/Edge Networking for Multimedia Applications

Guest Editors: Yong Jin, Hang Shen, Daniele D'Agostino, Nadjib Achir, and James Nightingale

The original version of this article was revised due to an Open Access cancellation after publication.

A correction to this article is available online at https://doi.org/10.1007/s12083-018-0686-5.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

1 Introduction

The problem of identifying influential nodes in complex networks has become a research focus owing to its wide applications [1] such as accelerating the speed of information with the help of influential individuals [2‐4], preventing the spread of a disease [5‐7] and identifying the key nodes in protein and brain networks to diagnose a disease [8‐10], etc.

Measuring the influence of nodes, link prediction and recommendation are the core problems in the field of network information mining and identifying influential nodes is the basis of the others. So far, researchers have proposed a variety of methods such as degree [11], betweenness [12], closeness [13], K-shell [14] and LC [15], etc. According to the considered information, these measures can be divided into three categories: local measure, semi-local measure and global measure [16]. Centralities in the category of local measures are straightforward and low time complexity in quantifying node importance because they only consider the number of nearest neighbors. Conversely, global measures have higher accuracy because the information of holistic network is used to determine the importance of a node. However, identifying influential nodes in complex networks relies on multiple computations by using global measures, so global measures are not suitable for large scale networks. Considering the low accuracy of local measures and the high time complexity of global measures, some centralities in the category of semi-local measures are proposed which are more valid by using competitive information and can maintain a relatively low time complexity. Researchers cast about for more stable and efficient characteristics to determine the node’s importance recently [17]. DIL centrality [18] is one of the novel measures. DIL centrality innovatively uses the importance of edges to identify the influential nodes, it portrays the importance of edges from the perspective of connectivity ability and alternative index and can efficiently find the most influential bridge nodes. From the point of considered information, DIL centrality belongs to local measure. Although DIL centrality is more exact than other local measures, it still has a relatively large defect of ranking all the nodes with degree of 1 as same.

In this paper, the NL centrality belonging to semi-local measures is proposed to identify the influential nodes in networks which considers semi-local structure and topological position simultaneously. The semi-local structure of a node is defined as the number of neighbors within 2-steps. Because considering second-degree neighbors of a node has better effect on ranking influential nodes. At same time, we assume that the topological position of a node is related to the edges which connected to the node. If an edge attached has strong connectivity, the edge occupies an important position in the topology. Correspondingly, the position of the two nodes connected to the edge is more important. To obtain the influence ability of node, the SIR model is used [19] to simulate the process of virus propagation in four complex networks. In SIR model, the influence ability of a node is defined as the average propagation range. The experimental result of monotonicity, accuracy and time complexity show that NL centrality has a better performance in ranking influential nodes. At the same time, the results prove the the rationality of hypothesis, which predicates that we can find some new measures by considering the semi-local structure and topological position of a node.

Many centrality measures are devised to rank nodes according to their influence ability. All these measures have their own points and shortcomings. Local measures such as degree and DIL centralities are simple. But, they are low accuracy when used alone because they neglect the global information about networks. In contrast, global measures such as betweenness centrality, closeness centrality and K-shell centrality can achieve high accuracy and computational complexity in ranking nodes, so they are not suitable for large scale networks. In recent years, some researchers proposed several semi-local centralities as a balance between high accuracy and low time complexity. Under the premise of controlling computational complexity, semi-local measures consider the multiple characteristics of the nodes as much as possible. Some of the most popular centralities in this category are LC [15] and LSC [20] centralities. Several symbols in this paper are shown in Table 1.

Table 1

List of important symbols

Symbol	Description	Illustration
V	Set of nodes	V = {v₁, v₂, …, v_n}
N	Size of set V
E	Set of edges	E = {e₁₂, e₁₃, …, e_ij}, i ≠ j
M	Size of set E
G	An unweighted and undirected network	G = (V, E)
Г(v₁)	Set of the nearest neighbors of node v	Г(v₁) = {v₂, v₃, …}
<k>	Average degree of nodes
e_uv	The edge is connected to node u and node v	e_uv ∈ E
$ {I}_{e_{uv}} $	Importance of edge e_uv	$ {I}_{e_{12}} $=12
ks	An integer value which means the position of a node	ks = 1
β	Spreading probabilities of SIR model
β_c	Epidemic threshold	β_c = < k > /(<k² > − < k>)
τ	Kendall’s tau correlation coefficient
σ	Average number of recovered nodes when spreading stops
M(∙)	The monotonicity of the corresponding method

G = (V, E) is a graph which has n = |V| nodes and m = ∣ E∣ edges. It also can be defined as an adjacent matrix A = {a_uv} ∈ R^{n, n}, where a_uv = 1 if node u and node v are directly connected. If not,a_uv = 0. Degree centrality is the most basic method which belongs to the category of local centrality. It is defined as:

$$ {C}_D(v)=\sum \limits_{v\in V}{a}_{uv}. $$

(1)

A relatively novel method in this category is DIL centrality [18]. Most central measures ignore the the edges when evaluating the influence of a node, however, each edge may have potential meaning to network structures [21‐23]. So DIL uses the degree value and importance of edges to identify influential nodes. The weight of edge is defined as its connectivity and fungibility in DIL centrality. The DIL centrality of node v can be defined as:

$$ {C}_{DIL}(v)={k}_v+\sum \limits_{u\in \Gamma (v)}{I_{e_{uv}}}^{\ast}\frac{k_v-1}{k_v+{k}_u-2}, $$

(2)

where k_v denotes the degree of node v and $ {I}_{e_{uv}}=\frac{C}{\theta } $ is defined as the importance of edge e_uv. DIL portrays the importance of edge from the perspective of connectivity and fungibility which can be expressed as C and θ severally. C = (k_u − t − 1) ∗ (k_v − t − 1),$ \theta =\frac{t}{2}+1 $, where t is the number of the triangle whose one side is e_uv.

Closeness centrality [13] counts among the global measures which characterizes the control of nodes on the network flow spread in the shortest path [24, 25] and is defined as:

$$ {C}_C(v)=\frac{1}{\sum_{u\in V}{d}_{uv}}, $$

(3)

where d_uv denotes the distance of shortest path from node u to node v and u ≠ v. The larger the closeness centrality of a node is, the closer the neighbors link.The time complexities of closeness centrality is O(n³) because of the process of finding the shortest path between all the nodes. Closeness centrality flows the premise that information is always propagating along the shortest path. But in real life, the spread of grippe, news and so on is not necessarily through the shortest path.

Kitsak et al. [14] firstly proposed the K-shell centrality. They found that the node with a core position were more important than the node in the margin of the network. The decomposition process of K-shell centrality needs to continuously delete nodes from the network. At first, the nodes which have only one nearest neighbors are removed and the degree values of residual nodes are updated. After that, if there are some nodes with only one edge left, those are also removed till the degree values of all remain nodes are over 1. The removed nodes are considered to be at the extreme periphery and those ks value are assigned to 1. Next, repeat the above operation for all nodes whose degree value are two. The decomposition process will not end until all nodes in the network are deleted. By process of the K-shell centrality, the influence ability of nodes can be ordered by the value of ks. However, K-shell centrality assign the same ks value to a many nodes with different influence. In order to further improve the ability of K-shell centrality to distinguish influential nodes, some improvements are gradually proposed. For examole, Zeng et al. [26] found that some nodes with large degree are seriously underestimated. Considering the exhausted degree in the network decomposition, they propose a Mixed Degree Decomposition method which improved the K-shell centrality. Wang Z et al. [27, 28] utilized the iterations in the process of K-shell decomposition to assign the node position with decimal rather than integer. Compared with K-shell centrality, it can further distinguish the influential nodes. At the same time, Lin et al. [29] proposed INK method based on neighbors to identify the node influence with largest k-core values.

With the popularity of the mobile Internet, the scale of social network is consistently expanding. Hence, measures with low time complexity and high accuracy for this challenging task are needed. Semi-local measures are proposed for this reason. LC centrality is the most important measure which uses the semi-local structure to identify the influence ability of node. LC centrality of node v is proposed as a trade-off between local and global measures. LC centrality can be defined as:

$$ {C}_L(v)=\sum \limits_{u\in \Gamma (v)}\sum \limits_{w\in \Gamma (u)}\varphi (w), $$

(4)

where φ(w) is the number of the nearest and the next nearest neighbors of node w. LC centrality outperforms local measures because it considers the information from fourth-degree neighbors. But it neglects the effect of topological position upon the influence of a node. On this basis, Gao et al. [20] proposed local structural centrality (LSC) based on the topological connections among the neighbors.

The fundamental reason for the different importance of node is the heterogeneity of network structure. Researchers have proposed a number of important node mining methods that provide alternatives from different perspectives. In addition to the above tree categories, some measures identify the influence ability of node from the perspective of network connectivity, node deletion, node efficiency and so on. For example, PageRank [30] and LeaderRank [31] based on random walk are used to sort nodes according to the link relationship between web pages. Cao et al. [32] used D-S evidence theory to integrate several influential factors effectively, etc. Ranking nodes in complex networks is a valuable research topic, and many scholars have made their efforts.

3 Our method

In fact, there is a threshold value of considered neighbors. Liu et al. [33] found that, considering the neighbors within 2-step would get a balance between low time complexity and high accuracy. When considering more steps of neighbour, the improvement of ranking accuracy is not obvious and even negative. But LC centrality considers the neighbors of a node within third-degree even fourth-degree which has a certain negative impact on accuracy and time complexity. Taking node 1 and node 13 as examples. According to Eq. (4), the computing process of LC centrality for node 13 isC_L(v₁₃) = Q(v₂₃) + Q(v₁₇) + Q(v₁₆) + Q(v₁₅), Q(v₁₇) = φ(v₁₈) + φ(v₁₃) + φ(v₁₆) + φ(v₁₉). Node 16, 17, 19 and 20 are the first-degree neighbors of node 18, node 13, 15 and 24 are the second-degree neighbors of node 18. Therefore, when calculating the LC centrality of node 13, the information about fourth-degree neighbor (Node 24) is used. Although LC centrality considers more information, its accuracy is not proportional to the steps of considered nodes. Node 1 and node 13 in Fig. 1 are a good illustration of this problem. The infected numbers of node 1 and node 13 under different spreading probabilities (β) are shown in Table 2. The number of infected nodes is averaged through 1000 simulated propagation processes by SIR model. As shown in Table 2, infected number of node 1 is larger than that of node 13 with different spreading probabilities, demonstrating that node 1 is more important than node 13. But C_L(v₁) = 115, C_L(v₁₃) = 123. The conclusion drew by LC centrality is contrary to that came by SIR model. The ranked nodes by LC centrality in Fig. 1 are listed in Table 3. Therefore, we can draw the conclusion that the multi-step neighbors considered by LC centrality have a negative effect on its accuracy. However, it is undeniable that the accuracy of semi-local centrality is better than that of local measures.

Table 2

The infected numbers of node 1 and node 13 under different spreading probabilities

Node	0.05	0.1	0.15	0.2	0.25	0.3
1	0.41	0.77	1.12	1.77	2.32	2.73
13	0.35	0.46	1.03	1.13	2.14	2.69

Table 3

The ranked nodes by LC centrality

V	C_L(v)	Rank	v	C_L(v)	Rank	v	C_L(v)	Rank	v	C_L(v)	Rank
13	115	1	15	79	7	7	57	9	21	19	13
16	106	2	2	75	8	8	57	9	9	19	13
1	104	3	3	75	8	14	53	10	10	19	13
17	99	4	23	75	8	19	53	10	11	19	13
4	93	5	5	57	9	20	30	11	12	19	13
18	80	6	6	57	9	22	25	12	24	9	14

Based on the hypothesis that the influence ability of a node is related to the semi-local information as well as its topological position, a novel NL centrality based on the second-degree neighbors and the importance of edges is proposed. Compared with LC centrality, the semi-local structure is defined as the information about second-degree neighbors of a node in NL centrality. Because considering the neighbors of a node within 2-steps can balance the two aspects of accuracy and efficiency while the performance is not obviously improve or even decreased when considering more steps of neighbors in most cases [33]. Using the semi-local structure instead of the global information, it is not have to know the global information and the accuracy of measure is ensured while reducing the time complexity. On the other hand, the importance of the edge connected to a node is used to represent its topological position. If an edge attached to a node with strong connectivity, the edge occupies an important position in the topology. Correspondingly, the position of the two nodes connected to the edge is more important. To verify the hypothesis, NL centrality is compared with other centralities in the experimental part.

The NL centrality of node v is defined as:

$$ {C}_{NL}(v)=\sum \limits_{u\in \Gamma (v)}\left(\varphi (u)+{I_{e_{uv}}}^{\ast}\frac{k_v-1}{k_v+{k}_u-2}\right), $$

(5)

where Г(v) is the set of the nearest neighbors of node v, φ(u) is the number of neighbors within 2-steps of node u,$ {I}_{e_{uv}} $ is the importance of edge e_uv defined in (2) and k_v is the degree value of node v.

4 Experiments

4.1 Datasets

In experiments, four complex networks are used, including: (1) Dolphin [34] communication network which has 62 nodes and 159 edges. (2) Word [35] adjacency network of a book. A node means a noun or an adjective and the edge indicates that two words occur in adjacent positions. (3) Email [36] communication network. The node is a receiver or sender and edge represents email exchange. (4)Grid [37]. A node is a generator, transformator or substation. The edge represents a power supply line in real life. In the process of experiment, all four real-world networks are regarded as undirected and unweight networks.

Table 4 shows the specific properties of those networks including number of nodes (n), number of edges (m), average degree (<k>), maximum degree (k_max), assortativity coefficient (ε), clustering coefficient (cl), epidemic threshold (β_c) and infection probability (β) of SIR model used in section 5.3.

Table 4

Properties of the real networks

Network	n	m	<k>	k_max	ε	cl	β_c	β
Dolphins	62	159	5.1290	12	−0.0436	0.2590	0.1723	0.18
Word	112	425	7.59	49	−0.129	0.173	0.078	0.08
Email	1113	5451	9.6222	71	0.0782	0.2202	0.0565	0.06
Grid	4941	6594	2.6691	19	0.0034	0.103	0.126	0.13

4.2 The SIR model

The SIR model [19, 38] is the most classical epidemic model, where S represents the susceptible state, I means the infected state and R represents recovered state. The nodes in the state of infected can infect their susceptible neighbors, at the same time, the infected nodes turn into the state of recovered and cannot sicken. To get the authentic influence ability of node, SIR model is used to simulate the transmission process of virus in networks. Originally only one node is in the infected state and the remaining nodes are susceptible. Every time the infected nodes infect their susceptible neighbors with probability β which is defined as infection probability. Until all the nodes are susceptible or recovered, the propagation process stops. The proportion of recovery nodes is considered to be the influential capability of the original node. In this paper, the diffusion was simulated for 100 times and the average number of recovered nodes is used as the influence of a node.

4.3 Evaluations

In order to distinguish the importance of all nodes, each node should be assigned with a unique index by centrality measures. The proportion of repetitive elements in a sequence is called the monotonicity of the sequence. To quantify the monotonicity of different ranking methods, M(R) [39, 40] is used and it is defined as:

$$ M(R)={\left[1-\frac{\sum_{r\in R}{N}_r\left({N}_r-1\right)}{N\left(N-1\right)}\right]}^2, $$

(6)

where N is the size of the list R and N_r is the number of nodes whose rank indexes are r. Because of N_r ≤ N, the value of M(R) is between 0 and 1.0. The larger the value of M(R) is, the more monotonous the ranking method performs. When the value of M(R) is 1.0, the ranking results of elements in list R differ from each other. When the value of M(R) is 0, all nodes are in the same rank.

In order to evaluate the accuracy of centrality measures, the correlation coefficient (τ) [41] is employed to quantify the correctness. Consider two ranking results R₁ and R₂, the number of their elements is N.$ \left({r}_i^1,{r}_i^2\right) $and $ \left({r}_j^1,{r}_j^2\right) $ are randomly selected pair from R₁ and R₂, respectively. If they satisfy $ {r}_i^1>{r}_i^2 $and$ {r}_j^1>{r}_j^2 $ or $ {r}_i^1<{r}_i^2 $and$ {r}_j^1<{r}_j^2 $, the$ \left({r}_i^1,{r}_i^2\right) $and $ \left({r}_j^1,{r}_j^2\right) $ are said to be concordant. If $ {r}_i^1>{r}_i^2 $and$ {r}_j^1<{r}_j^2 $ or $ {r}_j^1<{r}_j^2 $and$ {r}_j^1>{r}_j^2 $, they are considered to be discordant.

Kendall’s coefficient of concordance (τ) is a statistical value that calculates the correlation of multiple rank variables. The Kendall’s coefficient of concordance τ is defined as:

$$ \uptau \left(\mathrm{R}\right)=\frac{n_c-{n}_d}{0.5\ast N\ast \left(N-1\right)}, $$

(7)

where n_c is the number of concordant pair and n_d is the number of discordant pair. The accuracy can be evaluated by using the Kendall coefficient to calculate the correlation between the ranking results obtained from the propagation process and the ranking results obtained by centrality calculation.

5 Results

In experiments, the NL centrality is compared with other five centrality measures which belong to various categories including DC (degree centrality), KS (K-shell centrality), CC (closeness centrality), LC (local centrality) and DIL (DIL centrality). We use different methods to analyze the performance of NL centrality from three aspects: monotonicity, accuracy and efficiency.

5.1 Monotonicity

Firstly, we use schematic network in Fig.1 to illustrate the monotonicity of NL centrality. Some well-known measures are chosen for comparison.Table 5 shows the ranking results identified by DC, KS, CC, LC, DIL and NL. In Table 5, the “rank” column denotes the ranking result, while the other columns represent the node number of schematic network corresponding to different measures. Taking the first row as an example, the ranking result calculated by the degree centrality of node 1 is 1, so we put the node number 1 in the first row of the DC column. The nodes with same ranking results are placed in the one row and the “Others” refers to the remaining nodes.

Table 5

Ranking results of the schematic network generated by different measures

Rank	DC	KS	CC	LC	DIL	NL
1	1	13,14,15,16,17,18,19,23	4	1	1	1
2	13,16,17,18	Others	1,23	16	13	4
3	2,3,4,14,15,23		13	13	23	23
4	19,20		14	17	4	13
5	Others		2,3	4	18	2,3
6			16	18	14	18
7			22	15	16	16
8			17	2,3,23	15	15
9			5,6,7,8,15	5,6,7,8	2	14
10			18	14,19	3	17
11			21	20	17	5,6,7,8
12			9,10,11,12,19	22	20	20
13			20	9,10,11,12,21	19	19
14			24	24	Others	Others
15						24

From Table 5, we can find that NL centrality divides the importance of nodes in Fig. 1 into 15 ranks and most of them only have one node. If there are multiple node numbers in a row, the method corresponding to this column considers these nodes as equally important. K-shell centrality rank the importance of nodes into two sequences, which means that almost all of nodes are equally important. This is obviously unreasonable. Moreover, Closeness, LC and DIL centralities have the same performance. It’s remarkable that DIL centrality in the category of local measure can achieve the same effect as closeness centrality in the category of global measures, illustrating that DIL algorithm is correct in considering edges to identify influential nodes. By and large, the NL centrality is the most effective measure in distinguishing the influence ability of nodes, because only NL centrality divides 24 nodes into 15 ordered sequences. Other competitive methods such as closeness, LC and DIL centralities also have a good performance.

Secondly, the monotonicity of NL centrality is further compared by using four real networks. The values of M(∙) are compared in Table 6 where the “Network” column indicates the names of several real networks and other columns show the monotonicity of the corresponding method. From Table 6, we can find that the value of M(NL) is approximate to 1.0 in all four networks which is markedly larger than the values of M(DC) and M(KS). Therefore, NL centrality has strongest ability to distinguish different influence of nodes. Table 6 also shows that monotonicity of degree centrality presents great changes on different networks because it only utilizes local information. And K-shell centrality has a limited ability to distinguish the position of node because it often assigns the same values for many nodes, causing the monotonicity of all four networks to be unsatisfactory. The experimental results also reveal that NL centrality has a relatively stable monotonicity performance with the expansion of network.

Table 6

The result of monotonicity

Network	M(DC)	M(KS)	M(CC)	M(LC)	M(DIL)	M(NL)
Dolphins	0.904812	0.613961	0.986779	0.997885	0.980962	0.998942
Word	0.930663	0.773971	0.991795	0.999839	0.9926	0.999839
Email	0.943967	0.917951	0.999412	0.999929	0.981253	0.999944
Grid	0.866198	0.811454	0.995714	0.993049	0.978278	0.994511

5.2 Correctness

The correlations between the numbers of infected nodes (σ) and the ranking indices given by centralities are plotted in Fig. 2. In the SIR model, the β should be moderate. If β is much smaller than epidemic threshold (β_c), the epidemic only spread in a very small region and the number of recovered nodes cannot be measured. On the contrary, if the value of β is much larger than the epidemic threshold, the epidemic can easily outbreak over almost entire network, causing every node can infect all other nodes. In order to achieve the appropriate range of influence, the value of β should be slightly larger than β_c where β_c is epidemic threshold. The values of β_c and β are presented in Table 4. In Fig. 2, σ is the average number of recovered nodes when spreading stops, the abscissa indicates the ranking indices given by centrality measure. Figure 3 shows that the NL centrality is significantly correlated with number of recovered nodes and has the best performance in Word and Email network. Besides, the traditional measures such as the K-shell and closeness centralities have little correlation with the influence capability and the result calculated by centrality measure.

In order to compare the validity of centrality measures, the value of τ is used to evaluate the accuracy of centrality measures in determining influential nodes. A statistical value τ is used to calculate the correlation of the ranking list generated by centrality measures and SIR model respectively. The comparative results are shown in Fig. 3. In Fig. 3, we can find that NL centrality nearly has the maximum value of τ in four real networks, especially in Grid, which means that the NL centrality outperforms other centrality measure upon most occasions. According to the result, degree centrality have a best performance when the infection probabilities are small in all four networks. We know that when β is much smaller than β_c, the node can only infect their neighbors. So, the influence ability of a node depends on the number of its neighbors, so when β too small, local measures have the best performance. In fact, that is the reason why DIL centrality performs better than NL centrality when infection probabilities β is less than 0.05 in the Dolphins network. When β grows larger, NL centrality can identify influential nodes more accurately. For example, the τ of NL centrality is maximum When β is larger than 0.04 in Word. A similar situation can also be found in Fig. 3b. The results of four networks sufficiently attest that the NL centrality has a good performance on identifying influential nodes within the overwhelming majority of β.

5.3 Efficiency

The scale of current network is consistently increasing. Most traditional measures cannot balance the two aspects of high accuracy and low time complexity. Hence, effective and universal measures are needed to solve the problem. The categories and time complexities of different centrality measures are discussed and the results are shown in Table 7. From the Table 7, we can observe that the time complexities of DC and KS are O(n) which are the lowest, but they do not perform well as shown in Fig. 3. The time complexity of NL is O(n < k>²) which is equal to that of LC and DIL, but the NL centrality performs obviously better than others in terms of monotonicity and correctness. Table 7 indicates that NL centrality has low time complexity and can be used in large-scale networks.

Table 7

The time complexity of six measures

Method	Category	Time complexity
DC	Local measure	O(n)
KS	Global measure	O(n)
CC	Global measure	O(n³) or O(n²log n + nm)
LC	Semi-local measure	O(n < k>²)
DIL	Local measure	O(n < k>²)
NL	Semi-local measure	O(n < k>²)

6 Conclusions

How to find the most influential nodes in a large scale network by quantitative analysis, or to evaluate the influence of a node to other or more nodes is one of the important problems to be solved in the complex network research. In this paper, the NL centrality is proposed to find influential nodes in complex networks which considers semi-local structure and topological position simultaneously. Using the semi-local structure instead of the global information, the accuracy of measure is ensured while reducing the time complexity. On the other hand, the importance of the edge connected to a node is used to represent its topological position. We assume that if an edge attached to a node with strong connectivity, the edge occupies an important position in the topology. Correspondingly, the position of the two nodes connected to the edge is more important. In experiment, we apply the NL centrality on real networks and use the SIR model to simulate the spreading process. Compared with degree, DIL, closeness, LC and K-shell centralities, NL centrality can give excellent performance in terms of monotonicity, correctness and time complexity on four real networks. The experiments show that NL centrality can balance on high accuracy and low time complexity. At the same time, we also prove the hypothesis that the importance of a node is not only related to the semi-local structure, but also its topological position.

Although NL centrality performs well than other well-known measures, it ignores the dynamics of the network. Especially, the structure of time-varying network is changed and all kinds of indicators of nodes are dynamic. So, the importance of nodes changes at any time and the dynamic properties of the network are the most important factors. Whether NL centrality is applicable to dynamic networks is the focus of our next research.

Acknowledgements

This work was supported by China Postdoctoral Science Foundation (No. 2016 M592697) and Key Science and Technology Project of Shandong Province of China (No. 2014GGH201022).

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Vorheriger Artikel Time delay estimation algorithm based on virtual Array and MUSIC for single sensor system

Nächster Artikel Correction to: Identifying influential nodes in complex networks based on Neighbours and edges

poDe DM, Solé-Ribalta A, Omodei E, Gómez S, Arenas A (2015) Ranking in interconnected multilayer networks reveals versatile nodes. Nat Commun 6:6868

Sheikhahmadi A, Nematbakhsh MA, Shokrollahi A (2015) Improving detection of influential nodes in complex networks. Physica A 436:833–845

Hinz O, Schulze C, Takac C (2014) New product adoption in social networks: why direction matters. J Bus Res 67(1):2836–2844

Liu JG, Ren ZM, Guo Q, Wang BH (2013) Node importance ranking of complex networks. Acta Phys Sin 62(17):178901–178901

Cai Y, Li Y, Cao Y, Li W, Zeng X (2017) Modeling and impact analysis of interdependent characteristics on cascading failures in smart grids. Int J Electr Power Energy Syst 89:106–114

Mieghem PV, Omic J, Kooij R (2009) Virus spread in networks. IEEE/ACM Trans Networking 17(1):1–14

Lloyd AL, May RM (2001) How viruses spread among computers and people. Science 292(5520):1316–1317

Sporns O (2013) Structure and function of complex brain networks. Dialogues Clin Neurosci 15(3):247–262

Moussa MN, Vechlekar CD, Burdette JH, Steen MR, Hugenschmidt CE, Laurienti PJ (2011) Changes in cognitive state alter human functional brain networks. Front Hum Neurosci 5(12):83

10.

Alvarez MJ, Shen Y, Giorgi FM, Lachmann A, Ding BB, Ye BH (2016) Functional characterization of somatic mutations in cancer using network-based inference of protein activity. Nat Genet 48(8):838–847

11.

Freeman LC (1978) Centrality in social networks conceptual clarification. Soc Networks 1(3):215–239

12.

Freeman LC (1977) A set of measures of centrality based on betweenness. Sociometry 40(1):35–41

13.

Sabidussi G (1966) The centrality index of a graph. Psychometrika 31(4):581–603MathSciNetMATH

14.

Kitsak M, Gallos LK, Havlin S, Liljeros F, Muchnik L, Stanley HE, Makse HA (2010) Identification of influential spreaders in complex networks. Nat Phys 6(11):888–893

15.

Chen D, Lü L, Shang MS, Zhang YC, Zhou T (2012) Identifying influential nodes in complex networks. Physica A 391(4):1777–1787

16.

Lü L, Chen D, Ren XL, Zhang QM, Zhang YC, Zhou T (2016) Vital nodes identification in complex networks. Phys Rep 650:1–63MathSciNet

17.

Berahmand K, Bouyer A, Samadi N (2018) A new centrality measure based on the negative and positive effects of clustering coefficient for identifying influential spreaders in complex networks. Chaos, Solitons Fractals 110:41–54MATH

18.

Liu J, Xiong Q, Shi W, Shi X, Wang K (2016) Evaluating the importance of nodes in complex networks. Physica A 452:209–219

19.

Li M, Zhang R, Hu R, Yang F, Yao Y, Yuan Y (2018) Identifying and ranking influential spreaders in complex networks by combining a local-degree sum and the clustering coefficient. Int J Mod Phys B 32(6):1850118MATH

20.

Gao S, Ma J, Chen Z, Wang G, Xing C (2014) Ranking the spreading ability of nodes in complex networks based on local structure. Physica A 403(6):130–147MATH

21.

Wang J, Hou X, Li K, Din Y (2017) A novel weight neighborhood centrality algorithm for identifying influential spreaders in complex networks. Physica A 475:88–105

22.

Mirzasoleiman B, Babaei M, Jalili M, Safari M (2011) Cascaded failures in weighted networks. Phys Rev E Stat Nonlinear Soft Matter Phys 84(2):046114

23.

Zhang CJ, Zeng A (2014) Network skeleton for synchronization: identifying redundant connections. Physica A 402(10):180–185MathSciNetMATH

24.

Guimerà R, Díaz-Guilera A, Vega-Redondo F, Cabrales A, Arenas A (2002) Optimal network topologies for local search with congestion. Phys Rev Lett 9(24):248701

25.

Yan G, Zhou T, Hu B, Fu ZQ, Wang BH (2006) Efficient routing on complex networks. Phys Rev E Stat Nonlinear Soft Matter Phys 73(2):046108

26.

Zeng A, Zhang CJ (2013) Ranking spreaders by decomposing complex networks. Phys Lett A 377(14):1031–1035

27.

Wang Z, Zhao Y, Xi J, Du C (2016) Fast ranking influential nodes in complex networks using a k-shell iteration factor. Physica A 461:171–181

28.

Wang Z, Du C, Fan J, Xing Y (2017) Ranking influential nodes in social networks based on node position and neighborhood. Neurocomputing 260:466–477

29.

Lin JH, Guo Q, Dong WZ, Tang LY, Liu JG (2014) Identifying the node spreading influence with largest k -core values. Phys Lett A 378(45):3279–3284MATH

30.

Franceschet M (2010) Pagerank: standing on the shoulders of giants. Commun ACM 54(6):92–101

31.

Li Q, Zhou T, Lü L, Chen D (2014) Identifying influential spreaders by weighted leaderrank. Physica A 404(24):47–55MathSciNetMATH

32.

Gao C, Wei D, Hu Y, Mahadevan S, Deng Y (2013) A modified evidential methodology of identifying influential nodes in weighted networks. Physica A S392(21):5490–5500MathSciNetMATH

33.

Liu Y, Tang M, Zhou T, Do Y (2016) Identify influential spreaders in complex networks, the role of neighborhood. Physica A 452(3):289–298

34.

Bian T, Deng Y (2017) A new evidential methodology of identifying influential nodes in complex networks. Chaos, Solitons Fractals 103(2):101–110MathSciNetMATH

35.

Bao ZK, Ma C, Xiang BB, Zhang HF (2016) Identification of influential nodes in complex networks: method from spreading probability viewpoint. Physica A 468MATH

36.

Ma Q, Ma J (2017) Identifying and ranking influential spreaders in complex networks with consideration of spreading probability. Physica A 465:312–330

37.

Watts DJ, Strogatz SH (2011) Collective dynamics of 'small-world' networks. Nature 1998:440–442MATH

38.

Castellano C, Pastorsatorras R (2010) Thresholds for epidemic spreading in networks. Phys Rev Lett 105(21):218701

39.

Knight WR (1966) A computer method for calculating kendall's tau with ungrouped data. Publ Am Stat Assoc 61(314):436–439MATH

40.

Yang F, Zhang R, Yang Z, Hu R, Li M, Yuan Y et al (2017) Identifying the most influential spreaders in complex networks by an extended local k-shell sum.Int. J. Mod Phys C 28(01):925–214MathSciNet

41.

Zhu C, Wang X, Zhu L (2017) A novel method of evaluating key nodes in complex networks. Chaos, Solitons Fractals 96:43–50MATH

Titel: Identifying influential nodes in complex networks based on Neighbours and edges
verfasst von: Zengzhen Shao
Shulei Liu
Yanyu Zhao
Yanxiu Liu
Publikationsdatum: 19.09.2018
Verlag: Springer US
Erschienen in: Peer-to-Peer Networking and Applications / Ausgabe 6/2019
Print ISSN: 1936-6442
Elektronische ISSN: 1936-6450
DOI: https://doi.org/10.1007/s12083-018-0681-x

V	C_L(v)	Rank	v	C_L(v)	Rank	v	C_L(v)	Rank	v	C_L(v)	Rank
13	115	1	15	79	7	7	57	9	21	19	13
16	106	2	2	75	8	8	57	9	9	19	13
1	104	3	3	75	8	14	53	10	10	19	13
17	99	4	23	75	8	19	53	10	11	19	13
4	93	5	5	57	9	20	30	11	12	19	13
18	80	6	6	57	9	22	25	12	24	9	14

V	C_L(v)	Rank	v	C_L(v)	Rank	v	C_L(v)	Rank	v	C_L(v)	Rank
13	115	1	15	79	7	7	57	9	21	19	13
16	106	2	2	75	8	8	57	9	9	19	13
1	104	3	3	75	8	14	53	10	10	19	13
17	99	4	23	75	8	19	53	10	11	19	13
4	93	5	5	57	9	20	30	11	12	19	13
18	80	6	6	57	9	22	25	12	24	9	14

Springer Professional

Abstract

Publisher’s Note

1 Introduction

2 Related works

3 Our method

4 Experiments

4.1 Datasets

4.2 The SIR model

4.3 Evaluations

5 Results

5.1 Monotonicity

5.2 Correctness

5.3 Efficiency

6 Conclusions

Acknowledgements

Publisher’s Note

Weitere Artikel der Ausgabe 6/2019

UAV-assisted wireless relay networks for mobile offloading and trajectory optimization

Heterogeneous mobile network access control technology based on mutual trust mechanism

Cyber-physical battlefield perception systems based on machine learning technology for data delivery

Fog computing support scheme based on fusion of location service and privacy preservation for QoS enhancement

Distributed consensus of networked markov jump multi-agent systems with mode-dependent event-triggered communications and switching topologies

Guest editorials: Special issue on fog/edge networking for multimedia applications

V	C_L(v)	Rank	v	C_L(v)	Rank	v	C_L(v)	Rank	v	C_L(v)	Rank
13	115	1	15	79	7	7	57	9	21	19	13
16	106	2	2	75	8	8	57	9	9	19	13
1	104	3	3	75	8	14	53	10	10	19	13
17	99	4	23	75	8	19	53	10	11	19	13
4	93	5	5	57	9	20	30	11	12	19	13
18	80	6	6	57	9	22	25	12	24	9	14