research-article

Perturb and combine to identify influential spreaders in real-world networks

Authors:
Antoine J.-P. Tixier

École Polytechnique, Palaiseau, France

École Polytechnique, Palaiseau, France
View Profile

,
Maria Evgenia G. Rossi

École Polytechnique, Palaiseau, France

École Polytechnique, Palaiseau, France
View Profile

,
Fragkiskos D. Malliaros

Centrale-Supélec, Gif-sur-Yvette, France

Centrale-Supélec, Gif-sur-Yvette, France
View Profile

,
Jesse Read

École Polytechnique, Palaiseau, France

École Polytechnique, Palaiseau, France
View Profile

,
Michalis Vazirgiannis

École Polytechnique, Palaiseau, France

École Polytechnique, Palaiseau, France
View Profile

ASONAM '19: Proceedings of the 2019 IEEE/ACM International Conference on Advances in Social Networks Analysis and MiningAugust 2019Pages 73–80https://doi.org/10.1145/3341161.3342866

Published:15 January 2020Publication History

ASONAM '19: Proceedings of the 2019 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining

Pages 73–80

ABSTRACT

Some of the most effective influential spreader detection algorithms are unstable to small perturbations of the network structure. Inspired by bagging in Machine Learning, we propose the first Perturb and Combine (P&C) procedure for networks. It (1) creates many perturbed versions of a given graph, (2) applies a node scoring function separately to each graph, and (3) combines the results. Experiments conducted on real-world networks of various sizes with the k-core, generalized k-core, and PageRank algorithms reveal that P&C brings substantial improvements. Moreover, this performance boost can be obtained at almost no extra cost through parallelization. Finally, a bias-variance analysis suggests that P&C works mainly by reducing bias, and that therefore, it should be capable of improving the performance of all vertex scoring functions, including stable ones. An extended version of this paper is provided by [1].

References

A. J.-P. Tixier, M.-E. G. Rossi, F. D. Malliaros, J. Read, and M. Vazirgiannis, "Perturb and combine to identify influential spreaders in real-world networks," arXiv preprint arXiv:1807.09586, 2018.Google Scholar
F. Hoppensteadt, Mathematical Theories of Populations: Demographics, Genetics, and Epidemics. Siam, 1975, vol. 20.Google Scholar
J. Leskovec, L. A. Adamic, and B. A. Huberman, "The dynamics of viral marketing," ACM Transactions on the Web (TWEB), vol. 1, no. 1, p. 5, 2007.Google ScholarDigital Library
S. Pei, L. Muchnik, J. S. Andrade Jr, Z. Zheng, and H. A. Makse, "Searching for superspreaders of information in real-world social media," Scientific reports, vol. 4, p. 5547, 2014.Google ScholarCross Ref
K. Balog, L. Azzopardi, and M. De Rijke, "Formal models for expert finding in enterprise corpora," in Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval. ACM, 2006, pp. 43--50.Google Scholar
A. Tixier, F. Malliaros, and M. Vazirgiannis, "A graph degeneracy-based approach to keyword extraction," in Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, 2016, pp. 1860--1870.Google Scholar
M. Kitsak, L. K. Gallos, S. Havlin, F. Liljeros, L. Muchnik, H. E. Stanley, and H. A. Makse, "Identification of influential spreaders in complex networks," Nature physics, vol. 6, no. 11, pp. 888--893, 2010.Google ScholarCross Ref
G. F. De Arruda, A. L. Barbieri, P. M. Rodríguez, F. A. Rodrigues, Y. Moreno, and L. da Fontoura Costa, "Role of centrality for the identification of influential spreaders in complex networks," Physical Review E, vol. 90, no. 3, p. 032812, 2014.Google ScholarCross Ref
F. D. Malliaros, M.-E. G. Rossi, and M. Vazirgiannis, "Locating influential nodes in complex networks," Scientific reports, vol. 6, p. 19307, 2016.Google ScholarCross Ref
F. Malliaros, C. Giatsidis, A. Papadopoulos, and M. Vazirgiannis, "The core decomposition of networks: Theory, algorithms and applications," 2019.Google Scholar
S. B. Seidman, "Network structure and minimum degree," Social networks, vol. 5, no. 3, pp. 269--287, 1983.Google ScholarCross Ref
V. Batagelj and M. Zaveršnik, "Generalized cores," arXiv preprint cs/0202039, 2002.Google Scholar
A. Adiga and A. K. S. Vullikanti, "How robust is the core of a network?" in Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Springer, 2013, pp. 541--556.Google Scholar
A. V. Goltsev, S. N. Dorogovtsev, and J. F. F. Mendes, "k-core (bootstrap) percolation on complex networks: Critical phenomena and nonlocal effects," Physical Review E, vol. 73, no. 5, p. 056101, 2006.Google ScholarCross Ref
L. Breiman, "Bias, variance, and arcing classifiers," 1996.Google Scholar
L. Breiman, "Bagging predictors," Machine learning, vol. 24, no. 2, pp. 123--140, 1996.Google ScholarCross Ref
L. Breiman, "Random forests," Machine learning, vol. 45, no. 1, pp. 5--32, 2001.Google ScholarDigital Library
P. Erdös and A. Rényi, "On the evolution of random graphs," Publ. Math. Inst. Hung. Acad. Sci, vol. 5, pp. 17--61, 1960.Google Scholar
F. Chung and L. Lu, "The average distances in random graphs with given expected degrees," Proceedings of the National Academy of Sciences, vol. 99, no. 25, pp. 15 879--15 882, 2002.Google ScholarCross Ref
I. C. Ipsen and R. S. Wills, "Mathematical properties and analysis of googleâĂ&Zacute;s pagerank."Google Scholar
A. Y. Ng, A. X. Zheng, and M. I. Jordan, "Link analysis, eigenvectors and stability," in International Joint Conference on Artificial Intelligence, vol. 17, no. 1. LAWRENCE ERLBAUM ASSOCIATES LTD, 2001, pp. 903--910.Google Scholar
L. Page, S. Brin, R. Motwani, and T. Winograd, "The pagerank citation ranking: Bringing order to the web." Stanford InfoLab, Tech. Rep., 1999.Google Scholar
J. Leskovec and A. Krevl, "SNAP Datasets: Stanford large network dataset collection," http://snap.stanford.edu/data, Jun. 2014.Google Scholar
B. Klimt and Y. Yang, "The enron corpus: A new dataset for email classification research," ECML '04: Proceedings of the 15th European Conference on Machine Learning, pp. 217--226, 2004.Google Scholar
W. O. Kermack and A. G. McKendrick, "Contributions to the mathematical theory of epidemics. ii. the problem of endemicity," vol. 138, no. 834. JSTOR, 1932, pp. 55--83.Google Scholar
D. Chakrabarti, Y. Wang, C. Wang, J. Leskovec, and C. Faloutsos, "Epidemic thresholds in real networks," ACM Transactions on Information and System Security (TISSEC), vol. 10, no. 4, pp. 1:1--1:26, 2008.Google ScholarDigital Library
R. Mihalcea and P. Tarau, "TextRank: bringing order into texts," in Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics, 2004.Google Scholar
A. Tixier, K. Skianis, and M. Vazirgiannis, "Gowvis: a web application for graph-of-words-based text visualization and summarization," Proceedings of ACL-2016 System Demonstrations, pp. 151--156, 2016.Google Scholar
A. Hulth, "Improved automatic keyword extraction given more linguistic knowledge," in Proceedings of the 2003 Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics, 2003, pp. 216--223.Google Scholar
F. Rousseau and M. Vazirgiannis, "Main core retention on graph-of-words for single-document keyword extraction," in European Conference on Information Retrieval. Springer, 2015, pp. 382--393.Google Scholar
K. Järvelin and J. Kekäläinen, "Cumulated gain-based evaluation of ir techniques," ACM Transactions on Information Systems (TOIS), vol. 20, no. 4, pp. 422--446, 2002.Google ScholarDigital Library
Y. Grandvalet, "Bagging equalizes influence," Machine Learning, vol. 55, pp. 251--270, 2004.Google ScholarDigital Library
Y. Raviv and N. Intrator, "Bootstrapping with noise: An effective regularization technique," Connection Science, vol. 8, no. 3-4, pp. 355--372, 1996.Google ScholarCross Ref
B. W. Silverman, Density estimation for statistics and data analysis. Routledge, 1986.Google ScholarCross Ref
D. Zügner, A. Akbarnejad, and S. Günnemann, "Adversarial attacks on neural networks for graph data," arXiv preprint arXiv:1805.07984, 2018.Google Scholar
G. Ghoshal and A.-L. Barabási, "Ranking stability and super-stable nodes in complex networks," Nature communications, vol. 2, p. 394, 2011.Google ScholarCross Ref
A. Krizhevsky, I. Sutskever, and G. E. Hinton, "Imagenet classification with deep convolutional neural networks," in Advances in neural information processing systems, 2012, pp. 1097--1105.Google Scholar
T. Mikolov, K. Chen, G. Corrado, and J. Dean, "Efficient estimation of word representations in vector space," arXiv preprint arXiv:1301.3781, 2013.Google Scholar

Recommendations

Activity recognition in rehabilitation training based on ensemble stochastic configuration networks
Abstract
Rehabilitation training for patients with limb activity dysfunction and sub-healthy state has gradually shifted from therapies to strategies with remote assistance. Stochastic configuration networks (SCNs) are characterized by a structure that ...
Read More
Equimatchable Graphs on Surfaces

A graph G is equimatchable if each matching in G is a subset of a maximum-size matching and it is factor critical if G-v has a perfect matching for each vertex v of G. It is known that any 2-connected equimatchable graph is either bipartite or factor ...
Read More
Advancing Ensemble Learning Performance through data transformation and classifiers fusion in granular computing context
Highlights
- We proposed a new ensemble learning framework in the setting of granular computing.
Abstract
Classification is a special type of machine learning tasks, which is essentially achieved by training a classifier that can be used to classify new instances. In order to train a high performance classifier, it is crucial to extract ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
ASONAM '19: Proceedings of the 2019 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining
August 2019
1228 pages
ISBN:9781450368681
DOI:10.1145/3341161
Editors:
Francesca Spezzano
Boise State University
,
Wei Chen
Microsoft Research, China
,
Xiaokui Xiao
National University of Singapore, Singapore
Copyright © 2019 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 15 January 2020
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
bagging
bootstrap
degeneracy
ensemble learning
influential spreader detection
machine learning
perturb and combine
Qualifiers
- research-article
Conference

Acceptance Rates
ASONAM '19 Paper Acceptance Rate41of286submissions,14%Overall Acceptance Rate116of549submissions,21%
More
Upcoming Conference
KDD '24

Sponsor:

sigkdd

sigkdd

The 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 25 - 29, 2024

Barcelona , Spain
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 5
  Total Citations
  View Citations
- 109
  Total Downloads
- Downloads (Last 12 months)10
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Perturb and combine to identify influential spreaders in real-world networks

ASONAM '19: Proceedings of the 2019 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining

ABSTRACT

References

Cited By

Recommendations

Activity recognition in rehabilitation training based on ensemble stochastic configuration networks

Equimatchable Graphs on Surfaces

Advancing Ensemble Learning Performance through data transformation and classifiers fusion in granular computing context

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Perturb and combine to identify influential spreaders in real-world networks

ASONAM '19: Proceedings of the 2019 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining

ABSTRACT

References

Cited By

Recommendations

Activity recognition in rehabilitation training based on ensemble stochastic configuration networks

Equimatchable Graphs on Surfaces

Advancing Ensemble Learning Performance through data transformation and classifiers fusion in granular computing context

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media