Skip to main content
Top
Published in:

01-12-2016 | Original Article

Structure-preserving sparsification methods for social networks

Authors: Michael Hamann, Gerd Lindner, Henning Meyerhenke, Christian L. Staudt, Dorothea Wagner

Published in: Social Network Analysis and Mining | Issue 1/2016

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Sparsification reduces the size of networks while preserving structural and statistical properties of interest. Various sparsifying algorithms have been proposed in different contexts. We contribute the first systematic conceptual and experimental comparison of edge sparsification methods on a diverse set of network properties. It is shown that they can be understood as methods for rating edges by importance and then filtering globally or locally by these scores. We show that applying a local filtering technique improves the preservation of all kinds of properties. In addition, we propose a new sparsification method (Local Degree) which preserves edges leading to local hub nodes. All methods are evaluated on a set of social networks from Facebook, Google+, Twitter and LiveJournal with respect to network properties including diameter, connected components, community structure, multiple node centrality measures and the behavior of epidemic simulations. To assess the preservation of the community structure, we also include experiments on synthetically generated networks with ground truth communities. Experiments with our implementations of the sparsification methods (included in the open-source network analysis tool suite NetworKit) show that many network properties can be preserved down to about 20 % of the original set of edges for sparse graphs with a reasonable density. The experimental results allow us to differentiate the behavior of different methods and show which method is suitable with respect to which property. While our Local Degree method is best for preserving connectivity and short distances, other newly introduced local variants are best for preserving the community structure.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Appendix
Available only for authorised users
Literature
go back to reference Ahmed NK, Neville J, Kompella R (2014) Network sampling: from static to streaming graphs. ACM Trans Knowl Discov Data (TKDD) 8(2):7 Ahmed NK, Neville J, Kompella R (2014) Network sampling: from static to streaming graphs. ACM Trans Knowl Discov Data (TKDD) 8(2):7
go back to reference Batson J, Spielman DA, Srivastava N, Teng SH (2013) Spectral sparsification of graphs: theory and algorithms. Commun ACM 56(8):87–94CrossRef Batson J, Spielman DA, Srivastava N, Teng SH (2013) Spectral sparsification of graphs: theory and algorithms. Commun ACM 56(8):87–94CrossRef
go back to reference Costa LdF, Oliveira ON Jr, Travieso G, Rodrigues FA, Villas Boas PR, Antiqueira L, Viana MP, Correa Rocha LE (2011) Analyzing and modeling real-world phenomena with complex networks: a survey of applications. Adv Phys 60(3):329–412CrossRef Costa LdF, Oliveira ON Jr, Travieso G, Rodrigues FA, Villas Boas PR, Antiqueira L, Viana MP, Correa Rocha LE (2011) Analyzing and modeling real-world phenomena with complex networks: a survey of applications. Adv Phys 60(3):329–412CrossRef
go back to reference Ebbes P, Huang Z, Rangaswamy A, Thadakamalla HP, Unit ORGB (2008) Sampling large-scale social networks: insights from simulated networks. In: 18th annual workshop on information technologies and systems, Paris, France, Citeseer Ebbes P, Huang Z, Rangaswamy A, Thadakamalla HP, Unit ORGB (2008) Sampling large-scale social networks: insights from simulated networks. In: 18th annual workshop on information technologies and systems, Paris, France, Citeseer
go back to reference Fortunato S, Barthelemy M (2007) Resolution limit in community detection. Proc Natl Acad Sci 104(1):36–41CrossRef Fortunato S, Barthelemy M (2007) Resolution limit in community detection. Proc Natl Acad Sci 104(1):36–41CrossRef
go back to reference Fortunato S, Boguñá M, Flammini A, Menczer F (2008) Approximating pagerank from in-degree. In: Aiello W, Broder A, Janssen J, Milios E (eds) Algorithms and models for the web-graph, Springer, Berlin, pp 59–71 Fortunato S, Boguñá M, Flammini A, Menczer F (2008) Approximating pagerank from in-degree. In: Aiello W, Broder A, Janssen J, Milios E (eds) Algorithms and models for the web-graph, Springer, Berlin, pp 59–71
go back to reference Geisberger R, Sanders P, Schultes D (2008) Better approximation of betweenness centrality. In: ALENEX, SIAM, pp 90–100 Geisberger R, Sanders P, Schultes D (2008) Better approximation of betweenness centrality. In: ALENEX, SIAM, pp 90–100
go back to reference John E, Safro I (2016) Single-and multi-level network sparsification by algebraic distance. arXiv:160105527 (arXiv preprint) John E, Safro I (2016) Single-and multi-level network sparsification by algebraic distance. arXiv:​160105527 (arXiv preprint)
go back to reference Keeling M, Rohani P (2008) Modeling infectious diseases in humans and animals. Princeton University Press, PrincetonMATH Keeling M, Rohani P (2008) Modeling infectious diseases in humans and animals. Princeton University Press, PrincetonMATH
go back to reference Leskovec J, Faloutsos C (2006) Sampling from large graphs. In: Proceedings of the 12th ACM SIGKDD international conference on knowledge discovery and data mining, ACM, New York, NY, USA, KDD ’06, pp 631–636. doi:10.1145/1150402.1150479 Leskovec J, Faloutsos C (2006) Sampling from large graphs. In: Proceedings of the 12th ACM SIGKDD international conference on knowledge discovery and data mining, ACM, New York, NY, USA, KDD ’06, pp 631–636. doi:10.​1145/​1150402.​1150479
go back to reference Leskovec J, Mcauley JJ (2012) Learning to discover social circles in ego networks. In: Pereira F, Burges C, Bottou L, Weinberger K (eds) Advances in neural information processing systems 25. Curran Associates Inc, New York, pp 539–547 Leskovec J, Mcauley JJ (2012) Learning to discover social circles in ego networks. In: Pereira F, Burges C, Bottou L, Weinberger K (eds) Advances in neural information processing systems 25. Curran Associates Inc, New York, pp 539–547
go back to reference Lindner G, Staudt CL, Hamann M, Meyerhenke H, Wagner D (2015) Structure-preserving sparsification of social networks. In: Pei J, Silvestri F, Tang J (eds) Proceedings of the 2015 IEEE/ACM international conference on advances in social networks analysis and mining, ASONAM 2015, Paris, France, August 25–28, 2015, ACM, pp 448–454. doi:10.1145/2808797.2809313 Lindner G, Staudt CL, Hamann M, Meyerhenke H, Wagner D (2015) Structure-preserving sparsification of social networks. In: Pei J, Silvestri F, Tang J (eds) Proceedings of the 2015 IEEE/ACM international conference on advances in social networks analysis and mining, ASONAM 2015, Paris, France, August 25–28, 2015, ACM, pp 448–454. doi:10.​1145/​2808797.​2809313
go back to reference Nick B, Lee C, Cunningham P, Brandes U (2013) Simmelian backbones: Amplifying hidden homophily in facebook networks. In: Proceedings of the 2013 IEEE/ACM international conference on advances in social networks analysis and mining, ACM, New York, NY, USA, ASONAM ’13, pp 525–532. doi:10.1145/2492517.2492569 Nick B, Lee C, Cunningham P, Brandes U (2013) Simmelian backbones: Amplifying hidden homophily in facebook networks. In: Proceedings of the 2013 IEEE/ACM international conference on advances in social networks analysis and mining, ACM, New York, NY, USA, ASONAM ’13, pp 525–532. doi:10.​1145/​2492517.​2492569
go back to reference Nocaj A, Ortmann M, Brandes U (2014) Untangling hairballs—from 3 to 14 degrees of separation. In: Duncan CA, Symvonis A (eds) Graph Drawing—22nd international symposium, GD 2014, Würzburg, Germany, September 24-26, 2014 Revised Selected Papers, Lecture Notes in Computer Science, vol 8871. Springer, Berlin, pp 101–112. doi:10.1007/978-3-662-45803-7_9 Nocaj A, Ortmann M, Brandes U (2014) Untangling hairballs—from 3 to 14 degrees of separation. In: Duncan CA, Symvonis A (eds) Graph Drawing—22nd international symposium, GD 2014, Würzburg, Germany, September 24-26, 2014 Revised Selected Papers, Lecture Notes in Computer Science, vol 8871. Springer, Berlin, pp 101–112. doi:10.​1007/​978-3-662-45803-7_​9
go back to reference Ortmann M, Brandes U (2014) Triangle listing algorithms: Back from the diversion. In: McGeoch CC, Meyer U (eds) 2014 Proceedings of the Sixteenth Workshop on Algorithm Engineering and Experiments, ALENEX 2014, Portland, Oregon, USA, January 5, 2014, SIAM, pp 1–8. doi:10.1137/1.9781611973198.1 Ortmann M, Brandes U (2014) Triangle listing algorithms: Back from the diversion. In: McGeoch CC, Meyer U (eds) 2014 Proceedings of the Sixteenth Workshop on Algorithm Engineering and Experiments, ALENEX 2014, Portland, Oregon, USA, January 5, 2014, SIAM, pp 1–8. doi:10.​1137/​1.​9781611973198.​1
go back to reference Saha T, Rangwala H, Domeniconi C (2013) Sparsification and sampling of networks for collective classification. In: Greenberg AM, Kennedy WG, Bos ND (eds) Social Computing, Behavioral-cultural modeling and prediction. Springer, Berlin, pp 293–302 Saha T, Rangwala H, Domeniconi C (2013) Sparsification and sampling of networks for collective classification. In: Greenberg AM, Kennedy WG, Bos ND (eds) Social Computing, Behavioral-cultural modeling and prediction. Springer, Berlin, pp 293–302
go back to reference Salathé M, Kazandjieva M, Lee JW, Levis P, Feldman MW, Jones JH (2010) A high-resolution human contact network for infectious disease transmission. Proc Natl Acad Sci 107(51):22020–22025CrossRef Salathé M, Kazandjieva M, Lee JW, Levis P, Feldman MW, Jones JH (2010) A high-resolution human contact network for infectious disease transmission. Proc Natl Acad Sci 107(51):22020–22025CrossRef
go back to reference Satuluri V, Parthasarathy S, Ruan Y (2011) Local graph sparsification for scalable clustering. In: Proceedings of the 2011 ACM SIGMOD international conference on management of data, ACM, New York, NY, USA, SIGMOD ’11, pp 721–732. doi:10.1145/1989323.1989399 Satuluri V, Parthasarathy S, Ruan Y (2011) Local graph sparsification for scalable clustering. In: Proceedings of the 2011 ACM SIGMOD international conference on management of data, ACM, New York, NY, USA, SIGMOD ’11, pp 721–732. doi:10.​1145/​1989323.​1989399
go back to reference Traud AL, Mucha PJ, Porter MA (2012) Social structure of facebook networks. Phys A Stat Mech Appl 391(16):4165–4180CrossRef Traud AL, Mucha PJ, Porter MA (2012) Social structure of facebook networks. Phys A Stat Mech Appl 391(16):4165–4180CrossRef
go back to reference Vinh NX, Epps J, Bailey J (2009) Information theoretic measures for clusterings comparison: is a correction for chance necessary? In: Proceedings of the 26th annual international conference on machine learning, ACM, New York, NY, USA, ICML ’09, pp 1073–1080. doi:10.1145/1553374.1553511 Vinh NX, Epps J, Bailey J (2009) Information theoretic measures for clusterings comparison: is a correction for chance necessary? In: Proceedings of the 26th annual international conference on machine learning, ACM, New York, NY, USA, ICML ’09, pp 1073–1080. doi:10.​1145/​1553374.​1553511
go back to reference Wang Y, Chakrabarti D, Wang C, Faloutsos C (2003) Epidemic spreading in real networks: an eigenvalue viewpoint. In: Proceedings of the 22nd international symposium on reliable distributed systems, 2003, IEEE, pp 25–34 Wang Y, Chakrabarti D, Wang C, Faloutsos C (2003) Epidemic spreading in real networks: an eigenvalue viewpoint. In: Proceedings of the 22nd international symposium on reliable distributed systems, 2003, IEEE, pp 25–34
go back to reference Yang J, Leskovec J (2012) Defining and evaluating network communities based on ground-truth. In: Proceedings of the ACM SIGKDD workshop on mining data semantics, ACM, p 3 Yang J, Leskovec J (2012) Defining and evaluating network communities based on ground-truth. In: Proceedings of the ACM SIGKDD workshop on mining data semantics, ACM, p 3
Metadata
Title
Structure-preserving sparsification methods for social networks
Authors
Michael Hamann
Gerd Lindner
Henning Meyerhenke
Christian L. Staudt
Dorothea Wagner
Publication date
01-12-2016
Publisher
Springer Vienna
Published in
Social Network Analysis and Mining / Issue 1/2016
Print ISSN: 1869-5450
Electronic ISSN: 1869-5469
DOI
https://doi.org/10.1007/s13278-016-0332-2

Premium Partner