Skip to main content
Erschienen in: Social Network Analysis and Mining 1/2016

01.12.2016 | Original Article

Increasing sampling efficiency for the fixed degree sequence model with phase transitions

verfasst von: Christian Brugger, André Lucas Chinazzo, Alexandre Flores John, Christian De Schryver, Norbert Wehn, Wolfgang Schlauch, Katharina Anna Zweig

Erschienen in: Social Network Analysis and Mining | Ausgabe 1/2016

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Real-world network data is often very noisy and contains erroneous or missing edges. These superfluous and missing edges can be identified statistically by assessing the number of common neighbors of the two incident nodes. To evaluate whether this number of common neighbors, the so-called co-occurrence, is statistically significant, a comparison with the expected co-occurrence in a suitable random graph model is required. For networks with a skewed degree distribution, including most real-world networks, it is known that the fixed degree sequence model (FDSM), which maintains the degrees of nodes, is favorable over using simplified graph models that are based on an independence assumption. However, the use of a FDSM requires sampling from the space of all graphs with the given degree sequence and measuring the co-occurrence of each pair of nodes in each of the samples, since there is no known closed formula known for this statistic. While there exist log-linear approaches such as Markov chain Monte Carlo sampling, the computational complexity still depends on the length of the Markov chain and the number of samples, which is significant in large-scale networks. In this article, we show based on ground truth data for different data sets that there are various phase transition-like tipping points that enable us to choose a comparatively low number of samples and to reduce the length of the Markov chains without reducing the quality of the significance test. As a result, the computational effort can be reduced by an order of magnitudes. Furthermore, we present and evaluate practically usable strategies for speeding up the randomization process of input graphs and heuristics for phase transition-based computation stopping.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Fußnoten
1
The 100k MovieLens data set, available from http://​grouplens.​org/​datasets/​movielens/​.
 
Literatur
Zurück zum Zitat Berger A, Müller-Hannemann M (2010) Uniform sampling of digraphs with a fixed degree sequence. In: Thilikos DM (ed) Graph theoretic concepts in computer science, vol 6410. Springer, Heidelberg, pp 220–231 Berger A, Müller-Hannemann M (2010) Uniform sampling of digraphs with a fixed degree sequence. In: Thilikos DM (ed) Graph theoretic concepts in computer science, vol 6410. Springer, Heidelberg, pp 220–231
Zurück zum Zitat Brugger C, Chinazzo AL, John AF, De Schryver C, Wehn N, Spitz A, Zweig KA (2015) Exploiting phase transitions for the efficient sampling of the fixed degree sequence model. In: Proceedings of the 2015 IEEE/ACM international conference on advances in social networks analysis and mining (ASONAM), ASONAM’15. IEEE, ACM, New York, NY, pp 308–313, August 2015 Brugger C, Chinazzo AL, John AF, De Schryver C, Wehn N, Spitz A, Zweig KA (2015) Exploiting phase transitions for the efficient sampling of the fixed degree sequence model. In: Proceedings of the 2015 IEEE/ACM international conference on advances in social networks analysis and mining (ASONAM), ASONAM’15. IEEE, ACM, New York, NY, pp 308–313, August 2015
Zurück zum Zitat Del Genio CI, Kim H, Toroczkai Z, Bassler KE (2010) Efficient and exact sampling of simple graphs with given arbitrary degree sequence. PloS one 5(4):e10012CrossRef Del Genio CI, Kim H, Toroczkai Z, Bassler KE (2010) Efficient and exact sampling of simple graphs with given arbitrary degree sequence. PloS one 5(4):e10012CrossRef
Zurück zum Zitat Gionis A et al (2007) Assessing data mining results via swap randomization. ACM Trans Knowl Discov Data 1(3), Article No 14 Gionis A et al (2007) Assessing data mining results via swap randomization. ACM Trans Knowl Discov Data 1(3), Article No 14
Zurück zum Zitat Gionis A, Mannila H, Mielikäinen T, Tsaparas P (2006) Assessing data mining results via swap randomization. In: Proceedings of the twelfth ACM SIGKDD international conference on knowledge discovery and data mining (KDD’06) Gionis A, Mannila H, Mielikäinen T, Tsaparas P (2006) Assessing data mining results via swap randomization. In: Proceedings of the twelfth ACM SIGKDD international conference on knowledge discovery and data mining (KDD’06)
Zurück zum Zitat Horvát E-Á, Zweig KA (2013) A fixed degree sequence model for the one-mode projection of multiplex bipartite graphs. Soc Netw Anal Min 4:164 Horvát E-Á, Zweig KA (2013) A fixed degree sequence model for the one-mode projection of multiplex bipartite graphs. Soc Netw Anal Min 4:164
Zurück zum Zitat Jerrum M, Sinclair A, Vigoda E (2004) A polynomial-time approximation algorithm for the permanent of a matrix with nonnegative entries. JACM 51(4):671–697MathSciNetCrossRefMATH Jerrum M, Sinclair A, Vigoda E (2004) A polynomial-time approximation algorithm for the permanent of a matrix with nonnegative entries. JACM 51(4):671–697MathSciNetCrossRefMATH
Zurück zum Zitat Kannan R, Tetali P, Vempala S (1999) Simple markov-chain algorithms for generating bipartite graphs and tournaments. Random Struct Algorithms 14(4):293–308MathSciNetCrossRefMATH Kannan R, Tetali P, Vempala S (1999) Simple markov-chain algorithms for generating bipartite graphs and tournaments. Random Struct Algorithms 14(4):293–308MathSciNetCrossRefMATH
Zurück zum Zitat Milo R, Itzkovitz S, Kashtan N, Levitt R, Alon U (2004) Response to comment on “Network motifs: simple building blocks of complex networks” and “Superfamilies of evolved and designed networks”. Science 305:1107dCrossRef Milo R, Itzkovitz S, Kashtan N, Levitt R, Alon U (2004) Response to comment on “Network motifs: simple building blocks of complex networks” and “Superfamilies of evolved and designed networks”. Science 305:1107dCrossRef
Zurück zum Zitat Uhlmann S, Mannsperger H, Zhang JD, Horvat E-Á, Schmidt C, Küblbeck M, Ward A, Tschulena U, Zweig K, Korf U, Wiemann S, Sahin Ö (2012) Global miRNA regulation of a local protein network: case study with the EGFR-driven cell cycle network in breast cancer. Mol Syst Biol 8:570CrossRef Uhlmann S, Mannsperger H, Zhang JD, Horvat E-Á, Schmidt C, Küblbeck M, Ward A, Tschulena U, Zweig K, Korf U, Wiemann S, Sahin Ö (2012) Global miRNA regulation of a local protein network: case study with the EGFR-driven cell cycle network in breast cancer. Mol Syst Biol 8:570CrossRef
Zurück zum Zitat Zweig KA (2010) How to forget the second side of the story: a new method for the one-mode projection of bipartite graphs. In: Proceedings of the 2010 international conference on advances in social networks analysis and mining ASONAM 2010, pp 200–207 Zweig KA (2010) How to forget the second side of the story: a new method for the one-mode projection of bipartite graphs. In: Proceedings of the 2010 international conference on advances in social networks analysis and mining ASONAM 2010, pp 200–207
Zurück zum Zitat Zweig KA (2011) Good versus optimal: why network analytic methods need more systematic evaluation. Central Eur J Comput Sci 1:137–153 Zweig KA (2011) Good versus optimal: why network analytic methods need more systematic evaluation. Central Eur J Comput Sci 1:137–153
Zurück zum Zitat Zweig KA, Kaufmann M (2011) A systematic approach to the one-mode projection of bipartite graphs. Soc Netw Anal Min 1(3):187–218CrossRef Zweig KA, Kaufmann M (2011) A systematic approach to the one-mode projection of bipartite graphs. Soc Netw Anal Min 1(3):187–218CrossRef
Metadaten
Titel
Increasing sampling efficiency for the fixed degree sequence model with phase transitions
verfasst von
Christian Brugger
André Lucas Chinazzo
Alexandre Flores John
Christian De Schryver
Norbert Wehn
Wolfgang Schlauch
Katharina Anna Zweig
Publikationsdatum
01.12.2016
Verlag
Springer Vienna
Erschienen in
Social Network Analysis and Mining / Ausgabe 1/2016
Print ISSN: 1869-5450
Elektronische ISSN: 1869-5469
DOI
https://doi.org/10.1007/s13278-016-0407-0

Weitere Artikel der Ausgabe 1/2016

Social Network Analysis and Mining 1/2016 Zur Ausgabe

Premium Partner