Skip to main content
Erschienen in: Social Network Analysis and Mining 1/2014

01.12.2014 | Original Article

Estimating network parameters using random walks

verfasst von: Colin Cooper, Tomasz Radzik, Yiannis Siantos

Erschienen in: Social Network Analysis and Mining | Ausgabe 1/2014

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Sampling from large graphs is an area of great interest, especially since the emergence of huge structures such as Online Social Networks and the World Wide Web (WWW). The large scale properties of a network can be summarized in terms of parameters of the underlying graph, such as the total number of vertices, edges and triangles. However, the large si ze of these networks makes it computationally expensive to obtain such structural properties of the underlying graph by exhaustive search. If we can estimate these properties by taking small but representative samples from the network, then size is no longer such a problem. In this paper we present a general framework to estimate network properties using random walks. These methods work under the assumption we are able to obtain local characteristics of a vertex during each step of the random walk, for example the number of its neighbours, and their labels. As examples of this approach, we present practical methods to estimate the total number of edges/links m, number of vertices/nodes n and number of connected triads of vertices (triangles) t in graphs. We also give a general method to count any type of small connected subgraph, of which vertices, edges and triangles are specific examples. Additionally we present experimental estimates for nmt we obtained using our methods on real or synthetic networks. The synthetic networks were random graphs with power-law degree distributions and designed to have a large number of triangles. We used these graphs as they tend to correspond to the structure of large online networks. The real networks were samples of the WWW and social networks obtained from the SNAP database. In order to test that the methods are indeed practical, the total number of steps made by the walk was limited to at most the size n of the network. In fact the estimates appear to converge to the correct value at a lower number of steps, indicating that our proposed methods are feasible in practice.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Barabási A, Albert R (1999) Emergence of scaling in random networks. Science 286(5439):509–512 Barabási A, Albert R (1999) Emergence of scaling in random networks. Science 286(5439):509–512
Zurück zum Zitat Bollobás B, Riordan O (2002) Mathematical results on scale-free graphs. In: Bornholdt S, Schuster H (eds) Handbook of graphs and networks. Wiley-VCH, Weinheim, pp 1–32 Bollobás B, Riordan O (2002) Mathematical results on scale-free graphs. In: Bornholdt S, Schuster H (eds) Handbook of graphs and networks. Wiley-VCH, Weinheim, pp 1–32
Zurück zum Zitat Bollobás B, Riordan O, Spencer J, Tusnády G (2001) The degree sequence of a scale-free random graph process. Random Struct Algorithms 18:279–290CrossRefMATH Bollobás B, Riordan O, Spencer J, Tusnády G (2001) The degree sequence of a scale-free random graph process. Random Struct Algorithms 18:279–290CrossRefMATH
Zurück zum Zitat Broder AZ, Kumar R, Maghoul F, Raghavan P, Rajagopalan S, Stata R, Tomkins A, Wiener JL (2000) Graph structure in the web. Comput Netw 33(1–6):309–320CrossRef Broder AZ, Kumar R, Maghoul F, Raghavan P, Rajagopalan S, Stata R, Tomkins A, Wiener JL (2000) Graph structure in the web. Comput Netw 33(1–6):309–320CrossRef
Zurück zum Zitat Cheeger J (1971) A lower bound for the smallest eigenvalue of the laplacian. Probl Anal 195–199 (papers dedicated to Salomon Bochner, 1969) Cheeger J (1971) A lower bound for the smallest eigenvalue of the laplacian. Probl Anal 195–199 (papers dedicated to Salomon Bochner, 1969)
Zurück zum Zitat Cooper C (2002) Classifying special interest groups in web graphs. In: RANDOM 2002: Randomization and Approximation Techniques in Computer Science, pp 263–275 Cooper C (2002) Classifying special interest groups in web graphs. In: RANDOM 2002: Randomization and Approximation Techniques in Computer Science, pp 263–275
Zurück zum Zitat Cooper C (2006) The age specific degree distribution of web-graphs. Comb Probab Comput 15:637–661CrossRefMATH Cooper C (2006) The age specific degree distribution of web-graphs. Comb Probab Comput 15:637–661CrossRefMATH
Zurück zum Zitat Cooper C, Frieze A (2003) A general model web graphs. Random Struct Algorithms 22(3):311–335 Cooper C, Frieze A (2003) A general model web graphs. Random Struct Algorithms 22(3):311–335
Zurück zum Zitat Cooper C, Frieze AM (2008) Random walks on random graphs. In: Cheng MX (ed) NanoNet. Lecture notes of the institute for computer sciences, social informatics and telecommunications engineering, vol 3. Springer, Berlin, pp 95–106 Cooper C, Frieze AM (2008) Random walks on random graphs. In: Cheng MX (ed) NanoNet. Lecture notes of the institute for computer sciences, social informatics and telecommunications engineering, vol 3. Springer, Berlin, pp 95–106
Zurück zum Zitat Cooper C, Radzik T, Siantos Y (2012) Estimating network parameters using random walks. In: CASoN. IEEE, pp 33–40 Cooper C, Radzik T, Siantos Y (2012) Estimating network parameters using random walks. In: CASoN. IEEE, pp 33–40
Zurück zum Zitat Cooper C, Radzik T, Siantos Y (2012) A fast algorithm to find all high degree vertices in graphs with a power law degree sequence. In: WAW 2012 Proceedings. LNCS, vol 7323. Springer, Berlin, pp 165–178 Cooper C, Radzik T, Siantos Y (2012) A fast algorithm to find all high degree vertices in graphs with a power law degree sequence. In: WAW 2012 Proceedings. LNCS, vol 7323. Springer, Berlin, pp 165–178
Zurück zum Zitat Drinea E, Enachescu M, Mitzenmacher M (2001) Technical report: variations on random graph models for the web. Tech. rep., Harvard University, Department of Computer Science Drinea E, Enachescu M, Mitzenmacher M (2001) Technical report: variations on random graph models for the web. Tech. rep., Harvard University, Department of Computer Science
Zurück zum Zitat Ganesh A, Kermarrec A, Le Merrer E, Massouli L (2007) Peer counting and sampling in overlay networks based on random walks. Distrib Comput 20:267–278. doi:10.1007/s00446-007-0027-z Ganesh A, Kermarrec A, Le Merrer E, Massouli L (2007) Peer counting and sampling in overlay networks based on random walks. Distrib Comput 20:267–278. doi:10.​1007/​s00446-007-0027-z
Zurück zum Zitat Gjoka M, Kurant M, Butts CT, Markopoulou A (2009) A walk in facebook: uniform sampling of users in online social networks. CoRR. abs/0906.0060 Gjoka M, Kurant M, Butts CT, Markopoulou A (2009) A walk in facebook: uniform sampling of users in online social networks. CoRR. abs/0906.0060
Zurück zum Zitat Hastings WK (1970) Monte carlo sampling methods using markov chains and their applications. Biometrika 57(1):97–109CrossRefMATH Hastings WK (1970) Monte carlo sampling methods using markov chains and their applications. Biometrika 57(1):97–109CrossRefMATH
Zurück zum Zitat Katzir L, Liberty E, Somekh O (2011) Estimating sizes of social networks via biased sampling. In: Srinivasan S, Ramamritham K, Kumar A, Ravindra MP, Bertino V, Kumar R (eds.) WWW. ACM, pp 597–606 Katzir L, Liberty E, Somekh O (2011) Estimating sizes of social networks via biased sampling. In: Srinivasan S, Ramamritham K, Kumar A, Ravindra MP, Bertino V, Kumar R (eds.) WWW. ACM, pp 597–606
Zurück zum Zitat Leskovec J, Backstrom L, Kumar R, Tomkins A (2008) Microscopic evolution of social networks. In: Proceeding of the 14th ACM SIGKDD international conference on knowledge discovery and data mining, KDD’08, ACM, New York, NY, USA, pp 462–470. doi:10.1145/1401890.1401948 Leskovec J, Backstrom L, Kumar R, Tomkins A (2008) Microscopic evolution of social networks. In: Proceeding of the 14th ACM SIGKDD international conference on knowledge discovery and data mining, KDD’08, ACM, New York, NY, USA, pp 462–470. doi:10.​1145/​1401890.​1401948
Zurück zum Zitat Leskovec J, Lang KJ, Dasgupta A, Mahoney MW (2008) Community structure in large networks: Natural cluster sizes and the absence of large well-defined clusters. CoRR. abs/0810.1355 Leskovec J, Lang KJ, Dasgupta A, Mahoney MW (2008) Community structure in large networks: Natural cluster sizes and the absence of large well-defined clusters. CoRR. abs/0810.1355
Zurück zum Zitat Lovász L (1996) Random walks on graphs: a survey. Bolyai Soc Math Stud 2:353–397 Lovász L (1996) Random walks on graphs: a survey. Bolyai Soc Math Stud 2:353–397
Zurück zum Zitat Massoulié L, Merrer EL, Kermarrec AM, Ganesh AJ (2006) Peer counting and sampling in overlay networks: random walk methods. In: Ruppert E, Malkhi D (eds.) PODC. ACM, pp 123–132 Massoulié L, Merrer EL, Kermarrec AM, Ganesh AJ (2006) Peer counting and sampling in overlay networks: random walk methods. In: Ruppert E, Malkhi D (eds.) PODC. ACM, pp 123–132
Zurück zum Zitat Metropolis N, Rosenbluth AW, Rosenbluth MN, Teller AH, Teller E (1953) Equation of state calculations by fast computing machines. J Chem Phys 21(6):1087–1092CrossRef Metropolis N, Rosenbluth AW, Rosenbluth MN, Teller AH, Teller E (1953) Equation of state calculations by fast computing machines. J Chem Phys 21(6):1087–1092CrossRef
Metadaten
Titel
Estimating network parameters using random walks
verfasst von
Colin Cooper
Tomasz Radzik
Yiannis Siantos
Publikationsdatum
01.12.2014
Verlag
Springer Vienna
Erschienen in
Social Network Analysis and Mining / Ausgabe 1/2014
Print ISSN: 1869-5450
Elektronische ISSN: 1869-5469
DOI
https://doi.org/10.1007/s13278-014-0168-6

Weitere Artikel der Ausgabe 1/2014

Social Network Analysis and Mining 1/2014 Zur Ausgabe

Premium Partner