Skip to main content

2023 | OriginalPaper | Buchkapitel

Reconstructing Degree Distribution and Triangle Counts from Edge-Sampled Graphs

verfasst von : Naomi A. Arnold, Raúl J. Mondragón, Richard G. Clegg

Erschienen in: Complex Networks and Their Applications XI

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Often, due to prohibitively large size or to limits to data collecting APIs, it is not possible to work with a complete network dataset and sampling is required. A type of sampling which is consistent with Twitter API restrictions is uniform edge sampling. In this paper, we propose a methodology for the recovery of two fundamental network properties from an edge-sampled network: the degree distribution and the triangle count (we estimate the totals for the network and the counts associated with each edge). We use a Bayesian approach and show a range of methods for constructing a prior which does not require assumptions about the original network. Our approach is tested on two synthetic and two real datasets with diverse degree and triangle count distributions.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
In experimental runs, the native binomial functions introduced numerical inaccuracies for large powers of \((1-p)\). Therefore, an equivalent evaluation of binomial probabilities using the log-gamma function and laws of logs was used in practice.
 
Literatur
1.
Zurück zum Zitat Ahmed, N.K., Neville, J., Kompella, R.: Network sampling: From static to streaming graphs. ACM Trans. Knowl. Discov. Data (2013) Ahmed, N.K., Neville, J., Kompella, R.: Network sampling: From static to streaming graphs. ACM Trans. Knowl. Discov. Data (2013)
2.
Zurück zum Zitat Antunes, N., Guo, T., Pipiras, V.: Sampling methods and estimation of triangle count distributions in large networks. Netw. Sci. (2021) Antunes, N., Guo, T., Pipiras, V.: Sampling methods and estimation of triangle count distributions in large networks. Netw. Sci. (2021)
3.
Zurück zum Zitat Arnold, N.: Studying evolving complex networks. Ph.D. thesis, Queen Mary University of London (2021) Arnold, N.: Studying evolving complex networks. Ph.D. thesis, Queen Mary University of London (2021)
4.
Zurück zum Zitat Barabási, A.-L., Albert, R.: Emergence of scaling in random networks. Science (1999) Barabási, A.-L., Albert, R.: Emergence of scaling in random networks. Science (1999)
5.
Zurück zum Zitat Bhattacharya, B.B., Das, S., Mukherjee, S.: Motif estimation via subgraph sampling: the fourth-moment phenomenon. Annals Stat. 50(2), 987–1011 (2022)CrossRefMATH Bhattacharya, B.B., Das, S., Mukherjee, S.: Motif estimation via subgraph sampling: the fourth-moment phenomenon. Annals Stat. 50(2), 987–1011 (2022)CrossRefMATH
6.
Zurück zum Zitat Bianconi, G.: Grand canonical ensembles of sparse networks and Bayesian inference. Entropy (2022) Bianconi, G.: Grand canonical ensembles of sparse networks and Bayesian inference. Entropy (2022)
7.
Zurück zum Zitat Chen, Q., Chang, H., Govindan, R., Jamin, S.: The origin of power laws in internet topologies revisited. In: Proceedings of IEEE Computing and Communication Societies (2002) Chen, Q., Chang, H., Govindan, R., Jamin, S.: The origin of power laws in internet topologies revisited. In: Proceedings of IEEE Computing and Communication Societies (2002)
8.
Zurück zum Zitat Erdős, P., Rényi, A., et al.: On the evolution of random graphs. Publ. Math. Inst. Hung. Acad. Sci. (1960) Erdős, P., Rényi, A., et al.: On the evolution of random graphs. Publ. Math. Inst. Hung. Acad. Sci. (1960)
9.
Zurück zum Zitat Feld, S.L.: Why your friends have more friends than you do. Am. J. Sociol. (1991) Feld, S.L.: Why your friends have more friends than you do. Am. J. Sociol. (1991)
10.
Zurück zum Zitat Frank, O.: Statistical inference in graphs. Ph.D. thesis, Foa Repro Stockholm (1971) Frank, O.: Statistical inference in graphs. Ph.D. thesis, Foa Repro Stockholm (1971)
11.
Zurück zum Zitat Ganguly, A., Kolaczyk, E.D.: Estimation of vertex degrees in a sampled network. In: Asilomar Conference on Signals, Systems, and Computers (2017) Ganguly, A., Kolaczyk, E.D.: Estimation of vertex degrees in a sampled network. In: Asilomar Conference on Signals, Systems, and Computers (2017)
12.
Zurück zum Zitat Katzir, L., Liberty, E., Somekh, O.: Estimating sizes of social networks via biased sampling. In: Proceedings on International Conference on World Wide Web (2011) Katzir, L., Liberty, E., Somekh, O.: Estimating sizes of social networks via biased sampling. In: Proceedings on International Conference on World Wide Web (2011)
13.
Zurück zum Zitat Klusowski JM, Wu, J.: Counting motifs with graph sampling. In: Conference on Learning Theory, pp. 1966–2011. PMLR (2018) Klusowski JM, Wu, J.: Counting motifs with graph sampling. In: Conference on Learning Theory, pp. 1966–2011. PMLR (2018)
14.
Zurück zum Zitat Leskovec, J., Faloutsos, C.: Sampling from large graphs. In: Proceedings of the International Conference on Knowledge Discovery and Data Mining (2006) Leskovec, J., Faloutsos, C.: Sampling from large graphs. In: Proceedings of the International Conference on Knowledge Discovery and Data Mining (2006)
15.
Zurück zum Zitat Morstatter, F., Pfeffer, J., Liu, H., Carley, K.: Is the sample good enough? comparing data from Twitter’s streaming API with Twitter’s firehose. In: Proceedings of the International AAAI Conference on Web and Social Media (2013) Morstatter, F., Pfeffer, J., Liu, H., Carley, K.: Is the sample good enough? comparing data from Twitter’s streaming API with Twitter’s firehose. In: Proceedings of the International AAAI Conference on Web and Social Media (2013)
16.
Zurück zum Zitat Newman, M.E.: The structure of scientific collaboration networks. Proceedings of the National Academy of Sciences (2001) Newman, M.E.: The structure of scientific collaboration networks. Proceedings of the National Academy of Sciences (2001)
17.
Zurück zum Zitat Newman, M.E.: Network structure from rich but noisy data. Nat. Phys. 14(6), 542–545 (2018)CrossRef Newman, M.E.: Network structure from rich but noisy data. Nat. Phys. 14(6), 542–545 (2018)CrossRef
18.
Zurück zum Zitat Stefani, L.D., Epasto, A., Riondato, M., Upfal, E.: Triest: counting local and global triangles in fully dynamic streams with fixed memory size. ACM Trans. Knowl. Discov. Data (TKDD) (2017) Stefani, L.D., Epasto, A., Riondato, M., Upfal, E.: Triest: counting local and global triangles in fully dynamic streams with fixed memory size. ACM Trans. Knowl. Discov. Data (TKDD) (2017)
19.
Zurück zum Zitat Stumpf, M.P., Wiuf, C., May, R.M.: Subnets of scale-free networks are not scale-free: sampling properties of networks. PNAS (2005) Stumpf, M.P., Wiuf, C., May, R.M.: Subnets of scale-free networks are not scale-free: sampling properties of networks. PNAS (2005)
20.
Zurück zum Zitat Tsourakakis, C.E., Kang, U., Miller, G.L., Faloutsos, C.: Doulion: counting triangles in massive graphs with a coin. In: Proceedings International Conference on Knowledge Discovery and Data Mining (2009) Tsourakakis, C.E., Kang, U., Miller, G.L., Faloutsos, C.: Doulion: counting triangles in massive graphs with a coin. In: Proceedings International Conference on Knowledge Discovery and Data Mining (2009)
22.
Zurück zum Zitat Watts, D.J., Strogatz, S.H.: Collective dynamics of ‘small-world’ networks. Nature (1998) Watts, D.J., Strogatz, S.H.: Collective dynamics of ‘small-world’ networks. Nature (1998)
23.
Zurück zum Zitat Young, J.-G., Cantwell, G.T., Newman, M.: Bayesian inference of network structure from unreliable data. J. Complex Netw. (2020) Young, J.-G., Cantwell, G.T., Newman, M.: Bayesian inference of network structure from unreliable data. J. Complex Netw. (2020)
24.
Zurück zum Zitat Zhang, Y., Kolaczyk, E.D., Spencer, B.D.: Estimating network degree distributions under sampling: an inverse problem, with applications to monitoring social media networks. Annals Appl. Stat. (2015) Zhang, Y., Kolaczyk, E.D., Spencer, B.D.: Estimating network degree distributions under sampling: an inverse problem, with applications to monitoring social media networks. Annals Appl. Stat. (2015)
Metadaten
Titel
Reconstructing Degree Distribution and Triangle Counts from Edge-Sampled Graphs
verfasst von
Naomi A. Arnold
Raúl J. Mondragón
Richard G. Clegg
Copyright-Jahr
2023
DOI
https://doi.org/10.1007/978-3-031-21131-7_23

Premium Partner