Skip to main content
Erschienen in: Social Network Analysis and Mining 1/2022

01.12.2022 | Original Article

Constant community identification in million-scale networks

verfasst von: Anjan Chowdhury, Sriram Srinivasan, Sanjukta Bhowmick, Animesh Mukherjee, Kuntal Ghosh

Erschienen in: Social Network Analysis and Mining | Ausgabe 1/2022

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

The inherently stochastic nature of community detection in real-world complex networks poses an important challenge in assessing the accuracy of the results. In order to eliminate the algorithmic and implementation artifacts, it is necessary to identify the groups of vertices that are always clustered together, independent of the community detection algorithm used. Such groups of vertices are called constant communities. Current approaches for finding constant communities are very expensive and do not scale to large networks. In this paper, we use binary edge classification to find constant communities. The key idea is to classify edges based on whether they form a constant community or not. We present two methods for edge classification. The first is a GCN-based semi-supervised approach that we term Line-GCN. The second is an unsupervised approach based on image thresholding methods. Neither of these methods requires explicit detection of communities and can thus scale to very large networks of the order of millions of vertices. Both of our semi-supervised and unsupervised results on real-world graphs demonstrate that the constant communities obtained by our method have higher F1-scores and comparable or higher NMI scores than other state-of-the-art baseline methods for constant community detection. While the training step of Line-GCN can be expensive, the unsupervised algorithm is 10 times faster than the baseline methods. For larger networks, the baseline methods cannot complete, whereas all of our algorithms can find constant communities in a reasonable amount of time. Finally, we also demonstrate that our methods are robust under noisy conditions. We use three different, well-studied noise models to add noise to the networks and show that our results are mostly stable.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Akiba T, Sano S, Yanase T, Ohta T, Koyama M (2019) Optuna: A next-generation hyperparameter optimization framework. In: Proceedings of the 25rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining Akiba T, Sano S, Yanase T, Ohta T, Koyama M (2019) Optuna: A next-generation hyperparameter optimization framework. In: Proceedings of the 25rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
Zurück zum Zitat Amin J, Sharif M, Yasmin M, Fernandes SL (2020) A distinctive approach in brain tumor detection and classification using mri. Pattern Recogn Lett 139:118–127CrossRef Amin J, Sharif M, Yasmin M, Fernandes SL (2020) A distinctive approach in brain tumor detection and classification using mri. Pattern Recogn Lett 139:118–127CrossRef
Zurück zum Zitat Asokan A, Anitha J (2019) Change detection techniques for remote sensing applications: a survey. Earth Sci Inf 12(2):143–160CrossRef Asokan A, Anitha J (2019) Change detection techniques for remote sensing applications: a survey. Earth Sci Inf 12(2):143–160CrossRef
Zurück zum Zitat Blondel VD, Guillaume J-L, Lambiotte R, Lefebvre E (2008) Fast unfolding of communities in large networks. J. of Stat. Mech. (10), 10008 Blondel VD, Guillaume J-L, Lambiotte R, Lefebvre E (2008) Fast unfolding of communities in large networks. J. of Stat. Mech. (10), 10008
Zurück zum Zitat Bruna J, Li X (2017) Community detection with graph neural networks. stat 1050, 27 Bruna J, Li X (2017) Community detection with graph neural networks. stat 1050, 27
Zurück zum Zitat Cai L, Li J, Wang J, Ji S (2021) Line graph neural networks for link prediction. IEEE Transactions on Pattern Analysis and Machine Intelligence Cai L, Li J, Wang J, Ji S (2021) Line graph neural networks for link prediction. IEEE Transactions on Pattern Analysis and Machine Intelligence
Zurück zum Zitat Chakraborty T, Srinivasan S, Ganguly N, Bhowmick S, Mukherjee A (2013) Constant communities in complex networks. Sci Rep 3(1):1–9CrossRef Chakraborty T, Srinivasan S, Ganguly N, Bhowmick S, Mukherjee A (2013) Constant communities in complex networks. Sci Rep 3(1):1–9CrossRef
Zurück zum Zitat Chakraborty T, Srinivasan S, Ganguly N, Mukherjee A, Bhowmick S (2014) On the permanence of vertices in network communities., 1396–1405 Chakraborty T, Srinivasan S, Ganguly N, Mukherjee A, Bhowmick S (2014) On the permanence of vertices in network communities., 1396–1405
Zurück zum Zitat Chouhan SS, Kaul A, Singh UP (2018) Soft computing approaches for image segmentation: a survey. Multimed Tools Appl 77(21):28483–28537CrossRef Chouhan SS, Kaul A, Singh UP (2018) Soft computing approaches for image segmentation: a survey. Multimed Tools Appl 77(21):28483–28537CrossRef
Zurück zum Zitat Chowdhury A, Srinivasan S, Bhowmick S, Mukherjee A, Ghosh K (2021) Constant community identification in million scale networks using image thresholding algorithms. In: Proceedings of the 2021 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining. ASONAM ’21, pp. 116–120. Association for Computing Machinery, New York, NY, USA . https://doi.org/10.1145/3487351.3488350 Chowdhury A, Srinivasan S, Bhowmick S, Mukherjee A, Ghosh K (2021) Constant community identification in million scale networks using image thresholding algorithms. In: Proceedings of the 2021 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining. ASONAM ’21, pp. 116–120. Association for Computing Machinery, New York, NY, USA . https://​doi.​org/​10.​1145/​3487351.​3488350
Zurück zum Zitat Defferrard M, Bresson X, Vandergheynst P (2016) Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering Defferrard M, Bresson X, Vandergheynst P (2016) Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering
Zurück zum Zitat dos Anjos A, Reza Shahbazkia H (2008) Bi-level Image Thresholding - A Fast Method. In: Proceedings of the First International Conference on Bio-inspired Systems and Signal Processing - Volume 2: BIOSIGNALS, (BIOSTEC 2008), pp. 70–76 . https://doi.org/10.5220/0001064300700076. INSTICC dos Anjos A, Reza Shahbazkia H (2008) Bi-level Image Thresholding - A Fast Method. In: Proceedings of the First International Conference on Bio-inspired Systems and Signal Processing - Volume 2: BIOSIGNALS, (BIOSTEC 2008), pp. 70–76 . https://​doi.​org/​10.​5220/​0001064300700076​. INSTICC
Zurück zum Zitat Garcia-Lamont F, Cervantes J, López A, Rodriguez L (2018) Segmentation of images by color features: A survey. Neurocomputing 292:1–27CrossRef Garcia-Lamont F, Cervantes J, López A, Rodriguez L (2018) Segmentation of images by color features: A survey. Neurocomputing 292:1–27CrossRef
Zurück zum Zitat Hagenbuchner FSMGACTM, Monfardini G (2009) The graph neural network model. Appl Netw Sci 20(1):61–80 Hagenbuchner FSMGACTM, Monfardini G (2009) The graph neural network model. Appl Netw Sci 20(1):61–80
Zurück zum Zitat Huang Deng-Yuan WCH, Ta-Wei Lin (2011) Automatic multilevel thresholding based on two-stage otsu’s method with cluster determination by valley estimation. Int J Innovative Comput, Inf Control 7(10):5631–5644 Huang Deng-Yuan WCH, Ta-Wei Lin (2011) Automatic multilevel thresholding based on two-stage otsu’s method with cluster determination by valley estimation. Int J Innovative Comput, Inf Control 7(10):5631–5644
Zurück zum Zitat Hu Z, Dong Y, Wang K, Chang K-W, Sun Y (2020) Gpt-gnn: Generative pre-training of graph neural networks. In: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1857–1867 Hu Z, Dong Y, Wang K, Chang K-W, Sun Y (2020) Gpt-gnn: Generative pre-training of graph neural networks. In: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1857–1867
Zurück zum Zitat Iqbal Z, Khan MA, Sharif M, Shah JH (2018) An automated detection and classification of citrus plant diseases using image processing techniques: a review. Comput and Electron Agric 153:12–32CrossRef Iqbal Z, Khan MA, Sharif M, Shah JH (2018) An automated detection and classification of citrus plant diseases using image processing techniques: a review. Comput and Electron Agric 153:12–32CrossRef
Zurück zum Zitat Liao R, Li Y, Song Y, Wang S, Hamilton W, Duvenaud DK, Urtasun R, Zemel R (2019) Efficient graph generation with graph recurrent attention networks. Advances in Neural Information Processing Systems 32 Liao R, Li Y, Song Y, Wang S, Hamilton W, Duvenaud DK, Urtasun R, Zemel R (2019) Efficient graph generation with graph recurrent attention networks. Advances in Neural Information Processing Systems 32
Zurück zum Zitat Li Y, Vinyals O, Dyer C, Pascanu R, Battaglia P (2018) Learning Deep Generative Models of Graphs Li Y, Vinyals O, Dyer C, Pascanu R, Battaglia P (2018) Learning Deep Generative Models of Graphs
Zurück zum Zitat Li B, Xia Y, Xie S, Wu L, Qin T (2021) Distance-enhanced graph neural network for link prediction. ICML 2021 Workshop on Computational Biology Li B, Xia Y, Xie S, Wu L, Qin T (2021) Distance-enhanced graph neural network for link prediction. ICML 2021 Workshop on Computational Biology
Zurück zum Zitat Luo L, Fang Y, Cao X, Zhang X, Zhang W (2021) Detecting communities from heterogeneous graphs: A context path-based graph neural network model. In: Proceedings of the 30th ACM International Conference on Information & Knowledge Management, pp. 1170–1180 Luo L, Fang Y, Cao X, Zhang X, Zhang W (2021) Detecting communities from heterogeneous graphs: A context path-based graph neural network model. In: Proceedings of the 30th ACM International Conference on Information & Knowledge Management, pp. 1170–1180
Zurück zum Zitat Mahdy LN, Ezzat KA, Elmousalami HH, Ella HA, Hassanien AE (2020) Automatic x-ray covid-19 lung image classification system based on multi-level thresholding and support vector machine. MedRxiv Mahdy LN, Ezzat KA, Elmousalami HH, Ella HA, Hassanien AE (2020) Automatic x-ray covid-19 lung image classification system based on multi-level thresholding and support vector machine. MedRxiv
Zurück zum Zitat Moradan A, Draganov A, Mottin D, Assent I (2021) Ucode: Unified community detection with graph convolutional networks. arXiv preprint arXiv:2112.14822 Moradan A, Draganov A, Mottin D, Assent I (2021) Ucode: Unified community detection with graph convolutional networks. arXiv preprint arXiv:​2112.​14822
Zurück zum Zitat Mueller TT, Paetzold JC, Prabhakar C, Usynin D, Rueckert D, Kaissis G (2022) Differentially private graph classification with gnns. arXiv preprint arXiv:2202.02575 Mueller TT, Paetzold JC, Prabhakar C, Usynin D, Rueckert D, Kaissis G (2022) Differentially private graph classification with gnns. arXiv preprint arXiv:​2202.​02575
Zurück zum Zitat Poulin V, Théberge F (2019) Ensemble clustering for graphs: comparisons and applications. Appl Netw Sci 4(1):51CrossRef Poulin V, Théberge F (2019) Ensemble clustering for graphs: comparisons and applications. Appl Netw Sci 4(1):51CrossRef
Zurück zum Zitat Rocklin M (2015) Dask: Parallel computation with blocked algorithms and task scheduling. In: Proceedings of the 14th Python in Science Conference . Citeseer Rocklin M (2015) Dask: Parallel computation with blocked algorithms and task scheduling. In: Proceedings of the 14th Python in Science Conference . Citeseer
Zurück zum Zitat Rosvall M, Bergstrom CT (2008) Maps of random walks on complex networks reveal community structure. PNAS 105(4):1118–1123CrossRef Rosvall M, Bergstrom CT (2008) Maps of random walks on complex networks reveal community structure. PNAS 105(4):1118–1123CrossRef
Zurück zum Zitat Souravlas S, Anastasiadou S, Katsavounis S (2021) A survey on the recent advances of deep community detection. Appl Sci 11(16):7179CrossRef Souravlas S, Anastasiadou S, Katsavounis S (2021) A survey on the recent advances of deep community detection. Appl Sci 11(16):7179CrossRef
Zurück zum Zitat Sun J, Zheng W, Zhang Q, Xu Z (2021) Graph neural network encoding for community detection in attribute networks. IEEE Transactions on Cybernetics Sun J, Zheng W, Zhang Q, Xu Z (2021) Graph neural network encoding for community detection in attribute networks. IEEE Transactions on Cybernetics
Zurück zum Zitat Wu Z, Pan S, Chen F, Long G, Zhang C, Yu PS (2019) A comprehensive survey on graph neural networks. IEEE transact neural netw learn syst 32(1):4–24MathSciNetCrossRef Wu Z, Pan S, Chen F, Long G, Zhang C, Yu PS (2019) A comprehensive survey on graph neural networks. IEEE transact neural netw learn syst 32(1):4–24MathSciNetCrossRef
Zurück zum Zitat Zhang M, Chen Y (2018) Link prediction based on graph neural networks. In: Proceedings of the 32nd International Conference on Neural Information Processing Systems. NIPS’18, pp. 5171–5181. Curran Associates Inc., Red Hook, NY, USA Zhang M, Chen Y (2018) Link prediction based on graph neural networks. In: Proceedings of the 32nd International Conference on Neural Information Processing Systems. NIPS’18, pp. 5171–5181. Curran Associates Inc., Red Hook, NY, USA
Metadaten
Titel
Constant community identification in million-scale networks
verfasst von
Anjan Chowdhury
Sriram Srinivasan
Sanjukta Bhowmick
Animesh Mukherjee
Kuntal Ghosh
Publikationsdatum
01.12.2022
Verlag
Springer Vienna
Erschienen in
Social Network Analysis and Mining / Ausgabe 1/2022
Print ISSN: 1869-5450
Elektronische ISSN: 1869-5469
DOI
https://doi.org/10.1007/s13278-022-00895-8

Weitere Artikel der Ausgabe 1/2022

Social Network Analysis and Mining 1/2022 Zur Ausgabe

Premium Partner