Skip to main content

2019 | OriginalPaper | Buchkapitel

DPSCAN: Structural Graph Clustering Based on Density Peaks

verfasst von : Changfa Wu, Yu Gu, Ge Yu

Erschienen in: Database Systems for Advanced Applications

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Structural graph clustering is one of the fundamental problems in managing and analyzing graph data. The structural clustering algorithm SCAN is successfully used in many applications because it obtains not only clusters but also hubs and outliers. However, the results of SCAN heavily depend on two sensitive parameters, \(\epsilon \) and \(\mu \), which may result in loss of accuracy and efficiency. In this paper, we propose a novel Density Peak-based Structural Clustering Algorithm for Networks (DPSCAN). Specifically, DPSCAN clusters vertices based on the structural similarity and the structural dependency between vertices and their neighbors, without tuning parameters. Through theoretical analysis, we prove that DPSCAN can detect meaningful clusters, hubs and outliers. In addition, to improve the efficiency of DPSCAN, we propose a new index structure named DP-Index, so that each vertex needs to be visited only once. Finally, we conduct comprehensive experimental studies on real and synthetic graphs, which demonstrate that our new approach outperforms the state-of-the-art approaches.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Bai, X., Yang, P., Shi, X.: An overlapping community detection algorithm based on density peaks. Neurocomputing 226, 7–15 (2017)CrossRef Bai, X., Yang, P., Shi, X.: An overlapping community detection algorithm based on density peaks. Neurocomputing 226, 7–15 (2017)CrossRef
2.
Zurück zum Zitat Chang, L., Li, W., Qin, L., Zhang, W., Yang, S.: pSCAN: fast and exact structural graph clustering. IEEE Trans. Knowl. Data Eng. 29(2), 387–401 (2017)CrossRef Chang, L., Li, W., Qin, L., Zhang, W., Yang, S.: pSCAN: fast and exact structural graph clustering. IEEE Trans. Knowl. Data Eng. 29(2), 387–401 (2017)CrossRef
3.
Zurück zum Zitat Clauset, A., Newman, M.E., Moore, C.: Finding community structure in very large networks. Phys. Rev. E 70(6), 066111 (2004)CrossRef Clauset, A., Newman, M.E., Moore, C.: Finding community structure in very large networks. Phys. Rev. E 70(6), 066111 (2004)CrossRef
4.
Zurück zum Zitat Cordasco, G., Gargano, L.: Community detection via semi-synchronous label propagation algorithms. In: 2010 IEEE International Workshop on Business Applications of Social Network Analysis (BASNA), pp. 1–8. IEEE (2010) Cordasco, G., Gargano, L.: Community detection via semi-synchronous label propagation algorithms. In: 2010 IEEE International Workshop on Business Applications of Social Network Analysis (BASNA), pp. 1–8. IEEE (2010)
5.
Zurück zum Zitat Falkowski, T., Barth, A., Spiliopoulou, M.: Studying community dynamics with an incremental graph mining algorithm. In: AMCIS 2008 Proceedings, p. 29 (2008) Falkowski, T., Barth, A., Spiliopoulou, M.: Studying community dynamics with an incremental graph mining algorithm. In: AMCIS 2008 Proceedings, p. 29 (2008)
6.
Zurück zum Zitat Girvan, M., Newman, M.E.: Community structure in social and biological networks. Proc. Natl. Acad. Sci. 99(12), 7821–7826 (2002)MathSciNetCrossRef Girvan, M., Newman, M.E.: Community structure in social and biological networks. Proc. Natl. Acad. Sci. 99(12), 7821–7826 (2002)MathSciNetCrossRef
7.
Zurück zum Zitat Gong, S., Zhang, Y., Yu, G.: Clustering stream data by exploring the evolution of density mountain. Proc. VLDB Endowment 11(4), 393–405 (2017)CrossRef Gong, S., Zhang, Y., Yu, G.: Clustering stream data by exploring the evolution of density mountain. Proc. VLDB Endowment 11(4), 393–405 (2017)CrossRef
8.
Zurück zum Zitat Huang, J., Sun, H., Han, J., Deng, H., Sun, Y., Liu, Y.: Shrink: a structural clustering algorithm for detecting hierarchical communities in networks. In: Proceedings of the 19th ACM International Conference on Information and Knowledge Management, pp. 219–228. ACM (2010) Huang, J., Sun, H., Han, J., Deng, H., Sun, Y., Liu, Y.: Shrink: a structural clustering algorithm for detecting hierarchical communities in networks. In: Proceedings of the 19th ACM International Conference on Information and Knowledge Management, pp. 219–228. ACM (2010)
9.
Zurück zum Zitat Lancichinetti, A., Fortunato, S., Radicchi, F.: Benchmark graphs for testing community detection algorithms. Phys. Rev. E 78(4), 046110 (2008)CrossRef Lancichinetti, A., Fortunato, S., Radicchi, F.: Benchmark graphs for testing community detection algorithms. Phys. Rev. E 78(4), 046110 (2008)CrossRef
10.
Zurück zum Zitat Lim, S., Ryu, S., Kwon, S., Jung, K., Lee, J.G.: LinkSCAN*: overlapping community detection using the link-space transformation. In: 2014 IEEE 30th International Conference on Data Engineering (ICDE), pp. 292–303. IEEE (2014) Lim, S., Ryu, S., Kwon, S., Jung, K., Lee, J.G.: LinkSCAN*: overlapping community detection using the link-space transformation. In: 2014 IEEE 30th International Conference on Data Engineering (ICDE), pp. 292–303. IEEE (2014)
11.
Zurück zum Zitat Newman, M.E.: Finding community structure in networks using the eigenvectors of matrices. Phys. Rev. E 74(3), 036104 (2006)MathSciNetCrossRef Newman, M.E.: Finding community structure in networks using the eigenvectors of matrices. Phys. Rev. E 74(3), 036104 (2006)MathSciNetCrossRef
12.
Zurück zum Zitat Onizuka, M., Fujimori, T., Shiokawa, H.: Graph partitioning for distributed graph processing. Data Sci. Eng. 2(1), 94–105 (2017)CrossRef Onizuka, M., Fujimori, T., Shiokawa, H.: Graph partitioning for distributed graph processing. Data Sci. Eng. 2(1), 94–105 (2017)CrossRef
14.
Zurück zum Zitat Raghavan, U.N., Albert, R., Kumara, S.: Near linear time algorithm to detect community structures in large-scale networks. Phys. Rev. E 76(3), 036106 (2007)CrossRef Raghavan, U.N., Albert, R., Kumara, S.: Near linear time algorithm to detect community structures in large-scale networks. Phys. Rev. E 76(3), 036106 (2007)CrossRef
15.
Zurück zum Zitat Rand, W.M.: Objective criteria for the evaluation of clustering methods. J. Am. Stat. Assoc. 66(336), 846–850 (1971)CrossRef Rand, W.M.: Objective criteria for the evaluation of clustering methods. J. Am. Stat. Assoc. 66(336), 846–850 (1971)CrossRef
16.
Zurück zum Zitat Rodriguez, A., Laio, A.: Clustering by fast search and find of density peaks. Science 344(6191), 1492–1496 (2014)CrossRef Rodriguez, A., Laio, A.: Clustering by fast search and find of density peaks. Science 344(6191), 1492–1496 (2014)CrossRef
17.
Zurück zum Zitat Shao, J., Han, Z., Yang, Q., Zhou, T.: Community detection based on distance dynamics. In: Proceedings of the 21st ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1075–1084. ACM (2015) Shao, J., Han, Z., Yang, Q., Zhou, T.: Community detection based on distance dynamics. In: Proceedings of the 21st ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1075–1084. ACM (2015)
18.
Zurück zum Zitat Shi, J., Malik, J.: Normalized cuts and image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 22(8), 888–905 (2000)CrossRef Shi, J., Malik, J.: Normalized cuts and image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 22(8), 888–905 (2000)CrossRef
19.
Zurück zum Zitat Shiokawa, H., Fujiwara, Y., Onizuka, M.: SCAN++: efficient algorithm for finding clusters, hubs and outliers on large-scale graphs. Proc. VLDB Endowment 8(11), 1178–1189 (2015)CrossRef Shiokawa, H., Fujiwara, Y., Onizuka, M.: SCAN++: efficient algorithm for finding clusters, hubs and outliers on large-scale graphs. Proc. VLDB Endowment 8(11), 1178–1189 (2015)CrossRef
20.
Zurück zum Zitat Strehl, A., Ghosh, J.: Cluster ensembles-a knowledge reuse framework for combining multiple partitions. J. Mach. Learn. Res. 3, 583–617 (2002)MathSciNetMATH Strehl, A., Ghosh, J.: Cluster ensembles-a knowledge reuse framework for combining multiple partitions. J. Mach. Learn. Res. 3, 583–617 (2002)MathSciNetMATH
21.
Zurück zum Zitat Sun, H., Huang, J., Han, J., Deng, H., Zhao, P., Feng, B.: gSkeletonClu: density-based network clustering via structure-connected tree division or agglomeration. In: 2010 IEEE 10th International Conference on Data Mining (ICDM), pp. 481–490. IEEE (2010) Sun, H., Huang, J., Han, J., Deng, H., Zhao, P., Feng, B.: gSkeletonClu: density-based network clustering via structure-connected tree division or agglomeration. In: 2010 IEEE 10th International Conference on Data Mining (ICDM), pp. 481–490. IEEE (2010)
22.
Zurück zum Zitat Wen, D., Qin, L., Zhang, Y., Chang, L., Lin, X.: Efficient structural graph clustering: an index-based approach. Proc. VLDB Endowment 11(3), 243–255 (2017)CrossRef Wen, D., Qin, L., Zhang, Y., Chang, L., Lin, X.: Efficient structural graph clustering: an index-based approach. Proc. VLDB Endowment 11(3), 243–255 (2017)CrossRef
23.
Zurück zum Zitat Xu, X., Yuruk, N., Feng, Z., Schweiger, T.A.: SCAN: a structural clustering algorithm for networks. In: Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 824–833. ACM (2007) Xu, X., Yuruk, N., Feng, Z., Schweiger, T.A.: SCAN: a structural clustering algorithm for networks. In: Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 824–833. ACM (2007)
24.
Zurück zum Zitat Zhou, K., Martin, A., Pan, Q., Liu, Z.: SELP: semi-supervised evidential label propagation algorithm for graph data clustering. Int. J. Approximate Reasoning 92, 139–154 (2018)MathSciNetCrossRef Zhou, K., Martin, A., Pan, Q., Liu, Z.: SELP: semi-supervised evidential label propagation algorithm for graph data clustering. Int. J. Approximate Reasoning 92, 139–154 (2018)MathSciNetCrossRef
Metadaten
Titel
DPSCAN: Structural Graph Clustering Based on Density Peaks
verfasst von
Changfa Wu
Yu Gu
Ge Yu
Copyright-Jahr
2019
DOI
https://doi.org/10.1007/978-3-030-18579-4_37