Skip to main content
Top
Published in: World Wide Web 2/2020

07-11-2019

Towards k-vertex connected component discovery from large networks

Authors: Yuan Li, Guoren Wang, Yuhai Zhao, Feida Zhu, Yubao Wu

Published in: World Wide Web | Issue 2/2020

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

In many real life network-based applications such as social relation analysis, Web analysis, collaborative network, road network and bioinformatics, the discovery of components with high connectivity is an important problem. In particular, k-edge connected component (k-ECC) has recently been extensively studied to discover disjoint components. Yet many real scenarios present more needs and challenges for overlapping components. In this paper, we propose a k-vertex connected component (k-VCC) model, which is much more cohesive, and thus supports overlapping between components very well. To discover k-VCCs, we propose three frameworks including top-down, bottom-up and hybrid frameworks. The top-down framework is first developed to find the exact k-VCCs by dividing the whole network. To further reduce the high computational cost for input networks of large sizes, a bottom-up framework is then proposed to locally identify the seed subgraphs, and obtain the heuristic k-VCCs by expanding and merging these seed subgraphs. Finally, the hybrid framework takes advantages of the above two frameworks. It exploits the results of bottom-up framework to construct the well-designed mixed graph and then discover the exact k-VCCs by contracting the mixed graph in a top-down way. Because the size of mixed graph is smaller than the original network, the hybrid framework runs much faster than the top-down framework. Comprehensive experimental are conducted on large real and synthetic networks and demonstrate the efficiency and effectiveness of the proposed exact and heuristic approaches.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Adamcsek, B., Palla, G., Farkas, I., Derényi, I., Vicsek, T.: Cfinder: Locating cliques and overlapping modules in biological networks. Bioinformatics 22(8), 1021–1023 (2006)CrossRef Adamcsek, B., Palla, G., Farkas, I., Derényi, I., Vicsek, T.: Cfinder: Locating cliques and overlapping modules in biological networks. Bioinformatics 22(8), 1021–1023 (2006)CrossRef
2.
go back to reference Akiba, T., Iwata, Y., Yoshida, Y.: Linear-time enumeration of maximal k-edge-connected subgraphs in large networks by random contraction. In: CIKM, pp. 909–918 (2013) Akiba, T., Iwata, Y., Yoshida, Y.: Linear-time enumeration of maximal k-edge-connected subgraphs in large networks by random contraction. In: CIKM, pp. 909–918 (2013)
3.
go back to reference Batagelj, V., Zaversnik, M.: An o (m) algorithm for cores decomposition of networks. arXiv:cs/0310049 (2003) Batagelj, V., Zaversnik, M.: An o (m) algorithm for cores decomposition of networks. arXiv:cs/​0310049 (2003)
4.
go back to reference Berlowitz, D., Cohen, S., Kimelfeld, B.: Efficient enumeration of maximal k-plexes. In: SIGMOD, pp. 431–444 (2015) Berlowitz, D., Cohen, S., Kimelfeld, B.: Efficient enumeration of maximal k-plexes. In: SIGMOD, pp. 431–444 (2015)
5.
go back to reference Böger, C.A., Chen, M.H., Tin, A., Olden, M., Köttgen, A., de Boer, I.H., Fuchsberger, C., O’Seaghdha, C.M., Pattaro, C., Teumer, A., et al: Cubn is a gene locus for albuminuria. J. Am. Soc. Nephrol. 22(3), 555–570 (2011)CrossRef Böger, C.A., Chen, M.H., Tin, A., Olden, M., Köttgen, A., de Boer, I.H., Fuchsberger, C., O’Seaghdha, C.M., Pattaro, C., Teumer, A., et al: Cubn is a gene locus for albuminuria. J. Am. Soc. Nephrol. 22(3), 555–570 (2011)CrossRef
6.
go back to reference Chang, L., Yu, J.X., Qin, L., Lin, X., Liu, C., Liang, W.: Efficiently computing k-edge connected components via graph decomposition. In: SIGMOD, pp. 205–216 (2013) Chang, L., Yu, J.X., Qin, L., Lin, X., Liu, C., Liang, W.: Efficiently computing k-edge connected components via graph decomposition. In: SIGMOD, pp. 205–216 (2013)
7.
go back to reference Chang, L., Lin, X., Qin, L., Yu, J.X., Zhang, W.: Index-based optimal algorithms for computing steiner components with maximum connectivity. In: SIGMOD, pp. 459–474. ACM (2015) Chang, L., Lin, X., Qin, L., Yu, J.X., Zhang, W.: Index-based optimal algorithms for computing steiner components with maximum connectivity. In: SIGMOD, pp. 459–474. ACM (2015)
8.
go back to reference Chen, L.Y., Zhao, W.H., Tian, W., Guo, J., Jiang, F., Jin, L.J., Sun, Y.X., Chen, K.M., An, L.L., Li, G., et al: Stk39 is an independent risk factor for male hypertension in Han Chinese. Int. J. Cardiol. 154(2), 122–127 (2012)CrossRef Chen, L.Y., Zhao, W.H., Tian, W., Guo, J., Jiang, F., Jin, L.J., Sun, Y.X., Chen, K.M., An, L.L., Li, G., et al: Stk39 is an independent risk factor for male hypertension in Han Chinese. Int. J. Cardiol. 154(2), 122–127 (2012)CrossRef
9.
go back to reference Cheng, J., Ke, Y., Chu, S., Özsu, M. T.: Efficient core decomposition in massive networks. In: ICDE, pp. 51–62 (2011) Cheng, J., Ke, Y., Chu, S., Özsu, M. T.: Efficient core decomposition in massive networks. In: ICDE, pp. 51–62 (2011)
10.
go back to reference Christophides, V., Karvounarakis, G., Plexousakis, D., Scholl, M., Tourtounis, S.: Optimizing taxonomic semantic Web queries using labeling schemes. Web Semantics: Science Services and Agents on the World Wide Web 1(2), 207–228 (2004)CrossRef Christophides, V., Karvounarakis, G., Plexousakis, D., Scholl, M., Tourtounis, S.: Optimizing taxonomic semantic Web queries using labeling schemes. Web Semantics: Science Services and Agents on the World Wide Web 1(2), 207–228 (2004)CrossRef
11.
go back to reference Cohen, J.: Trusses: Cohesive subgraphs for social network analysis. National Security Agency Technical Report 16 (2008) Cohen, J.: Trusses: Cohesive subgraphs for social network analysis. National Security Agency Technical Report 16 (2008)
12.
go back to reference Consortium, W.T.C.C., et al.: Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature 447(7145), 661 (2007)CrossRef Consortium, W.T.C.C., et al.: Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature 447(7145), 661 (2007)CrossRef
13.
go back to reference Conte, A., Firmani, D., Mordente, C., Patrignani, M., Torlone, R.: Fast enumeration of large k-plexes. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 115–124. ACM (2017) Conte, A., Firmani, D., Mordente, C., Patrignani, M., Torlone, R.: Fast enumeration of large k-plexes. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 115–124. ACM (2017)
14.
go back to reference Cui, W., Xiao, Y., Wang, H., Wang, W.: Local search of communities in large graphs. In: SIGMOD, pp. 991–1002 (2014) Cui, W., Xiao, Y., Wang, H., Wang, W.: Local search of communities in large graphs. In: SIGMOD, pp. 991–1002 (2014)
15.
go back to reference Diestel, R.: Graph theory. Grad Texts in Math (2005) Diestel, R.: Graph theory. Grad Texts in Math (2005)
16.
go back to reference Esfahanian, A.H., Louis Hakimi, S.: On computing the connectivities of graphs and digraphs. Networks 14(2), 355–366 (1984)MathSciNetCrossRef Esfahanian, A.H., Louis Hakimi, S.: On computing the connectivities of graphs and digraphs. Networks 14(2), 355–366 (1984)MathSciNetCrossRef
17.
19.
go back to reference Gregory, S.: Finding overlapping communities in networks by label propagation. J. Phys. 12(10), 103018 (2010) Gregory, S.: Finding overlapping communities in networks by label propagation. J. Phys. 12(10), 103018 (2010)
20.
go back to reference Hariharan, R., Kavitha, T., Panigrahi, D., Bhalgat, A.: An o (mn) gomory-hu tree construction algorithm for unweighted graphs. In: ACM Symposium on Theory of Computing, pp. 605–614 (2007) Hariharan, R., Kavitha, T., Panigrahi, D., Bhalgat, A.: An o (mn) gomory-hu tree construction algorithm for unweighted graphs. In: ACM Symposium on Theory of Computing, pp. 605–614 (2007)
21.
go back to reference Hu, J., Wu, X., Cheng, R., Luo, S., Fang, Y.: Querying minimal steiner maximum-connected subgraphs in large graphs. In: CIKM, pp. 1241–1250. ACM (2016) Hu, J., Wu, X., Cheng, R., Luo, S., Fang, Y.: Querying minimal steiner maximum-connected subgraphs in large graphs. In: CIKM, pp. 1241–1250. ACM (2016)
22.
go back to reference Huang, X., Cheng, H., Qin, L., Tian, W., Yu, J.X.: Querying k-truss community in large and dynamic graphs. In: SIGMOD, pp. 1311–1322 (2014) Huang, X., Cheng, H., Qin, L., Tian, W., Yu, J.X.: Querying k-truss community in large and dynamic graphs. In: SIGMOD, pp. 1311–1322 (2014)
23.
go back to reference Huang, X., Lu, W., Lakshmanan, L.V.: Truss decomposition of probabilistic graphs: Semantics and algorithms. In: ACM Proc. of SIGMOD, pp. 77–90. ACM (2016) Huang, X., Lu, W., Lakshmanan, L.V.: Truss decomposition of probabilistic graphs: Semantics and algorithms. In: ACM Proc. of SIGMOD, pp. 77–90. ACM (2016)
24.
go back to reference Kane, V., Mohanty, S.: A lower bound on the number of vertices of a graph. Proc. Am. Math. Soc. 72(1), 211–212 (1978)MathSciNetCrossRef Kane, V., Mohanty, S.: A lower bound on the number of vertices of a graph. Proc. Am. Math. Soc. 72(1), 211–212 (1978)MathSciNetCrossRef
25.
go back to reference Kargar, M., An, A.: Keyword search in graphs: Finding r-cliques. PVLDB 4 (10), 681–692 (2011) Kargar, M., An, A.: Keyword search in graphs: Finding r-cliques. PVLDB 4 (10), 681–692 (2011)
26.
go back to reference Lappas, T., Liu, K., Terzi, E.: Finding a team of experts in social networks. In: KDD, pp. 467–476. ACM (2009) Lappas, T., Liu, K., Terzi, E.: Finding a team of experts in social networks. In: KDD, pp. 467–476. ACM (2009)
27.
go back to reference Lee, C., Reid, F., McDaid, A., Hurley, N.: Detecting highly overlapping community structure by greedy clique expansion. arXiv:1002.1827(2010) Lee, C., Reid, F., McDaid, A., Hurley, N.: Detecting highly overlapping community structure by greedy clique expansion. arXiv:1002.​1827(2010)
28.
go back to reference Li, Y., Zhao, Y., Wang, G., Zhu, F., Wu, Y., Shi, S.: Effective k-vertex connected component detection in large-scale networks. In: International Conference on Database Systems for Advanced Applications, pp. 404–421. Springer (2017) Li, Y., Zhao, Y., Wang, G., Zhu, F., Wu, Y., Shi, S.: Effective k-vertex connected component detection in large-scale networks. In: International Conference on Database Systems for Advanced Applications, pp. 404–421. Springer (2017)
29.
go back to reference Li, L., Zheng, K., Wang, S., Hua, W., Zhou, X.: Go slow to go fast: Minimal on-road time route scheduling with parking facilities using historical trajectory. VLDB J. 27(3), 321–345 (2018)CrossRef Li, L., Zheng, K., Wang, S., Hua, W., Zhou, X.: Go slow to go fast: Minimal on-road time route scheduling with parking facilities using historical trajectory. VLDB J. 27(3), 321–345 (2018)CrossRef
30.
go back to reference Lian, D., Zheng, K., Ge, Y., Cao, L., Chen, E., Xie, X.: Geomf++: Scalable location recommendation via joint geographical modeling and matrix factorization. ACM Trans. Inf. Syst. (TOIS) 36(3), 33 (2018)CrossRef Lian, D., Zheng, K., Ge, Y., Cao, L., Chen, E., Xie, X.: Geomf++: Scalable location recommendation via joint geographical modeling and matrix factorization. ACM Trans. Inf. Syst. (TOIS) 36(3), 33 (2018)CrossRef
31.
go back to reference Lim, S., Ryu, S., Kwon, S., Jung, K., Lee, J.G.: Linkscan*: Overlapping community detection using the link-space transformation. In: ICDE, pp. 292–303. IEEE (2014) Lim, S., Ryu, S., Kwon, S., Jung, K., Lee, J.G.: Linkscan*: Overlapping community detection using the link-space transformation. In: ICDE, pp. 292–303. IEEE (2014)
32.
go back to reference Liu, G., Liu, Y., Zheng, K., Liu, A., Li, Z., Wang, Y., Zhou, X.: Mcs-gpm: Multi-constrained simulation based graph pattern matching in contextual social graphs. IEEE Trans. Knowl. Data Eng. 30(6), 1050–1064 (2017)CrossRef Liu, G., Liu, Y., Zheng, K., Liu, A., Li, Z., Wang, Y., Zhou, X.: Mcs-gpm: Multi-constrained simulation based graph pattern matching in contextual social graphs. IEEE Trans. Knowl. Data Eng. 30(6), 1050–1064 (2017)CrossRef
33.
go back to reference Mokken, R.J.: Cliques, clubs and clans. Quality & Quantity 13(2), 161–173 (1979)CrossRef Mokken, R.J.: Cliques, clubs and clans. Quality & Quantity 13(2), 161–173 (1979)CrossRef
34.
go back to reference Molloy, M., Reed, B.: The size of the giant component of a random graph with a given degree sequence. Comb. Probab. Comput. 7(03), 295–305 (1998)MathSciNetCrossRef Molloy, M., Reed, B.: The size of the giant component of a random graph with a given degree sequence. Comb. Probab. Comput. 7(03), 295–305 (1998)MathSciNetCrossRef
35.
go back to reference Palla, G., Derényi, I., Farkas, I., Vicsek, T.: Uncovering the overlapping community structure of complex networks in nature and society. Nature 435(7043), 814–818 (2005)CrossRef Palla, G., Derényi, I., Farkas, I., Vicsek, T.: Uncovering the overlapping community structure of complex networks in nature and society. Nature 435(7043), 814–818 (2005)CrossRef
36.
go back to reference Pattillo, J., Youssef, N., Butenko, S.: On clique relaxation models in network analysis. Eur. J. Oper. Res. 226(1), 9–18 (2013)MathSciNetCrossRef Pattillo, J., Youssef, N., Butenko, S.: On clique relaxation models in network analysis. Eur. J. Oper. Res. 226(1), 9–18 (2013)MathSciNetCrossRef
37.
go back to reference Shan, J., Shen, D., Nie, T., Kou, Y., Yu, G.: Searching overlapping communities for group query. World Wide Web 19(6), 1179–1202 (2016)CrossRef Shan, J., Shen, D., Nie, T., Kou, Y., Yu, G.: Searching overlapping communities for group query. World Wide Web 19(6), 1179–1202 (2016)CrossRef
38.
go back to reference Slavin, T.P., Feng, T., Schnell, A., Zhu, X., Elston, R.C.: Two-marker association tests yield new disease associations for coronary artery disease and hypertension. Human Gen. 130(6), 725–733 (2011)CrossRef Slavin, T.P., Feng, T., Schnell, A., Zhu, X., Elston, R.C.: Two-marker association tests yield new disease associations for coronary artery disease and hypertension. Human Gen. 130(6), 725–733 (2011)CrossRef
39.
go back to reference Sozio, M., Gionis, A.: The community-search problem and how to plan a successful cocktail party. In: SIGKDD, pp. 939–948 (2010) Sozio, M., Gionis, A.: The community-search problem and how to plan a successful cocktail party. In: SIGKDD, pp. 939–948 (2010)
41.
go back to reference Sun, H., Huang, J., Bai, Y., Zhao, Z., Jia, X., He, F., Li, Y.: Efficient k-edge connected component detection through an early merging and splitting strategy. Knowl.-Based Syst. 111, 63–72 (2016)CrossRef Sun, H., Huang, J., Bai, Y., Zhao, Z., Jia, X., He, F., Li, Y.: Efficient k-edge connected component detection through an early merging and splitting strategy. Knowl.-Based Syst. 111, 63–72 (2016)CrossRef
42.
go back to reference Wang, J., Cheng, J.: Truss decomposition in massive networks. PVLDB 5(9), 812–823 (2012) Wang, J., Cheng, J.: Truss decomposition in massive networks. PVLDB 5(9), 812–823 (2012)
43.
go back to reference Wang, N., Zhang, J., Tan, K.L., Tung, A.K.: On triangulation-based dense neighborhood graph discovery. PVLDB 4(2), 58–68 (2010) Wang, N., Zhang, J., Tan, K.L., Tung, A.K.: On triangulation-based dense neighborhood graph discovery. PVLDB 4(2), 58–68 (2010)
44.
go back to reference Wang, Y., O’Connell, J.R., McArdle, P.F., Wade, J.B., Dorff, S.E., Shah, S.J., Shi, X., Pan, L., Rampersaud, E., Shen, H., et al.: Whole-genome association study identifies stk39 as a hypertension susceptibility gene. Proc. Natl. Acad. Sci. 106(1), 226–231 (2009)CrossRef Wang, Y., O’Connell, J.R., McArdle, P.F., Wade, J.B., Dorff, S.E., Shah, S.J., Shi, X., Pan, L., Rampersaud, E., Shen, H., et al.: Whole-genome association study identifies stk39 as a hypertension susceptibility gene. Proc. Natl. Acad. Sci. 106(1), 226–231 (2009)CrossRef
45.
go back to reference Wu, Y., Jin, R., Li, J., Zhang, X.: Robust local community detection: On free rider effect and its elimination. PVLDB 8(7), 798–809 (2015) Wu, Y., Jin, R., Li, J., Zhang, X.: Robust local community detection: On free rider effect and its elimination. PVLDB 8(7), 798–809 (2015)
46.
go back to reference Wu, Y., Jin, R., Zhu, X., Zhang, X.: Finding dense and connected subgraphs in dual networks. In: ICDE, pp. 915–926 (2015) Wu, Y., Jin, R., Zhu, X., Zhang, X.: Finding dense and connected subgraphs in dual networks. In: ICDE, pp. 915–926 (2015)
47.
go back to reference Wu, Y., Zhu, X., Li, L., Fan, W., Jin, R., Zhang, X.: Mining dual networks: Models, algorithms and applications. TKDD 10(4), 40 (2016)CrossRef Wu, Y., Zhu, X., Li, L., Fan, W., Jin, R., Zhang, X.: Mining dual networks: Models, algorithms and applications. TKDD 10(4), 40 (2016)CrossRef
48.
go back to reference Yang, J., Leskovec, J.: Defining and evaluating network communities based on ground-truth. In: ICDM, pp. 745–754 (2012) Yang, J., Leskovec, J.: Defining and evaluating network communities based on ground-truth. In: ICDM, pp. 745–754 (2012)
49.
go back to reference Zeng, Z., Wang, J., Zhou, L., Karypis, G.: Coherent closed quasi-clique discovery from large dense graph databases. In: KDD, pp. 797–802 (2006) Zeng, Z., Wang, J., Zhou, L., Karypis, G.: Coherent closed quasi-clique discovery from large dense graph databases. In: KDD, pp. 797–802 (2006)
50.
go back to reference Zhao, Y., Zheng, K., Li, Y., Su, H., Liu, J., Zhou, X.: Destination-aware task assignment in spatial crowdsourcing: A worker decomposition approach. IEEE Transactions on Knowledge and Data Engineering (2019) Zhao, Y., Zheng, K., Li, Y., Su, H., Liu, J., Zhou, X.: Destination-aware task assignment in spatial crowdsourcing: A worker decomposition approach. IEEE Transactions on Knowledge and Data Engineering (2019)
51.
go back to reference Zheng, K., Zheng, Y., Yuan, N.J., Shang, S., Zhou, X.: Online discovery of gathering patterns over trajectories. IEEE Trans. Knowl. Data Eng. 26(8), 1974–1988 (2013)CrossRef Zheng, K., Zheng, Y., Yuan, N.J., Shang, S., Zhou, X.: Online discovery of gathering patterns over trajectories. IEEE Trans. Knowl. Data Eng. 26(8), 1974–1988 (2013)CrossRef
52.
go back to reference Zheng, B., Su, H., Hua, W., Zheng, K., Zhou, X., Li, G.: Efficient clue-based route search on road networks. IEEE Trans. Knowl. Data Eng. 29 (9), 1846–1859 (2017)CrossRef Zheng, B., Su, H., Hua, W., Zheng, K., Zhou, X., Li, G.: Efficient clue-based route search on road networks. IEEE Trans. Knowl. Data Eng. 29 (9), 1846–1859 (2017)CrossRef
53.
go back to reference Zheng, K., Zhao, Y., Lian, D., Zheng, B., Liu, G., Zhou, X.: Reference-based framework for spatio-temporal trajectory compression and query processing. IEEE Transactions on Knowledge and Data Engineering (2019) Zheng, K., Zhao, Y., Lian, D., Zheng, B., Liu, G., Zhou, X.: Reference-based framework for spatio-temporal trajectory compression and query processing. IEEE Transactions on Knowledge and Data Engineering (2019)
54.
go back to reference Zhou, R., Liu, C., Yu, J.X., Liang, W., Chen, B., Li, J.: Finding maximal k-edge-connected subgraphs from a large graph. In: EDBT, pp. 480–491 (2012) Zhou, R., Liu, C., Yu, J.X., Liang, W., Chen, B., Li, J.: Finding maximal k-edge-connected subgraphs from a large graph. In: EDBT, pp. 480–491 (2012)
Metadata
Title
Towards k-vertex connected component discovery from large networks
Authors
Yuan Li
Guoren Wang
Yuhai Zhao
Feida Zhu
Yubao Wu
Publication date
07-11-2019
Publisher
Springer US
Published in
World Wide Web / Issue 2/2020
Print ISSN: 1386-145X
Electronic ISSN: 1573-1413
DOI
https://doi.org/10.1007/s11280-019-00725-6

Other articles of this Issue 2/2020

World Wide Web 2/2020 Go to the issue

Premium Partner