Skip to main content
Erschienen in: The VLDB Journal 2/2017

31.12.2016 | Regular Paper

I/O efficient ECC graph decomposition via graph reduction

verfasst von: Long Yuan, Lu Qin, Xuemin Lin, Lijun Chang, Wenjie Zhang

Erschienen in: The VLDB Journal | Ausgabe 2/2017

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

The problem of computing k-edge connected components (k-\(\mathsf {ECC}\)s) of a graph G for a specific k is a fundamental graph problem and has been investigated recently. In this paper, we study the problem of \(\mathsf {ECC}\) decomposition, which computes the k-\(\mathsf {ECC}\)s of a graph G for all possible k values. \(\mathsf {ECC}\) decomposition can be widely applied in a variety of applications such as graph-topology analysis, community detection, Steiner Component Search, and graph visualization. A straightforward solution for \(\mathsf {ECC}\) decomposition is to apply the existing k-\(\mathsf {ECC}\) computation algorithm to compute the k-\(\mathsf {ECC}\)s for all k values. However, this solution is not applicable to large graphs for two challenging reasons. First, all existing k-\(\mathsf {ECC}\) computation algorithms are highly memory intensive due to the complex data structures used in the algorithms. Second, the number of possible k values can be very large, resulting in a high computational cost when each k value is independently considered. In this paper, we address the above challenges, and study I/O efficient \(\mathsf {ECC}\) decomposition via graph reduction. We introduce two elegant graph reduction operators which aim to reduce the size of the graph loaded in memory while preserving the connectivity information of a certain set of edges to be computed for a specific k. We also propose three novel I/O efficient algorithms, \(\mathsf {Bottom}\)-\(\mathsf {Up}\), \(\mathsf {Top}\)-\(\mathsf {Down}\), and \(\mathsf {Hybrid}\), that explore the k values in different orders to reduce the redundant computations between different k values. We analyze the I/O and memory costs for all proposed algorithms. In addition, we extend our algorithm to build an efficient index for Steiner Component Search. We show that our index can be used to perform Steiner Component Search in optimal I/Os when only the node information of the graph is allowed to be loaded in memory. In our experiments, we evaluate our algorithms using seven real large datasets with various graph properties, one of which contains 1.95 billion edges. The experimental results show that our proposed algorithms are scalable and efficient.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Abello, J., Resende, M.G., Sudarsky, S.: Massive quasi-clique detection. In: Latin American Symposium on Theoretical Informatics, pp. 598–612 (2002) Abello, J., Resende, M.G., Sudarsky, S.: Massive quasi-clique detection. In: Latin American Symposium on Theoretical Informatics, pp. 598–612 (2002)
2.
Zurück zum Zitat Aggarwal, A., Vitter, J., et al.: The input/output complexity of sorting and related problems. Commun. ACM 31(9), 1116–1127 (1988)MathSciNetCrossRef Aggarwal, A., Vitter, J., et al.: The input/output complexity of sorting and related problems. Commun. ACM 31(9), 1116–1127 (1988)MathSciNetCrossRef
3.
Zurück zum Zitat Agrawal, R., Rajagopalan, S., Srikant, R., Xu, Y.: Mining newsgroups using networks arising from social behavior. In: Proceedings of WWW, pp. 529–535 (2003) Agrawal, R., Rajagopalan, S., Srikant, R., Xu, Y.: Mining newsgroups using networks arising from social behavior. In: Proceedings of WWW, pp. 529–535 (2003)
4.
Zurück zum Zitat Akiba, T., Iwata, Y., Yoshida, Y.: Linear-time enumeration of maximal k-edge-connected subgraphs in large networks by random contraction. In: Proceedings CIKM, pp. 909–918 (2013) Akiba, T., Iwata, Y., Yoshida, Y.: Linear-time enumeration of maximal k-edge-connected subgraphs in large networks by random contraction. In: Proceedings CIKM, pp. 909–918 (2013)
5.
Zurück zum Zitat Alvarez-Hamelin, J.I., Dall’Asta, L., Barrat, A., Vespignani, A.: Large scale networks fingerprinting and visualization using the k-core decomposition. In: Advances in Neural Information Processing Systems, pp. 41–50 (2005) Alvarez-Hamelin, J.I., Dall’Asta, L., Barrat, A., Vespignani, A.: Large scale networks fingerprinting and visualization using the k-core decomposition. In: Advances in Neural Information Processing Systems, pp. 41–50 (2005)
6.
Zurück zum Zitat Alvarez-Hamelin, J.I., Dall’Asta, L., Barrat, A., Vespignani, A.: How the k-core decomposition helps in understanding the internet topology. In: ISMA Workshop on the Internet Topology, vol. 1 (2006) Alvarez-Hamelin, J.I., Dall’Asta, L., Barrat, A., Vespignani, A.: How the k-core decomposition helps in understanding the internet topology. In: ISMA Workshop on the Internet Topology, vol. 1 (2006)
7.
Zurück zum Zitat Alvarez-Hamelin, J.I., Dall’Asta, L., Barrat, A., Vespignani, A.: K-core decomposition of internet graphs: hierarchies, self-similarity and measurement biases. Netw. Heterog. Media 3(2), 371–393 (2008)MathSciNetCrossRefMATH Alvarez-Hamelin, J.I., Dall’Asta, L., Barrat, A., Vespignani, A.: K-core decomposition of internet graphs: hierarchies, self-similarity and measurement biases. Netw. Heterog. Media 3(2), 371–393 (2008)MathSciNetCrossRefMATH
8.
Zurück zum Zitat Carmi, S., Havlin, S., Kirkpatrick, S., Shavitt, Y., Shir, E.: A model of internet topology using k-shell decomposition. Proc. Natl Acad. Sci. 104(27), 11150–11154 (2007)CrossRef Carmi, S., Havlin, S., Kirkpatrick, S., Shavitt, Y., Shir, E.: A model of internet topology using k-shell decomposition. Proc. Natl Acad. Sci. 104(27), 11150–11154 (2007)CrossRef
9.
Zurück zum Zitat Chang, L., Lin, X., Qin, L., Yu, J.X., Zhang, W.: Index-based optimal algorithms for computing Steiner components with maximum connectivity. In: Proceedings of the SIGMOD, pp. 459–474 (2015) Chang, L., Lin, X., Qin, L., Yu, J.X., Zhang, W.: Index-based optimal algorithms for computing Steiner components with maximum connectivity. In: Proceedings of the SIGMOD, pp. 459–474 (2015)
10.
Zurück zum Zitat Chang, L., Yu, J.X., Qin, L., Lin, X., Liu, C., Liang, W.: Efficiently computing k-edge connected components via graph decomposition. In: Proceedings of the SIGMOD, pp. 205–216 (2013) Chang, L., Yu, J.X., Qin, L., Lin, X., Liu, C., Liang, W.: Efficiently computing k-edge connected components via graph decomposition. In: Proceedings of the SIGMOD, pp. 205–216 (2013)
11.
Zurück zum Zitat Chen, J., Yuan, B.: Detecting functional modules in the yeast protein-protein interaction network. Bioinformatics 22(18), 2283–2290 (2006)CrossRef Chen, J., Yuan, B.: Detecting functional modules in the yeast protein-protein interaction network. Bioinformatics 22(18), 2283–2290 (2006)CrossRef
12.
Zurück zum Zitat Cheng, J., Ke, Y., Chu, S., Ozsu. M.T.: Efficient core decomposition in massive networks. In: Proceedings of the ICDE, pp. 51–62 (2011) Cheng, J., Ke, Y., Chu, S., Ozsu. M.T.: Efficient core decomposition in massive networks. In: Proceedings of the ICDE, pp. 51–62 (2011)
13.
Zurück zum Zitat Cheng, J., Zhu, L., Ke, Y., Chu, S.: Fast algorithms for maximal clique enumeration with limited memory. In: Proceedings of the SIGKDD, pp. 1240–1248 (2012) Cheng, J., Zhu, L., Ke, Y., Chu, S.: Fast algorithms for maximal clique enumeration with limited memory. In: Proceedings of the SIGKDD, pp. 1240–1248 (2012)
14.
15.
Zurück zum Zitat Hu, X., Tao, Y., Chung, C.: Massive graph triangulation. In: Proceedings of the SIGMOD, pp. 325–336 (2013) Hu, X., Tao, Y., Chung, C.: Massive graph triangulation. In: Proceedings of the SIGMOD, pp. 325–336 (2013)
16.
Zurück zum Zitat Jia, Y., Hoberock, J., Garland, M., Hart, J.: On the visualization of social and other scale-free networks. IEEE Trans. Vis. Comput. Graph. 14(6), 1285–1292 (2008)CrossRef Jia, Y., Hoberock, J., Garland, M., Hart, J.: On the visualization of social and other scale-free networks. IEEE Trans. Vis. Comput. Graph. 14(6), 1285–1292 (2008)CrossRef
17.
Zurück zum Zitat Luce, R.D.: Connectivity and generalized cliques in sociometric group structure. Psychometrika 15(2), 169–190 (1950)MathSciNetCrossRef Luce, R.D.: Connectivity and generalized cliques in sociometric group structure. Psychometrika 15(2), 169–190 (1950)MathSciNetCrossRef
18.
Zurück zum Zitat Luce, R.D., Perry, A.D.: A method of matrix analysis of group structure. Psychometrika 14(2), 95–116 (1949)MathSciNetCrossRef Luce, R.D., Perry, A.D.: A method of matrix analysis of group structure. Psychometrika 14(2), 95–116 (1949)MathSciNetCrossRef
19.
Zurück zum Zitat Magnanti, T.L., Raghavan, S.: Strong formulations for network design problems with connectivity requirements. Networks 45(2), 61–79 (2005)MathSciNetCrossRefMATH Magnanti, T.L., Raghavan, S.: Strong formulations for network design problems with connectivity requirements. Networks 45(2), 61–79 (2005)MathSciNetCrossRefMATH
20.
Zurück zum Zitat Matsuda, H., Ishihara, T., Hashimoto, A.: Classifying molecular sequences using a linkage graph with their pairwise similarities. Theor. Comput. Sci. 210(2), 305–325 (1999)MathSciNetCrossRefMATH Matsuda, H., Ishihara, T., Hashimoto, A.: Classifying molecular sequences using a linkage graph with their pairwise similarities. Theor. Comput. Sci. 210(2), 305–325 (1999)MathSciNetCrossRefMATH
21.
Zurück zum Zitat Nagamochi, H., Ibaraki, T.: A linear-time algorithm for finding a sparse k-connected spanning subgraph of a k-connected graph. Algorithmica 7(1–6), 583–596 (1992)MathSciNetCrossRefMATH Nagamochi, H., Ibaraki, T.: A linear-time algorithm for finding a sparse k-connected spanning subgraph of a k-connected graph. Algorithmica 7(1–6), 583–596 (1992)MathSciNetCrossRefMATH
22.
Zurück zum Zitat Pei, J., Jiang, D., Zhang, A.: On mining cross-graph quasi-cliques. In: Proceedings of the SIGKDD, pp. 228–238 (2005) Pei, J., Jiang, D., Zhang, A.: On mining cross-graph quasi-cliques. In: Proceedings of the SIGKDD, pp. 228–238 (2005)
24.
Zurück zum Zitat Spirin, V., Mirny, L.A.: Protein complexes and functional modules in molecular networks. Proc. Natl. Acad. Sci. 100(21), 12123–12128 (2003)CrossRef Spirin, V., Mirny, L.A.: Protein complexes and functional modules in molecular networks. Proc. Natl. Acad. Sci. 100(21), 12123–12128 (2003)CrossRef
25.
Zurück zum Zitat Wang, J., Cheng, J.: Truss decomposition in massive networks. PVLDB 5(9), 812–823 (2012) Wang, J., Cheng, J.: Truss decomposition in massive networks. PVLDB 5(9), 812–823 (2012)
26.
Zurück zum Zitat Wang, N., Zhang, J., Tan, K., Tung, A.K.H.: On triangulation-based dense neighborhood graphs discovery. PVLDB 4(2), 58–68 (2010) Wang, N., Zhang, J., Tan, K., Tung, A.K.H.: On triangulation-based dense neighborhood graphs discovery. PVLDB 4(2), 58–68 (2010)
27.
Zurück zum Zitat White, D.R., Harary, F.: The cohesiveness of blocks in social networks: node connectivity and conditional density. Sociol. Methodol. 31(1), 305–359 (2001)CrossRef White, D.R., Harary, F.: The cohesiveness of blocks in social networks: node connectivity and conditional density. Sociol. Methodol. 31(1), 305–359 (2001)CrossRef
28.
Zurück zum Zitat Yan, X., Zhou, X., Han, J.: Mining closed relational graphs with connectivity constraints. In: Proceedings of the SIGKDD, pp. 324–333 (2005) Yan, X., Zhou, X., Han, J.: Mining closed relational graphs with connectivity constraints. In: Proceedings of the SIGKDD, pp. 324–333 (2005)
29.
Zurück zum Zitat Yuan, L., Qin, L., Lin, X., Chang, L., Zhang, W.: Diversified top-k clique search. VLDB J. 25(2), 171–196 (2016)CrossRef Yuan, L., Qin, L., Lin, X., Chang, L., Zhang, W.: Diversified top-k clique search. VLDB J. 25(2), 171–196 (2016)CrossRef
30.
Zurück zum Zitat Yuan, L., Qin, L., Lin, X., Chang, L., Zhang, W.: I/O efficient ecc graph decomposition via graph reduction. PVLDB 9(7), 516–527 (2016) Yuan, L., Qin, L., Lin, X., Chang, L., Zhang, W.: I/O efficient ecc graph decomposition via graph reduction. PVLDB 9(7), 516–527 (2016)
31.
Zurück zum Zitat Zhang, Y., Parthasarathy, S.: Extracting analyzing and visualizing triangle k-core motifs within networks. In: Proceedings of the ICDE, pp. 1049–1060 (2012) Zhang, Y., Parthasarathy, S.: Extracting analyzing and visualizing triangle k-core motifs within networks. In: Proceedings of the ICDE, pp. 1049–1060 (2012)
32.
Zurück zum Zitat Zhang, Z., Yu, J.X., Qin, L., Chang, L., Lin, X.: I/O efficient: computing SCCs in massive graphs. In Proceedings of the SIGMOD, pp. 245–270 (2013) Zhang, Z., Yu, J.X., Qin, L., Chang, L., Lin, X.: I/O efficient: computing SCCs in massive graphs. In Proceedings of the SIGMOD, pp. 245–270 (2013)
33.
Zurück zum Zitat Zhang, Z., Yu, J.X., Qin, L., Shang, Z.: Divide & conquer: I/O efficient depth-first search. In: Proceedings of the SIGMOD, pp. 445–458 (2015) Zhang, Z., Yu, J.X., Qin, L., Shang, Z.: Divide & conquer: I/O efficient depth-first search. In: Proceedings of the SIGMOD, pp. 445–458 (2015)
34.
Zurück zum Zitat Zhou, R., Liu, C., Yu, J.X., Liang, W., Chen, B., Li, J.: Finding maximal k-edge-connected subgraphs from a large graph. In: Proceedings of the EDBT, pp. 480–491 (2012) Zhou, R., Liu, C., Yu, J.X., Liang, W., Chen, B., Li, J.: Finding maximal k-edge-connected subgraphs from a large graph. In: Proceedings of the EDBT, pp. 480–491 (2012)
Metadaten
Titel
I/O efficient ECC graph decomposition via graph reduction
verfasst von
Long Yuan
Lu Qin
Xuemin Lin
Lijun Chang
Wenjie Zhang
Publikationsdatum
31.12.2016
Verlag
Springer Berlin Heidelberg
Erschienen in
The VLDB Journal / Ausgabe 2/2017
Print ISSN: 1066-8888
Elektronische ISSN: 0949-877X
DOI
https://doi.org/10.1007/s00778-016-0451-4

Weitere Artikel der Ausgabe 2/2017

The VLDB Journal 2/2017 Zur Ausgabe