Abstract
The problem of computing k-edge connected components (k-ECCs) of a graph G for a specific k is a fundamental graph problem and has been investigated recently. In this paper, we study the problemof ECC decomposition, which computes the k-ECCs of a graph G for all k values. ECC decomposition can be widely applied in a variety of applications such as graph-topology analysis, community detection, Steiner component search, and graph visualization. A straightforward solution for ECC decomposition is to apply the existing k-ECC computation algorithm to compute the k-ECCs for all k values. However, this solution is not applicable to large graphs for two challenging reasons. First, all existing k-ECC computation algorithms are highly memory intensive due to the complex data structures used in the algorithms. Second, the number of possible k values can be very large, resulting in a high computational cost when each k value is independently considered. In this paper, we address the above challenges, and study I/O efficient ECC decomposition via graph reduction. We introduce two elegant graph reduction operators which aim to reduce the size of the graph loaded in memory while preserving the connectivity information of a certain set of edges to be computed for a specific k. We also propose three novel I/O efficient algorithms, Bottom-Up, Top-Down, and Hybrid, that explore the k values in different orders to reduce the redundant computations between different k values. We analyze the I/O and memory costs for all proposed algorithms. In our experiments, we evaluate our algorithms using seven real large datasets with various graph properties, one of which contains 1.95 billion edges. The experimental results show that our proposed algorithms are scalable and efficient.
- J. Abello, M. G. Resende, and S. Sudarsky. Massive quasi-clique detection. In LATIN 2002: Theoretical Informatics, pages 598--612. 2002.Google ScholarCross Ref
- A. Aggarwal, J. Vitter, et al. The input/output complexity of sorting and related problems. Communications of the ACM, 31(9):1116--1127, 1988.Google ScholarDigital Library
- C. Aggarwal, Y. Xie, and P. S. Yu. Gconnect: A connectivity index for massive disk-resident graphs. PVLDB, 2(1):862--873, 2009.Google ScholarDigital Library
- R. Agrawal, S. Rajagopalan, R. Srikant, and Y. Xu. Mining newsgroups using networks arising from social behavior. In Proc. of WWW'03, pages 529--535, 2003.Google ScholarDigital Library
- T. Akiba, Y. Iwata, and Y. Yoshida. Linear-time enumeration of maximal k-edge-connected subgraphs in large networks by random contraction. In Proc. of CIKM'13, pages 909--918, 2013.Google ScholarDigital Library
- J. I. Alvarez-Hamelin, L. Dall'Asta, A. Barrat, and A. Vespignani. K-core decomposition of internet graphs: Hierarchies, self-similarity and measurement biases. Networks and Heterogeneous Media, 3(2), 2008.Google Scholar
- S. Carmi, S. Havlin, S. Kirkpatrick, Y. Shavitt, and E. Shir. A model of internet topology using k-shell decomposition. Proceedings of the National Academy of Sciences, 104(27):11150--11154, 2007.Google ScholarCross Ref
- L. Chang, X. Lin, L. Qin, J. X. Yu, and W. Zhang. Index-based optimal algorithms for computing Steiner components with maximum connectivity. In Proc. of SIGMOD'15, pages 459--474, 2015.Google ScholarDigital Library
- L. Chang, J. X. Yu, L. Qin, X. Lin, C. Liu, and W. Liang. Efficiently computing k-edge connected components via graph decomposition. In Proc. of SIGMOD'13, pages 205--216, 2013.Google ScholarDigital Library
- J. Chen and B. Yuan. Detecting functional modules in the yeast protein--protein interaction network. Bioinformatics, 22(18):2283--2290, 2006.Google ScholarDigital Library
- J. Cheng, Y. Ke, S. Chu, and M. T. Ozsu. Efficient core decomposition in massive networks. In Proc. of ICDE'11, pages 51--62, 2011.Google ScholarDigital Library
- J. Cheng, Y. Ke, A. W.-C. Fu, J. X. Yu, and L. Zhu. Finding maximal cliques in massive networks. ACM Transactions on Database Systems, 36(4):21, 2011.Google ScholarDigital Library
- J. Cheng, L. Zhu, and Y. Ke. Fast algorithms for maximal clique enumeration with limited memory. In Proc. of SIGKDD'12, pages 1240--1248, 2012.Google ScholarDigital Library
- S. Chu and J. Cheng. Triangle listing in massive networks. ACM Transactions on Knowledge Discovery from Data, 6(4):17, 2012.Google Scholar
- J. Cohen. Trusses: Cohesive subgraphs for social network analysis. National Security Agency Technical Report, page 16, 2008.Google Scholar
- X. Hu, Y. Tao, and C. Chung. Massive graph triangulation. In Proc. of SIGMOD'13, pages 325--336, 2013.Google ScholarDigital Library
- R. D. Luce. Connectivity and generalized cliques in sociometric group structure. Psychometrika, 15(2):169--190, 1950.Google ScholarCross Ref
- R. D. Luce and A. D. Perry. A method of matrix analysis of group structure. Psychometrika, 14(2):95--116, 1949.Google ScholarCross Ref
- T. L. Magnanti and S. Raghavan. Strong formulations for network design problems with connectivity requirements. Networks, 45(2), 2005.Google Scholar
- H. Nagamochi and T. Ibaraki. A linear-time algorithm for finding a sparse k-connected spanning subgraph of a k-connected graph. Algorithmica, 1992.Google Scholar
- J. Pei, D. Jiang, and A. Zhang. On mining cross-graph quasi-cliques. In Proc. of SIGKDD'05, pages 228--238, 2005.Google ScholarDigital Library
- S. B. Seidman. Network structure and minimum degree. Social Networks, 5(3):269--287, 1983.Google ScholarCross Ref
- J. Wang and J. Cheng. Truss decomposition in massive networks. PVLDB, 5(9):812--823, 2012.Google ScholarDigital Library
- N. Wang, J. Zhang, K. Tan, and A. K. H. Tung. On triangulation-based dense neighborhood graphs discovery. PVLDB, 4(2):58--68, 2010.Google ScholarDigital Library
- D. R. White and F. Harary. The cohesiveness of blocks in social networks: Node connectivity and conditional density. Sociological Methodology, 31(1), 2001.Google Scholar
- X. Yan, X. Zhou, and J. Han. Mining closed relational graphs with connectivity constraints. In Proc. of SIGKDD'05, pages 324--333, 2005.Google ScholarDigital Library
- Y. Zhang and S. Parthasarathy. Extracting analyzing and visualizing triangle k-core motifs within networks. In Proc. of ICDE'12, pages 1049--1060, 2012.Google ScholarDigital Library
- Z. Zhang, L. Qin, and J. X. Yu. Contract & expand: I/O efficient SCCs computing. In Proc. of ICDE'14, pages 208--219, 2014.Google ScholarCross Ref
- Z. Zhang, J. X. Yu, L. Qin, L. Chang, and X. Lin. I/O efficient: Computing SCCs in massive graphs. In Proc. of SIGMOD'13, pages 245--270, 2013.Google ScholarDigital Library
- Z. Zhang, J. X. Yu, L. Qin, and Z. Shang. Divide & conquer: I/O efficient depth-first search. In Proc. of SIGMOD'15, pages 445--458, 2015.Google ScholarDigital Library
- F. Zhao and A. K. Tung. Large scale cohesive subgraphs discovery for social network visual analysis. PVLDB, 6(2):85--96, 2012.Google ScholarDigital Library
- R. Zhou, C. Liu, J. X. Yu, and W. Liang. Finding maximal k-edge-connected subgraphs from a large graph. In Proc. of EDBT'12, pages 480--491, 2012.Google ScholarDigital Library
Index Terms
- I/O efficient ECC graph decomposition via graph reduction
Recommendations
Efficiently computing k-edge connected components via graph decomposition
SIGMOD '13: Proceedings of the 2013 ACM SIGMOD International Conference on Management of DataEfficiently computing k-edge connected components in a large graph, G = (V, E), where V is the vertex set and E is the edge set, is a long standing research problem. It is not only fundamental in graph analysis but also crucial in graph search ...
I/O efficient ECC graph decomposition via graph reduction
The problem of computing k-edge connected components (k-$$\mathsf {ECC}$$ECCs) of a graph G for a specific k is a fundamental graph problem and has been investigated recently. In this paper, we study the problem of $$\mathsf {ECC}$$ECC decomposition, ...
Graph Decomposition is NP-Complete: A Complete Proof of Holyer's Conjecture
An H-decomposition of a graph G=(V,E) is a partition of E into subgraphs isomorphic to H. Given a fixed graph H, the H-decomposition problem is to determine whether an input graph G admits an H-decomposition.In 1980, Holyer conjectured that H-...
Comments