ABSTRACT
This paper studies I/O-efficient algorithms for settling the classic triangle listing problem, whose solution is a basic operator in dealing with many other graph problems. Specifically, given an undirected graph G, the objective of triangle listing is to find all the cliques involving 3 vertices in G. The problem has been well studied in internal memory, but remains an urgent difficult challenge when G does not fit in memory, rendering any algorithm to entail frequent I/O accesses. Although previous research has attempted to tackle the challenge, the state-of-the-art solutions rely on a set of crippling assumptions to guarantee good performance. Motivated by this, we develop a new algorithm that is provably I/O and CPU efficient at the same time, without making any assumption on the input G at all. The algorithm uses ideas drastically different from all the previous approaches, and outperformed the existing competitors by a factor over an order of magnitude in our extensive experimentation.
- A. Aggarwal and J. S. Vitter. The input/output complexity of sorting and related problems. CACM, 31(9):1116--1127, 1988. Google ScholarDigital Library
- N. Alon and J. H. Spencer. The Probabilistic Methods. Wiley, New York, 2nd edition, 2000.Google Scholar
- E. Bakshy, I. Rosenn, C. Marlow, and L. A. Adamic. The role of social networks in information diffusion. CoRR, 2012.Google ScholarDigital Library
- V. Batagelj and M. Zaversnik. Short cycle connectivity. Discrete Mathematics, 307:310--318, 2007.Google ScholarDigital Library
- D. Chakrabarti, Y. Zhan, and C. Faloutsos. R-MAT: A recursive model for graph mining. In SDM, 2004.Google ScholarCross Ref
- J. Cheng, Y. Ke, A. W.-C. Fu, J. X. Yu, and L. Zhu. Finding maximal cliques in massive networks. TODS, 36(4):21, 2011. Google ScholarDigital Library
- N. Chiba and T. Nishizeki. Arboricity and subgraph listing algorithms. SIAM J. of Comp., 14(1):210--223, 1985. Google ScholarDigital Library
- S. Chu and J. Cheng. Triangle listing in massive networks and its applications. In SIGKDD, pages 672--680, 2011. Google ScholarDigital Library
- S. Chu and J. Cheng. Triangle listing in massive networks. TKDD, 6(4):17, 2012. Google ScholarDigital Library
- J. Cohen. Graph twiddling in a mapreduce world. Computing in Science and Engineering, 11(4):29--41, 2009. Google ScholarDigital Library
- R. Dementiev. Algorithm engineering for large data sets hardware, software, algorithms. PhD thesis, Saarland University, 2006.Google Scholar
- D. Eppstein and E. S. Spiro. The h-index of a graph and its application to dynamic subgraph statistics. In WADS, pages 278--289, 2009. Google ScholarDigital Library
- J. Hellings, G. H. L. Fletcher, and H. J. Haverkort. Efficient external-memory bisimulation on dags. In SIGMOD, pages 553--564, 2012. Google ScholarDigital Library
- A. Itai and M. Rodeh. Finding a minimum circuit in a graph. SIAM J. of Comp., 7(4):413--423, 1978.Google ScholarCross Ref
- M. Latapy. Main-memory triangle computations for very large (sparse (power-law)) graphs. TCC, 407(1-3):458--473, 2008. Google ScholarDigital Library
- M. C. Lin, F. J. Soulignac, and J. L. Szwarcfiter. Arboricity, h-index, and dynamic algorithms. TCC, 426:75--90, 2012. Google ScholarDigital Library
- B. Menegola. An external memory algorithm for listing triangles. Technical report, Universidade Federal do Rio Grande do Sul, 2010.Google Scholar
- C. S. J. A. Nash-Williams. Decomposition of finite graphs into forests. Journal of the London Mathematical Society, 39(1):12, 1964.Google ScholarCross Ref
- T. Schank. Algorithmic Aspects of Triangle-Based Network Analysis. PhD thesis, Universitat Karlsruhe, Fakultat fur Informatik, 2007.Google Scholar
- T. Schank and D. Wagner. Finding, counting and listing all triangles in large graphs, an experimental study. In Workshop on Experimental Algorithms (WEA), pages 606--609, 2005. Google ScholarDigital Library
- C.-H. Tai, P. S. Yu, D.-N. Yang, and M.-S. Chen. Privacy-preserving social network publication against friendship attacks. In SIGKDD, pages 1262--1270, 2011. Google ScholarDigital Library
- J. Wang and J. Cheng. Truss decomposition in massive networks. PVLDB, 5(9):812--823, 2012. Google ScholarDigital Library
- N. Wang, J. Zhang, K.-L. Tan, and A. K. H. Tung. On triangulation-based dense neighborhood graph discovery. PVLDB, 4(2):58--68, 2010. Google ScholarDigital Library
- D. J. Watts and S. H. Strogatz. Collective dynamics of 'small-world' networks. Nature, 393:440--442, 1998.Google ScholarCross Ref
- P. Zhao, C. C. Aggarwal, and M. Wang. gsketch: On query estimation in graph streams. PVLDB, 5(3):193--204, 2011. Google ScholarDigital Library
Index Terms
- Massive graph triangulation
Recommendations
Acyclically 4-colorable triangulations
An acyclic k-coloring of a graph is a proper vertex k-coloring such that every bichromatic subgraph, induced by this coloring, contains no cycles. A graph is acyclically k-colorable if it has an acyclic k-coloring. In this paper, we prove that every ...
Some forbidden subgraph conditions for a graph to have a k-contractible edge
Special issue: Combinatorics 2000An edge of a k-connected graph is said to be k-contractible if the contraction of the edge results in a k-connected graph. Let K"n^- stand for the graph obtained from K"n by removing one edge. Let G be a k-connected graph (k>=5). It is known that if ...
Graph Decomposition is NP-Complete: A Complete Proof of Holyer's Conjecture
An H-decomposition of a graph G=(V,E) is a partition of E into subgraphs isomorphic to H. Given a fixed graph H, the H-decomposition problem is to determine whether an input graph G admits an H-decomposition.In 1980, Holyer conjectured that H-...
Comments