ABSTRACT
The number of triangles in a graph is a fundamental metric widely used in social network analysis, link classification and recommendation, and more. In these applications, modern graphs of interest tend to both large and dynamic. This paper presents the design and implementation of a fast parallel algorithm for estimating the number of triangles in a massive undirected graph whose edges arrive as a stream. Our algorithm is designed for shared-memory multicore machines and can make efficient use of parallelism and the memory hierarchy. We provide theoretical guarantees on performance and accuracy, and our experiments on real-world datasets show accurate results and substantial speedups compared to an optimized sequential implementation.
- Z. Bar-Yossef, R. Kumar, and D. Sivakumar. Reductions in streaming algorithms, with an application to counting triangles in graphs. In Proc. ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 623--632, 2002. Google ScholarDigital Library
- L. Becchetti, P. Boldi, C. Castillo, and A. Gionis. Efficient semi-streaming algorithms for local triangle counting in massive graphs. In Proc. ACM Conference on Knowledge Discovery and Data Mining (KDD), pages 16--24, 2008. Google ScholarDigital Library
- J. W. Berry, L. Fosvedt, D. Nordman, C. A. Phillips, and A. G. Wilson. Listing triangles in expected linear time on power law graphs with exponent at least 7/3. Technical report, Sandia National Laboratories, 2011.Google Scholar
- G. E. Blelloch, P. B. Gibbons, and H. V. Simhadri. Low depth cache-oblivious algorithms. In SPAA'10, pages 189--199. ACM, 2010. Google ScholarDigital Library
- G. E. Blelloch, J. T. Fineman, P. B. Gibbons, and H. V. Simhadri. Scheduling irregular parallel computations on hierarchical caches. In SPAA'11, pages 355--366. ACM, 2011. Google ScholarDigital Library
- L. S. Buriol, G. Frahling, S. Leonardi, and C. Sohler. Estimating clustering indexes in data streams. In Proc. European Symposium on Algorithms (ESA), pages 618--632, 2007. Google ScholarDigital Library
- N. Chiba and T. Nishizeki. Arboricity and subgraph listing algorithms. SIAM Journal on computing, 14: 210--223, 1985. Google ScholarDigital Library
- S. Chu and J. Cheng. Triangle listing in massive networks and its applications. In Knowledge Data and Discovery (KDD), pages 672--680, 2011. Google ScholarDigital Library
- J. Cohen. Graph twiddling in a mapreduce world. Computing in Science and Engineering, 11: 29--41, 2009. Google ScholarDigital Library
- J.-P. Eckmann and E. Moses. Curvature of co-links uncovers hidden thematic layers in the world wide web. Proceedings of the National Academy of Sciences, 99 (9): 5825--5829, 2002. 10.1073/pnas.032093399.Google ScholarCross Ref
- M. Frigo, C. E. Leiserson, H. Prokop, and S. Ramachandran. Cache-oblivious algorithms. In FOCS, 1999. Google ScholarDigital Library
- Intel Corp. Intel Cilk Plus, 2013. http://cilkplus.org/.Google Scholar
- M. Jha, C. Seshadhri, and A. Pinar. From the birthday paradox to a practical sublinear space streaming algorithm for triangle counting. CoRR, abs/1212.2264, 2012.Google Scholar
- H. Jowhari and M. Ghodsi. New streaming algorithms for counting triangles in graphs. In Proc. 11th Annual International Conference Computing and Combinatorics (COCOON), pages 710--716, 2005. Google ScholarDigital Library
- D. M. Kane, K. Mehlhorn, T. Sauerwald, and H. Sun. Counting arbitrary subgraphs in data streams. In Proc. International Colloquium on Automata, Languages, and Programming (ICALP), pages 598--609, 2012. Google ScholarDigital Library
- M. N. Kolountzakis, G. L. Miller, R. Peng, and C. E. Tsourakakis. Efficient triangle counting in large graphs via degree-based vertex partitioning. In WAW, pages 15--24, 2010.Google ScholarCross Ref
- K. Kutzkov and R. Pagh. On the streaming complexity of computing local clustering coefficients. In Proceedings of 6th ACM conference on Web Search and Data Mining (WSDM), 2013. Google ScholarDigital Library
- M. Latapy. Main-memory triangle computations for very large (sparse (power-law)) graphs. Theoretical Computer Science, 407: 458--473, 2008. Google ScholarDigital Library
- J. Leskovec. Stanford large network dataset collection. http://snap.stanford.edu/data/index.html, 2012. Accessed Dec 5, 2012.Google Scholar
- M. Manjunath, K. Mehlhorn, K. Panagiotou, and H. Sun. Approximate counting of cycles in streams. In Proc. European Symposium on Algorithms (ESA), pages 677--688, 2011. Google ScholarDigital Library
- M. E. J. Newman. The structure and function of complex networks. SIAM REVIEW, 45: 167--256, 2003.Google ScholarDigital Library
- R. Pagh and C. E. Tsourakakis. Colorful triangle counting and a mapreduce implementation. Inf. Process. Lett., 112 (7): 277--281, 2012. Google ScholarDigital Library
- A. Pavan, K. Tangwongsan, S. Tirthapura, and K.-L. Wu. Counting and sampling triangles from a graph stream. PVLDB, 6 (14), 2013.Google Scholar
- T. Schank and D. Wagner. Finding, counting and listing all triangles in large graphs, an experimental study. In Workshop on Experimental and Efficient Algorithms (WEA), pages 606--609, 2005. Google ScholarDigital Library
- J. Shun, G. E. Blelloch, J. T. Fineman, P. B. Gibbons, A. Kyrola, H. V. Simhadri, and K. Tangwongsan. Brief announcement: the problem based benchmark suite. In Proc. ACM Symposium on Parallelism in Algorithms and Architectures (SPAA), pages 68--70, 2012. Google ScholarDigital Library
- S. Suri and S. Vassilvitskii. Counting triangles and the curse of the last reducer. In Proc. 20th International Conference on World Wide Web (WWW), pages 607--614, 2011. Google ScholarDigital Library
- K. Tangwongsan, A. Pavan, and S. Tirthapura. Parallel triangle counting in massive streaming graphs. CoRR, abs/1308, 2013.Google Scholar
- C. E. Tsourakakis, U. Kang, G. L. Miller, and C. Faloutsos. Doulion: counting triangles in massive graphs with a coin. In Proc. ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pages 837--846, 2009. Google ScholarDigital Library
- C. E. Tsourakakis, P. Drineas, E. Michelakis, I. Koutis, and C. Faloutsos. Spectral counting of triangles via element-wise sparsification and triangle-based link recommendation. Social Netw. Analys. Mining, 1 (2): 75--81, 2011.Google ScholarCross Ref
- S. Wasserman and K. Faust. Social Network Analysis. Cambridage University Press, 1994.Google ScholarCross Ref
Index Terms
- Parallel triangle counting in massive streaming graphs
Recommendations
Distributed, Shared-Memory Parallel Triangle Counting
PASC '18: Proceedings of the Platform for Advanced Scientific Computing ConferenceTriangles are the most basic non-trivial subgraphs. Triangle counting is used in a number of different applications, including social network mining, cyber security, and spam detection. In general, triangle counting algorithms are readily parallelizable,...
A 2D Parallel Triangle Counting Algorithm for Distributed-Memory Architectures
ICPP '19: Proceedings of the 48th International Conference on Parallel ProcessingTriangle counting is a fundamental graph analytic operation that is used extensively in network science and graph mining. As the size of the graphs that needs to be analyzed continues to grow, there is a requirement in developing scalable algorithms for ...
Parallel community detection for massive graphs
PPAM'11: Proceedings of the 9th international conference on Parallel Processing and Applied Mathematics - Volume Part ITackling the current volume of graph-structured data requires parallel tools. We extend our work on analyzing such massive graph data with the first massively parallel algorithm for community detection that scales to current data sizes, scaling to ...
Comments