skip to main content
10.1145/2505515.2505741acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

Parallel triangle counting in massive streaming graphs

Published:27 October 2013Publication History

ABSTRACT

The number of triangles in a graph is a fundamental metric widely used in social network analysis, link classification and recommendation, and more. In these applications, modern graphs of interest tend to both large and dynamic. This paper presents the design and implementation of a fast parallel algorithm for estimating the number of triangles in a massive undirected graph whose edges arrive as a stream. Our algorithm is designed for shared-memory multicore machines and can make efficient use of parallelism and the memory hierarchy. We provide theoretical guarantees on performance and accuracy, and our experiments on real-world datasets show accurate results and substantial speedups compared to an optimized sequential implementation.

References

  1. Z. Bar-Yossef, R. Kumar, and D. Sivakumar. Reductions in streaming algorithms, with an application to counting triangles in graphs. In Proc. ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 623--632, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. L. Becchetti, P. Boldi, C. Castillo, and A. Gionis. Efficient semi-streaming algorithms for local triangle counting in massive graphs. In Proc. ACM Conference on Knowledge Discovery and Data Mining (KDD), pages 16--24, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. J. W. Berry, L. Fosvedt, D. Nordman, C. A. Phillips, and A. G. Wilson. Listing triangles in expected linear time on power law graphs with exponent at least 7/3. Technical report, Sandia National Laboratories, 2011.Google ScholarGoogle Scholar
  4. G. E. Blelloch, P. B. Gibbons, and H. V. Simhadri. Low depth cache-oblivious algorithms. In SPAA'10, pages 189--199. ACM, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. G. E. Blelloch, J. T. Fineman, P. B. Gibbons, and H. V. Simhadri. Scheduling irregular parallel computations on hierarchical caches. In SPAA'11, pages 355--366. ACM, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. L. S. Buriol, G. Frahling, S. Leonardi, and C. Sohler. Estimating clustering indexes in data streams. In Proc. European Symposium on Algorithms (ESA), pages 618--632, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. N. Chiba and T. Nishizeki. Arboricity and subgraph listing algorithms. SIAM Journal on computing, 14: 210--223, 1985. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. S. Chu and J. Cheng. Triangle listing in massive networks and its applications. In Knowledge Data and Discovery (KDD), pages 672--680, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. J. Cohen. Graph twiddling in a mapreduce world. Computing in Science and Engineering, 11: 29--41, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. J.-P. Eckmann and E. Moses. Curvature of co-links uncovers hidden thematic layers in the world wide web. Proceedings of the National Academy of Sciences, 99 (9): 5825--5829, 2002. 10.1073/pnas.032093399.Google ScholarGoogle ScholarCross RefCross Ref
  11. M. Frigo, C. E. Leiserson, H. Prokop, and S. Ramachandran. Cache-oblivious algorithms. In FOCS, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Intel Corp. Intel Cilk Plus, 2013. http://cilkplus.org/.Google ScholarGoogle Scholar
  13. M. Jha, C. Seshadhri, and A. Pinar. From the birthday paradox to a practical sublinear space streaming algorithm for triangle counting. CoRR, abs/1212.2264, 2012.Google ScholarGoogle Scholar
  14. H. Jowhari and M. Ghodsi. New streaming algorithms for counting triangles in graphs. In Proc. 11th Annual International Conference Computing and Combinatorics (COCOON), pages 710--716, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. D. M. Kane, K. Mehlhorn, T. Sauerwald, and H. Sun. Counting arbitrary subgraphs in data streams. In Proc. International Colloquium on Automata, Languages, and Programming (ICALP), pages 598--609, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. M. N. Kolountzakis, G. L. Miller, R. Peng, and C. E. Tsourakakis. Efficient triangle counting in large graphs via degree-based vertex partitioning. In WAW, pages 15--24, 2010.Google ScholarGoogle ScholarCross RefCross Ref
  17. K. Kutzkov and R. Pagh. On the streaming complexity of computing local clustering coefficients. In Proceedings of 6th ACM conference on Web Search and Data Mining (WSDM), 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. M. Latapy. Main-memory triangle computations for very large (sparse (power-law)) graphs. Theoretical Computer Science, 407: 458--473, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. J. Leskovec. Stanford large network dataset collection. http://snap.stanford.edu/data/index.html, 2012. Accessed Dec 5, 2012.Google ScholarGoogle Scholar
  20. M. Manjunath, K. Mehlhorn, K. Panagiotou, and H. Sun. Approximate counting of cycles in streams. In Proc. European Symposium on Algorithms (ESA), pages 677--688, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. M. E. J. Newman. The structure and function of complex networks. SIAM REVIEW, 45: 167--256, 2003.Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. R. Pagh and C. E. Tsourakakis. Colorful triangle counting and a mapreduce implementation. Inf. Process. Lett., 112 (7): 277--281, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. A. Pavan, K. Tangwongsan, S. Tirthapura, and K.-L. Wu. Counting and sampling triangles from a graph stream. PVLDB, 6 (14), 2013.Google ScholarGoogle Scholar
  24. T. Schank and D. Wagner. Finding, counting and listing all triangles in large graphs, an experimental study. In Workshop on Experimental and Efficient Algorithms (WEA), pages 606--609, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. J. Shun, G. E. Blelloch, J. T. Fineman, P. B. Gibbons, A. Kyrola, H. V. Simhadri, and K. Tangwongsan. Brief announcement: the problem based benchmark suite. In Proc. ACM Symposium on Parallelism in Algorithms and Architectures (SPAA), pages 68--70, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. S. Suri and S. Vassilvitskii. Counting triangles and the curse of the last reducer. In Proc. 20th International Conference on World Wide Web (WWW), pages 607--614, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. K. Tangwongsan, A. Pavan, and S. Tirthapura. Parallel triangle counting in massive streaming graphs. CoRR, abs/1308, 2013.Google ScholarGoogle Scholar
  28. C. E. Tsourakakis, U. Kang, G. L. Miller, and C. Faloutsos. Doulion: counting triangles in massive graphs with a coin. In Proc. ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pages 837--846, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. C. E. Tsourakakis, P. Drineas, E. Michelakis, I. Koutis, and C. Faloutsos. Spectral counting of triangles via element-wise sparsification and triangle-based link recommendation. Social Netw. Analys. Mining, 1 (2): 75--81, 2011.Google ScholarGoogle ScholarCross RefCross Ref
  30. S. Wasserman and K. Faust. Social Network Analysis. Cambridage University Press, 1994.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Parallel triangle counting in massive streaming graphs

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          CIKM '13: Proceedings of the 22nd ACM international conference on Information & Knowledge Management
          October 2013
          2612 pages
          ISBN:9781450322638
          DOI:10.1145/2505515

          Copyright © 2013 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 27 October 2013

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          CIKM '13 Paper Acceptance Rate143of848submissions,17%Overall Acceptance Rate1,861of8,427submissions,22%

          Upcoming Conference

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader