skip to main content
10.1145/3485447.3512167acmconferencesArticle/Chapter ViewAbstractPublication PageswwwConference Proceedingsconference-collections
research-article

Lightning Fast and Space Efficient k-clique Counting

Published:25 April 2022Publication History

ABSTRACT

K-clique counting is a fundamental problem in network analysis which has attracted much attention in recent years. Computing the count of k-cliques in a graph for a large k (e.g., k = 8) is often intractable as the number of k-cliques increases exponentially w.r.t. (with respect to) k. Existing exact k-clique counting algorithms are often hard to handle large dense graphs, while sampling-based solutions either require a huge number of samples or consume very high storage space to achieve a satisfactory accuracy. To overcome these limitations, we propose a new framework to estimate the number of k-cliques which integrates both the exact k-clique counting technique and two novel color-based sampling techniques. The key insight of our framework is that we only apply the exact algorithm to compute the k-clique counts in the sparse regions of a graph, and use the proposed sampling-based techniques to estimate the number of k-cliques in the dense regions of the graph. Specifically, we develop two novel dynamic programming based k-color set sampling techniques to efficiently estimate the k-clique counts, where a k-color set contains k nodes with k different colors. Since a k-color set is often a good approximation of a k-clique in the dense regions of a graph, our sampling-based solutions are extremely efficient and accurate. Moreover, the proposed sampling techniques are space efficient which use near-linear space w.r.t. graph size. We conduct extensive experiments to evaluate our algorithms using 8 real-life graphs. The results show that our best algorithm is at least one order of magnitude faster than the state-of-the-art sampling-based solutions (with the same relative error 0.1%) and can be up to three orders of magnitude faster than the state-of-the-art exact algorithm on large graphs.

References

  1. Mohammad Almasri, Izzat El Hajj, Rakesh Nagi, Jinjun Xiong, and Wen-Mei W. Hwu. 2021. K-Clique Counting on GPUs. CoRR abs/2104.13209(2021).Google ScholarGoogle Scholar
  2. Noga Alon, Raphael Yuster, and Uri Zwick. 1994. Color-coding: a new method for finding simple paths, cycles and other small subgraphs within large graphs. In STOC.Google ScholarGoogle Scholar
  3. Balabhaskar Balasundaram and Sergiy Butenko. 2006. Graph Domination, Coloring and Cliques in Telecommunications. In Handbook of Optimization in Telecommunications. Springer, 865–890.Google ScholarGoogle Scholar
  4. Vladimir Batagelj and Matjaz Zaversnik. 2003. An O(m) Algorithm for Cores Decomposition of Networks. CoRR cs.DS/0310049(2003).Google ScholarGoogle Scholar
  5. Luca Becchetti, Paolo Boldi, Carlos Castillo, and Aristides Gionis. 2008. Efficient semi-streaming algorithms for local triangle counting in massive graphs. In KDD.Google ScholarGoogle Scholar
  6. Austin R. Benson, David F. Gleich, and Jure Leskovec. 2016. Higher-order organization of complex networks. Science 353, 6295 (2016).Google ScholarGoogle Scholar
  7. J. W. Berry, B. Hendrickson, R. A. Laviolette, and C. A. Phillips. 2011. Tolerating the Community Detection Resolution Limit with Edge Weighting. Physical Review E Statistical Nonlinear & Soft Matter Physics 83, 5(2011), 056119.Google ScholarGoogle ScholarCross RefCross Ref
  8. Marco Bressan, Flavio Chierichetti, Ravi Kumar, Stefano Leucci, and Alessandro Panconesi. 2018. Motif Counting Beyond Five Nodes. ACM Trans. Knowl. Discov. Data 12, 4 (2018), 48:1–48:25.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Marco Bressan, Stefano Leucci, and Alessandro Panconesi. 2019. Motivo: Fast Motif Counting via Succinct Color Coding and Adaptive Sampling. Proc. VLDB Endow. 12, 11 (2019), 1651–1663.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. S Ronald Burt. 2004. Structural holes and good ideas. Amer. J. Sociology 110, 2 (2004), 349–399.Google ScholarGoogle ScholarCross RefCross Ref
  11. Keren Censor-Hillel, Yi-Jun Chang, François Le Gall, and Dean Leitersdorf. 2021. Tight Distributed Listing of Cliques. In SODA.Google ScholarGoogle Scholar
  12. Lijun Chang and Lu Qin. 2019. Cohesive Subgraph Computation Over Large Sparse Graphs. In ICDE.Google ScholarGoogle Scholar
  13. Norishige Chiba and Takao Nishizeki. 1985. Arboricity and Subgraph Listing Algorithms. SIAM J. Comput. 14, 1 (1985), 210–223.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Shumo Chu and James Cheng. 2011. Triangle listing in massive networks and its applications. In KDD.Google ScholarGoogle Scholar
  15. Maximilien Danisch, Oana Balalau, and Mauro Sozio. 2018. Listing k-cliques in Sparse Real-World Graphs. In WWW.Google ScholarGoogle Scholar
  16. Talya Eden, Dana Ron, and C. Seshadhri. 2018. On approximating the number of k-cliques in sublinear time. In STOC.Google ScholarGoogle Scholar
  17. Talya Eden, Dana Ron, and C. Seshadhri. 2020. Faster sublinear approximation of the number of k-cliques in low-arboricity graphs. In SODA.Google ScholarGoogle Scholar
  18. Katherine Faust. 2010. A puzzle concerning triads in social networks: Graph constraints and the triad census. Soc. Networks 32, 3 (2010), 221–233.Google ScholarGoogle ScholarCross RefCross Ref
  19. Irene Finocchi, Marco Finocchi, and Emanuele G. Fusco. 2015. Clique Counting in MapReduce: Algorithms and Experiments. ACM J. Exp. Algorithmics 20 (2015), 1.7:1–1.7:20.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Lukas Gianinazzi, Maciej Besta, Yannick Schaffner, and Torsten Hoefler. 2021. Parallel Algorithms for Finding Large Cliques in Sparse Graphs. In SPAA.Google ScholarGoogle Scholar
  21. William Hasenplaugh, Tim Kaler, Tao B. Schardl, and Charles E. Leiserson. 2014. Ordering heuristics for parallel graph coloring. In SPAA.Google ScholarGoogle Scholar
  22. Lin Hu, Lei Zou, and Yu Liu. 2021. Accelerating Triangle Counting on GPU. In SIGMOD.Google ScholarGoogle Scholar
  23. Shweta Jain and C. Seshadhri. 2017. A Fast and Provable Method for Estimating Clique Counts Using Turán’s Theorem. In WWW.Google ScholarGoogle Scholar
  24. Shweta Jain and C. Seshadhri. 2020. The Power of Pivoting for Exact Clique Counting. In WSDM.Google ScholarGoogle Scholar
  25. Shweta Jain and C. Seshadhri. 2020. Provably and Efficiently Approximating Near-cliques using the Turán Shadow: PEANUTS. In WWW ’20: The Web Conference 2020, Taipei, Taiwan, April 20-24, 2020. 1966–1976.Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Madhav Jha, C. Seshadhri, and Ali Pinar. 2015. Path Sampling: A Fast and Provable Method for Estimating 4-Vertex Subgraph Counts. In WWW.Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Matthieu Latapy. 2008. Main-memory triangle computations for very large (sparse (power-law)) graphs. Theor. Comput. Sci. 407, 1-3 (2008), 458–473.Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Ronghua Li, Sen Gao, Lu Qin, Guoren Wang, Weihua Yang, and Jeffrey Xu Yu. 2020. Ordering Heuristics for k-clique Listing. Proc. VLDB Endow. 13, 11 (2020), 2536–2548.Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Kazuhisa Makino and Takeaki Uno. 2004. New Algorithms for Enumerating All Maximal Cliques. In 9th Scandinavian Workshop on Algorithm Theory.Google ScholarGoogle Scholar
  30. David W. Matula and Leland L. Beck. 1983. Smallest-Last Ordering and clustering and Graph Coloring Algorithms. J. ACM 30, 3 (1983), 417–427.Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. R. Milo, S. Shen-Orr, S. Itzkovitz, N. Kashtan, D Chklovskii, and U. Alon. 2010. Network Motifs: Simple Building Blocks of Complex Networks. Science 298, 5594 (2010), 763–764.Google ScholarGoogle Scholar
  32. Mark Ortmann and Ulrik Brandes. 2014. Triangle Listing Algorithms: Back from the Diversion. In ALENEX.Google ScholarGoogle Scholar
  33. Noujan Pashanasangi and C. Seshadhri. 2020. Efficiently Counting Vertex Orbits of All 5-vertex Subgraphs, by EVOKE. In WSDM.Google ScholarGoogle Scholar
  34. Ali Pinar, C. Seshadhri, and Vaidyanathan Vishal. 2017. ESCAPE: Efficiently Counting All 5-Vertex Subgraphs. In WWW.Google ScholarGoogle Scholar
  35. Natasa Przulj, Derek G. Corneil, and Igor Jurisica. 2004. Modeling interactome: scale-free or geometric?Bioinform. 20, 18 (2004), 3508–3515.Google ScholarGoogle Scholar
  36. Mahmudur Rahman, Mansurul Alam Bhuiyan, and Mohammad Al Hasan. 2014. Graft: An Efficient Graphlet Counting Method for Large Graph Analysis. IEEE Trans. Knowl. Data Eng. 26, 10 (2014), 2466–2478.Google ScholarGoogle ScholarCross RefCross Ref
  37. Ahmet Erdem Sariyüce, C. Seshadhri, Ali Pinar, and Ümit V. Çatalyürek. 2015. Finding the Hierarchy of Dense Subgraphs using Nucleus Decompositions. In WWW.Google ScholarGoogle Scholar
  38. Comandur Seshadhri and Srikanta Tirthapura. 2019. Scalable Subgraph Counting: The Methods Behind The Madness. In WWW.Google ScholarGoogle Scholar
  39. Bintao Sun, Maximilien Danisch, T.-H. Hubert Chan, and Mauro Sozio. 2020. KClist++: A Simple Algorithm for Finding k-Clique Densest Subgraphs in Large Graphs. Proc. VLDB Endow. 13, 10 (2020), 1628–1640.Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Ancy Sarah Tom, Narayanan Sundaram, Nesreen K. Ahmed, Shaden Smith, Stijn Eyerman, Midhunchandra Kodiyath, Ibrahim Hur, Fabrizio Petrini, and George Karypis. 2017. Exploring optimizations on shared-memory platforms for parallel triangle counting algorithms. In HPEC.Google ScholarGoogle Scholar
  41. Etsuji Tomita, Akira Tanaka, and Haruhisa Takahashi. 2006. The worst-case time complexity for generating all maximal cliques and computational experiments. Theor. Comput. Sci. 363, 1 (2006), 28–42.Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Charalampos E. Tsourakakis. 2015. The K-clique Densest Subgraph Problem. In WWW.Google ScholarGoogle Scholar
  43. Charalampos E. Tsourakakis, U Kang, Gary L. Miller, and Christos Faloutsos. 2009. DOULION: counting triangles in massive graphs with a coin. In KDD.Google ScholarGoogle Scholar
  44. Pinghui Wang, Junzhou Zhao, Xiangliang Zhang, Zhenguo Li, Jiefeng Cheng, John C. S. Lui, Don Towsley, Jing Tao, and Xiaohong Guan. 2018. MOSS-5: A Fast Method of Approximating Counts of 5-Node Graphlets in Large Graphs. IEEE Trans. Knowl. Data Eng. 30, 1 (2018), 73–86.Google ScholarGoogle ScholarCross RefCross Ref
  45. Hao Yin, Austin R. Benson, and Jure Leskovec. 2017. Higher-order clustering in networks. Physical Review E 97, 5 (2017), 052306.Google ScholarGoogle ScholarCross RefCross Ref
  46. Long Yuan, Lu Qin, Xuemin Lin, Lijun Chang, and Wenjie Zhang. 2017. Effective and Efficient Dynamic Graph Coloring. Proc. VLDB Endow. 11, 3 (2017), 338–351.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Lightning Fast and Space Efficient k-clique Counting
      Index terms have been assigned to the content through auto-classification.

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        WWW '22: Proceedings of the ACM Web Conference 2022
        April 2022
        3764 pages
        ISBN:9781450390965
        DOI:10.1145/3485447

        Copyright © 2022 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 25 April 2022

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Research
        • Refereed limited

        Acceptance Rates

        Overall Acceptance Rate1,899of8,196submissions,23%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      HTML Format

      View this article in HTML Format .

      View HTML Format