ABSTRACT
K-clique counting is a fundamental problem in network analysis which has attracted much attention in recent years. Computing the count of k-cliques in a graph for a large k (e.g., k = 8) is often intractable as the number of k-cliques increases exponentially w.r.t. (with respect to) k. Existing exact k-clique counting algorithms are often hard to handle large dense graphs, while sampling-based solutions either require a huge number of samples or consume very high storage space to achieve a satisfactory accuracy. To overcome these limitations, we propose a new framework to estimate the number of k-cliques which integrates both the exact k-clique counting technique and two novel color-based sampling techniques. The key insight of our framework is that we only apply the exact algorithm to compute the k-clique counts in the sparse regions of a graph, and use the proposed sampling-based techniques to estimate the number of k-cliques in the dense regions of the graph. Specifically, we develop two novel dynamic programming based k-color set sampling techniques to efficiently estimate the k-clique counts, where a k-color set contains k nodes with k different colors. Since a k-color set is often a good approximation of a k-clique in the dense regions of a graph, our sampling-based solutions are extremely efficient and accurate. Moreover, the proposed sampling techniques are space efficient which use near-linear space w.r.t. graph size. We conduct extensive experiments to evaluate our algorithms using 8 real-life graphs. The results show that our best algorithm is at least one order of magnitude faster than the state-of-the-art sampling-based solutions (with the same relative error 0.1%) and can be up to three orders of magnitude faster than the state-of-the-art exact algorithm on large graphs.
- Mohammad Almasri, Izzat El Hajj, Rakesh Nagi, Jinjun Xiong, and Wen-Mei W. Hwu. 2021. K-Clique Counting on GPUs. CoRR abs/2104.13209(2021).Google Scholar
- Noga Alon, Raphael Yuster, and Uri Zwick. 1994. Color-coding: a new method for finding simple paths, cycles and other small subgraphs within large graphs. In STOC.Google Scholar
- Balabhaskar Balasundaram and Sergiy Butenko. 2006. Graph Domination, Coloring and Cliques in Telecommunications. In Handbook of Optimization in Telecommunications. Springer, 865–890.Google Scholar
- Vladimir Batagelj and Matjaz Zaversnik. 2003. An O(m) Algorithm for Cores Decomposition of Networks. CoRR cs.DS/0310049(2003).Google Scholar
- Luca Becchetti, Paolo Boldi, Carlos Castillo, and Aristides Gionis. 2008. Efficient semi-streaming algorithms for local triangle counting in massive graphs. In KDD.Google Scholar
- Austin R. Benson, David F. Gleich, and Jure Leskovec. 2016. Higher-order organization of complex networks. Science 353, 6295 (2016).Google Scholar
- J. W. Berry, B. Hendrickson, R. A. Laviolette, and C. A. Phillips. 2011. Tolerating the Community Detection Resolution Limit with Edge Weighting. Physical Review E Statistical Nonlinear & Soft Matter Physics 83, 5(2011), 056119.Google ScholarCross Ref
- Marco Bressan, Flavio Chierichetti, Ravi Kumar, Stefano Leucci, and Alessandro Panconesi. 2018. Motif Counting Beyond Five Nodes. ACM Trans. Knowl. Discov. Data 12, 4 (2018), 48:1–48:25.Google ScholarDigital Library
- Marco Bressan, Stefano Leucci, and Alessandro Panconesi. 2019. Motivo: Fast Motif Counting via Succinct Color Coding and Adaptive Sampling. Proc. VLDB Endow. 12, 11 (2019), 1651–1663.Google ScholarDigital Library
- S Ronald Burt. 2004. Structural holes and good ideas. Amer. J. Sociology 110, 2 (2004), 349–399.Google ScholarCross Ref
- Keren Censor-Hillel, Yi-Jun Chang, François Le Gall, and Dean Leitersdorf. 2021. Tight Distributed Listing of Cliques. In SODA.Google Scholar
- Lijun Chang and Lu Qin. 2019. Cohesive Subgraph Computation Over Large Sparse Graphs. In ICDE.Google Scholar
- Norishige Chiba and Takao Nishizeki. 1985. Arboricity and Subgraph Listing Algorithms. SIAM J. Comput. 14, 1 (1985), 210–223.Google ScholarDigital Library
- Shumo Chu and James Cheng. 2011. Triangle listing in massive networks and its applications. In KDD.Google Scholar
- Maximilien Danisch, Oana Balalau, and Mauro Sozio. 2018. Listing k-cliques in Sparse Real-World Graphs. In WWW.Google Scholar
- Talya Eden, Dana Ron, and C. Seshadhri. 2018. On approximating the number of k-cliques in sublinear time. In STOC.Google Scholar
- Talya Eden, Dana Ron, and C. Seshadhri. 2020. Faster sublinear approximation of the number of k-cliques in low-arboricity graphs. In SODA.Google Scholar
- Katherine Faust. 2010. A puzzle concerning triads in social networks: Graph constraints and the triad census. Soc. Networks 32, 3 (2010), 221–233.Google ScholarCross Ref
- Irene Finocchi, Marco Finocchi, and Emanuele G. Fusco. 2015. Clique Counting in MapReduce: Algorithms and Experiments. ACM J. Exp. Algorithmics 20 (2015), 1.7:1–1.7:20.Google ScholarDigital Library
- Lukas Gianinazzi, Maciej Besta, Yannick Schaffner, and Torsten Hoefler. 2021. Parallel Algorithms for Finding Large Cliques in Sparse Graphs. In SPAA.Google Scholar
- William Hasenplaugh, Tim Kaler, Tao B. Schardl, and Charles E. Leiserson. 2014. Ordering heuristics for parallel graph coloring. In SPAA.Google Scholar
- Lin Hu, Lei Zou, and Yu Liu. 2021. Accelerating Triangle Counting on GPU. In SIGMOD.Google Scholar
- Shweta Jain and C. Seshadhri. 2017. A Fast and Provable Method for Estimating Clique Counts Using Turán’s Theorem. In WWW.Google Scholar
- Shweta Jain and C. Seshadhri. 2020. The Power of Pivoting for Exact Clique Counting. In WSDM.Google Scholar
- Shweta Jain and C. Seshadhri. 2020. Provably and Efficiently Approximating Near-cliques using the Turán Shadow: PEANUTS. In WWW ’20: The Web Conference 2020, Taipei, Taiwan, April 20-24, 2020. 1966–1976.Google ScholarDigital Library
- Madhav Jha, C. Seshadhri, and Ali Pinar. 2015. Path Sampling: A Fast and Provable Method for Estimating 4-Vertex Subgraph Counts. In WWW.Google ScholarDigital Library
- Matthieu Latapy. 2008. Main-memory triangle computations for very large (sparse (power-law)) graphs. Theor. Comput. Sci. 407, 1-3 (2008), 458–473.Google ScholarDigital Library
- Ronghua Li, Sen Gao, Lu Qin, Guoren Wang, Weihua Yang, and Jeffrey Xu Yu. 2020. Ordering Heuristics for k-clique Listing. Proc. VLDB Endow. 13, 11 (2020), 2536–2548.Google ScholarDigital Library
- Kazuhisa Makino and Takeaki Uno. 2004. New Algorithms for Enumerating All Maximal Cliques. In 9th Scandinavian Workshop on Algorithm Theory.Google Scholar
- David W. Matula and Leland L. Beck. 1983. Smallest-Last Ordering and clustering and Graph Coloring Algorithms. J. ACM 30, 3 (1983), 417–427.Google ScholarDigital Library
- R. Milo, S. Shen-Orr, S. Itzkovitz, N. Kashtan, D Chklovskii, and U. Alon. 2010. Network Motifs: Simple Building Blocks of Complex Networks. Science 298, 5594 (2010), 763–764.Google Scholar
- Mark Ortmann and Ulrik Brandes. 2014. Triangle Listing Algorithms: Back from the Diversion. In ALENEX.Google Scholar
- Noujan Pashanasangi and C. Seshadhri. 2020. Efficiently Counting Vertex Orbits of All 5-vertex Subgraphs, by EVOKE. In WSDM.Google Scholar
- Ali Pinar, C. Seshadhri, and Vaidyanathan Vishal. 2017. ESCAPE: Efficiently Counting All 5-Vertex Subgraphs. In WWW.Google Scholar
- Natasa Przulj, Derek G. Corneil, and Igor Jurisica. 2004. Modeling interactome: scale-free or geometric?Bioinform. 20, 18 (2004), 3508–3515.Google Scholar
- Mahmudur Rahman, Mansurul Alam Bhuiyan, and Mohammad Al Hasan. 2014. Graft: An Efficient Graphlet Counting Method for Large Graph Analysis. IEEE Trans. Knowl. Data Eng. 26, 10 (2014), 2466–2478.Google ScholarCross Ref
- Ahmet Erdem Sariyüce, C. Seshadhri, Ali Pinar, and Ümit V. Çatalyürek. 2015. Finding the Hierarchy of Dense Subgraphs using Nucleus Decompositions. In WWW.Google Scholar
- Comandur Seshadhri and Srikanta Tirthapura. 2019. Scalable Subgraph Counting: The Methods Behind The Madness. In WWW.Google Scholar
- Bintao Sun, Maximilien Danisch, T.-H. Hubert Chan, and Mauro Sozio. 2020. KClist++: A Simple Algorithm for Finding k-Clique Densest Subgraphs in Large Graphs. Proc. VLDB Endow. 13, 10 (2020), 1628–1640.Google ScholarDigital Library
- Ancy Sarah Tom, Narayanan Sundaram, Nesreen K. Ahmed, Shaden Smith, Stijn Eyerman, Midhunchandra Kodiyath, Ibrahim Hur, Fabrizio Petrini, and George Karypis. 2017. Exploring optimizations on shared-memory platforms for parallel triangle counting algorithms. In HPEC.Google Scholar
- Etsuji Tomita, Akira Tanaka, and Haruhisa Takahashi. 2006. The worst-case time complexity for generating all maximal cliques and computational experiments. Theor. Comput. Sci. 363, 1 (2006), 28–42.Google ScholarDigital Library
- Charalampos E. Tsourakakis. 2015. The K-clique Densest Subgraph Problem. In WWW.Google Scholar
- Charalampos E. Tsourakakis, U Kang, Gary L. Miller, and Christos Faloutsos. 2009. DOULION: counting triangles in massive graphs with a coin. In KDD.Google Scholar
- Pinghui Wang, Junzhou Zhao, Xiangliang Zhang, Zhenguo Li, Jiefeng Cheng, John C. S. Lui, Don Towsley, Jing Tao, and Xiaohong Guan. 2018. MOSS-5: A Fast Method of Approximating Counts of 5-Node Graphlets in Large Graphs. IEEE Trans. Knowl. Data Eng. 30, 1 (2018), 73–86.Google ScholarCross Ref
- Hao Yin, Austin R. Benson, and Jure Leskovec. 2017. Higher-order clustering in networks. Physical Review E 97, 5 (2017), 052306.Google ScholarCross Ref
- Long Yuan, Lu Qin, Xuemin Lin, Lijun Chang, and Wenjie Zhang. 2017. Effective and Efficient Dynamic Graph Coloring. Proc. VLDB Endow. 11, 3 (2017), 338–351.Google ScholarDigital Library
Index Terms
- Lightning Fast and Space Efficient k-clique Counting
Recommendations
Parallel K-clique counting on GPUs
ICS '22: Proceedings of the 36th ACM International Conference on SupercomputingCounting k-cliques in a graph is an important problem in graph analysis with many applications such as community detection and graph partitioning. Counting k-cliques is typically done by traversing search trees starting at each vertex in the graph. ...
Efficient Biclique Counting in Large Bipartite Graphs
PACMMODA (p,q)-biclique is a complete subgraph (X,Y) that |X|=p, |Y|=q. Counting (p,q)-bicliques in bipartite graphs is an important operator for many bipartite graph analysis applications. However, getting the count of (p,q)-bicliques for large p and q (e.g., ...
Clique-transversal sets and clique-coloring in planar graphs
Let G=(V,E) be a graph. A clique-transversal setD is a subset of vertices of G such that D meets all cliques of G, where a clique is defined as a complete subgraph maximal under inclusion and having at least two vertices. The clique-transversal number, ...
Comments