Editorial Notes
Computationally Reproducible. The experimental results of this paper were reproduced by a SIGMOD Review Committee and were found to support the central results reported in the paper. Details of the review process are found here:
http://db-reproducibility.seas.harvard.edu/#process
ABSTRACT
The problem of enumerating (i.e., generating) all maximal cliques in a graph has received extensive treatment, due to the plethora of applications in various areas such as data mining, bioinformatics, network analysis and community detection. However, requiring the enumerated subgraphs to be full cliques is too restrictive in common real-life scenarios where "almost cliques" are equally useful. Hence, the notion of a k-plex, a clique relaxation that allows every node to be "missing" k neighbors, has been introduced. But this seemingly minor relaxation casts existing algorithms for clique enumeration inapplicable, for inherent reasons. This paper presents the first provably efficient algorithms, both for enumerating the maximal k-plexes and for enumerating the maximal connected k-plexes. Our algorithms run in polynomial delay for a constant k and incremental FPT delay when k is a parameter. The importance of such algorithms is in the areas mentioned above, as well as in new applications. Extensive experimentation over both real and synthetic datasets shows the efficiency of our algorithms, and their scalability with respect to graph size, density and choice of k, as well as their clear superiority over the state-of-the-art.
- E. A. Akkoyunlu. The enumeration of maximal cliques of large graphs. SIAM J. Comput., 2(1):1--6, 1973.Google ScholarDigital Library
- J. Bailey, T. Manoukian, and K. Ramamohanarao. A fast algorithm for computing hypergraph transversals and its application in mining emerging patterns. In Proceedings of the 3rd IEEE International Conference on Data Mining (ICDM 2003), 19--22 December 2003, Melbourne, Florida, USA, pages 485--488, 2003. Google ScholarDigital Library
- B. Balasundaram, S. Butenko, and I. V. Hicks. Clique relaxations in social network analysis: The maximum k-plex problem. Oper. Res., 59(1):133--142, Jan. 2011. Google ScholarDigital Library
- H. Bernard, P. D. Killworth, and L. Sailer. Informant accuracy in social network data iv: a comparison of clique-level structure in behavioral and cognitive network data. Social Networks, 2(3):191 -- 218, 19791980.Google ScholarCross Ref
- K. Biswas, V. Muthukkumarasamy, and E. Sithirasenan. Maximal clique based clustering scheme for wireless sensor networks. In 2013 IEEE Eighth International Conference on Intelligent Sensors, Sensor Networks and Information Processing, Melbourne, Australia, April 2--5, 2013, pages 237--241, 2013.Google ScholarCross Ref
- V. Boginski, S. Butenko, and P. M. Pardalos. Statistical analysis of financial networks. Computational Statistics and Data Analysis, 48(2):431 -- 443, 2005.Google ScholarCross Ref
- E. Boros, K. M. Elbassioni, V. Gurvich, and L. Khachiyan. An efficient incremental algorithm for generating all maximal independent sets in hypergraphs of bounded dimension. Parallel Processing Letters, 10(4):253--266, 2000.Google ScholarCross Ref
- E. Boros, K. M. Elbassioni, V. Gurvich, and L. Khachiyan. Generating maximal independent sets for hypergraphs with bounded edge-intersections. In LATIN: Theoretical Informatics, 6th Latin American Symposium, Buenos Aires, Argentina, April 5--8, 2004, Proceedings, pages 488--498, 2004.Google Scholar
- C. Bron and J. Kerbosch. Algorithm 457: Finding all cliques of an undirected graph. Commun. ACM, 16(9):575--577, Sept. 1973. Google ScholarDigital Library
- L. Chang, J. X. Yu, and L. Qin. Fast maximal cliques enumeration in sparse graphs. Algorithmica, 66(1):173--186, 2013.Google ScholarCross Ref
- J. Cheng, Y. Ke, A. W.-C. Fu, J. X. Yu, and L. Zhu. Finding maximal cliques in massive networks. ACM Trans. Database Syst., 36(4):21:1--21:34, Dec. 2011. Google ScholarDigital Library
- S. Cohen, B. Kimelfeld, and Y. Sagiv. Generating all maximal induced subgraphs for hereditary and connected-hereditary graph properties. J. Comput. Syst. Sci., 74(7):1147--1159, 2008. Google ScholarDigital Library
- T. Eiter, G. Gottlob, and K. Makino. New results on monotone dualization and generating hypergraph transversals. SIAM J. Comput., 32(2):514--537, 2003. Google ScholarDigital Library
- D. Eppstein, M. Löffler, and D. Strash. Listing all maximal cliques in sparse graphs in near-optimal time. In O. Cheong, K.-Y. Chwa, and K. Park, editors, Algorithms and Computation, volume 6506 of Lecture Notes in Computer Science, pages 403--414. Springer Berlin Heidelberg, 2010.Google ScholarCross Ref
- D. Eppstein and D. Strash. Listing all maximal cliques in large sparse real-world graphs. In P. Pardalos and S. Rebennack, editors, Experimental Algorithms, volume 6630 of Lecture Notes in Computer Science, pages 364--375. Springer Berlin Heidelberg, 2011. Google ScholarDigital Library
- G. Gottlob. Deciding monotone duality and identifying frequent itemsets in quadratic logspace. In Proceedings of the 32nd ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, PODS 2013, New York, NY, USA - June 22 - 27, 2013, pages 25--36, 2013. Google ScholarDigital Library
- E. Harley, A. Bonner, and N. Goodman. Uniform integration of genome mapping data using intersection graphs. Bioinformatics, 17(6):487--494, 2001.Google ScholarCross Ref
- D. S. Johnson, C. H. Papadimitriou, and M. Yannakakis. On generating all maximal independent sets. Inf. Process. Lett., 27(3):119--123, 1988. Google ScholarDigital Library
- B. Kimelfeld and Y. Sagiv. Finding and approximating top-k answers in keyword proximity search. In Proceedings of the Twenty-Fifth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, June 26--28, 2006, Chicago, Illinois, USA, pages 173--182, 2006. Google ScholarDigital Library
- I. Koch. Enumerating all connected maximal common subgraphs in two graphs. Theor. Comput. Sci., 250(1--2):1--30, Jan. 2001. Google ScholarDigital Library
- E. L. Lawler. A procedure for computing the k best solutions to discrete optimization problems and its application to the shortest path problem. Management Science, 18(7):401--405, 1972.Google ScholarDigital Library
- J. M. Lewis and M. Yannakakis. The node-deletion problem for hereditary properties is np-complete. J. Comput. Syst. Sci., 20(2):219--230, 1980.Google ScholarCross Ref
- D. Liben-Nowell and J. M. Kleinberg. The link-prediction problem for social networks. JASIST, 58(7):1019--1031, 2007. Google ScholarDigital Library
- S. Mohseni-Zadeh, P. Brézellec, and J.-L. Risler. Cluster-c, an algorithm for the large-scale clustering of protein sequences based on the extraction of maximal cliques. Computational Biology and Chemistry, 28(3):211 -- 218, 2004. Google ScholarDigital Library
- J. Moon and L. Moser. On cliques in graphs. Israel Journal of Mathematics, 3(1):23--28, 1965.Google ScholarCross Ref
- H. Moser, R. Niedermeier, and M. Sorge. Algorithms and experiments for clique relaxations--finding maximum s-plexes. In J. Vahrenhold, editor, Experimental Algorithms, volume 5526 of Lecture Notes in Computer Science, pages 233--244. Springer Berlin Heidelberg, 2009. Google ScholarDigital Library
- K. Murakami and T. Uno. Efficient algorithms for dualizing large-scale hypergraphs. Discrete Applied Mathematics, 170:83--94, 2014. Google ScholarDigital Library
- K. G. Murty. An algorithm for ranking all the assignments in order of increasing cost. Operations Research, 16(3):682--687, 1968.Google ScholarDigital Library
- G. Palla, I. Derenyi, and T. Vicsek. Uncovering the overlapping community structure of complex networks in nature and society. Nature, 435(7043):814--818, 2005.Google ScholarCross Ref
- J. Pattillo, N. Youssef, and S. Butenko. On clique relaxation models in network analysis. European Journal of Operational Research, 226(1):9--18, 2013.Google ScholarCross Ref
- J. L. Pattillo. Mathematical Foundations and Algorithms for Clique Relaxations in Networks. PhD thesis, College Station, TX, USA, 2011. AAI3500375. Google ScholarDigital Library
- S. Seidman and B. Foster. A graph theoretic generalization of the clique concept. Journal of Mathematical Sociology, 6:139--154, 1978.Google ScholarCross Ref
- SNAP: Stanford network analysis platform. htto://snap.stanford.edu.Google Scholar
- E. Tomita, A. Tanaka, and H. Takahashi. The worst-case time complexity for generating all maximal cliques and computational experiments. Theor. Comput. Sci., 363(1):28--42, Oct. 2006. Google ScholarDigital Library
- S. Wasserman and K. Faust. Social Network Analysis: Methods and Applications. Cambridge University Press, 1994.Google ScholarCross Ref
- B. Wu and X. Pei. A parallel algorithm for enumerating all the maximal phk -plexes. In Emerging Technologies in Knowledge Discovery and Data Mining, PAKDD 2007, International Workshops, Nanjing, China, May 22--25, 2007, Revised Selected Papers, pages 476--483, 2007. Google ScholarDigital Library
- B. Yan and S. Gregory. Detecting communities in networks by merging cliques. In IEEE International Conference on Intelligent Computing and Intelligent Systems, 2009.Google ScholarCross Ref
- M. Yannakakis. Algorithms for acyclic database schemes. In VLDB, pages 82--94, 1981. Google ScholarDigital Library
- J. Y. Yen. Finding the k shortest loopless paths in a network. Management Science, 17:712--716, 1971.Google ScholarDigital Library
- Z. Zou, J. Li, H. Gao, and S. Zhang. Finding top-k maximal cliques in an uncertain graph. In Proceedings of the 26th International Conference on Data Engineering, ICDE 2010, March 1--6, 2010, Long Beach, California, USA, pages 649--652, 2010.Google ScholarCross Ref
Index Terms
- Efficient Enumeration of Maximal k-Plexes
Recommendations
Scaling Up Maximal k-plex Enumeration
CIKM '22: Proceedings of the 31st ACM International Conference on Information & Knowledge ManagementFinding all maximal k-plexes on networks is a fundamental research problem in graph analysis due to many important applications, such as community detection, biological graph analysis, and so on. A k-plex is a subgraph in which every vertex is adjacent ...
Efficient enumeration of non-isomorphic distance-hereditary graphs and related graphs
AbstractWe investigate the enumeration problems for the class of distance-hereditary graphs and related graph classes. We first give an enumeration algorithm for the class of distance-hereditary graphs using the recent framework for enumerating every non-...
On acyclically 4-colorable maximal planar graphs
An acyclic coloring of a graph is a proper coloring of the graph, for which every cycle uses at least three colors. Let G4 be the set of maximal planar graphs of minimum degree 4, such that each graph in G4 contains exactly four odd-vertices and the ...
Comments