skip to main content
10.1145/2723372.2746478acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article
Results Reproduced / v1.1

Efficient Enumeration of Maximal k-Plexes

Authors Info & Claims
Published:27 May 2015Publication History

Editorial Notes

Computationally Reproducible. The experimental results of this paper were reproduced by a SIGMOD Review Committee and were found to support the central results reported in the paper. Details of the review process are found here:

http://db-reproducibility.seas.harvard.edu/#process

ABSTRACT

The problem of enumerating (i.e., generating) all maximal cliques in a graph has received extensive treatment, due to the plethora of applications in various areas such as data mining, bioinformatics, network analysis and community detection. However, requiring the enumerated subgraphs to be full cliques is too restrictive in common real-life scenarios where "almost cliques" are equally useful. Hence, the notion of a k-plex, a clique relaxation that allows every node to be "missing" k neighbors, has been introduced. But this seemingly minor relaxation casts existing algorithms for clique enumeration inapplicable, for inherent reasons. This paper presents the first provably efficient algorithms, both for enumerating the maximal k-plexes and for enumerating the maximal connected k-plexes. Our algorithms run in polynomial delay for a constant k and incremental FPT delay when k is a parameter. The importance of such algorithms is in the areas mentioned above, as well as in new applications. Extensive experimentation over both real and synthetic datasets shows the efficiency of our algorithms, and their scalability with respect to graph size, density and choice of k, as well as their clear superiority over the state-of-the-art.

References

  1. E. A. Akkoyunlu. The enumeration of maximal cliques of large graphs. SIAM J. Comput., 2(1):1--6, 1973.Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. J. Bailey, T. Manoukian, and K. Ramamohanarao. A fast algorithm for computing hypergraph transversals and its application in mining emerging patterns. In Proceedings of the 3rd IEEE International Conference on Data Mining (ICDM 2003), 19--22 December 2003, Melbourne, Florida, USA, pages 485--488, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. B. Balasundaram, S. Butenko, and I. V. Hicks. Clique relaxations in social network analysis: The maximum k-plex problem. Oper. Res., 59(1):133--142, Jan. 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. H. Bernard, P. D. Killworth, and L. Sailer. Informant accuracy in social network data iv: a comparison of clique-level structure in behavioral and cognitive network data. Social Networks, 2(3):191 -- 218, 19791980.Google ScholarGoogle ScholarCross RefCross Ref
  5. K. Biswas, V. Muthukkumarasamy, and E. Sithirasenan. Maximal clique based clustering scheme for wireless sensor networks. In 2013 IEEE Eighth International Conference on Intelligent Sensors, Sensor Networks and Information Processing, Melbourne, Australia, April 2--5, 2013, pages 237--241, 2013.Google ScholarGoogle ScholarCross RefCross Ref
  6. V. Boginski, S. Butenko, and P. M. Pardalos. Statistical analysis of financial networks. Computational Statistics and Data Analysis, 48(2):431 -- 443, 2005.Google ScholarGoogle ScholarCross RefCross Ref
  7. E. Boros, K. M. Elbassioni, V. Gurvich, and L. Khachiyan. An efficient incremental algorithm for generating all maximal independent sets in hypergraphs of bounded dimension. Parallel Processing Letters, 10(4):253--266, 2000.Google ScholarGoogle ScholarCross RefCross Ref
  8. E. Boros, K. M. Elbassioni, V. Gurvich, and L. Khachiyan. Generating maximal independent sets for hypergraphs with bounded edge-intersections. In LATIN: Theoretical Informatics, 6th Latin American Symposium, Buenos Aires, Argentina, April 5--8, 2004, Proceedings, pages 488--498, 2004.Google ScholarGoogle Scholar
  9. C. Bron and J. Kerbosch. Algorithm 457: Finding all cliques of an undirected graph. Commun. ACM, 16(9):575--577, Sept. 1973. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. L. Chang, J. X. Yu, and L. Qin. Fast maximal cliques enumeration in sparse graphs. Algorithmica, 66(1):173--186, 2013.Google ScholarGoogle ScholarCross RefCross Ref
  11. J. Cheng, Y. Ke, A. W.-C. Fu, J. X. Yu, and L. Zhu. Finding maximal cliques in massive networks. ACM Trans. Database Syst., 36(4):21:1--21:34, Dec. 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. S. Cohen, B. Kimelfeld, and Y. Sagiv. Generating all maximal induced subgraphs for hereditary and connected-hereditary graph properties. J. Comput. Syst. Sci., 74(7):1147--1159, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. T. Eiter, G. Gottlob, and K. Makino. New results on monotone dualization and generating hypergraph transversals. SIAM J. Comput., 32(2):514--537, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. D. Eppstein, M. Löffler, and D. Strash. Listing all maximal cliques in sparse graphs in near-optimal time. In O. Cheong, K.-Y. Chwa, and K. Park, editors, Algorithms and Computation, volume 6506 of Lecture Notes in Computer Science, pages 403--414. Springer Berlin Heidelberg, 2010.Google ScholarGoogle ScholarCross RefCross Ref
  15. D. Eppstein and D. Strash. Listing all maximal cliques in large sparse real-world graphs. In P. Pardalos and S. Rebennack, editors, Experimental Algorithms, volume 6630 of Lecture Notes in Computer Science, pages 364--375. Springer Berlin Heidelberg, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. G. Gottlob. Deciding monotone duality and identifying frequent itemsets in quadratic logspace. In Proceedings of the 32nd ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, PODS 2013, New York, NY, USA - June 22 - 27, 2013, pages 25--36, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. E. Harley, A. Bonner, and N. Goodman. Uniform integration of genome mapping data using intersection graphs. Bioinformatics, 17(6):487--494, 2001.Google ScholarGoogle ScholarCross RefCross Ref
  18. D. S. Johnson, C. H. Papadimitriou, and M. Yannakakis. On generating all maximal independent sets. Inf. Process. Lett., 27(3):119--123, 1988. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. B. Kimelfeld and Y. Sagiv. Finding and approximating top-k answers in keyword proximity search. In Proceedings of the Twenty-Fifth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, June 26--28, 2006, Chicago, Illinois, USA, pages 173--182, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. I. Koch. Enumerating all connected maximal common subgraphs in two graphs. Theor. Comput. Sci., 250(1--2):1--30, Jan. 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. E. L. Lawler. A procedure for computing the k best solutions to discrete optimization problems and its application to the shortest path problem. Management Science, 18(7):401--405, 1972.Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. J. M. Lewis and M. Yannakakis. The node-deletion problem for hereditary properties is np-complete. J. Comput. Syst. Sci., 20(2):219--230, 1980.Google ScholarGoogle ScholarCross RefCross Ref
  23. D. Liben-Nowell and J. M. Kleinberg. The link-prediction problem for social networks. JASIST, 58(7):1019--1031, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. S. Mohseni-Zadeh, P. Brézellec, and J.-L. Risler. Cluster-c, an algorithm for the large-scale clustering of protein sequences based on the extraction of maximal cliques. Computational Biology and Chemistry, 28(3):211 -- 218, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. J. Moon and L. Moser. On cliques in graphs. Israel Journal of Mathematics, 3(1):23--28, 1965.Google ScholarGoogle ScholarCross RefCross Ref
  26. H. Moser, R. Niedermeier, and M. Sorge. Algorithms and experiments for clique relaxations--finding maximum s-plexes. In J. Vahrenhold, editor, Experimental Algorithms, volume 5526 of Lecture Notes in Computer Science, pages 233--244. Springer Berlin Heidelberg, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. K. Murakami and T. Uno. Efficient algorithms for dualizing large-scale hypergraphs. Discrete Applied Mathematics, 170:83--94, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. K. G. Murty. An algorithm for ranking all the assignments in order of increasing cost. Operations Research, 16(3):682--687, 1968.Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. G. Palla, I. Derenyi, and T. Vicsek. Uncovering the overlapping community structure of complex networks in nature and society. Nature, 435(7043):814--818, 2005.Google ScholarGoogle ScholarCross RefCross Ref
  30. J. Pattillo, N. Youssef, and S. Butenko. On clique relaxation models in network analysis. European Journal of Operational Research, 226(1):9--18, 2013.Google ScholarGoogle ScholarCross RefCross Ref
  31. J. L. Pattillo. Mathematical Foundations and Algorithms for Clique Relaxations in Networks. PhD thesis, College Station, TX, USA, 2011. AAI3500375. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. S. Seidman and B. Foster. A graph theoretic generalization of the clique concept. Journal of Mathematical Sociology, 6:139--154, 1978.Google ScholarGoogle ScholarCross RefCross Ref
  33. SNAP: Stanford network analysis platform. htto://snap.stanford.edu.Google ScholarGoogle Scholar
  34. E. Tomita, A. Tanaka, and H. Takahashi. The worst-case time complexity for generating all maximal cliques and computational experiments. Theor. Comput. Sci., 363(1):28--42, Oct. 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. S. Wasserman and K. Faust. Social Network Analysis: Methods and Applications. Cambridge University Press, 1994.Google ScholarGoogle ScholarCross RefCross Ref
  36. B. Wu and X. Pei. A parallel algorithm for enumerating all the maximal phk -plexes. In Emerging Technologies in Knowledge Discovery and Data Mining, PAKDD 2007, International Workshops, Nanjing, China, May 22--25, 2007, Revised Selected Papers, pages 476--483, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. B. Yan and S. Gregory. Detecting communities in networks by merging cliques. In IEEE International Conference on Intelligent Computing and Intelligent Systems, 2009.Google ScholarGoogle ScholarCross RefCross Ref
  38. M. Yannakakis. Algorithms for acyclic database schemes. In VLDB, pages 82--94, 1981. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. J. Y. Yen. Finding the k shortest loopless paths in a network. Management Science, 17:712--716, 1971.Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Z. Zou, J. Li, H. Gao, and S. Zhang. Finding top-k maximal cliques in an uncertain graph. In Proceedings of the 26th International Conference on Data Engineering, ICDE 2010, March 1--6, 2010, Long Beach, California, USA, pages 649--652, 2010.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Efficient Enumeration of Maximal k-Plexes

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      SIGMOD '15: Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data
      May 2015
      2110 pages
      ISBN:9781450327589
      DOI:10.1145/2723372

      Copyright © 2015 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 27 May 2015

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      SIGMOD '15 Paper Acceptance Rate106of415submissions,26%Overall Acceptance Rate785of4,003submissions,20%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader