research-article

Finding maximal cliques in massive networks

Authors:
James Cheng

Nanyang Technological University, Singapore

Nanyang Technological University, Singapore
View Profile

,
Yiping Ke

The Chinese University of Hong Kong, New Territories, Hong Kong

The Chinese University of Hong Kong, New Territories, Hong Kong
View Profile

,
Ada Wai-Chee Fu

The Chinese University of Hong Kong, New Territories, Hong Kong

The Chinese University of Hong Kong, New Territories, Hong Kong
View Profile

,
Jeffrey Xu Yu

The Chinese University of Hong Kong, New Territories, Hong Kong

The Chinese University of Hong Kong, New Territories, Hong Kong
View Profile

,
Linhong Zhu

Institute for Infocomm Research, Singapore

Institute for Infocomm Research, Singapore
View Profile

Authors Info & Claims

ACM Transactions on Database Systems Volume 36 Issue 4Article No.: 21pp 1–34https://doi.org/10.1145/2043652.2043654

Published:19 December 2011Publication History

ACM Transactions on Database Systems

Abstract

Maximal clique enumeration is a fundamental problem in graph theory and has important applications in many areas such as social network analysis and bioinformatics. The problem is extensively studied; however, the best existing algorithms require memory space linear in the size of the input graph. This has become a serious concern in view of the massive volume of today's fast-growing networks. We propose a general framework for designing external-memory algorithms for maximal clique enumeration in large graphs. The general framework enables maximal clique enumeration to be processed recursively in small subgraphs of the input graph, thus allowing in-memory computation of maximal cliques without the costly random disk access. We prove that the set of cliques obtained by the recursive local computation is both correct (i.e., globally maximal) and complete. The subgraph to be processed each time is defined based on a set of base vertices that can be flexibly chosen to achieve different purposes. We discuss the selection of the base vertices to fully utilize the available memory in order to minimize I/O cost in static graphs, and for update maintenance in dynamic graphs. We also apply our framework to design an external-memory algorithm for maximum clique computation in a large graph.

References

Abu-Khzam, F. N., Baldwin, N. E., Langston, M. A., and Samatova, N. F. 2005. On the relative efficiency of maximal clique enumeration algorithms, with applications to high-throughput computational biology. In Proceedings of the International Conference on Research Trends in Science and Technology.Google Scholar
Akkoyunlu, E. A. 1973. The enumeration of maximal cliques of large graphs. SIAM J. Comput. 2, 1, 1--6.Google ScholarDigital Library
Bernard, H. R., Killworth, P. D., and Sailer, L. 1979. Informant accuracy in social network data iv: a comparison of clique-level structure in behavioral and cognitive network data. Social Netw. 2, 3, 191--218.Google ScholarCross Ref
Berry, N. M., Ko, T. H., Moy, T., Smrcka, J., Turnley, J., and Wu, B. 2004. Emergent clique formation in terrorist recruitment. In Proceedings of the AAAI-04 Workshop on Agent Organizations: Theory and Practice.Google Scholar
Boginski, V., Butenko, S., and Pardalos, P. M. 2005. Statistical analysis of financial networks. Comput. Statist. Data Anal. 48, 2, 431--443.Google ScholarCross Ref
Bomze, I. M., Budinich, M., Pardalos, P. M., and Pelillo, M. 1999. The maximum clique problem. In Handbook of Combinatorial Optimization. Kluwer Academic Publishers, 1--74.Google Scholar
Bron, C. and Kerbosch, J. 1973. Algorithm 457: finding all cliques of an undirected graph. Comm. ACM 16, 9, 575--577. Google ScholarDigital Library
Byskov, J. M. 2003. Algorithms for k-colouring and finding maximal independent sets. In Proceedings of the Symposium on Discrete Algorithms (SODA). 456--457. Google ScholarDigital Library
Cazals, F. and Karande, C. 2008. A note on the problem of reporting maximal cliques. Theor. Comput. Sci. 407, 1-3, 564--568. Google ScholarDigital Library
Cheng, J., Ke, Y., Fu, A. W.-C., Yu, J. X., and Zhu, L. 2010. Finding maximal cliques in massive networks by h*-graph. In Proceedings of the SIGMOD International Conference on Management of Data. 447--458. Google ScholarDigital Library
Creamer, G., Rowe, R., Hershkop, S., and Stolfo, S. J. 2007. Segmentation and automated social hierarchy detection through email network analysis. In Proceedings of WebKDD/SNA-KDD. 40--58. Google ScholarDigital Library
Dorogovtsev, S. N. and Mendesand, J. F. F. 2003. Evolution of Networks: From Biological Nets to the Internet and www. Oxford University Press. Google ScholarDigital Library
Du, N., Wu, B., Xu, L., Wang, B., and Xin, P. 2009. Parallel algorithm for enumerating maximal cliques in complex network. In Mining Complex Data, 207--221.Google Scholar
Faloutsos, M., Faloutsos, P., and Faloutsos, C. 1999. On power-law relationships of the internet topology. In Proceedings of the ACM SIGCOMM Conference on Applications, Technologies, Architectures and Protocols for Computer Communications (SIGCOMM). 251--262. Google ScholarDigital Library
Faust, K. and Wasserman, S. 1995. Social Network Analysis: Methods and Applications. Cambridge University Press.Google Scholar
Gouda, K. and Zaki, M. J. 2001. Efficiently mining maximal frequent itemsets. In Proceedings of the International Conference on Data Mining (ICDM). 163--170. Google ScholarDigital Library
Kilby, P., Slaney, J. K., Thiébaux, S., and Walsh, T. 2006. Estimating search tree size. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI). Google ScholarDigital Library
Knuth, D. E. 1975. Estimating the efficiency of backtrack programs. Math. Comput. 29, 129, 121--136.Google Scholar
Koch, I. 2001. Enumerating all connected maximal common subgraphs in two graphs. Theor. Comput. Sci. 250, 1-2, 1--30. Google ScholarDigital Library
Kose, F., Weckwerth, W., Linke, T., and Fiehn, O. 2001. Visualizing plant metabolomic correlation networks using clique-metabolite matrices. Bioinf. 17, 12, 1198--1208.Google ScholarCross Ref
Makino, K. and Uno, T. 2004. New algorithms for enumerating all maximal cliques. In Proceedings of the Scandinavian Workshop on Algorithms Theory (SWAT). 260--272.Google Scholar
Modani, N. and Dey, K. 2008. Large maximal cliques enumeration in sparse graphs. In Proceedings of the International Conference on Information and Knowledge Management (CIKM). 1377--1378. Google ScholarDigital Library
Mohseni-Zadeh, S., Brézellec, P., and Risler, J.-L. 2004. Cluster-c, an algorithm for the large-scale clustering of protein sequences based on the extraction of maximal cliques. Comput. Biol. Chemist. 28, 3, 211--218. Google ScholarDigital Library
Newman, M. E. J. 2003. The structure and function of complex networks. SIAM Rev. 45, 167--256.Google ScholarDigital Library
Schmidt, M. C., Samatova, N. F., Thomas, K., and Park, B.-H. 2009. A scalable, parallel algorithm for maximal clique enumeration. J. Parall. Distrib. Comput. 69, 4, 417--428. Google ScholarDigital Library
Stix, V. 2004. Finding all maximal cliques in dynamic graphs. Comput. Optimiz. Appl. 27, 173--186. Google ScholarDigital Library
Tomita, E. and Seki, T. 2003. An efficient branch-and-bound algorithm for finding a maximum clique. Discr. Math. Theoret. Comput. Sci. 2731, 278--289. Google ScholarDigital Library
Tomita, E., Tanaka, A., and Takahashi, H. 2006. The worst-case time complexity for generating all maximal cliques and computational experiments. Theor. Comput. Sci. 363, 1, 28--42. Google ScholarDigital Library
Tsukiyama, S., Ide, M., Ariyoshi, H., and Shirakawa, I. 1977. A new algorithm for generating all the maximal independent sets. SIAM J. Comput. 6, 3, 505--517.Google ScholarDigital Library
Wan, L., Wu, B., Du, N., Ye, Q., and Chen, P. 2006. A new algorithm for enumerating all maximal cliques in complex network. In Proceedings of the International Conference on Advanced Data Mining and Applications (ADMA). 606--617. Google ScholarDigital Library
Zhang, B., Park, B.-H., Karpinets, T. V., and Samatova, N. F. 2008. From pull-down data to protein interaction networks and complexes with biological relevance. Bioinf. 24, 7, 979--986. Google ScholarDigital Library

Index Terms

Finding maximal cliques in massive networks
1. Mathematics of computing
  1. Discrete mathematics
    1. Graph theory
      1. Graph algorithms

Recommendations

Finding maximal cliques in massive networks by H*-graph
SIGMOD '10: Proceedings of the 2010 ACM SIGMOD International Conference on Management of data

Maximal clique enumeration (MCE) is a fundamental problem in graph theory and has important applications in many areas such as social network analysis and bioinformatics. The problem is extensively studied; however, the best existing algorithms require ...
Read More
Mining maximal cliques from a large graph using MapReduce

We consider Maximal Clique Enumeration (MCE) from a large graph. A maximal clique is perhaps the most fundamental dense substructure in a graph, and MCE is an important tool to discover densely connected subgraphs, with numerous applications to data ...
Read More
Coloring the Maximal Cliques of Graphs

In this paper we are concerned with the so-called clique-colorations of a graph, that is, colorations of the vertices so that no maximal clique is monochromatic. On one hand, it is known to be NP-complete to decide whether a perfect graph is 2-clique-...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in

ACM Transactions on Database Systems Volume 36, Issue 4
December 2011
271 pages
ISSN:0362-5915
EISSN:1557-4644
DOI:10.1145/2043652
Issue’s Table of Contents

Copyright © 2011 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 19 December 2011
- Accepted: 1 April 2011
- Revised: 1 February 2011
- Received: 1 October 2010
Published in tods Volume 36, Issue 4

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
H*-graph
Maximal clique enumeration
dynamic graphs
h-index
massive networks
scale-free networks
Qualifiers
- research-article
- Research
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 111
  Total Citations
  View Citations
- 1,022
  Total Downloads
- Downloads (Last 12 months)26
- Downloads (Last 6 weeks)2
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Finding maximal cliques in massive networks

ACM Transactions on Database Systems

Abstract

References

Cited By

Index Terms

Recommendations

Finding maximal cliques in massive networks by H*-graph

Mining maximal cliques from a large graph using MapReduce

Coloring the Maximal Cliques of Graphs

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Finding maximal cliques in massive networks

ACM Transactions on Database Systems

Abstract

References

Cited By

Index Terms

Recommendations

Finding maximal cliques in massive networks by H*-graph

Mining maximal cliques from a large graph using MapReduce

Coloring the Maximal Cliques of Graphs

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media