Skip to main content
Erschienen in: The Journal of Supercomputing 8/2019

12.02.2019

Listing all maximal cliques in large graphs on vertex-centric model

verfasst von: Assia Brighen, Hachem Slimani, Abdelmounaam Rezgui, Hamamache Kheddouci

Erschienen in: The Journal of Supercomputing | Ausgabe 8/2019

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Maximal Clique Enumeration (MCE), which consists to enumerate all maximal complete subgraphs in a given graph, is a fundamental problem in graph theory, and it is used in several applications. It is one of Karp’s 21 NP-complete problems. In the literature, this problem has been widely studied. One of the most notable, efficient, successful and extensively used solutions is the Bron–Kerbosch (BK) algorithm. The latter is a sequential algorithm which is able to enumerate all maximal cliques in an undirected graph without duplication. Furthermore, it is used to solve large problems such as maximum clique problem, community detection and graph clustering. However, for large graphs, sequential algorithms are slow and do not scale well. Thus, processing efficiently this kind of graphs needs to develop distributed algorithms under parallel and distributed platforms or large graph mining frameworks. In this setting, we propose new efficient distributed algorithms for maximal clique enumerating based on the vertex-centric model. These algorithms use the BK algorithm principle to deal with the MCE problem on large cluster. The proposed algorithms are implemented in Giraph and evaluated by using real-world graphs and computer-generated benchmark networks. Our experiments on a Hadoop cluster show that the proposed algorithms can effectively process a variety of large real-world and computer-generated graphs and scale well with increasing the dataset size and the number of nodes in the cluster. Furthermore, the proposed algorithms are provably work-efficient compared with other algorithms including BK algorithm.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Literatur
2.
Zurück zum Zitat Avery C, Kunz C (2011) Giraph: large-scale graph processing infrastructure on Hadoop. In: Proceedings of the 2011 Hadoop Summit, Santa Clara Avery C, Kunz C (2011) Giraph: large-scale graph processing infrastructure on Hadoop. In: Proceedings of the 2011 Hadoop Summit, Santa Clara
5.
Zurück zum Zitat Chen Q, Fang Ch, Wang Z, Suo B, Li Z, Ives ZG (2016) Parallelizing maximal clique enumeration over graph data. In: DASFAA’2016 Proceedings, Part II, of the 21st International Conference on Database Systems for Advanced Applications, vol 9643, pp 249–264. https://doi.org/10.1007/978-3-319-32049-6_16 Chen Q, Fang Ch, Wang Z, Suo B, Li Z, Ives ZG (2016) Parallelizing maximal clique enumeration over graph data. In: DASFAA’2016 Proceedings, Part II, of the 21st International Conference on Database Systems for Advanced Applications, vol 9643, pp 249–264. https://​doi.​org/​10.​1007/​978-3-319-32049-6_​16
6.
Zurück zum Zitat Cheng J, Zhu L, Ke Y, Chu S (2012) Fast algorithms for maximal clique enumeration with limited memory. In: KDD’12 Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp 1240-1248. https://doi.org/10.1145/2339530.2339724 Cheng J, Zhu L, Ke Y, Chu S (2012) Fast algorithms for maximal clique enumeration with limited memory. In: KDD’12 Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp 1240-1248. https://​doi.​org/​10.​1145/​2339530.​2339724
8.
Zurück zum Zitat Ching A, Edunov S, Kabiljo M, Logothetis D, Muthukrishnan S (2015) One trillion edges: graph processing at facebook-scale. In: Proceedings of the 41st International Conference on Very Large Data Bases, Kohala Coast, Hawaii vol 8(12), pp 1804–1815. https://doi.org/10.14778/2824032.2824077 Ching A, Edunov S, Kabiljo M, Logothetis D, Muthukrishnan S (2015) One trillion edges: graph processing at facebook-scale. In: Proceedings of the 41st International Conference on Very Large Data Bases, Kohala Coast, Hawaii vol 8(12), pp 1804–1815. https://​doi.​org/​10.​14778/​2824032.​2824077
9.
Zurück zum Zitat Conte A, Virgilio RD, Maccioni A, Patrignani M, Torlone R (2016) Finding all maximal cliques in very large social networks. In: Proceedings of the 19th International Conference on Extending Database Technology, EDBT 2016, Bordeaux, France, pp 173–184. https://doi.org/10.5441/002/edbt.2016.18 Conte A, Virgilio RD, Maccioni A, Patrignani M, Torlone R (2016) Finding all maximal cliques in very large social networks. In: Proceedings of the 19th International Conference on Extending Database Technology, EDBT 2016, Bordeaux, France, pp 173–184. https://​doi.​org/​10.​5441/​002/​edbt.​2016.​18
12.
Zurück zum Zitat Dean J, Ghemawat S (2004) MapReduce: simplified data processing on large clusters. In: OSDI’04, The 6th Symposium on Operating System Design and Implementation, vol 6, California, USA. pp 137–150 Dean J, Ghemawat S (2004) MapReduce: simplified data processing on large clusters. In: OSDI’04, The 6th Symposium on Operating System Design and Implementation, vol 6, California, USA. pp 137–150
13.
Zurück zum Zitat Doekemeijer N, Varbanescu AL (2014) A Survey of parallel graph processing frameworks. Technical report, Delft University of Technology, Report number PDS-2014-003 Doekemeijer N, Varbanescu AL (2014) A Survey of parallel graph processing frameworks. Technical report, Delft University of Technology, Report number PDS-2014-003
14.
Zurück zum Zitat Du N, Bin W, Liutong X, Bai W, Xin P (2006) A parallel algorithm for enumerating all maximal cliques in complex network. In: Proceedings of the Sixth IEEE International Conference on Data Mining—Workshops, Hong Kong, China, pp 320–324. https://doi.org/10.1109/ICDMW.2006.17 Du N, Bin W, Liutong X, Bai W, Xin P (2006) A parallel algorithm for enumerating all maximal cliques in complex network. In: Proceedings of the Sixth IEEE International Conference on Data Mining—Workshops, Hong Kong, China, pp 320–324. https://​doi.​org/​10.​1109/​ICDMW.​2006.​17
15.
Zurück zum Zitat Elshawi R, Batarfi O, Fayoumi A, Barnawi A, Sakr S (2015) Big graph processing systems: state-of-the-art and open challenges. In: Big Data Computing Service and Applications (BigDataService), 2015 IEEE First International Conference on Big Data Computing Service and Applications, pp 24–33. https://doi.org/10.1109/BigDataService.2015.11 Elshawi R, Batarfi O, Fayoumi A, Barnawi A, Sakr S (2015) Big graph processing systems: state-of-the-art and open challenges. In: Big Data Computing Service and Applications (BigDataService), 2015 IEEE First International Conference on Big Data Computing Service and Applications, pp 24–33. https://​doi.​org/​10.​1109/​BigDataService.​2015.​11
17.
Zurück zum Zitat Eppstein D, Loffler M, Strash D (2010) Listing all maximal cliques in sparse graphs in near-optimal time. In: Cheong O, Chwa KY, Park K (eds) Algorithms and Computation. ISAAC 2010. Lecture Notes in Computer Science, vol 6506, Springer, Berlin, Heidelberg. pp 403–414. https://doi.org/10.1007/978-3-642-17517-6_36 Eppstein D, Loffler M, Strash D (2010) Listing all maximal cliques in sparse graphs in near-optimal time. In: Cheong O, Chwa KY, Park K (eds) Algorithms and Computation. ISAAC 2010. Lecture Notes in Computer Science, vol 6506, Springer, Berlin, Heidelberg. pp 403–414. https://​doi.​org/​10.​1007/​978-3-642-17517-6_​36
20.
Zurück zum Zitat Golumbic MC (1980) Algorithmic graph theory and perfect graphs. Academic Press, New YorkMATH Golumbic MC (1980) Algorithmic graph theory and perfect graphs. Academic Press, New YorkMATH
21.
Zurück zum Zitat Gonzalez JE, Low Y, Gu H, Bickson D, Guestrin C (2012) PowerGraph: distributed graph-parallel computation on natural graphs. In: OSDI’12 Proceedings of the 10th USENIX Conference on Operating Systems Design and Implementation, CA, USA, pp 17–30 Gonzalez JE, Low Y, Gu H, Bickson D, Guestrin C (2012) PowerGraph: distributed graph-parallel computation on natural graphs. In: OSDI’12 Proceedings of the 10th USENIX Conference on Operating Systems Design and Implementation, CA, USA, pp 17–30
22.
Zurück zum Zitat Guo Y, Biczak M, Varbanescu AL, Iosup A, Martella C, Willke TL (2014) How well do graph-processing platforms perform? An empirical performance evaluation and analysis. In: 2014 IEEE 28th International Parallel and Distributed Processing Symposium, pp 395–404. https://doi.org/10.1109/IPDPS.2014.49 Guo Y, Biczak M, Varbanescu AL, Iosup A, Martella C, Willke TL (2014) How well do graph-processing platforms perform? An empirical performance evaluation and analysis. In: 2014 IEEE 28th International Parallel and Distributed Processing Symposium, pp 395–404. https://​doi.​org/​10.​1109/​IPDPS.​2014.​49
23.
26.
33.
Zurück zum Zitat Karp RM (1972) Reducibility among combinatorial problems. In: Miller RE, Thatcher JW (eds) Complexity of computer computations. Plenum Press, New York, pp 85–104CrossRef Karp RM (1972) Reducibility among combinatorial problems. In: Miller RE, Thatcher JW (eds) Complexity of computer computations. Plenum Press, New York, pp 85–104CrossRef
34.
Zurück zum Zitat Koichi S, Arisaka M, Koshino H, Aoki A, Iwata S, Uno T, Satoh H (2014) Chemical structure elucidation from 13C NMR chemical shifts: efficient data processing using bipartite matching and maximal clique algorithms. J Chem Inf Model 54(4):1027–1035. https://doi.org/10.1021/ci400601c CrossRef Koichi S, Arisaka M, Koshino H, Aoki A, Iwata S, Uno T, Satoh H (2014) Chemical structure elucidation from 13C NMR chemical shifts: efficient data processing using bipartite matching and maximal clique algorithms. J Chem Inf Model 54(4):1027–1035. https://​doi.​org/​10.​1021/​ci400601c CrossRef
35.
Zurück zum Zitat Lancichinetti A, Fortunato S, Radicchi F (2008) Benchmark graphs for testing community detection algorithms. Phys Rev E 78(4):046110CrossRef Lancichinetti A, Fortunato S, Radicchi F (2008) Benchmark graphs for testing community detection algorithms. Phys Rev E 78(4):046110CrossRef
38.
Zurück zum Zitat Lu L, Gu Y, Grossman R (2010) dMaximalCliques: a distributed algorithm for enumerating all maximal cliques and maximal clique distribution. In: ICDMW’10 Proceedings of the 2010 IEEE International Conference on Data Mining Workshops, pp 1320–1327. https://doi.org/10.1109/ICDMW.2010.13 Lu L, Gu Y, Grossman R (2010) dMaximalCliques: a distributed algorithm for enumerating all maximal cliques and maximal clique distribution. In: ICDMW’10 Proceedings of the 2010 IEEE International Conference on Data Mining Workshops, pp 1320–1327. https://​doi.​org/​10.​1109/​ICDMW.​2010.​13
40.
Zurück zum Zitat Malewicz G, Austern MH, Bik AJC, Dehnert JC, Horn I, Leiser N, Czajkowski G (2010) Pregel: a system for large-scale graph processing. In: SIGMOD’10 Proceedings of the 2010 ACM SIGMOD International Conference on Management of data, pp 135–146, Indiana, USA. https://doi.org/10.1145/1807167.1807184 Malewicz G, Austern MH, Bik AJC, Dehnert JC, Horn I, Leiser N, Czajkowski G (2010) Pregel: a system for large-scale graph processing. In: SIGMOD’10 Proceedings of the 2010 ACM SIGMOD International Conference on Management of data, pp 135–146, Indiana, USA. https://​doi.​org/​10.​1145/​1807167.​1807184
41.
Zurück zum Zitat Martella C, Shaposhnik R, Logothetis D (2015) Practical graph analytics with apache giraph. Apress, BerkelyCrossRef Martella C, Shaposhnik R, Logothetis D (2015) Practical graph analytics with apache giraph. Apress, BerkelyCrossRef
46.
Zurück zum Zitat Sakr S (2013) Processing large-scale graph data: A guide to current technology. IBM Developerworks Sakr S (2013) Processing large-scale graph data: A guide to current technology. IBM Developerworks
50.
Zurück zum Zitat Shrawak P, Kagzi T, Singh AP, Dobariya B, Lokhande P, Alhat BR (2017) Robotic algorithm development. IJCSIT 8(1):116–119 Shrawak P, Kagzi T, Singh AP, Dobariya B, Lokhande P, Alhat BR (2017) Robotic algorithm development. IJCSIT 8(1):116–119
58.
59.
Zurück zum Zitat Xin RS, Crankshaw D, Dave A, Gonzalez JE, Franklin MJ, Stoica I (2014) GraphX: Unifying data-parallel and graph-parallel analytics. In arXiv preprint arXiv:1402.2394 Xin RS, Crankshaw D, Dave A, Gonzalez JE, Franklin MJ, Stoica I (2014) GraphX: Unifying data-parallel and graph-parallel analytics. In arXiv preprint arXiv:​1402.​2394
60.
Zurück zum Zitat Xin RS, Gonzalez JE, Franklin MJ, Stoica I (2013) GraphX: a resilient distributed graph system on spark. In: GRADES’13 First International Workshop on Graph Data Management Experiences and Systems Article No. 2, New York, USA. https://doi.org/10.1145/2484425.2484427 Xin RS, Gonzalez JE, Franklin MJ, Stoica I (2013) GraphX: a resilient distributed graph system on spark. In: GRADES’13 First International Workshop on Graph Data Management Experiences and Systems Article No. 2, New York, USA. https://​doi.​org/​10.​1145/​2484425.​2484427
63.
Zurück zum Zitat Yuan P, Zhang W, Xie C, Jin H, Liu L, Lee K (2014) Fast iterative graph computation: a path centric approach. In: SC’14 Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, pp 401–412. https://doi.org/10.1109/SC.2014.38 Yuan P, Zhang W, Xie C, Jin H, Liu L, Lee K (2014) Fast iterative graph computation: a path centric approach. In: SC’14 Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, pp 401–412. https://​doi.​org/​10.​1109/​SC.​2014.​38
Metadaten
Titel
Listing all maximal cliques in large graphs on vertex-centric model
verfasst von
Assia Brighen
Hachem Slimani
Abdelmounaam Rezgui
Hamamache Kheddouci
Publikationsdatum
12.02.2019
Verlag
Springer US
Erschienen in
The Journal of Supercomputing / Ausgabe 8/2019
Print ISSN: 0920-8542
Elektronische ISSN: 1573-0484
DOI
https://doi.org/10.1007/s11227-019-02770-4

Weitere Artikel der Ausgabe 8/2019

The Journal of Supercomputing 8/2019 Zur Ausgabe