Skip to main content
Top

2017 | OriginalPaper | Chapter

K-Clique-Graphs for Dense Subgraph Discovery

Authors : Giannis Nikolentzos, Polykarpos Meladianos, Yannis Stavrakas, Michalis Vazirgiannis

Published in: Machine Learning and Knowledge Discovery in Databases

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Finding dense subgraphs in a graph is a fundamental graph mining task, with applications in several fields. Algorithms for identifying dense subgraphs are used in biology, in finance, in spam detection, etc. Standard formulations of this problem such as the problem of finding the maximum clique of a graph are hard to solve. However, some tractable formulations of the problem have also been proposed, focusing mainly on optimizing some density function, such as the degree density and the triangle density. However, maximization of degree density usually leads to large subgraphs with small density, while maximization of triangle density does not necessarily lead to subgraphs that are close to being cliques.
In this paper, we introduce the k-clique-graph densest subgraph problem, \(k \ge 3\), a novel formulation for the discovery of dense subgraphs. Given an input graph, its k-clique-graph is a new graph created from the input graph where each vertex of the new graph corresponds to a k-clique of the input graph and two vertices are connected with an edge if they share a common \(k - 1\)-clique. We define a simple density function, the k-clique-graph density, which gives compact and at the same time dense subgraphs, and we project its resulting subgraphs back to the input graph. In this paper, we focus on the triangle-graph densest subgraph problem obtained for \(k = 3\). To optimize the proposed function, we provide an exact algorithm. Furthermore, we present an efficient greedy approximation algorithm that scales well to larger graphs.
We evaluate the proposed algorithms on real datasets and compare them with other algorithms in terms of the size and the density of the extracted subgraphs. The results verify the ability of the proposed algorithms in finding high-quality subgraphs in terms of size and density. Finally, we apply the proposed method to the important problem of keyword extraction from textual documents. Code related to this chapter is available at: https://​github.​com/​giannisnik/​k-clique-graphs-dense-subgraphs.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Appendix
Available only for authorised users
Literature
1.
go back to reference Alvarez-Hamelin, J.I., Dall’Asta, L., Barrat, A., Vespignani, A.: Large scale networks fingerprinting and visualization using the k-core decomposition. In: NIPS 2005, pp. 41–50 (2005) Alvarez-Hamelin, J.I., Dall’Asta, L., Barrat, A., Vespignani, A.: Large scale networks fingerprinting and visualization using the k-core decomposition. In: NIPS 2005, pp. 41–50 (2005)
3.
go back to reference Angel, A., Koudas, N., Sarkas, N., Srivastava, D., Svendsen, M., Tirthapura, S.: Dense subgraph maintenance under streaming edge weight updates for real-time story identification. VLDB J. 23(2), 175–199 (2014)CrossRef Angel, A., Koudas, N., Sarkas, N., Srivastava, D., Svendsen, M., Tirthapura, S.: Dense subgraph maintenance under streaming edge weight updates for real-time story identification. VLDB J. 23(2), 175–199 (2014)CrossRef
5.
6.
go back to reference Bader, G.D., Hogue, C.W.: An automated method for finding molecular complexes in large protein interaction networks. BMC Bioinform. 4(1), 1 (2003)CrossRef Bader, G.D., Hogue, C.W.: An automated method for finding molecular complexes in large protein interaction networks. BMC Bioinform. 4(1), 1 (2003)CrossRef
7.
go back to reference Balalau, O.D., Bonchi, F., Chan, T., Gullo, F., Sozio, M.: Finding subgraphs with maximum total density and limited overlap. In: WSDM 2015, pp. 379–388 (2015) Balalau, O.D., Bonchi, F., Chan, T., Gullo, F., Sozio, M.: Finding subgraphs with maximum total density and limited overlap. In: WSDM 2015, pp. 379–388 (2015)
9.
go back to reference Bron, C., Kerbosch, J.: Algorithm 457: finding all cliques of an undirected graph. Commun. ACM 16(9), 575–577 (1973)CrossRefMATH Bron, C., Kerbosch, J.: Algorithm 457: finding all cliques of an undirected graph. Commun. ACM 16(9), 575–577 (1973)CrossRefMATH
10.
go back to reference Buehrer, G., Chellapilla, K.: A scalable pattern mining approach to web graph compression with communities. In: WSDM 2008, pp. 95–106 (2008) Buehrer, G., Chellapilla, K.: A scalable pattern mining approach to web graph compression with communities. In: WSDM 2008, pp. 95–106 (2008)
12.
go back to reference Chen, J., Saad, Y.: Dense subgraph extraction with application to community detection. TKDE 24(7), 1216–1230 (2012) Chen, J., Saad, Y.: Dense subgraph extraction with application to community detection. TKDE 24(7), 1216–1230 (2012)
13.
go back to reference Chiba, N., Nishizeki, T.: Arboricity and subgraph listing algorithms. In: SICOMP 1985, vol. 14, no. 1, pp. 210–223 (1985) Chiba, N., Nishizeki, T.: Arboricity and subgraph listing algorithms. In: SICOMP 1985, vol. 14, no. 1, pp. 210–223 (1985)
14.
go back to reference Du, X., Jin, R., Ding, L., Lee, V.E., Thornton Jr., J.H.: Migration motif: a spatial-temporal pattern mining approach for financial markets. In: KDD 2009, pp. 1135–1144 (2009) Du, X., Jin, R., Ding, L., Lee, V.E., Thornton Jr., J.H.: Migration motif: a spatial-temporal pattern mining approach for financial markets. In: KDD 2009, pp. 1135–1144 (2009)
15.
go back to reference Feige, U.: Approximating maximum clique by removing subgraphs. In: SIDMA 2004, vol. 18, no. 2, pp. 219–225 (2004) Feige, U.: Approximating maximum clique by removing subgraphs. In: SIDMA 2004, vol. 18, no. 2, pp. 219–225 (2004)
17.
go back to reference Fratkin, E., Naughton, B.T., Brutlag, D.L., Batzoglou, S.: MotifCut: regulatory motifs finding with maximum density subgraphs. Bioinformatics 22(14), e150–e157 (2006)CrossRef Fratkin, E., Naughton, B.T., Brutlag, D.L., Batzoglou, S.: MotifCut: regulatory motifs finding with maximum density subgraphs. Bioinformatics 22(14), e150–e157 (2006)CrossRef
18.
go back to reference Gibson, D., Kumar, R., Tomkins, A.: Discovering large dense subgraphs in massive graphs. In: VLDB 2005, pp. 721–732 (2005) Gibson, D., Kumar, R., Tomkins, A.: Discovering large dense subgraphs in massive graphs. In: VLDB 2005, pp. 721–732 (2005)
19.
go back to reference Goldberg, A.V.: Finding a maximum density subgraph. Technical report, University of California Berkeley (1984) Goldberg, A.V.: Finding a maximum density subgraph. Technical report, University of California Berkeley (1984)
20.
go back to reference Håstad, J.: Clique is hard to approximate within \(n^{1-\epsilon }\). In: FOCS 1996, pp. 627–636 (1996) Håstad, J.: Clique is hard to approximate within \(n^{1-\epsilon }\). In: FOCS 1996, pp. 627–636 (1996)
21.
go back to reference Itai, A., Rodeh, M.: Finding a minimum circuit in a graph. In: SICOMP 1978, vol. 7, no. 4, pp. 413–423 (1978) Itai, A., Rodeh, M.: Finding a minimum circuit in a graph. In: SICOMP 1978, vol. 7, no. 4, pp. 413–423 (1978)
24.
go back to reference Lee, V.E., Ruan, N., Jin, R., Aggarwal, C.: A survey of algorithms for dense subgraph discovery. In: Managing and Mining Graph Data, pp. 303–336 (2010) Lee, V.E., Ruan, N., Jin, R., Aggarwal, C.: A survey of algorithms for dense subgraph discovery. In: Managing and Mining Graph Data, pp. 303–336 (2010)
25.
go back to reference Meladianos, P., Nikolentzos, G., Rousseau, F., Stavrakas, Y., Vazirgiannis, M.: Degeneracy-based real-time sub-event detection in Twitter stream. In: ICWSM 2015, pp. 248–257 (2015) Meladianos, P., Nikolentzos, G., Rousseau, F., Stavrakas, Y., Vazirgiannis, M.: Degeneracy-based real-time sub-event detection in Twitter stream. In: ICWSM 2015, pp. 248–257 (2015)
26.
go back to reference Mihalcea, R., Tarau, P.: TextRank: bringing order into texts. In: EMNLP 2004, pp. 404–411 (2004) Mihalcea, R., Tarau, P.: TextRank: bringing order into texts. In: EMNLP 2004, pp. 404–411 (2004)
28.
go back to reference Orlin, J.B.: A faster strongly polynomial time algorithm for submodular function minimization. Math. Program. 118(2), 237–251 (2009)MathSciNetCrossRefMATH Orlin, J.B.: A faster strongly polynomial time algorithm for submodular function minimization. Math. Program. 118(2), 237–251 (2009)MathSciNetCrossRefMATH
31.
32.
go back to reference Sozio, M., Gionis, A.: The community-search problem and how to plan a successful cocktail party. In: KDD 2010, pp. 939–948 (2010) Sozio, M., Gionis, A.: The community-search problem and how to plan a successful cocktail party. In: KDD 2010, pp. 939–948 (2010)
33.
go back to reference Tixier, A.J.P., Malliaros, F.D., Vazirgiannis, M.: A graph degeneracy-based approach to keyword extraction. In: EMNLP 2016 (2016) Tixier, A.J.P., Malliaros, F.D., Vazirgiannis, M.: A graph degeneracy-based approach to keyword extraction. In: EMNLP 2016 (2016)
34.
go back to reference Tsourakakis, C.: The k-clique densest subgraph problem. In: WWW 2015, pp. 1122–1132 (2015) Tsourakakis, C.: The k-clique densest subgraph problem. In: WWW 2015, pp. 1122–1132 (2015)
35.
go back to reference Tsourakakis, C., Bonchi, F., Gionis, A., Gullo, F., Tsiarli, M.: Denser than the densest subgraph: extracting optimal quasi-cliques with quality guarantees. In: KDD 2013, pp. 104–112 (2013) Tsourakakis, C., Bonchi, F., Gionis, A., Gullo, F., Tsiarli, M.: Denser than the densest subgraph: extracting optimal quasi-cliques with quality guarantees. In: KDD 2013, pp. 104–112 (2013)
36.
go back to reference Wang, N., Zhang, J., Tan, K.L., Tung, A.K.: On triangulation-based dense neighborhood graph discovery. VLDB Endow. 4(2), 58–68 (2010)CrossRef Wang, N., Zhang, J., Tan, K.L., Tung, A.K.: On triangulation-based dense neighborhood graph discovery. VLDB Endow. 4(2), 58–68 (2010)CrossRef
Metadata
Title
K-Clique-Graphs for Dense Subgraph Discovery
Authors
Giannis Nikolentzos
Polykarpos Meladianos
Yannis Stavrakas
Michalis Vazirgiannis
Copyright Year
2017
DOI
https://doi.org/10.1007/978-3-319-71249-9_37

Premium Partner