Skip to main content
Erschienen in: Data Mining and Knowledge Discovery 3/2013

01.11.2013

An efficiently computable subgraph pattern support measure: counting independent observations

verfasst von: Yuyi Wang, Jan Ramon, Thomas Fannes

Erschienen in: Data Mining and Knowledge Discovery | Ausgabe 3/2013

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Graph support measures are functions measuring how frequently a given subgraph pattern occurs in a given database graph. An important class of support measures relies on overlap graphs. A major advantage of overlap-graph based approaches is that they combine anti-monotonicity with counting the occurrences of a subgraph pattern which are independent according to certain criteria. However, existing overlap-graph based support measures are expensive to compute. In this paper, we propose a new support measure which is based on a new notion of independence. We show that our measure is the solution to a sparse linear program, which can be computed efficiently using interior point methods. We study the anti-monotonicity and other properties of this new measure, and relate it to the statistical power of a sample of embeddings in a network. We show experimentally that, in contrast to earlier overlap-graph based proposals, our support measure makes it feasible to mine subgraph patterns in large networks.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Agrawal R, Imielinski T, Swami A (1993) Mining association rules between sets of items in large databases. In: Proceedings of SIGMOD’93, Washington DC, pp 207–216 Agrawal R, Imielinski T, Swami A (1993) Mining association rules between sets of items in large databases. In: Proceedings of SIGMOD’93, Washington DC, pp 207–216
Zurück zum Zitat Berlingerio M, Bonchi F, Bringmann B, Gionis A (2009) Mining graph evolution rules. In: Proceedings of ECML/PKDD’09, Bled, pp 115–130 Berlingerio M, Bonchi F, Bringmann B, Gionis A (2009) Mining graph evolution rules. In: Proceedings of ECML/PKDD’09, Bled, pp 115–130
Zurück zum Zitat Boyd S, Vandenberghe L (2004) Convex optimization. Cambridge University Press, CambridgeMATH Boyd S, Vandenberghe L (2004) Convex optimization. Cambridge University Press, CambridgeMATH
Zurück zum Zitat Bringmann B, Nijssen S (2008) What is frequent in a single graph? In: Proceedings of PAKDD’08, Osaka, pp 858–863 Bringmann B, Nijssen S (2008) What is frequent in a single graph? In: Proceedings of PAKDD’08, Osaka, pp 858–863
Zurück zum Zitat Calders T, Ramon J, Dyck DV (2011) All normalized anti-monotonic overlap graph measures are bounded. Data Min Knowl Discov 23(3):503–548MathSciNetMATHCrossRef Calders T, Ramon J, Dyck DV (2011) All normalized anti-monotonic overlap graph measures are bounded. Data Min Knowl Discov 23(3):503–548MathSciNetMATHCrossRef
Zurück zum Zitat Chakrabarti D, Faloutsos C (2006) Graph mining: laws, generators, and algorithms. ACM Comput Surv 38(1):1–69CrossRef Chakrabarti D, Faloutsos C (2006) Graph mining: laws, generators, and algorithms. ACM Comput Surv 38(1):1–69CrossRef
Zurück zum Zitat Chan T, Chang KL, Raman R (2009) An SDP primal-dual algorithm for approximating the Lovsz-theta function. In: Proceedings of the IEEE ISIT’09, pp 2808–2812 Chan T, Chang KL, Raman R (2009) An SDP primal-dual algorithm for approximating the Lovsz-theta function. In: Proceedings of the IEEE ISIT’09, pp 2808–2812
Zurück zum Zitat Dreweke A, Wörlein M, Fischer I, Schell D, Meinl Th, Philippsen M (2007) Graph-based procedural abstraction. In: Proceedings of the international symposium on code generation and optimization’07, San Jose, pp 259–270 Dreweke A, Wörlein M, Fischer I, Schell D, Meinl Th, Philippsen M (2007) Graph-based procedural abstraction. In: Proceedings of the international symposium on code generation and optimization’07, San Jose, pp 259–270
Zurück zum Zitat Fiedler M, Borgelt C (2007) Support computation for mining frequent subgraphs in a single graph. In: Proceedings of the workshop on mining and learning with graphs (MLG’07), Firenze Fiedler M, Borgelt C (2007) Support computation for mining frequent subgraphs in a single graph. In: Proceedings of the workshop on mining and learning with graphs (MLG’07), Firenze
Zurück zum Zitat Feige U, Goldwasser S, Lovász L, Safra S, Szegedy M (1991) Approximating clique is almost NP-complete. In: FOCS IEEE Computer Society, pp 2–12 Feige U, Goldwasser S, Lovász L, Safra S, Szegedy M (1991) Approximating clique is almost NP-complete. In: FOCS IEEE Computer Society, pp 2–12
Zurück zum Zitat Garey MR, Johnson DS (1979) Computers and intractibility, a guide to the theory of NP-completeness. W. H. Freeman and Company, New York Garey MR, Johnson DS (1979) Computers and intractibility, a guide to the theory of NP-completeness. W. H. Freeman and Company, New York
Zurück zum Zitat Gjoka M, Kurant M, Butts C, Markopoulou A (2010) Walking in facebook: a case study of unbiased sampling of OSNs. In: Proceedings of IEEE INFOCOM’10, San Diego, pp 1–9 Gjoka M, Kurant M, Butts C, Markopoulou A (2010) Walking in facebook: a case study of unbiased sampling of OSNs. In: Proceedings of IEEE INFOCOM’10, San Diego, pp 1–9
Zurück zum Zitat Kibriya A, Ramon J (2012) Nearly exact mining of frequent trees in large networks. In: Proceedings of ECML-PKDD 2012, Bristol, pp 426–441 Kibriya A, Ramon J (2012) Nearly exact mining of frequent trees in large networks. In: Proceedings of ECML-PKDD 2012, Bristol, pp 426–441
Zurück zum Zitat Klein PN, Lu H (1996) Efficient approximation algorithms for semidefinite programs arising from MAX CUT and COLORING. In: Proceedings of ACM STOC’96, pp 338–347 Klein PN, Lu H (1996) Efficient approximation algorithms for semidefinite programs arising from MAX CUT and COLORING. In: Proceedings of ACM STOC’96, pp 338–347
Zurück zum Zitat Knuth DE (1994) The sandwich theorem. Electron J Comb 1:1–48 Knuth DE (1994) The sandwich theorem. Electron J Comb 1:1–48
Zurück zum Zitat Kuramochi M, Karypis G (2005) Finding frequent subgraph patterns in a large sparse graph. Data Mining Knowl Discov 11(3):243–271MathSciNetCrossRef Kuramochi M, Karypis G (2005) Finding frequent subgraph patterns in a large sparse graph. Data Mining Knowl Discov 11(3):243–271MathSciNetCrossRef
Zurück zum Zitat Lovász L (1979) On the Shannon capacity of a graph. IEEE Trans Inf Theory 25(1):1–7MATHCrossRef Lovász L (1979) On the Shannon capacity of a graph. IEEE Trans Inf Theory 25(1):1–7MATHCrossRef
Zurück zum Zitat Luigi P, Pasquale F, Carlo S, Mario V (2004) A subgraph isomorphism algorithm for matching large graphs. IEEE Trans Pattern Anal Mach Intell 26(10):1367–1372CrossRef Luigi P, Pasquale F, Carlo S, Mario V (2004) A subgraph isomorphism algorithm for matching large graphs. IEEE Trans Pattern Anal Mach Intell 26(10):1367–1372CrossRef
Zurück zum Zitat Schrijver A (1979) A comparison of the Delsarte and Lovász bounds. IEEE Trans Inf Theory 25:425–429 Schrijver A (1979) A comparison of the Delsarte and Lovász bounds. IEEE Trans Inf Theory 25:425–429
Zurück zum Zitat Vanetik N, Gudes E, Shimony SE (2002) Computing frequent graph subgraph patterns from semistructured data. In: Proceeding of the IEEE international conference on data mining (ICDM’02), Maebashi, pp 458–465 Vanetik N, Gudes E, Shimony SE (2002) Computing frequent graph subgraph patterns from semistructured data. In: Proceeding of the IEEE international conference on data mining (ICDM’02), Maebashi, pp 458–465
Zurück zum Zitat Wang Y, Ramon J (2012) An efficiently computable support measure for frequent subgraph pattern mining. In: Proceedings of ECML-PKDD 2012, Bristol, pp 362–377 Wang Y, Ramon J (2012) An efficiently computable support measure for frequent subgraph pattern mining. In: Proceedings of ECML-PKDD 2012, Bristol, pp 362–377
Metadaten
Titel
An efficiently computable subgraph pattern support measure: counting independent observations
verfasst von
Yuyi Wang
Jan Ramon
Thomas Fannes
Publikationsdatum
01.11.2013
Verlag
Springer US
Erschienen in
Data Mining and Knowledge Discovery / Ausgabe 3/2013
Print ISSN: 1384-5810
Elektronische ISSN: 1573-756X
DOI
https://doi.org/10.1007/s10618-013-0318-x

Weitere Artikel der Ausgabe 3/2013

Data Mining and Knowledge Discovery 3/2013 Zur Ausgabe

Premium Partner