Skip to main content
Erschienen in: Journal of Intelligent Information Systems 1/2013

01.02.2013

NODAR: mining globally distributed substructures from a single labeled graph

verfasst von: Aya Hellal, Lotfi Ben Romdhane

Erschienen in: Journal of Intelligent Information Systems | Ausgabe 1/2013

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Data mining in structured and semi-structured data focuses on frequent data values. However, in graph data mining, the focus is on common specific topologies. Graph mining, although its ubiquity, is a difficult task since it requires subgraph isomorphism which is known to be NP-complete. In order to effectively prune the search space and thereby save computational time, a graph mining algorithm requires that the support measure of a pattern to be no greater than that of its subpatterns. This property of the support measure is referred to in the literature as the down-closure, anti-monotonicity or admissibility. Unfortunately, when mining a single labeled graph, simply counting the occurrences of a graph pattern may not have the down-closure property. For this, most existing approaches mine frequent substructures in a set of labeled graphs (called also the transactional setting) and few efforts have been devoted to mining frequent globally distributed substructures in a single labeled graph. In this paper, we propose a graph mining algorithm, called NODAR(Non-Overlapping embeDding based grAph mineR), for computing common and globally distributed substructures in a single labeled graph. NODAR adopts the Depth-First Search (DFS) strategy and is based on the SMNOES (Size of Maximum Non Overlapping Embedding Set) as support measure. The core idea of NODAR is to automatically extract frequent subpatterns; and thus without frequency computation thanks to the down-closure property of SMNOES. By adopting this strategy in the computation of frequent substructures, NODAR reduces the number of subgraph isomorphism tests needed to compute pattern frequencies. Experimental results on monograph and transactional graph databases; and comparison with well-known probabilistic and exact algorithms; prove the efficacy of NODAR.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Borgelt, C., & Berthold, M. R. (2002). Mining molecular fragments: Finding relevant substructures of molecules. In ICDM ’02: Proceedings of the 2002 IEEE international conference on data mining (p. 51). Washington, DC, USA: IEEE Computer Society.CrossRef Borgelt, C., & Berthold, M. R. (2002). Mining molecular fragments: Finding relevant substructures of molecules. In ICDM ’02: Proceedings of the 2002 IEEE international conference on data mining (p. 51). Washington, DC, USA: IEEE Computer Society.CrossRef
Zurück zum Zitat Chittimoori, R. N., Holder, L. B., & Cook, D. J. (1999). Applying the subdue substructure discovery system to the chemical toxicity domain. In Proceedings of the twelfth international Florida artificial intelligence research society conference (pp. 90–94). AAAI Press. Chittimoori, R. N., Holder, L. B., & Cook, D. J. (1999). Applying the subdue substructure discovery system to the chemical toxicity domain. In Proceedings of the twelfth international Florida artificial intelligence research society conference (pp. 90–94). AAAI Press.
Zurück zum Zitat Cook, D. J., & Holder, L. B. (2007). Mining graph data. New York: Wiley.MATH Cook, D. J., & Holder, L. B. (2007). Mining graph data. New York: Wiley.MATH
Zurück zum Zitat Geamsakul, W., Matsuda, T., Yoshida, T., Motoda, H., & Washio, T. (2003). Classifier construction by graph-based induction for graph-structured data. In PAKDD’03: Proceedings of the 7th Pacific-Asia conference on advances in knowledge discovery and data mining (pp. 52–62). Berlin: Springer. Geamsakul, W., Matsuda, T., Yoshida, T., Motoda, H., & Washio, T. (2003). Classifier construction by graph-based induction for graph-structured data. In PAKDD’03: Proceedings of the 7th Pacific-Asia conference on advances in knowledge discovery and data mining (pp. 52–62). Berlin: Springer.
Zurück zum Zitat Ghazizadeh, S., & Chawathe, S. S. (2002). Seus: Structure extraction using summaries. In Discovery science (pp. 71–85). Ghazizadeh, S., & Chawathe, S. S. (2002). Seus: Structure extraction using summaries. In Discovery science (pp. 71–85).
Zurück zum Zitat Gudes, E., Shimony, S. E., & Vanetik, N. (2006). Discovering frequent graph patterns using disjoint paths. IEEE Transactions on Knowledge and Data Engineering, 18, 1441–1456.CrossRef Gudes, E., Shimony, S. E., & Vanetik, N. (2006). Discovering frequent graph patterns using disjoint paths. IEEE Transactions on Knowledge and Data Engineering, 18, 1441–1456.CrossRef
Zurück zum Zitat Huan, J., Wang, W., Prins, J., & Yang, J. (2004). Spin: Mining maximal frequent subgraphs from graph databases. In Proceedings of the tenth ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’04 (pp. 581–586). New York: ACM.CrossRef Huan, J., Wang, W., Prins, J., & Yang, J. (2004). Spin: Mining maximal frequent subgraphs from graph databases. In Proceedings of the tenth ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’04 (pp. 581–586). New York: ACM.CrossRef
Zurück zum Zitat Inokuchi, A., Washio, T., & Motoda, H. (2000). An apriori-based algorithm for mining frequent substructures from graph data. In Proceedings of the 4th European conference on principles of data mining and knowledge discovery, PKDD ’00 (pp. 13–23). London: Springer. Inokuchi, A., Washio, T., & Motoda, H. (2000). An apriori-based algorithm for mining frequent substructures from graph data. In Proceedings of the 4th European conference on principles of data mining and knowledge discovery, PKDD ’00 (pp. 13–23). London: Springer.
Zurück zum Zitat Jiang, X., Xiong, H., Wang, C., & Tan, A.-H. (2009). Mining globally distributed frequent subgraphs in a single labeled graph. Data & Knowledge Engineering, 68(10), 1034–1058.CrossRef Jiang, X., Xiong, H., Wang, C., & Tan, A.-H. (2009). Mining globally distributed frequent subgraphs in a single labeled graph. Data & Knowledge Engineering, 68(10), 1034–1058.CrossRef
Zurück zum Zitat Kuramochi, M., & Karypis, G. (2001). Frequent subgraph discovery. In ICDM ’01: Proceedings of the 2001 IEEE international conference on data mining (pp. 313–320). Washington, DC: IEEE Computer Society.CrossRef Kuramochi, M., & Karypis, G. (2001). Frequent subgraph discovery. In ICDM ’01: Proceedings of the 2001 IEEE international conference on data mining (pp. 313–320). Washington, DC: IEEE Computer Society.CrossRef
Zurück zum Zitat Kuramochi, M., & Karypis, G. (2004). An efficient algorithm for discovering frequent subgraphs. IEEE Transactions on Knowledge and Data Engineering, 16(9), 1038–1051.CrossRef Kuramochi, M., & Karypis, G. (2004). An efficient algorithm for discovering frequent subgraphs. IEEE Transactions on Knowledge and Data Engineering, 16(9), 1038–1051.CrossRef
Zurück zum Zitat Nijssen, S., & Kok, J. N. (2004). A quickstart in frequent structure mining can make a difference. In Proceedings of the tenth ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’04 (pp. 647–652). New York: ACM.CrossRef Nijssen, S., & Kok, J. N. (2004). A quickstart in frequent structure mining can make a difference. In Proceedings of the tenth ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’04 (pp. 647–652). New York: ACM.CrossRef
Zurück zum Zitat Schreiber, F., & Schwöbbermeyer, H. (2005). Frequency concepts and pattern detection for the analysis of motifs in networks. In Transactions on computational systems biology III (pp. 89–104). Berlin: Springer.CrossRef Schreiber, F., & Schwöbbermeyer, H. (2005). Frequency concepts and pattern detection for the analysis of motifs in networks. In Transactions on computational systems biology III (pp. 89–104). Berlin: Springer.CrossRef
Zurück zum Zitat Vanetik, N., Shimony, S. E., & Gudes, E. (2006). Support measures for graph data*. Data Mining and Knowledge Discovery, 13, 243–260.MathSciNetMATHCrossRef Vanetik, N., Shimony, S. E., & Gudes, E. (2006). Support measures for graph data*. Data Mining and Knowledge Discovery, 13, 243–260.MathSciNetMATHCrossRef
Zurück zum Zitat Wörlein, M., Dreweke, E., Meinl, T., & Fischer, I. (2006). Edgar: The embedding-based graph miner. In Proceedings of the international workshop on mining and learning with graphs (MLG 2006 (pp. 221–228). Wörlein, M., Dreweke, E., Meinl, T., & Fischer, I. (2006). Edgar: The embedding-based graph miner. In Proceedings of the international workshop on mining and learning with graphs (MLG 2006 (pp. 221–228).
Zurück zum Zitat Yan, X., & Han, J. (2002). gspan: Graph-based substructure pattern mining. In ICDM ’02: Proceedings of the 2002 IEEE international conference on data mining (p. 721). Washington, DC: IEEE Computer Society. Yan, X., & Han, J. (2002). gspan: Graph-based substructure pattern mining. In ICDM ’02: Proceedings of the 2002 IEEE international conference on data mining (p. 721). Washington, DC: IEEE Computer Society.
Metadaten
Titel
NODAR: mining globally distributed substructures from a single labeled graph
verfasst von
Aya Hellal
Lotfi Ben Romdhane
Publikationsdatum
01.02.2013
Verlag
Springer US
Erschienen in
Journal of Intelligent Information Systems / Ausgabe 1/2013
Print ISSN: 0925-9902
Elektronische ISSN: 1573-7675
DOI
https://doi.org/10.1007/s10844-012-0213-8

Weitere Artikel der Ausgabe 1/2013

Journal of Intelligent Information Systems 1/2013 Zur Ausgabe