Skip to main content

2015 | OriginalPaper | Buchkapitel

A Graph-Based Concept Discovery Method for n-Ary Relations

verfasst von : Nazmiye Ceren Abay, Alev Mutlu, Pinar Karagoz

Erschienen in: Big Data Analytics and Knowledge Discovery

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Concept discovery is a multi-relational data mining task for inducing definitions of a specific relation in terms of other relations in the data set. Such learning tasks usually have to deal with large search spaces and hence have efficiency and scalability issues. In this paper, we present a hybrid approach that combines association rule mining methods and graph-based approaches to cope with these issues. The proposed method inputs the data in relational format, converts it into a graph representation, and traverses the graph to find the concept descriptors. Graph traversal and pruning are guided based on association rule mining techniques. The proposed method distinguishes from the state-of-the art methods as it can work on n-ary relations, it uses path finding queries to extract concepts and can handle numeric values. Experimental results show that the method is superior to the state-of-the art methods in terms of accuracy and the coverage of the induced concept descriptors and the running time.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
2
Although the running time of the proposed method and that of CRIS are not directly comparable, as the experiments are conducted on different computers, we provide them to provide intuition. In addition, these results are obtained from a recent study hence the configurations of computers are expected to be similar.
 
Literatur
1.
Zurück zum Zitat Džeroski, S., De Raedt, L.: Multi-relational data mining: the current frontiers. SIGKDD Explor. Newsl. 5(1), 100–101 (2003)CrossRef Džeroski, S., De Raedt, L.: Multi-relational data mining: the current frontiers. SIGKDD Explor. Newsl. 5(1), 100–101 (2003)CrossRef
2.
Zurück zum Zitat Doncescu, A., Waissman, J., Richard, G., Roux, G.: Characterization of bio-chemical signals by inductive logic programming. Knowl.-Based Syst. 15(1–2), 129–137 (2002)CrossRef Doncescu, A., Waissman, J., Richard, G., Roux, G.: Characterization of bio-chemical signals by inductive logic programming. Knowl.-Based Syst. 15(1–2), 129–137 (2002)CrossRef
3.
Zurück zum Zitat Turcotte, M., Muggleton, S., Sternberg, M.J.E.: Generating protein three-dimensional fold signatures using inductive logic programming. Comput. and Chem. 26(1), 57–64 (2002)CrossRef Turcotte, M., Muggleton, S., Sternberg, M.J.E.: Generating protein three-dimensional fold signatures using inductive logic programming. Comput. and Chem. 26(1), 57–64 (2002)CrossRef
4.
Zurück zum Zitat Dzeroski, S., Jacobs, N., Molina, M., Moure, C., Muggleton, S., Laer, W.V.: Detecting traffic problems with ILP. In: Page, D.L. (ed.) ILP 1998. LNCS, vol. 1446, pp. 281–290. Springer, Heidelberg (1998) Dzeroski, S., Jacobs, N., Molina, M., Moure, C., Muggleton, S., Laer, W.V.: Detecting traffic problems with ILP. In: Page, D.L. (ed.) ILP 1998. LNCS, vol. 1446, pp. 281–290. Springer, Heidelberg (1998)
5.
Zurück zum Zitat Muggleton, S., De Raedt, L.: Inductive logic programming: theory and methods. J. Logic Program. 19, 629–679 (1994)CrossRef Muggleton, S., De Raedt, L.: Inductive logic programming: theory and methods. J. Logic Program. 19, 629–679 (1994)CrossRef
6.
Zurück zum Zitat Duboc, A.L., Paes, A., Zaverucha, G.: Using the bottom clause and mode declarations in FOL theory revision from examples. Mach. Learn. 76(1), 73–107 (2009)CrossRef Duboc, A.L., Paes, A., Zaverucha, G.: Using the bottom clause and mode declarations in FOL theory revision from examples. Mach. Learn. 76(1), 73–107 (2009)CrossRef
7.
Zurück zum Zitat Cook, D.J., Holder, L.B.: Substructure discovery using minimum description length and background knowledge. J. Artif. Intell. Res. (JAIR) 1, 231–255 (1994) Cook, D.J., Holder, L.B.: Substructure discovery using minimum description length and background knowledge. J. Artif. Intell. Res. (JAIR) 1, 231–255 (1994)
8.
Zurück zum Zitat Ball, T., Larus, J.R.: Efficient path profiling. In: Melvin, S.W., Beaty, S. (eds.) Proceedings of the 29th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 29, Paris, France, 2–4 December 1996, pp. 46–57. ACM/IEEE Computer Society (1996) Ball, T., Larus, J.R.: Efficient path profiling. In: Melvin, S.W., Beaty, S. (eds.) Proceedings of the 29th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 29, Paris, France, 2–4 December 1996, pp. 46–57. ACM/IEEE Computer Society (1996)
9.
Zurück zum Zitat Lavrac, N., Dzeroski, S.: Inductive Logic Programming: Techniques and Applications, vol. 10001. Routledge, New York (1993) Lavrac, N., Dzeroski, S.: Inductive Logic Programming: Techniques and Applications, vol. 10001. Routledge, New York (1993)
10.
Zurück zum Zitat Gao, Z., Zhang, Z., Huang, Z.: Learning relations by path finding and simultaneous covering. In: 2009 WRI World Congress on Computer Science and Information Engineering, vol. 5, pp. 539–543. IEEE (2009) Gao, Z., Zhang, Z., Huang, Z.: Learning relations by path finding and simultaneous covering. In: 2009 WRI World Congress on Computer Science and Information Engineering, vol. 5, pp. 539–543. IEEE (2009)
11.
Zurück zum Zitat Ong, I.M., de Castro Dutra, I., Page, D.L., Santos Costa, V.: Mode directed path finding. In: Gama, J., Camacho, R., Brazdil, P.B., Jorge, A.M., Torgo, L. (eds.) ECML 2005. LNCS (LNAI), vol. 3720, pp. 673–681. Springer, Heidelberg (2005) CrossRef Ong, I.M., de Castro Dutra, I., Page, D.L., Santos Costa, V.: Mode directed path finding. In: Gama, J., Camacho, R., Brazdil, P.B., Jorge, A.M., Torgo, L. (eds.) ECML 2005. LNCS (LNAI), vol. 3720, pp. 673–681. Springer, Heidelberg (2005) CrossRef
12.
Zurück zum Zitat Mutlu, A., Karagoz, P.: A hybrid graph-based method for concept rule discovery. In: Bellatreche, L., Mohania, M.K. (eds.) DaWaK 2013. LNCS, vol. 8057, pp. 327–338. Springer, Heidelberg (2013) CrossRef Mutlu, A., Karagoz, P.: A hybrid graph-based method for concept rule discovery. In: Bellatreche, L., Mohania, M.K. (eds.) DaWaK 2013. LNCS, vol. 8057, pp. 327–338. Springer, Heidelberg (2013) CrossRef
13.
Zurück zum Zitat Holder, L.B., Cook, D.J., Djoko, S., et al.: Substucture discovery in the subdue system. In: KDD workshop, pp.169–180 (1994) Holder, L.B., Cook, D.J., Djoko, S., et al.: Substucture discovery in the subdue system. In: KDD workshop, pp.169–180 (1994)
14.
Zurück zum Zitat Gonzalez, J., Holder, L., Cook, D.J.: Application of graph-based concept learning to the predictive toxicology domain. In: Proceedings of the Predictive Toxicology Challenge Workshop (2001) Gonzalez, J., Holder, L., Cook, D.J.: Application of graph-based concept learning to the predictive toxicology domain. In: Proceedings of the Predictive Toxicology Challenge Workshop (2001)
15.
Zurück zum Zitat Karunaratne, T., Böstrom, H.: Differ: a propositionalization approach for learning from structured data. Mutagenesis 80(88.86), 76–92 (2006) Karunaratne, T., Böstrom, H.: Differ: a propositionalization approach for learning from structured data. Mutagenesis 80(88.86), 76–92 (2006)
16.
Zurück zum Zitat Kavurucu, Y., Senkul, P., Toroslu, I.H.: Concept discovery on relational databases: new techniques for search space pruning and rule quality improvement. Knowl.-Based Syst. 23(8), 743–756 (2010)CrossRef Kavurucu, Y., Senkul, P., Toroslu, I.H.: Concept discovery on relational databases: new techniques for search space pruning and rule quality improvement. Knowl.-Based Syst. 23(8), 743–756 (2010)CrossRef
17.
Zurück zum Zitat Richards, B.L., Mooney, R.J.: Learning relations by pathfinding. In: AAAI, pp. 50–55 (1992) Richards, B.L., Mooney, R.J.: Learning relations by pathfinding. In: AAAI, pp. 50–55 (1992)
18.
Zurück zum Zitat Robinson, I., Webber, J., Eifrem, E.: Graph Databases. O’Reilly Media Inc., Sebastopol (2013) Robinson, I., Webber, J., Eifrem, E.: Graph Databases. O’Reilly Media Inc., Sebastopol (2013)
19.
Zurück zum Zitat Schling, B.: The Boost C++ Libraries. XML Press (2011) Schling, B.: The Boost C++ Libraries. XML Press (2011)
20.
Zurück zum Zitat Goutte, C., Gaussier, É.: A probabilistic interpretation of precision, recall and F-score, with implication for evaluation. In: Losada, D.E., Fernández-Luna, J.M. (eds.) ECIR 2005. LNCS, vol. 3408, pp. 345–359. Springer, Heidelberg (2005) CrossRef Goutte, C., Gaussier, É.: A probabilistic interpretation of precision, recall and F-score, with implication for evaluation. In: Losada, D.E., Fernández-Luna, J.M. (eds.) ECIR 2005. LNCS, vol. 3408, pp. 345–359. Springer, Heidelberg (2005) CrossRef
21.
Zurück zum Zitat Larson, J., Michalski, R.S.: Inductive inference of vl decision rules. ACM SIGART Bull. 63, 38–44 (1977)CrossRef Larson, J., Michalski, R.S.: Inductive inference of vl decision rules. ACM SIGART Bull. 63, 38–44 (1977)CrossRef
22.
Zurück zum Zitat Srinivasan, A., Muggleton, S.H., Sternberg, M.J., King, R.D.: Theories for mutagenicity: a study in first-order and feature-based induction. Artif. Intell. 85(1), 277–299 (1996)CrossRef Srinivasan, A., Muggleton, S.H., Sternberg, M.J., King, R.D.: Theories for mutagenicity: a study in first-order and feature-based induction. Artif. Intell. 85(1), 277–299 (1996)CrossRef
23.
Zurück zum Zitat Srinivasan, A., King, R.D., Muggleton, S.H., Sternberg, M.J.: The predictive toxicology evaluation challenge. In: IJCAI Citeseer, vol. 1, pp. 4–9 (1997) Srinivasan, A., King, R.D., Muggleton, S.H., Sternberg, M.J.: The predictive toxicology evaluation challenge. In: IJCAI Citeseer, vol. 1, pp. 4–9 (1997)
Metadaten
Titel
A Graph-Based Concept Discovery Method for n-Ary Relations
verfasst von
Nazmiye Ceren Abay
Alev Mutlu
Pinar Karagoz
Copyright-Jahr
2015
DOI
https://doi.org/10.1007/978-3-319-22729-0_30