Skip to main content
Erschienen in: Progress in Artificial Intelligence 3/2019

13.05.2019 | Regular Paper

WordificationMI: multi-relational data mining through multiple-instance propositionalization

verfasst von: Luis A. Quintero-Domínguez, Carlos Morell, Sebastián Ventura

Erschienen in: Progress in Artificial Intelligence | Ausgabe 3/2019

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Multi-relational data mining (MRDM) looks for patterns from a relational database. One of the established approaches to MRDM is propositionalization, characterized by transforming a relational database into a simpler representation, commonly a single table. Another approach that has proven to be effective to address learning problems involving one-to-many relationships between the data is multiple-instance learning. In this paper, we propose a new technique to transform relational data, called WordificationMI, which takes advantage of the multiple-instance learning’s potentialities. This new proposal is based on the bag-of-words representation, proposed in the Wordification methodology, but with the difference that it transforms a relational database into a multiple-instance representation. Additionally, we propose a feature selection method, named MICHI (\(\chi _\mathrm{MI}^{2}\)), for reducing the dimensionality of the datasets obtained with WordificationMI. We also present an empirical evaluation with ten relational databases and four learning techniques that show the effectiveness of the proposed methods.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Fußnoten
1
All databases used here were obtained from https://​relational.​fit.​cvut.​cz except IMDb that was provided by the authors of Wordification.
 
Literatur
3.
Zurück zum Zitat Alphonse, É., Rouveirol, C.: Lazy propositionalisation for relational learning. In: Proceedings of the 14th European Conference on Artificial Intelligence, ECAI’00, pp. 256–260. IOS Press, Amsterdam, The Netherlands (2000) Alphonse, É., Rouveirol, C.: Lazy propositionalisation for relational learning. In: Proceedings of the 14th European Conference on Artificial Intelligence, ECAI’00, pp. 256–260. IOS Press, Amsterdam, The Netherlands (2000)
9.
Zurück zum Zitat De Raedt, L.: Attribute-value learning versus inductive logic programming: the missing links. In: Page, D. (ed.) Inductive Logic Programming. Lecture Notes in Computer Science (Lecture Notes in Artificial Intelligence), vol. 1446, pp. 1–8. Springer, Berlin, Heidelberg (1998). https://doi.org/10.1007/BFb0027304 De Raedt, L.: Attribute-value learning versus inductive logic programming: the missing links. In: Page, D. (ed.) Inductive Logic Programming. Lecture Notes in Computer Science (Lecture Notes in Artificial Intelligence), vol. 1446, pp. 1–8. Springer, Berlin, Heidelberg (1998). https://​doi.​org/​10.​1007/​BFb0027304
10.
Zurück zum Zitat De Raedt, L.: Logical and Relational Learning. Cognitive Technologies. Springer, Berlin (2008)MATHCrossRef De Raedt, L.: Logical and Relational Learning. Cognitive Technologies. Springer, Berlin (2008)MATHCrossRef
11.
Zurück zum Zitat Demšar, J.: Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res. 7, 1–30 (2006)MathSciNetMATH Demšar, J.: Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res. 7, 1–30 (2006)MathSciNetMATH
17.
Zurück zum Zitat García, S., Herrera, F.: An extension on “statistical comparisons of classifiers over multiple data sets” for all pairwise comparisons. J Mach Learn Res 9(Dec), 2677–2694 (2008)MATH García, S., Herrera, F.: An extension on “statistical comparisons of classifiers over multiple data sets” for all pairwise comparisons. J Mach Learn Res 9(Dec), 2677–2694 (2008)MATH
19.
Zurück zum Zitat Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software: an update. ACM SIGKDD Explor. Newslett. 11(1), 10–18 (2009)CrossRef Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software: an update. ACM SIGKDD Explor. Newslett. 11(1), 10–18 (2009)CrossRef
21.
Zurück zum Zitat Herrera, F., Ventura, S., Bello-Pérez, R., Cornelis, C., Zafra Gómez, A., Sánchez-Tarragó, D., Vluymans, S.: Multiple Instance Learning. Foundations and Algorithms. Springer, Berlin (2016)MATHCrossRef Herrera, F., Ventura, S., Bello-Pérez, R., Cornelis, C., Zafra Gómez, A., Sánchez-Tarragó, D., Vluymans, S.: Multiple Instance Learning. Foundations and Algorithms. Springer, Berlin (2016)MATHCrossRef
22.
Zurück zum Zitat Knobbe, A.J.: Multi-relational Data Mining. No. 145 in Frontiers in Artificial Intelligence and Applications. IOS Press, Amsterdam (2006) Knobbe, A.J.: Multi-relational Data Mining. No. 145 in Frontiers in Artificial Intelligence and Applications. IOS Press, Amsterdam (2006)
25.
Zurück zum Zitat Krogel, M.A., Wrobel, S.: Transformation-based learning using multirelational aggregation. In: Proceedings of the Eleventh International Conference on Inductive Logic Programming (ILP 2001), LNAI, vol. 2157, pp. 142–155. Springer (2001). https://doi.org/10.1007/3-540-44797-0_12 Krogel, M.A., Wrobel, S.: Transformation-based learning using multirelational aggregation. In: Proceedings of the Eleventh International Conference on Inductive Logic Programming (ILP 2001), LNAI, vol. 2157, pp. 142–155. Springer (2001). https://​doi.​org/​10.​1007/​3-540-44797-0_​12
27.
Zurück zum Zitat Lavrač, N., Džeroski, S.: Inductive Logic Programming: Techniques and Applications. Ellis Hortwood, New York (1994)MATH Lavrač, N., Džeroski, S.: Inductive Logic Programming: Techniques and Applications. Ellis Hortwood, New York (1994)MATH
28.
Zurück zum Zitat Lavrač, N., Džeroski, S., Grobelnik, M.: Learning nonrecursive definitions of relations with LINUS. In: Y. Kodratoff (ed.) Machine Learning—EWSL-91. Lecture Notes in Computer Science, pp. 265–281. Springer, Berlin, Heidelberg (1991). https://doi.org/10.1007/BFb0017020 Lavrač, N., Džeroski, S., Grobelnik, M.: Learning nonrecursive definitions of relations with LINUS. In: Y. Kodratoff (ed.) Machine Learning—EWSL-91. Lecture Notes in Computer Science, pp. 265–281. Springer, Berlin, Heidelberg (1991). https://​doi.​org/​10.​1007/​BFb0017020
29.
Zurück zum Zitat Lavrač, N., Flach, P.A.: An extended transformation approach to inductive logic programming. ACM Trans. Comput. Log. (TOCL) 2(4), 458–494 (2001)MATHCrossRef Lavrač, N., Flach, P.A.: An extended transformation approach to inductive logic programming. ACM Trans. Comput. Log. (TOCL) 2(4), 458–494 (2001)MATHCrossRef
38.
Zurück zum Zitat Quinlan, J.R.: C4.5: Programs for Machine Learning. The Morgan Kaufmann Series in Machine Learning. Morgan Kaufmann Publishers, San Mateo (1993) Quinlan, J.R.: C4.5: Programs for Machine Learning. The Morgan Kaufmann Series in Machine Learning. Morgan Kaufmann Publishers, San Mateo (1993)
39.
Zurück zum Zitat Reutemann, P., Pfahringer, B., Frank, E.: A toolbox for learning from relational data with propositional and multi-instance learners. In: AI 2004: Advances in Artificial Intelligence. Lecture Notes in Computer Science, pp. 1017–1023. Springer, Berlin (2004). https://doi.org/10.1007/978-3-540-30549-1_95 Reutemann, P., Pfahringer, B., Frank, E.: A toolbox for learning from relational data with propositional and multi-instance learners. In: AI 2004: Advances in Artificial Intelligence. Lecture Notes in Computer Science, pp. 1017–1023. Springer, Berlin (2004). https://​doi.​org/​10.​1007/​978-3-540-30549-1_​95
42.
Zurück zum Zitat Srinivasan, A., King, R.D., Muggleton, S.H., Sternberg, M.J.: Carcinogenesis predictions using ILP. In: Inductive Logic Programming, pp. 273–287. Springer (1997) Srinivasan, A., King, R.D., Muggleton, S.H., Sternberg, M.J.: Carcinogenesis predictions using ILP. In: Inductive Logic Programming, pp. 273–287. Springer (1997)
43.
Zurück zum Zitat Witten, I.H., Frank, E., Hall, M.A.: Data Mining: Practical Machine Learning Tools and Techniques, 3ed edn. Morgan Kaufmann Series in Data Management Systems. Morgan Kaufmann, Burlington (2011) Witten, I.H., Frank, E., Hall, M.A.: Data Mining: Practical Machine Learning Tools and Techniques, 3ed edn. Morgan Kaufmann Series in Data Management Systems. Morgan Kaufmann, Burlington (2011)
44.
Zurück zum Zitat Yang, J., Jiang, Y.G., Hauptmann, A.G., Ngo, C.W.: Evaluating bag-of-visual-words representations in scene classification. In: Proceedings of the International Workshop on Multimedia Information Retrieval, MIR’07, pp. 197–206. ACM, New York, NY, USA (2007). https://doi.org/10.1145/1290082.1290111 Yang, J., Jiang, Y.G., Hauptmann, A.G., Ngo, C.W.: Evaluating bag-of-visual-words representations in scene classification. In: Proceedings of the International Workshop on Multimedia Information Retrieval, MIR’07, pp. 197–206. ACM, New York, NY, USA (2007). https://​doi.​org/​10.​1145/​1290082.​1290111
Metadaten
Titel
WordificationMI: multi-relational data mining through multiple-instance propositionalization
verfasst von
Luis A. Quintero-Domínguez
Carlos Morell
Sebastián Ventura
Publikationsdatum
13.05.2019
Verlag
Springer Berlin Heidelberg
Erschienen in
Progress in Artificial Intelligence / Ausgabe 3/2019
Print ISSN: 2192-6352
Elektronische ISSN: 2192-6360
DOI
https://doi.org/10.1007/s13748-019-00186-y

Weitere Artikel der Ausgabe 3/2019

Progress in Artificial Intelligence 3/2019 Zur Ausgabe