Skip to main content
Erschienen in: Journal of Intelligent Information Systems 2/2014

01.04.2014

A method for reduction of examples in relational learning

verfasst von: Ondřej Kuželka, Andrea Szabóová, Filip Železný

Erschienen in: Journal of Intelligent Information Systems | Ausgabe 2/2014

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Feature selection methods often improve the performance of attribute-value learning. We explore whether also in relational learning, examples in the form of clauses can be reduced in size to speed up learning without affecting the learned hypothesis. To this end, we introduce the notion of safe reduction: a safely reduced example cannot be distinguished from the original example under the given hypothesis language bias. Next, we consider the particular, rather permissive bias of bounded treewidth clauses. We show that under this hypothesis bias, examples of arbitrary treewidth can be reduced efficiently. We evaluate our approach on four data sets with the popular system Aleph and the state-of-the-art relational learner nFOIL. On all four data sets we make learning faster in the case of nFOIL, achieving an order-of-magnitude speed up on one of the data sets, and more accurate in the case of Aleph.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Anhänge
Nur mit Berechtigung zugänglich
Fußnoten
1
In this paper we follow the conventions of Atserias et al. (2007). In other works, what we call k-consistency is known as strong k+1-consistency (Rossi et al. 2006).
 
2
This is not always the case. For example, we have a decision procedure for x-subsumption w.r.t. the set of all clauses – the ordinary 𝜃-subsumption which is decidable but NP-hard.
 
3
Note again the terminology used in this paper following Atserias et al. (2007). In CSP-literature, it is often common to call 2-consistency what we call 1-consistency.
 
4
We are applying 𝜃 also to e because e need not be ground.
 
5
A first-order interpretation consists of several components, one them is the domain of discourse, and another is a function ϕ which maps constants to elements of the domain of discourse. We assume w.l.o.g that there is a constant c d for every element d of the domain of discourse.
 
Literatur
Zurück zum Zitat Appice, A., Ceci,M., Rawles, S., Flach, P.A. (2004). Redundant feature elimination for multi-class problems. In ICML (vol. 69). Appice, A., Ceci,M., Rawles, S., Flach, P.A. (2004). Redundant feature elimination for multi-class problems. In ICML (vol. 69).
Zurück zum Zitat Atserias, A., Bulatov, A., Dalmau, V. (2007). On the power of k-consistency. In Proceedings of ICALP-2007 (pp. 266–271). Atserias, A., Bulatov, A., Dalmau, V. (2007). On the power of k-consistency. In Proceedings of ICALP-2007 (pp. 266–271).
Zurück zum Zitat Beeri, C., Fagin, R., Maier, D., Yannakakis, M. (1983). On the desirability of acyclic database schemes. Journal of ACM, 30(3), 479–513.CrossRefMATHMathSciNet Beeri, C., Fagin, R., Maier, D., Yannakakis, M. (1983). On the desirability of acyclic database schemes. Journal of ACM, 30(3), 479–513.CrossRefMATHMathSciNet
Zurück zum Zitat Bodlaender, H.L., & Mohring, R.H. (1993). The pathwidth and treewidth of cographs. SIAM Journal of Discrete Methematics, 6, 238–255.MathSciNet Bodlaender, H.L., & Mohring, R.H. (1993). The pathwidth and treewidth of cographs. SIAM Journal of Discrete Methematics, 6, 238–255.MathSciNet
Zurück zum Zitat Courcelle, B. (1990). The monadic second-order logic of graphs. i. recognizable sets of finite graphs. Information and Computation, 85(1), 12–75.CrossRefMATHMathSciNet Courcelle, B. (1990). The monadic second-order logic of graphs. i. recognizable sets of finite graphs. Information and Computation, 85(1), 12–75.CrossRefMATHMathSciNet
Zurück zum Zitat De Raedt, L. (2008). Logical and relational learning. New York: Springer. De Raedt, L. (2008). Logical and relational learning. New York: Springer.
Zurück zum Zitat Dechter, R. (2003). Constraint processing. San Francisco: Morgan Kaufmann. Dechter, R. (2003). Constraint processing. San Francisco: Morgan Kaufmann.
Zurück zum Zitat Fagin, R. (1983). Degrees of acyclicity for hypergraphs and relational database schemes. Journal of the ACM, 30(3), 514–550.CrossRefMATHMathSciNet Fagin, R. (1983). Degrees of acyclicity for hypergraphs and relational database schemes. Journal of the ACM, 30(3), 514–550.CrossRefMATHMathSciNet
Zurück zum Zitat Feder, T., & Vardi, M.Y. (1998). The computational structure of monotone monadic snp and constraint satisfaction: a study through datalog and group theory. SIAM Journal on Computing, 28(1), 57–104.CrossRefMATHMathSciNet Feder, T., & Vardi, M.Y. (1998). The computational structure of monotone monadic snp and constraint satisfaction: a study through datalog and group theory. SIAM Journal on Computing, 28(1), 57–104.CrossRefMATHMathSciNet
Zurück zum Zitat Freuder, E.C. (1990). Complexity of k-tree structured constraint satisfaction problems. In Proceedings of the eighth national conference on artificial intelligence (vol. 1, pp. 4–9). AAAI’90: AAAI Press. Freuder, E.C. (1990). Complexity of k-tree structured constraint satisfaction problems. In Proceedings of the eighth national conference on artificial intelligence (vol. 1, pp. 4–9). AAAI’90: AAAI Press.
Zurück zum Zitat Hastie, T., Tibshirani, R., Friedman, J. (2001). The elements of statistical learning: data mining, inference, and prediction. New York: Springer. Hastie, T., Tibshirani, R., Friedman, J. (2001). The elements of statistical learning: data mining, inference, and prediction. New York: Springer.
Zurück zum Zitat Helma, C., King, R.D., Kramer, S., Srinivasan, A. (2001). The predictive toxicology challenge 2000–2001. Bioinformatics, 17(1), 107–108.CrossRef Helma, C., King, R.D., Kramer, S., Srinivasan, A. (2001). The predictive toxicology challenge 2000–2001. Bioinformatics, 17(1), 107–108.CrossRef
Zurück zum Zitat Krogel, M.A., Rawles, S., Železný, F., Flach, P., Lavrac, N., Wrobel, S. (2003). Comparative evaluation of approaches to propositionalization. In ILP. Springer. Krogel, M.A., Rawles, S., Železný, F., Flach, P., Lavrac, N., Wrobel, S. (2003). Comparative evaluation of approaches to propositionalization. In ILP. Springer.
Zurück zum Zitat Kuželka, O., & Železný, F. (2009). Block-wise construction of acyclic relational features with monotone irreducibility and relevancy properties. In ICML 2009: the 26th International Conference on Machine Learning. Kuželka, O., & Železný, F. (2009). Block-wise construction of acyclic relational features with monotone irreducibility and relevancy properties. In ICML 2009: the 26th International Conference on Machine Learning.
Zurück zum Zitat Kuželka, O., Železný, F. (2011a). Block-wise construction of tree-like relational features with monotone reducibility and redundancy. Machine Learning, 83, 163–192.CrossRefMATHMathSciNet Kuželka, O., Železný, F. (2011a). Block-wise construction of tree-like relational features with monotone reducibility and redundancy. Machine Learning, 83, 163–192.CrossRefMATHMathSciNet
Zurück zum Zitat Kuželka, O., Železný, F. (2011b). Seeing the world through homomorphism: An experimental study on reducibility of examples. In ILP’10: Inductive logic programming (pp. 138–145). Kuželka, O., Železný, F. (2011b). Seeing the world through homomorphism: An experimental study on reducibility of examples. In ILP’10: Inductive logic programming (pp. 138–145).
Zurück zum Zitat Kuželka, O., Szabóová, A., Železný, F. (2013a). Bounded least general generalization. In ILP’12: inductive logic programming. Kuželka, O., Szabóová, A., Železný, F. (2013a). Bounded least general generalization. In ILP’12: inductive logic programming.
Zurück zum Zitat Kuželka, O., Szabóová, A., Železný, F. (2013b). Reducing examples in relational learning with bounded-treewidth hypotheses. In New frontiers in mining complex patterns (pp. 17–32). Kuželka, O., Szabóová, A., Železný, F. (2013b). Reducing examples in relational learning with bounded-treewidth hypotheses. In New frontiers in mining complex patterns (pp. 17–32).
Zurück zum Zitat Landwehr, N., Kersting, K., Raedt, L.D. (2007). Integrating naïve bayes and FOIL. Journal of Machine Learning Research, 8, 481–507.MATH Landwehr, N., Kersting, K., Raedt, L.D. (2007). Integrating naïve bayes and FOIL. Journal of Machine Learning Research, 8, 481–507.MATH
Zurück zum Zitat Lavrač, N., Gamberger, D., Jovanoski, V. (1999). A study of relevance for learning in deductive databases. Journal of Logic Programming, 40(2/3), 215–249.CrossRefMATHMathSciNet Lavrač, N., Gamberger, D., Jovanoski, V. (1999). A study of relevance for learning in deductive databases. Journal of Logic Programming, 40(2/3), 215–249.CrossRefMATHMathSciNet
Zurück zum Zitat Liu, H.,Motoda, H., Setiono, R., Zhao, Z. (2010). Feature selection: an ever evolving frontier in data mining. Journal of Machine Learning Research - Proceedings Track, 10, 4–13. Liu, H.,Motoda, H., Setiono, R., Zhao, Z. (2010). Feature selection: an ever evolving frontier in data mining. Journal of Machine Learning Research - Proceedings Track, 10, 4–13.
Zurück zum Zitat Maloberti, J., & Sebag, M. (2004). Fast theta-subsumption with constraint satisfaction algorithms. Machine Learning, 55(2), 137–174.CrossRefMATH Maloberti, J., & Sebag, M. (2004). Fast theta-subsumption with constraint satisfaction algorithms. Machine Learning, 55(2), 137–174.CrossRefMATH
Zurück zum Zitat Muggleton, S. (1995). Inverse entailment and Progol. New Generation Computing, Special Issue on Inductive Logic Programming, 13(3–4), 245–286.CrossRef Muggleton, S. (1995). Inverse entailment and Progol. New Generation Computing, Special Issue on Inductive Logic Programming, 13(3–4), 245–286.CrossRef
Zurück zum Zitat Nassif, H., Al-Ali, H., Khuri, S., Keirouz, W., Page, D. (2009). An inductive logic programming approach to validate hexose biochemical knowledge. In: Proceedings of the 19th international conference on ILP (pp. 149–165). Leuven. Nassif, H., Al-Ali, H., Khuri, S., Keirouz, W., Page, D. (2009). An inductive logic programming approach to validate hexose biochemical knowledge. In: Proceedings of the 19th international conference on ILP (pp. 149–165). Leuven.
Zurück zum Zitat Nienhuys-Cheng, S.H., de Wolf, R., (eds.) (1997). Foundations of inductive logic programming. Lecture Notes in Computer Science (vol. 1228). Springer. Nienhuys-Cheng, S.H., de Wolf, R., (eds.) (1997). Foundations of inductive logic programming. Lecture Notes in Computer Science (vol. 1228). Springer.
Zurück zum Zitat Plotkin, G. (1970). A note on inductive generalization. Edinburgh: Edinburgh University Press. Plotkin, G. (1970). A note on inductive generalization. Edinburgh: Edinburgh University Press.
Zurück zum Zitat Rossi, F., van Beek, P., Walsh T., (Eds.) (2006). Handbook of constraint programming. New York: Elsevier. Rossi, F., van Beek, P., Walsh T., (Eds.) (2006). Handbook of constraint programming. New York: Elsevier.
Zurück zum Zitat Žaková, M., Železný, F., Garcia-Sedano, J., Tissot, C.M., Lavrač, N., Křemen, P., Molina, J. (2007). Relational data mining applied to virtual engineering of product designs. In ILP06, LNAI (vol. 4455, pp. 439–453). Springer. Žaková, M., Železný, F., Garcia-Sedano, J., Tissot, C.M., Lavrač, N., Křemen, P., Molina, J. (2007). Relational data mining applied to virtual engineering of product designs. In ILP06, LNAI (vol. 4455, pp. 439–453). Springer.
Metadaten
Titel
A method for reduction of examples in relational learning
verfasst von
Ondřej Kuželka
Andrea Szabóová
Filip Železný
Publikationsdatum
01.04.2014
Verlag
Springer US
Erschienen in
Journal of Intelligent Information Systems / Ausgabe 2/2014
Print ISSN: 0925-9902
Elektronische ISSN: 1573-7675
DOI
https://doi.org/10.1007/s10844-013-0294-z

Weitere Artikel der Ausgabe 2/2014

Journal of Intelligent Information Systems 2/2014 Zur Ausgabe

Premium Partner