Skip to main content
Top
Published in: Journal of Intelligent Information Systems 2/2014

01-04-2014

A method for reduction of examples in relational learning

Authors: Ondřej Kuželka, Andrea Szabóová, Filip Železný

Published in: Journal of Intelligent Information Systems | Issue 2/2014

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Feature selection methods often improve the performance of attribute-value learning. We explore whether also in relational learning, examples in the form of clauses can be reduced in size to speed up learning without affecting the learned hypothesis. To this end, we introduce the notion of safe reduction: a safely reduced example cannot be distinguished from the original example under the given hypothesis language bias. Next, we consider the particular, rather permissive bias of bounded treewidth clauses. We show that under this hypothesis bias, examples of arbitrary treewidth can be reduced efficiently. We evaluate our approach on four data sets with the popular system Aleph and the state-of-the-art relational learner nFOIL. On all four data sets we make learning faster in the case of nFOIL, achieving an order-of-magnitude speed up on one of the data sets, and more accurate in the case of Aleph.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Appendix
Available only for authorised users
Footnotes
1
In this paper we follow the conventions of Atserias et al. (2007). In other works, what we call k-consistency is known as strong k+1-consistency (Rossi et al. 2006).
 
2
This is not always the case. For example, we have a decision procedure for x-subsumption w.r.t. the set of all clauses – the ordinary 𝜃-subsumption which is decidable but NP-hard.
 
3
Note again the terminology used in this paper following Atserias et al. (2007). In CSP-literature, it is often common to call 2-consistency what we call 1-consistency.
 
4
We are applying 𝜃 also to e because e need not be ground.
 
5
A first-order interpretation consists of several components, one them is the domain of discourse, and another is a function ϕ which maps constants to elements of the domain of discourse. We assume w.l.o.g that there is a constant c d for every element d of the domain of discourse.
 
Literature
go back to reference Appice, A., Ceci,M., Rawles, S., Flach, P.A. (2004). Redundant feature elimination for multi-class problems. In ICML (vol. 69). Appice, A., Ceci,M., Rawles, S., Flach, P.A. (2004). Redundant feature elimination for multi-class problems. In ICML (vol. 69).
go back to reference Atserias, A., Bulatov, A., Dalmau, V. (2007). On the power of k-consistency. In Proceedings of ICALP-2007 (pp. 266–271). Atserias, A., Bulatov, A., Dalmau, V. (2007). On the power of k-consistency. In Proceedings of ICALP-2007 (pp. 266–271).
go back to reference Beeri, C., Fagin, R., Maier, D., Yannakakis, M. (1983). On the desirability of acyclic database schemes. Journal of ACM, 30(3), 479–513.CrossRefMATHMathSciNet Beeri, C., Fagin, R., Maier, D., Yannakakis, M. (1983). On the desirability of acyclic database schemes. Journal of ACM, 30(3), 479–513.CrossRefMATHMathSciNet
go back to reference Bodlaender, H.L., & Mohring, R.H. (1993). The pathwidth and treewidth of cographs. SIAM Journal of Discrete Methematics, 6, 238–255.MathSciNet Bodlaender, H.L., & Mohring, R.H. (1993). The pathwidth and treewidth of cographs. SIAM Journal of Discrete Methematics, 6, 238–255.MathSciNet
go back to reference Courcelle, B. (1990). The monadic second-order logic of graphs. i. recognizable sets of finite graphs. Information and Computation, 85(1), 12–75.CrossRefMATHMathSciNet Courcelle, B. (1990). The monadic second-order logic of graphs. i. recognizable sets of finite graphs. Information and Computation, 85(1), 12–75.CrossRefMATHMathSciNet
go back to reference De Raedt, L. (2008). Logical and relational learning. New York: Springer. De Raedt, L. (2008). Logical and relational learning. New York: Springer.
go back to reference Dechter, R. (2003). Constraint processing. San Francisco: Morgan Kaufmann. Dechter, R. (2003). Constraint processing. San Francisco: Morgan Kaufmann.
go back to reference Feder, T., & Vardi, M.Y. (1998). The computational structure of monotone monadic snp and constraint satisfaction: a study through datalog and group theory. SIAM Journal on Computing, 28(1), 57–104.CrossRefMATHMathSciNet Feder, T., & Vardi, M.Y. (1998). The computational structure of monotone monadic snp and constraint satisfaction: a study through datalog and group theory. SIAM Journal on Computing, 28(1), 57–104.CrossRefMATHMathSciNet
go back to reference Freuder, E.C. (1990). Complexity of k-tree structured constraint satisfaction problems. In Proceedings of the eighth national conference on artificial intelligence (vol. 1, pp. 4–9). AAAI’90: AAAI Press. Freuder, E.C. (1990). Complexity of k-tree structured constraint satisfaction problems. In Proceedings of the eighth national conference on artificial intelligence (vol. 1, pp. 4–9). AAAI’90: AAAI Press.
go back to reference Hastie, T., Tibshirani, R., Friedman, J. (2001). The elements of statistical learning: data mining, inference, and prediction. New York: Springer. Hastie, T., Tibshirani, R., Friedman, J. (2001). The elements of statistical learning: data mining, inference, and prediction. New York: Springer.
go back to reference Helma, C., King, R.D., Kramer, S., Srinivasan, A. (2001). The predictive toxicology challenge 2000–2001. Bioinformatics, 17(1), 107–108.CrossRef Helma, C., King, R.D., Kramer, S., Srinivasan, A. (2001). The predictive toxicology challenge 2000–2001. Bioinformatics, 17(1), 107–108.CrossRef
go back to reference Krogel, M.A., Rawles, S., Železný, F., Flach, P., Lavrac, N., Wrobel, S. (2003). Comparative evaluation of approaches to propositionalization. In ILP. Springer. Krogel, M.A., Rawles, S., Železný, F., Flach, P., Lavrac, N., Wrobel, S. (2003). Comparative evaluation of approaches to propositionalization. In ILP. Springer.
go back to reference Kuželka, O., & Železný, F. (2009). Block-wise construction of acyclic relational features with monotone irreducibility and relevancy properties. In ICML 2009: the 26th International Conference on Machine Learning. Kuželka, O., & Železný, F. (2009). Block-wise construction of acyclic relational features with monotone irreducibility and relevancy properties. In ICML 2009: the 26th International Conference on Machine Learning.
go back to reference Kuželka, O., Železný, F. (2011a). Block-wise construction of tree-like relational features with monotone reducibility and redundancy. Machine Learning, 83, 163–192.CrossRefMATHMathSciNet Kuželka, O., Železný, F. (2011a). Block-wise construction of tree-like relational features with monotone reducibility and redundancy. Machine Learning, 83, 163–192.CrossRefMATHMathSciNet
go back to reference Kuželka, O., Železný, F. (2011b). Seeing the world through homomorphism: An experimental study on reducibility of examples. In ILP’10: Inductive logic programming (pp. 138–145). Kuželka, O., Železný, F. (2011b). Seeing the world through homomorphism: An experimental study on reducibility of examples. In ILP’10: Inductive logic programming (pp. 138–145).
go back to reference Kuželka, O., Szabóová, A., Železný, F. (2013a). Bounded least general generalization. In ILP’12: inductive logic programming. Kuželka, O., Szabóová, A., Železný, F. (2013a). Bounded least general generalization. In ILP’12: inductive logic programming.
go back to reference Kuželka, O., Szabóová, A., Železný, F. (2013b). Reducing examples in relational learning with bounded-treewidth hypotheses. In New frontiers in mining complex patterns (pp. 17–32). Kuželka, O., Szabóová, A., Železný, F. (2013b). Reducing examples in relational learning with bounded-treewidth hypotheses. In New frontiers in mining complex patterns (pp. 17–32).
go back to reference Landwehr, N., Kersting, K., Raedt, L.D. (2007). Integrating naïve bayes and FOIL. Journal of Machine Learning Research, 8, 481–507.MATH Landwehr, N., Kersting, K., Raedt, L.D. (2007). Integrating naïve bayes and FOIL. Journal of Machine Learning Research, 8, 481–507.MATH
go back to reference Lavrač, N., Gamberger, D., Jovanoski, V. (1999). A study of relevance for learning in deductive databases. Journal of Logic Programming, 40(2/3), 215–249.CrossRefMATHMathSciNet Lavrač, N., Gamberger, D., Jovanoski, V. (1999). A study of relevance for learning in deductive databases. Journal of Logic Programming, 40(2/3), 215–249.CrossRefMATHMathSciNet
go back to reference Liu, H.,Motoda, H., Setiono, R., Zhao, Z. (2010). Feature selection: an ever evolving frontier in data mining. Journal of Machine Learning Research - Proceedings Track, 10, 4–13. Liu, H.,Motoda, H., Setiono, R., Zhao, Z. (2010). Feature selection: an ever evolving frontier in data mining. Journal of Machine Learning Research - Proceedings Track, 10, 4–13.
go back to reference Maloberti, J., & Sebag, M. (2004). Fast theta-subsumption with constraint satisfaction algorithms. Machine Learning, 55(2), 137–174.CrossRefMATH Maloberti, J., & Sebag, M. (2004). Fast theta-subsumption with constraint satisfaction algorithms. Machine Learning, 55(2), 137–174.CrossRefMATH
go back to reference Muggleton, S. (1995). Inverse entailment and Progol. New Generation Computing, Special Issue on Inductive Logic Programming, 13(3–4), 245–286.CrossRef Muggleton, S. (1995). Inverse entailment and Progol. New Generation Computing, Special Issue on Inductive Logic Programming, 13(3–4), 245–286.CrossRef
go back to reference Nassif, H., Al-Ali, H., Khuri, S., Keirouz, W., Page, D. (2009). An inductive logic programming approach to validate hexose biochemical knowledge. In: Proceedings of the 19th international conference on ILP (pp. 149–165). Leuven. Nassif, H., Al-Ali, H., Khuri, S., Keirouz, W., Page, D. (2009). An inductive logic programming approach to validate hexose biochemical knowledge. In: Proceedings of the 19th international conference on ILP (pp. 149–165). Leuven.
go back to reference Nienhuys-Cheng, S.H., de Wolf, R., (eds.) (1997). Foundations of inductive logic programming. Lecture Notes in Computer Science (vol. 1228). Springer. Nienhuys-Cheng, S.H., de Wolf, R., (eds.) (1997). Foundations of inductive logic programming. Lecture Notes in Computer Science (vol. 1228). Springer.
go back to reference Plotkin, G. (1970). A note on inductive generalization. Edinburgh: Edinburgh University Press. Plotkin, G. (1970). A note on inductive generalization. Edinburgh: Edinburgh University Press.
go back to reference Rossi, F., van Beek, P., Walsh T., (Eds.) (2006). Handbook of constraint programming. New York: Elsevier. Rossi, F., van Beek, P., Walsh T., (Eds.) (2006). Handbook of constraint programming. New York: Elsevier.
go back to reference Žaková, M., Železný, F., Garcia-Sedano, J., Tissot, C.M., Lavrač, N., Křemen, P., Molina, J. (2007). Relational data mining applied to virtual engineering of product designs. In ILP06, LNAI (vol. 4455, pp. 439–453). Springer. Žaková, M., Železný, F., Garcia-Sedano, J., Tissot, C.M., Lavrač, N., Křemen, P., Molina, J. (2007). Relational data mining applied to virtual engineering of product designs. In ILP06, LNAI (vol. 4455, pp. 439–453). Springer.
Metadata
Title
A method for reduction of examples in relational learning
Authors
Ondřej Kuželka
Andrea Szabóová
Filip Železný
Publication date
01-04-2014
Publisher
Springer US
Published in
Journal of Intelligent Information Systems / Issue 2/2014
Print ISSN: 0925-9902
Electronic ISSN: 1573-7675
DOI
https://doi.org/10.1007/s10844-013-0294-z

Other articles of this Issue 2/2014

Journal of Intelligent Information Systems 2/2014 Go to the issue

Premium Partner