Top

Journal of Intelligent Information Systems

Published in:

01-04-2014

A method for reduction of examples in relational learning

Authors: Ondřej Kuželka, Andrea Szabóová, Filip Železný

Published in: Journal of Intelligent Information Systems | Issue 2/2014

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

Feature selection methods often improve the performance of attribute-value learning. We explore whether also in relational learning, examples in the form of clauses can be reduced in size to speed up learning without affecting the learned hypothesis. To this end, we introduce the notion of safe reduction: a safely reduced example cannot be distinguished from the original example under the given hypothesis language bias. Next, we consider the particular, rather permissive bias of bounded treewidth clauses. We show that under this hypothesis bias, examples of arbitrary treewidth can be reduced efficiently. We evaluate our approach on four data sets with the popular system Aleph and the state-of-the-art relational learner nFOIL. On all four data sets we make learning faster in the case of nFOIL, achieving an order-of-magnitude speed up on one of the data sets, and more accurate in the case of Aleph.

previous article Semantic subgroup explanations

next article A contribution to the discovery of multidimensional patterns in healthcare trajectories

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Available only for authorised users

In this paper we follow the conventions of Atserias et al. (2007). In other works, what we call k-consistency is known as strong k+1-consistency (Rossi et al. 2006).

This is not always the case. For example, we have a decision procedure for x-subsumption w.r.t. the set of all clauses – the ordinary 𝜃-subsumption which is decidable but NP-hard.

Note again the terminology used in this paper following Atserias et al. (2007). In CSP-literature, it is often common to call 2-consistency what we call 1-consistency.

We are applying 𝜃 also to e because e need not be ground.

A first-order interpretation consists of several components, one them is the domain of discourse, and another is a function ϕ which maps constants to elements of the domain of discourse. We assume w.l.o.g that there is a constant c _d for every element d of the domain of discourse.

Appice, A., Ceci,M., Rawles, S., Flach, P.A. (2004). Redundant feature elimination for multi-class problems. In ICML (vol. 69).

Atserias, A., Bulatov, A., Dalmau, V. (2007). On the power of k-consistency. In Proceedings of ICALP-2007 (pp. 266–271).

Beeri, C., Fagin, R., Maier, D., Yannakakis, M. (1983). On the desirability of acyclic database schemes. Journal of ACM, 30(3), 479–513.CrossRefMATHMathSciNet

Bodlaender, H.L., & Mohring, R.H. (1993). The pathwidth and treewidth of cographs. SIAM Journal of Discrete Methematics, 6, 238–255.MathSciNet

Courcelle, B. (1990). The monadic second-order logic of graphs. i. recognizable sets of finite graphs. Information and Computation, 85(1), 12–75.CrossRefMATHMathSciNet

De Raedt, L. (1997).) Logical settings for concept-learning. Artificial Intelligence, 95(1), 187–201.CrossRefMATHMathSciNet

De Raedt, L. (2008). Logical and relational learning. New York: Springer.

Dechter, R. (2003). Constraint processing. San Francisco: Morgan Kaufmann.

Erickson, J. (2009). CS 598: Computational topology, course notes, University of Illinois at Urbana-Champaign. http://compgeom.cs.uiuc.edu/~jeffe/teaching/comptop/.

Fagin, R. (1983). Degrees of acyclicity for hypergraphs and relational database schemes. Journal of the ACM, 30(3), 514–550.CrossRefMATHMathSciNet

Feder, T., & Vardi, M.Y. (1998). The computational structure of monotone monadic snp and constraint satisfaction: a study through datalog and group theory. SIAM Journal on Computing, 28(1), 57–104.CrossRefMATHMathSciNet

Freuder, E.C. (1990). Complexity of k-tree structured constraint satisfaction problems. In Proceedings of the eighth national conference on artificial intelligence (vol. 1, pp. 4–9). AAAI’90: AAAI Press.

Hastie, T., Tibshirani, R., Friedman, J. (2001). The elements of statistical learning: data mining, inference, and prediction. New York: Springer.

Helma, C., King, R.D., Kramer, S., Srinivasan, A. (2001). The predictive toxicology challenge 2000–2001. Bioinformatics, 17(1), 107–108.CrossRef

Krogel, M.A., Rawles, S., Železný, F., Flach, P., Lavrac, N., Wrobel, S. (2003). Comparative evaluation of approaches to propositionalization. In ILP. Springer.

Kuželka, O., & Železný, F. (2009). Block-wise construction of acyclic relational features with monotone irreducibility and relevancy properties. In ICML 2009: the 26th International Conference on Machine Learning.

Kuželka, O., Železný, F. (2011a). Block-wise construction of tree-like relational features with monotone reducibility and redundancy. Machine Learning, 83, 163–192.CrossRefMATHMathSciNet

Kuželka, O., Železný, F. (2011b). Seeing the world through homomorphism: An experimental study on reducibility of examples. In ILP’10: Inductive logic programming (pp. 138–145).

Kuželka, O., Szabóová, A., Železný, F. (2013a). Bounded least general generalization. In ILP’12: inductive logic programming.

Kuželka, O., Szabóová, A., Železný, F. (2013b). Reducing examples in relational learning with bounded-treewidth hypotheses. In New frontiers in mining complex patterns (pp. 17–32).

Landwehr, N., Kersting, K., Raedt, L.D. (2007). Integrating naïve bayes and FOIL. Journal of Machine Learning Research, 8, 481–507.MATH

Lavrač, N., Gamberger, D., Jovanoski, V. (1999). A study of relevance for learning in deductive databases. Journal of Logic Programming, 40(2/3), 215–249.CrossRefMATHMathSciNet

Liu, H.,Motoda, H., Setiono, R., Zhao, Z. (2010). Feature selection: an ever evolving frontier in data mining. Journal of Machine Learning Research - Proceedings Track, 10, 4–13.

Mackworth, A. (1977). Consistency in networks of relations. Artificial Intelligence, 8(1), 99–118.CrossRefMATHMathSciNet

Maloberti, J., & Sebag, M. (2004). Fast theta-subsumption with constraint satisfaction algorithms. Machine Learning, 55(2), 137–174.CrossRefMATH

Muggleton, S. (1995). Inverse entailment and Progol. New Generation Computing, Special Issue on Inductive Logic Programming, 13(3–4), 245–286.CrossRef

Nassif, H., Al-Ali, H., Khuri, S., Keirouz, W., Page, D. (2009). An inductive logic programming approach to validate hexose biochemical knowledge. In: Proceedings of the 19th international conference on ILP (pp. 149–165). Leuven.

Nienhuys-Cheng, S.H., de Wolf, R., (eds.) (1997). Foundations of inductive logic programming. Lecture Notes in Computer Science (vol. 1228). Springer.

Plotkin, G. (1970). A note on inductive generalization. Edinburgh: Edinburgh University Press.

Rossi, F., van Beek, P., Walsh T., (Eds.) (2006). Handbook of constraint programming. New York: Elsevier.

Žaková, M., Železný, F., Garcia-Sedano, J., Tissot, C.M., Lavrač, N., Křemen, P., Molina, J. (2007). Relational data mining applied to virtual engineering of product designs. In ILP06, LNAI (vol. 4455, pp. 439–453). Springer.

Title: A method for reduction of examples in relational learning
Authors: Ondřej Kuželka
Andrea Szabóová
Filip Železný
Publication date: 01-04-2014
Publisher: Springer US
Published in: Journal of Intelligent Information Systems / Issue 2/2014
Print ISSN: 0925-9902
Electronic ISSN: 1573-7675
DOI: https://doi.org/10.1007/s10844-013-0294-z

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Other articles of this Issue 2/2014

Finding the most descriptive substructures in graphs with discrete and numeric labels

Link classification with probabilistic graphs

Semantic subgroup explanations

Hierarchical object-driven action rules

Mining complex patterns

A contribution to the discovery of multidimensional patterns in healthcare trajectories

Premium Partner