Skip to main content

2012 | Buch

Foundations of Rule Learning

verfasst von: Johannes Fürnkranz, Dragan Gamberger, Nada Lavrač

Verlag: Springer Berlin Heidelberg

Buchreihe : Cognitive Technologies

insite
SUCHEN

Über dieses Buch

Rules – the clearest, most explored and best understood form of knowledge representation – are particularly important for data mining, as they offer the best tradeoff between human and machine understandability. This book presents the fundamentals of rule learning as investigated in classical machine learning and modern data mining. It introduces a feature-based view, as a unifying framework for propositional and relational rule learning, thus bridging the gap between attribute-value learning and inductive logic programming, and providing complete coverage of most important elements of rule learning.

The book can be used as a textbook for teaching machine learning, as well as a comprehensive reference to research in the field of inductive rule learning. As such, it targets students, researchers and developers of rule learning algorithms, presenting the fundamental rule learning concepts in sufficient breadth and depth to enable the reader to understand, develop and apply rule learning techniques to real-world data.

Inhaltsverzeichnis

Frontmatter
Chapter 1. Machine Learning and Data Mining
Abstract
Machine learning and data mining are research areas of computer science whose quick development is due to the advances in data analysis research, growth in thedatabase industry and the resultingmarket needs for methods that are capable of extracting valuable knowledge from large data stores.
Johannes Fürnkranz, Dragan Gamberger, Nada Lavrač
Chapter 2. Rule Learning in a Nutshell
Abstract
This chapter gives a brief overview of inductive rule learning and may therefore serve as a guide through the rest of the book. Later chapters will expand upon the material presented here and discuss advanced approaches, whereas this chapter only presents the core concepts. The chapter describes search heuristics and rule quality criteria, the basic covering algorithm, illustrates classification rule learning on simple propositional learning problems, shows how to use the learned rules for classifying new instances, and introduces the basic evaluation criteria and methodology for rule-set evaluation.
Johannes Fürnkranz, Dragan Gamberger, Nada Lavrač
Chapter 3. Formal Framework for Rule Analysis
Abstract
This chapter provides a formal framework which enables the analysis of the key elements of rule learning: features, search heuristics for feature and rule evaluation, and heuristics/constraints for rule overfitting avoidance. After a brief introduction, Sect. 3.2 discusses receiver operating characteristic (ROC) analysis, the basis of coverage spaces which will serve as an analytic tool for visualizing the behavior of rule learning algorithms and their key elements. The coverage space will then be formally introduced in Sect. 3.3, and its relation to the ROC space will be discussed. Finally, in Sect. 3.4, we show how typical rule learning algorithms behave in coverage space.
Johannes Fürnkranz, Dragan Gamberger, Nada Lavrač
Chapter 4. Features
Abstract
Rule learning systems use features as the main building blocks for rules. A feature can be a simple attribute-value test or a test for the validity of a complex domain knowledge relationship. Most existing concept learning systems generate features in the rule construction process. In contrast, this chapter shows that the separation of the feature construction and rule construction process has several theoretical and practical advantages. In particular, explicit usage of features enables a unifying framework of both propositional and relational rule learning. We demonstrate procedures for generating a set of simple features that—in domains with no contradictory examples—enable the construction of complete and consistent rule sets, and do not include obviously irrelevant features.
Johannes Fürnkranz, Dragan Gamberger, Nada Lavrač
Chapter 5. Relational Features
Abstract
While typical data mining approaches find patterns/models from data stored in a single data table, relational data mining and inductive logic programming approaches (Džeroski & Lavrač, 2001; Lavrač & Džeroski, 1994a) find patterns/models from data stored in more complex data structures, such as graphs, multiple tables, etc., involving multiple relations. This chapter shows how to construct relational features and how to derive a covering table from such complex data structures.
Johannes Fürnkranz, Dragan Gamberger, Nada Lavrač
Chapter 6. Learning Single Rules
Abstract
This chapter describes algorithms and strategies for constructing single rules in the concept learning framework. The starting point in Sect. 6.2 is a generic algorithm for finding the best rule which searches the hypothesis space for a rule that optimizes some quality criterion. Section 6.3 presents alternative search algorithms (heuristic search algorithms including hill-climbing and beam search, as well as exhaustive search algorithms).
Johannes Fürnkranz, Dragan Gamberger, Nada Lavrač
Chapter 7. Rule Evaluation Measures
Abstract
This chapter gives an overview of rule evaluation measures that are commonly used as search heuristics in rule learning. The behavior of various heuristics is analyzed by visualizing their dynamics in the coverage space (cf. Chap.​ 3).
Johannes Fürnkranz, Dragan Gamberger, Nada Lavrač
Chapter 8. Learning Rule Sets
Abstract
The main shortcoming of algorithms for learning a single rule is the lack of expressiveness of a single conjunctive rule. In this chapter we present approaches that construct sets of rules. These sets can be constructed by iterative usage of the algorithms for constructing single rules presented in Chap. 6.
Johannes Fürnkranz, Dragan Gamberger, Nada Lavrač
Chapter 9. Pruning of Rules and Rule Sets
Abstract
Overfitting is a key problem for all learning algorithms. It describes the phenomenon that a very good fit of the learned rule set to the training data may result in inaccurate predictions on unseen data. Typically, this problem occurs when the hypothesis space is too large. Prepruning and post-pruning are two standard techniques for avoiding overfitting. Prepruning deals with it during learning, while post-pruning addresses this problem after an overfitting rule set has been learned.
Johannes Fürnkranz, Dragan Gamberger, Nada Lavrač
Chapter 10. Beyond Concept Learning
Abstract
So far, we have mostly assumed a concept learning framework, where the learner’s task is to learn a rule set describing the target concept from a set of positive and negative examples for this concept. In this chapter, we discuss approaches that allow to extend this framework. We start with multiclass problems, which commonly occur in practice, and discuss the most popular methods for handling them: one-against-all classification and pairwise classification. We also discuss error-correcting output codes as a general framework for reducing multiclass problems to binary classification. As many prediction problems have complex, structured output variables, we also present label ranking and show how a generalization of pairwise classification can address this problem and related problems such as multilabel, hierarchical, and ordered classification. General ranking problems, in particular methods for optimizing the area under the ROC curve, are also addressed in this section. Finally, we briefly review rule learning approaches to regression and clustering.
Johannes Fürnkranz, Dragan Gamberger, Nada Lavrač
Chapter 11. Supervised Descriptive Rule Learning
Abstract
This chapter presents subgroup discovery (SD) and some of the related supervised descriptive rule induction techniques, including contrast set mining (CSM) and emerging pattern mining (EPM). These descriptive rule learning techniques are presented in a unifying framework named supervised descriptive rule learning. All these techniques aim at discovering patterns in the form of rules induced from labeled data. This chapter contributes to the understanding of these techniques by presenting a unified terminology and by explaining the apparent differences between the learning tasks as variants of a unique supervised descriptive rule learning task. It also shows that various rule learning heuristics used in CSM, EPM, and SD algorithms all aim at optimizing a trade off between rule coverage and precision.
Johannes Fürnkranz, Dragan Gamberger, Nada Lavrač
Chapter 12. Selected Applications
Abstract
Rule learning is among the oldest machine learning techniques and it has been used in numerous applications. When optimal prediction quality of induced models is the primary goal then rule learning is not necessarily the best choice. For such applications one may often expect better results from complex models constructed by more modern machine learning approaches likesupport vector machines or random forests. The key quality of rule learning has been—and still is—its inherent simplicity and the human understandability of the induced models and patterns. Research results clearly demonstrate that rule learning, together with decision tree learning, is an essential part of the knowledge discovery process.
Johannes Fürnkranz, Dragan Gamberger, Nada Lavrač
Backmatter
Metadaten
Titel
Foundations of Rule Learning
verfasst von
Johannes Fürnkranz
Dragan Gamberger
Nada Lavrač
Copyright-Jahr
2012
Verlag
Springer Berlin Heidelberg
Electronic ISBN
978-3-540-75197-7
Print ISBN
978-3-540-75196-0
DOI
https://doi.org/10.1007/978-3-540-75197-7