Knowledge discovery in databases: An overview

Abstract

Data Mining and knowledge Discovery in Databases (KDD) promise to play an important role in the way people interact with databases, especially decision support databases where analysis and exploration operations are essential. Inductive logic programming can potentially play some key roles in KDD. This is an extended abstract for an invited talk in the conference. In the talk, we define the basic notions in data mining and KDD, define the goals, present motivation, and give a high-level definition of the KDD Process and how it relates to Data Mining. We then focus on data mining methods. Basic coverage of a sampling of methods will be provided to illustrate the methods and how they are used. We cover a case study of a successful application in science data analysis: the classification of cataloging of a major astronomy sky survey covering 2 billion objects in the northern sky. The system can outperform human as well as classical computational analysis tools in astronomy on the task of recognizing faint stars and galaxies. We also cover the problem of scaling a clustering problem to a large catalog database of billions of objects. We conclude with a listing of research challenges and we outline area where ILP could play some important roles in KDD.

Usama Fayyad

On the complexity of some Inductive Logic Programming problems

Abstract

The bounded ILP-consistency problem for function-free Horn clauses is described as follows. Given a set E ⁺ and E ⁻ of function-free ground Horn clauses and an integer k polynomial in E ⁺∪E ⁻, does there exist a function-free Horn clause C with no more than k literais such that C subsumes each element in E ⁺ and C does not subsume any element in E ⁻. It is shown that this problem is Σ ₂ ^P complete. We derive some related results on the complexity of ILP and discuss the usefulness of such complexity results.

Georg Gottlob, Nicola Leone, Francesco Scarcello

Inductive logic programming and constraint logic programming (abstract)

Jean-François Puget

Learning phonetic rules in a speech recognition system

Abstract

Current speech recognition systems can be categorized into two broad classes; the knowledge-based approach and the stochastic one. In this paper we present a rule-based method for the recognition of Hungarian vowels. A spectrogram model was used as a front-end module and some acoustic features were extracted (e.g. locations, intensities and shapes of local maxima) from spectrograms by using a genetic algorithm method. On the basis of these features we developed a rule set for the recognition of isolated Hungarian vowels. These rules represented by Prolog clauses were refined by the IMPUT Inductive Logic Programming method.

Zoltán Alexin, János Csirik, Tibor Gyimóthy, Mark Jelasity, László Tóth

Cautious induction in inductive logic programming

Abstract

Many top-down Inductive Logic Programming systems use a greedy, covering approach to construct hypotheses. This paper presents an alternative, cautious approach, known as cautious induction. We conjecture that cautious induction can allow better hypotheses to be found, with respect to some hypothesis quality criteria. This conjecture is supported by the presentation of an algorithm called OILS, and with a complexity analysis and empirical comparison of OILS with the Progol system. The results are encouraging and demonstrate the applicability of cautious induction to problems with noisy datasets, and to problems which require large, complex hypotheses to be learnt.

Simon Anthony, Alan M. Frisch

Generating numerical literals during refinement

Abstract

Despite the rapid emergence and success of Inductive Logic Programming, problems still surround number handling—problems directly inherited from the choice of logic programs as the representation language. Our conjecture is that a generalisation of the representation language to Constraint Logic Programs provides an effective solution to this problem. We support this claim with the presentation of an algorithm called NUM, to which a top-down refinement operator can delegate the task of finding numerical literals. NUM can handle equations, in-equations and dis-equations in a uniform way, and, furthermore, provides more generality than competing approaches since numerical literals are not required to cover all the positive examples available.

Simon Anthony, Alan M. Frisch

Lookahead and discretization in ILP

Abstract

We present and evaluate two methods for improving the performance of ILP systems. One of them is discretization of numerical attributes, based on Fayyad and Irani's text [9], but adapted and extended in such a way that it can cope with some aspects of discretization that only occur in relational learning problems (when indeterminate literals occur). The second technique is lookahead. It is a well-known problem in ILP that a learner cannot always assess the quality of a refinement without knowing which refinements will be enabled afterwards, i.e. without looking ahead in the refinement lattice. We present a simple method for specifying when lookahead is to be used, and what kind of lookahead is interesting. Both the discretization and lookahead techniques are evaluated experimentally. The results show that both techniques improve the quality of the induced theory, while computational costs are acceptable.

Hendrik Blockeel, Luc De Raedt

Data mining via ILP: The application of Progol to a database of enantioseparations

Abstract

As far as this author is aware, this is the first paper to describe the application of Progol to enantioseparations. A scheme is proposed for data mining a relational database of published enantioseparations using Progol. The application of the scheme is described and a preliminary assessment of the usefulness of the resulting generalisations is made using their accuracy, size, ease of interpretation and chemical justification.

Christopher H. Bryant

Part-of-speech tagging using Progol

Abstract

A system for ‘tagging’ words with their part-of-speech (POS) tags is constructed. The system has two components: a lexicon containing the set of possible POS tags for a given word, and rules which use a word's context to eliminate possible tags for a word. The Inductive Logic Programming (ILP) system Progol is used to induce these rules in the form of definite clauses. The final theory contained 885 clauses. For background knowledge, Progol uses a simple grammar, where the tags are terminals and predicates such as nounp (noun phrase) are non-terminals. Progol was altered to allow the caching of information about clauses generated during the induction process which greatly increased efficiency. The system achieved a per-word accuracy of 96.4% on known words drawn from sentences without quotation marks. This is on a par with other tagging systems induced from the same data [5, 2, 4] which all have accuracies in the range 96–97%. The per-sentence accuracy was 4 49.5%.

James Cussens

Maximum Entropy modeling with Clausal Constraints

Abstract

We present the learning system Maccent which addresses the novel task of stochastic MAximum ENTropy modeling with Clausal Constraints. Maximum Entropy method is a Bayesian method based on the principle that the target stochastic model should be as uniform as possible, subject to known constraints. Maccent incorporates clausal constraints that are based on the evaluation of Prolog clauses in examples represented as Prolog programs. We build on an existing maximum-likelihood approach to maximum entropy modeling, which we upgrade along two dimensions: (1) Maccent can handle larger search spaces, due to a partial ordering defined on the space of clausal constraints, and (2) uses a richer first-order logic format. In comparison with other inductive logic programming systems, MACCENT seems to be the first that explicitly constructs a conditional probability distribution p(C|I) based on an empirical distribution \(\tilde p\)(C|I) (where p(C|I) (\(\tilde p\)(C|I)) equals the induced (observed) probability of an instance I belonging to a class C). First experiments indicate MACCENT may be useful for prediction, and for classification in cases where the induced model should be combined with other stochastic information sources.

Luc Dehaspe

Mining association rules in multiple relations

Abstract

The application of algorithms for efficiently generating association rules is so far restricted to cases where information is put together in a single relation. We describe how this restriction can be overcome through the combination of the available algorithms with standard techniques from the field of inductive logic programming. We present the system Warmr, which extends Apriori [2] to mine association rules in multiple relations. We apply Warmr to the natural language processing task of mining part-of-speech tagging rules in a large corpus of English. be applied to further constrain the space of interesting ARMR's.

Luc Dehaspe, Luc De Raedt

Using logical decision trees for clustering

Abstract

A novel first order clustering system, called C 0.5, is presented. It inherits its logical decision tree formalism from the TILDE system, but instead of using class information to guide the search, it employs the principles of instance based learning in order to perform clustering. Various experiments are discussed, which show the promise of the approach.

Luc De Raedt, Hendrik Blockeel

Induction of Slovene nominal paradigms

Abstract

The paper presents results of using FoIDL, an inductive logic programming system, to learn the inflectional paradigms of Slovene nouns. Foidl learns first-order decision lists, defined as ordered list of clauses; it has been previously tested on the problem of inducing rules for forming the past tense of English verbs. Slovene, unlike English, has rich inflectional morphology, and the paper reports the result of applying Foidl over a large lexicon of Slovene word-forms to induce rules for the synthesis and analysis of the full inflectional paradigms of Slovene nouns.

Sašo Džeroski, Tomaž Erjavec

Normal forms for inductive logic programming

Abstract

In this paper we study induction of unrestricted clausal theories from interpretations. First, we show that in the propositional case induction from complete evidence can be seen as an equivalence-preserving transformation from DNF to CNF. From this we conclude that induction is essentially a process of determining what is false in the domain of discourse. We then proceed by investigating dual normal forms for evidence and hypotheses in predicate logic. We define evidence nonnal form (ENF), which is Skolemised existential DNF under a Consistent Naming Assumption. Because ENF is incomplete, in the sense that it does not have the expressive power of clausal logic, ENF evidence requires the identification of Skolem terms. The approach is partly implemented in the Primus system.

Peter A. Flach

On a sufficient condition for the existence of most specific hypothesis in progol

Abstract

In this paper, we give a sufficient condition for the existence of the most specific hypothesis (MSH) in Progol. Muggleton [2] showed that for any first order theory (background knowledge) B and a single clause (a positive example) E, there exists the most specific hypothesis ¬bot(B, E) which satisfies B ≈ ¬,bot(B, E) ⊨ E and for any hypothesis H satisfying B ≈ H ⊨ E, H entails bot(B, E) assuming that hypotheses are all single clauses. Yamamoto[8] gave a counter example and indicated that Muggleton's proof contains error. He also gave a sufficient condition under which the MSH exists. In this paper, we give another and more realistic sufficient condition to guarantee the existence of the MSH.

Koichi Furukawa, Tomoko Murakami, Ken Ueno, Tomonobu Ozaki, Keiko Shimazu

Induction of logic programs with more than one recursive clause by analyzing saturations

Abstract

This paper describes a bottom-up ILP algorithm called MRI, which induces recursive programs with one or more recursive clauses from a few of examples. It analyzes saturations using path structures, which express streams of terms processed by predicates and was originally introduced by Identam-Almquist. We introduce extension and difference of path structures. Recursive clauses can be expressed as a difference among path structures. The paper also shows experimental results.

Mitsue Furusawa, Nobuhiro Inuzuka, Hirohisa Seki, Hidenori Itoh

A logical framework for graph theoretical decision tree learning

Abstract

We present a logical approach to graph theoretical learning that is based on using alphabetic substitutions for modelling graph morphisms. A classified graph is represented by a definite clause that possesses variables of the sort node for representing nodes and atoms for representing the edges. In contrast to the standard logical semantics, different node variables are assumed to denote different objects. The use of an alphabetical subsumption relation (α-subsumption) implies that the least generalization of clauses (α-generalization) has different properties than Plotkin's least generalization (gg). We present a method for constructing optimal α-generalizations from Plotkin's least generalization. The developed framework is used in the relational decision tree algorithm TRITOP.

Peter Geibel, Fritz Wysotzki

Learning with abduction

Abstract

We investigate how abduction and induction can be integrated into a common learning framework through the notion of Abductive Concept Learning (ACL). ACL is an extension of Inductive Logic Programming (ILP) to the case in which both the background and the target theory are abductive logic programs and where an abductive notion of entailment is used as the coverage relation. In this framework, it is then possible to learn with incomplete information about the examples by exploiting the hypothetical reasoning of abduction. The paper presents the basic framework of ACL with its main characteristics. An algorithm for an intermediate version of ACL is developed by suitably extending the top-down ILP method and integrating this with an abductive proof procedure for Abductive Logic Programming (ALP). A prototype system has been developed and applied to learning problems with incomplete information.

A. C. Kakas, F. Riguzzi

Systematic Predicate Invention in Inductive Logic Programming

Abstract

We propose in this paper a new approach for learning predicate definitions from examples and from an initial theory. The particularity of this approach consists in inventing both a new predicate symbol and a specification for this predicate at most steps of learning. The specifications that are built are incomplete and imprecise, what is modelized by introducing the notion of a-interpretation. At the end of the learning task, some invented predicates are removed by unfolding techniques. The remaining predicates either enable to simplify the program, or are defined by recursive programs. In the second case, the program could not have been learned without inventing these predicates. The method has been implemented in a system, called SPILP, which has been successfully tested for inventing predicates which simplify the learned programs as well as for inventing recursively defined predicates. Let us point out that the introduction of σ-interpretations gives us a general framework for dealing with imprecise specifications and that SPILP can work, even when the target concepts are also incompletely defined by σ-interpretations.

Lionel Martin, Christel Vrain

Learning programs in the event calculus

Abstract

The event calculus is a formalism for reasoning about actions and change in dynamic systems. It has been used in diverse areas including planning and communications protocol specification. Writing event calculus programs requires the construction of domain specific axioms (DSAs) - a programming task which is non-trivial, and one that hinders the broader use of the event calculus. This work demostrates that such axioms can be learned from temporal observations using Inductive Logic programming (ILP) techniques, in particular theory c0ompletion. The theory of logical back-propagation as a mechanism for theory completion is described and its implementation in the ILP system Progol is used here. These techniques were used to investigate learning DSAS for the traditional AI blocks world. In the experiments Progol, utilising logical back-propagation, learned correct DSAs. These results provide encouragement and highlight the possibility of discovering causal relationships from data in temporal databases, and also learning the domain specific knowledge necessary in the development of plans.

Stephen Moyle, Stephen Muggleton

Distance between Herbrand interpretations: A measure for approximations to a target concept

Abstract

We can use a metric to measure the differences between elements in a domain or subsets of that domain (i.e. concepts). Which particular metric should be chosen, depends on the kind of difference we want to measure. The well known Euclidean metric on ℜⁿ and its generalizations are often used for this purpose, but such metrics are not always suitable for concepts where elements have some structure different from real numbers. For example, in (Inductive) Logic Programming a concept is often expressed as an Herbrand interpretation of some firstorder language. Every element in an Herbrand interpretation is a ground atom which has a tree structure. We start by defining a metric d on the set of expressions (ground atoms and ground terms), motivated by the structure and complexity of the expressions and the symbols used therein. This metric induces the Hausdorff metric h on the set of all sets of ground atoms, which allows us to measure the distance between Herbrand interpretations. We then give some necessary and some sufficient conditions for an upper bound of h between two given Herbrand interpretations, by considering the elements in their symmetric difference.

Shan-Hwei Nienhuys-Cheng

Realizing Progol by forward reasoning

Abstract

Current implementation of coverage check for evaluating the candidate hypothesis in A'-like search in Progol is based on backward reasoning of Prolog. But it contains some kinds of redundancy. In this paper, we propose an alternative algorithm based on forward reasoning of extended MGTP (Model Generation Theorem Prover). Since this alternative can remove the redundant computations, we can expect to realize a more efficient search process.

Tomonobu Ozaki, Koichi Furukawa, Tomoko Murakami, Ken Ueno

Probabilistic first-order classification

Abstract

We discuss the problem of classification using the first order hypotheses. This paper proposes an enhancement of classification based on the naive Bayesian scheme that is able to overcome the conditional independence assumption. Several experiments, involving some artificial and real-world, both propositional and relational domains, were conducted. The results indicate that the classification performance of propositional learners is reached when the richer first-order knowledge representation is not mandatory. This holds also in the domains where such representation is more convenient. Our framework can also benefit from the use of the hypotheses describing negative information. In such case, the classification becomes more noise resistant.

Uroš Pompe, Igor Kononenko

Learning Horn definitions with equivalence and membership queries

Abstract

A Horn definition is a set of Horn clauses with the same head literal. In this paper, we consider learning non-recursive, function-free first-order Horn definitions. We show that this class is exactly learnable from equivalence and membership queries. It follows then that this class is PAC learnable using examples and membership queries. Our results have been shown to be applicable to learning efficient goal-decomposition rules in planning domains.

Chandra Reddy, Prasad Tadepalli

Using abstraction schemata in inductive logic programming

Abstract

We consider inductive logic programming guided by a language bias called abstraction schemata which enables us to specify a partial structure for a target program. Specifically, to improve the efficiency of such learning, we discuss a class of programs for which it is possible to devise a learning algorithm capable of identifying and pruning unpromising uses of the schemata. This identification process includes the bias shift problem: how to decide whether a hypothesis space contains no correct program with respect to a given example specification. For solving this problem, a required property of hypothesis spaces is discovered. This result yields a class of programs that are beyond the representational capabilities of previous approaches — most notably, non-trivial programs with local variables.

Ken Sadohar, Makoto Haraguchi

Distance induction in first order logic

Abstract

A distance on the problem domain allows one to tackle some typical goals of machine learning, e.g. classification or conceptual clustering, via robust data analysis algorithms (e.g. k-nearest neighbors or k-means).

A method for building a distance on first-order logic domains is presented in this paper. The distance is constructed from examples expressed as definite or constrained clauses, via a two-step process: a set of d hypotheses is first learnt from the training examples. These hypotheses serve as new descriptors of the problem domain Λ_h: they induce a mapping π from ,Ch onto the space of integers IN^d. The distance between any two examples E and F is finally defined as the Euclidean distance between π(E) and π(F). The granularity of this hypothesis-driven distance (HDD) is controlled via the user-supplied parameter d.

The relevance of a HDD is evaluated from the predictive accuracy of the k-NN classifier based on this distance. Preliminary experiments demonstrate the potentialities of distance induction, in terms of predictive accuracy, computational cost, and tolerance to noise.

Michèle Sebag

Carcinogenesis predictions using ILP

Abstract

Obtaining accurate structural alerts for the causes of chemical cancers is a problem of great scientific and humanitarian value. This paper follows up on earlier research that demonstrated the use of Inductive Logic Programming (ILP) for predictions for the related problem of mutagenic activity amongst nitroaromatic molecules. Here we are concerned with predicting carcinogenic activity in rodent bioassays using data from the U.S. National Toxicology Program conducted by the National Institute of Environmental Health Sciences. The 330 chemicals used here are significantly more diverse than the previous study, and form the basis for obtaining Structure-Activity Relationships (SARs) relating molecular structure to cancerous activity in rodents. We describe the use of the ILP system Progol to obtain SARs from this data. The rules obtained from Progol are comparable in accuracy to those from expert chemists, and more accurate than most state-of-the-art toxicity prediction methods. The rules can also be interpreted to give clues about the biological and chemical mechanisms of carcinogenesis, and make use of those learnt by Progol for mutagenesis. Finally, we present details of, and predictions for, an ongoing international blind trial aimed specifically at comparing prediction methods. This trial provides ILP algorithms an opportunity to participate at the leading-edge of scientific discovery.

A. Srinivasan, R. D. King, S. H. Muggleton, M. J. E. Sternberg

Discovery of first-order regularities in a relational database using ofine candidate determination

Abstract

In this paper, we present an algorithm for the discovery of first order clauses holding in an relational database in the framework of the nonmonotonic ILP setting [1]. The algorithm adopts the principle of offline candidate determination algorithm used for mining association rules in large transaction databases [4]. Analoguous to the measures used in mining association rules, we define a support and a confidence measure as acceptance criteria for discovered hypothesis clauses.

The algorithm has been implemented in C with an interface to the relational database management system INGRES. We present and discuss the results of an experiment in the KRK domain and conclude

Irene Weber

Which hypotheses can be found with inverse entailment?

Abstract

In this paper we give a completeness theorem of an inductive inference rule inverse entailment proposed by Muggleton. Our main result is that a hypothesis clause H can be derived from an example E under a background theory B with inverse entailment iff H subsumes E relative to B in Plotkin's sense. The theory B can be any clausal theory, and the example E can be any clause which is neither a tautology nor implied by B. The derived hypothesis H is a clause which is not always definite. In order to prove the result we give a declarative semantics for arbitrary consistent clausal theories, and show that SB-resolution, which was originally introduced by Plotkin, is a complete procedural semantics. The completeness is shown as an extension of the completeness theorem of SLD-resolution. We also show that every hypothesis H derived with saturant generalization, proposed by Rouveirol, must subsume E w.r.t. B in Buntine's sense. Moreover we show that saturant generalization can be obtained from inverse entailment by giving some restriction to it.

Akihiro Yamamoto

Springer Professional

Inductive Logic Programming

7th International Workshop, ILP-97 Prague, Czech Republic September 17–20, 1997 Proceedings

Über dieses Buch

Inhaltsverzeichnis

Frontmatter

Knowledge discovery in databases: An overview

On the complexity of some Inductive Logic Programming problems

Inductive logic programming and constraint logic programming (abstract)

Learning phonetic rules in a speech recognition system

Cautious induction in inductive logic programming

Generating numerical literals during refinement

Lookahead and discretization in ILP

Data mining via ILP: The application of Progol to a database of enantioseparations

Part-of-speech tagging using Progol

Maximum Entropy modeling with Clausal Constraints

Mining association rules in multiple relations

Using logical decision trees for clustering

Induction of Slovene nominal paradigms

Normal forms for inductive logic programming

On a sufficient condition for the existence of most specific hypothesis in progol

Induction of logic programs with more than one recursive clause by analyzing saturations

A logical framework for graph theoretical decision tree learning

Learning with abduction

Systematic Predicate Invention in Inductive Logic Programming

Learning programs in the event calculus

Distance between Herbrand interpretations: A measure for approximations to a target concept

Realizing Progol by forward reasoning

Probabilistic first-order classification

Learning Horn definitions with equivalence and membership queries

Using abstraction schemata in inductive logic programming

Distance induction in first order logic

Carcinogenesis predictions using ILP

Discovery of first-order regularities in a relational database using ofine candidate determination

Which hypotheses can be found with inverse entailment?

Backmatter