nach oben

2008 | Buch

Kapitel lesen Erstes Kapitel lesen

Rule Extraction from Support Vector Machines

herausgegeben von: Joachim Diederich

Verlag: Springer Berlin Heidelberg

Buchreihe : Studies in Computational Intelligence

Enthalten in: Springer Professional "Wirtschaft+Technik" , Springer Professional "Technik" , Springer Professional "Wirtschaft"

Einloggen, um Zugang zu erhalten

Über dieses Buch

Support vector machines (SVMs) are one of the most active research areas in machine learning. SVMs have shown good performance in a number of applications, including text and image classification. However, the learning capability of SVMs comes at a cost – an inherent inability to explain in a comprehensible form, the process by which a learning result was reached. Hence, the situation is similar to neural networks, where the apparent lack of an explanation capability has led to various approaches aiming at extracting symbolic rules from neural networks. For SVMs to gain a wider degree of acceptance in fields such as medical diagnosis and security sensitive areas, it is desirable to offer an explanation capability. User explanation is often a legal requirement, because it is necessary to explain how a decision was reached or why it was made. This book provides an overview of the field and introduces a number of different approaches to extracting rules from support vector machines developed by key researchers. In addition, successful applications are outlined and future research opportunities are discussed. The book is an important reference for researchers and graduate students, and since it provides an introduction to the topic, it will be important in the classroom as well. Because of the significance of both SVMs and user explanation, the book is of relevance to data mining practitioners and data analysts.

Inhaltsverzeichnis

Frontmatter

Introduction

Rule Extraction from Support Vector Machines: An Introduction

Rule extraction from support vector machines (SVMs) follows in the footsteps of the earlier effort to obtain human-comprehensible rules from artificial neural networks (ANNs) in order to explain “how” a decision was made or “why” a certain result was achieved. Hence, much of the motivation for the field of rule extraction from support vector machines carries over from the now established area of rule extraction from neural networks. This introduction aims at outlining the significance of extracting rules from SVMs and it will investigate in detail what it means to explain the decision-making process of a machine learning system to a human user who may not be an expert on artificial intelligence or the particular application domain. It is natural to refer to both psychology and philosophy in this context because “explanation” refers to the human mind and its effort to understand the world; the traditional area of philosophical endeavours. Hence, the foundations of current efforts to simulate human explanatory reasoning are discussed as are current limitations and opportunities for rule extraction from support vector machines.

Joachim Diederich

Rule Extraction from Support Vector Machines: An Overview of Issues and Application in Credit Scoring

Summary

Innovative storage technology and the rising popularity of the Internet have generated an ever-growing amount of data. In this vast amount of data much valuable knowledge is available, yet it is hidden. The Support Vector Machine (SVM) is a state-of-the-art classification technique that generally provides accurate models, as it is able to capture non-linearities in the data. However, this strength is also its main weakness, as the generated non-linear models are typically regarded as incomprehensible black-box models. By extracting rules that mimic the black box as closely as possible, we can provide some insight into the logics of the SVM model. This explanation capability is of crucial importance in any domain where the model needs to be validated before being implemented, such as in credit scoring (loan default prediction) and medical diagnosis. If the SVM is regarded as the current state-of-the-art, SVM rule extraction can be the state-of-the-art of the (near) future. This chapter provides an overview of recently proposed SVM rule extraction techniques, complemented with the pedagogical Artificial Neural Network (ANN) rule extraction techniques which are also suitable for SVMs. Issues related to this topic are the different rule outputs and corresponding rule expressiveness; the focus on high dimensional data as SVM models typically perform well on such data; and the requirement that the extracted rules are in line with existing domain knowledge. These issues are explained and further illustrated with a credit scoring case, where we extract a Trepan tree and a RIPPER rule set from the generated SVM model. The benefit of decision tables in a rule extraction context is also demonstrated. Finally, some interesting alternatives for SVM rule extraction are listed.

David Martens, Johan Huysmans, Rudy Setiono, Jan Vanthienen, Bart Baesens

Algorithms and Techniques

Rule Extraction for Transfer Learning

Summary

This chapter discusses transfer learning, which is one practical application of rule extraction. In transfer learning, information from one learning experience is applied to speed up learning in a related task. The chapter describes several techniques for transfer learning in SVM-basedreinforcement learning, and shows results from a case study.

Lisa Torrey, Jude Shavlik, Trevor Walker, Richard Maclin

Rule Extraction from Linear Support Vector Machines via Mathematical Programming

Summary

We describe an algorithm for converting linear support vector machines SVM and any other arbitrary hyperplane-based linear classifiers into a set of nonoverlapping rules that, unlike the original classifier, can be easily interpreted by humans.

Each iteration of the rule extraction algorithm is formulated as a constrained optimization problem that is computationally inexpensive to solve. We discuss various properties of the algorithm and provide proof of convergence for two different optimization criteria. We demonstrate the performance and the speed of the algorithm on linear classifiers learned from real-world datasets, including a medical dataset on detection of lung cancer from medical images.

The ability to convert SVMs and other “black-box” classifiers into a set of human-understandable rules, is critical not only for physician acceptance, but also for reducing the regulatory barrier for medical-decision support systems based on such classifiers.

We also present some variations and extensions of the proposed mathematical programming formulations for rule extraction.

Glenn Fung, Sathyakama Sandilya, R. Bharat Rao

Rule Extraction Based on Support and Prototype Vectors

The support vector machine (SVM) is a modelling technique based on the statistical learning theory (Cortes and Vapnik 1995; Cristianini and Shawe-Taylor 2000; Vapnik 1998), which has been successfully applied initially in classification problems and later extended in different domains to other kind of problems like regression or novel detection. As a learning tool, it has demonstrated its strength especially in the cases where a data set of reduced size is at hands and/or when input space is of a high dimensionality. Nevertheless, a possible limitation of the SVMs is, similarly to the neuronal networks case, that they are only able of generating results in the form of black box models; that is, the solution provided by them is difficult to be interpreted from the point of view of the user.

Haydemar Núñez, Cecilio Angulo, Andreu Català

SVMT-Rule: Association Rule Mining Over SVM Classification Trees

Since support vector machines (SVM) [7–9] demonstrate a good accuracy in classification and regression, rule extraction from a trained SVM (SVM-Rule) procedure is important for data mining and knowledge discovery [1–6, 29, 31]. However, the obtained rules from SVM-Rule in practice are less comprehensible than our expectation because there is a big number of incomprehensible numerical parameters (i.e., support vectors) turned up in those rules. Compared to SVM-Rule, decision-tree is a simple, but very efficient rule extraction method in terms of comprehensibility [33]. The obtained rules from decision tree may not be so accurate as SVM rules, but they are easy to comprehend because that every rule represents one decision path that is traceable in the decision tree.

Shaoning Pang, Nik Kasabov

Prototype Rules from SVM

Summary

Prototype based rules (P-rules) are an alternative to crisp and fuzzy rules, moreover they can be seen as a generalization of different forms of knowledge representation. In P-rules knowledge is represented as set of reference vectors, that may be derived from the SVM model.

The number of support vectors (SV) should be reduced to a minimal number that still preserves SVM generalization abilities. Several state-of-the-art methods that reduce the number of support vectors are compared with a new approach, taking into consideration possible interpretation of retained support vectors as the basis for P-rules.

Marcin Blachnik, Włodzisław Duch

Applications

Prediction of First-Day Returns of Initial Public Offering in the US Stock Market Using Rule Extraction from Support Vector Machines

Summary

Artificial neural networks (ANNs) and support vector machines have successfully improved the quality of predicting share movements in relation to statistically based counterparts. However, it has not been feasible to gain insight into the reasons why a certain prediction is made. Due to this limitation, the use of machine learning techniques in the capital market has met a critical hurdle. This chapter outlines a method based on pedagogical learning for extracting rules from support vector machines. To the best of our knowledge, the experiments reported here are the first attempt to utilize learning based rule extraction from support vector machines for financial data mining.

The experiments use predictions from support vector machines for extracting rules associated with the first-day returns of “initial public offerings” (IPOs) in the US stock market. A novel feature of the experiments is the simultaneous application of fundamental and technical analysis in the context of predicting the success of IPOs. Cross-industry IPOs covering the period from 1974 to 1984 and software and services IPOs launched between 1996 and 2000 are utilized.

Rolf Mitsdorffer, Joachim Diederich

Accent in Speech Samples: Support Vector Machines for Classification and Rule Extraction

Accent is the pattern of pronunciation which can identify a person’s linguistic, social or cultural background. It is an important source of inter-speaker variability and a particular problem for automated speech recognition. This study aims to investigate the effectiveness of rule extraction from support vector machines for speech accent classification. The presence of a speaker’s accent in the speech signal has significant implications for the accuracy of speech recognition because the effectiveness of an Automatic Speech Recognition System (ASR) is greatly reduced when the particular accent or dialect in the speech samples on which it is trained differs from the accent or dialect of the end-user [4] [14]. The correct identification of a speaker’s accent, and the subsequent use of the appropriately trained system, can be used to improve the efficiency and accuracy of the ASR application. If used in automated telephone helplines, analysing accent and then directing callers to the appropriately-accented response system may improve customer comfort and understanding. The increasing use of speech recognition technology in modern applications by people with a wide variety of linguistic and cultural backgrounds, means that addressing accent-related variability in speech is an important area of ongoing research. Rule extraction in this context can aid in the refinement of the design of a successful classifier, by discovering the contribution of the various input features, as well as by facilitating the comparison of the results with other machine learning methods.

Carol Pedersen, Joachim Diederich

Rule Extraction from SVM for Protein Structure Prediction

Summary

In recent years, many researches have focused on improving the accuracy of protein structure prediction, and many significant results have been achieved. However, the existing methods lack the ability to explain the process of how a learning result is reached and why a prediction decision is made. The explanation of a decision is important for the acceptance of machine learning technology in bioinformatics applications such as protein structure prediction. The support vector machines (SVMs) have shown better performance than most traditional machine learning approaches in a variety of application areas. However, the SVMs are still black box models. They do not produce comprehensible models that account for the predictions they make. To overcome this limitation, in this chapter, we present two new approaches of rule generation for understanding protein structure prediction. Based on the strong generalization ability of the SVM and the interpretation of the decision tree, one approach combines SVMs with decision trees into a new algorithm called SVM_DT. Another method combines SVMs with association rule (AR) based scheme called SVM_PCPAR. We also provide the method of rule aggregation for a large number of rules to produce the super rules by using conceptual clustering. The results of the experiments for protein structure prediction show that not only the comprehensibility of SVM_DT and SVM_PCPAR are much better than that of SVMs, but also that the test accuracy of these rules is comparable. We believe that SVM_DT and SVM_PCPAR can be used for protein structure prediction, and understanding the prediction as well. The prediction and its interpretation can be used for guiding biological experiments.

Jieyue He, Hae-jin Hu, Bernard Chen, Phang C. Tai, Rob Harrison, Yi Pan

Backmatter

Titel: Rule Extraction from Support Vector Machines
herausgegeben von: Joachim Diederich
Verlag: Springer Berlin Heidelberg
Electronic ISBN: 978-3-540-75390-2
Print ISBN: 978-3-540-75389-6
DOI: https://doi.org/10.1007/978-3-540-75390-2

Premium Partner

Marktübersichten

Die im Laufe eines Jahres in der „adhäsion“ veröffentlichten Marktübersichten helfen Anwendern verschiedenster Branchen, sich einen gezielten Überblick über Lieferantenangebote zu verschaffen.

Zur Marktübersicht