ABSTRACT
We describe an algorithm for converting linear support vector machines and any other arbitrary hyperplane-based linear classifiers into a set of non-overlapping rules that, unlike the original classifier, can be easily interpreted by humans. Each iteration of the rule extraction algorithm is formulated as a constrained optimization problem that is computationally inexpensive to solve. We discuss various properties of the algorithm and provide proof of convergence for two different optimization criteria We demonstrate the performance and the speed of the algorithm on linear classifiers learned from real-world datasets, including a medical dataset on detection of lung cancer from medical images. The ability to convert SVM's and other "black-box" classifiers into a set of human-understandable rules, is critical not only for physician acceptance, but also to reducing the regulatory barrier for medical-decision support systems based on such classifiers.
- D. P. Bertsekas. Nonlinear Programming. Athena Scientific, Belmont, MA, 1995.Google Scholar
- Dimitri P. Bertsekas. Projected Newton methods for optimization problems with simple constraints. SIAM Journal on Control and Optimization, 20:221--246, 1982.Google ScholarDigital Library
- F. Beyer, L. Zierott, J. Stoeckel, W. Heindel, and D. Wormanns. Computer-assisted detection (cad) of pulmonary nodules at mdct: Can cad be used as concurrent reader? In Proceeding of the 11th European Congress of Radiology, Viena, Austria, March 2005. To appear.Google Scholar
- E.H. Shortliffe B.G. Buchanan. Rule-Based Expert Systems: the MYCIN experiments of the Stanford Heuristic Programming Project. Addison-Wesley, Reading, MA, 1984. Google ScholarDigital Library
- P. S. Bradley and O. L. Mangasarian. Feature selection via concave minimization and support vector machines. In J. Shavlik, editor, Machine Learning Proceedings of the Fifteenth International Conference(ICML '98), pages 82--90, San Francisco, California, 1998. Morgan Kaufmann. ftp://ftp.cs.wisc.edu/math-prog/tech-reports/9803.ps. Google ScholarDigital Library
- V. Cherkassky and F. Mulier. Learning from Data - Concepts, Theory and Methods. John Wiley & Sons, New York, 1998. Google ScholarDigital Library
- G. Fung and O. L. Mangasarian. Proximal support vector machine classifiers. In F. Provost and R. Srikant, editors, Proceedings KDD-2001: Knowledge Discovery and Data Mining, August 26--29, 2001, San Francisco, CA, pages 77--86, New York, 2001. Asscociation for Computing Machinery. ftp://ftp.cs.wisc.edu/pub/dmi/tech-reports/01-02.ps. Google ScholarDigital Library
- G. Fung, O. L. Mangasarian, and J. Shavlik. Knowledge-based support vector machine classifiers. Technical Report 01-09, Data Mining Institute, Computer Sciences Department, University of Wisconsin, Madison, Wisconsin, November 2001. ftp://ftp.cs.wisc.edu/pub/dmi/tech-reports/01-09.ps, NIPS 2002 Proceedings, to appear.Google Scholar
- Glenn Fung. The disputed federalist papers: Svm feature selection via concave minimization. In TAPIA '03: Proceedings of the 2003 conference on Diversity in computing, pages 42--46. ACM Press, 2003. Google ScholarDigital Library
- F. J. Kurfes. Neural networks and structured knowledge: Rule extraction and applications. Applied Intelligence (Special Issue), 12(1-2):7--13, 2000. Google ScholarDigital Library
- O. L. Mangasarian. Generalized support vector machines. In A. Smola, P. Bartlett, B. Schölkopf, and D. Schuurmans, editors, Advances in Large Margin Classifiers, pages 135--146, Cambridge, MA, 2000. MIT Press. ftp://ftp.cs.wisc.edu/math-prog/tech-reports/98-14.ps.Google Scholar
- O. L. Mangasarian, W. N. Street, and W. H. Wolberg. Breast cancer diagnosis and prognosis via linear programming. Operations Research, 43(4):570--577, July-August 1995.Google ScholarDigital Library
- S. Mika, G. Rätsch, J. Weston, B. Schölkopf, and K.-R. Müller. Fisher discriminant analysis with kernels. In Y.-H. Hu, J. Larsen, E. Wilson, and S. Douglas, editors, Neural Networks for Signal Processing IX, pages 41--48. IEEE, 1999.Google Scholar
- P. M. Murphy and D. W. Aha. UCI machine learning repository, 1992. www.ics.uci.edu/ mlearn/MLRepository.html.Google Scholar
- Haydemar Nunez, Cecilio Angulo, and Andreu Catal. Rule extraction from support vector machines. In ESANN'2002 proceedings - European Symposium on Artificial Neural Networks, pages 107-112. d-side, 2002.Google Scholar
- K. Preston. Computer processing of biomedical images. Computer, 9:54--68, 1976.Google ScholarDigital Library
- A. Tickle R. Andrews, J. Diederich. A survey and critique of techniques for extracting rules from trained artificial neural networks. Knowledge-Based Systems, 8:373--389, 1995.Google ScholarDigital Library
- L. B. Lusted R. S. Ledley. Reasoning foundations of medical diagnosis. Science, 130:9--21, 1959.Google ScholarCross Ref
- J. Roehrig. The promise of cad in digital mammography. European Journal of Radiology, 31:35--39, 1999.Google ScholarCross Ref
- G. Towell & J. Shavlik. The extraction of refined rules from knowledge-based neural networks. Machine Learning, 13:71--101, 1993. Google ScholarDigital Library
- J. A. K. Suykens and J. Vandewalle. Least squares support vector machine classifiers. Neural Processing Letters, 9(3):293--300, 1999. Google ScholarDigital Library
- V. N. Vapnik. The Nature of Statistical Learning Theory. Springer, New York, second edition, 2000. Google ScholarDigital Library
Index Terms
- Rule extraction from linear support vector machines
Recommendations
Decompositional Rule Extraction from Support Vector Machines by Active Learning
Support vector machines (SVMs) are currently state-of-the-art for the classification task and, generally speaking, exhibit good predictive performance due to their ability to model nonlinearities. However, their strength is also their main weakness, as ...
Rule extraction for support vector machine using input space expansion
ACIIDS'11: Proceedings of the Third international conference on Intelligent information and database systems - Volume Part IIFuzzy Rule-Based System (FRB) in the form of human comprehensible IF-THEN rules can be extracted from Support Vector Machine (SVM) which is regarded as a black-boxed system. We first prove that SVM decision network and the zero-ordered Sugeno FRB type ...
Rule extraction from trained support vector machines
PAKDD'05: Proceedings of the 9th Pacific-Asia conference on Advances in Knowledge Discovery and Data MiningSupport vector machine (SVM) is applied to many research fields because of its good generalization ability and solid theoretical foundation. However, as the model generated by SVM is like a black box, it is difficult for user to interpret and understand ...
Comments