skip to main content
10.1145/1081870.1081878acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
Article

Rule extraction from linear support vector machines

Published:21 August 2005Publication History

ABSTRACT

We describe an algorithm for converting linear support vector machines and any other arbitrary hyperplane-based linear classifiers into a set of non-overlapping rules that, unlike the original classifier, can be easily interpreted by humans. Each iteration of the rule extraction algorithm is formulated as a constrained optimization problem that is computationally inexpensive to solve. We discuss various properties of the algorithm and provide proof of convergence for two different optimization criteria We demonstrate the performance and the speed of the algorithm on linear classifiers learned from real-world datasets, including a medical dataset on detection of lung cancer from medical images. The ability to convert SVM's and other "black-box" classifiers into a set of human-understandable rules, is critical not only for physician acceptance, but also to reducing the regulatory barrier for medical-decision support systems based on such classifiers.

References

  1. D. P. Bertsekas. Nonlinear Programming. Athena Scientific, Belmont, MA, 1995.Google ScholarGoogle Scholar
  2. Dimitri P. Bertsekas. Projected Newton methods for optimization problems with simple constraints. SIAM Journal on Control and Optimization, 20:221--246, 1982.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. F. Beyer, L. Zierott, J. Stoeckel, W. Heindel, and D. Wormanns. Computer-assisted detection (cad) of pulmonary nodules at mdct: Can cad be used as concurrent reader? In Proceeding of the 11th European Congress of Radiology, Viena, Austria, March 2005. To appear.Google ScholarGoogle Scholar
  4. E.H. Shortliffe B.G. Buchanan. Rule-Based Expert Systems: the MYCIN experiments of the Stanford Heuristic Programming Project. Addison-Wesley, Reading, MA, 1984. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. P. S. Bradley and O. L. Mangasarian. Feature selection via concave minimization and support vector machines. In J. Shavlik, editor, Machine Learning Proceedings of the Fifteenth International Conference(ICML '98), pages 82--90, San Francisco, California, 1998. Morgan Kaufmann. ftp://ftp.cs.wisc.edu/math-prog/tech-reports/9803.ps. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. V. Cherkassky and F. Mulier. Learning from Data - Concepts, Theory and Methods. John Wiley & Sons, New York, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. G. Fung and O. L. Mangasarian. Proximal support vector machine classifiers. In F. Provost and R. Srikant, editors, Proceedings KDD-2001: Knowledge Discovery and Data Mining, August 26--29, 2001, San Francisco, CA, pages 77--86, New York, 2001. Asscociation for Computing Machinery. ftp://ftp.cs.wisc.edu/pub/dmi/tech-reports/01-02.ps. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. G. Fung, O. L. Mangasarian, and J. Shavlik. Knowledge-based support vector machine classifiers. Technical Report 01-09, Data Mining Institute, Computer Sciences Department, University of Wisconsin, Madison, Wisconsin, November 2001. ftp://ftp.cs.wisc.edu/pub/dmi/tech-reports/01-09.ps, NIPS 2002 Proceedings, to appear.Google ScholarGoogle Scholar
  9. Glenn Fung. The disputed federalist papers: Svm feature selection via concave minimization. In TAPIA '03: Proceedings of the 2003 conference on Diversity in computing, pages 42--46. ACM Press, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. F. J. Kurfes. Neural networks and structured knowledge: Rule extraction and applications. Applied Intelligence (Special Issue), 12(1-2):7--13, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. O. L. Mangasarian. Generalized support vector machines. In A. Smola, P. Bartlett, B. Schölkopf, and D. Schuurmans, editors, Advances in Large Margin Classifiers, pages 135--146, Cambridge, MA, 2000. MIT Press. ftp://ftp.cs.wisc.edu/math-prog/tech-reports/98-14.ps.Google ScholarGoogle Scholar
  12. O. L. Mangasarian, W. N. Street, and W. H. Wolberg. Breast cancer diagnosis and prognosis via linear programming. Operations Research, 43(4):570--577, July-August 1995.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. S. Mika, G. Rätsch, J. Weston, B. Schölkopf, and K.-R. Müller. Fisher discriminant analysis with kernels. In Y.-H. Hu, J. Larsen, E. Wilson, and S. Douglas, editors, Neural Networks for Signal Processing IX, pages 41--48. IEEE, 1999.Google ScholarGoogle Scholar
  14. P. M. Murphy and D. W. Aha. UCI machine learning repository, 1992. www.ics.uci.edu/ mlearn/MLRepository.html.Google ScholarGoogle Scholar
  15. Haydemar Nunez, Cecilio Angulo, and Andreu Catal. Rule extraction from support vector machines. In ESANN'2002 proceedings - European Symposium on Artificial Neural Networks, pages 107-112. d-side, 2002.Google ScholarGoogle Scholar
  16. K. Preston. Computer processing of biomedical images. Computer, 9:54--68, 1976.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. A. Tickle R. Andrews, J. Diederich. A survey and critique of techniques for extracting rules from trained artificial neural networks. Knowledge-Based Systems, 8:373--389, 1995.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. L. B. Lusted R. S. Ledley. Reasoning foundations of medical diagnosis. Science, 130:9--21, 1959.Google ScholarGoogle ScholarCross RefCross Ref
  19. J. Roehrig. The promise of cad in digital mammography. European Journal of Radiology, 31:35--39, 1999.Google ScholarGoogle ScholarCross RefCross Ref
  20. G. Towell & J. Shavlik. The extraction of refined rules from knowledge-based neural networks. Machine Learning, 13:71--101, 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. J. A. K. Suykens and J. Vandewalle. Least squares support vector machine classifiers. Neural Processing Letters, 9(3):293--300, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. V. N. Vapnik. The Nature of Statistical Learning Theory. Springer, New York, second edition, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Rule extraction from linear support vector machines

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      KDD '05: Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
      August 2005
      844 pages
      ISBN:159593135X
      DOI:10.1145/1081870

      Copyright © 2005 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 21 August 2005

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • Article

      Acceptance Rates

      Overall Acceptance Rate1,133of8,635submissions,13%

      Upcoming Conference

      KDD '24

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader