Skip to main content
Log in

A Dual-Objective Evolutionary Algorithm for Rules Extraction in Data Mining

  • Published:
Computational Optimization and Applications Aims and scope Submit manuscript

An Erratum to this article was published on 14 August 2006

Abstract

This paper presents a dual-objective evolutionary algorithm (DOEA) for extracting multiple decision rule lists in data mining, which aims at satisfying the classification criteria of high accuracy and ease of user comprehension. Unlike existing approaches, the algorithm incorporates the concept of Pareto dominance to evolve a set of non-dominated decision rule lists each having different classification accuracy and number of rules over a specified range. The classification results of DOEA are analyzed and compared with existing rule-based and non-rule based classifiers based upon 8 test problems obtained from UCI Machine Learning Repository. It is shown that the DOEA produces comprehensible rules with competitive classification accuracy as compared to many methods in literature. Results obtained from box plots and t-tests further examine its invariance to random partition of datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. A.D. Arbatli and H.L. Akin, “Rule extraction from trained neural networks using genetic algorithms,” in Proceedings of the 2nd World Congress of Nonlinear Analysis, Theory, Methods & Application, vol. 30, no. 3, pp. 1639–1648, 1997.

  2. W. Banzhaf, E. Nordin, P.R. Keller, and F.D. Francone, Genetic Programming: An Introduction on the Automatic Evolution of Computer Programs and its Applications, Morgan Kaufmann, San Francisco, CA, 1998.

    Google Scholar 

  3. C.L. Blake and C.J. Merz, UCI Repository of machine learning databases [http://www.ics.uci.edu/∼mlearn/MLRepository.html]. Irvine, CA: University of California, Department of Information and Computer Science, 1998.

  4. C.C. Bojarczuk, H.S. Lopes, and A.A. Freitas, “Genetic programming for knowledge discovery in chest-pain diagnosis,” IEEE Engineering in Medicine and Biology Magazine, vol. 4, no. 19, pp. 38–44, 2000.

    Article  Google Scholar 

  5. M. Brameier and W. Banzhaf, “A comparison of linear genetic programming neural networks in medical data mining,” IEEE Transactions on Evolutionary Computation, vol. 5, no. 1, pp. 17–26, 2001.

    Article  Google Scholar 

  6. R. Cattral, F. Oppacher, and D. Deugo, “Rule acquisition with a genetic algorithm,” in Proceedings of the IEEE Congress on Evolutionary Computation, vol. 1, pp. 125–129, 1999.

  7. J.M. Chambers, W.S. Cleveland, B. Kleiner, and P.A. Turkey, Graphical Methods for Data Analysis, Wadsworth & Brooks/Cole, Pacific CA, 1983.

    MATH  Google Scholar 

  8. C.A. Coello Coello, D.A. Van Veldhuizen, and G.B. Lamont, Evolutionary Algorithms for Solving Multi-Objective Problems, Plenum Pub Corp, 2002.

  9. C.B. Congdon, “Classification of epidemiological data: a comparison of genetic algorithm and decision tree approaches,” in Proceedings of the IEEE Congress on Evolutionary Computation, vol. 1, pp. 442–449, 2000.

  10. R.O. Duda, P.E. Hart, and D.G. Stork, Pattern Classification, 2nd edition, John Wiley and Sons, 2001.

  11. U. Fayyad, “Data mining and knowledge discovery in databases: implications for scientific databases,” Proceedings of the Ninth International Conference on Scientific and Statistical Database Management, pp. 2–11, 1997.

  12. M.V. Fidelis, H.S. Lopes, and A. Freitas, “Discovering comprehensible classification rules with a genetic algorithm,” in Proceedings of the IEEE Congress on Evolutionary Computation, vol. 1, pp. 805–810, 2000.

  13. E. Frank and I.H. Witten “Generating accurate rule sets without global optimization,” Proceedings of the Fifteenth International Conference Machine Learning (ICML’98), pp. 144–151, 1998.

  14. L.M. Howard and D.J. D’Angelo, “The GA-P: a genetic algorithm and genetic programming hybrid,” IEEE Expert, vol. 10, pp. 11–15, 1995.

    Article  Google Scholar 

  15. H. Ishibuchi, T. Murata, and I.B. Türksen, “Single-objective and two-objective genetic algorithms for selecting linguistic rules for pattern classification problems,” Fuzzy Sets and Systems, vol. 89, no. 2, pp. 135–150, 1997.

    Article  Google Scholar 

  16. H. Ishibuchi, T. Nakashima, and T. Murata, “Three-objective genetics-based machine learning for linguistic rule extraction,” Information Sciences, vol. 136, no. 1–4, pp. 109–133, 2001.

    Article  MATH  Google Scholar 

  17. G.H. John, and P. Langley, “Estimating continuous distributions in Bayesian classifiers,” in Proceedings of the Eleventh Conference on Uncertainty in Artificial Intelligence, Morgan Kaufmann, San Mateo, pp. 338–345, 1995.

  18. Y. Kim, W.N. Street, and F. Menczer, “Evolutionary model selection in unsupervised learning,” Intelligent Data Analysis, vol. 6, no. 6, pp. 531–556, 2002.

    MATH  Google Scholar 

  19. J.K. Kishore, L.M. Patnaik, V. Mani, and V.K. Agrawal, “Application of genetic programming for multicategory pattern classification,” IEEE Transactions on Evolutionary Computation, vol. 4, no. 3, pp. 242–258, 2000.

    Article  Google Scholar 

  20. R. Kohavi, “The power of decision tables,” in Proceedings of the 8th European Conference on Machine Learning, pp. 174–189, 1995.

  21. R.R.F. Mendes, F.B. Voznika, A.A. Freitas and J.C. Nievola, “Discovering fuzzy classification rules with genetic programming and co-evolution,” Lecture Notes in Artificial Intelligence 2168, Springer-Verlag, pp. 314–325, 2001.

  22. Z. Michalewicz, Genetic Algorithms + Data Structure = Evolutionary Programs, Springer-Verlag: Berlin, 2nd edition, 1996.

    Google Scholar 

  23. D. Michie, D.J. Spiegelhalter, and C.C. Taylor, Machine Learning, Neural and Statistical Classification, London: Ellis Horwood, 1994.

    MATH  Google Scholar 

  24. T.M. Mitchell, Machine Learning, McGraw Hill, 1997.

  25. D.C. Montgomery, G.C. Runger, and N.F. Hubele, Engineering Statistics, Wiley, John & Sons:, New York, 2nd edition, 2001.

    Google Scholar 

  26. C.A. Peña-Reyes and M. Sipper, “A fuzzy-genetic approach to breast cancer diagnosis,” Artificial Intelligence in Medicine, vol. 17, no. 2, pp. 131–155, 1999.

    Article  Google Scholar 

  27. A.R. Polo and M. Hasse, “A Genetic Classifier Tool,” in Proceedings of the 20th International Conference of the Chilean Computer Science Society, pp. 14–23, 2000.

  28. L. Prechelt, “Some notes on neural learning algorithm benchmarking,” Neurocomputing, vol. 9, no. 3, pp. 343–347, 1995.

  29. J.R. Quinlan, C4.5: Programs for Machine Learning, Morgan Kaufmann: CA, 1993.

    Google Scholar 

  30. R. Setiono and H. Liu, “NeuroLinear: From neural networks to oblique decision rules,” Neurocomputing, vol. 17, no. 1, pp. 1–24, 1997.

    Article  Google Scholar 

  31. K.C. Tan, A. Tay, T.H. Lee, and C.M. Heng, “Mining multiple comprehensible classification rules using genetic programming,” in Proceedings of the IEEE Congress on Evolutionary Computation, Honolulu, Hawaii, vol. 2, pp. 1302–1307, 2002.

  32. K.C. Tan, Q. Yu, and T.H. Lee, “A distributed coevolutionary classifier for knowledge discovery in data mining,” IEEE Transaction on Systems, Man and Cybernetics: Part C (Applications and Reviews), vol. 35, no. 2, pp. 131–142, 2005.

    Article  Google Scholar 

  33. D.A. Van Veldhuizen and G.B. Lamont, “Multiobjective Evolutionary Algorithms: Analyzing the State-of-the-Art,” Evolutionary Computation, vol. 8, no. 2, pp. 125–147, 2000.

    Article  Google Scholar 

  34. V. Vapnik, The Nature of Statistical Learning Theory, Springer: NY, 1995.

    MATH  Google Scholar 

  35. C.H. Wang, T.P. Hong, S.S. Tseng, and C.M. Liao, “Automatically integrating multiple rule sets in a distributed-knowledge environment,” IEEE Transactions on Systems, Man, and Cybernetics Part C: Applications and Reviews, vol. 28, no. 3, pp. 471–476, 1998.

    Article  MATH  Google Scholar 

  36. C.H. Wang, T.P. Hong, and S.S. Tseng, “Integrating membership functions and fuzzy rule sets from multiple knowledge sources,” Fuzzy Sets and Systems, vol. 112, no. 1, pp. 141–154, 2000.

    Article  Google Scholar 

  37. I.H. Witten and E. Frank, Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations, Morgan Kaufmann Publishers: CA, 1999.

    Google Scholar 

  38. M.L. Wong and K.S. Leung, Data Mining Using Grammar Based Genetic Programming and Applications, Kluwer Academic Publishers: London, 2000.

    MATH  Google Scholar 

  39. X. Yao, and Y. Liu, “A new evolutionary system for evolving artificial neural networks,” IEEE Transactions on Neural Networks, vol. 8, no. 3, pp. 694–713, 1997.

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to K. C. Tan.

Additional information

An erratum to this article is available at http://dx.doi.org/10.1007/s10589-006-9594-3.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Tan, K.C., Yu, Q. & Ang, J.H. A Dual-Objective Evolutionary Algorithm for Rules Extraction in Data Mining. Comput Optim Applic 34, 273–294 (2006). https://doi.org/10.1007/s10589-005-3907-9

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10589-005-3907-9

Keywords

Navigation