Skip to main content
Top
Published in: Neural Computing and Applications 7/2012

01-10-2012 | Original Article

Using artificial neural networks to enhance CART

Authors: William A. Young II, Gary R. Weckman, Vijaya Hari, Harry S. Whiting II, Andrew P. Snow

Published in: Neural Computing and Applications | Issue 7/2012

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Accuracy is a critical factor in predictive modeling. A predictive model such as a decision tree must be accurate to draw conclusions about the system being modeled. This research aims at analyzing and improving the performance of classification and regression trees (CART), a decision tree algorithm, by evaluating and deriving a new methodology based on the performance of real-world data sets that were studied. This paper introduces a new approach to tree induction to improve the efficiency of the CART algorithm by combining the existing functionality of CART with the addition of artificial neural networks (ANNs). Trained ANNs are utilized by the tree induction algorithm by generating new, synthetic data, which have been shown to improve the overall accuracy of the decision tree model when actual training samples are limited. In this paper, traditional decision trees developed by the standard CART methodology are compared with the enhanced decision trees that utilize the ANN’s synthetic data generation, or CART+. This research demonstrates the improved accuracies that can be obtained with CART+, which can ultimately improve the knowledge that can be extracted by researchers about a system being modeled.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Chih-Hung H, Mao-Jiun JW (2005) Using decision tree-based data mining to establish a sizing system for the manufacture of garments. Int J Adv Manuf Technol 26(5):669–674CrossRef Chih-Hung H, Mao-Jiun JW (2005) Using decision tree-based data mining to establish a sizing system for the manufacture of garments. Int J Adv Manuf Technol 26(5):669–674CrossRef
2.
go back to reference Witten IH, Frank E (2011) Data mining: practical machine learning tools and techniques 3rd edn. Morgan Kaufmann Series in Data Management Systems, Morgan Kaufmann, Burlington, MA Witten IH, Frank E (2011) Data mining: practical machine learning tools and techniques 3rd edn. Morgan Kaufmann Series in Data Management Systems, Morgan Kaufmann, Burlington, MA
3.
go back to reference Kriegel H-P, Borgwardt KM, Kröger P, Pryakhin A, Schubert M, Zimek A (2007) Future trends in data mining. Data Min Knowl Discov 15(1):87–97MathSciNetCrossRef Kriegel H-P, Borgwardt KM, Kröger P, Pryakhin A, Schubert M, Zimek A (2007) Future trends in data mining. Data Min Knowl Discov 15(1):87–97MathSciNetCrossRef
4.
go back to reference Lavrac N (2004) Introduction: lesson learned from data mining applications and collaborative problem solving. Mach Learn 57:13–34MATHCrossRef Lavrac N (2004) Introduction: lesson learned from data mining applications and collaborative problem solving. Mach Learn 57:13–34MATHCrossRef
5.
go back to reference Kusiak A, Smith M (2007) Data mining in design of products and production systems. Ann Rev Control 31:147–156CrossRef Kusiak A, Smith M (2007) Data mining in design of products and production systems. Ann Rev Control 31:147–156CrossRef
7.
go back to reference Apte C, Weiss S (1997) Data mining with decision trees and decision rules. Futur Gener Comput Syst 13:197–210CrossRef Apte C, Weiss S (1997) Data mining with decision trees and decision rules. Futur Gener Comput Syst 13:197–210CrossRef
8.
go back to reference Li Y (2006) Predicting materials properties and behavior using classification and regression trees. Mater Sci Eng A 433:261–268CrossRef Li Y (2006) Predicting materials properties and behavior using classification and regression trees. Mater Sci Eng A 433:261–268CrossRef
10.
go back to reference Wilkinson L (1992) Tree structured data analysis: AID, CHAID and CART, Department of Statistics, Northwestern University, Evanston, IL 60201 Wilkinson L (1992) Tree structured data analysis: AID, CHAID and CART, Department of Statistics, Northwestern University, Evanston, IL 60201
12.
go back to reference Feldman D, Gross S (2005) Mortgage default: classification trees analysis. J Real Estate Fin Econ 30(4):369–396CrossRef Feldman D, Gross S (2005) Mortgage default: classification trees analysis. J Real Estate Fin Econ 30(4):369–396CrossRef
13.
go back to reference Questier F, Put R, Coomans D, Walczak B, Vander Heydan Y (2005) The user of CART and multivariate regression trees for supervised and unsupervised feature selection. Chemom Intell Lab Syst 76:45–54CrossRef Questier F, Put R, Coomans D, Walczak B, Vander Heydan Y (2005) The user of CART and multivariate regression trees for supervised and unsupervised feature selection. Chemom Intell Lab Syst 76:45–54CrossRef
14.
go back to reference Caetanoa S, Aires-de-Sousab J, Daszykowskia M, Vander Heydena Y (2005) Prediction of enantioselectivity using chirality codes and classification and regression trees. Anal Chim Acta 544:315–326CrossRef Caetanoa S, Aires-de-Sousab J, Daszykowskia M, Vander Heydena Y (2005) Prediction of enantioselectivity using chirality codes and classification and regression trees. Anal Chim Acta 544:315–326CrossRef
15.
go back to reference Andryashin A (2005) Financial applications of classification and regression trees, MS, Thesis, Center of Applied Statistics and Economics, Humboldt University, Berlin Andryashin A (2005) Financial applications of classification and regression trees, MS, Thesis, Center of Applied Statistics and Economics, Humboldt University, Berlin
16.
go back to reference Waheed T, Bonnell RB, Prasher SO, Paulet E (2006) Measuring performance in precision agriculture: CART—a decision tree approach. Agric Water Manage 84:173–185CrossRef Waheed T, Bonnell RB, Prasher SO, Paulet E (2006) Measuring performance in precision agriculture: CART—a decision tree approach. Agric Water Manage 84:173–185CrossRef
17.
go back to reference Provost F, Danyluk A (1999) Problem definition, data cleaning, and evaluation: a classifier learning case study. Informatica 23:123–136 Provost F, Danyluk A (1999) Problem definition, data cleaning, and evaluation: a classifier learning case study. Informatica 23:123–136
18.
go back to reference Breiman L, Friedman JH, Olshen RA, Stone CJ (1984) Classification and regression trees. Wadsworth, Inc., Monterey, CA Breiman L, Friedman JH, Olshen RA, Stone CJ (1984) Classification and regression trees. Wadsworth, Inc., Monterey, CA
19.
go back to reference Hsu C-H, Wang M-JJ (2005) Using decision tree-based data mining to establish a sizing system for the manufacture of garments. Int J Adv Manuf Technol 26:669–674CrossRef Hsu C-H, Wang M-JJ (2005) Using decision tree-based data mining to establish a sizing system for the manufacture of garments. Int J Adv Manuf Technol 26:669–674CrossRef
20.
go back to reference Evans B, Fisher D (1994) Overcoming process delays with decision tree induction. IEEE Expert 9:60–66CrossRef Evans B, Fisher D (1994) Overcoming process delays with decision tree induction. IEEE Expert 9:60–66CrossRef
21.
go back to reference Wu M-C, Lin S-Y, Lin C-H (2006) An effective application of decision tree to stock trading. Expert Syst Appl 31:270–274CrossRef Wu M-C, Lin S-Y, Lin C-H (2006) An effective application of decision tree to stock trading. Expert Syst Appl 31:270–274CrossRef
22.
go back to reference Yang B-S, Lin DS, Tan ACC (2005) VIBEX: an expert system for vibration fault diagnosis of rotating machinery using decision tree and decision table. Expert Syst Appl 28:735–742CrossRef Yang B-S, Lin DS, Tan ACC (2005) VIBEX: an expert system for vibration fault diagnosis of rotating machinery using decision tree and decision table. Expert Syst Appl 28:735–742CrossRef
23.
go back to reference McSherry D (1999) Strategic induction of decision trees. Knowl Based Syst 12:269–275CrossRef McSherry D (1999) Strategic induction of decision trees. Knowl Based Syst 12:269–275CrossRef
24.
go back to reference Changnon D, Ritsche M, Elyea K, Shelton S, Schramm K (2000) Integrating climate forecasts and natural gas supply information Into a natural gas purchasing decision. Meterol Appl 7:211–216CrossRef Changnon D, Ritsche M, Elyea K, Shelton S, Schramm K (2000) Integrating climate forecasts and natural gas supply information Into a natural gas purchasing decision. Meterol Appl 7:211–216CrossRef
25.
go back to reference Moon TH, Sohn SY (2005) Intelligent approach for effective management of governmental funds for small and medium enterprises. Expert Syst Appl 29(566):572 Moon TH, Sohn SY (2005) Intelligent approach for effective management of governmental funds for small and medium enterprises. Expert Syst Appl 29(566):572
26.
go back to reference Hsia T-C, Shie A-J, Chen L-C (2008) Course planning of extension education to meet market demand by using data mining techniques—an example of Chinkuo technology university in Taiwan. Expert Syst Appl 34:596–602CrossRef Hsia T-C, Shie A-J, Chen L-C (2008) Course planning of extension education to meet market demand by using data mining techniques—an example of Chinkuo technology university in Taiwan. Expert Syst Appl 34:596–602CrossRef
27.
go back to reference Rangwala MH (2006) Empirical investigation of decision tree extraction from neural networks, MS, Thesis, Department of Manufacturing and Systems Engineering, Ohio University, Athens, OH Rangwala MH (2006) Empirical investigation of decision tree extraction from neural networks, MS, Thesis, Department of Manufacturing and Systems Engineering, Ohio University, Athens, OH
28.
go back to reference Michie D (1989) Problems of computer-aided concept formation. In: Quinlan JR (ed) Applications of expert systems volume 2. Addison Wesley, Wokingham, UK, pp 310–333 Michie D (1989) Problems of computer-aided concept formation. In: Quinlan JR (ed) Applications of expert systems volume 2. Addison Wesley, Wokingham, UK, pp 310–333
29.
go back to reference Gardner MW, Dorling SR (1998) Artificial neural networks (the multilayer Perceptron)—a review of applications in the atmospheric sciences. Atmos Environ 32:2627–2636CrossRef Gardner MW, Dorling SR (1998) Artificial neural networks (the multilayer Perceptron)—a review of applications in the atmospheric sciences. Atmos Environ 32:2627–2636CrossRef
30.
go back to reference Aakerlund L, Hemmingsen R (1998) Neural networks as models of psychopathology. Biol Psychiatr 43(7):471–482CrossRef Aakerlund L, Hemmingsen R (1998) Neural networks as models of psychopathology. Biol Psychiatr 43(7):471–482CrossRef
31.
go back to reference Yang CC, Prasher SO, Landry JA, Ramaswamy HS, Tommasi D (2000) Application of artificial neural networks in image recognition and classification of crop and weeds. Can Agric 13 Eng 42:147–152 Yang CC, Prasher SO, Landry JA, Ramaswamy HS, Tommasi D (2000) Application of artificial neural networks in image recognition and classification of crop and weeds. Can Agric 13 Eng 42:147–152
32.
go back to reference Laskaria EC, Meletiou GC, Tasoulis DK, Vrahatis MN (2006) Studying the performance of artificial neural networks on problems related to cryptography. Nonlinear Anal Real World Appl 7:937–942MathSciNetCrossRef Laskaria EC, Meletiou GC, Tasoulis DK, Vrahatis MN (2006) Studying the performance of artificial neural networks on problems related to cryptography. Nonlinear Anal Real World Appl 7:937–942MathSciNetCrossRef
33.
go back to reference Madani K, Braz J et al (eds) Industrial and real world applications of artificial neural networks, illusion or reality? Informatics in control. Autom Robot I:11–26 Madani K, Braz J et al (eds) Industrial and real world applications of artificial neural networks, illusion or reality? Informatics in control. Autom Robot I:11–26
34.
go back to reference Allex CF, Shavlik JW, Blattner FR (1999) Neural network input representations that produce accurate consensus sequences from DNA fragment assemblies. Bioinformatics 15(9):723–728CrossRef Allex CF, Shavlik JW, Blattner FR (1999) Neural network input representations that produce accurate consensus sequences from DNA fragment assemblies. Bioinformatics 15(9):723–728CrossRef
36.
go back to reference Krawiec K, Slowinski R, Szczesniak I (1998) Pedagogical method for extraction of symbolic knowledge from neural networks. In: Polkowski L, Skowron A (eds) Proceedings of the first international conference on rough sets and current trends in computing (RSCTC '98). Springer, London, UK Krawiec K, Slowinski R, Szczesniak I (1998) Pedagogical method for extraction of symbolic knowledge from neural networks. In: Polkowski L, Skowron A (eds) Proceedings of the first international conference on rough sets and current trends in computing (RSCTC '98). Springer, London, UK
37.
go back to reference Bhagat PM (2005) Pattern recognition in industry. Elsevier. ISBN 0-08-044538-1 Bhagat PM (2005) Pattern recognition in industry. Elsevier. ISBN 0-08-044538-1
38.
go back to reference Craven MW (1996) Extracting comprehensible models from trained neural networks, PhD Thesis, Computer Science Department, University of Wisconsin, Madison, WI Craven MW (1996) Extracting comprehensible models from trained neural networks, PhD Thesis, Computer Science Department, University of Wisconsin, Madison, WI
39.
go back to reference Dwyer K, Holte R Decision tree instability and active learning. Department of Computing Science, University of Alberta, Edmonton AB, Canada Dwyer K, Holte R Decision tree instability and active learning. Department of Computing Science, University of Alberta, Edmonton AB, Canada
42.
go back to reference Nicholls JG, Martin AR, Wallace BG, Fuchs PA (2001) From neuron to brain, 4th ed. Sinauer Associates, Sunderland, MA. ISBN 0878934391 Nicholls JG, Martin AR, Wallace BG, Fuchs PA (2001) From neuron to brain, 4th ed. Sinauer Associates, Sunderland, MA. ISBN 0878934391
44.
go back to reference Chen F (2004) Learning accurate and understandable rules from SVM classifiers, MS, Thesis, School of Computer Science, Simon Fraser University Chen F (2004) Learning accurate and understandable rules from SVM classifiers, MS, Thesis, School of Computer Science, Simon Fraser University
45.
go back to reference Luin AD (1988) Queries and concept learning. Mach Learn 2:319–342 Luin AD (1988) Queries and concept learning. Mach Learn 2:319–342
46.
go back to reference Murphy PM (1995) UCI repository of machine learning databases—a machinereadable data repository, maintained at the Department of Information and Computer Science, University of California, Irvine., anonymous FTP from ics.uci.edu in the directory pub/machinelearningdatabases Murphy PM (1995) UCI repository of machine learning databases—a machinereadable data repository, maintained at the Department of Information and Computer Science, University of California, Irvine., anonymous FTP from ics.uci.edu in the directory pub/machinelearningdatabases
48.
go back to reference Millie DF, Weckman GR, Pigg RJ, Teste PA, Dyble J, Litaker RW, Carrick HJ, Fahnestiel GL (2006) Modeling phytoplankton abundance in Saginaw bay, Lake Huron: using artificial neural networks to discern functional influence of environmental variables and relevance to a great lakes observing system. J Phycol 42:336–349CrossRef Millie DF, Weckman GR, Pigg RJ, Teste PA, Dyble J, Litaker RW, Carrick HJ, Fahnestiel GL (2006) Modeling phytoplankton abundance in Saginaw bay, Lake Huron: using artificial neural networks to discern functional influence of environmental variables and relevance to a great lakes observing system. J Phycol 42:336–349CrossRef
49.
go back to reference Hernandez S, Nesic S, Weckman G, Ghai V (2005) Use of artificial neural networks for predicting crude oil effect on CO2 corrosion of carbon steels. Corrosion. Submitted and accepted for publication, April 2006 Hernandez S, Nesic S, Weckman G, Ghai V (2005) Use of artificial neural networks for predicting crude oil effect on CO2 corrosion of carbon steels. Corrosion. Submitted and accepted for publication, April 2006
50.
go back to reference Rastogi P (2005) Assessing wireless network dependability using neural networks, MS, Thesis., School of Communication, Ohio University, Athens, OH Rastogi P (2005) Assessing wireless network dependability using neural networks, MS, Thesis., School of Communication, Ohio University, Athens, OH
51.
go back to reference Weckman G, Snow A, Rastogi P, Rangwala M (2008) Assessing wireless network dependability through knowledge extraction via decision trees. IARIA: ICONS 2008 Weckman G, Snow A, Rastogi P, Rangwala M (2008) Assessing wireless network dependability through knowledge extraction via decision trees. IARIA: ICONS 2008
54.
go back to reference Maindonald J, Braun J (2003) Data analysis and graphics using R: an example-based approach. Cambridge University Press, Cambridge, UK Maindonald J, Braun J (2003) Data analysis and graphics using R: an example-based approach. Cambridge University Press, Cambridge, UK
56.
go back to reference Rangwala M (2006) Empirical investigation of decision tree extraction from neural networks, MS, Thesis, Russ College of Engineering and Technology, Department of Industrial and Systems Enginnering, Ohio Univerity, Athens, OH Rangwala M (2006) Empirical investigation of decision tree extraction from neural networks, MS, Thesis, Russ College of Engineering and Technology, Department of Industrial and Systems Enginnering, Ohio Univerity, Athens, OH
Metadata
Title
Using artificial neural networks to enhance CART
Authors
William A. Young II
Gary R. Weckman
Vijaya Hari
Harry S. Whiting II
Andrew P. Snow
Publication date
01-10-2012
Publisher
Springer-Verlag
Published in
Neural Computing and Applications / Issue 7/2012
Print ISSN: 0941-0643
Electronic ISSN: 1433-3058
DOI
https://doi.org/10.1007/s00521-012-0887-4

Other articles of this Issue 7/2012

Neural Computing and Applications 7/2012 Go to the issue

Premium Partner