Skip to main content
Erschienen in: Neural Computing and Applications 7/2012

01.10.2012 | Original Article

Using artificial neural networks to enhance CART

verfasst von: William A. Young II, Gary R. Weckman, Vijaya Hari, Harry S. Whiting II, Andrew P. Snow

Erschienen in: Neural Computing and Applications | Ausgabe 7/2012

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Accuracy is a critical factor in predictive modeling. A predictive model such as a decision tree must be accurate to draw conclusions about the system being modeled. This research aims at analyzing and improving the performance of classification and regression trees (CART), a decision tree algorithm, by evaluating and deriving a new methodology based on the performance of real-world data sets that were studied. This paper introduces a new approach to tree induction to improve the efficiency of the CART algorithm by combining the existing functionality of CART with the addition of artificial neural networks (ANNs). Trained ANNs are utilized by the tree induction algorithm by generating new, synthetic data, which have been shown to improve the overall accuracy of the decision tree model when actual training samples are limited. In this paper, traditional decision trees developed by the standard CART methodology are compared with the enhanced decision trees that utilize the ANN’s synthetic data generation, or CART+. This research demonstrates the improved accuracies that can be obtained with CART+, which can ultimately improve the knowledge that can be extracted by researchers about a system being modeled.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Chih-Hung H, Mao-Jiun JW (2005) Using decision tree-based data mining to establish a sizing system for the manufacture of garments. Int J Adv Manuf Technol 26(5):669–674CrossRef Chih-Hung H, Mao-Jiun JW (2005) Using decision tree-based data mining to establish a sizing system for the manufacture of garments. Int J Adv Manuf Technol 26(5):669–674CrossRef
2.
Zurück zum Zitat Witten IH, Frank E (2011) Data mining: practical machine learning tools and techniques 3rd edn. Morgan Kaufmann Series in Data Management Systems, Morgan Kaufmann, Burlington, MA Witten IH, Frank E (2011) Data mining: practical machine learning tools and techniques 3rd edn. Morgan Kaufmann Series in Data Management Systems, Morgan Kaufmann, Burlington, MA
3.
Zurück zum Zitat Kriegel H-P, Borgwardt KM, Kröger P, Pryakhin A, Schubert M, Zimek A (2007) Future trends in data mining. Data Min Knowl Discov 15(1):87–97MathSciNetCrossRef Kriegel H-P, Borgwardt KM, Kröger P, Pryakhin A, Schubert M, Zimek A (2007) Future trends in data mining. Data Min Knowl Discov 15(1):87–97MathSciNetCrossRef
4.
Zurück zum Zitat Lavrac N (2004) Introduction: lesson learned from data mining applications and collaborative problem solving. Mach Learn 57:13–34MATHCrossRef Lavrac N (2004) Introduction: lesson learned from data mining applications and collaborative problem solving. Mach Learn 57:13–34MATHCrossRef
5.
Zurück zum Zitat Kusiak A, Smith M (2007) Data mining in design of products and production systems. Ann Rev Control 31:147–156CrossRef Kusiak A, Smith M (2007) Data mining in design of products and production systems. Ann Rev Control 31:147–156CrossRef
7.
Zurück zum Zitat Apte C, Weiss S (1997) Data mining with decision trees and decision rules. Futur Gener Comput Syst 13:197–210CrossRef Apte C, Weiss S (1997) Data mining with decision trees and decision rules. Futur Gener Comput Syst 13:197–210CrossRef
8.
Zurück zum Zitat Li Y (2006) Predicting materials properties and behavior using classification and regression trees. Mater Sci Eng A 433:261–268CrossRef Li Y (2006) Predicting materials properties and behavior using classification and regression trees. Mater Sci Eng A 433:261–268CrossRef
10.
Zurück zum Zitat Wilkinson L (1992) Tree structured data analysis: AID, CHAID and CART, Department of Statistics, Northwestern University, Evanston, IL 60201 Wilkinson L (1992) Tree structured data analysis: AID, CHAID and CART, Department of Statistics, Northwestern University, Evanston, IL 60201
12.
Zurück zum Zitat Feldman D, Gross S (2005) Mortgage default: classification trees analysis. J Real Estate Fin Econ 30(4):369–396CrossRef Feldman D, Gross S (2005) Mortgage default: classification trees analysis. J Real Estate Fin Econ 30(4):369–396CrossRef
13.
Zurück zum Zitat Questier F, Put R, Coomans D, Walczak B, Vander Heydan Y (2005) The user of CART and multivariate regression trees for supervised and unsupervised feature selection. Chemom Intell Lab Syst 76:45–54CrossRef Questier F, Put R, Coomans D, Walczak B, Vander Heydan Y (2005) The user of CART and multivariate regression trees for supervised and unsupervised feature selection. Chemom Intell Lab Syst 76:45–54CrossRef
14.
Zurück zum Zitat Caetanoa S, Aires-de-Sousab J, Daszykowskia M, Vander Heydena Y (2005) Prediction of enantioselectivity using chirality codes and classification and regression trees. Anal Chim Acta 544:315–326CrossRef Caetanoa S, Aires-de-Sousab J, Daszykowskia M, Vander Heydena Y (2005) Prediction of enantioselectivity using chirality codes and classification and regression trees. Anal Chim Acta 544:315–326CrossRef
15.
Zurück zum Zitat Andryashin A (2005) Financial applications of classification and regression trees, MS, Thesis, Center of Applied Statistics and Economics, Humboldt University, Berlin Andryashin A (2005) Financial applications of classification and regression trees, MS, Thesis, Center of Applied Statistics and Economics, Humboldt University, Berlin
16.
Zurück zum Zitat Waheed T, Bonnell RB, Prasher SO, Paulet E (2006) Measuring performance in precision agriculture: CART—a decision tree approach. Agric Water Manage 84:173–185CrossRef Waheed T, Bonnell RB, Prasher SO, Paulet E (2006) Measuring performance in precision agriculture: CART—a decision tree approach. Agric Water Manage 84:173–185CrossRef
17.
Zurück zum Zitat Provost F, Danyluk A (1999) Problem definition, data cleaning, and evaluation: a classifier learning case study. Informatica 23:123–136 Provost F, Danyluk A (1999) Problem definition, data cleaning, and evaluation: a classifier learning case study. Informatica 23:123–136
18.
Zurück zum Zitat Breiman L, Friedman JH, Olshen RA, Stone CJ (1984) Classification and regression trees. Wadsworth, Inc., Monterey, CA Breiman L, Friedman JH, Olshen RA, Stone CJ (1984) Classification and regression trees. Wadsworth, Inc., Monterey, CA
19.
Zurück zum Zitat Hsu C-H, Wang M-JJ (2005) Using decision tree-based data mining to establish a sizing system for the manufacture of garments. Int J Adv Manuf Technol 26:669–674CrossRef Hsu C-H, Wang M-JJ (2005) Using decision tree-based data mining to establish a sizing system for the manufacture of garments. Int J Adv Manuf Technol 26:669–674CrossRef
20.
Zurück zum Zitat Evans B, Fisher D (1994) Overcoming process delays with decision tree induction. IEEE Expert 9:60–66CrossRef Evans B, Fisher D (1994) Overcoming process delays with decision tree induction. IEEE Expert 9:60–66CrossRef
21.
Zurück zum Zitat Wu M-C, Lin S-Y, Lin C-H (2006) An effective application of decision tree to stock trading. Expert Syst Appl 31:270–274CrossRef Wu M-C, Lin S-Y, Lin C-H (2006) An effective application of decision tree to stock trading. Expert Syst Appl 31:270–274CrossRef
22.
Zurück zum Zitat Yang B-S, Lin DS, Tan ACC (2005) VIBEX: an expert system for vibration fault diagnosis of rotating machinery using decision tree and decision table. Expert Syst Appl 28:735–742CrossRef Yang B-S, Lin DS, Tan ACC (2005) VIBEX: an expert system for vibration fault diagnosis of rotating machinery using decision tree and decision table. Expert Syst Appl 28:735–742CrossRef
23.
Zurück zum Zitat McSherry D (1999) Strategic induction of decision trees. Knowl Based Syst 12:269–275CrossRef McSherry D (1999) Strategic induction of decision trees. Knowl Based Syst 12:269–275CrossRef
24.
Zurück zum Zitat Changnon D, Ritsche M, Elyea K, Shelton S, Schramm K (2000) Integrating climate forecasts and natural gas supply information Into a natural gas purchasing decision. Meterol Appl 7:211–216CrossRef Changnon D, Ritsche M, Elyea K, Shelton S, Schramm K (2000) Integrating climate forecasts and natural gas supply information Into a natural gas purchasing decision. Meterol Appl 7:211–216CrossRef
25.
Zurück zum Zitat Moon TH, Sohn SY (2005) Intelligent approach for effective management of governmental funds for small and medium enterprises. Expert Syst Appl 29(566):572 Moon TH, Sohn SY (2005) Intelligent approach for effective management of governmental funds for small and medium enterprises. Expert Syst Appl 29(566):572
26.
Zurück zum Zitat Hsia T-C, Shie A-J, Chen L-C (2008) Course planning of extension education to meet market demand by using data mining techniques—an example of Chinkuo technology university in Taiwan. Expert Syst Appl 34:596–602CrossRef Hsia T-C, Shie A-J, Chen L-C (2008) Course planning of extension education to meet market demand by using data mining techniques—an example of Chinkuo technology university in Taiwan. Expert Syst Appl 34:596–602CrossRef
27.
Zurück zum Zitat Rangwala MH (2006) Empirical investigation of decision tree extraction from neural networks, MS, Thesis, Department of Manufacturing and Systems Engineering, Ohio University, Athens, OH Rangwala MH (2006) Empirical investigation of decision tree extraction from neural networks, MS, Thesis, Department of Manufacturing and Systems Engineering, Ohio University, Athens, OH
28.
Zurück zum Zitat Michie D (1989) Problems of computer-aided concept formation. In: Quinlan JR (ed) Applications of expert systems volume 2. Addison Wesley, Wokingham, UK, pp 310–333 Michie D (1989) Problems of computer-aided concept formation. In: Quinlan JR (ed) Applications of expert systems volume 2. Addison Wesley, Wokingham, UK, pp 310–333
29.
Zurück zum Zitat Gardner MW, Dorling SR (1998) Artificial neural networks (the multilayer Perceptron)—a review of applications in the atmospheric sciences. Atmos Environ 32:2627–2636CrossRef Gardner MW, Dorling SR (1998) Artificial neural networks (the multilayer Perceptron)—a review of applications in the atmospheric sciences. Atmos Environ 32:2627–2636CrossRef
30.
Zurück zum Zitat Aakerlund L, Hemmingsen R (1998) Neural networks as models of psychopathology. Biol Psychiatr 43(7):471–482CrossRef Aakerlund L, Hemmingsen R (1998) Neural networks as models of psychopathology. Biol Psychiatr 43(7):471–482CrossRef
31.
Zurück zum Zitat Yang CC, Prasher SO, Landry JA, Ramaswamy HS, Tommasi D (2000) Application of artificial neural networks in image recognition and classification of crop and weeds. Can Agric 13 Eng 42:147–152 Yang CC, Prasher SO, Landry JA, Ramaswamy HS, Tommasi D (2000) Application of artificial neural networks in image recognition and classification of crop and weeds. Can Agric 13 Eng 42:147–152
32.
Zurück zum Zitat Laskaria EC, Meletiou GC, Tasoulis DK, Vrahatis MN (2006) Studying the performance of artificial neural networks on problems related to cryptography. Nonlinear Anal Real World Appl 7:937–942MathSciNetCrossRef Laskaria EC, Meletiou GC, Tasoulis DK, Vrahatis MN (2006) Studying the performance of artificial neural networks on problems related to cryptography. Nonlinear Anal Real World Appl 7:937–942MathSciNetCrossRef
33.
Zurück zum Zitat Madani K, Braz J et al (eds) Industrial and real world applications of artificial neural networks, illusion or reality? Informatics in control. Autom Robot I:11–26 Madani K, Braz J et al (eds) Industrial and real world applications of artificial neural networks, illusion or reality? Informatics in control. Autom Robot I:11–26
34.
Zurück zum Zitat Allex CF, Shavlik JW, Blattner FR (1999) Neural network input representations that produce accurate consensus sequences from DNA fragment assemblies. Bioinformatics 15(9):723–728CrossRef Allex CF, Shavlik JW, Blattner FR (1999) Neural network input representations that produce accurate consensus sequences from DNA fragment assemblies. Bioinformatics 15(9):723–728CrossRef
36.
Zurück zum Zitat Krawiec K, Slowinski R, Szczesniak I (1998) Pedagogical method for extraction of symbolic knowledge from neural networks. In: Polkowski L, Skowron A (eds) Proceedings of the first international conference on rough sets and current trends in computing (RSCTC '98). Springer, London, UK Krawiec K, Slowinski R, Szczesniak I (1998) Pedagogical method for extraction of symbolic knowledge from neural networks. In: Polkowski L, Skowron A (eds) Proceedings of the first international conference on rough sets and current trends in computing (RSCTC '98). Springer, London, UK
37.
Zurück zum Zitat Bhagat PM (2005) Pattern recognition in industry. Elsevier. ISBN 0-08-044538-1 Bhagat PM (2005) Pattern recognition in industry. Elsevier. ISBN 0-08-044538-1
38.
Zurück zum Zitat Craven MW (1996) Extracting comprehensible models from trained neural networks, PhD Thesis, Computer Science Department, University of Wisconsin, Madison, WI Craven MW (1996) Extracting comprehensible models from trained neural networks, PhD Thesis, Computer Science Department, University of Wisconsin, Madison, WI
39.
Zurück zum Zitat Dwyer K, Holte R Decision tree instability and active learning. Department of Computing Science, University of Alberta, Edmonton AB, Canada Dwyer K, Holte R Decision tree instability and active learning. Department of Computing Science, University of Alberta, Edmonton AB, Canada
42.
Zurück zum Zitat Nicholls JG, Martin AR, Wallace BG, Fuchs PA (2001) From neuron to brain, 4th ed. Sinauer Associates, Sunderland, MA. ISBN 0878934391 Nicholls JG, Martin AR, Wallace BG, Fuchs PA (2001) From neuron to brain, 4th ed. Sinauer Associates, Sunderland, MA. ISBN 0878934391
44.
Zurück zum Zitat Chen F (2004) Learning accurate and understandable rules from SVM classifiers, MS, Thesis, School of Computer Science, Simon Fraser University Chen F (2004) Learning accurate and understandable rules from SVM classifiers, MS, Thesis, School of Computer Science, Simon Fraser University
45.
Zurück zum Zitat Luin AD (1988) Queries and concept learning. Mach Learn 2:319–342 Luin AD (1988) Queries and concept learning. Mach Learn 2:319–342
46.
Zurück zum Zitat Murphy PM (1995) UCI repository of machine learning databases—a machinereadable data repository, maintained at the Department of Information and Computer Science, University of California, Irvine., anonymous FTP from ics.uci.edu in the directory pub/machinelearningdatabases Murphy PM (1995) UCI repository of machine learning databases—a machinereadable data repository, maintained at the Department of Information and Computer Science, University of California, Irvine., anonymous FTP from ics.uci.edu in the directory pub/machinelearningdatabases
48.
Zurück zum Zitat Millie DF, Weckman GR, Pigg RJ, Teste PA, Dyble J, Litaker RW, Carrick HJ, Fahnestiel GL (2006) Modeling phytoplankton abundance in Saginaw bay, Lake Huron: using artificial neural networks to discern functional influence of environmental variables and relevance to a great lakes observing system. J Phycol 42:336–349CrossRef Millie DF, Weckman GR, Pigg RJ, Teste PA, Dyble J, Litaker RW, Carrick HJ, Fahnestiel GL (2006) Modeling phytoplankton abundance in Saginaw bay, Lake Huron: using artificial neural networks to discern functional influence of environmental variables and relevance to a great lakes observing system. J Phycol 42:336–349CrossRef
49.
Zurück zum Zitat Hernandez S, Nesic S, Weckman G, Ghai V (2005) Use of artificial neural networks for predicting crude oil effect on CO2 corrosion of carbon steels. Corrosion. Submitted and accepted for publication, April 2006 Hernandez S, Nesic S, Weckman G, Ghai V (2005) Use of artificial neural networks for predicting crude oil effect on CO2 corrosion of carbon steels. Corrosion. Submitted and accepted for publication, April 2006
50.
Zurück zum Zitat Rastogi P (2005) Assessing wireless network dependability using neural networks, MS, Thesis., School of Communication, Ohio University, Athens, OH Rastogi P (2005) Assessing wireless network dependability using neural networks, MS, Thesis., School of Communication, Ohio University, Athens, OH
51.
Zurück zum Zitat Weckman G, Snow A, Rastogi P, Rangwala M (2008) Assessing wireless network dependability through knowledge extraction via decision trees. IARIA: ICONS 2008 Weckman G, Snow A, Rastogi P, Rangwala M (2008) Assessing wireless network dependability through knowledge extraction via decision trees. IARIA: ICONS 2008
54.
Zurück zum Zitat Maindonald J, Braun J (2003) Data analysis and graphics using R: an example-based approach. Cambridge University Press, Cambridge, UK Maindonald J, Braun J (2003) Data analysis and graphics using R: an example-based approach. Cambridge University Press, Cambridge, UK
56.
Zurück zum Zitat Rangwala M (2006) Empirical investigation of decision tree extraction from neural networks, MS, Thesis, Russ College of Engineering and Technology, Department of Industrial and Systems Enginnering, Ohio Univerity, Athens, OH Rangwala M (2006) Empirical investigation of decision tree extraction from neural networks, MS, Thesis, Russ College of Engineering and Technology, Department of Industrial and Systems Enginnering, Ohio Univerity, Athens, OH
Metadaten
Titel
Using artificial neural networks to enhance CART
verfasst von
William A. Young II
Gary R. Weckman
Vijaya Hari
Harry S. Whiting II
Andrew P. Snow
Publikationsdatum
01.10.2012
Verlag
Springer-Verlag
Erschienen in
Neural Computing and Applications / Ausgabe 7/2012
Print ISSN: 0941-0643
Elektronische ISSN: 1433-3058
DOI
https://doi.org/10.1007/s00521-012-0887-4

Weitere Artikel der Ausgabe 7/2012

Neural Computing and Applications 7/2012 Zur Ausgabe

Premium Partner