Skip to main content
Top
Published in: Soft Computing 6/2011

01-06-2011 | Focus

Case study of inaccuracies in the granulation of decision trees

Authors: Salman Badr, Andrzej Bargiela

Published in: Soft Computing | Issue 6/2011

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Cybernetics studies information process in the context of interaction with physical systems. Because such information is sometimes vague and exhibits complex interactions; it can only be discerned using approximate representations. Machine learning provides solutions that create approximate models of information and decision trees are one of its main components. However, decision trees are susceptible to information overload and can get overly complex when a large amount of data is inputted in them. Granulation of decision tree remedies this problem by providing the essential structure of the decision tree, which can decrease its utility. To evaluate the relationship that exists between granulation and decision tree complexity, data uncertainty and prediction accuracy, the deficiencies obtained by nursing homes during annual inspections were taken as a case study. Using rough sets, three forms of granulation were performed: (1) attribute grouping, (2) removing insignificant attributes and (3) removing uncertain records. Attribute grouping significantly reduces tree complexity without having any strong effect upon data consistency and accuracy. On the other hand, removing insignificant features decrease data consistency and tree complexity, while increasing the error in prediction. Finally, decrease in the uncertainty of the dataset results in an increase in accuracy and has no impact on tree complexity.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literature
go back to reference Bargiela A, Pedrycz W (2003) Granular computing: an introduction. Kluwer Academic Publishers, DordrechtMATH Bargiela A, Pedrycz W (2003) Granular computing: an introduction. Kluwer Academic Publishers, DordrechtMATH
go back to reference Cherkauer KJ, Shavlik JW (1996) Growing simpler decision trees to facilitate knowledge discovery. In: Proceedings of the second international conference on knowledge discovery and data mining, pp 315–318 Cherkauer KJ, Shavlik JW (1996) Growing simpler decision trees to facilitate knowledge discovery. In: Proceedings of the second international conference on knowledge discovery and data mining, pp 315–318
go back to reference Fierens D, Ramon J, Blockeel H, Bruynooghe M (2005) A comparison of approaches for learning first-order logical probability estimation trees. LNCS 3720:556–563 Fierens D, Ramon J, Blockeel H, Bruynooghe M (2005) A comparison of approaches for learning first-order logical probability estimation trees. LNCS 3720:556–563
go back to reference Hall LO, Chawla N, Bowyer KW (1998) Decision tree learning on very large data sets. IEEE Int Conf Syst Man Cybern 3:2579–2584 Hall LO, Chawla N, Bowyer KW (1998) Decision tree learning on very large data sets. IEEE Int Conf Syst Man Cybern 3:2579–2584
go back to reference Han SW, Kim JY (2008) A new decision tree algorithm based on rough set theory. Int J Innov Comput Inf Control 4:2749–5757MathSciNet Han SW, Kim JY (2008) A new decision tree algorithm based on rough set theory. Int J Innov Comput Inf Control 4:2749–5757MathSciNet
go back to reference Huang L, Huang M, Guo B, Zhang Z (2007) A new method for constructing decision tree based on rough set theory. IEEE Int Conf Granular Comput 241–244 Huang L, Huang M, Guo B, Zhang Z (2007) A new method for constructing decision tree based on rough set theory. IEEE Int Conf Granular Comput 241–244
go back to reference John M (1989) An empirical comparison of pruning methods for decision tree induction. Mach Learn 4:227–243CrossRef John M (1989) An empirical comparison of pruning methods for decision tree induction. Mach Learn 4:227–243CrossRef
go back to reference Kweku-Muata O-B (2007) Post-pruning in decision tree induction using multiple performance measures. Comput Oper Res 34:3331–3345MATHCrossRef Kweku-Muata O-B (2007) Post-pruning in decision tree induction using multiple performance measures. Comput Oper Res 34:3331–3345MATHCrossRef
go back to reference Pawlak Z (1991) Rough sets: theoretical aspects of reasoning about data. Kluwer Academic Publishers, DordrechtMATH Pawlak Z (1991) Rough sets: theoretical aspects of reasoning about data. Kluwer Academic Publishers, DordrechtMATH
go back to reference Refaat M (2007) Data Preparation for Data Mining Using SAS, Morgan Kaufmann Refaat M (2007) Data Preparation for Data Mining Using SAS, Morgan Kaufmann
go back to reference Tusar T (2007) Optimizing accuracy and size of decision trees. In: Proceedings of the sixteenth international electronical and computer science conference-ERK 2007, pp 81–84 Tusar T (2007) Optimizing accuracy and size of decision trees. In: Proceedings of the sixteenth international electronical and computer science conference-ERK 2007, pp 81–84
go back to reference Wang C, Ou F (2008) An algorithm for decision tree construction based on rough set theory. In: International conference on computer science and information technology, pp 295–298 Wang C, Ou F (2008) An algorithm for decision tree construction based on rough set theory. In: International conference on computer science and information technology, pp 295–298
go back to reference Wittien IH, Frank E (2005) Data mining: practical machine learning tools and techniques. Morgan Kaufmann Publishers, California Wittien IH, Frank E (2005) Data mining: practical machine learning tools and techniques. Morgan Kaufmann Publishers, California
go back to reference Yellasiri R, Rao CR, Reddy V (2005) Decision tree induction using rough set theory-comparative study. J Theor Appl Inf Technol 3:110–114 Yellasiri R, Rao CR, Reddy V (2005) Decision tree induction using rough set theory-comparative study. J Theor Appl Inf Technol 3:110–114
go back to reference Zhou X, Zhang D, Jiang Y (2008) A new credit scoring method based on rough sets and decision tree. LNCS 5012:1081–1089 Zhou X, Zhang D, Jiang Y (2008) A new credit scoring method based on rough sets and decision tree. LNCS 5012:1081–1089
Metadata
Title
Case study of inaccuracies in the granulation of decision trees
Authors
Salman Badr
Andrzej Bargiela
Publication date
01-06-2011
Publisher
Springer-Verlag
Published in
Soft Computing / Issue 6/2011
Print ISSN: 1432-7643
Electronic ISSN: 1433-7479
DOI
https://doi.org/10.1007/s00500-010-0587-x

Other articles of this Issue 6/2011

Soft Computing 6/2011 Go to the issue

Premium Partner