Skip to main content
Erschienen in: Soft Computing 6/2011

01.06.2011 | Focus

Case study of inaccuracies in the granulation of decision trees

verfasst von: Salman Badr, Andrzej Bargiela

Erschienen in: Soft Computing | Ausgabe 6/2011

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Cybernetics studies information process in the context of interaction with physical systems. Because such information is sometimes vague and exhibits complex interactions; it can only be discerned using approximate representations. Machine learning provides solutions that create approximate models of information and decision trees are one of its main components. However, decision trees are susceptible to information overload and can get overly complex when a large amount of data is inputted in them. Granulation of decision tree remedies this problem by providing the essential structure of the decision tree, which can decrease its utility. To evaluate the relationship that exists between granulation and decision tree complexity, data uncertainty and prediction accuracy, the deficiencies obtained by nursing homes during annual inspections were taken as a case study. Using rough sets, three forms of granulation were performed: (1) attribute grouping, (2) removing insignificant attributes and (3) removing uncertain records. Attribute grouping significantly reduces tree complexity without having any strong effect upon data consistency and accuracy. On the other hand, removing insignificant features decrease data consistency and tree complexity, while increasing the error in prediction. Finally, decrease in the uncertainty of the dataset results in an increase in accuracy and has no impact on tree complexity.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Bargiela A, Pedrycz W (2003) Granular computing: an introduction. Kluwer Academic Publishers, DordrechtMATH Bargiela A, Pedrycz W (2003) Granular computing: an introduction. Kluwer Academic Publishers, DordrechtMATH
Zurück zum Zitat Cherkauer KJ, Shavlik JW (1996) Growing simpler decision trees to facilitate knowledge discovery. In: Proceedings of the second international conference on knowledge discovery and data mining, pp 315–318 Cherkauer KJ, Shavlik JW (1996) Growing simpler decision trees to facilitate knowledge discovery. In: Proceedings of the second international conference on knowledge discovery and data mining, pp 315–318
Zurück zum Zitat Fierens D, Ramon J, Blockeel H, Bruynooghe M (2005) A comparison of approaches for learning first-order logical probability estimation trees. LNCS 3720:556–563 Fierens D, Ramon J, Blockeel H, Bruynooghe M (2005) A comparison of approaches for learning first-order logical probability estimation trees. LNCS 3720:556–563
Zurück zum Zitat Hall LO, Chawla N, Bowyer KW (1998) Decision tree learning on very large data sets. IEEE Int Conf Syst Man Cybern 3:2579–2584 Hall LO, Chawla N, Bowyer KW (1998) Decision tree learning on very large data sets. IEEE Int Conf Syst Man Cybern 3:2579–2584
Zurück zum Zitat Han SW, Kim JY (2008) A new decision tree algorithm based on rough set theory. Int J Innov Comput Inf Control 4:2749–5757MathSciNet Han SW, Kim JY (2008) A new decision tree algorithm based on rough set theory. Int J Innov Comput Inf Control 4:2749–5757MathSciNet
Zurück zum Zitat Huang L, Huang M, Guo B, Zhang Z (2007) A new method for constructing decision tree based on rough set theory. IEEE Int Conf Granular Comput 241–244 Huang L, Huang M, Guo B, Zhang Z (2007) A new method for constructing decision tree based on rough set theory. IEEE Int Conf Granular Comput 241–244
Zurück zum Zitat John M (1989) An empirical comparison of pruning methods for decision tree induction. Mach Learn 4:227–243CrossRef John M (1989) An empirical comparison of pruning methods for decision tree induction. Mach Learn 4:227–243CrossRef
Zurück zum Zitat Kweku-Muata O-B (2007) Post-pruning in decision tree induction using multiple performance measures. Comput Oper Res 34:3331–3345MATHCrossRef Kweku-Muata O-B (2007) Post-pruning in decision tree induction using multiple performance measures. Comput Oper Res 34:3331–3345MATHCrossRef
Zurück zum Zitat Pawlak Z (1991) Rough sets: theoretical aspects of reasoning about data. Kluwer Academic Publishers, DordrechtMATH Pawlak Z (1991) Rough sets: theoretical aspects of reasoning about data. Kluwer Academic Publishers, DordrechtMATH
Zurück zum Zitat Refaat M (2007) Data Preparation for Data Mining Using SAS, Morgan Kaufmann Refaat M (2007) Data Preparation for Data Mining Using SAS, Morgan Kaufmann
Zurück zum Zitat Tusar T (2007) Optimizing accuracy and size of decision trees. In: Proceedings of the sixteenth international electronical and computer science conference-ERK 2007, pp 81–84 Tusar T (2007) Optimizing accuracy and size of decision trees. In: Proceedings of the sixteenth international electronical and computer science conference-ERK 2007, pp 81–84
Zurück zum Zitat Wang C, Ou F (2008) An algorithm for decision tree construction based on rough set theory. In: International conference on computer science and information technology, pp 295–298 Wang C, Ou F (2008) An algorithm for decision tree construction based on rough set theory. In: International conference on computer science and information technology, pp 295–298
Zurück zum Zitat Wittien IH, Frank E (2005) Data mining: practical machine learning tools and techniques. Morgan Kaufmann Publishers, California Wittien IH, Frank E (2005) Data mining: practical machine learning tools and techniques. Morgan Kaufmann Publishers, California
Zurück zum Zitat Yellasiri R, Rao CR, Reddy V (2005) Decision tree induction using rough set theory-comparative study. J Theor Appl Inf Technol 3:110–114 Yellasiri R, Rao CR, Reddy V (2005) Decision tree induction using rough set theory-comparative study. J Theor Appl Inf Technol 3:110–114
Zurück zum Zitat Zhou X, Zhang D, Jiang Y (2008) A new credit scoring method based on rough sets and decision tree. LNCS 5012:1081–1089 Zhou X, Zhang D, Jiang Y (2008) A new credit scoring method based on rough sets and decision tree. LNCS 5012:1081–1089
Metadaten
Titel
Case study of inaccuracies in the granulation of decision trees
verfasst von
Salman Badr
Andrzej Bargiela
Publikationsdatum
01.06.2011
Verlag
Springer-Verlag
Erschienen in
Soft Computing / Ausgabe 6/2011
Print ISSN: 1432-7643
Elektronische ISSN: 1433-7479
DOI
https://doi.org/10.1007/s00500-010-0587-x

Weitere Artikel der Ausgabe 6/2011

Soft Computing 6/2011 Zur Ausgabe

Premium Partner