Skip to main content
Erschienen in: Soft Computing 1/2012

01.01.2012 | Original Paper

Parameter determination and feature selection for C4.5 algorithm using scatter search approach

verfasst von: Shih-Wei Lin, Shih-Chieh Chen

Erschienen in: Soft Computing | Ausgabe 1/2012

Einloggen

Aktivieren Sie unsere intelligente Suche um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

The C4.5 decision tree (DT) can be applied in various fields and discovers knowledge for human understanding. However, different problems typically require different parameter settings. Rule of thumb or trial-and-error methods are generally utilized to determine parameter settings. However, these methods may result in poor parameter settings and unsatisfactory results. On the other hand, although a dataset can contain numerous features, not all features are beneficial for classification in C4.5 algorithm. Therefore, a novel scatter search-based approach (SS + DT) is proposed to acquire optimal parameter settings and to select the beneficial subset of features that result in better classification results. To evaluate the efficiency of the proposed SS + DT approach, datasets in the UCI (University of California, Irvine) Machine Learning Repository are utilized to assess the performance of the proposed approach. Experimental results demonstrate that the parameter settings for the C4.5 algorithm obtained by the SS + DT approach are better than those obtained by other approaches. When feature selection is considered, classification accuracy rates on most datasets are increased. Therefore, the proposed approach can be utilized to identify effectively the best parameter settings for C4.5 algorithm and useful features.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Berry MJA, Linoff G (2001) Data mining techniques: for marking, sales and customer support. Wiley, London Berry MJA, Linoff G (2001) Data mining techniques: for marking, sales and customer support. Wiley, London
Zurück zum Zitat Freitas AA (1998) Data mining: and knowledge discovery with evolutionary algorithm. Springer, Berlin Freitas AA (1998) Data mining: and knowledge discovery with evolutionary algorithm. Springer, Berlin
Zurück zum Zitat Glover F (1998) A template for scatter search and path relinking. In: Hao JK, Lutton E, Ronald E, Schoenauer M, Snyers D (eds) Artificial evolution, Lecture notes in computer science, vol 1363, Springer, Berlin, pp 13–54 Glover F (1998) A template for scatter search and path relinking. In: Hao JK, Lutton E, Ronald E, Schoenauer M, Snyers D (eds) Artificial evolution, Lecture notes in computer science, vol 1363, Springer, Berlin, pp 13–54
Zurück zum Zitat Han J, Kamber M (2006) Data mining: concepts and techniques. Morgan Kaufmann, San FranciscoMATH Han J, Kamber M (2006) Data mining: concepts and techniques. Morgan Kaufmann, San FranciscoMATH
Zurück zum Zitat Kohavi R, John G (1995) Automatic parameter selection by minimizing estimated error. In: Prieditis A, Russell S (eds) Machine learning: Proceedings of the twelfth international conference, Morgan Kaufmann Kohavi R, John G (1995) Automatic parameter selection by minimizing estimated error. In: Prieditis A, Russell S (eds) Machine learning: Proceedings of the twelfth international conference, Morgan Kaufmann
Zurück zum Zitat Laguna M, Martí R (2003) Scatter search: methodology and implementations in C. Kluwer Academic Publishers, BostonCrossRef Laguna M, Martí R (2003) Scatter search: methodology and implementations in C. Kluwer Academic Publishers, BostonCrossRef
Zurück zum Zitat Liu H, Motoda H (1998) Feature selection for knowledge discovery and data mining. Kluwer Academic, BostonCrossRefMATH Liu H, Motoda H (1998) Feature selection for knowledge discovery and data mining. Kluwer Academic, BostonCrossRefMATH
Zurück zum Zitat Quinlan JR (1986) Introduction of decision trees. Mach Learn 1:81–106 Quinlan JR (1986) Introduction of decision trees. Mach Learn 1:81–106
Zurück zum Zitat Quinlan JR (1993) C4.5: programs for machine learning. Morgan Kaufmann, Menlo Park Quinlan JR (1993) C4.5: programs for machine learning. Morgan Kaufmann, Menlo Park
Zurück zum Zitat Rasha SAW, Monmarché N, Slimane M, Moaid AF, Saleh HH (2006) A scatter search algorithm for the automatic clustering problem. Lect Notes Comput Sci 4065:350–364. doi:10.1007/11790853_28 CrossRef Rasha SAW, Monmarché N, Slimane M, Moaid AF, Saleh HH (2006) A scatter search algorithm for the automatic clustering problem. Lect Notes Comput Sci 4065:350–364. doi:10.​1007/​11790853_​28 CrossRef
Zurück zum Zitat Smith M, Bull L (2005) GAP: constructing and selection features with evolutionary computing. In: Jain LC, Ghosh A (eds) Evolutionary computation in data mining. Springer, Berlin Smith M, Bull L (2005) GAP: constructing and selection features with evolutionary computing. In: Jain LC, Ghosh A (eds) Evolutionary computation in data mining. Springer, Berlin
Metadaten
Titel
Parameter determination and feature selection for C4.5 algorithm using scatter search approach
verfasst von
Shih-Wei Lin
Shih-Chieh Chen
Publikationsdatum
01.01.2012
Verlag
Springer-Verlag
Erschienen in
Soft Computing / Ausgabe 1/2012
Print ISSN: 1432-7643
Elektronische ISSN: 1433-7479
DOI
https://doi.org/10.1007/s00500-011-0734-z

Weitere Artikel der Ausgabe 1/2012

Soft Computing 1/2012 Zur Ausgabe