nach oben

Soft Computing

Erschienen in:

11.01.2019 | Methodologies and Application

New feature selection and voting scheme to improve classification accuracy

verfasst von: Cheng-Jung Tsai

Erschienen in: Soft Computing | Ausgabe 22/2019

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Classification is a classic technique employed in data mining. Many ensemble learning methods have been introduced to improve the predictive accuracy of classification. A typical ensemble learning method consists of three steps: selection, building, and integration. Of the three steps, the first and third significantly affect the predictive accuracy of the classification. In this paper, we propose a new selection and integration scheme. Our method can improve the accuracy of subtrees and maintain their diversity. Through a new voting scheme, the predictive accuracy of ensemble learning is improved. We also theoretically analyzed the selection and integration steps of our method. The results of experimental analyses show that our method can achieve better accuracy than two state-of-the-art tree-based ensemble learning approaches.

Vorheriger Artikel A hybrid evolutionary-simplex search method to solve nonlinear constrained optimization problems

Nächster Artikel Smart adaptive run parameterization (SArP): enhancement of user manual selection of running parameters in fluid dynamic simulations using bio-inspired and machine-learning techniques

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Abedinia O, Amjady N, Zareipour H (2017) A new feature selection technique for load and price forecast of electrical power systems. IEEE Trans Power Syst 32(1):62–74CrossRef

Alobaidi MH, Fateh C, Mohamed AM (2018) Robust ensemble learning framework for day-ahead forecasting of household based energy consumption. Appl Energy 212:997–1012CrossRef

Breiman L (1996) Bagging predictors. Mach Learn 24(1):123–140MATH

Breiman L (2001) Random forests. Mach Learn 45(1):5–32CrossRef

Breiman L (2017) Classification and regression trees. Routledge, AbingtonCrossRef

Chandrashekar G, Sahin F (2014) A survey on feature selection methods. Comput Electr Eng 40(1):16–28CrossRef

Cunningham P, Carney J (2000) Diversity versus quality in classification ensembles based on feature selection. In: Proceedings of the eleventh European conference on machine learning, Barcelona, Catalonia, Spain, 2000, pp 109–116

Demsar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30MathSciNetMATH

Freund Y, Schapire RE (1996) Experiments with a new boosting algorithm. In: Proceedings of the thirteenth international conference on machine learning, Bari, Italy, 1996, pp 148–156

Guerra-Salcedo C, Whitley D (1999) Genetic approach to feature selection for ensemble creation. In: Proceedings of the international conference on genetic and progressive computation, Orlando, Florida, USA, 1999, pp 236–243

Han J, Kambe M (2011) Data mining: concepts and techniques. Morgan Kaufmann, Burlington

Ho TK (2008) The random subspace method for constructing decision forests. IEEE Trans Pattern Anal Mach Intell 20(8):832–844

Hsu WH (2004) Genetic wrappers for feature selection in decision tree induction and variable ordering in Bayesian network structure learning. Inf Sci 163(1–3):103–122MathSciNetCrossRef

Joshi S, Nair MK (2015) Prediction of heart disease using classification based data mining techniques. Comput Intell Data Min 2:503–511

Kohavi R, John G (1997) Wrappers for feature subset selection. Artif Intell 1:273–324CrossRef

Krawczyk B, Minku LL, Gama J, Stefanowski J, Woźniak M (2017) Ensemble learning for data stream analysis: a survey. Inf Fus 37:132–156CrossRef

Kuncheva L, Whitaker CJ (2003) Measures of diversity in classifier ensembles and their relationship with the ensemble accuracy. Mach Learn 51(2):181–207CrossRef

Kuncheva L, Whitaker CJ, Shipp CA, Duin RPW (2000) Is independence good for combining classifiers? In: Proceedings of the fifth international conference on pattern recognition, Barcelona, Spain, 2000, pp 2168–2171

Laradji IH, Alshayeb M, Ghouti L (2015) Software defect prediction using ensemble learning on selected features. Inf Softw Technol 58:388–402CrossRef

Lee CI, Tsai CJ, Ku CW (2006) An evolutionary and attribute-oriented ensemble classifier. In: Proceedings of the international conference on computational science and its applications, Berlin, Heidelberg, 2006, pp 1210–1218

Liu B, Long R, Chou KC (2016a) iDHS-EL: identifying DNase I hypersensitive sites by fusing three different modes of pseudo nucleotide composition into an ensemble learning framework. Bioinformatics 32(16):2411–2418CrossRef

Liu B, Wang S, Long R, Chou KC (2016b) iRSpot-EL: identify recombination spots with an ensemble learning approach. Bioinformatics 33(1):35–41CrossRef

Nock R (2002) Inducing interpretable voting classifiers without trading accuracy for simplicity: theoretical results, approximation algorithms, and experiments. J Artif Intell Res 17:137–170MathSciNetCrossRef

Opitz DW (1999) Feature selection for ensembles. In: Proceedings of the sixteenth national conference on artificial intelligence and eleventh conference on innovative applications of artificial intelligence, Orlando, Florida, USA, 1999, pp 379–384

Opitz D, Maclin R (1999) Popular ensemble methods: an empirical study. J Artif Intell Res 11:169–198CrossRef

Ortiz-Boyer D, Hervás-Martínez C, García-Pedrajas N (2005) CIXL2: a crossover operator for progressive algorithms based on population features. J Artif Intell Res 24:1–48CrossRef

Pandya R, Pandya J (2015) C5.0 algorithm to improved decision tree with feature selection and reduced error pruning. Int J Comput Appl 117(16):18–21

Quinlan JR (1993) C4.5: program for machine learning. Morgen Kaufmann Publisher, San Mateo

Rastogi R, Shim K (2000) PUBLIC: a decision tree classifier that integrates building and pruning. Data Min Knowl Disc 4(4):315–344CrossRef

Reynolds TJ, Antoniou CA (2003) Experiments in speech recognition using a modular MLP architecture for acoustic modeling. Inf Sci 156(1–2):39–54CrossRef

Selamat A, Omatu S (2004) Web page feature selection and classification using neural networks. Inf Sci 158:69–88MathSciNetCrossRef

Shipp CA, Kuncheva L (2002) Relationships between combination methods and measures of diversity in combining classifiers. Inf Fus 3(2):35–148CrossRef

Somol P, Pudil P, Kittler J (2004) Fast branch & bound algorithms for optimal feature selection. IEEE Trans Pattern Anal Mach Intell 26:900–912CrossRef

Wang XZ, Xing HJ, Li Y, Hua Q, Dong CR, Pedrycz W (2015) A study on relationship between generalization abilities and fuzziness of base classifiers in ensemble learning. IEEE Trans Fuzzy Syst 23(5):1638–1654CrossRef

Wolpert D, Macready WG (1999) An efficient method to estimate bagging’s generalization error. Mach Learn 35(1):41–55CrossRef

Zhang Y, Bhattacharyya S (2004) Genetic programming in classifying large-scale data: an ensemble method. Inf Sci 163(1–3):85–101CrossRef

Zhang XL, Wang D (2016) A deep ensemble learning method for monaural speech separation. IEEE/ACM Trans Audio Speech Lang Process 24(5):967–977MathSciNetCrossRef

Zhou ZH (2015) Ensemble learning. In: Encyclopedia of biometrics, pp 411–416. https://doi.org/10.1007/978-1-4899-7488-4

Zilly J, Buhmann JM (2017) Mahapatra D (2017) Glaucoma detection using entropy sampling and ensemble learning for automatic optic cup and disc segmentation. Comput Med Imaging Graph 55:28–41CrossRef

Titel: New feature selection and voting scheme to improve classification accuracy
verfasst von: Cheng-Jung Tsai
Publikationsdatum: 11.01.2019
Verlag: Springer Berlin Heidelberg
Erschienen in: Soft Computing / Ausgabe 22/2019
Print ISSN: 1432-7643
Elektronische ISSN: 1433-7479
DOI: https://doi.org/10.1007/s00500-019-03757-2

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Wirtschaft"

Springer Professional "Technik"

Weitere Artikel der Ausgabe 22/2019

Forecasting stock volatility process using improved least square support vector machine approach

C-3PO: Click-sequence-aware deeP neural network (DNN)-based Pop-uPs recOmmendation

An optimization-based methodology for the definition of amplitude thresholds of the ground penetrating radar

Swarm bat algorithm with improved search (SBAIS)

A study of sine–cosine oscillation heterogeneous PCNN for image quantization

WebHound: a data-driven intrusion detection from real-world web access logs

Premium Partner