Skip to main content
Erschienen in: International Journal of Machine Learning and Cybernetics 5/2017

01.04.2016 | Original Article

A selective neural network ensemble classification for incomplete data

verfasst von: Yuan-Ting Yan, Yan-Ping Zhang, Yi-Wen Zhang, Xiu-Quan Du

Erschienen in: International Journal of Machine Learning and Cybernetics | Ausgabe 5/2017

Einloggen

Aktivieren Sie unsere intelligente Suche um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Neural network ensemble (NNE) is a simple and effective method to deal with incomplete data for classification. However, with the increase in the number of missing values, the number of incomplete feature combinations (feature subsets) grown rapidly which makes the NNE method very time-consuming and the accuracy is also need to be improved. In this paper, we propose a selective neural network ensemble (SNNE) classification for incomplete data. The SNNE first obtains all the available feature subsets of the incomplete dataset and then applies mutual information to measure the importance (relevance) degree of each feature subset. After that, an optimization process is applied to remove the feature subsets by satisfying the following condition: there is at least a feature subset contained in the removed feature subset and the difference of their importance degree is smaller than a given threshold δ. Finally, the rest of the feature subsets were used to train a group of neural networks and the classification for a given sample is decided by weighted majority voting of all available components in the ensemble. Experimental results show that δ = 0.05 is reasonable in our study. It can improve the efficiency of the algorithm without loss the algorithm accuracy. Experiments also show that SNNE outperforms the NNE-based algorithms compared. In addition, it can greatly reduce the running time when dealing with datasets with larger number of missing values.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Weitere Produktempfehlungen anzeigen
Literatur
1.
Zurück zum Zitat Allison PD (2001) Missing data, 136th edn. Sage publications, LondonMATH Allison PD (2001) Missing data, 136th edn. Sage publications, LondonMATH
2.
Zurück zum Zitat Little RJ, Rubin DB (2014) Statistical analysis with missing data. Wiley, New YorkMATH Little RJ, Rubin DB (2014) Statistical analysis with missing data. Wiley, New YorkMATH
3.
Zurück zum Zitat Lee SY, Song XY (2003) Maximum likelihood estimation and model comparison for mixtures of structural equation models with ignorable missing data. J Classif 20(2):221–255MathSciNetCrossRefMATH Lee SY, Song XY (2003) Maximum likelihood estimation and model comparison for mixtures of structural equation models with ignorable missing data. J Classif 20(2):221–255MathSciNetCrossRefMATH
4.
Zurück zum Zitat Li X-L, Zhou Z-H (2007) Structure learning of probabilistic relational models from incomplete relational data. Machine Learning: ECML 2007. Springer, Berlin, pp 214–225CrossRef Li X-L, Zhou Z-H (2007) Structure learning of probabilistic relational models from incomplete relational data. Machine Learning: ECML 2007. Springer, Berlin, pp 214–225CrossRef
5.
Zurück zum Zitat Duda RO, Hart PE (1973) Pattern classification and scene analysis, edn 3. Wiley, New YorkMATH Duda RO, Hart PE (1973) Pattern classification and scene analysis, edn 3. Wiley, New YorkMATH
6.
Zurück zum Zitat Prati RC, Batista GE, Monard MC (2004) Class imbalances versus class overlapping: an analysis of a learning system behavior. MICAI 2004: advances in artificial intelligence. Springer, Berlin, pp 312–321CrossRef Prati RC, Batista GE, Monard MC (2004) Class imbalances versus class overlapping: an analysis of a learning system behavior. MICAI 2004: advances in artificial intelligence. Springer, Berlin, pp 312–321CrossRef
7.
Zurück zum Zitat Rässler S (2004) The impact of multiple imputation for DACSEIS. University of Erlangen-Nurnberg, Germany Rässler S (2004) The impact of multiple imputation for DACSEIS. University of Erlangen-Nurnberg, Germany
8.
Zurück zum Zitat Williams D, Liao X-J, Xue Y, Carin L, Krishnapuram B (2007) On classification with incomplete data. IEEE Trans Pattern Anal Mach Intell 29(3):427–436CrossRef Williams D, Liao X-J, Xue Y, Carin L, Krishnapuram B (2007) On classification with incomplete data. IEEE Trans Pattern Anal Mach Intell 29(3):427–436CrossRef
9.
Zurück zum Zitat Clark PG, Grzymala B, Rzasa JW (2014) Mining incomplete data with singleton, subset and concept probabilistic approximations. Inf Sci 280:368–384MathSciNetCrossRefMATH Clark PG, Grzymala B, Rzasa JW (2014) Mining incomplete data with singleton, subset and concept probabilistic approximations. Inf Sci 280:368–384MathSciNetCrossRefMATH
11.
Zurück zum Zitat Lin H-C, Su C-T (2013) A selective Bayes classifier with meta-heuristics for incomplete data. Neurocomputing 106:95–102CrossRef Lin H-C, Su C-T (2013) A selective Bayes classifier with meta-heuristics for incomplete data. Neurocomputing 106:95–102CrossRef
12.
Zurück zum Zitat Ramoni M, Sebastiani P (2001) Robust learning with missing data. Mach Learn 45(2):147–170CrossRefMATH Ramoni M, Sebastiani P (2001) Robust learning with missing data. Mach Learn 45(2):147–170CrossRefMATH
13.
Zurück zum Zitat Krause S, Polikar R (2003) An ensemble of classifiers approach for the missing feature problem. In: Proceedings of the 2003 international joint conference on neural networks, vol. 1, IEEE, pp 553–558 Krause S, Polikar R (2003) An ensemble of classifiers approach for the missing feature problem. In: Proceedings of the 2003 international joint conference on neural networks, vol. 1, IEEE, pp 553–558
14.
Zurück zum Zitat Jiang K, Chen H-X, Yuan S-M (2005) Classification for incomplete data using classifier ensembles. In: IEEE international conference on neural networks and brain, ICNN&B 2005, Vol 1, pp 559–563 Jiang K, Chen H-X, Yuan S-M (2005) Classification for incomplete data using classifier ensembles. In: IEEE international conference on neural networks and brain, ICNN&B 2005, Vol 1, pp 559–563
15.
Zurück zum Zitat Philippot E, Santosh KC, Belaïd A, Belaïd Y (2014) Bayesian networks for incomplete data analysis in form processing. Int J Mach Learn Cybernet 6(3):1–17 Philippot E, Santosh KC, Belaïd A, Belaïd Y (2014) Bayesian networks for incomplete data analysis in form processing. Int J Mach Learn Cybernet 6(3):1–17
16.
Zurück zum Zitat Wang X-Z, Wang R, Feng H-M, Wang H-C (2014) A new approach to classifier fusion based on upper integral. IEEE Trans Cybern 44(5):620–635CrossRef Wang X-Z, Wang R, Feng H-M, Wang H-C (2014) A new approach to classifier fusion based on upper integral. IEEE Trans Cybern 44(5):620–635CrossRef
17.
Zurück zum Zitat Wang X-Z, Xing H-J, Li Y, Hua Q, Dong C-R, Pedrycz W (2015) A study on relationship between generalization abilities and fuzziness of base classifiers in ensemble learning. IEEE Trans Fuzzy Syst 23(5):1638–1654CrossRef Wang X-Z, Xing H-J, Li Y, Hua Q, Dong C-R, Pedrycz W (2015) A study on relationship between generalization abilities and fuzziness of base classifiers in ensemble learning. IEEE Trans Fuzzy Syst 23(5):1638–1654CrossRef
18.
Zurück zum Zitat Chen H-X, Du Y-P, Jiang K (2012) Classification of incomplete data using classifier ensembles. In: 2012 IEEE international conference on systems and informatics, ICSAI2012, pp 2229–2232 Chen H-X, Du Y-P, Jiang K (2012) Classification of incomplete data using classifier ensembles. In: 2012 IEEE international conference on systems and informatics, ICSAI2012, pp 2229–2232
19.
Zurück zum Zitat Yan Y-T, Zhang Y-P, Zhang Y-W (2014) Multi-granulation ensemble classification for incomplete data. In: Rough sets and knowledge technology, Springer International Publishing, pp 343–351 Yan Y-T, Zhang Y-P, Zhang Y-W (2014) Multi-granulation ensemble classification for incomplete data. In: Rough sets and knowledge technology, Springer International Publishing, pp 343–351
20.
Zurück zum Zitat Dai Q, Li M (2014) Introducing randomness into greedy ensemble pruning algorithms. Appl Intell 42(3):406–429MathSciNetCrossRef Dai Q, Li M (2014) Introducing randomness into greedy ensemble pruning algorithms. Appl Intell 42(3):406–429MathSciNetCrossRef
21.
Zurück zum Zitat Zhang T, Dai Q, Ma Z (2015) Extreme learning machines’ ensemble selection with GRASP. Appl Intell 43(2):439–459CrossRef Zhang T, Dai Q, Ma Z (2015) Extreme learning machines’ ensemble selection with GRASP. Appl Intell 43(2):439–459CrossRef
22.
Zurück zum Zitat Ma Z, Dai Q, Liu N (2015) Several novel evaluation measures for rank-based ensemble pruning with applications to time series prediction. Exp Syst Appl 42(1):280–292CrossRef Ma Z, Dai Q, Liu N (2015) Several novel evaluation measures for rank-based ensemble pruning with applications to time series prediction. Exp Syst Appl 42(1):280–292CrossRef
23.
Zurück zum Zitat Wang X-Z, Chen A-X, Feng H-M (2011) Upper integral network with extreme learning mechanism. Neurocomputing 74(16):2520–2525CrossRef Wang X-Z, Chen A-X, Feng H-M (2011) Upper integral network with extreme learning mechanism. Neurocomputing 74(16):2520–2525CrossRef
24.
Zurück zum Zitat You Z-H, Lei Y-K, Zhu L et al (2013) Prediction of protein-protein interactions from amino acid sequences with ensemble extreme learning machines and principal component analysis. BMC Bioinformatics 14(Suppl 8):S10CrossRef You Z-H, Lei Y-K, Zhu L et al (2013) Prediction of protein-protein interactions from amino acid sequences with ensemble extreme learning machines and principal component analysis. BMC Bioinformatics 14(Suppl 8):S10CrossRef
25.
Zurück zum Zitat Chen H, Ni D, Qin J et al (2015) Standard plane localization in fetal ultrasound via domain transferred deep neural networks. IEEE J Biomed Health Inform 19(5):1627–1636CrossRef Chen H, Ni D, Qin J et al (2015) Standard plane localization in fetal ultrasound via domain transferred deep neural networks. IEEE J Biomed Health Inform 19(5):1627–1636CrossRef
26.
Zurück zum Zitat Chen H-X, Yuan S-M, Jiang K (2005) Wrapper approach for learning neural network ensemble by feature selection. In: Advances in neural networks (ISNN 2005), vol 3496. Springer, Berlin, pp 526–531 Chen H-X, Yuan S-M, Jiang K (2005) Wrapper approach for learning neural network ensemble by feature selection. In: Advances in neural networks (ISNN 2005), vol 3496. Springer, Berlin, pp 526–531
27.
Zurück zum Zitat Yu L, Liu H (2003) Feature selection for high-dimensional data: a fast correlation-based filter solution, In: ICML, Vol 3, pp 856–863 Yu L, Liu H (2003) Feature selection for high-dimensional data: a fast correlation-based filter solution, In: ICML, Vol 3, pp 856–863
28.
Zurück zum Zitat Erik S et al (2010) On estimating mutual information for feature selection, Artificial Neural Networks-ICANN 2010. Springer, Berlin, pp 362–367 Erik S et al (2010) On estimating mutual information for feature selection, Artificial Neural Networks-ICANN 2010. Springer, Berlin, pp 362–367
30.
Zurück zum Zitat Benoit QF, Doquire G, Verleysen M (2014) Estimating mutual information for feature selection in the presence of label noise. Comput Stat Data Anal 71:832–848MathSciNetCrossRef Benoit QF, Doquire G, Verleysen M (2014) Estimating mutual information for feature selection in the presence of label noise. Comput Stat Data Anal 71:832–848MathSciNetCrossRef
31.
Zurück zum Zitat Breiman L (1996) Bagging predictors. Mach Learn 24(2):123–140MATH Breiman L (1996) Bagging predictors. Mach Learn 24(2):123–140MATH
Metadaten
Titel
A selective neural network ensemble classification for incomplete data
verfasst von
Yuan-Ting Yan
Yan-Ping Zhang
Yi-Wen Zhang
Xiu-Quan Du
Publikationsdatum
01.04.2016
Verlag
Springer Berlin Heidelberg
Erschienen in
International Journal of Machine Learning and Cybernetics / Ausgabe 5/2017
Print ISSN: 1868-8071
Elektronische ISSN: 1868-808X
DOI
https://doi.org/10.1007/s13042-016-0524-0

Weitere Artikel der Ausgabe 5/2017

International Journal of Machine Learning and Cybernetics 5/2017 Zur Ausgabe

Neuer Inhalt