Top

Journal of Intelligent Information Systems

Published in:

07-07-2018

Experimental validation for N-ary error correcting output codes for ensemble learning of deep neural networks

Authors: Kaikai Zhao, Tetsu Matsukawa, Einoshin Suzuki

Published in: Journal of Intelligent Information Systems | Issue 2/2019

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

N-ary error correcting output codes (ECOC) decompose a multi-class problem into simpler multi-class problems by splitting the classes into N subsets (meta-classes) to form an ensemble of N-class classifiers and combine them to make predictions. It is one of the most accurate ensemble learning methods for traditional classification tasks. Deep learning has gained increasing attention in recent years due to its successes on various tasks such as image classification and speech recognition. However, little is known about N-ary ECOC with deep neural networks (DNNs) as base learners, probably due to the long computation time. In this paper, we show by experiments that N-ary ECOC with DNNs as base learners generally exhibits superior performance compared with several state-of-the-art ensemble learning methods. Moreover, our work contributes to a more efficient setting of the two crucial hyperparameters of N-ary ECOC: the value of N and the number of base learners to train. We also explore valuable strategies for further improving the accuracy of N-ary ECOC.

previous article Reliable TF-based recommender system for capturing complex correlations among contexts

next article Inferring preferences in ontology-based recommender systems using WOWA

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Available only for authorised users

https://github.com/alexvking/neural-net-house-number-recognition

https://gist.github.com/mavenlin/e56253735ef32c3c296d

https://github.com/jimgoo/caffe-oxford102

Ahmed, E., Jones, M., Marks, T.K. (2015). An improved deep learning architecture for person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 3908–3916.

Ahonen, T., Hadid, A., Pietikainen, M. (2006). Face description with local binary patterns: Application to face recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(12), 2037–2041.MATHCrossRef

Allwein, E.L., Schapire, R.E., Singer, Y. (2000). Reducing multiclass to binary: A unifying approach for margin classifiers. Journal of Machine Learning Research, 1, 113–141.MathSciNetMATH

Breiman, L. (1996). Bagging predictors. Machine learning, 24(2), 123–140.MATHMathSciNet

Buckland, M., & Gey, F. (1994). The relationship between recall and precision. Journal of the American Society for Information Science, 45(1), 12.CrossRef

Buda, M., Maki, A., Mazurowski, M.A. (2017). A systematic study of the class imbalance problem in convolutional neural networks. arXiv:1710.05381.

Chawla, N.V. (2009). Data mining for imbalanced datasets: An overview. In: Data mining and knowledge discovery handbook, pp. 875–886. Springer.

Cortes, C., & Vapnik, V. (1995). Support-vector networks. Machine Learning, 20(3), 273–297.MATH

Dietterich, T.G. (2002). Ensemble Learning. The handbook of brain theory and neural networks, 2nd edn. Cambridge: MIT Press.

Dietterich, T.G., & Bakiri, G. (1995). Solving multiclass learning problems via error-correcting output codes. Journal of Artificial Intelligence Research, 2(1), 263–286.MATHCrossRef

Freund, Y., & Schapire, R.E. (1997). A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences, 55 (1), 119–139.MathSciNetMATHCrossRef

Goodfellow, I., Bengio, Y., Courville, A. (2016). Deep learning. London: MIT Press.MATH

Han, J., Pei, J., Kamber, M. (2011). Data mining: Concepts and techniques, 3rd edn., Waltham, Morgan Kaufmann.

Hastie, T., Rosset, S., Zhu, J., Zou, H. (2009). Multi-class AdaBoost. Statistics and Its Interface, 2(3), 349–360.MathSciNetMATHCrossRef

Ho, T.K. (1995). Random Decision Forests. In: Proceedings of the 3rd international conference on document analysis and recognition, pp. 278–282.

Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., Darrell, T. (2014). Caffe: convolutional architecture for fast feature embedding. In: Proceedings of the 22nd ACM international conference on multimedia, pp. 675–678.

Krizhevsky, A., & Hinton, G. (2009). Learning multiple layers of features from tiny images. Master’s thesis, department of computer science. Canada: University of Toronto.

Krizhevsky, A., Sutskever, I., Hinton, G.E. (2012). Imagenet classification with deep convolutional neural networks. In: Proceedings of the advances in neural information processing systems, pp. 1097–1105.

LeCun, Y., Bottou, L., Bengio, Y., Haffner, P. (1998). Gradient-based learning applied to document recognition. In: Proceedings of the IEEE, pp. 2278–2324.

Lee, S., Purushwalkam, S., Cogswell, M., Crandall, D., Batra, D. (2015). Why M heads are better than one: Training a diverse ensemble of deep networks. arXiv:1511.06314.

Lin, M., Chen, Q., Yan, S. (2014). Network in network. In: Proceedings of the international conference on learning representations.

Masko, D., & Hensman, P. (2015). The impact of imbalanced training data for convolutional neural networks. Bachelor’s thesis, school of computer science and communication. Sweden: KTH Royal Institute of Technology.

Müller, H., Michoux, N., Bandon, D., Geissbuhler, A. (2004). A review of content-based image retrieval systems in medical applications - clinical benefits and future directions. International Journal of Medical Informatics, 73(1), 1–23.CrossRef

Netzer, Y., Wang, T., Coates, A., Bissacco, A., Wu, B., Ng, A.Y. (2011). Reading digits in natural images with unsupervised feature learning. In: Proceedings of the NIPS workshop on deep learning and unsupervised feature learning.

Nilsback, M.E., & Zisserman, A. (2008). Automated flower classification over a large number of classes. In: Proceedings of the 6th Indian conference on computer vision, graphics & image processing, pp. 722–729.

Opitz, M., Possegger, H., Bischof, H. (2016). Efficient model averaging for deep neural networks. In: Proceedings of the 13th Asian conference on computer vision (ACCV’16).

Pearson, K. (1895). Note on regression and inheritance in the case of two parents. Proceedings of the Royal Society of London, 58, 240–242.CrossRef

Quinlan, J.R. (1986). Induction of decision trees. Machine Learning, 1(1), 81–106.

Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., et al. (2015). Imagenet large scale visual recognition challenge. International Journal of Computer Vision, 115 (3), 211–252.MathSciNetCrossRef

Schapire, R.E. (1990). The strength of weak learnability. Machine learning, 5(2), 197–227.

Seiffert, C., Khoshgoftaar, T.M., Van Hulse, J., Napolitano, A. (2008). Resampling or reweighting: A comparison of boosting implementations. In: Proceedings of the IEEE international conference on tools with artificial intelligence, pp. 445–451.

Skalak, D.B. (1996). The sources of increased accuracy for two proposed boosting algorithms. In: Proceedings of the american association for artificial intelligence workshop on integrating multiple learned models.

Van Rijsbergen, C. (1979). Information Retrieval. London: Butterworth.MATH

Vriesmann, L.M, de Souza Britto Jr, A., De Oliveira, L.E.S., Sabourin, R., Ko, A.H.R. (2012). Improving a dynamic ensemble selection method based on oracle information. International Journal of Innovative Computing and Applications, 4(3-4), 184–200.CrossRef

Wei, X.S., Gao, B.B., Wu, J. (2015). Deep spatial pyramid ensemble for cultural event recognition. In: Proceedings of the IEEE international conference on computer vision workshops, pp. 38–44.

Xie, J., Xu, B., Chuang, Z. (2013). Horizontal and vertical ensemble with deep representation for classification. In: Proceedings of the ICML workshop on representation learning.

Yosinski, J., Clune, J., Bengio, Y., Lipson, H. (2014). How transferable are features in deep neural networks?. In: Proceedings of the advances in neural information processing systems, pp. 3320–3328.

Yule, G.U. (1900). On the association of attributes in statistics: With illustrations from the material of the childhood society. Philosophical Transactions of the Royal Society of London Series A, 194, 257–319.MATHCrossRef

Zhou, J.T., Tsang, I.W., Ho, S.S., Muller, K.R. (2016). N-ary error correcting coding scheme. arXiv:1603.05850.

Zhou, Z.H. (2012). Ensemble methods: Foundations and algorithms, Chapman & Hall/CRC, Boca Raton.

Title: Experimental validation for N-ary error correcting output codes for ensemble learning of deep neural networks
Authors: Kaikai Zhao
Tetsu Matsukawa
Einoshin Suzuki
Publication date: 07-07-2018
Publisher: Springer US
Published in: Journal of Intelligent Information Systems / Issue 2/2019
Print ISSN: 0925-9902
Electronic ISSN: 1573-7675
DOI: https://doi.org/10.1007/s10844-018-0516-5

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Other articles of this Issue 2/2019

Design of a clinician dashboard to facilitate co-decision making in the management of non-specific low back pain

An overview of recommender systems in the internet of things

Inferring preferences in ontology-based recommender systems using WOWA

Incremental entity resolution process over query results for data integration systems

Trust inference using implicit influence and projected user network for item recommendation

Reliable TF-based recommender system for capturing complex correlations among contexts

Premium Partner