Top

Neural Computing and Applications

Published in:

01-07-2016 | Original Article

Learning a good representation with unsymmetrical auto-encoder

Authors: Yanan Sun, Hua Mao, Quan Guo, Zhang Yi

Published in: Neural Computing and Applications | Issue 5/2016

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

Auto-encoders play a fundamental role in unsupervised feature learning and learning initial parameters of deep architectures for supervised tasks. For given input samples, robust features are used to generate robust representations from two perspectives: (1) invariant to small variation of samples and (2) reconstruction by decoders with minimal error. Traditional auto-encoders with different regularization terms have symmetrical numbers of encoder and decoder layers, and sometimes parameters. We investigate the relation between the number of layers and propose an unsymmetrical structure, i.e., an unsymmetrical auto-encoder (UAE), to learn more effective features. We present empirical results of feature learning using the UAE and state-of-the-art auto-encoders for classification tasks with a range of datasets. We also analyze the gradient vanishing problem mathematically and provide suggestions for the appropriate number of layers to use in UAEs with a logistic activation function. In our experiments, UAEs demonstrated superior performance with the same configuration compared to other auto-encoders.

previous article Projective synchronization of two different fractional-order chaotic systems via adaptive fuzzy control

next article Segmentation of english Offline handwritten cursive scripts using a feedforward neural network

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

The MNIST datasets for these problems are available at http://www.iro.umontreal.ca/~lisa/icml2007.

We used two GPU models: NVIDIA GTX750Ti and GTX780.

Baldi P, Hornik K (1989) Neural networks and principal component analysis: learning from examples without local minima. Neural Netw 2(1):53–58CrossRef

Baldi P, Pineda F (1991) Contrastive learning and neural oscillations. Neural Comput 3(4):526–545CrossRef

Baum EB, Haussler D (1989) What size net gives valid generalization? Neural Comput 1(1):151–160CrossRef

Bengio Y (2009) Learning deep architectures for AI. Found Trends Mach Learn 2(1):1–127 MathSciNetCrossRefMATH

Bengio Y (2012) Deep learning of representations for unsupervised and transfer learning. Unsupervised Transf Learn Chall Mach Learn 7:19

Bengio Y, Lamblin P, Popovici D, Larochelle H (2007) Greedy layer-wise training of deep networks. Adv Neural Inf Process Syst 19:153

Doya K (1992) Bifurcations in the learning of recurrent neural networks 3. Learning (RTRL) 3:17

Erhan D, Bengio Y, Courville A, Vincent P (2009) Visualizing higher-layer features of a deep network. Dept. IRO, Universit de Montral, Technical Report

Goodfellow I, Lee H, Le QV, Saxe A, Ng AY (2009) Measuring invariances in deep networks. In: Bengio Y, Schuurmans D, Lafferty JD, Williams CKI, Culotta A (eds) Advances in neural information processing systems 22, Curran Associates, Inc., pp 646–654

10.

Hinton GE (1987) Learning translation invariant recognition in a massively parallel networks. In: PARLE Parallel Architectures and Languages Europe, vol 1. Springer, Eindhoven, pp 1–13

11.

Hinton GE, Salakhutdinov RR (2006) Reducing the dimensionality of data with neural networks. Science 313(5786):504–507MathSciNetCrossRefMATH

12.

Jarrett K, Kavukcuoglu K, Ranzato M, LeCun Y (2009) What is the best multi-stage architecture for object recognition? In: IEEE 12th international conference on computer vision, 2009. IEEE, pp 2146–2153

13.

Krizhevsky A, Hinton G (2009) Learning multiple layers of features from tiny images. Computer Science Department, University of Toronto, Tech. Rep

14.

Lee H, Grosse R, Ranganath R, Ng AY (2009) Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations. In: Proceedings of the 26th annual international conference on machine learning, pp 609–616. ACM

15.

Liou CY, Cheng WC, Liou JW, Liou DR (2014) Autoencoder for words. Neurocomputing 139:84–96CrossRef

16.

Liou CY, Huang JC, Yang WC (2008) Modeling word perception using the Elman network. Neurocomputing 71(16):3150–3157CrossRef

17.

Moody J, Hanson S, Krogh A, Hertz JA (1995) A simple weight decay can improve generalization. Adv Neural Inf Process Syst 4:950–957

18.

Olshausen BA, Field DJ (1997) Sparse coding with an overcomplete basis set: a strategy employed by v1? Vis Res 37(23):3311–3325CrossRef

19.

Pineda FJ (1988) Dynamics and architecture for neural computation. J Complex 4(3):216–245MathSciNetCrossRefMATH

20.

Ranzato MA, Boureau Y-L, Cun YL (2008) Sparse feature learning for deep belief networks. In: Platt JC, Koller D, Singer Y, Roweis ST (eds) Advances in neural information processing systems 20, Curran Associates, Inc., Red Hook, New York, pp 1185–1192

21.

Ranzato MA, Poultney C, Chopra S, Cun YL (2007) Efficient learning of sparse representations with an energy-based model. In: Schölkopf B, Platt JC, Hoffman T (eds) Advances in neural information processing systems 19, MIT Press, pp 1137–1144

22.

Rifai S, Vincent P, Muller X, Glorot X, Bengio Y (2011) Contractive auto-encoders: Explicit invariance during feature extraction. In: Proceedings of the 28th international conference on machine learning (ICML-11), pp 833–840

23.

Schwartz D, Samalam V, Solla SA, Denker J (1990) Exhaustive learning. Neural Comput 2(3):374–385CrossRef

24.

Tishby N, Levin E, Solla SA (1989) Consistent inference of probabilities in layered networks: predictions and generalizations. In: International joint conference on neural networks, IJCNN, 1989. IEEE, pp 403–409

25.

Vincent P, Larochelle H, Bengio Y, Manzagol PA (2008) Extracting and composing robust features with denoising autoencoders. In: Proceedings of the 25th international conference on machine learning. ACM, pp 1096–1103

26.

Vincent P, Larochelle H, Lajoie I, Bengio Y, Manzagol PA (2010) Stacked denoising autoencoders: learning useful representations in a deep network with a local denoising criterion. J Mach Learn Res 11:3371–3408 MathSciNetMATH

Title: Learning a good representation with unsymmetrical auto-encoder
Authors: Yanan Sun
Hua Mao
Quan Guo
Zhang Yi
Publication date: 01-07-2016
Publisher: Springer London
Published in: Neural Computing and Applications / Issue 5/2016
Print ISSN: 0941-0643
Electronic ISSN: 1433-3058
DOI: https://doi.org/10.1007/s00521-015-1939-3

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft"

Springer Professional "Technik"

Springer Professional "Wirtschaft+Technik"

Other articles of this Issue 5/2016

Integrating corporate governance and financial variables for the identification of qualified audit opinions with neural networks

HISYCOL a hybrid computational intelligence system for combined machine learning: the case of air pollution modeling in Athens

Estimating the energy production of the wind turbine using artificial neural network

Unsupervised feature selection for sensor time-series in pervasive computing applications

Special issue on the engineering applications of neural networks

Consensus problems for multi-agent systems with nonlinear algorithms

Premium Partner