Skip to main content
Erschienen in: KI - Künstliche Intelligenz 4/2012

01.11.2012 | Fachbeitrag

Deep Learning

Layer-Wise Learning of Feature Hierarchies

verfasst von: Hannes Schulz, Sven Behnke

Erschienen in: KI - Künstliche Intelligenz | Ausgabe 4/2012

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Hierarchical neural networks for object recognition have a long history. In recent years, novel methods for incrementally learning a hierarchy of features from unlabeled inputs were proposed as good starting point for supervised training. These deep learning methods—together with the advances of parallel computers—made it possible to successfully attack problems that were not practical before, in terms of depth and input size. In this article, we introduce the reader to the basic concepts of deep learning, discuss selected methods in detail, and present application examples from computer vision and speech recognition.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

KI - Künstliche Intelligenz

The Scientific journal "KI – Künstliche Intelligenz" is the official journal of the division for artificial intelligence within the "Gesellschaft für Informatik e.V." (GI) – the German Informatics Society - with constributions from troughout the field of artificial intelligence.

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Weitere Produktempfehlungen anzeigen
Literatur
1.
Zurück zum Zitat Behnke S (1999) Hebbian learning and competition in the neural abstraction pyramid. In: Proceedings of international joint conference on neural networks (IJCNN), Washington, DC, USA, vol 2, pp 1356–1361 Behnke S (1999) Hebbian learning and competition in the neural abstraction pyramid. In: Proceedings of international joint conference on neural networks (IJCNN), Washington, DC, USA, vol 2, pp 1356–1361
2.
Zurück zum Zitat Behnke S (2003a) Discovering hierarchical speech features using convolutional non-negative matrix factorization. In: Proceedings of international joint conference on neural networks (IJCNN), Portland, Oregon, USA, vol 4, pp 2758–2763 Behnke S (2003a) Discovering hierarchical speech features using convolutional non-negative matrix factorization. In: Proceedings of international joint conference on neural networks (IJCNN), Portland, Oregon, USA, vol 4, pp 2758–2763
3.
Zurück zum Zitat Behnke S (2003b) Hierarchical neural networks for image interpretation. Lecture notes in computer science, vol 2766. Springer, Berlin MATHCrossRef Behnke S (2003b) Hierarchical neural networks for image interpretation. Lecture notes in computer science, vol 2766. Springer, Berlin MATHCrossRef
4.
Zurück zum Zitat Bengio Y, Lamblin P, Popovici D, Larochelle H (2006) Greedy layer-wise training of deep networks. In: Advances in neural information processing systems (NIPS), Vancouver, Canada, pp 153–160 Bengio Y, Lamblin P, Popovici D, Larochelle H (2006) Greedy layer-wise training of deep networks. In: Advances in neural information processing systems (NIPS), Vancouver, Canada, pp 153–160
6.
Zurück zum Zitat Boureau Y, Bach F, LeCun Y, Ponce J (2010) Learning mid-level features for recognition. In: Proceedings of computer vision and pattern recognition (CVPR), San Francisco, CA, USA, pp 2559–2566 Boureau Y, Bach F, LeCun Y, Ponce J (2010) Learning mid-level features for recognition. In: Proceedings of computer vision and pattern recognition (CVPR), San Francisco, CA, USA, pp 2559–2566
7.
Zurück zum Zitat Cireşan DC, Meier U, Masci J, Schmidhuber J (2012) Multi-column deep neural networks for image classification. In: Proceedings of computer vision and pattern recognition (CVPR) (in press) Cireşan DC, Meier U, Masci J, Schmidhuber J (2012) Multi-column deep neural networks for image classification. In: Proceedings of computer vision and pattern recognition (CVPR) (in press)
8.
Zurück zum Zitat Coates A, Lee H, Ng AY (2010) An analysis of single-layer networks in unsupervised feature learning. In: Proceedings of international conference on artificial intelligence and statistics (AISTATS), Chia, Laguna, Italy Coates A, Lee H, Ng AY (2010) An analysis of single-layer networks in unsupervised feature learning. In: Proceedings of international conference on artificial intelligence and statistics (AISTATS), Chia, Laguna, Italy
9.
Zurück zum Zitat Collobert R, Weston J (2008) A unified architecture for natural language processing: deep neural networks with multitask learning. In: Proceedings of international conference on machine learning (ICML), Helsinki, Finland, pp 160–167 CrossRef Collobert R, Weston J (2008) A unified architecture for natural language processing: deep neural networks with multitask learning. In: Proceedings of international conference on machine learning (ICML), Helsinki, Finland, pp 160–167 CrossRef
10.
11.
Zurück zum Zitat Dahl G, Yu D, Deng L, Acero A (2012) Context-dependent pre-trained deep neural networks for large vocabulary speech recognition. IEEE Trans. Audio Speech Lang. Process. 20(1):30–42 CrossRef Dahl G, Yu D, Deng L, Acero A (2012) Context-dependent pre-trained deep neural networks for large vocabulary speech recognition. IEEE Trans. Audio Speech Lang. Process. 20(1):30–42 CrossRef
12.
Zurück zum Zitat Erhan D, Manzagol P, Bengio Y, Bengio S, Vincent P (2009) The difficulty of training deep architectures and the effect of unsupervised pre-training. In: Proceedings of international conference on artificial intelligence and statistics (AISTATS), Clearwater Beach, FL, USA, pp 153–160 Erhan D, Manzagol P, Bengio Y, Bengio S, Vincent P (2009) The difficulty of training deep architectures and the effect of unsupervised pre-training. In: Proceedings of international conference on artificial intelligence and statistics (AISTATS), Clearwater Beach, FL, USA, pp 153–160
13.
Zurück zum Zitat Erhan D, Bengio Y, Courville AC, Manzagol PA, Vincent P, Bengio S (2010) Why does unsupervised pre-training help deep learning? J. Mach. Learn. Res. 11:625–660 MathSciNetMATH Erhan D, Bengio Y, Courville AC, Manzagol PA, Vincent P, Bengio S (2010) Why does unsupervised pre-training help deep learning? J. Mach. Learn. Res. 11:625–660 MathSciNetMATH
14.
Zurück zum Zitat Fidler S, Leonardis A (2007) Towards scalable representations of object categories: learning a hierarchy of parts. In: Proceedings of computer vision and pattern recognition (CVPR), Minneapolis, MN, USA Fidler S, Leonardis A (2007) Towards scalable representations of object categories: learning a hierarchy of parts. In: Proceedings of computer vision and pattern recognition (CVPR), Minneapolis, MN, USA
15.
Zurück zum Zitat Fukushima K (1980) Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biol. Cybern. 36(4):193–202 MATHCrossRef Fukushima K (1980) Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biol. Cybern. 36(4):193–202 MATHCrossRef
16.
Zurück zum Zitat Grangier D, Bottou L, Collobert R (2009) Deep convolutional networks for scene parsing. In: ICML deep learning workshop, Montreal, Canada Grangier D, Bottou L, Collobert R (2009) Deep convolutional networks for scene parsing. In: ICML deep learning workshop, Montreal, Canada
17.
Zurück zum Zitat Hinton G (2002) Training products of experts by minimizing contrastive divergence. Neural Comput. 14(8):1771–1800 MATHCrossRef Hinton G (2002) Training products of experts by minimizing contrastive divergence. Neural Comput. 14(8):1771–1800 MATHCrossRef
18.
19.
20.
Zurück zum Zitat Hochreiter S, Bengio Y Frasconi P, Schmidhuber J (2001) Gradient flow in recurrent nets: the difficulty of learning long-term dependencies. In: Kremer SC, Kolen JF (eds) A field guide to dynamical recurrent neural networks. Wiley/IEEE Press, New York Hochreiter S, Bengio Y Frasconi P, Schmidhuber J (2001) Gradient flow in recurrent nets: the difficulty of learning long-term dependencies. In: Kremer SC, Kolen JF (eds) A field guide to dynamical recurrent neural networks. Wiley/IEEE Press, New York
21.
Zurück zum Zitat Huang J, Mumford D (1999) Statistics of natural images and models. In: Proceedings of computer vision and pattern recognition (CVPR), Ft. Collins, CO, USA Huang J, Mumford D (1999) Statistics of natural images and models. In: Proceedings of computer vision and pattern recognition (CVPR), Ft. Collins, CO, USA
22.
Zurück zum Zitat Kavukcuoglu K, Ranzato M, LeCun Y (2010) Fast inference in sparse coding algorithms with applications to object recognition. CoRR abs/1010.3467 Kavukcuoglu K, Ranzato M, LeCun Y (2010) Fast inference in sparse coding algorithms with applications to object recognition. CoRR abs/1010.3467
23.
Zurück zum Zitat LeCun Y, Boser B, Denker J, Henderson D, Howard R, Hubbard W, Jackel L (1989) Backpropagation applied to handwritten zip code recognition. Neural Comput. 1(4):541–551 CrossRef LeCun Y, Boser B, Denker J, Henderson D, Howard R, Hubbard W, Jackel L (1989) Backpropagation applied to handwritten zip code recognition. Neural Comput. 1(4):541–551 CrossRef
24.
Zurück zum Zitat LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc. IEEE 86:2278–2324 CrossRef LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc. IEEE 86:2278–2324 CrossRef
25.
Zurück zum Zitat Lee H, Grosse R, Ranganath R, Ng A (2009) Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations. In: Proceedings of international conference on machine learning (ICML), New York, NY, USA, pp 609–616 Lee H, Grosse R, Ranganath R, Ng A (2009) Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations. In: Proceedings of international conference on machine learning (ICML), New York, NY, USA, pp 609–616
26.
Zurück zum Zitat Lee H, Pham P, Largman Y, Ng A (2009) Unsupervised feature learning for audio classification using convolutional deep belief networks. In: Advances in neural information processing systems (NIPS), Vancouver, Canada, pp 1096–1104 Lee H, Pham P, Largman Y, Ng A (2009) Unsupervised feature learning for audio classification using convolutional deep belief networks. In: Advances in neural information processing systems (NIPS), Vancouver, Canada, pp 1096–1104
27.
Zurück zum Zitat Memisevic R (2011) Gradient-based learning of higher-order image features. In: Proceedings of international conference on computer vision (ICCV), Barcelona, Spain, pp 1591–1598 CrossRef Memisevic R (2011) Gradient-based learning of higher-order image features. In: Proceedings of international conference on computer vision (ICCV), Barcelona, Spain, pp 1591–1598 CrossRef
28.
Zurück zum Zitat Ranzato M, Hinton G (2010) Modeling pixel means and covariances using factorized third-order Boltzmann machines. In: Proceedings of computer vision and pattern recognition (CVPR), San Francisco, CA, USA, pp 2551–2558 Ranzato M, Hinton G (2010) Modeling pixel means and covariances using factorized third-order Boltzmann machines. In: Proceedings of computer vision and pattern recognition (CVPR), San Francisco, CA, USA, pp 2551–2558
29.
Zurück zum Zitat Riesenhuber M, Poggio T (1999) Hierarchical models of object recognition in cortex. Nat. Neurosci. 2:1019–1025 CrossRef Riesenhuber M, Poggio T (1999) Hierarchical models of object recognition in cortex. Nat. Neurosci. 2:1019–1025 CrossRef
30.
Zurück zum Zitat Rumelhart D, Hinton G, Williams R (1986) Learning representations by back-propagating errors. Nature 323(6088):533–536 CrossRef Rumelhart D, Hinton G, Williams R (1986) Learning representations by back-propagating errors. Nature 323(6088):533–536 CrossRef
31.
Zurück zum Zitat Scherer D, Müller A, Behnke S (2010) Evaluation of pooling operations in convolutional architectures for object recognition. In: Proceedings of international conference on artificial neural networks (ICANN), Thessaloniki, Greece, pp 92–101 Scherer D, Müller A, Behnke S (2010) Evaluation of pooling operations in convolutional architectures for object recognition. In: Proceedings of international conference on artificial neural networks (ICANN), Thessaloniki, Greece, pp 92–101
32.
Zurück zum Zitat Schulz H, Behnke S (2012) Learning object-class segmentation with convolutional neural networks. In: Proceedings of the European symposium on artificial neural networks (ESANN), Bruges, Belgium Schulz H, Behnke S (2012) Learning object-class segmentation with convolutional neural networks. In: Proceedings of the European symposium on artificial neural networks (ESANN), Bruges, Belgium
33.
Zurück zum Zitat Shannon C (1949) The synthesis of two-terminal switching circuits. Bell Syst. Tech. J. 28(1):59–98 MathSciNet Shannon C (1949) The synthesis of two-terminal switching circuits. Bell Syst. Tech. J. 28(1):59–98 MathSciNet
34.
Zurück zum Zitat Taylor G, Fergus R, LeCun Y, Bregler C (2010) Convolutional learning of spatio-temporal features. In: Computer Vision (ECCV 2010), pp 140–153 CrossRef Taylor G, Fergus R, LeCun Y, Bregler C (2010) Convolutional learning of spatio-temporal features. In: Computer Vision (ECCV 2010), pp 140–153 CrossRef
35.
Zurück zum Zitat Tieleman T (2008) Training restricted Boltzmann machines using approximations to the likelihood gradient. In: Proceedings of international conference on machine learning (ICML), pp 1064–1071 CrossRef Tieleman T (2008) Training restricted Boltzmann machines using approximations to the likelihood gradient. In: Proceedings of international conference on machine learning (ICML), pp 1064–1071 CrossRef
36.
37.
Zurück zum Zitat Vincent P, Larochelle H, Lajoie I, Bengio Y, Manzagol P (2010) Stacked denoising autoencoders: learning useful representations in a deep network with a local denoising criterion. J. Mach. Learn. Res. 11:3371–3408 MathSciNetMATH Vincent P, Larochelle H, Lajoie I, Bengio Y, Manzagol P (2010) Stacked denoising autoencoders: learning useful representations in a deep network with a local denoising criterion. J. Mach. Learn. Res. 11:3371–3408 MathSciNetMATH
38.
Zurück zum Zitat Weston J, Ratle F, Collobert R (2008) Deep learning via semi-supervised embedding. In: Proceedings of international conference on machine learning (ICML), Helsinki, Finland, pp 1168–1175 CrossRef Weston J, Ratle F, Collobert R (2008) Deep learning via semi-supervised embedding. In: Proceedings of international conference on machine learning (ICML), Helsinki, Finland, pp 1168–1175 CrossRef
39.
Zurück zum Zitat Wiskott L, Sejnowski T (2002) Slow feature analysis: unsupervised learning of invariances. Neural Comput. 14(4):715–770 MATHCrossRef Wiskott L, Sejnowski T (2002) Slow feature analysis: unsupervised learning of invariances. Neural Comput. 14(4):715–770 MATHCrossRef
40.
Zurück zum Zitat Zeiler M, Taylor G, Fergus R (2011) Adaptive deconvolutional networks for mid and high level feature learning. In: Proceedings of international conference on computer vision (ICCV), Barcelona, Spain, pp 2018–2025 CrossRef Zeiler M, Taylor G, Fergus R (2011) Adaptive deconvolutional networks for mid and high level feature learning. In: Proceedings of international conference on computer vision (ICCV), Barcelona, Spain, pp 2018–2025 CrossRef
Metadaten
Titel
Deep Learning
Layer-Wise Learning of Feature Hierarchies
verfasst von
Hannes Schulz
Sven Behnke
Publikationsdatum
01.11.2012
Verlag
Springer-Verlag
Erschienen in
KI - Künstliche Intelligenz / Ausgabe 4/2012
Print ISSN: 0933-1875
Elektronische ISSN: 1610-1987
DOI
https://doi.org/10.1007/s13218-012-0198-z

Weitere Artikel der Ausgabe 4/2012

KI - Künstliche Intelligenz 4/2012 Zur Ausgabe

Premium Partner