Skip to main content

2015 | OriginalPaper | Buchkapitel

5. Advanced Model Initialization Techniques

verfasst von : Dong Yu, Li Deng

Erschienen in: Automatic Speech Recognition

Verlag: Springer London

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In this chapter, we introduce several advanced deep neural network (DNN) model initialization or pretraining techniques. These techniques have played important roles in the early days of deep learning research and continue to be useful under many conditions. We focus our presentation of pretraining DNNs on the following topics: the restricted Boltzmann machine (RBM), which by itself is an interesting generative model, the deep belief network (DBN), the denoising autoencoder, and the discriminative pretraining.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Bengio, Y., Lamblin, P., Popovici, D., Larochelle, H.: Greedy layer-wise training of deep networks. In: Proceedings of the Neural Information Processing Systems (NIPS), pp. 153–160 (2006) Bengio, Y., Lamblin, P., Popovici, D., Larochelle, H.: Greedy layer-wise training of deep networks. In: Proceedings of the Neural Information Processing Systems (NIPS), pp. 153–160 (2006)
2.
Zurück zum Zitat Bottou, L.: Online learning and stochastic approximations. On-line Learn. Neural Netw. 17, 9 (1998) Bottou, L.: Online learning and stochastic approximations. On-line Learn. Neural Netw. 17, 9 (1998)
3.
Zurück zum Zitat Coates, A., Ng, A.Y., Lee, H.: An analysis of single-layer networks in unsupervised feature learning. In: Proceedings of the International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 215–223 (2011) Coates, A., Ng, A.Y., Lee, H.: An analysis of single-layer networks in unsupervised feature learning. In: Proceedings of the International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 215–223 (2011)
4.
Zurück zum Zitat Dahl, G.E., Yu, D., Deng, L., Acero, A.: Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition. IEEE Trans. Audio, Speech Lang. Process. 20(1), 30–42 (2012)CrossRef Dahl, G.E., Yu, D., Deng, L., Acero, A.: Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition. IEEE Trans. Audio, Speech Lang. Process. 20(1), 30–42 (2012)CrossRef
5.
Zurück zum Zitat Erhan, D., Bengio, Y., Courville, A., Manzagol, P.A., Vincent, P., Bengio, S.: Why does unsupervised pre-training help deep learning? J. Mach. Learn. Res. (JMLR) 11, 625–660 (2010)MATHMathSciNet Erhan, D., Bengio, Y., Courville, A., Manzagol, P.A., Vincent, P., Bengio, S.: Why does unsupervised pre-training help deep learning? J. Mach. Learn. Res. (JMLR) 11, 625–660 (2010)MATHMathSciNet
6.
Zurück zum Zitat Erhan, D., Manzagol, P.A., Bengio, Y., Bengio, S., Vincent, P.: The difficulty of training deep architectures and the effect of unsupervised pre-training. In: Proceedings of the International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 153–160 (2009) Erhan, D., Manzagol, P.A., Bengio, Y., Bengio, S., Vincent, P.: The difficulty of training deep architectures and the effect of unsupervised pre-training. In: Proceedings of the International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 153–160 (2009)
7.
Zurück zum Zitat Hinton, G.: A practical guide to training restricted Boltzmann machines. Technical Report UTML TR 2010-003, University of Toronto (2010) Hinton, G.: A practical guide to training restricted Boltzmann machines. Technical Report UTML TR 2010-003, University of Toronto (2010)
8.
9.
10.
Zurück zum Zitat Hinton, G.E., Dayan, P., Frey, B.J., Neal, R.M.: The wake-sleep algorithm for unsupervised neural networks. SCIENCE-NEW YORK THEN WASHINGTON- pp. 1158–1158 (1995) Hinton, G.E., Dayan, P., Frey, B.J., Neal, R.M.: The wake-sleep algorithm for unsupervised neural networks. SCIENCE-NEW YORK THEN WASHINGTON- pp. 1158–1158 (1995)
11.
Zurück zum Zitat Hinton, G.E., Salakhutdinov, R.: Replicated softmax: an undirected topic model. In: Proceedings of the Neural Information Processing Systems (NIPS), pp. 1607–1614 (2009) Hinton, G.E., Salakhutdinov, R.: Replicated softmax: an undirected topic model. In: Proceedings of the Neural Information Processing Systems (NIPS), pp. 1607–1614 (2009)
12.
Zurück zum Zitat Hinton, G.E., Srivastava, N., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.R.: Improving neural networks by preventing co-adaptation of feature detectors. arXiv preprint arXiv:1207.0580 (2012) Hinton, G.E., Srivastava, N., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.R.: Improving neural networks by preventing co-adaptation of feature detectors. arXiv preprint arXiv:​1207.​0580 (2012)
13.
Zurück zum Zitat Larochelle, H., Bengio, Y.: Classification using discriminative restricted Boltzmann machines. In: Proceedings of the International Conference on Machine Learning (ICML), pp. 536–543 (2008) Larochelle, H., Bengio, Y.: Classification using discriminative restricted Boltzmann machines. In: Proceedings of the International Conference on Machine Learning (ICML), pp. 536–543 (2008)
14.
Zurück zum Zitat Ling, Z.H., Deng, L., Yu, D.: Modeling spectral envelopes using restricted Boltzmann machines and deep belief networks for statistical parametric speech synthesis. IEEE Trans. Audio, Speech Lang. Process. 21(10), 2129–2139 (2013)CrossRef Ling, Z.H., Deng, L., Yu, D.: Modeling spectral envelopes using restricted Boltzmann machines and deep belief networks for statistical parametric speech synthesis. IEEE Trans. Audio, Speech Lang. Process. 21(10), 2129–2139 (2013)CrossRef
15.
Zurück zum Zitat Sainath, T., Kingsbury, B., Ramabhadran, B.: Improving training time of deep belief networks through hybrid pre-training and larger batch sizes. In: Proceedings of the Neural Information Processing Systems (NIPS) Workshop on Log-linear Models (2012) Sainath, T., Kingsbury, B., Ramabhadran, B.: Improving training time of deep belief networks through hybrid pre-training and larger batch sizes. In: Proceedings of the Neural Information Processing Systems (NIPS) Workshop on Log-linear Models (2012)
16.
Zurück zum Zitat Salakhutdinov, R., Mnih, A., Hinton, G.: Restricted boltzmann machines for collaborative filtering. In: Proceedings of the International Conference on Machine Learning (ICML), pp. 791–798 (2007) Salakhutdinov, R., Mnih, A., Hinton, G.: Restricted boltzmann machines for collaborative filtering. In: Proceedings of the International Conference on Machine Learning (ICML), pp. 791–798 (2007)
17.
Zurück zum Zitat Saul, L.K., Jaakkola, T., Jordan, M.I.: Mean field theory for sigmoid belief networks. J. Artif. Intell. Res. (JAIR) 4, 61–76 (1996)MATH Saul, L.K., Jaakkola, T., Jordan, M.I.: Mean field theory for sigmoid belief networks. J. Artif. Intell. Res. (JAIR) 4, 61–76 (1996)MATH
18.
Zurück zum Zitat Seide, F., Li, G., Chen, X., Yu, D.: Feature engineering in context-dependent deep neural networks for conversational speech transcription. In: Proceedings of the IEEE Workshop on Automfatic Speech Recognition and Understanding (ASRU), pp. 24–29 (2011) Seide, F., Li, G., Chen, X., Yu, D.: Feature engineering in context-dependent deep neural networks for conversational speech transcription. In: Proceedings of the IEEE Workshop on Automfatic Speech Recognition and Understanding (ASRU), pp. 24–29 (2011)
19.
Zurück zum Zitat Seide, F., Li, G., Yu, D.: Conversational speech transcription using context-dependent deep neural networks. In: Proceedings of the Annual Conference of International Speech Communication Association (INTERSPEECH), pp. 437–440 (2011) Seide, F., Li, G., Yu, D.: Conversational speech transcription using context-dependent deep neural networks. In: Proceedings of the Annual Conference of International Speech Communication Association (INTERSPEECH), pp. 437–440 (2011)
20.
Zurück zum Zitat Smolensky, P.: Information processing in dynamical systems: foundations of harmony theory. Department of Computer Science, University of Colorado, Boulder (1986) Smolensky, P.: Information processing in dynamical systems: foundations of harmony theory. Department of Computer Science, University of Colorado, Boulder (1986)
21.
Zurück zum Zitat Vincent, P., Larochelle, H., Bengio, Y., Manzagol, P.A.: Extracting and composing robust features with denoising autoencoders. In: Proceedings of the International Conference on Machine Learning (ICML), pp. 1096–1103 (2008) Vincent, P., Larochelle, H., Bengio, Y., Manzagol, P.A.: Extracting and composing robust features with denoising autoencoders. In: Proceedings of the International Conference on Machine Learning (ICML), pp. 1096–1103 (2008)
22.
Zurück zum Zitat Yu, D., Deng, L., Dahl, G.: Roles of pre-training and fine-tuning in context-dependent DBN-HMMs for real-world speech recognition. In: Proceedings of the Neural Information Processing Systems (NIPS) Workshop on Deep Learning and Unsupervised Feature Learning (2010) Yu, D., Deng, L., Dahl, G.: Roles of pre-training and fine-tuning in context-dependent DBN-HMMs for real-world speech recognition. In: Proceedings of the Neural Information Processing Systems (NIPS) Workshop on Deep Learning and Unsupervised Feature Learning (2010)
23.
Zurück zum Zitat Zhang, S., Bao, Y., Zhou, P., Jiang, H., Li-Rong, D.: Improving deep neural networks for LVCSR using dropout and shrinking structure. In: Proceedings of the International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6899–6903 (2014) Zhang, S., Bao, Y., Zhou, P., Jiang, H., Li-Rong, D.: Improving deep neural networks for LVCSR using dropout and shrinking structure. In: Proceedings of the International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6899–6903 (2014)
Metadaten
Titel
Advanced Model Initialization Techniques
verfasst von
Dong Yu
Li Deng
Copyright-Jahr
2015
Verlag
Springer London
DOI
https://doi.org/10.1007/978-1-4471-5779-3_5

Neuer Inhalt