Skip to main content

2017 | OriginalPaper | Buchkapitel

Improving the Performance of Deep Learning Based Speech Enhancement System Using Fuzzy Restricted Boltzmann Machine

verfasst von : Suman Samui, Indrajit Chakrabarti, Soumya K. Ghosh

Erschienen in: Pattern Recognition and Machine Intelligence

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Supervised speech enhancement based on machine learning is a new paradigm for segregating clean speech from background noise. The current work represents a supervised speech enhancement system based on a robust deep learning method where the pre-training phase of deep belief network (DBN) has been conducted by employing fuzzy restricted Boltzmann machines (FRBM) instead of regular RBM. It has been observed that the performance of FRBM model is superior to that of RBM model particularly when the training data is noisy. Our experimental results on various noise scenarios have shown that the proposed approach outperforms the conventional DNN-based speech enhancement methods which use regular RBM for unsupervised pre-training.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Chen, C., Zhang, C.Y., Chen, L., Gan, M.: Fuzzy restricted boltzmann machine for the enhancement of deep learning. IEEE Trans. Fuzzy Syst. 23(6), 2163–2173 (2015)CrossRef Chen, C., Zhang, C.Y., Chen, L., Gan, M.: Fuzzy restricted boltzmann machine for the enhancement of deep learning. IEEE Trans. Fuzzy Syst. 23(6), 2163–2173 (2015)CrossRef
2.
Zurück zum Zitat Chen, J., Wang, Y., Wang, D.: A feature study for classification-based speech separation at low signal-to-noise ratios. IEEE/ACM Trans. Audio Speech Lang. Process. 22(12), 1993–2002 (2014)CrossRef Chen, J., Wang, Y., Wang, D.: A feature study for classification-based speech separation at low signal-to-noise ratios. IEEE/ACM Trans. Audio Speech Lang. Process. 22(12), 1993–2002 (2014)CrossRef
3.
Zurück zum Zitat Erhan, D., Bengio, Y., Courville, A., Manzagol, P.A., Vincent, P., Bengio, S.: Why does unsupervised pre-training help deep learning? J. Mach. Learn. Res. 11, 625–660 (2010)MATHMathSciNet Erhan, D., Bengio, Y., Courville, A., Manzagol, P.A., Vincent, P., Bengio, S.: Why does unsupervised pre-training help deep learning? J. Mach. Learn. Res. 11, 625–660 (2010)MATHMathSciNet
4.
Zurück zum Zitat Hinton, G.E., Osindero, S., Teh, Y.W.: A fast learning algorithm for deep belief nets. Neural Comput. 18(7), 1527–1554 (2006)CrossRefMATHMathSciNet Hinton, G.E., Osindero, S., Teh, Y.W.: A fast learning algorithm for deep belief nets. Neural Comput. 18(7), 1527–1554 (2006)CrossRefMATHMathSciNet
5.
Zurück zum Zitat Kolbk, M., Tan, Z.H., Jensen, J.: Speech intelligibility potential of general and specialized deep neural network based speech enhancement systems. IEEE/ACM Trans. Audio Speech Lang. Process. 25(1), 153–167 (2017)CrossRef Kolbk, M., Tan, Z.H., Jensen, J.: Speech intelligibility potential of general and specialized deep neural network based speech enhancement systems. IEEE/ACM Trans. Audio Speech Lang. Process. 25(1), 153–167 (2017)CrossRef
6.
Zurück zum Zitat Loizou, P.C.: Speech Enhancement: Theory and Practice. CRC Press (2013) Loizou, P.C.: Speech Enhancement: Theory and Practice. CRC Press (2013)
7.
Zurück zum Zitat Samui, S., Chakrabarti, I., Ghosh, S.K.: Improved single channel phase-aware speech enhancement technique for low signal-to-noise ratio signal. IET Signal Proc. 10(6), 641–650 (2016)CrossRef Samui, S., Chakrabarti, I., Ghosh, S.K.: Improved single channel phase-aware speech enhancement technique for low signal-to-noise ratio signal. IET Signal Proc. 10(6), 641–650 (2016)CrossRef
8.
Zurück zum Zitat Wang, D., Brown, G.J.: Computational Auditory Scene Analysis: Principles, Algorithms, And Applications (2006) Wang, D., Brown, G.J.: Computational Auditory Scene Analysis: Principles, Algorithms, And Applications (2006)
9.
Zurück zum Zitat Wang, Y., Narayanan, A., Wang, D.: On training targets for supervised speech separation. IEEE/ACM Trans. Audio Speech Lang. Process. 22(12), 1849–1858 (2014)CrossRef Wang, Y., Narayanan, A., Wang, D.: On training targets for supervised speech separation. IEEE/ACM Trans. Audio Speech Lang. Process. 22(12), 1849–1858 (2014)CrossRef
10.
Zurück zum Zitat Wang, Y., Wang, D.: Towards scaling up classification-based speech separation. IEEE Trans. Audio Speech Lang. Process. 21(7), 1381–1390 (2013)CrossRef Wang, Y., Wang, D.: Towards scaling up classification-based speech separation. IEEE Trans. Audio Speech Lang. Process. 21(7), 1381–1390 (2013)CrossRef
11.
Zurück zum Zitat Williamson, D.S., Wang, Y., Wang, D.: Estimating nonnegative matrix model activations with deep neural networks to increase perceptual speech quality. J. Acoust. Soc. Am. 138(3), 1399–1407 (2015)CrossRef Williamson, D.S., Wang, Y., Wang, D.: Estimating nonnegative matrix model activations with deep neural networks to increase perceptual speech quality. J. Acoust. Soc. Am. 138(3), 1399–1407 (2015)CrossRef
12.
Zurück zum Zitat Xu, Y., Du, J., Dai, L.R., Lee, C.H.: A regression approach to speech enhancement based on deep neural networks. IEEE/ACM Trans. Audio Speech Lang. Process. 23(1), 7–19 (2015)CrossRef Xu, Y., Du, J., Dai, L.R., Lee, C.H.: A regression approach to speech enhancement based on deep neural networks. IEEE/ACM Trans. Audio Speech Lang. Process. 23(1), 7–19 (2015)CrossRef
13.
Zurück zum Zitat Zhang, X.L., Wang, D.: A deep ensemble learning method for monaural speech separation. IEEE/ACM Trans. Audio Speech Lang. Process. 24(5), 967–977 (2016)CrossRef Zhang, X.L., Wang, D.: A deep ensemble learning method for monaural speech separation. IEEE/ACM Trans. Audio Speech Lang. Process. 24(5), 967–977 (2016)CrossRef
Metadaten
Titel
Improving the Performance of Deep Learning Based Speech Enhancement System Using Fuzzy Restricted Boltzmann Machine
verfasst von
Suman Samui
Indrajit Chakrabarti
Soumya K. Ghosh
Copyright-Jahr
2017
DOI
https://doi.org/10.1007/978-3-319-69900-4_68

Premium Partner