Skip to main content
Erschienen in: International Journal of Speech Technology 3/2013

01.09.2013

MCRA noise estimation for KLT-VRE-based speech enhancement

verfasst von: Adda Saadoune, Abderrahmane Amrouche, Sid Ahmed Selouani

Erschienen in: International Journal of Speech Technology | Ausgabe 3/2013

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

A new signal subspace-based approach is proposed for the enhancement of speech corrupted by a high level of noise. Conventional subspace-based methods use the minimum mean square error criterion to optimize the Karhunen-Loève Transform (KLT). In non-stationary noisy environments, the selection of the optimal order of the KLT-based speech enhancement model is a critical issue. Indeed, estimation of the relevant subspace dimensions depends on the environmental conditions that may change unpredictably. Therefore, a drastic KLT-based dimension reduction may induce the loss of relevant components of speech and conversely, a reconstruction using a higher order of the KLT model will be ineffective to remove the noise. The method presented in this paper uses a Variance of Reconstruction Error (VRE) criterion to optimally select the KLT order model. A prominent point of this subspace method is that it incorporates the Minima Controlled Recursive Averaging (MCRA) to estimate the noise Power Spectral Density (PSD) used in the gain function. Three variants of the VRE combined with MCRA methods are implemented and compared, namely the VRE-MCRA, VRE-MCRA2 and VRE-IMCRA. Objective measures show that VRE-based approaches achieve a lower signal distortion and a higher noise reduction than existing enhancement methods.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Abolhassani, A., Selouani, S. A., O’Shaughnessy, D., Harkat, M. F. (2007). Speech enhancement using PCA and variance of the reconstruction error model identification. In: Interspeech’ 2007 proceedings, Antwerp., pp. 974–977. Abolhassani, A., Selouani, S. A., O’Shaughnessy, D., Harkat, M. F. (2007). Speech enhancement using PCA and variance of the reconstruction error model identification. In: Interspeech’ 2007 proceedings, Antwerp., pp. 974–977.
Zurück zum Zitat Ben Aicha, A., & Ben Jebara, S. (2007). Perceptual musical noise reduction using critical band tonality coefficients and masking thresholds. In: Interspeech 2007 proceedings, Antwerp., pp. 822–825. Ben Aicha, A., & Ben Jebara, S. (2007). Perceptual musical noise reduction using critical band tonality coefficients and masking thresholds. In: Interspeech 2007 proceedings, Antwerp., pp. 822–825.
Zurück zum Zitat Berouti, M., Schwartz, R., & Makhoul, J. (1979). Enhancement of speech corrupted by acoustic noise. In: IEEE ICASSP’79 proceedings, Washington, DC, pp. 208–211. Berouti, M., Schwartz, R., & Makhoul, J. (1979). Enhancement of speech corrupted by acoustic noise. In: IEEE ICASSP’79 proceedings, Washington, DC, pp. 208–211.
Zurück zum Zitat Boll, S. (1979). Suppression of acoustic noise in speech using spectral subtraction. IEEE Transactions on Acoustics, Speech, and Signal Processing, 27(2), 113–120. CrossRef Boll, S. (1979). Suppression of acoustic noise in speech using spectral subtraction. IEEE Transactions on Acoustics, Speech, and Signal Processing, 27(2), 113–120. CrossRef
Zurück zum Zitat Ching-Ta, I. (2011). Enhancement of single channel speech using perceptual-decision-directed approach. Speech Communication, 53(4), 495–507. CrossRef Ching-Ta, I. (2011). Enhancement of single channel speech using perceptual-decision-directed approach. Speech Communication, 53(4), 495–507. CrossRef
Zurück zum Zitat Cohen, I. (2003). Noise spectrum estimation in adverse environments: improved minima controlled recursive averaging. IEEE Transactions on Speech and Audio Processing, 11(5), 466–475. CrossRef Cohen, I. (2003). Noise spectrum estimation in adverse environments: improved minima controlled recursive averaging. IEEE Transactions on Speech and Audio Processing, 11(5), 466–475. CrossRef
Zurück zum Zitat Cohen, I., & Berdugo, B. (2002). Noise estimation by minima controlled recursive averaging for robust speech enhancement. IEEE Signal Processing Letters, 9(1), 12–15. CrossRef Cohen, I., & Berdugo, B. (2002). Noise estimation by minima controlled recursive averaging for robust speech enhancement. IEEE Signal Processing Letters, 9(1), 12–15. CrossRef
Zurück zum Zitat Dendrinos, M., Bakamides, S., & Carayannis, G. (1991). Speech enhancement from noise: a regenerative approach. Speech Communication, 10, 45–57. CrossRef Dendrinos, M., Bakamides, S., & Carayannis, G. (1991). Speech enhancement from noise: a regenerative approach. Speech Communication, 10, 45–57. CrossRef
Zurück zum Zitat Ephraim, Y., & Malah, D. (1984). Speech enhancement using MMSE short-time spectral amplitude estimator. EEE Transactions on Acoustics, Speech, and Signal Processing, 32(6), 1109–1121. CrossRef Ephraim, Y., & Malah, D. (1984). Speech enhancement using MMSE short-time spectral amplitude estimator. EEE Transactions on Acoustics, Speech, and Signal Processing, 32(6), 1109–1121. CrossRef
Zurück zum Zitat Ephraim, Y., & Malah, D. (1985). Speech enhancement using a minimum mean-square error log-spectral amplitude estimator. EEE Transactions on Acoustics, Speech, and Signal Processing, 23(2), 443–445. CrossRef Ephraim, Y., & Malah, D. (1985). Speech enhancement using a minimum mean-square error log-spectral amplitude estimator. EEE Transactions on Acoustics, Speech, and Signal Processing, 23(2), 443–445. CrossRef
Zurück zum Zitat Ephraim, Y., & Van Trees, H. L. (1995). A signal subspace approach for speech enhancement. IEEE Transactions on Speech and Audio Processing, 3(4), 251–266. CrossRef Ephraim, Y., & Van Trees, H. L. (1995). A signal subspace approach for speech enhancement. IEEE Transactions on Speech and Audio Processing, 3(4), 251–266. CrossRef
Zurück zum Zitat Hu, Y., & Loizou, P. C. (2003). A generalized subspace approach for enhancing speech corrupted by colored noise. IEEE Transactions on Speech and Audio Processing 11, 334–341. CrossRef Hu, Y., & Loizou, P. C. (2003). A generalized subspace approach for enhancing speech corrupted by colored noise. IEEE Transactions on Speech and Audio Processing 11, 334–341. CrossRef
Zurück zum Zitat Johnston, J. D. (1988). Transform coding of audio signal using perceptual noise criteria. IEEE Journal on Selected Areas in Communications, 6(2), 314–323. CrossRef Johnston, J. D. (1988). Transform coding of audio signal using perceptual noise criteria. IEEE Journal on Selected Areas in Communications, 6(2), 314–323. CrossRef
Zurück zum Zitat Loizou, P. C. (2007). Speech enhancement theory and practice (1st edn.). Boca Raton: CRC Press. Loizou, P. C. (2007). Speech enhancement theory and practice (1st edn.). Boca Raton: CRC Press.
Zurück zum Zitat Rangachari, S., & Loizou, P. C. (2006). A noise estimation algorithm for highly non-stationary environments. Speech Communication, 28, 220–231. CrossRef Rangachari, S., & Loizou, P. C. (2006). A noise estimation algorithm for highly non-stationary environments. Speech Communication, 28, 220–231. CrossRef
Zurück zum Zitat Selouani, S. A. (2011). Speech processing and soft computing. Springer briefs in speech technology. Springer, Berlin. CrossRef Selouani, S. A. (2011). Speech processing and soft computing. Springer briefs in speech technology. Springer, Berlin. CrossRef
Zurück zum Zitat Scalart, P., & Filho, J. V. (1996). Speech enhancement based on a priori signal to noise estimation. In: IEEE ICASSP ’96 proceedings, Atlanta, pp. 629–632. Scalart, P., & Filho, J. V. (1996). Speech enhancement based on a priori signal to noise estimation. In: IEEE ICASSP ’96 proceedings, Atlanta, pp. 629–632.
Metadaten
Titel
MCRA noise estimation for KLT-VRE-based speech enhancement
verfasst von
Adda Saadoune
Abderrahmane Amrouche
Sid Ahmed Selouani
Publikationsdatum
01.09.2013
Verlag
Springer US
Erschienen in
International Journal of Speech Technology / Ausgabe 3/2013
Print ISSN: 1381-2416
Elektronische ISSN: 1572-8110
DOI
https://doi.org/10.1007/s10772-012-9186-9

Weitere Artikel der Ausgabe 3/2013

International Journal of Speech Technology 3/2013 Zur Ausgabe

Neuer Inhalt