nach oben

International Journal of Speech Technology

Erschienen in:

01.09.2013

MCRA noise estimation for KLT-VRE-based speech enhancement

verfasst von: Adda Saadoune, Abderrahmane Amrouche, Sid Ahmed Selouani

Erschienen in: International Journal of Speech Technology | Ausgabe 3/2013

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

A new signal subspace-based approach is proposed for the enhancement of speech corrupted by a high level of noise. Conventional subspace-based methods use the minimum mean square error criterion to optimize the Karhunen-Loève Transform (KLT). In non-stationary noisy environments, the selection of the optimal order of the KLT-based speech enhancement model is a critical issue. Indeed, estimation of the relevant subspace dimensions depends on the environmental conditions that may change unpredictably. Therefore, a drastic KLT-based dimension reduction may induce the loss of relevant components of speech and conversely, a reconstruction using a higher order of the KLT model will be ineffective to remove the noise. The method presented in this paper uses a Variance of Reconstruction Error (VRE) criterion to optimally select the KLT order model. A prominent point of this subspace method is that it incorporates the Minima Controlled Recursive Averaging (MCRA) to estimate the noise Power Spectral Density (PSD) used in the gain function. Three variants of the VRE combined with MCRA methods are implemented and compared, namely the VRE-MCRA, VRE-MCRA2 and VRE-IMCRA. Objective measures show that VRE-based approaches achieve a lower signal distortion and a higher noise reduction than existing enhancement methods.

Vorheriger Artikel Wavelet based sub-band parameters for classification of unaspirated Hindi stop consonants in initial position of CV syllables

Nächster Artikel Employing both gender and emotion cues to enhance speaker identification performance in emotional talking environments

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Abolhassani, A., Selouani, S. A., O’Shaughnessy, D., Harkat, M. F. (2007). Speech enhancement using PCA and variance of the reconstruction error model identification. In: Interspeech’ 2007 proceedings, Antwerp., pp. 974–977.

Ben Aicha, A., & Ben Jebara, S. (2007). Perceptual musical noise reduction using critical band tonality coefficients and masking thresholds. In: Interspeech 2007 proceedings, Antwerp., pp. 822–825.

Berouti, M., Schwartz, R., & Makhoul, J. (1979). Enhancement of speech corrupted by acoustic noise. In: IEEE ICASSP’79 proceedings, Washington, DC, pp. 208–211.

Boll, S. (1979). Suppression of acoustic noise in speech using spectral subtraction. IEEE Transactions on Acoustics, Speech, and Signal Processing, 27(2), 113–120. CrossRef

Ching-Ta, I. (2011). Enhancement of single channel speech using perceptual-decision-directed approach. Speech Communication, 53(4), 495–507. CrossRef

Cohen, I. (2003). Noise spectrum estimation in adverse environments: improved minima controlled recursive averaging. IEEE Transactions on Speech and Audio Processing, 11(5), 466–475. CrossRef

Cohen, I., & Berdugo, B. (2002). Noise estimation by minima controlled recursive averaging for robust speech enhancement. IEEE Signal Processing Letters, 9(1), 12–15. CrossRef

Dendrinos, M., Bakamides, S., & Carayannis, G. (1991). Speech enhancement from noise: a regenerative approach. Speech Communication, 10, 45–57. CrossRef

Ephraim, Y., & Malah, D. (1984). Speech enhancement using MMSE short-time spectral amplitude estimator. EEE Transactions on Acoustics, Speech, and Signal Processing, 32(6), 1109–1121. CrossRef

Ephraim, Y., & Malah, D. (1985). Speech enhancement using a minimum mean-square error log-spectral amplitude estimator. EEE Transactions on Acoustics, Speech, and Signal Processing, 23(2), 443–445. CrossRef

Ephraim, Y., & Van Trees, H. L. (1995). A signal subspace approach for speech enhancement. IEEE Transactions on Speech and Audio Processing, 3(4), 251–266. CrossRef

Hu, Y., & Loizou, P. C. (2003). A generalized subspace approach for enhancing speech corrupted by colored noise. IEEE Transactions on Speech and Audio Processing 11, 334–341. CrossRef

Hu, Y., & Loizou, P. C. (2007). Subjective evaluation and comparison of speech enhancement algorithms. Speech Communication, 49, 588–601. (NOIZEUS: a noisy speech corpus for evaluation of speech enhancement algorithms. Available at: http://www.utdallas.edu/~loizou/speech/noizeus/). CrossRef

Johnston, J. D. (1988). Transform coding of audio signal using perceptual noise criteria. IEEE Journal on Selected Areas in Communications, 6(2), 314–323. CrossRef

Loizou, P. C. (2007). Speech enhancement theory and practice (1st edn.). Boca Raton: CRC Press.

Rangachari, S., & Loizou, P. C. (2006). A noise estimation algorithm for highly non-stationary environments. Speech Communication, 28, 220–231. CrossRef

Selouani, S. A. (2011). Speech processing and soft computing. Springer briefs in speech technology. Springer, Berlin. CrossRef

Scalart, P., & Filho, J. V. (1996). Speech enhancement based on a priori signal to noise estimation. In: IEEE ICASSP ’96 proceedings, Atlanta, pp. 629–632.

Titel: MCRA noise estimation for KLT-VRE-based speech enhancement
verfasst von: Adda Saadoune
Abderrahmane Amrouche
Sid Ahmed Selouani
Publikationsdatum: 01.09.2013
Verlag: Springer US
Erschienen in: International Journal of Speech Technology / Ausgabe 3/2013
Print ISSN: 1381-2416
Elektronische ISSN: 1572-8110
DOI: https://doi.org/10.1007/s10772-012-9186-9

Neuer Inhalt

Bildnachweise

VDI-Icon, Profil Icon, inhalt2, Springer Professional Modul/© Springer Fachmedien Wiesbaden GmbH, Internationaler Motorenkongress/© [M] ATZlive | Chisnikov / Fotolia.com, Search Icon, Banner Hanser, Gardiner von Trapp/© Alpega Group, Benny Hahn/© ZEP GmbH, Customer Experience/© © oatawa / Getty Images / iStock, Zeitschrift Wissensmanagement Cover, PatentFit-Logo/© Springer Fachmedien Wiesbaden GmbH, 2023_Antrieb/© supervisuell, ATZ-Webinar: Prototypenfreie Entwicklung durch Offline- und Driver-in-the-Loop-HiL-Tests /© (c) VI-grade, chassis.tech plus 2023/© [M] ATZlive / TÜV SÜD PRODUCT SERVICE GMBH

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Weitere Artikel der Ausgabe 3/2013

Recognizing the message and the messenger: biomimetic spectral analysis for robust speech and speaker recognition

Advanced classification approach for neuronal phoneme recognition system based on efficient constructive training algorithm

VoCMex: a voice corpus in Mexican Spanish for research in speaker recognition

Employing both gender and emotion cues to enhance speaker identification performance in emotional talking environments

Wavelet based sub-band parameters for classification of unaspirated Hindi stop consonants in initial position of CV syllables

Wavelet-scalogram based study of non-periodicity in speech signals as a complementary measure of chaotic content

Neuer Inhalt

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.