Skip to main content
Erschienen in: International Journal of Speech Technology 4/2015

14.08.2015

Ideal binary masking for reducing convolutive noise

verfasst von: Nasir Saleem, Ehtasham Mustafa, Aamir Nawaz, Adnan Khan

Erschienen in: International Journal of Speech Technology | Ausgabe 4/2015

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

It is important to know the degree to which convolutive noise disrupts the perceptual aspects of speech and its intelligibility. This paper presents the ideal binary masking criterion for reducing the convolutive noise (reverberation) and to improve the quality and intelligibility of speech. The noise is suppressed using ideal binary time–frequency masking that is based on signal-to-reverberation ratio (SRR) of individual time–frequency channels. All T–F channels with the SRR greater than pre-selected threshold are retained while others are eliminated. The performance of algorithm is evaluated using IEEE sentences corrupted with different degrees of reverberation times (RT60) ranging from 0.3 to 2.0 s. The results indicate that with the increase of reverberation time, the intelligibility and perceptual aspects of speech decrease. Additional analyses indicated that ideal binary masking reduced the temporary envelope spreading effect introduced by the reverberation. The algorithm is evaluated with perceptual evaluation of speech quality, SNRLOSS, log-likelihood-ratio and frequency weighted segmental signal-to-noise ratio.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Assmann, P. F., & Summerfield, Q. (2004). The perception of speech under adverse acoustic conditions. In S. Greenberg (Ed.), Speech processing in auditory system. A. N: W. A. Ainsworth. Assmann, P. F., & Summerfield, Q. (2004). The perception of speech under adverse acoustic conditions. In S. Greenberg (Ed.), Speech processing in auditory system. A. N: W. A. Ainsworth.
Zurück zum Zitat Bolt, R. H., & MacDonald, A. D. (1949). Theory of speech masking by reverberation. Journal of the Acoustic Society of America, 21, 577–580.CrossRef Bolt, R. H., & MacDonald, A. D. (1949). Theory of speech masking by reverberation. Journal of the Acoustic Society of America, 21, 577–580.CrossRef
Zurück zum Zitat Furuya, K., & Kataoka, A. (2007). Robust speech dereverberation using multichannel blind deconvolution with spectral subtraction. IEEE Transactions on Audio, Speech, and Language Processing, 15, 1579–1591.CrossRef Furuya, K., & Kataoka, A. (2007). Robust speech dereverberation using multichannel blind deconvolution with spectral subtraction. IEEE Transactions on Audio, Speech, and Language Processing, 15, 1579–1591.CrossRef
Zurück zum Zitat Grundlehner, B., Lecocq, J., Balan, R., & Rosca, J. (2005). Performance assessment method for speech enhancement. In Proceedings of 1st annual, IEEE. Grundlehner, B., Lecocq, J., Balan, R., & Rosca, J. (2005). Performance assessment method for speech enhancement. In Proceedings of 1st annual, IEEE.
Zurück zum Zitat Haykin, S. (2000). Unsupervised adaptive filtering: Blind de-convolution (Vol. 2, pp. 1–12). New York: Wiley. Haykin, S. (2000). Unsupervised adaptive filtering: Blind de-convolution (Vol. 2, pp. 1–12). New York: Wiley.
Zurück zum Zitat Huang, Y., Benesty, J., & Chen, J. (2007). De-reverberation. In J. Benesty, M. Sondhi, & Y. Huang (Eds.), Springer handbook of speech processing (pp. 929–943). New York: Springer. Huang, Y., Benesty, J., & Chen, J. (2007). De-reverberation. In J. Benesty, M. Sondhi, & Y. Huang (Eds.), Springer handbook of speech processing (pp. 929–943). New York: Springer.
Zurück zum Zitat Kjellberg, A. (2004). Effects of reverberation time on the cognitive load in speech communication: Theoretical considerations. Noise Health, 7, 11–21. Kjellberg, A. (2004). Effects of reverberation time on the cognitive load in speech communication: Theoretical considerations. Noise Health, 7, 11–21.
Zurück zum Zitat Kokkinakis, K., & Loizou, P. C. (2009). Selective-tap blind de-reverberation for two-microphone enhancement of reverberant speech. IEEE Signal Processing Letters, 16, 961–964.CrossRef Kokkinakis, K., & Loizou, P. C. (2009). Selective-tap blind de-reverberation for two-microphone enhancement of reverberant speech. IEEE Signal Processing Letters, 16, 961–964.CrossRef
Zurück zum Zitat Krishnamoorthy, P., & Prasanna, S. R. (2009). Reverberant speech enhancement by temporal and spectral processing. IEEE Transactions on Audio, Speech, and Language Processing, 17, 253–266.CrossRef Krishnamoorthy, P., & Prasanna, S. R. (2009). Reverberant speech enhancement by temporal and spectral processing. IEEE Transactions on Audio, Speech, and Language Processing, 17, 253–266.CrossRef
Zurück zum Zitat Loizou, P. C. (2007). Speech enhancement: Theory and practice. In S. R. Quackenbush, T. P. Barnwell III, & M. A. Clement (Eds.), Objective—measures of speech quality (2nd ed.). Eaglewood Cliffs: Prentice Hall. Loizou, P. C. (2007). Speech enhancement: Theory and practice. In S. R. Quackenbush, T. P. Barnwell III, & M. A. Clement (Eds.), Objective—measures of speech quality (2nd ed.). Eaglewood Cliffs: Prentice Hall.
Zurück zum Zitat Ma, J., & Loizou, P. C. (2011). SNR loss: A new objective measure for predicting speech intelligibility of noise-suppressed speech. Speech Communication, 53(3), 340–354.CrossRef Ma, J., & Loizou, P. C. (2011). SNR loss: A new objective measure for predicting speech intelligibility of noise-suppressed speech. Speech Communication, 53(3), 340–354.CrossRef
Zurück zum Zitat Miyoshi, M., & Kaneda, Y. (1988). Inverse filtering of room acoustics. IEEE Transactions on Speech and Audio Processing, 36, 145–152.CrossRef Miyoshi, M., & Kaneda, Y. (1988). Inverse filtering of room acoustics. IEEE Transactions on Speech and Audio Processing, 36, 145–152.CrossRef
Zurück zum Zitat Nabelek, A. K., & Dagenais, P. A. (1986). Vowel errors in noise and in reverberation by hearing-impaired listeners. Journal of the Acoustic Society of America, 80, 741–748.CrossRef Nabelek, A. K., & Dagenais, P. A. (1986). Vowel errors in noise and in reverberation by hearing-impaired listeners. Journal of the Acoustic Society of America, 80, 741–748.CrossRef
Zurück zum Zitat Nabelek, A. K., & Letowski, T. R. (1988). Similarities of vowels in non-reverberant and reverberant fields. Journal of the Acoustic Society of America, 83, 1891–1899.CrossRef Nabelek, A. K., & Letowski, T. R. (1988). Similarities of vowels in non-reverberant and reverberant fields. Journal of the Acoustic Society of America, 83, 1891–1899.CrossRef
Zurück zum Zitat Nabelek, A. K., Letowski, T. R., & Tucker, F. M. (1989). Reverberant overlap and self-masking in consonant identification. Journal of the Acoustic Society of America, 86, 1259–1265.CrossRef Nabelek, A. K., Letowski, T. R., & Tucker, F. M. (1989). Reverberant overlap and self-masking in consonant identification. Journal of the Acoustic Society of America, 86, 1259–1265.CrossRef
Zurück zum Zitat Nabelek, A. K., & Picket, J. M. (1974). Monaural and binaural speech perception through hearing aids under noise and reverberation with normal and hearing-impaired listeners. Journal of Speech and Hearing Research, 17, 724–739.CrossRef Nabelek, A. K., & Picket, J. M. (1974). Monaural and binaural speech perception through hearing aids under noise and reverberation with normal and hearing-impaired listeners. Journal of Speech and Hearing Research, 17, 724–739.CrossRef
Zurück zum Zitat Neuman, A. C., Wroblewski, M., Hajicek, J., & Rubinstein, A. (2010). Combined effects of noise and reverberation on speech recognition performance of normal-hearing children and adults. Ear and Hearing, 31, 336–344.CrossRef Neuman, A. C., Wroblewski, M., Hajicek, J., & Rubinstein, A. (2010). Combined effects of noise and reverberation on speech recognition performance of normal-hearing children and adults. Ear and Hearing, 31, 336–344.CrossRef
Zurück zum Zitat Rix, A.W., Hollier, M. P., Hekstra, A. P. & Beerends, J. G. (2001). Perceptual evaluation of speech quality (PESQ). Rix, A.W., Hollier, M. P., Hekstra, A. P. & Beerends, J. G. (2001). Perceptual evaluation of speech quality (PESQ).
Zurück zum Zitat Roman, N., & Woodruff, J. (2013). Speech intelligibility in reverberation with ideal binary masking: Effects of early reflections and signal-to-noise ratio threshold. Journal of the Acoustical Society of America, 133, 1707–1717.CrossRef Roman, N., & Woodruff, J. (2013). Speech intelligibility in reverberation with ideal binary masking: Effects of early reflections and signal-to-noise ratio threshold. Journal of the Acoustical Society of America, 133, 1707–1717.CrossRef
Metadaten
Titel
Ideal binary masking for reducing convolutive noise
verfasst von
Nasir Saleem
Ehtasham Mustafa
Aamir Nawaz
Adnan Khan
Publikationsdatum
14.08.2015
Verlag
Springer US
Erschienen in
International Journal of Speech Technology / Ausgabe 4/2015
Print ISSN: 1381-2416
Elektronische ISSN: 1572-8110
DOI
https://doi.org/10.1007/s10772-015-9298-0

Weitere Artikel der Ausgabe 4/2015

International Journal of Speech Technology 4/2015 Zur Ausgabe

Neuer Inhalt