Top

International Journal of Speech Technology

Published in:

19-11-2016

Single channel noise reduction system in low SNR

Author: Nasir Saleem

Published in: International Journal of Speech Technology | Issue 1/2017

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

We propose a two stage noise reduction system for reducing background noise using single-microphone recordings in very low signal-to-noise ratio (SNR) based on Wiener filtering and ideal binary masking. The proposed system contains two stages. In first stage, the Wiener filtering with improved a priori SNR is applied to noisy speech for background noise reduction. In second stage, the ideal binary mask is estimated at every time–frequency channel by using pre-processed first stage speech and comparing the time–frequency channels against a pre-selected threshold T to reduce the residual noise. The time–frequency channels satisfying the threshold are preserved whereas all other time–frequency channels are attenuated. The results revealed substantial improvements in speech intelligibility and quality over that accomplished with the traditional noise reduction algorithms and unprocessed speech.

previous article Speech enhancement based on stationary bionic wavelet transform and maximum a posterior estimator of magnitude-squared spectrum

next article Text-independent speaker identification based on selection of the most similar feature vectors

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Abd El-Fattah, M. A., Dessouky, M. I., Abbas, A. M., Diab, S. M., El-Rabaie, S. M., & Al-Nuaimy, W., et al. (2014). Speech enhancement with an adaptive Wiener filter. International Journal of Speech Technology, 17(1), 53–64. doi:10.1007/s10772-013-9205-5.CrossRef

Boldt, J. B., & Ellis, D. (2009). A simple correlation-based model of intelligibility for nonlinear speech enhancement and separation. In Proc. EUSIPCO’09, Glasgow, August 2009 (pp. 1849–1853).

Boldt, J. B., Kjems, U., Pedersen, M. S., Lunner, T., & Wang, D. (2008). Estimation of the ideal binary mask using directional systems. In Proc. int. workshop acoust. echo and noise control (pp. 1–4)

Boll, S. (1979). Suppression of acoustic noise in speech using spectral subtraction. In IEEE transactions on acoustics, speech, and signal processing, ASSP (Vol. 27, pp. 113–120). doi:10.1109/TASSP.1979.1163209.

Ephraim, Y., & Malah, D. (1984). Speech enhancement using a minimum mean-square error short-time spectral amplitude estimator. IEEE Transactions on Acoustics, Speech, and Signal Processing, 32(6), 1109–1121. doi:10.1109/TASSP.1984.1164453.CrossRef

Ephraim, Y., & Malah, D. (1985). Speech enhancement using a minimum mean-square error log-spectral amplitude estimator. In IEEE transactions on acoustics, speech, signal processing, ASSP (Vol. 23, No. 2, pp. 443–445). doi:10.1109/TASSP.1985.1164550.

Hansen, J., & Pellom, B. (1998). An effective quality evaluation protocol for speech enhancement algorithms. In International Conference on Spoken Language Processing, 7(2819), 2822.

Hirsch, H., & Pearce, D. (2000). The aurora experimental framework for the performance evaluation of speech recognition systems under noisy conditions. In: ISCA ITRW ASR2000, Paris.

Hu, Y., & Loizou, P. (2007). Subjective evaluation and comparison of speech enhancement algorithms. Speech Communication, 49(7–8), 588–601. doi:10.1016/j.specom.2006.12.006.CrossRef

ITU-T P.835. (2003). Subjective test methodology for evaluating speech communication systems that include noise suppression algorithm.

ITU-T Recommendation P.56. (1993). Objective measurement of active speech level.

Klatt, D. (1982). Prediction of perceived phonetic distance from critical band spectra. In Proc. IEEE int. conf. acoust., speech, signal processing (Vol. 7, pp. 1278–1281). doi:10.1109/ICASSP.1982.1171512.

Kitawaki, N., Nagabuchi, H., & Itoh, K. (1988). Objective quality evaluation for low bit-rate speech coding systems. IEEE Journal on Selected Areas in Communications, 6(2), 262–273. doi:10.1109/49.601.CrossRef

Lim, J, & Oppenheim, A. V. (1978). All-pole modeling of degraded speech. In IEEE trans. acoust., speech, signal proc., ASSP (Vol. 26, No. 3, pp. 197–210). doi:10.1109/TASSP.1978.1163086.

Loizou, P. C. (2007). Speech enhancement: Theory and practice. Boca Raton, FL: CRC Press.

Loizou, P. C. (2009). An algorithm that improves speech intelligibility in noise for normal-hearing listeners. The Journal of the Acoustical Society of America, 126(23), 1486–1494. doi:10.1121/1.3184603.

Quackenbush, S., Barnwell, T., & Clements, M. (1988). Objective measures of speech quality. Eaglewood Cliffs, NJ: Prentice-Hall.

Rix, A. W., Beerends, J. G., Hollier, M. P., & Hekstra, A. P. (2001). Perceptual evaluation of speech quality (PESQ)-a new method for speech quality assessment of telephone networks and codecs. In Acoustics, speech, and signal processing ICASSP. doi:10.1109/ICASSP.2001.941023.

Saleem, N., Mustafa, E., Nawaz, A., & Khan, A. (2015a). Ideal binary masking for reducing convolutive noise. International Journal of Speech Technology, 18(4), 547–554. doi:10.1007/s10772-015-9298-0.CrossRef

Saleem, N., Shafi, M., Mustafa, E., & Nawaz, A. (2015b). A novel binary mask estimation based on spectral subtraction gain-induced distortions for improved speech intelligibility and quality. Technical Journal, UET, Taxila, 20(4), 35–42.

Scalart, P., & Filho, J. (1996). Speech enhancement based on a priori signal to noise estimation. In Proc. IEEE int. conf. acoust., speech, signal processing (pp. 629–632). doi:10.1109/ICASSP.1996.543199.

Wang, D. (2005). On ideal binary mask as the computational goal of auditory scene analysis. In Speech separation by humans and machines (pp. 181–197). doi:10.1007/0-387-22794-6_12.

Wang, D. (2008). Time-frequency masking for speech separation and its potential for hearing aid design. Trends in Amplification, 12(4), 332–353. doi:10.1177/1084713808326455.CrossRef

Title: Single channel noise reduction system in low SNR
Author: Nasir Saleem
Publication date: 19-11-2016
Publisher: Springer US
Published in: International Journal of Speech Technology / Issue 1/2017
Print ISSN: 1381-2416
Electronic ISSN: 1572-8110
DOI: https://doi.org/10.1007/s10772-016-9391-z

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Other articles of this Issue 1/2017

Voice recognition package for ERTU’s cloud

Single-channel speech separation using empirical mode decomposition and multi pitch information with estimation of number of speakers

Bandwidth extension of telephone speech using magnitude spectrum data hiding

Speech based automatic personality perception using spectral features

Subjective speech quality measurement repeatability: comparison of laboratory test results

Melody extraction from music using modified group delay functions