Skip to main content
Erschienen in: International Journal of Speech Technology 1/2019

22.11.2018

Evaluating noise suppression methods for recovering the Lombard speech from vocal output in an external noise field

verfasst von: Ghazaleh Vaziri, Christian Giguère, Hilmi R. Dajani

Erschienen in: International Journal of Speech Technology | Ausgabe 1/2019

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Speech production is affected by noise due to the Lombard effect. The traditional method of investigation is through headphone delivery of noise to allow speech to be recorded in quiet, but this could create an occlusion effect artefact during speech production. It is also not directly applicable when wearing hearing protectors, hearing aids, or other devices due to physical interference by the headphones. In these situations, the Lombard effect needs to be elicited by an external noise field and speech recorded in the presence of noise. This is a more challenging measurement situation, but one that preserves perception of own voice and the surrounding noise in interaction with the hearing device worn. Two methods, direct waveform subtraction and adaptive noise cancellation, were evaluated for suppressing the background noise in the recorded speech..The effects of sound recording configuration on performance was investigated for two microphone types (omnidirectional and directional) at two distances (50 and 25 cm) in different noises and in the presence of real talker’s movement. Results show that the amount of noise reduction with both suppression methods is greater for fluctuating than continuous noises. Overall, the best recording configuration for noise reduction was with the omnidirectional microphone at 25 cm. Pitch extraction, energy level, and objective speech intelligibility and quality measures show that both suppression methods provide adequate noise reduction for SNRs as low as − 10 dB, which is suitable to successfully recover Lombard speech produced in an external noise field with open ears and when wearing hearing protectors.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Abd El-Fattah, M. A., Dessouky, M. I., Abbas, A. M., Alaa, M., Diab, S. M., El-Rabaie, E. M., et al. (2014). Speech enhancement with an adaptive Wiener filter. International Journal of Speech Technology, 17, 53–64.CrossRef Abd El-Fattah, M. A., Dessouky, M. I., Abbas, A. M., Alaa, M., Diab, S. M., El-Rabaie, E. M., et al. (2014). Speech enhancement with an adaptive Wiener filter. International Journal of Speech Technology, 17, 53–64.CrossRef
Zurück zum Zitat Bahoura, M., & Rouat, J. (2006). Wavelet speech enhancement based on time-scale adaptation. Speech Communication, 48, 1620–1637.CrossRef Bahoura, M., & Rouat, J. (2006). Wavelet speech enhancement based on time-scale adaptation. Speech Communication, 48, 1620–1637.CrossRef
Zurück zum Zitat Beerends, J. G., Helstra, A. P., Rix, A. W., & Hollier, M. P. (2002). Perceptual evaluation of speech quality (PESQ), the new ITU standard for end-to-end speech quality assessment. Part II-Psychoacoustic model. Journal of Audio Engineering Society, 50(10), 765–778. Beerends, J. G., Helstra, A. P., Rix, A. W., & Hollier, M. P. (2002). Perceptual evaluation of speech quality (PESQ), the new ITU standard for end-to-end speech quality assessment. Part II-Psychoacoustic model. Journal of Audio Engineering Society, 50(10), 765–778.
Zurück zum Zitat Bouserhal, R., Macdonald, E. N., Falk, T. H., & Voix, J. (2016). Variations in voice level and fundamental frequency with changing background noise level and talker-to-listener distance while wearing hearing protectors: A pilot study. International Journal of Audiology, 55(Sup1), S13–S20.CrossRef Bouserhal, R., Macdonald, E. N., Falk, T. H., & Voix, J. (2016). Variations in voice level and fundamental frequency with changing background noise level and talker-to-listener distance while wearing hearing protectors: A pilot study. International Journal of Audiology, 55(Sup1), S13–S20.CrossRef
Zurück zum Zitat Brungart, D., Cord, M. T., Solomon, N. P., Dietrich-Burns, K., & Block, K. (2012). Evaluating the effects of hearing protection on speech production in noisy environments. In ListTalk-2012. Brungart, D., Cord, M. T., Solomon, N. P., Dietrich-Burns, K., & Block, K. (2012). Evaluating the effects of hearing protection on speech production in noisy environments. In ListTalk-2012.
Zurück zum Zitat Castellanos, A., Benedi, J. M., & Casacuberta, F. (1996). An analysis of general acoustic-phonetic features for Spanish speech produced with Lombard effect. Speech Communication, 20, 23–35.CrossRef Castellanos, A., Benedi, J. M., & Casacuberta, F. (1996). An analysis of general acoustic-phonetic features for Spanish speech produced with Lombard effect. Speech Communication, 20, 23–35.CrossRef
Zurück zum Zitat Chen, S. H. (2004). Speech enhancement using perceptual wavelet packet decomposition teager energy operator. Journal of VLSI Signal Processing, 36, 125–139.CrossRef Chen, S. H. (2004). Speech enhancement using perceptual wavelet packet decomposition teager energy operator. Journal of VLSI Signal Processing, 36, 125–139.CrossRef
Zurück zum Zitat Davis, C., Kim, J., Grauwinkel, K., & Mixdorff, H. (2006). Lombard speech: Auditory (A), visual (V) and AV effects. 3rd international conference on speech prosody, Dresden. Davis, C., Kim, J., Grauwinkel, K., & Mixdorff, H. (2006). Lombard speech: Auditory (A), visual (V) and AV effects. 3rd international conference on speech prosody, Dresden.
Zurück zum Zitat Drugman, T., & Dutoit, T. (2010). Glottal-based Analysis of the Lombard Effect. 11th annual conference of the international speech communication association (INTERSPEECH), Chiba. Drugman, T., & Dutoit, T. (2010). Glottal-based Analysis of the Lombard Effect. 11th annual conference of the international speech communication association (INTERSPEECH), Chiba.
Zurück zum Zitat Ferrand, C. T. (2005). Relationship between masking levels and phonatory stability in normal-speaking women. Voice, 20(2), 223–228.CrossRef Ferrand, C. T. (2005). Relationship between masking levels and phonatory stability in normal-speaking women. Voice, 20(2), 223–228.CrossRef
Zurück zum Zitat Garnier, M., Bailly, L., Dohen, M., Welby, P., & Loevenbruck, H. (2006a). An acoustic and articulatory study of Lombard speech: Global effect on the utterance. 9th International conference on spoken language processing (INTERSPEECH), Pittsburgh. Garnier, M., Bailly, L., Dohen, M., Welby, P., & Loevenbruck, H. (2006a). An acoustic and articulatory study of Lombard speech: Global effect on the utterance. 9th International conference on spoken language processing (INTERSPEECH), Pittsburgh.
Zurück zum Zitat Garnier, M., Dohen, M., Loevenbruck, H., Welby, P., & Bailly, L. (2006b). The Lombard effect: A physiological reflex or a controlled intelligibility enhancement. Proceedings of 7th international seminar on speech production. Garnier, M., Dohen, M., Loevenbruck, H., Welby, P., & Bailly, L. (2006b). The Lombard effect: A physiological reflex or a controlled intelligibility enhancement. Proceedings of 7th international seminar on speech production.
Zurück zum Zitat Garnier, M., & Henrich, N. (2014). Speaking in noise; how does the Lombard effect improve acoustic contrasts between speech and ambient noise? Computer Speech and Language, 28, 580–597.CrossRef Garnier, M., & Henrich, N. (2014). Speaking in noise; how does the Lombard effect improve acoustic contrasts between speech and ambient noise? Computer Speech and Language, 28, 580–597.CrossRef
Zurück zum Zitat Giguère, C., Laroche, C., Brault, E., Ste-Marie, J. C., Brosseau-Villeneuve, M., Philippon, B., et al. (2006). Quantifying the Lombard effect in different background noises. The Journal Acoustical Society of America 120, 3378–3378.CrossRef Giguère, C., Laroche, C., Brault, E., Ste-Marie, J. C., Brosseau-Villeneuve, M., Philippon, B., et al. (2006). Quantifying the Lombard effect in different background noises. The Journal Acoustical Society of America 120, 3378–3378.CrossRef
Zurück zum Zitat Giguère, C., Laroche, C., & Vaillancourt, V. (2010). Modelling speech intelligibility in the noisy workplace for normal-hearing and hearing-impaired listeners using hearing protectors. International Journal of Acoustics and Vibration, 15(4), 156–167.CrossRef Giguère, C., Laroche, C., & Vaillancourt, V. (2010). Modelling speech intelligibility in the noisy workplace for normal-hearing and hearing-impaired listeners using hearing protectors. International Journal of Acoustics and Vibration, 15(4), 156–167.CrossRef
Zurück zum Zitat Goldenberg, R., Cohen, A., & Shallom, I. (2006). The Lombard effect’s influence on automatic speaker verification systems and methods for its compensation. International conference on information technology: research and education. Goldenberg, R., Cohen, A., & Shallom, I. (2006). The Lombard effect’s influence on automatic speaker verification systems and methods for its compensation. International conference on information technology: research and education.
Zurück zum Zitat Gomez, A. M., Schwerin, B., & Paliwal, K. (2012). Improving objective intelligibility prediction by combining correlation and coherence based methods with a measure based on the negative distortion ratio. Speech Communication, 54, 503–515.CrossRef Gomez, A. M., Schwerin, B., & Paliwal, K. (2012). Improving objective intelligibility prediction by combining correlation and coherence based methods with a measure based on the negative distortion ratio. Speech Communication, 54, 503–515.CrossRef
Zurück zum Zitat Gonzalez, S., & Brookes, M. (2011). A pitch estimation filter robust to high levels of noise (PEFAC). In IEEE/ACM transactions on audio, speech and language processing (TASLP), 22(2), 518–530. Gonzalez, S., & Brookes, M. (2011). A pitch estimation filter robust to high levels of noise (PEFAC). In IEEE/ACM transactions on audio, speech and language processing (TASLP), 22(2), 518–530.
Zurück zum Zitat Hodgson, M., Steininger, G., & Razavi, Z. (2007). Measurement and prediction of speech and noise levels and the Lombard effect in eating establishment. Acoustical Society of America, 121(4), 2023–2033.CrossRef Hodgson, M., Steininger, G., & Razavi, Z. (2007). Measurement and prediction of speech and noise levels and the Lombard effect in eating establishment. Acoustical Society of America, 121(4), 2023–2033.CrossRef
Zurück zum Zitat Holube, I., Fredelake, S., Vlaming, M., & Kollmeier, B. (2010). Development and analysis of an international speech test signal (ISTS). International Journal of Audiology, 49, 891–903.CrossRef Holube, I., Fredelake, S., Vlaming, M., & Kollmeier, B. (2010). Development and analysis of an international speech test signal (ISTS). International Journal of Audiology, 49, 891–903.CrossRef
Zurück zum Zitat Hormann, H., Lazarus-Mainka, G., Schubeius, M., & Lazarus, H. (1984). The effects of noise and the wearing of ear protectors on verbal communication. Noise Control Engineering Journal, 23(2), 69–77.CrossRef Hormann, H., Lazarus-Mainka, G., Schubeius, M., & Lazarus, H. (1984). The effects of noise and the wearing of ear protectors on verbal communication. Noise Control Engineering Journal, 23(2), 69–77.CrossRef
Zurück zum Zitat Howard-Jones, P., & Rosen, S. (1993). The perception of speech in fluctuating noise. Acta Acustica united with Acustica, 78, 258–272. Howard-Jones, P., & Rosen, S. (1993). The perception of speech in fluctuating noise. Acta Acustica united with Acustica, 78, 258–272.
Zurück zum Zitat Hu, Y., & Loizou, P. C. (2008). Evaluation of objective quality measures for speech enhancement. IEEE Transaction on Audio, Speech, and Language Processing, 16(1), 229–238.CrossRef Hu, Y., & Loizou, P. C. (2008). Evaluation of objective quality measures for speech enhancement. IEEE Transaction on Audio, Speech, and Language Processing, 16(1), 229–238.CrossRef
Zurück zum Zitat Junqua, J. C. (1996). The influence of acoustics on speech production: A noise-induced stress phenomenon Known as the Lombard effect. Speech Communication, 20, 13–22.CrossRef Junqua, J. C. (1996). The influence of acoustics on speech production: A noise-induced stress phenomenon Known as the Lombard effect. Speech Communication, 20, 13–22.CrossRef
Zurück zum Zitat Junqua, J. C., & Anglade, Y. (1990). Acoustic and perceptual studies of Lombard Speech: application to isolated words automatic speech recognition. International conference on acoustics, speech, and signal processing. Junqua, J. C., & Anglade, Y. (1990). Acoustic and perceptual studies of Lombard Speech: application to isolated words automatic speech recognition. International conference on acoustics, speech, and signal processing.
Zurück zum Zitat Laugesen, S., Nielsen, C., Maas, P., & Jensen, N. S. (2009). Observations on hearing aid users’ strategies for controlling the level of their own voice. Journal of American Academy of Audiology, 20(8), 503–513.CrossRef Laugesen, S., Nielsen, C., Maas, P., & Jensen, N. S. (2009). Observations on hearing aid users’ strategies for controlling the level of their own voice. Journal of American Academy of Audiology, 20(8), 503–513.CrossRef
Zurück zum Zitat Liu, W. M., Jellyman, K. A., Evans, N. W. D., & Mason, J. S. D. (2006). Assessment of objective quality measures for speech intelligibility. International conference on acoustics, speech processing (ICASSP), Toulouse. Liu, W. M., Jellyman, K. A., Evans, N. W. D., & Mason, J. S. D. (2006). Assessment of objective quality measures for speech intelligibility. International conference on acoustics, speech processing (ICASSP), Toulouse.
Zurück zum Zitat Luke, C., Theib, A., Schmidt, G., Niebuhr, O., & John, T. (2013). Creation of a Lombard speech database using an acoustic ambiance simulation with loudspeakers. 6th Biennial workshop on DSP for in-vehicle systems. Luke, C., Theib, A., Schmidt, G., Niebuhr, O., & John, T. (2013). Creation of a Lombard speech database using an acoustic ambiance simulation with loudspeakers. 6th Biennial workshop on DSP for in-vehicle systems.
Zurück zum Zitat Ma, J., Hu, Y., & Loizou, P. C. (2009). Objective measures for predicting speech intelligibility in noisy condition based on new band-importance functions. The Journal of the Acoustical Society of America, 125(5), 3387–3405.CrossRef Ma, J., Hu, Y., & Loizou, P. C. (2009). Objective measures for predicting speech intelligibility in noisy condition based on new band-importance functions. The Journal of the Acoustical Society of America, 125(5), 3387–3405.CrossRef
Zurück zum Zitat MacDonald, E. N., & Raufer, S. (2013). Speech perception in amplitude-modulated noise. Proceedings of meeting on acoustics, Montreal. MacDonald, E. N., & Raufer, S. (2013). Speech perception in amplitude-modulated noise. Proceedings of meeting on acoustics, Montreal.
Zurück zum Zitat Nijs, L., Saher, K., & Ouden, D. d. (2008). Effect of room absorption on human vocal output in multitalker situations. The Journal of the Acoustical Society of America, 123(2), 803–813.CrossRef Nijs, L., Saher, K., & Ouden, D. d. (2008). Effect of room absorption on human vocal output in multitalker situations. The Journal of the Acoustical Society of America, 123(2), 803–813.CrossRef
Zurück zum Zitat O’Shaughnessy, D. (2000). Speech communications: Human and machine. New York: IEEE Press.MATH O’Shaughnessy, D. (2000). Speech communications: Human and machine. New York: IEEE Press.MATH
Zurück zum Zitat Payton, K. L., & Braida, L. D. (1999). A method to determine the speech transmission index from speech waveforms. The Journal of the Acoustical Society of America, 106(6), 3637–3648.CrossRef Payton, K. L., & Braida, L. D. (1999). A method to determine the speech transmission index from speech waveforms. The Journal of the Acoustical Society of America, 106(6), 3637–3648.CrossRef
Zurück zum Zitat Payton, K. L., & Shrestha, M. (2008). Evaluation of short-time speech-based intelligibility metrics. Foxwoods. Payton, K. L., & Shrestha, M. (2008). Evaluation of short-time speech-based intelligibility metrics. Foxwoods.
Zurück zum Zitat Payton, K. L., & Shrestha, M. (2013). Comparison of a short-time speech-based intelligibility metric to the speech transmission index and intelligibility data. The Journal of the Acoustical Society of America, 134(5), 3818–3827.CrossRef Payton, K. L., & Shrestha, M. (2013). Comparison of a short-time speech-based intelligibility metric to the speech transmission index and intelligibility data. The Journal of the Acoustical Society of America, 134(5), 3818–3827.CrossRef
Zurück zum Zitat Pourmand, N. (2012). Objective and subjective evaluation of wideband speech quality. London: The University of Western Ontario. Pourmand, N. (2012). Objective and subjective evaluation of wideband speech quality. London: The University of Western Ontario.
Zurück zum Zitat Ramli, R. M., Noor, A. O., & Abdul Samad, S. (2012). A review of adaptive line enhancers for noise cancellation. Australian Journal of Basic and Applied sciences, 6(6), 337–352. Ramli, R. M., Noor, A. O., & Abdul Samad, S. (2012). A review of adaptive line enhancers for noise cancellation. Australian Journal of Basic and Applied sciences, 6(6), 337–352.
Zurück zum Zitat Rindel, J. H., & Gade, A. C. (2012). Dynamic sound source for simulating the Lombard effect modeling software. New York: Procedding of Inter Noise. Rindel, J. H., & Gade, A. C. (2012). Dynamic sound source for simulating the Lombard effect modeling software. New York: Procedding of Inter Noise.
Zurück zum Zitat Taal, C. H., Hendriks, R. C., Heusdens, R., Jensen, J., & Kjems, U. (2009). An evaluation of objective quality measures for speech intelligibility prediction. Proceeding of Interspeech, Brighton. Taal, C. H., Hendriks, R. C., Heusdens, R., Jensen, J., & Kjems, U. (2009). An evaluation of objective quality measures for speech intelligibility prediction. Proceeding of Interspeech, Brighton.
Zurück zum Zitat Tan, L., & Karnjanadecha, M. (2003). Pitch detection algorithm: Autocorrelation method and AMDF. Intelligent signal processing and communication systems (ISPACS), Bangkok. Tan, L., & Karnjanadecha, M. (2003). Pitch detection algorithm: Autocorrelation method and AMDF. Intelligent signal processing and communication systems (ISPACS), Bangkok.
Zurück zum Zitat Ternstrom, S., Sodersten, M., & Bohman, M. (2002). Cancellation of simulated environmental noise as a tool for measuring vocal performance during noise exposure. Journal of Voice, 16(2), 195–206.CrossRef Ternstrom, S., Sodersten, M., & Bohman, M. (2002). Cancellation of simulated environmental noise as a tool for measuring vocal performance during noise exposure. Journal of Voice, 16(2), 195–206.CrossRef
Zurück zum Zitat Tufts, J. B., & Frank, T. (2003). Speech production in noise with and without hearing protection. The Journal of the Acoustical Society of America, 114(2), 1069–1080.CrossRef Tufts, J. B., & Frank, T. (2003). Speech production in noise with and without hearing protection. The Journal of the Acoustical Society of America, 114(2), 1069–1080.CrossRef
Zurück zum Zitat Vaziri, G., Giguère, C., Dajani, H., & Ellaham, N. (2015). A comparison of speech enhancement methods to extract Lombard speech in an external noise field. The Journal of the Acoustical Society of America, 138(3), 1727.CrossRef Vaziri, G., Giguère, C., Dajani, H., & Ellaham, N. (2015). A comparison of speech enhancement methods to extract Lombard speech in an external noise field. The Journal of the Acoustical Society of America, 138(3), 1727.CrossRef
Zurück zum Zitat Vermiglio, A. J. (2008). The American english hearing in noise test. International Journal of Audiology, 47, 386–387.CrossRef Vermiglio, A. J. (2008). The American english hearing in noise test. International Journal of Audiology, 47, 386–387.CrossRef
Zurück zum Zitat Wakao, A., Takeda, K., & Itakura, F. (1996). Variability of Lombard effects under different noise. Proceedings of the International Conference on Spoken Language Processing (ICSLP). Wakao, A., Takeda, K., & Itakura, F. (1996). Variability of Lombard effects under different noise. Proceedings of the International Conference on Spoken Language Processing (ICSLP).
Zurück zum Zitat Zeine, L., & Brandt, J. F. (1988). The Lombard effect on Alaryngeal speech. Journal of Communication Disorder, 21, 373–383.CrossRef Zeine, L., & Brandt, J. F. (1988). The Lombard effect on Alaryngeal speech. Journal of Communication Disorder, 21, 373–383.CrossRef
Zurück zum Zitat Zhao, H., & Gan, W. (2013). A new pitch estimation method based on AMDF. Journal of Multimedia, 8(5), 618–625. Zhao, H., & Gan, W. (2013). A new pitch estimation method based on AMDF. Journal of Multimedia, 8(5), 618–625.
Metadaten
Titel
Evaluating noise suppression methods for recovering the Lombard speech from vocal output in an external noise field
verfasst von
Ghazaleh Vaziri
Christian Giguère
Hilmi R. Dajani
Publikationsdatum
22.11.2018
Verlag
Springer US
Erschienen in
International Journal of Speech Technology / Ausgabe 1/2019
Print ISSN: 1381-2416
Elektronische ISSN: 1572-8110
DOI
https://doi.org/10.1007/s10772-018-09564-8

Weitere Artikel der Ausgabe 1/2019

International Journal of Speech Technology 1/2019 Zur Ausgabe

Neuer Inhalt