nach oben

International Journal of Speech Technology

Erschienen in:

22.11.2018

Evaluating noise suppression methods for recovering the Lombard speech from vocal output in an external noise field

verfasst von: Ghazaleh Vaziri, Christian Giguère, Hilmi R. Dajani

Erschienen in: International Journal of Speech Technology | Ausgabe 1/2019

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Speech production is affected by noise due to the Lombard effect. The traditional method of investigation is through headphone delivery of noise to allow speech to be recorded in quiet, but this could create an occlusion effect artefact during speech production. It is also not directly applicable when wearing hearing protectors, hearing aids, or other devices due to physical interference by the headphones. In these situations, the Lombard effect needs to be elicited by an external noise field and speech recorded in the presence of noise. This is a more challenging measurement situation, but one that preserves perception of own voice and the surrounding noise in interaction with the hearing device worn. Two methods, direct waveform subtraction and adaptive noise cancellation, were evaluated for suppressing the background noise in the recorded speech..The effects of sound recording configuration on performance was investigated for two microphone types (omnidirectional and directional) at two distances (50 and 25 cm) in different noises and in the presence of real talker’s movement. Results show that the amount of noise reduction with both suppression methods is greater for fluctuating than continuous noises. Overall, the best recording configuration for noise reduction was with the omnidirectional microphone at 25 cm. Pitch extraction, energy level, and objective speech intelligibility and quality measures show that both suppression methods provide adequate noise reduction for SNRs as low as − 10 dB, which is suitable to successfully recover Lombard speech produced in an external noise field with open ears and when wearing hearing protectors.

Vorheriger Artikel Long short-term memory recurrent neural network architectures for Urdu acoustic modeling

Nächster Artikel Continuous Tamil Speech Recognition technique under non stationary noisy environments

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Abd El-Fattah, M. A., Dessouky, M. I., Abbas, A. M., Alaa, M., Diab, S. M., El-Rabaie, E. M., et al. (2014). Speech enhancement with an adaptive Wiener filter. International Journal of Speech Technology, 17, 53–64.CrossRef

Bahoura, M., & Rouat, J. (2006). Wavelet speech enhancement based on time-scale adaptation. Speech Communication, 48, 1620–1637.CrossRef

Beerends, J. G., Helstra, A. P., Rix, A. W., & Hollier, M. P. (2002). Perceptual evaluation of speech quality (PESQ), the new ITU standard for end-to-end speech quality assessment. Part II-Psychoacoustic model. Journal of Audio Engineering Society, 50(10), 765–778.

Bouserhal, R., Macdonald, E. N., Falk, T. H., & Voix, J. (2016). Variations in voice level and fundamental frequency with changing background noise level and talker-to-listener distance while wearing hearing protectors: A pilot study. International Journal of Audiology, 55(Sup1), S13–S20.CrossRef

Brungart, D., Cord, M. T., Solomon, N. P., Dietrich-Burns, K., & Block, K. (2012). Evaluating the effects of hearing protection on speech production in noisy environments. In ListTalk-2012.

Castellanos, A., Benedi, J. M., & Casacuberta, F. (1996). An analysis of general acoustic-phonetic features for Spanish speech produced with Lombard effect. Speech Communication, 20, 23–35.CrossRef

Chen, S. H. (2004). Speech enhancement using perceptual wavelet packet decomposition teager energy operator. Journal of VLSI Signal Processing, 36, 125–139.CrossRef

Davis, C., Kim, J., Grauwinkel, K., & Mixdorff, H. (2006). Lombard speech: Auditory (A), visual (V) and AV effects. 3rd international conference on speech prosody, Dresden.

Dittberner, A. (2003). Interpreting the directivity index (DI). The hearing review, http://www.hearingreview.com/2003/06/interpreting-the-directivity-index-di/.

Drugman, T., & Dutoit, T. (2010). Glottal-based Analysis of the Lombard Effect. 11th annual conference of the international speech communication association (INTERSPEECH), Chiba.

Ferrand, C. T. (2005). Relationship between masking levels and phonatory stability in normal-speaking women. Voice, 20(2), 223–228.CrossRef

Garnier, M., Bailly, L., Dohen, M., Welby, P., & Loevenbruck, H. (2006a). An acoustic and articulatory study of Lombard speech: Global effect on the utterance. 9th International conference on spoken language processing (INTERSPEECH), Pittsburgh.

Garnier, M., Dohen, M., Loevenbruck, H., Welby, P., & Bailly, L. (2006b). The Lombard effect: A physiological reflex or a controlled intelligibility enhancement. Proceedings of 7th international seminar on speech production.

Garnier, M., & Henrich, N. (2014). Speaking in noise; how does the Lombard effect improve acoustic contrasts between speech and ambient noise? Computer Speech and Language, 28, 580–597.CrossRef

Giguère, C., Laroche, C., Brault, E., Ste-Marie, J. C., Brosseau-Villeneuve, M., Philippon, B., et al. (2006). Quantifying the Lombard effect in different background noises. The Journal Acoustical Society of America 120, 3378–3378.CrossRef

Giguère, C., Laroche, C., & Vaillancourt, V. (2010). Modelling speech intelligibility in the noisy workplace for normal-hearing and hearing-impaired listeners using hearing protectors. International Journal of Acoustics and Vibration, 15(4), 156–167.CrossRef

Goldenberg, R., Cohen, A., & Shallom, I. (2006). The Lombard effect’s influence on automatic speaker verification systems and methods for its compensation. International conference on information technology: research and education.

Gomez, A. M., Schwerin, B., & Paliwal, K. (2012). Improving objective intelligibility prediction by combining correlation and coherence based methods with a measure based on the negative distortion ratio. Speech Communication, 54, 503–515.CrossRef

Gonzalez, S., & Brookes, M. (2011). A pitch estimation filter robust to high levels of noise (PEFAC). In IEEE/ACM transactions on audio, speech and language processing (TASLP), 22(2), 518–530.

Hodgson, M., Steininger, G., & Razavi, Z. (2007). Measurement and prediction of speech and noise levels and the Lombard effect in eating establishment. Acoustical Society of America, 121(4), 2023–2033.CrossRef

Holube, I., Fredelake, S., Vlaming, M., & Kollmeier, B. (2010). Development and analysis of an international speech test signal (ISTS). International Journal of Audiology, 49, 891–903.CrossRef

Hormann, H., Lazarus-Mainka, G., Schubeius, M., & Lazarus, H. (1984). The effects of noise and the wearing of ear protectors on verbal communication. Noise Control Engineering Journal, 23(2), 69–77.CrossRef

Howard-Jones, P., & Rosen, S. (1993). The perception of speech in fluctuating noise. Acta Acustica united with Acustica, 78, 258–272.

Hu, Y., & Loizou, P. C. (2008). Evaluation of objective quality measures for speech enhancement. IEEE Transaction on Audio, Speech, and Language Processing, 16(1), 229–238.CrossRef

Junqua, J. C. (1996). The influence of acoustics on speech production: A noise-induced stress phenomenon Known as the Lombard effect. Speech Communication, 20, 13–22.CrossRef

Junqua, J. C., & Anglade, Y. (1990). Acoustic and perceptual studies of Lombard Speech: application to isolated words automatic speech recognition. International conference on acoustics, speech, and signal processing.

Laugesen, S., Nielsen, C., Maas, P., & Jensen, N. S. (2009). Observations on hearing aid users’ strategies for controlling the level of their own voice. Journal of American Academy of Audiology, 20(8), 503–513.CrossRef

Liu, W. M., Jellyman, K. A., Evans, N. W. D., & Mason, J. S. D. (2006). Assessment of objective quality measures for speech intelligibility. International conference on acoustics, speech processing (ICASSP), Toulouse.

Luke, C., Theib, A., Schmidt, G., Niebuhr, O., & John, T. (2013). Creation of a Lombard speech database using an acoustic ambiance simulation with loudspeakers. 6th Biennial workshop on DSP for in-vehicle systems.

Ma, J., Hu, Y., & Loizou, P. C. (2009). Objective measures for predicting speech intelligibility in noisy condition based on new band-importance functions. The Journal of the Acoustical Society of America, 125(5), 3387–3405.CrossRef

MacDonald, E. N., & Raufer, S. (2013). Speech perception in amplitude-modulated noise. Proceedings of meeting on acoustics, Montreal.

Nijs, L., Saher, K., & Ouden, D. d. (2008). Effect of room absorption on human vocal output in multitalker situations. The Journal of the Acoustical Society of America, 123(2), 803–813.CrossRef

Nymand, M. (2015). Directional vs. omnidirectional microphones. DPA MICROPHONES, https://www.dpamicrophones.com/mic-university/directional-vs-omnidirectional-microphones.

O’Shaughnessy, D. (2000). Speech communications: Human and machine. New York: IEEE Press.MATH

Payton, K. L., & Braida, L. D. (1999). A method to determine the speech transmission index from speech waveforms. The Journal of the Acoustical Society of America, 106(6), 3637–3648.CrossRef

Payton, K. L., & Shrestha, M. (2008). Evaluation of short-time speech-based intelligibility metrics. Foxwoods.

Payton, K. L., & Shrestha, M. (2013). Comparison of a short-time speech-based intelligibility metric to the speech transmission index and intelligibility data. The Journal of the Acoustical Society of America, 134(5), 3818–3827.CrossRef

Pourmand, N. (2012). Objective and subjective evaluation of wideband speech quality. London: The University of Western Ontario.

Ramli, R. M., Noor, A. O., & Abdul Samad, S. (2012). A review of adaptive line enhancers for noise cancellation. Australian Journal of Basic and Applied sciences, 6(6), 337–352.

Rindel, J. H., & Gade, A. C. (2012). Dynamic sound source for simulating the Lombard effect modeling software. New York: Procedding of Inter Noise.

Taal, C. H., Hendriks, R. C., Heusdens, R., Jensen, J., & Kjems, U. (2009). An evaluation of objective quality measures for speech intelligibility prediction. Proceeding of Interspeech, Brighton.

Tan, L., & Karnjanadecha, M. (2003). Pitch detection algorithm: Autocorrelation method and AMDF. Intelligent signal processing and communication systems (ISPACS), Bangkok.

Ternstrom, S., Sodersten, M., & Bohman, M. (2002). Cancellation of simulated environmental noise as a tool for measuring vocal performance during noise exposure. Journal of Voice, 16(2), 195–206.CrossRef

Thompson, S. C. (2000). Directional microphone patterns: They also have disadvantages. Audiology Online, https://www.audiologyonline.com/articles/directional-microphone-patterns-they-also-1294.

Tufts, J. B., & Frank, T. (2003). Speech production in noise with and without hearing protection. The Journal of the Acoustical Society of America, 114(2), 1069–1080.CrossRef

Vaziri, G., Giguère, C., Dajani, H., & Ellaham, N. (2015). A comparison of speech enhancement methods to extract Lombard speech in an external noise field. The Journal of the Acoustical Society of America, 138(3), 1727.CrossRef

Vermiglio, A. J. (2008). The American english hearing in noise test. International Journal of Audiology, 47, 386–387.CrossRef

Wakao, A., Takeda, K., & Itakura, F. (1996). Variability of Lombard effects under different noise. Proceedings of the International Conference on Spoken Language Processing (ICSLP).

Zeine, L., & Brandt, J. F. (1988). The Lombard effect on Alaryngeal speech. Journal of Communication Disorder, 21, 373–383.CrossRef

Zhao, H., & Gan, W. (2013). A new pitch estimation method based on AMDF. Journal of Multimedia, 8(5), 618–625.

Titel: Evaluating noise suppression methods for recovering the Lombard speech from vocal output in an external noise field
verfasst von: Ghazaleh Vaziri
Christian Giguère
Hilmi R. Dajani
Publikationsdatum: 22.11.2018
Verlag: Springer US
Erschienen in: International Journal of Speech Technology / Ausgabe 1/2019
Print ISSN: 1381-2416
Elektronische ISSN: 1572-8110
DOI: https://doi.org/10.1007/s10772-018-09564-8

Neuer Inhalt

Bildnachweise

VDI-Icon, Profil Icon, inhalt2, Springer Professional Modul/© Springer Fachmedien Wiesbaden GmbH, Nachhaltigkeitsaward Key Visual/© Cometis AG/Global ESG Monitor | Daniel Rupp | Generiert mit KI, Search Icon, Banner Hanser, Jonas Klose/© Pine Valley Capital GmbH, Carina Kießling von der Strategieberatung Roland Berger/© Monika Walther Fotografie | ATZ, Beijing Auto Show 2024: Deutsche Hersteller wollen angreifen./© EKH-Pictures / Generated with AI / Stock.adobe.com, Zeitschrift Wissensmanagement Cover, PatentFit-Logo/© Springer Fachmedien Wiesbaden GmbH, Zukunftswerkstatt Sales Excellence 2024/© AndreyPopov / Getty Images / iStock, 2023_Antrieb/© supervisuell, ATZ-Webinar: Prototypenfreie Entwicklung durch Offline- und Driver-in-the-Loop-HiL-Tests /© (c) VI-grade

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Weitere Artikel der Ausgabe 1/2019

Speech synthesis for glottal activity region processing

Improved dynamic match phone lattice search for Persian spoken term detection system in online and offline applications

Replay spoofing countermeasures using high spectro-temporal resolution features

Temperature controlled PSO on optimizing the DBN parameters for phoneme classification

Application of audio visual tuning detection software in piano tuning teaching

Segment-level probabilistic sequence kernel and segment-level pyramid match kernel based extreme learning machine for classification of varying length patterns of speech

Neuer Inhalt

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.