nach oben

Erschienen in:

2017 | OriginalPaper | Buchkapitel

An Algorithm for Detection of Breath Sounds in Spontaneous Speech with Application to Speaker Recognition

verfasst von : Sri Harsha Dumpala, K. N. R. K. Raju Alluri

Erschienen in: Speech and Computer

Verlag: Springer International Publishing

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Automatic detection and demarcation of non-speech sounds in speech is critical for developing sophisticated human-machine interaction systems. The main objective of this study is to develop acoustic features capturing the production differences between speech and breath sounds in terms of both, excitation source and vocal tract system based characteristics. Using these features, a rule-based algorithm is proposed for automatic detection of breath sounds in spontaneous speech. The proposed algorithm outperforms the previous methods for detection of breath sounds in spontaneous speech. Further, the importance of breath detection for speaker recognition is analyzed by considering an i-vector-based speaker recognition system. Experimental results show that the detection of breath sounds, prior to i-vector extraction, is essential to nullify the effect of breath sounds occurring in test samples on speaker recognition, which otherwise will degrade the performance of i-vector-based speaker recognition systems.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Vorheriges Kapitel Adaptation Approaches for Pronunciation Scoring with Sparse Training Data

Nächstes Kapitel An Alternative Approach to Exploring a Video

Lei, B., Rahman, S.A., Song, I.: Content-based classification of breath sound with enhanced features. Neurocomputing 141, 139–147 (2014)CrossRef

Dumpala, S.H., Sridaran, K.V., Gangashetty, S.V., Yegnanarayana, B.: Analysis of laughter and speech-laugh signals using excitation source information. In: ICASSP, pp. 975–979 (2014)

Drugman, T., Urbain, J., Dutoit, T.: Assessment of audio features for automatic cough detection. In: EUSIPCO, pp. 1289–1293 (2011)

Dumpala, S.H., Gangamohan, P., Gangashetty, S.V., Yegnanarayana, B.: Use of vowels in discriminating speech-laugh from laughter and neutral speech. In: Interspeech, pp. 1437–1441 (2016)

Ruinskiy, D., Lavner, Y.: An effective algorithm for automatic detection and exact demarcation of breath sounds in speech and song signals. IEEE Trans. Audio Speech Lang. Process. 15(3), 838–850 (2007)CrossRef

Zelasko, P., Jadczyk, T., Zilko, B.: HMM-based breath and filled pauses elimination in ASR. In: SIGMAP, pp. 255–260 (2014)

Igras, M., Zilko, B.: Wavelet method for breath detection in audio signals. In: ICME, pp. 1–6 (2013)

Godin, K.W., Hansen, J.H.: Physical task stress and speaker variability in voice quality. EURASIP J. Audio Speech Music Proc. 1, 1–13 (2015)

Nakano, T., Ogata, J., Goto, M., Hiraga, Y.: Analysis and automatic detection of breath sounds in unaccompanied singing voice. In: ICMPC, pp. 387–390 (2008)

10.

Igras, M., Zilko, B.: Different types of pauses as a source of biometry. In: Models and Analysis of Vocal Emissions for Biomedical Applications, pp. 197–200 (2013)

11.

Rapcan, V., D’Arcy, S., Reilly, R.B.: Automatic breath sound detection and removal for cognitive studies of speech and language. In: ISSC, pp. 1–6 (2009)

12.

Janicki, A.: On the impact of non-speech sounds on speaker recognition. In: Sojka, P., Horák, A., Kopeček, I., Pala, K. (eds.) TSD 2012. LNCS, vol. 7499, pp. 566–572. Springer, Heidelberg (2012). doi:10.1007/978-3-642-32790-2_69 CrossRef

13.

Pitt, M.A., Dilley, L., Johnson, K., Kiesling, S., Raymond, W., Hume, E., Fosler-Lussier, E.: Buckeye Corpus of Conversational Speech (2nd release). Department of Psychology, Ohio State University (Distributor), Columbus, OH (2007)

14.

Dumpala, S.H., Nellore, B.T., Nevali, R.R., Gangashetty, S.V., Yegnanarayana, B.: Robust features for sonorant segmentation in continuous speech. In: Interspeech, pp. 1987–1991 (2015)

15.

Dumpala, S.H., Nellore, B.T., Nevali, R.R., Gangashetty, S.V., Yegnanarayana, B.: Robust vowel landmark detection using epoch-based features. In: Interspeech, pp. 160–164 (2016)

16.

Murty, K.S.R., Yegnanarayana, B.: Epoch extraction from speech signals. IEEE Trans. Audio Speech Lang. Process. 16, 1602–1613 (2008)CrossRef

17.

Yegnanarayana, B., Dhananjaya, N.G.: Spectro-temporal analysis of speech signals using zero-time windowing and group delay function. Speech Commun. 55(6), 782–795 (2013)CrossRef

18.

Hirose, H.: Investigating the physiology of laryngeal structures. In: The Handbook of Phonetic Sciences, Cambridge, pp. 116–136 (1995)

19.

Brookes, M., et al.: Voicebox: Speech processing toolbox for Matlab (2011). www.ee.ic.ac.uk/hp/staff/dmb/voicebox/voicebox.html

20.

Voice Biometry Standardization (VBS) (2015). http://voicebiometry.org/

21.

Dumpala, S.H., Kopparapu, S.K.: Improved speaker recognition system for stressed speech using deep neural networks. In: IJCNN, pp. 1257–1264 (2017)

Titel: An Algorithm for Detection of Breath Sounds in Spontaneous Speech with Application to Speaker Recognition
verfasst von: Sri Harsha Dumpala
K. N. R. K. Raju Alluri
Verlag: Springer International Publishing
Buch: Speech and Computer
Print ISBN: 978-3-319-66428-6

Electronic ISBN: 978-3-319-66429-3

Copyright-Jahr: 2017
DOI: https://doi.org/10.1007/978-3-319-66429-3_9

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"