nach oben

International Journal of Speech Technology

Erschienen in:

01.06.2013

Vowel onset point detection for noisy speech using spectral energy at formant frequencies

verfasst von: Anil Kumar Vuppala, K. Sreenivasa Rao

Erschienen in: International Journal of Speech Technology | Ausgabe 2/2013

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

In this paper, we propose a method for robust detection of the vowel onset points (VOPs) from noisy speech. The proposed VOP detection method exploits the spectral energy at formant frequencies of the speech segments present in glottal closure region. In this work, formants are extracted by using group delay function, and glottal closure instants are extracted by using zero frequency filter based method. Performance of the proposed VOP detection method is compared with the existing method, which uses the combination of evidence from excitation source, spectral peaks energy and modulation spectrum. Speech data from TIMIT database and noise samples from NOISEX database are used for analyzing the performance of the VOP detection methods. Significant improvement in the performance of VOP detection is observed by using proposed method compared to existing method.

Vorheriger Artikel Robust emotional speech classification in the presence of babble noise

Nächster Artikel Expressive speech synthesis: a review

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Gangashetty, S. V., Sekhar, C. C., & Yegnanarayana, B. (2004a). Detection of vowel onset points in continuous speech using autoassociative neural network models. In Proc. int. conf. spoken language processing (pp. 401–410).

Gangashetty, S. V., Sekhar, C. C., & Yegnanarayana, B. (2004b). Extraction of fixed dimension patterns from varying duration segments of consonant-vowel utterances. In Proc. of IEEE ICISIP (pp. 159–164).

Hermes, D. J. (1990). Vowel onset detection. The Journal of the Acoustical Society of America, 87, 866–873. CrossRef

Joseph, M. A., Guruprasad, S., & Yegnanarayana, B. (2006). Extracting formants from short segments of speech using group delay functions. In Proc. of interspeech (pp. 1009–1012).

Murty, K. S. R., & Yegnanarayana, B. (2008). Epoch extraction from speech signals. IEEE Transactions on Audio, Speech, and Language Processing, 16(8), 1602–1613. CrossRef

Prasanna, S. R. M., & Yegnanarayana, B. (2005). Detection of vowel onset point events using excitation source information. In Proc. of interspeech (pp. 1133–1136).

Prasanna, S. R. M., Gangashetty, S. V., & Yegnanarayana, B. (2001). Significance of vowel onset point for speech analysis. In Proc. of int. conf. signal processing and communications (pp. 81–88).

Prasanna, S. R. M., Reddy, B. V. S., & Krishnamoorthy, P. (2009). Vowel onset point detection using source, spectral peaks, and modulation spectrum energies. IEEE Transactions on Audio, Speech, and Language Processing, 17(4), 556–565. CrossRef

Rao, K. S., & Yegnanarayana, B. (2009). Duration modification using glottal closure instants and vowel onset points. Speech Communication, 51, 1263–1269. CrossRef

Vuppala, A. K., Rao, K. S., Chakrabarti, S., Krishnamoorthy, P., & Prasanna, S. R. M. (2011). Recognition of consonant-vowel (cv) units under background noise using combined temporal and spectral preprocessing. International Journal of Speech Technology, 14(1).

Vuppala, A. K., Rao, K. S., & Chakrabarti, S. (2012a). Improved consonant–vowel recognition for low bit-rate coded speech. Wiley International Journal of Adaptive Control and Signal Processing, 26(4), 333–349. CrossRef

Vuppala, A. K., Rao, K. S., & Chakrabarti, S. (2012b). Spotting and recognition of consonant-vowel units from continuous speech using accurate vowel onset points. Circuits, Systems, and Signal Processing, 31(4), 1459–1474. CrossRef

Wang, J.-H., & Chen, S.-H. (1999). A c/v segmentation algorithm for mandarin speech using wavelet transforms. In Proc. IEEE int. conf. acoust., speech, signal processing (pp. 1261–1264).

Wang, J.-F., Wu, C. H., Chang, S. H., & Lee, J. Y. (1991). A hierarchical neural network based C/V segmentation algorithm for mandarin speech recognition. IEEE Transactions on Signal Processing, 39(9), 2141–2146. CrossRef

Titel: Vowel onset point detection for noisy speech using spectral energy at formant frequencies
verfasst von: Anil Kumar Vuppala
K. Sreenivasa Rao
Publikationsdatum: 01.06.2013
Verlag: Springer US
Erschienen in: International Journal of Speech Technology / Ausgabe 2/2013
Print ISSN: 1381-2416
Elektronische ISSN: 1572-8110
DOI: https://doi.org/10.1007/s10772-012-9179-8

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Weitere Artikel der Ausgabe 2/2013

The optimized wavelet filters for speech compression

Gender-dependent emotion recognition based on HMMs and SPHMMs

Characterization and recognition of emotions from speech using excitation source information

Expressive speech synthesis: a review

Emotion recognition from speech using global and local prosodic features

Robust emotional speech classification in the presence of babble noise