Skip to main content

2008 | OriginalPaper | Buchkapitel

11. Formant Estimation and Tracking

verfasst von : Douglas OʼShaughnessy, Prof.

Erschienen in: Springer Handbook of Speech Processing

Verlag: Springer Berlin Heidelberg

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

This chapter deals with the estimation and tracking of the movements of the spectral resonances of human vocal tracts, also known as formants. The representation or modeling of speech in terms of formants is useful in several areas of speech processing: coding, recognition, synthesis, and enhancement, as formants efficiently describe essential aspects of speech using a very limited set of parameters. However, estimating formants is more difficult than simply searching for peaks in an amplitude spectrum, as the spectral peaks of vocal-tract output depend upon a variety for factors in complicated ways: vocal-tract shape, excitation, and periodicity. We describe in detail the formal task of formant tracking, and explore its successes and difficulties, as well as giving reasons for the various approaches.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
11.1.
Zurück zum Zitat D. OʼShaughnessy: Speech Communication: Human and Machine, 2nd edn. (IEEE, Piscataway 2000) D. OʼShaughnessy: Speech Communication: Human and Machine, 2nd edn. (IEEE, Piscataway 2000)
11.2.
Zurück zum Zitat J. Darch, B. Milner, S. Vaseghi: MAP prediction of formant frequencies and voicing class from MFCC vectors in noise, Speech Commun. 11, 1556-1572 (2006)CrossRef J. Darch, B. Milner, S. Vaseghi: MAP prediction of formant frequencies and voicing class from MFCC vectors in noise, Speech Commun. 11, 1556-1572 (2006)CrossRef
11.3.
Zurück zum Zitat R. Togneri, L. Deng: A state-space model with neural-network prediction for recovering vocal tract resonances in fluent speech from Mel-cepstral coefficients, Speech Commun. 48(8), 971-988 (2006)CrossRef R. Togneri, L. Deng: A state-space model with neural-network prediction for recovering vocal tract resonances in fluent speech from Mel-cepstral coefficients, Speech Commun. 48(8), 971-988 (2006)CrossRef
11.4.
Zurück zum Zitat K. Weber, S. Ikbal, S. Bengio, H. Bourlard: Robust speech recognition and feature extraction using HMM2, Comput. Speech Lang. 17(2-3), 195-211 (2003)CrossRef K. Weber, S. Ikbal, S. Bengio, H. Bourlard: Robust speech recognition and feature extraction using HMM2, Comput. Speech Lang. 17(2-3), 195-211 (2003)CrossRef
11.5.
Zurück zum Zitat W. Ding, N. Campbell: Optimizing unit selection with voice source and formants in the CHATR speech synthesis system, Proc. Eurospeech (1997) pp. 537-540 W. Ding, N. Campbell: Optimizing unit selection with voice source and formants in the CHATR speech synthesis system, Proc. Eurospeech (1997) pp. 537-540
11.6.
Zurück zum Zitat J. Malkin, X. Li, J. Bilmes: A graphical model for formant tracking, Proc. IEEE ICASSP, Vol. 1 (2005) pp. 913-916 J. Malkin, X. Li, J. Bilmes: A graphical model for formant tracking, Proc. IEEE ICASSP, Vol. 1 (2005) pp. 913-916
11.7.
Zurück zum Zitat K. Sjlander, J. Beskow: WAVESURFER - an open source speech tool, Proc. ICSLP (2000) K. Sjlander, J. Beskow: WAVESURFER - an open source speech tool, Proc. ICSLP (2000)
11.8.
Zurück zum Zitat L. Deng, L.J. Lee, H. Attias, A. Acero: A structured speech model with continuous hidden dynamics and prediction-residual training for tracking vocal tract resonances, Proc. IEEE ICASSP, Vol. 1 (2004) pp. 557-560 L. Deng, L.J. Lee, H. Attias, A. Acero: A structured speech model with continuous hidden dynamics and prediction-residual training for tracking vocal tract resonances, Proc. IEEE ICASSP, Vol. 1 (2004) pp. 557-560
11.9.
Zurück zum Zitat Y. Zheng, M. Hasegawa-Johnson: Formant tracking by mixture state particle filter, Proc. IEEE ICASSP, Vol. 1 (2004) pp. 565-568 Y. Zheng, M. Hasegawa-Johnson: Formant tracking by mixture state particle filter, Proc. IEEE ICASSP, Vol. 1 (2004) pp. 565-568
11.10.
Zurück zum Zitat D.T. Toledano, J.G. Villardebo, L.H. Gomez: Initialization, training, and context-cependency in HMM-based formant tracking, IEEE Trans. Audio Speech 14(2), 511-523 (2006)CrossRef D.T. Toledano, J.G. Villardebo, L.H. Gomez: Initialization, training, and context-cependency in HMM-based formant tracking, IEEE Trans. Audio Speech 14(2), 511-523 (2006)CrossRef
11.11.
Zurück zum Zitat M. Lee, J. van Santen, B. Mobius, J. Olive: Formant tracking using context-dependent phonemic information, IEEE Trans. Speech Audio Process. 13(5), 741-750 (2005), Part 2CrossRef M. Lee, J. van Santen, B. Mobius, J. Olive: Formant tracking using context-dependent phonemic information, IEEE Trans. Speech Audio Process. 13(5), 741-750 (2005), Part 2CrossRef
11.12.
Zurück zum Zitat S. McCandless: An algorithm for automatic formant extraction using linear prediction spectra, Proc. IEEE ICASSP 22(2), 135-141 (1974) S. McCandless: An algorithm for automatic formant extraction using linear prediction spectra, Proc. IEEE ICASSP 22(2), 135-141 (1974)
11.13.
Zurück zum Zitat G. Kopec: Formant tracking using hidden Markov models and vector quantization, Proc. IEEE ICASSP 34(4), 709-729 (1986) G. Kopec: Formant tracking using hidden Markov models and vector quantization, Proc. IEEE ICASSP 34(4), 709-729 (1986)
11.14.
Zurück zum Zitat G.K. Vallabha, B. Tuller: Systematic errors in the formant analysis of steady-state vowels, Speech Commun. 38(1-2), 141-160 (2002)CrossRefMATH G.K. Vallabha, B. Tuller: Systematic errors in the formant analysis of steady-state vowels, Speech Commun. 38(1-2), 141-160 (2002)CrossRefMATH
11.15.
Zurück zum Zitat Y. Laprie, M.-O. Berger: Cooperation of regularization and speech heuristics to control automatic formant tracking, Speech Commun. 19(4), 255-269 (1996)CrossRef Y. Laprie, M.-O. Berger: Cooperation of regularization and speech heuristics to control automatic formant tracking, Speech Commun. 19(4), 255-269 (1996)CrossRef
11.16.
Zurück zum Zitat K. Mustafa, I.C. Bruce: Robust formant tracking for continuous speech with speaker variability, IEEE Trans. Audio Speech 14(2), 435-444 (2006)CrossRef K. Mustafa, I.C. Bruce: Robust formant tracking for continuous speech with speaker variability, IEEE Trans. Audio Speech 14(2), 435-444 (2006)CrossRef
11.17.
Zurück zum Zitat A. Rao, R. Kumaresan: On decomposing speech into modulated components, IEEE Trans. Speech Audio Process. 8(3), 240-254 (2000)CrossRef A. Rao, R. Kumaresan: On decomposing speech into modulated components, IEEE Trans. Speech Audio Process. 8(3), 240-254 (2000)CrossRef
11.18.
Zurück zum Zitat I.C. Bruce, N.V. Karkhanis, E.D. Young, M.B. Sachs: Robust formant tracking in noise, Proc. IEEE ICASSP, Vol. 1 (2002) pp. 281-284 I.C. Bruce, N.V. Karkhanis, E.D. Young, M.B. Sachs: Robust formant tracking in noise, Proc. IEEE ICASSP, Vol. 1 (2002) pp. 281-284
11.19.
Zurück zum Zitat L. Welling, H. Ney: Formant estimation for speech recognition, IEEE Trans. Speech Audio Process. 6(1), 36-48 (1998)CrossRef L. Welling, H. Ney: Formant estimation for speech recognition, IEEE Trans. Speech Audio Process. 6(1), 36-48 (1998)CrossRef
11.20.
Zurück zum Zitat B. Chen, P.C. Loizou: Formant frequency estimation in noise, Proc. IEEE ICASSP, Vol. 1 (2004) pp. 581-584 B. Chen, P.C. Loizou: Formant frequency estimation in noise, Proc. IEEE ICASSP, Vol. 1 (2004) pp. 581-584
11.21.
Zurück zum Zitat D.J. Nelson: Cross-spectral based formant estimation and alignment, Proc. IEEE ICASSP, Vol. 2 (2004) pp. 621-624 D.J. Nelson: Cross-spectral based formant estimation and alignment, Proc. IEEE ICASSP, Vol. 2 (2004) pp. 621-624
11.22.
Zurück zum Zitat A. Watanabe: Formant estimation method using inverse-filter control, IEEE Trans. Speech Audio Process. 9(4), 317-326 (2001)CrossRef A. Watanabe: Formant estimation method using inverse-filter control, IEEE Trans. Speech Audio Process. 9(4), 317-326 (2001)CrossRef
Metadaten
Titel
Formant Estimation and Tracking
verfasst von
Douglas OʼShaughnessy, Prof.
Copyright-Jahr
2008
Verlag
Springer Berlin Heidelberg
DOI
https://doi.org/10.1007/978-3-540-49127-9_11

Neuer Inhalt