nach oben

Erschienen in:

2017 | OriginalPaper | Buchkapitel

Adaptation Approaches for Pronunciation Scoring with Sparse Training Data

verfasst von : Federico Landini, Luciana Ferrer, Horacio Franco

Erschienen in: Speech and Computer

Verlag: Springer International Publishing

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

In Computer Assisted Language Learning systems, pronunciation scoring consists in providing a score grading the overall pronunciation quality of the speech uttered by a student. In this work, a log-likelihood ratio obtained with respect to two automatic speech recognition (ASR) models was used as score. One model represents native pronunciation while the other one captures non-native pronunciation. Different approaches to obtain each model and different amounts of training data were analyzed. The best results were obtained training an ASR system using a separate large corpus without pronunciation quality annotations and then adapting it to the native and non-native data, sequentially. Nevertheless, when models are trained directly on the native and non-native data, pronunciation scoring performance is similar. This is a surprising result considering that word error rates for these models are significantly worse, indicating that ASR performance is not a good predictor of pronunciation scoring performance on this system.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Vorheriges Kapitel Acoustic Modeling in the STC Keyword Search System for OpenKWS 2016 Evaluation

Nächstes Kapitel An Algorithm for Detection of Breath Sounds in Spontaneous Speech with Application to Speaker Recognition

Godfrey, J.J., Holliman, E.C., McDaniel, J.: SWITCHBOARD: telephone speech corpus for research and development. In: Proceedings of ICASSP. IEEE, San Francisco (1992)

Gauvain, J.-L., Lee, C.-H.: Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains. IEEE Trans. Speech Audio Process. 2, 291–298 (1994)CrossRef

Ronen, O., Neumeyer, L., Franco, H.: Automatic detection of mispronunciation for language instruction. In: Proceedings of EUROSPEECH, Rhodes (1997)

Cieri, C., Miller, D., Walker, K.: The fisher corpus: a resource for the next generations of speech-to-text. In: LREC, Lisbon (2004)

Franco, H., Ferrer, L., Bratt, H.: Adaptive and discriminative modeling for improved mispronunciation detection. In: Proceedings of ICASSP. IEEE, Florence (2014)

Robertson, S., Munteanu, C., Penn, G.: Pronunciation error detection for new language learners. In: Proceedings of Interspeech, San Francisco (2016)

Cucchiarini, C., Strik, H., Binnenpoorte, D., Boves, L.: Pronunciation evaluation in read and spontaneous speech: a comparison between human ratings and automatic scores. In: Proceedings of the New Sounds. Citeseer (2000)

Hönig, F., Batliner, A., Nöth, E.: Automatic assessment of non-native prosody annotation, modelling and evaluation. In: Proceedings of ISADEPT (2012)

Efron, B.: Bootstrap methods: another look at the Jackknife. Ann. Stat. 7, 1–26 (1979)MathSciNetCrossRefMATH

10.

Povey, D., Ghoshal, A., Boulianne, G., Burget, L., Glembek, O., Goel, N., Hannemann, M., Motlicek, P., Qian, Y., Schwarz, P., Silovsky, J., Stemmer, G., Vesely, K.: IEEE 2011 Workshop on Automatic Speech Recognition and Understanding (2011)

Titel: Adaptation Approaches for Pronunciation Scoring with Sparse Training Data
verfasst von: Federico Landini
Luciana Ferrer
Horacio Franco
Verlag: Springer International Publishing
Buch: Speech and Computer
Print ISBN: 978-3-319-66428-6

Electronic ISBN: 978-3-319-66429-3

Copyright-Jahr: 2017
DOI: https://doi.org/10.1007/978-3-319-66429-3_8

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Premium Partner