nach oben

Erschienen in:

2015 | OriginalPaper | Buchkapitel

Automatic Phoneme Border Detection to Improve Speech Recognition

verfasst von : Suárez-Guerra Sergio, Juárez-Murillo Cristian-Remington, Oropeza-Rodríguez José Luis

Erschienen in: Advances in Artificial Intelligence and Soft Computing

Verlag: Springer International Publishing

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

A comparative study of speech recognition performance among systems trained with manually labeled corpora and systems trained with semiautomatically labeled corpora is introduced. An automatic labeling system was designed to generate phoneme labels files for all words within the corpus used to train a system of automatic speech recognition. Speech recognition experiments were performed using the same corpus, first training with manually, and later with automatically generated labels. Results show that the recognition performance is better when the training of selected diccionary, is made with automatic label files than when it is made with manual label files. Not only is the automatic labeling of speech corpora faster than manual labeling, but also it is free from the subjectivity inherent in the manual segmentation performed by specialists. The performance achieved in this work is greater than 96 %.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Vorheriges Kapitel Practical Measurements for Quality of Ontology Matching Applying to the OAEI Dataset

Nächstes Kapitel Description Logic Programs: A Paraconsistent Relational Model Approach

Becchetti, C., Ricotti, L.C.: Speech Recognition. Theory and C ++ Implementation, pp. 305–307. Wiley, New York (1999)

Fantinato, P.C., et al.: A Fractal-based approach for speech segmentation. In: Tenth IEEE International Symposium on Multimedia, ISM 2008, 15–17 December 2008, pp. 551–555 (2008). doi:10.1109/ISM.2008.123

Galka, J., Ziolko, M.: Wavelets in speech segmentation. In: The 14th IEEE Mediterranean Electrotechnical Conference, MELECON 2008, 5–7 May 2008, pp. 876–879 (2008)

Hieronymus, J.L.: ASCII Phonetic Symbols for the World’s Languages: Worldbet, pp. 43–44 (1993). http://www.ling.ohio-state.edu/~edwards/WorldBet/worldbet.pdf

Hosom, J.P.: Automatic time alignment of phonemes using acoustic-phonetic information. Oregon Graduate Institute of Science and Technology. Ph.D. thesis, pp. 121–122 (2000)

Lander, T.: The CSLU Labeling Guide. Center of Spoken Language Understanding, Oregon Graduate Institute, pp. 52–56 (1997). http://www.cslu.ogi.edu/corpora/docs/labeling.pdf

Pineda, L.A., et al.: The Corpus DIMEx100: transcription and evaluation. Lang. Resour. Eval. 44, 347–370 (2009, 2010). doi:10.1007/s10579-009-9109-9

Bansal, P., et al.: Speech synthesis – automatic segmentation. Int. J. Comput. Appl. (0975–8887) 98 (4), (2014)

Toledano, D.T., Gómez, L.A.H., Grande, L.V.: Automatic phonetic segmentation. IEEE Trans. Speech Audio Process. 11(6), 617–625 (2003). doi:10.1109/TSA.2003.813579 CrossRef

Ziolko, B., Manandhar, S., Wilson, R.C.: Phoneme segmentation of speech. In: 18th International Conference on Pattern Recognition, ICPR 2006, vol. 4, pp. 282–285 (2006). doi:10.1109/ICPR.2006.931

Titel: Automatic Phoneme Border Detection to Improve Speech Recognition
verfasst von: Suárez-Guerra Sergio
Juárez-Murillo Cristian-Remington
Oropeza-Rodríguez José Luis
Verlag: Springer International Publishing
Buch: Advances in Artificial Intelligence and Soft Computing
Print ISBN: 978-3-319-27059-3

Electronic ISBN: 978-3-319-27060-9

Copyright-Jahr: 2015
DOI: https://doi.org/10.1007/978-3-319-27060-9_11

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Premium Partner