Skip to main content

2013 | OriginalPaper | Buchkapitel

Edit Distance Comparison Confidence Measure for Speech Recognition

verfasst von : Dawid Skurzok, Bartosz Ziółko

Erschienen in: Multimedia and Ubiquitous Engineering

Verlag: Springer Netherlands

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

A new possible confidence measure for automatic speech recognition is presented along with results of tests where they were applied. A classical method based on comparing the strongest hypotheses with an average of a few next hypotheses was used as a ground truth. Details of our own method based on comparison of edit distances are depicted with results of tests. It was found useful for spoken dialogue system as a module asking to repeat a phrase or declaring that it was not recognised. The method was designed for Polish language, which is morphologically rich.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Guo G, Huang C, Jiang H, Wang RH (2004) A comparative study on various confidence measures in large vocabulary speech recognition. Proceedings of international symposium on Chinese spoken language, pp 9–12 Guo G, Huang C, Jiang H, Wang RH (2004) A comparative study on various confidence measures in large vocabulary speech recognition. Proceedings of international symposium on Chinese spoken language, pp 9–12
2.
Zurück zum Zitat Razik J, Mella O, Fohr D, Haton J (2011) Frame-synchronous and local confidence measures for automatic speech recognition. Int J Pattern Recognit Artif Intell 25:157–182CrossRefMathSciNet Razik J, Mella O, Fohr D, Haton J (2011) Frame-synchronous and local confidence measures for automatic speech recognition. Int J Pattern Recognit Artif Intell 25:157–182CrossRefMathSciNet
3.
Zurück zum Zitat Wessel F, Schluter R, Macherey K, Ney H (2001) Confidence measures for large vocabulary continuous speech recognition. IEEE Trans Speech Audio Proc 9(3):288–298CrossRef Wessel F, Schluter R, Macherey K, Ney H (2001) Confidence measures for large vocabulary continuous speech recognition. IEEE Trans Speech Audio Proc 9(3):288–298CrossRef
4.
Zurück zum Zitat Molina C, Yoma N, Huenupan F, Garreton C, Wuth J (2010) Maximum entropy-based reinforcement learning using a condense measure in speech recognition for telephone speech. IEEE Trans Audio, Speech Lang Proc 18(5):1041–1052 Molina C, Yoma N, Huenupan F, Garreton C, Wuth J (2010) Maximum entropy-based reinforcement learning using a condense measure in speech recognition for telephone speech. IEEE Trans Audio, Speech Lang Proc 18(5):1041–1052
5.
Zurück zum Zitat Ziółko B, Jadczyk T, Skurzok D, Ziółko M (2012) Confidence measure by substring comparison for automatic speech recognition. ICALIP, Shanghai Ziółko B, Jadczyk T, Skurzok D, Ziółko M (2012) Confidence measure by substring comparison for automatic speech recognition. ICALIP, Shanghai
6.
Zurück zum Zitat Zhou L, Shi Y, Sears A (2010) Third-party error detection support mechanisms for dictation speech recognition. Interact Comput 22:375–388CrossRef Zhou L, Shi Y, Sears A (2010) Third-party error detection support mechanisms for dictation speech recognition. Interact Comput 22:375–388CrossRef
7.
Zurück zum Zitat Vogt R, Sridharan S, Mason M (2010) Making confident speaker verification decisions with minimal speech. IEEE Trans Audio Speech Lang Process 18(6):1182–1192CrossRef Vogt R, Sridharan S, Mason M (2010) Making confident speaker verification decisions with minimal speech. IEEE Trans Audio Speech Lang Process 18(6):1182–1192CrossRef
8.
Zurück zum Zitat Huet S, Gravier G, Sebillot P (2010) Morpho-syntactic post-processing of n-best lists for improved French automatic speech recognition. Comput Speech Lang 24:663–684CrossRef Huet S, Gravier G, Sebillot P (2010) Morpho-syntactic post-processing of n-best lists for improved French automatic speech recognition. Comput Speech Lang 24:663–684CrossRef
9.
Zurück zum Zitat Kim W, Hansen J (2010) Phonetic distance based condense measure. IEEE Signal Process Lett 17(2):121–124CrossRef Kim W, Hansen J (2010) Phonetic distance based condense measure. IEEE Signal Process Lett 17(2):121–124CrossRef
10.
Zurück zum Zitat Seigel M, Woodland P (2011) Combining information sources for confidence estimation with crf models. Proceedings of InterSpeech Seigel M, Woodland P (2011) Combining information sources for confidence estimation with crf models. Proceedings of InterSpeech
11.
Zurück zum Zitat Ziółko M, Gałka J, Ziółko B, Jadczyk T, Skurzok D, Mąsior M (2011) Automatic speech recognition system dedicated for Polish. Proceedings of Interspeech, Florence Ziółko M, Gałka J, Ziółko B, Jadczyk T, Skurzok D, Mąsior M (2011) Automatic speech recognition system dedicated for Polish. Proceedings of Interspeech, Florence
12.
Zurück zum Zitat Nouza J, Zdansky J, David P, Cerva P, Kolorenc J, Nejedlova D (2005) Fully automated system for Czech spoken broadcast transcription with very large (300 k+) lexicon. Proceedings of InterSpeech, pp 1681–1684 Nouza J, Zdansky J, David P, Cerva P, Kolorenc J, Nejedlova D (2005) Fully automated system for Czech spoken broadcast transcription with very large (300 k+) lexicon. Proceedings of InterSpeech, pp 1681–1684
13.
Zurück zum Zitat Hirsimaki T, Pylkkonen J, Kurimo M (2009) Importance of high-order n-gram models in morph-based speech recognition. IEEE Trans Audio Speech Lang Process 17(4):724–732CrossRef Hirsimaki T, Pylkkonen J, Kurimo M (2009) Importance of high-order n-gram models in morph-based speech recognition. IEEE Trans Audio Speech Lang Process 17(4):724–732CrossRef
14.
Zurück zum Zitat Levenshtein VI (1966) Binary codes capable of correcting deletions, insertions, and reversals. Soviet Phys Doklady 10:707–710MathSciNet Levenshtein VI (1966) Binary codes capable of correcting deletions, insertions, and reversals. Soviet Phys Doklady 10:707–710MathSciNet
15.
Zurück zum Zitat Grocholewski S (1998) First database for spoken Polish. Proceedings of international conference on language resources and evaluation, Grenada, pp 1059–1062 Grocholewski S (1998) First database for spoken Polish. Proceedings of international conference on language resources and evaluation, Grenada, pp 1059–1062
Metadaten
Titel
Edit Distance Comparison Confidence Measure for Speech Recognition
verfasst von
Dawid Skurzok
Bartosz Ziółko
Copyright-Jahr
2013
Verlag
Springer Netherlands
DOI
https://doi.org/10.1007/978-94-007-6738-6_19

Neuer Inhalt