Weitere Kapitel dieses Buchs durch Wischen aufrufen
This chapter covers state-of-the-art paradigms for all the components of deployed spoken dialog systems. With a focus on speech recognition and understanding components as well as dialog management, the specific requirements of deployed systems will be discussed. This includes their robustness against distorted and unexpected user input, their real-time-ability, and the need for standardized interfaces.
Bitte loggen Sie sich ein, um Zugang zu diesem Inhalt zu erhalten
Sie möchten Zugang zu diesem Inhalt erhalten? Dann informieren Sie sich jetzt über unsere Produkte:
See Sect. 3.2 for the definition of this metric.
Word error rate is a common performance metric in speech recognition. It is based on the Levenshtein (or edit) distance  and divides the minimum sum of word substitutions, deletions, and insertions to perform a word-by-word alignment of the recognized word string to a corresponding reference transcription by the number of tokens in said reference.
See Chap. 4 on how to determine optimal thresholds.
The author has witnessed several cases where a speech recognizer falsely accepted some noise or the like, and it turned out that the accepted entity was coincidentally correct. For example:
S: Depending on the kind of cable box you have, please say either Motorola, Pace, or say other brand.C: < cough > S: This was Pace, right?C: That’s correct.
This approach occasionally tricks callers in that they assume to be talking to a live person.
- Paradigms for Deployed Spoken Dialog Systems
- Springer New York
- Chapter 2