2008 | OriginalPaper | Buchkapitel
Speech Coding and Packet Loss Effects on Speech and Speaker Recognition
verfasst von : Laurent Besacier
Erschienen in: Automatic Speech Recognition on Mobile Devices and over Communication Networks
Verlag: Springer London
Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.
Wählen Sie Textabschnitte aus um mit Künstlicher Intelligenz passenden Patente zu finden. powered by
Markieren Sie Textabschnitte, um KI-gestützt weitere passende Inhalte zu finden. powered by
This chapter is related to the speech coding and packet loss problems that occur in network speech recognition where speech is transmitted (and most of the time coded) from a client terminal to a recognition server. The first part describes some commonly used speech coding standards and presents a packet loss model useful to evaluate different channel degradation conditions in a controlled fashion. The second part evaluates the influence of different speech and audio codecs on the performance of a continuous speech recognition engine. It is shown that MPEG transcoding degrades the speech recognition performance for low bit rates whereas performance remains acceptable for specialized speech coders like G723. The same system is also evaluated for different simulated and real packet loss conditions; in that case, the significant degradation of the automatic speech recognition (ASR) performance is analyzed. The third part presents an overview of joint compression and packet loss effects on speech biometrics. Conversely to the ASR task, it is experimentally demonstrated that the adverse effects of packet loss alone are negligible, while the encoding of speech, particularly at a low bit rate, coupled with packet loss, can reduce the speaker recognition accuracy considerably. The fourth part discusses these experimental observations and refers to robustness approaches.