Skip to main content

2015 | Buch

Audiovisual Quality Assessment and Prediction for Videotelephony

insite
SUCHEN

Über dieses Buch

The work presented in this book focuses on modeling audiovisual quality as perceived by the users of IP-based solutions for video communication like videotelephony. It also extends the current framework for the parametric prediction of audiovisual call quality. The book addresses several aspects related to the quality perception of entire video calls, namely, the quality estimation of the single audio and video modalities in an interactive context, the audiovisual quality integration of these modalities and the temporal pooling of short sample-based quality scores to account for the perceptual quality impact of time-varying degradations.

Inhaltsverzeichnis

Frontmatter
Chapter 1. Audiovisual Quality for Interactive Communication
Abstract
This chapter introduces the key notions related to the concept of quality and its evaluation for interactive audiovisual services like videotelephony. Focus is brought into the technological aspects associated with real-time interactive audiovisual services. A thorough description of the entire transmission chain will give a closer look at the main technical factors impacting the quality. Means of quality measurement like standardized methodologies for subjective assessment and quality estimation like instrumental methods will be reviewed. Finally, temporal as well as multi-modal aspects of audiovisual perception are presented in order to provide a broad overview on the topic.
Benjamin Belmudez
Chapter 2. Interactive Model Framework
Abstract
Evaluating user perception of audiovisual interactive services like videotelephony in a reliable fashion calls for a well controlled testing environment and experimental test bed. The present chapter introduces the main aspects of the employed experimental method for studying the perception of audiovisual quality for videotelephony. In this work, a dedicated test bed was deployed that is composed of a controlled laboratory environment, a network infrastructure, a videotelephony client and a control unit. Audiovisual material specific to videotelephony (“head-and-shoulders”) was produced following specific conversational scenarios adapted to the evaluation of the interactive quality. This experimental setup was designed to facilitate the investigation of user experience in an interactive experimental context.
Benjamin Belmudez
Chapter 3. Extension of Auditory and Visual Quality Estimation Functions for Videotelephony
Abstract
ITU-T Recommendation G.1070 defines an opinion model for the prediction of the audiovisual conversational quality for videotelephony applications. The model outputs three quality related values: the audio, the video and the audiovisual quality as perceived by the user. In addition, it takes into account the one-way delay and integrates it to provide a multimedia score that reflects the quality of the overall experience. In this chapter, improvements to the video and audio quality estimation functions are proposed. On the video side, the coefficients of the video quality functions are estimated for several video codecs based on the procedure defined in the Annex A of ITU-T Rec. G.1070. Then, the quality impact of two application-related parameters influencing the values of the video coefficients, namely the encoding resolution and the display size, is integrated in the model, hence reducing the coefficients dependency on input parameters. On the audio side, the quality function is extended to wideband applications by adapting the WB-E-model to comply with the assumptions of the G.1070 model. Finally, the performance of both functions is evaluated on four separate datasets including different video and audio codecs and experimental contexts, i.e. passive listening and viewing and interactive.
Benjamin Belmudez
Chapter 4. Audiovisual Integration for Call Quality
Abstract
Multi-modal aspects of audiovisual quality assessment for interactive communication services are presented in this chapter. The focus is brought on how perceived auditory and visual qualities integrate to form an overall audiovisual quality perception for different experimental contexts. In particular, three main aspects are investigated: first, the influence of the experimental context referring to the mode of assessment (whether the subjects are placed in a one-way listening and viewing setting, or in a two-way interactive situation of assessment) on the audio, video and audiovisual qualities; second, the effects of cross-modal interactions on the assessment of the audio and video qualities are measured for those experimental contexts and confronted to the results found in the literature; third, the impact of the conversational scenario on the assessment of the auditory and visual qualities is investigated. Finally, audiovisual integration functions are proposed for each mode of assessment and conversational scenario that can be used in a general context.
Benjamin Belmudez
Chapter 5. Temporal Integration for Audiovisual Call Quality
Abstract
Evaluating the perceived audiovisual quality of entire video calls requires to take into account the effects of time-varying degradations. Models of time integration have been developed which assess the fluctuating quality of transmitted speech and predict the call-final judgement based on a weighting technique of momentary quality ratings. In this chapter, the applicability of such models to the audiovisual case will not only be evaluated, but the models will be improved as well. To that end, a subjective test methodology is developed which allows the user perception evaluation of time-varying quality of 90 s long sequences organized in a simulated conversational structure. These sequences emulate the course of an entire video call and are composed of independent short samples (9 s). Audiovisual impairments are temporally distributed over the long sequences by applying different levels of impairment to the short samples, hence producing different predefined temporal quality profiles. A separate assessment of the perceived quality of the short samples on one hand and of the long sequences on the other hand allows to compare the perceptual effects between speech and audiovisual stimuli. It also allows the evaluation and optimization of existing call quality models that predict the quality at the end of a (simulated) speech conversation. These models proved to enhance the prediction accuracy in comparison to the plain average, and an optimization of the model parameters further refines the correlation of the estimates with the subjective data. The optimized models showed a higher correlation and a lower prediction error on independent test data.
Benjamin Belmudez
Chapter 6. Conclusion
Abstract
This book addressed key research questions related to audiovisual quality perception for interactive videotelephony applications. Evaluating the experienced quality of videotelephony and more generally of interactive video communication systems requires to design ecologically valid contexts of assessment to reflect realistic usage cases. In the case of video calls, conversational tests constitute an adequate assessment situation as they reflect the natural human process of exchanging information between two conversing partners in a mediated way. In this book, the characteristics of audiovisual conversational quality perception for videotelephony services are evaluated in the framework of the ITU-T parametric model G.1070.
Benjamin Belmudez
Backmatter
Metadaten
Titel
Audiovisual Quality Assessment and Prediction for Videotelephony
verfasst von
Benjamin Belmudez
Copyright-Jahr
2015
Electronic ISBN
978-3-319-14166-4
Print ISBN
978-3-319-14165-7
DOI
https://doi.org/10.1007/978-3-319-14166-4

Neuer Inhalt