Abstract
Many distributed multimedia applications rely on video analysis algorithms for automated video and image processing. Little is known, however, about the minimum video quality required to ensure an accurate performance of these algorithms. In an attempt to understand these requirements, we focus on a set of commonly used face analysis algorithms. Using standard datasets and live videos, we conducted experiments demonstrating that the algorithms show almost no decrease in accuracy until the input video is reduced to a certain critical quality, which amounts to significantly lower bitrate compared to the quality commonly acceptable for human vision. Since computer vision percepts video differently than human vision, existing video quality metrics, designed for human perception, cannot be used to reason about the effects of video quality reduction on accuracy of video analysis algorithms. We therefore investigate two alternate video quality metrics, blockiness and mutual information, and show how they can be used to estimate the critical video qualities for face analysis algorithms.
- Bradski, G. R. 1998. Computer vision face tracking as a component of a perceptual user interface. In Proceedings of the 4th IEEE Workshop on Applications of Computer Vision (WACV'98). IEEE, Los Alamitos, CA, 214--219. Google ScholarDigital Library
- Collins, R., Lipton, A., Kanade, T., Fujiyoshi, H., Duggins, D., Tsin, Y., Tolliver, D., Enomoto, N., and Hasegawa, O. 2000. A system for video surveillance and monitoring. Tech. rep. CMU-RI-TR-00-12, Robotics Institute, Carnegie Mellon University.Google Scholar
- Eisert, P. and Girod, B. 1998. Model-based coding of facial image sequences at varying illumination conditions. In Proceedings of the IMDSP Workshop, 199--122.Google Scholar
- Eisert, P., Wiegand, T., and Girod, B. 2000. Model-aided coding: A new approach to incor- porate facial animationinto motion-compensated video coding. IEEE Trans. Circuits Syst.Video Technol. 10, 344--358. Google ScholarDigital Library
- Eleftheriadis, A. and Anastassiou, D. 1995. Constrained and general dynamic rate shaping of compressed digital video. In Proceedings of the IEEE International Conference on Image Processing (ICIP'95), IEEE, Los Alamitos, CA, 396--399. Google ScholarDigital Library
- Girgensohn, A., Kimber, D., Vaughan, J., Yang, T., Shipman, F., Turner, T., Rieffel, E., Wilcox, L., Chen, F., and Dunnigan, T. 2007. DOTS: Support for effective video surveillance. In Proceedings of the 15th ACM International Conference on Multimedia (ACMMM'07). ACM, New York, 423--432. Google ScholarDigital Library
- Grother, P. J., Micheals, R. J., and Phillips, P. 2003. Face recognition vendor test 2002 performance metrics. In Proceedings of the 4th International Conference on Audio Visual Based Person Authentication (AVBPA'03). 937--945. Google ScholarDigital Library
- Hakeem, A., Shafique, K., and Shah, M. 2005. Anobject-based video coding framework for video sequences obtained from static cameras. In Proceedings of the 13th ACM International Conference on Multimedia (ACMMM'05). ACM, New York, 608--617. Google ScholarDigital Library
- Javed, O., Rasheed, Z., Alatas, O., and Shah, M. 2003. KNIGHT M : A real-time surveillance system for multiple overlapping and non-overlapping cameras. In Proceedings of the IEEE International Conference on Multimedia and Expo (ICME'03), IEEE, Los Alamitos, CA, 649--652. Google ScholarDigital Library
- Kim, J., Wang, Y., and Chang, S. 2003. Content-adaptive utility-based video adaptation. In Proceedings of the IEEE International Conference on Multimedia and Expo (ICME'03). Vol. 3, IEEE, Los Alamitos, CA, 281--284. Google ScholarDigital Library
- Kim, M. and Altunbasak, Y. 2001. Optimal dynamic rate shaping for compressed video stream- ing. In Proceedings of the International Conference on Networking (ICN'01). 786--794. Google ScholarDigital Library
- Korshunov, P. and Ooi, W. T. 2005. Critical video quality for distributed automated video surveillance. In Proceedings of the 13th ACM International Conference on Multimedia (ACMMM'05). ACM, New York, 151--160. Google ScholarDigital Library
- Lu, J., Plataniotis, K. N., and Venetsanopoulos, A. N. 2003. Regularized discriminant analysis for the small sample size problem in face recognition. Patt. Recogn. Lett. 24, 3079--3087. Google ScholarDigital Library
- Muijs, R. and Kirenko, I. 2005. A no-reference blocking artifact. measure for adaptive video processing. In Proceedings of the 13th European Singal Processing Conference (EUSIPCO'05).Google Scholar
- Nair, V. and Clark, J. J. 2002. Automated visual surveillance using hidden markov models. In Proceedings of the 15th International Conference on Vision Interface (VI'02), 88--92.Google Scholar
- Rangaswami, R., Dimitrijevi, Z., Kakligian, K., Chang, E., and Wang, Y. 2004. The SfinX video surveillance system. In Proceedings of the IEEE International Conference on Multimedia and Expo (ICME'04), IEEE, Los Alamitos, CA.Google Scholar
- Rouse, P. and Hemami, S. 2008. Analyzing the role of visual structure in the recognition of natural image content with multi-scale SSIM. In Proceedings of SPIE Human Vision and Electronic Imaging XIII Conference. Vol. 6806, 680615.1--680615.14.Google Scholar
- Rowley, H., Baluja, S., and Kanade, T. 1998. Neural network-based face detection. IEEE Trans. Patt. Anal, Mach. Intell. 20, 23--38. Google ScholarDigital Library
- Sanchez, V., Basu, A., and Mandal, M. 2004. Prioritized region of interest coding in JPEG2000. In Proceedings of the 17th International Conference on Pattern Recognition (ICPR'04). Vol. 2, 799--802. Google ScholarDigital Library
- Schumeyer, R., Heredia, E. A., and Barner, K. E. 1997. Region of interest priority coding for sign language videoconferencing. In Proceedings of the First IEEE Workshop on Multimedia Signal Processing (MMSP'05). IEEE, Los Alamitos, CA, 531--536.Google Scholar
- Shannon, C. 1948. A mathematical theory of communication. Bell Syst. Tech. J. 27, 379--423.Google ScholarCross Ref
- Smolic, A., Makai, B., and Sikora, T. 1999. Real-time estimation of long-term 3-D motion parameters for snhcface animation and model-based coding applications. IEEE Trans. Circuits Syst.Video Technol. 2, 255--263. Google ScholarDigital Library
- Viola, P. and Jones, M. 2001. Robust real-time face detection. In Proceedings of the ICCV Workshop on Statistical and Computation Theories of Vision (ICCV'01). Vol. 2, 747.Google Scholar
- Wu, Y., Jiao, L., Wu, G., Chang, E., and Wang, Y. 2003. Invariant feature extraction and biased statistical inference for video surveillance. In Proceedings of the IEEE International Conference on Advanced Video and Signal-Based Surveillance (AVSS'03), IEEE, Los Alamitos, CA, 284--289. Google ScholarDigital Library
- Yuan, X., Sun, Z., Varol, Y., and Bebis, G. 2003. A distributed visual surveillance system. In Proceedings of the IEEE International Conference on Advanced Video and Signal-Based Surveillance (AVSS'03). IEEE, Los Alamitos, CA. 199--204. Google ScholarDigital Library
Index Terms
- Video quality for face detection, recognition, and tracking
Recommendations
Adaptive appearance model tracking for still-to-video face recognition
Systems for still-to-video face recognition (FR) seek to detect the presence of target individuals based on reference facial still images or mug-shots. These systems encounter several challenges in video surveillance applications due to variations in ...
Face Recognition Through Different Facial Expressions
Face recognition has become an accessible issue for experts as well as ordinary people as it is a focal non-interfering biometric modality. In this paper, we introduced a new approach to perform face recognition under varying facial expressions. The ...
3D model-based face recognition in video
ICB'07: Proceedings of the 2007 international conference on Advances in BiometricsFace recognition in video has gained wide attention due to its role in designing surveillance systems. One of the main advantages of video over still frames is that evidence accumulation over multiple frames can provide better face recognition ...
Comments