ABSTRACT
The media industry is currently being pulled in the often-opposing directions of increased realism (high resolution, stereoscopic, large screen) and personalization (selection and control of content, availability on many devices). We investigate the feasibility of an end-to-end format-agnostic approach to support both these trends. In this paper, different aspects of a format-agnostic capture, production, delivery and rendering system are discussed. At the capture stage, the concept of layered scene representation is introduced, including panoramic video and 3D audio capture. At the analysis stage, a virtual director component is discussed that allows for automatic execution of cinematographic principles, using feature tracking and saliency detection. At the delivery stage, resolution-independent audiovisual transport mechanisms for both managed and unmanaged networks are treated. In the rendering stage, a rendering process that includes the manipulation of audiovisual content to match the connected display and loudspeaker properties is introduced. Different parts of the complete system are revisited demonstrating the requirements and the potential of this advanced concept.
- R. Schäfer, P. Kauff, C. Weissig, "Ultra high resolution video production and display as basis of a format agnostic production system", Proceedings of IBC 2010.Google Scholar
- M. Maeda, Y. Shishikui, F. Suginoshita, Y. Takiguchi, T. Nakatogawa, M. Kanazawa, K. Mitani, K. Hamasaki, M. Iwaki and Y. Nojiri. "Steps Toward the Practical Use of Super Hi-Vision". NAB2006 Proceedings, Las Vegas, USA, April 2006.Google Scholar
- P. Grosso, L. Herr, N. Ohta, P. Hearty and C. de Laat. "Super high definition media over optical networks", Future Generation Computer Systems, Volume 27, Issue 7, Pages 881--990, July 2011. Google ScholarDigital Library
- R. Kaiser, M. Thaler, A. Kriechbaum, H. Fassold, W. Bailer and J. Rosner, "Real time person tracking in high-resolution panoramic video for Automated broadcast Production", Proceedings of the 8th European Conference on Visual Media Production (CVMP 2011), 2011. Google ScholarDigital Library
- N. Dalal and B. Triggs, "Histograms of oriented gradients for human detection", in Proc. IEEE Computer Vision and Pattern Recognition (CVPR), vol. 1, 2005. Google ScholarDigital Library
- V. Prisacariu and I. Reid, "FastHOG - a realtime GPU implementation of HOG", Technical report, Department of Engineering Science, Oxford University, 2009.Google Scholar
- J. Shi and C. Tomasi, "Good features to track", In Computer Vision and Pattern Recognition, 1994. Proc. CVPR '94., 1994 IEEE Computer Society Conf., p. 593--600, 1994Google Scholar
- Information technology --- Multimedia content description interface --- Part 9: Profiles and levels, AM1: Extensions to profiles and levels. ISO/IEC 15938-9:2005/PDAM 1:2012.Google Scholar
- APIDIS project - Autonomous Production of Images based on Distributed and Intelligent Sensing, http://www.apidis.orgGoogle Scholar
- F. Chen, C. De Vleeschouwer, "Automatic summarization of broadcasted soccer videos with adaptive fast-forwarding", IEEE International Conference on Multimedia and Expo (ICME), 2011. Google ScholarDigital Library
- R. Kaiser, W. Weiss, M. Falelakis et al. (2012), "A Rule-Based Virtual Director Enhancing Group Communication", In 2012 IEEE International Conference on Multimedia and Expo Workshops, 187--192. Google ScholarDigital Library
- R. Kaiser, W. Weiss, G. Kienast, "The FascinatE Production Scripting Engine", Lecture Notes in Computer Science, 2012, Volume 7131, Advances in Multimedia Modeling Advances in Multimedia Modeling - 18th International Conference, MMM 2012, Pages 682--692, 2012 Google ScholarDigital Library
- Mavlankar, A., "Peer-to-Peer Video Streaming with Interactive Regionof- Interest", Ph.D. Dissertation, Stanford University, April 2010Google Scholar
- Khiem, N., Ravindra, G., Carlier, A., and Ooi., W. 2010. Supporting zoomable video streams with dynamic region-of-interest cropping. In Proceedings of the first annual ACM SIGMM conference on Multimedia systems (MMSys '10). ACM, New York, NY, USA, 259--270. Google ScholarDigital Library
- T. Stockhammer, "Dynamic Adaptive Streaming over HTTP - Standards and Design Principles", MMSys'11, February 23--25, 2011, San Jose, California, USA. Google ScholarDigital Library
- I. Sodagar, "The MPEG-DASH Standard for Multimedia Streaming Over the Internet", IEEE Transactions on Multimedia, Vol. 18, No.4, p.62--67, April 2011. Google ScholarDigital Library
- O. A. Niamut, M. J. Prins, R. van Brandenburg, A. Havekes "Spatial Tiling And Streaming In An Immersive Media Delivery Network", in Adjunct Proceedings of EuroITV 2011, Lisbon, Portugal, June 2011.Google Scholar
- R. van Brandenburg, O. A. Niamut, M. Prins, H. Stokking, "Spatial segmentation for immersive media delivery," in Proc. of 15th Int. Conf. on Intelligence in Next Generation Networks (ICIN), Berlin, Germany, 4-7 October, 2011.Google Scholar
- A. Berkhout, "A holographic approach to acoustic control,", J. Audio Eng. Soc., 36(12), pp. 977--995, December 1988.Google Scholar
- A. Berkhout, D. de Vries and P. Vogel, "Acoustic control by wave field synthesis,", J. Audio Eng. Soc., 93(5), pp. 2664--2778, May 1993.Google Scholar
- G. Theile, H. Wittek and M. Reisinger, "Potential Wavefield Synthesis Applications in the Multichannel Stereophonic World", 24th Conf. Audio Eng. Soc., June 2003.Google Scholar
- C. D. Salvador, "Discrete Wave Field Synthesis Using Fractional Order Filters and Fractional Delays", 128th Conv. Audio Eng. Soc., May 2010.Google Scholar
- J. Daniel, R. Nicol, and S. Moreau, "Further Investigations of High Order Ambisonics and Wavefield Synthesis for Holophonic Sound Imaging", 114th Conv. Audio Eng. Soc., March 2003.Google Scholar
- S. Spors and J. Ahrens. A comparison of wave field synthesis and higher-order Ambisonics with respect to physical properties and spatial sampling. In 125th AES Convention, San Fransisco, USA, 2008.Google Scholar
- X. Suau, J. R. Casas and J. Ruiz-Hidalgo, "Real-Time Head and Hand Tracking based on 2.5D data", IEEE Transactions on Multimedia, vol. 14, no. 3, p. 575--585, 2012.Google ScholarDigital Library
- P. Viola, M. J. Jones: "Rapid object detection using a boosted cascade of simple features", IEEE CVPR, 2001.Google ScholarCross Ref
- J. Gallego, M. Pardàs, J. L. Landabaso: "Segmentation and tracking of static and moving objects in video surveillance scenarios", IEEE International Conference on Image Processing, 2008.Google ScholarCross Ref
Index Terms
- Towards a format-agnostic approach for production, delivery and rendering of immersive media
Recommendations
A hybrid architecture for delivery of panoramic video
EuroITV '13: Proceedings of the 11th European Conference on Interactive TV and VideoThe media industry is being pulled in the often-opposing directions of increased realism (high resolution, stereoscopic, large screen) and personalisation (selection and control of content, availability on many devices). Within the EU FP7 project ...
The ultimate immersive experience: panoramic 3d video acquisition
MMM'12: Proceedings of the 18th international conference on Advances in Multimedia ModelingThe paper presents a new approach on an omni-directional omni-stereo multi-camera system that allows the recording of panoramic 3D video with high resolution and quality and display in stereo 3D on a cylindrical screen. It has been developed in the ...
Omnistereo: Panoramic Stereo Imaging
An Omnistereo panorama consists of a pair of panoramic images, where one panorama is for the left eye and another panorama is for the right eye. The panoramic stereo pair provides a stereo sensation up to a full 360 degrees. Omnistereo panoramas cannot ...
Comments