Skip to main content

2013 | Buch

3D-TV System with Depth-Image-Based Rendering

Architectures, Techniques and Challenges

herausgegeben von: Ce Zhu, Yin Zhao, Lu Yu, Masayuki Tanimoto

Verlag: Springer New York

insite
SUCHEN

Über dieses Buch

Riding on the success of 3D cinema blockbusters and advances in stereoscopic display technology, 3D video applications have gathered momentum in recent years. 3D-TV System with Depth-Image-Based Rendering: Architectures, Techniques and Challenges surveys depth-image-based 3D-TV systems, which are expected to be put into applications in the near future. Depth-image-based rendering (DIBR) significantly enhances the 3D visual experience compared to stereoscopic systems currently in use. DIBR techniques make it possible to generate additional viewpoints using 3D warping techniques to adjust the perceived depth of stereoscopic videos and provide for auto-stereoscopic displays that do not require glasses for viewing the 3D image.

The material includes a technical review and literature survey of components and complete systems, solutions for technical issues, and implementation of prototypes. The book is organized into four sections: System Overview, Content Generation, Data Compression and Transmission, and 3D Visualization and Quality Assessment. This book will benefit researchers, developers, engineers, and innovators, as well as advanced undergraduate and graduate students working in relevant areas.

Inhaltsverzeichnis

Frontmatter

system overview

Chapter 1. An Overview of 3D-TV System Using Depth-Image-Based Rendering
Abstract
The depth-based 3D system is considered a strong candidate of the second-generation 3D-TV, preceded by the stereoscopic 3D-TV. The data formats involve one or several pairs of coupled texture images and depth maps, often known as image-plus-depth (2D + Z), multi-view video plus depth (MVD), and layered depth video (LDV). With the depth information, novel views at arbitrary viewpoints can be synthesized with a depth-image-based rendering (DIBR) technique. In such a way, the depth-based 3D-TV system can provide stereoscopic pairs with an adjustable baseline or multiple views for autostereoscopic displays. This chapter overviews key technologies involved in this depth-based 3D-TV system, including content generation, data compression and transmission, 3D visualization, and quality evaluation. We will also present some challenges that hamper the commercialization of the depth-based 3D video broadcast. Finally, some international research cooperation and standardization efforts are briefly discussed as well.
Yin Zhao, Ce Zhu, Lu Yu, Masayuki Tanimoto

Content generation

Chapter 2. Generic Content Creation for 3D Displays
Abstract
Future 3D productions in the fields of digital signage, commercials, and 3D Television will cope with the problem that they have to address a wide range of different 3D displays, ranging from glasses-based standard stereo displays to auto-stereoscopic multi-view displays or even light-field displays. The challenge will be to serve all these display types with sufficient quality and appealing content. Against this background this chapter discusses flexible solutions for 3D capture, generic 3D representation formats using depth maps, robust methods for reliable depth estimation, required preprocessing of captured multi-view footage, postprocessing of estimated depth maps, and, finally, depth-image-based rendering (DIBR) for creating missing virtual views at the display side.
Frederik Zilly, Marcus Müller, Peter Kauff
Chapter 3. Stereo Matching and Viewpoint Synthesis FPGA Implementation
Abstract
With the advent of 3D-TV, the increasing interest of free viewpoint TV in MPEG, and the inevitable evolution toward high-quality and higher resolution TV (from SDTV to HDTV and even UDTV) with comfortable viewing experience, there is a need to develop low-cost solutions addressing the 3D-TV market. Moreover, it is believed that in a not too distant future 2D-UDTV display technology will support a reasonable quality 3D-TV autostereoscopic display mode (no need for 3D glasses) where up to a dozens of intermediate views are rendered between the extreme left and right stereo video input views. These intermediate views can be synthesized by using viewpoint synthesizing techniques with the left and/or right image and associated depth map. With the increasing penetration of 3D-TV broadcasting with left and right images as straightforward 3D-TV broadcasting method, extracting high-quality depth map from these stereo input images becomes mandatory to synthesize other intermediate views. This chapter describes such “Stereo-In to Multiple-Viewpoint-Out” functionality on a general FPGA-based system demonstrating a real-time high-quality depth extraction and viewpoint synthesizer, as a prototype toward a future chipset for 3D-HDTV.
Chao-Kang Liao, Hsiu-Chi Yeh, Ke Zhang, Vanmeerbeeck Geert, Tian-Sheuan Chang, Gauthier Lafruit
Chapter 4. DIBR-Based Conversion from Monoscopic to Stereoscopic and Multi-View Video
Abstract
This chapter aims to provide a tutorial on 2D-to-3D video conversion methods that exploit depth-image-based rendering (DIBR) techniques. It is devoted not only to university students who are new to this area of research, but also to researchers and engineers who want to enhance their knowledge of video conversion techniques. The basic principles and the various methods for converting 2D video to stereoscopic 3D, including depth extraction strategies and DIBR-based view synthesis approaches, are reviewed. Conversion artifacts and evaluation of conversion quality are discussed, and the advantages and disadvantages of the different methods are elaborated. Furthermore, practical implementations for the conversion from monoscopic to stereoscopic and multi-view video are drawn.
Liang Zhang, Carlos Vázquez, Grégory Huchet, Wa James Tam
Chapter 5. Virtual View Synthesis and Artifact Reduction Techniques
Abstract:
With texture and depth data, virtual views are synthesized to produce a disparity-adjustable stereo pair for stereoscopic displays, or to generate multiple views required by autostereoscopic displays. View synthesis typically consists of three steps: 3D warping, view merging, and hole filling. However, simple synthesis algorithms may yield some visual artifacts, e.g., texture flickering, boundary artifact, and smearing effect, and many efforts have been made to suppress these synthesis artifacts. Some employ spatial/temporal filters to smooth depth maps, which mitigate depth errors and enhance temporal consistency; some use a cross-check technique to detect and prevent possible synthesis distortions; some focus on removing boundary artifacts and others attempt to create natural texture patches for the disoccluded regions. In addition to rendering quality, real-time implementation is necessary for view synthesis. So far, the basic three-step rendering process has been realized in real time through GPU programming and a design.
Yin Zhao, Ce Zhu, Lu Yu
Chapter 6. Hole Filling for View Synthesis
Abstract
Depth-image-based depth rendering (DIBR) technique is recognized as a promising tool for supporting advanced 3D video services required in multi-view video (MVV) systems. However, an inherent problem with DIBR is to deal with the newly exposed areas that appear in synthesized views. This occurs when parts of the scene are not visible in every viewpoint, leaving blanks spots, called disocclusions. These disocclusions may grow larger as the distance between cameras increases. This chapter addresses the disocclusion problem in two manners: (1) the preprocessing of the depth data, and (2) the image inpainting of the synthesizing view. To deal with small disocclusions, a hole filling strategy is designed by preprocessing the depth video before DIBR, while for larger disocclusions an inpainting approach is proposed to retrieve the missing pixels by leveraging the given depth information.
Ismael Daribo, Hideo Saito, Ismael Daribo, Ryo Furukawa, Shinsaku Hiura, Naoki Asada
Chapter 7. LDV Generation from Multi-View Hybrid Image and Depth Video
Abstract
The technology around 3D-TV is evolving rapidly. There are already different stereo displays available and auto-stereoscopic displays promise 3D without glasses in the near future. All of the commercially available content today is purely image-based. Depth-based content on the other hand provides better flexibility and scalability regarding future 3D-TV requirements and in the long term is considered to be a better alternative for 3D-TV production. However, depth estimation is a difficult process, which threatens to become the main bottleneck in the whole production chain. There are already different sophisticated depth-based formats such as LDV (layered depth video) or MVD (multi-view video plus depth) available, but no reliable production techniques for these formats exist today. Usually camera systems, consisting of multiple color cameras, are used for capturing. These systems however rely on stereo matching for depth estimation, which often fails in presence of repetitive patterns or textureless regions. Newer, hybrid systems offer a better alternative here. Hybrid systems incorporate active sensors in the depth estimation process and allow to overcome difficulties of the standard multi-camera systems. In this chapter a complete production chain for 2-layer LDV format, based on a hybrid camera system of 5 color cameras and 2 time-of-flight cameras, is presented. It includes real-time preview capabilities for quality control during the shooting and post-production algorithms to generate high-quality LDV content consisting of foreground and occlusion layers.
Anatol Frick, Reinhard Koch

Data compression and transmission

Chapter 8. 3D Video Compression
Abstract
In this chapter, compression methods for 3D video (3DV) are presented. This includes data formats, video and depth compression, evaluation methods, and analysis tools. First, the fundamental principles of video coding for classical 2D video content are reviewed, including signal prediction, quantization, transformation, and entropy coding. These methods are extended toward multi-view video coding (MVC), where inter-view prediction is added to the 2D video coding methods to gain higher coding efficiency. Next, 3DV coding principles are introduced, which are different from previous coding methods. In 3DV, a generic input format is used for coding and a dense number of output views are generated for different types of autostereoscopic displays. This influences the format selection, encoder optimization, evaluation methods, and requires new modules, like the decoder-side view generation, as discussed in this chapter. Finally, different 3DV formats are compared and discussed for their applicability for 3DV systems.
Karsten Müller, Philipp Merkle, Gerhard Tech
Chapter 9. Depth Map Compression for Depth-Image-Based Rendering
Abstract
In this chapter, we discuss unique characteristics of depth maps, review recent depth map coding techniques, and describe how texture and depth map compression can be jointly optimized.
Gene Cheung, Antonio Ortega, Woo-Shik Kim, Vladan Velisavljevic, Akira Kubota
Chapter 10. Effects of Wavelet-Based Depth Video Compression
Abstract
Multi-view video (MVV) representation based on depth data, such as multi-view video plus depth (MVD), is emerging a new type of 3D video communication services. In the meantime, the problem of coding and transmitting the depth video is being raised in addition to classical texture video. Depth video is considered as key side information in novel view synthesis within MVV systems, such as three-dimensional television (3D-TV) or free viewpoint television (FTV). Nonetheless the influence of depth compression on the novel synthesized view is still a contentious issue. In this chapter, we propose to discuss and investigate the impact of the wavelet-based compression of the depth video on the quality of the view synthesis. After the analysis, different frameworks are presented to reduce the disturbing depth compression effects on the novel synthesized view.
Ismael Daribo, Hideo Saito, Ryo Furukawa, Shinsaku Hiura, Naoki Asada
Chapter 11. Transmission of 3D Video over Broadcasting
Abstract
This chapter provides a general perspective of the feasibility options of the digital broadcasting networks for delivering three-dimensional TV (3D-TV) services. It discusses factors (e.g., data format) that need to be accounted for in the deployment stages of 3D-TV services over broadcast networks with special emphasis made on systems based on Depth-Image-Based Rendering (DIBR) techniques.
Pablo Angueira, David de la Vega, Javier Morgade, Manuel María Vélez

3D visualization and quality assessment

Chapter 12. The Psychophysics of Binocular Vision
Abstract
This chapter reviews psychophysical research on human stereoscopic processes and their relationship to a 3D-TV system with DIBR. Topics include basic physiology, binocular correspondence and the horopter, stereoacuity and fusion limits, non-corresponding inputs and rivalry, dynamic cues to depth and their interactions with disparity, and development and adaptability of the binocular system.
Philip M. Grove
Chapter 13. Stereoscopic and Autostereoscopic Displays
Abstract
This chapter covers the state of the art in stereoscopic and autostereoscopic displays. The coverage is not exhaustive but is intended that in the relatively limited space available a reasonably comprehensive snapshot of the current state of the art can be provided. In order to give a background to this, a brief introduction to stereoscopic perception and a short history of stereoscopic displays is given. Holography is not covered in detail here as it is really a separate area of study and also is not likely to be the basis of a commercially viable display within the near future.
Phil Surman
Chapter 14. Subjective and Objective Visual Quality Assessment in the Context of Stereoscopic 3D-TV
Abstract
Subjective and objective visual quality assessment in the context of stereoscopic three-dimensional TV (3D-TV) is still in the nascent stage and needs to consider the effect of the added depth dimension. As a matter of fact, quality assessment of 3D-TV cannot be considered as a trivial extension of two-dimensional (2D) cases. Furthermore, it may also introduce negative effects not experienced in 2D, e.g., discomfort or nausea. Based on efforts initiated within the cost action ICT 1003 QUALINET, this chapter discusses current challenges in relation to subjective and objective visual quality assessment for stereo-based 3D-TV. Two case studies are presented to illustrate the current state of the art and some of the remaining challenges.
Marcus Barkowsky, Kjell Brunnström, Touradj Ebrahimi, Lina Karam, Pierre Lebreton, Patrick Le Callet, Andrew Perkis, Alexander Raake, Mahesh Subedar, Kun Wang, Liyuan Xing, Junyong You
Chapter 15. Visual Quality Assessment of Synthesized Views in the Context of 3D-TV
Abstract
Depth-image-based rendering (DIBR) is fundamental to 3D-TV applications because the generation of new viewpoints is recurrent. Like any tool, DIBR methods are subject to evaluations, thanks to the assessment of the visual quality of the resulting generated views. This assessment task is peculiar because DIBR can be used for different 3D-TV applications: either in a 2D context (Free Viewpoint Television, FTV), or in a 3D context (3D displays reproducing stereoscopic vision). Depending on the context, the factors affecting the visual experience may differ. This chapter concerns the case of the use of DIBR in the 2D context. It addresses two particular cases of use: visualization of still images and visualization of video sequences, in FTV in the 2D context. Through these two cases, the main issues of DIBR are presented in terms of visual quality assessment. Two experiments are proposed as case studies addressing the problematic of this chapter: the first one concerns the assessment of still images and the second one concerns the video sequence assessment. The two experiments question the reliability of subjective and objective usual tools when assessing the visual quality of synthesized views in a 2D context.
Emilie Bosc, Patrick Le Callet, Luce Morin, Muriel Pressigout
Backmatter
Metadaten
Titel
3D-TV System with Depth-Image-Based Rendering
herausgegeben von
Ce Zhu
Yin Zhao
Lu Yu
Masayuki Tanimoto
Copyright-Jahr
2013
Verlag
Springer New York
Electronic ISBN
978-1-4419-9964-1
Print ISBN
978-1-4419-9963-4
DOI
https://doi.org/10.1007/978-1-4419-9964-1

Neuer Inhalt