Skip to main content
Top

2016 | OriginalPaper | Chapter

Depth Transfer: Depth Extraction from Videos Using Nonparametric Sampling

Authors : Kevin Karsch, Ce Liu, Sing Bing Kang

Published in: Dense Image Correspondences for Computer Vision

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

In this chapter, a technique that automatically generates plausible depth maps from videos using nonparametric depth sampling is discussed. We demonstrate this method in cases where existing methods fail (nontranslating cameras and dynamic scenes). This technique is applicable to single images as well as videos. For videos, local motion cues are used to improve the inferred depth maps, while optical flow is used to ensure temporal depth consistency. For training and evaluation, a Microsoft Kinect-based system is developed to collect a large dataset containing stereoscopic videos with known depths, and this depth estimation technique outperforms the state-of-the-art on benchmark databases. This method can be used to automatically convert a monoscopic video into stereo for 3D visualization demonstrated through a variety of visually pleasing results for indoor and outdoor scenes, including results from the feature film Charade.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Footnotes
1
Our dataset and code are publicly available at http://​kevinkarsch.​com/​depthtransfer.
 
3
For further details and discussion of IRLS, see the appendix of Liu’s thesis [18].
 
4
In all other types of videos (e.g., those with parallax or fast moving objects/pose), we do not employ this algorithm; equivalently we set the motion segmentation weight to zero (η = 0).
 
5
The presentation of stereoscopic (left+right) video to convey the sense of depth.
 
Literature
1.
go back to reference Batra, D., Saxena, A.: Learning the right model: efficient max-margin learning in laplacian crfs. In: CVPR (2012) Batra, D., Saxena, A.: Learning the right model: efficient max-margin learning in laplacian crfs. In: CVPR (2012)
2.
go back to reference Colombari, A., Fusiello, A., Murino, V.: Continuous parallax adjustment for 3D-TV. In: IEEE Eur. Conf. Vis. Media Prod, pp. 194–200 (2005) Colombari, A., Fusiello, A., Murino, V.: Continuous parallax adjustment for 3D-TV. In: IEEE Eur. Conf. Vis. Media Prod, pp. 194–200 (2005)
3.
go back to reference Delage, E., Lee, H., Ng, A.: A dynamic Bayesian network model for autonomous 3D reconstruction from a single indoor image. In: CVPR (2006)CrossRef Delage, E., Lee, H., Ng, A.: A dynamic Bayesian network model for autonomous 3D reconstruction from a single indoor image. In: CVPR (2006)CrossRef
4.
go back to reference Guttmann, M., Wolf, L., Cohen-Or, D.: Semi-automatic stereo extraction from video footage. In: ICCV (2009)CrossRef Guttmann, M., Wolf, L., Cohen-Or, D.: Semi-automatic stereo extraction from video footage. In: ICCV (2009)CrossRef
5.
go back to reference Han, F., Zhu, S.C.: Bayesian reconstruction of 3D shapes and scenes from a single image. In: IEEE HLK (2003) Han, F., Zhu, S.C.: Bayesian reconstruction of 3D shapes and scenes from a single image. In: IEEE HLK (2003)
6.
go back to reference Hassner, T., Basri, R.: Example based 3D reconstruction from single 2D images. In: CVPR Workshop on Beyond Patches, pp. 15–22 (2006) Hassner, T., Basri, R.: Example based 3D reconstruction from single 2D images. In: CVPR Workshop on Beyond Patches, pp. 15–22 (2006)
8.
go back to reference Heikkila, M., Pietikainen, M.: A texture-based method for modeling the background and detecting moving objects. IEEE Trans. Pattern Anal. Mach. Intell. 28(4), 657–662 (2006)CrossRef Heikkila, M., Pietikainen, M.: A texture-based method for modeling the background and detecting moving objects. IEEE Trans. Pattern Anal. Mach. Intell. 28(4), 657–662 (2006)CrossRef
9.
go back to reference Hoiem, D., Efros, A., Hebert, M.: Automatic photo pop-up. In: ACM SIGGRAPH (2005)CrossRef Hoiem, D., Efros, A., Hebert, M.: Automatic photo pop-up. In: ACM SIGGRAPH (2005)CrossRef
10.
go back to reference Hoiem, D., Stein, A., Efros, A., Hebert, M.: Recovering occlusion boundaries from a single image. In: ICCV (2007)CrossRefMATH Hoiem, D., Stein, A., Efros, A., Hebert, M.: Recovering occlusion boundaries from a single image. In: ICCV (2007)CrossRefMATH
11.
go back to reference Horry, Y., Anjyo, K., Arai, K.: Tour into the picture: using a spidery mesh interface to make animation from a single image. In: SIGGRAPH (1997)CrossRef Horry, Y., Anjyo, K., Arai, K.: Tour into the picture: using a spidery mesh interface to make animation from a single image. In: SIGGRAPH (1997)CrossRef
12.
go back to reference Klein Gunnewiek, R., Berretty, R.P., Barenbrug, B., Magalhães, J.: Coherent spatial and temporal occlusion generation. In: Proc. SPIE 7237, Stereoscopic Displays and Applications XX, vol. 723713 (2009) Klein Gunnewiek, R., Berretty, R.P., Barenbrug, B., Magalhães, J.: Coherent spatial and temporal occlusion generation. In: Proc. SPIE 7237, Stereoscopic Displays and Applications XX, vol. 723713 (2009)
13.
go back to reference Konrad, J., Brown, G., Wang, M., Ishwar, P., Wu, C., Mukherjee, D.: Automatic 2d-to-3d image conversion using 3d examples from the internet. In: SPIE 8288, Stereoscopic Displays and Applications, vol. 82880F (2012). doi:10.1117/12.766566 Konrad, J., Brown, G., Wang, M., Ishwar, P., Wu, C., Mukherjee, D.: Automatic 2d-to-3d image conversion using 3d examples from the internet. In: SPIE 8288, Stereoscopic Displays and Applications, vol. 82880F (2012). doi:10.​1117/​12.​766566
14.
go back to reference Konrad, J., Wang, M., Ishwar, P.: 2d-to-3d image conversion by learning depth from examples. In: 3DCINE (2012) Konrad, J., Wang, M., Ishwar, P.: 2d-to-3d image conversion by learning depth from examples. In: 3DCINE (2012)
15.
go back to reference Koppal, S., Zitnick, C., Cohen, M., Kang, S., Ressler, B., Colburn, A.: A viewer-centric editor for 3D movies. IEEE Comput. Graph. Appl. 31, 20–35 (2011)CrossRef Koppal, S., Zitnick, C., Cohen, M., Kang, S., Ressler, B., Colburn, A.: A viewer-centric editor for 3D movies. IEEE Comput. Graph. Appl. 31, 20–35 (2011)CrossRef
16.
go back to reference Li, C., Kowdle, A., Saxena, A., Chen, T.: Towards holistic scene understanding: feedback enabled cascaded classification models. In: NIPS (2010) Li, C., Kowdle, A., Saxena, A., Chen, T.: Towards holistic scene understanding: feedback enabled cascaded classification models. In: NIPS (2010)
17.
go back to reference Liao, M., Gao, J., Yang, R., Gong, M.: Video stereolization: combining motion analysis with user interaction. IEEE Trans. Vis. Comput. Graph. 18(7), 1079–1088 (2012)CrossRef Liao, M., Gao, J., Yang, R., Gong, M.: Video stereolization: combining motion analysis with user interaction. IEEE Trans. Vis. Comput. Graph. 18(7), 1079–1088 (2012)CrossRef
18.
go back to reference Liu, C.: Beyond pixels: exploring new representations and applications for motion analysis. Ph.D. thesis, MIT (2009) Liu, C.: Beyond pixels: exploring new representations and applications for motion analysis. Ph.D. thesis, MIT (2009)
19.
go back to reference Liu, C., Yuen, J., Torralba, A.: Nonparametric scene parsing: label transfer via dense scene alignment. In: CVPR (2009) Liu, C., Yuen, J., Torralba, A.: Nonparametric scene parsing: label transfer via dense scene alignment. In: CVPR (2009)
20.
go back to reference Liu, B., Gould, S., Koller, D.: Single image depth estimation from predicted semantic labels. In: CVPR (2010)CrossRef Liu, B., Gould, S., Koller, D.: Single image depth estimation from predicted semantic labels. In: CVPR (2010)CrossRef
21.
go back to reference Liu, C., Yuen, J., Torralba, A.: Nonparametric scene parsing via label transfer. IEEE Trans. Pattern Anal. Mach. Intell. 33(12), 2368–2382 (2011)CrossRef Liu, C., Yuen, J., Torralba, A.: Nonparametric scene parsing via label transfer. IEEE Trans. Pattern Anal. Mach. Intell. 33(12), 2368–2382 (2011)CrossRef
22.
go back to reference Liu, C., Yuen, J., Torralba, A.: SIFT flow: dense correspondence across scenes and its applications. IEEE Trans. Pattern Anal. Mach. Intell. 33(5), 978–994 (2011)CrossRef Liu, C., Yuen, J., Torralba, A.: SIFT flow: dense correspondence across scenes and its applications. IEEE Trans. Pattern Anal. Mach. Intell. 33(5), 978–994 (2011)CrossRef
23.
go back to reference Luo, K., Li, D., Feng, Y., M., Z.: Depth-aided inpainting for disocclusion restoration of multi-view images using depth-image-based rendering. J. Zhejiang Univ. Sci. A 10(12), 1738–1749 (2009) Luo, K., Li, D., Feng, Y., M., Z.: Depth-aided inpainting for disocclusion restoration of multi-view images using depth-image-based rendering. J. Zhejiang Univ. Sci. A 10(12), 1738–1749 (2009)
24.
go back to reference Maire, M., Arbelaez, P., Fowlkes, C., Malik, J.: Using contours to detect and localize junctions in natural images. In: CVPR (2008)CrossRef Maire, M., Arbelaez, P., Fowlkes, C., Malik, J.: Using contours to detect and localize junctions in natural images. In: CVPR (2008)CrossRef
25.
go back to reference Nathan Silberman Derek Hoiem, P.K., Fergus, R.: Indoor segmentation and support inference from rgbd images. In: ECCV (2012) Nathan Silberman Derek Hoiem, P.K., Fergus, R.: Indoor segmentation and support inference from rgbd images. In: ECCV (2012)
26.
go back to reference Oh, B., Chen, M., Dorsey, J., Durand, F.: Image-based modeling and photo editing. In: SIGGRAPH (2001)CrossRef Oh, B., Chen, M., Dorsey, J., Durand, F.: Image-based modeling and photo editing. In: SIGGRAPH (2001)CrossRef
27.
go back to reference Oliva, A., Torralba, A.: Modeling the shape of the scene: a holistic representation of the spatial envelope. Int. J. Comput. Vis. 42, 145–175 (2001)CrossRefMATH Oliva, A., Torralba, A.: Modeling the shape of the scene: a holistic representation of the spatial envelope. Int. J. Comput. Vis. 42, 145–175 (2001)CrossRefMATH
28.
go back to reference Rubinstein, M., Liu, C., Freeman, W.: Annotation propagation: automatic annotation of large image databases via dense image correspondence. In: ECCV (2012) Rubinstein, M., Liu, C., Freeman, W.: Annotation propagation: automatic annotation of large image databases via dense image correspondence. In: ECCV (2012)
29.
go back to reference Rubinstein, M., Joulin, A., Kopf, J., Liu, C.: Unsupervised joint object discovery and segmentation in internet images. In: CVPR (2013)CrossRef Rubinstein, M., Joulin, A., Kopf, J., Liu, C.: Unsupervised joint object discovery and segmentation in internet images. In: CVPR (2013)CrossRef
31.
go back to reference Saxena, A., Sun, M., Ng, A.: Make3D: learning 3D scene structure from a single still image. IEEE Trans. Pattern Anal. Mach. Intell. 31(5), 824–840 (2009)CrossRef Saxena, A., Sun, M., Ng, A.: Make3D: learning 3D scene structure from a single still image. IEEE Trans. Pattern Anal. Mach. Intell. 31(5), 824–840 (2009)CrossRef
32.
go back to reference Sheikh, Y., Javed, O., Kanade, T.: Background subtraction for freely moving cameras. In: ICCV (2009)CrossRef Sheikh, Y., Javed, O., Kanade, T.: Background subtraction for freely moving cameras. In: ICCV (2009)CrossRef
33.
go back to reference Tappen, M., Liu, C.: A bayesian approach to alignment-based image hallucination. In: ECCV (2012)CrossRef Tappen, M., Liu, C.: A bayesian approach to alignment-based image hallucination. In: ECCV (2012)CrossRef
34.
go back to reference Van Pernis, A., DeJohn, M.: Dimensionalization: converting 2D films to 3D. In: SPIE 6803, Stereoscopic Displays and Applications XIX, vol. 68030T (2008). doi:10.1117/12.766566 Van Pernis, A., DeJohn, M.: Dimensionalization: converting 2D films to 3D. In: SPIE 6803, Stereoscopic Displays and Applications XIX, vol. 68030T (2008). doi:10.​1117/​12.​766566
35.
go back to reference Wang, O., Lang, M., Frei, M., Hornung, A., Smolic, A., Gross, M.: StereoBrush: interactive 2D to 3D conversion using discontinuous warps. In: SBIM (2011)CrossRef Wang, O., Lang, M., Frei, M., Hornung, A., Smolic, A., Gross, M.: StereoBrush: interactive 2D to 3D conversion using discontinuous warps. In: SBIM (2011)CrossRef
36.
go back to reference Ward, B., Kang, S.B., Bennett, E.P.: Depth director: a system for adding depth to movies. IEEE Comput. Graph. Appl. 31(1), 36–48 (2011)CrossRef Ward, B., Kang, S.B., Bennett, E.P.: Depth director: a system for adding depth to movies. IEEE Comput. Graph. Appl. 31(1), 36–48 (2011)CrossRef
37.
go back to reference Wu, C., Frahm, J.M., Pollefeys, M.: Repetition-based dense single-view reconstruction. In: CVPR (2011)CrossRef Wu, C., Frahm, J.M., Pollefeys, M.: Repetition-based dense single-view reconstruction. In: CVPR (2011)CrossRef
38.
go back to reference Zhang, L., Dugas-Phocion, G., Samson, J.S., Seitz, S.: Single view modeling of free-form scenes. J. Vis. Comput. Animat. 13(4), 225–235 (2002)CrossRefMATH Zhang, L., Dugas-Phocion, G., Samson, J.S., Seitz, S.: Single view modeling of free-form scenes. J. Vis. Comput. Animat. 13(4), 225–235 (2002)CrossRefMATH
39.
go back to reference Zhang, G., Dong, Z., Jia, J., Wan, L., Wong, T.T., Bao, H.: Refilming with depth-inferred videos. IEEE Trans. Vis. Comput. Graph. 15(5), 828–840 (2009)CrossRef Zhang, G., Dong, Z., Jia, J., Wan, L., Wong, T.T., Bao, H.: Refilming with depth-inferred videos. IEEE Trans. Vis. Comput. Graph. 15(5), 828–840 (2009)CrossRef
40.
go back to reference Zhang, G., Jia, J., Wong, T.T., Bao, H.: Consistent depth maps recovery from a video sequence. IEEE Trans. Pattern Anal. Mach. Intell. 31, 974–988 (2009)CrossRef Zhang, G., Jia, J., Wong, T.T., Bao, H.: Consistent depth maps recovery from a video sequence. IEEE Trans. Pattern Anal. Mach. Intell. 31, 974–988 (2009)CrossRef
41.
go back to reference Zhang, G., Jia, J., Hua, W., Bao, H.: Robust bilayer segmentation and motion/depth estimation with a handheld camera. IEEE Trans. Pattern Anal. Mach. Intell. 33(3), 603–617 (2011)CrossRef Zhang, G., Jia, J., Hua, W., Bao, H.: Robust bilayer segmentation and motion/depth estimation with a handheld camera. IEEE Trans. Pattern Anal. Mach. Intell. 33(3), 603–617 (2011)CrossRef
42.
go back to reference Zhang, L., Vazquez, C., Knorr, S.: 3D-TV content creation: automatic 2D-to-3D video conversion. IEEE Trans. Broadcast. 57(2), 372–383 (2011)CrossRef Zhang, L., Vazquez, C., Knorr, S.: 3D-TV content creation: automatic 2D-to-3D video conversion. IEEE Trans. Broadcast. 57(2), 372–383 (2011)CrossRef
Metadata
Title
Depth Transfer: Depth Extraction from Videos Using Nonparametric Sampling
Authors
Kevin Karsch
Ce Liu
Sing Bing Kang
Copyright Year
2016
DOI
https://doi.org/10.1007/978-3-319-23048-1_9