Skip to main content

2016 | OriginalPaper | Buchkapitel

Depth Transfer: Depth Extraction from Videos Using Nonparametric Sampling

verfasst von : Kevin Karsch, Ce Liu, Sing Bing Kang

Erschienen in: Dense Image Correspondences for Computer Vision

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In this chapter, a technique that automatically generates plausible depth maps from videos using nonparametric depth sampling is discussed. We demonstrate this method in cases where existing methods fail (nontranslating cameras and dynamic scenes). This technique is applicable to single images as well as videos. For videos, local motion cues are used to improve the inferred depth maps, while optical flow is used to ensure temporal depth consistency. For training and evaluation, a Microsoft Kinect-based system is developed to collect a large dataset containing stereoscopic videos with known depths, and this depth estimation technique outperforms the state-of-the-art on benchmark databases. This method can be used to automatically convert a monoscopic video into stereo for 3D visualization demonstrated through a variety of visually pleasing results for indoor and outdoor scenes, including results from the feature film Charade.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
Our dataset and code are publicly available at http://​kevinkarsch.​com/​depthtransfer.
 
3
For further details and discussion of IRLS, see the appendix of Liu’s thesis [18].
 
4
In all other types of videos (e.g., those with parallax or fast moving objects/pose), we do not employ this algorithm; equivalently we set the motion segmentation weight to zero (η = 0).
 
5
The presentation of stereoscopic (left+right) video to convey the sense of depth.
 
Literatur
1.
Zurück zum Zitat Batra, D., Saxena, A.: Learning the right model: efficient max-margin learning in laplacian crfs. In: CVPR (2012) Batra, D., Saxena, A.: Learning the right model: efficient max-margin learning in laplacian crfs. In: CVPR (2012)
2.
Zurück zum Zitat Colombari, A., Fusiello, A., Murino, V.: Continuous parallax adjustment for 3D-TV. In: IEEE Eur. Conf. Vis. Media Prod, pp. 194–200 (2005) Colombari, A., Fusiello, A., Murino, V.: Continuous parallax adjustment for 3D-TV. In: IEEE Eur. Conf. Vis. Media Prod, pp. 194–200 (2005)
3.
Zurück zum Zitat Delage, E., Lee, H., Ng, A.: A dynamic Bayesian network model for autonomous 3D reconstruction from a single indoor image. In: CVPR (2006)CrossRef Delage, E., Lee, H., Ng, A.: A dynamic Bayesian network model for autonomous 3D reconstruction from a single indoor image. In: CVPR (2006)CrossRef
4.
Zurück zum Zitat Guttmann, M., Wolf, L., Cohen-Or, D.: Semi-automatic stereo extraction from video footage. In: ICCV (2009)CrossRef Guttmann, M., Wolf, L., Cohen-Or, D.: Semi-automatic stereo extraction from video footage. In: ICCV (2009)CrossRef
5.
Zurück zum Zitat Han, F., Zhu, S.C.: Bayesian reconstruction of 3D shapes and scenes from a single image. In: IEEE HLK (2003) Han, F., Zhu, S.C.: Bayesian reconstruction of 3D shapes and scenes from a single image. In: IEEE HLK (2003)
6.
Zurück zum Zitat Hassner, T., Basri, R.: Example based 3D reconstruction from single 2D images. In: CVPR Workshop on Beyond Patches, pp. 15–22 (2006) Hassner, T., Basri, R.: Example based 3D reconstruction from single 2D images. In: CVPR Workshop on Beyond Patches, pp. 15–22 (2006)
8.
Zurück zum Zitat Heikkila, M., Pietikainen, M.: A texture-based method for modeling the background and detecting moving objects. IEEE Trans. Pattern Anal. Mach. Intell. 28(4), 657–662 (2006)CrossRef Heikkila, M., Pietikainen, M.: A texture-based method for modeling the background and detecting moving objects. IEEE Trans. Pattern Anal. Mach. Intell. 28(4), 657–662 (2006)CrossRef
9.
Zurück zum Zitat Hoiem, D., Efros, A., Hebert, M.: Automatic photo pop-up. In: ACM SIGGRAPH (2005)CrossRef Hoiem, D., Efros, A., Hebert, M.: Automatic photo pop-up. In: ACM SIGGRAPH (2005)CrossRef
10.
Zurück zum Zitat Hoiem, D., Stein, A., Efros, A., Hebert, M.: Recovering occlusion boundaries from a single image. In: ICCV (2007)CrossRefMATH Hoiem, D., Stein, A., Efros, A., Hebert, M.: Recovering occlusion boundaries from a single image. In: ICCV (2007)CrossRefMATH
11.
Zurück zum Zitat Horry, Y., Anjyo, K., Arai, K.: Tour into the picture: using a spidery mesh interface to make animation from a single image. In: SIGGRAPH (1997)CrossRef Horry, Y., Anjyo, K., Arai, K.: Tour into the picture: using a spidery mesh interface to make animation from a single image. In: SIGGRAPH (1997)CrossRef
12.
Zurück zum Zitat Klein Gunnewiek, R., Berretty, R.P., Barenbrug, B., Magalhães, J.: Coherent spatial and temporal occlusion generation. In: Proc. SPIE 7237, Stereoscopic Displays and Applications XX, vol. 723713 (2009) Klein Gunnewiek, R., Berretty, R.P., Barenbrug, B., Magalhães, J.: Coherent spatial and temporal occlusion generation. In: Proc. SPIE 7237, Stereoscopic Displays and Applications XX, vol. 723713 (2009)
13.
Zurück zum Zitat Konrad, J., Brown, G., Wang, M., Ishwar, P., Wu, C., Mukherjee, D.: Automatic 2d-to-3d image conversion using 3d examples from the internet. In: SPIE 8288, Stereoscopic Displays and Applications, vol. 82880F (2012). doi:10.1117/12.766566 Konrad, J., Brown, G., Wang, M., Ishwar, P., Wu, C., Mukherjee, D.: Automatic 2d-to-3d image conversion using 3d examples from the internet. In: SPIE 8288, Stereoscopic Displays and Applications, vol. 82880F (2012). doi:10.​1117/​12.​766566
14.
Zurück zum Zitat Konrad, J., Wang, M., Ishwar, P.: 2d-to-3d image conversion by learning depth from examples. In: 3DCINE (2012) Konrad, J., Wang, M., Ishwar, P.: 2d-to-3d image conversion by learning depth from examples. In: 3DCINE (2012)
15.
Zurück zum Zitat Koppal, S., Zitnick, C., Cohen, M., Kang, S., Ressler, B., Colburn, A.: A viewer-centric editor for 3D movies. IEEE Comput. Graph. Appl. 31, 20–35 (2011)CrossRef Koppal, S., Zitnick, C., Cohen, M., Kang, S., Ressler, B., Colburn, A.: A viewer-centric editor for 3D movies. IEEE Comput. Graph. Appl. 31, 20–35 (2011)CrossRef
16.
Zurück zum Zitat Li, C., Kowdle, A., Saxena, A., Chen, T.: Towards holistic scene understanding: feedback enabled cascaded classification models. In: NIPS (2010) Li, C., Kowdle, A., Saxena, A., Chen, T.: Towards holistic scene understanding: feedback enabled cascaded classification models. In: NIPS (2010)
17.
Zurück zum Zitat Liao, M., Gao, J., Yang, R., Gong, M.: Video stereolization: combining motion analysis with user interaction. IEEE Trans. Vis. Comput. Graph. 18(7), 1079–1088 (2012)CrossRef Liao, M., Gao, J., Yang, R., Gong, M.: Video stereolization: combining motion analysis with user interaction. IEEE Trans. Vis. Comput. Graph. 18(7), 1079–1088 (2012)CrossRef
18.
Zurück zum Zitat Liu, C.: Beyond pixels: exploring new representations and applications for motion analysis. Ph.D. thesis, MIT (2009) Liu, C.: Beyond pixels: exploring new representations and applications for motion analysis. Ph.D. thesis, MIT (2009)
19.
Zurück zum Zitat Liu, C., Yuen, J., Torralba, A.: Nonparametric scene parsing: label transfer via dense scene alignment. In: CVPR (2009) Liu, C., Yuen, J., Torralba, A.: Nonparametric scene parsing: label transfer via dense scene alignment. In: CVPR (2009)
20.
Zurück zum Zitat Liu, B., Gould, S., Koller, D.: Single image depth estimation from predicted semantic labels. In: CVPR (2010)CrossRef Liu, B., Gould, S., Koller, D.: Single image depth estimation from predicted semantic labels. In: CVPR (2010)CrossRef
21.
Zurück zum Zitat Liu, C., Yuen, J., Torralba, A.: Nonparametric scene parsing via label transfer. IEEE Trans. Pattern Anal. Mach. Intell. 33(12), 2368–2382 (2011)CrossRef Liu, C., Yuen, J., Torralba, A.: Nonparametric scene parsing via label transfer. IEEE Trans. Pattern Anal. Mach. Intell. 33(12), 2368–2382 (2011)CrossRef
22.
Zurück zum Zitat Liu, C., Yuen, J., Torralba, A.: SIFT flow: dense correspondence across scenes and its applications. IEEE Trans. Pattern Anal. Mach. Intell. 33(5), 978–994 (2011)CrossRef Liu, C., Yuen, J., Torralba, A.: SIFT flow: dense correspondence across scenes and its applications. IEEE Trans. Pattern Anal. Mach. Intell. 33(5), 978–994 (2011)CrossRef
23.
Zurück zum Zitat Luo, K., Li, D., Feng, Y., M., Z.: Depth-aided inpainting for disocclusion restoration of multi-view images using depth-image-based rendering. J. Zhejiang Univ. Sci. A 10(12), 1738–1749 (2009) Luo, K., Li, D., Feng, Y., M., Z.: Depth-aided inpainting for disocclusion restoration of multi-view images using depth-image-based rendering. J. Zhejiang Univ. Sci. A 10(12), 1738–1749 (2009)
24.
Zurück zum Zitat Maire, M., Arbelaez, P., Fowlkes, C., Malik, J.: Using contours to detect and localize junctions in natural images. In: CVPR (2008)CrossRef Maire, M., Arbelaez, P., Fowlkes, C., Malik, J.: Using contours to detect and localize junctions in natural images. In: CVPR (2008)CrossRef
25.
Zurück zum Zitat Nathan Silberman Derek Hoiem, P.K., Fergus, R.: Indoor segmentation and support inference from rgbd images. In: ECCV (2012) Nathan Silberman Derek Hoiem, P.K., Fergus, R.: Indoor segmentation and support inference from rgbd images. In: ECCV (2012)
26.
Zurück zum Zitat Oh, B., Chen, M., Dorsey, J., Durand, F.: Image-based modeling and photo editing. In: SIGGRAPH (2001)CrossRef Oh, B., Chen, M., Dorsey, J., Durand, F.: Image-based modeling and photo editing. In: SIGGRAPH (2001)CrossRef
27.
Zurück zum Zitat Oliva, A., Torralba, A.: Modeling the shape of the scene: a holistic representation of the spatial envelope. Int. J. Comput. Vis. 42, 145–175 (2001)CrossRefMATH Oliva, A., Torralba, A.: Modeling the shape of the scene: a holistic representation of the spatial envelope. Int. J. Comput. Vis. 42, 145–175 (2001)CrossRefMATH
28.
Zurück zum Zitat Rubinstein, M., Liu, C., Freeman, W.: Annotation propagation: automatic annotation of large image databases via dense image correspondence. In: ECCV (2012) Rubinstein, M., Liu, C., Freeman, W.: Annotation propagation: automatic annotation of large image databases via dense image correspondence. In: ECCV (2012)
29.
Zurück zum Zitat Rubinstein, M., Joulin, A., Kopf, J., Liu, C.: Unsupervised joint object discovery and segmentation in internet images. In: CVPR (2013)CrossRef Rubinstein, M., Joulin, A., Kopf, J., Liu, C.: Unsupervised joint object discovery and segmentation in internet images. In: CVPR (2013)CrossRef
31.
Zurück zum Zitat Saxena, A., Sun, M., Ng, A.: Make3D: learning 3D scene structure from a single still image. IEEE Trans. Pattern Anal. Mach. Intell. 31(5), 824–840 (2009)CrossRef Saxena, A., Sun, M., Ng, A.: Make3D: learning 3D scene structure from a single still image. IEEE Trans. Pattern Anal. Mach. Intell. 31(5), 824–840 (2009)CrossRef
32.
Zurück zum Zitat Sheikh, Y., Javed, O., Kanade, T.: Background subtraction for freely moving cameras. In: ICCV (2009)CrossRef Sheikh, Y., Javed, O., Kanade, T.: Background subtraction for freely moving cameras. In: ICCV (2009)CrossRef
33.
Zurück zum Zitat Tappen, M., Liu, C.: A bayesian approach to alignment-based image hallucination. In: ECCV (2012)CrossRef Tappen, M., Liu, C.: A bayesian approach to alignment-based image hallucination. In: ECCV (2012)CrossRef
34.
Zurück zum Zitat Van Pernis, A., DeJohn, M.: Dimensionalization: converting 2D films to 3D. In: SPIE 6803, Stereoscopic Displays and Applications XIX, vol. 68030T (2008). doi:10.1117/12.766566 Van Pernis, A., DeJohn, M.: Dimensionalization: converting 2D films to 3D. In: SPIE 6803, Stereoscopic Displays and Applications XIX, vol. 68030T (2008). doi:10.​1117/​12.​766566
35.
Zurück zum Zitat Wang, O., Lang, M., Frei, M., Hornung, A., Smolic, A., Gross, M.: StereoBrush: interactive 2D to 3D conversion using discontinuous warps. In: SBIM (2011)CrossRef Wang, O., Lang, M., Frei, M., Hornung, A., Smolic, A., Gross, M.: StereoBrush: interactive 2D to 3D conversion using discontinuous warps. In: SBIM (2011)CrossRef
36.
Zurück zum Zitat Ward, B., Kang, S.B., Bennett, E.P.: Depth director: a system for adding depth to movies. IEEE Comput. Graph. Appl. 31(1), 36–48 (2011)CrossRef Ward, B., Kang, S.B., Bennett, E.P.: Depth director: a system for adding depth to movies. IEEE Comput. Graph. Appl. 31(1), 36–48 (2011)CrossRef
37.
Zurück zum Zitat Wu, C., Frahm, J.M., Pollefeys, M.: Repetition-based dense single-view reconstruction. In: CVPR (2011)CrossRef Wu, C., Frahm, J.M., Pollefeys, M.: Repetition-based dense single-view reconstruction. In: CVPR (2011)CrossRef
38.
Zurück zum Zitat Zhang, L., Dugas-Phocion, G., Samson, J.S., Seitz, S.: Single view modeling of free-form scenes. J. Vis. Comput. Animat. 13(4), 225–235 (2002)CrossRefMATH Zhang, L., Dugas-Phocion, G., Samson, J.S., Seitz, S.: Single view modeling of free-form scenes. J. Vis. Comput. Animat. 13(4), 225–235 (2002)CrossRefMATH
39.
Zurück zum Zitat Zhang, G., Dong, Z., Jia, J., Wan, L., Wong, T.T., Bao, H.: Refilming with depth-inferred videos. IEEE Trans. Vis. Comput. Graph. 15(5), 828–840 (2009)CrossRef Zhang, G., Dong, Z., Jia, J., Wan, L., Wong, T.T., Bao, H.: Refilming with depth-inferred videos. IEEE Trans. Vis. Comput. Graph. 15(5), 828–840 (2009)CrossRef
40.
Zurück zum Zitat Zhang, G., Jia, J., Wong, T.T., Bao, H.: Consistent depth maps recovery from a video sequence. IEEE Trans. Pattern Anal. Mach. Intell. 31, 974–988 (2009)CrossRef Zhang, G., Jia, J., Wong, T.T., Bao, H.: Consistent depth maps recovery from a video sequence. IEEE Trans. Pattern Anal. Mach. Intell. 31, 974–988 (2009)CrossRef
41.
Zurück zum Zitat Zhang, G., Jia, J., Hua, W., Bao, H.: Robust bilayer segmentation and motion/depth estimation with a handheld camera. IEEE Trans. Pattern Anal. Mach. Intell. 33(3), 603–617 (2011)CrossRef Zhang, G., Jia, J., Hua, W., Bao, H.: Robust bilayer segmentation and motion/depth estimation with a handheld camera. IEEE Trans. Pattern Anal. Mach. Intell. 33(3), 603–617 (2011)CrossRef
42.
Zurück zum Zitat Zhang, L., Vazquez, C., Knorr, S.: 3D-TV content creation: automatic 2D-to-3D video conversion. IEEE Trans. Broadcast. 57(2), 372–383 (2011)CrossRef Zhang, L., Vazquez, C., Knorr, S.: 3D-TV content creation: automatic 2D-to-3D video conversion. IEEE Trans. Broadcast. 57(2), 372–383 (2011)CrossRef
Metadaten
Titel
Depth Transfer: Depth Extraction from Videos Using Nonparametric Sampling
verfasst von
Kevin Karsch
Ce Liu
Sing Bing Kang
Copyright-Jahr
2016
DOI
https://doi.org/10.1007/978-3-319-23048-1_9

Neuer Inhalt