Skip to main content

2016 | OriginalPaper | Buchkapitel

Unified Depth Prediction and Intrinsic Image Decomposition from a Single Image via Joint Convolutional Neural Fields

verfasst von : Seungryong Kim, Kihong Park, Kwanghoon Sohn, Stephen Lin

Erschienen in: Computer Vision – ECCV 2016

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

We present a method for jointly predicting a depth map and intrinsic images from single-image input. The two tasks are formulated in a synergistic manner through a joint conditional random field (CRF) that is solved using a novel convolutional neural network (CNN) architecture, called the joint convolutional neural field (JCNF) model. Tailored to our joint estimation problem, JCNF differs from previous CNNs in its sharing of convolutional activations and layers between networks for each task, its inference in the gradient domain where there exists greater correlation between depth and intrinsic images, and the incorporation of a gradient scale network that learns the confidence of estimated gradients in order to effectively balance them in the solution. This approach is shown to surpass state-of-the-art methods both on single-image depth estimation and on intrinsic image decomposition.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Anhänge
Nur mit Berechtigung zugänglich
Fußnoten
1
\(\triangledown \) is a differential operator defined in the \(\mathbf {x}\)- and \(\mathbf {y}\)-direction such that \(\triangledown = [\triangledown _{\mathbf {x}},\triangledown _{\mathbf {y}}]\).
 
2
It is defined as the receptive field through the CNNs for pixel p [33].
 
Literatur
1.
Zurück zum Zitat Chen, Q., Koltun, V.: A simple model for intrinsic image decomposition with depth cues. In: ICCV (2013) Chen, Q., Koltun, V.: A simple model for intrinsic image decomposition with depth cues. In: ICCV (2013)
2.
Zurück zum Zitat Laffont, P.Y., Bousseau, A., Paris, S., Durand, F., Drettakis, G.: Coherent intrinsic images from photo collections. ACM Trans. Graph. 31(6), 1–11 (2012)CrossRef Laffont, P.Y., Bousseau, A., Paris, S., Durand, F., Drettakis, G.: Coherent intrinsic images from photo collections. ACM Trans. Graph. 31(6), 1–11 (2012)CrossRef
3.
Zurück zum Zitat Lee, K.J., Zhao, Q., Tong, X., Gong, M., Izadi, S., Lee, S.U., Tan, P., Lin, S.: Estimation of intrinsic image sequences from image+depth video. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part VI. LNCS, vol. 7577, pp. 327–340. Springer, Heidelberg (2012)CrossRef Lee, K.J., Zhao, Q., Tong, X., Gong, M., Izadi, S., Lee, S.U., Tan, P., Lin, S.: Estimation of intrinsic image sequences from image+depth video. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part VI. LNCS, vol. 7577, pp. 327–340. Springer, Heidelberg (2012)CrossRef
4.
Zurück zum Zitat Jeon, J., Cho, S., Tong, X., Lee, S.: Intrinsic image decomposition using structure-texture separation and surface normals. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014, Part VII. LNCS, vol. 8695, pp. 218–233. Springer, Heidelberg (2014) Jeon, J., Cho, S., Tong, X., Lee, S.: Intrinsic image decomposition using structure-texture separation and surface normals. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014, Part VII. LNCS, vol. 8695, pp. 218–233. Springer, Heidelberg (2014)
5.
Zurück zum Zitat Barron, J.T., Malik, J.: intrinsic scene properties from a single RGB-D image. In: CVPR (2013) Barron, J.T., Malik, J.: intrinsic scene properties from a single RGB-D image. In: CVPR (2013)
6.
Zurück zum Zitat Eigen, D., Puhrsch, C., Ferus, R.: Depth map prediction from a single image using a multi-scale deep network. In: NIPS (2014) Eigen, D., Puhrsch, C., Ferus, R.: Depth map prediction from a single image using a multi-scale deep network. In: NIPS (2014)
7.
Zurück zum Zitat Fayao, L., Chunhua, S., Guosheng, L.: Deep convolutional neural fields for depth estimation from a single images. In: CVPR (2015) Fayao, L., Chunhua, S., Guosheng, L.: Deep convolutional neural fields for depth estimation from a single images. In: CVPR (2015)
8.
Zurück zum Zitat Kong, N., Black, M.J.: Intrinsic depth: Improving depth transfer with intrinsic images. In: ICCV (2015) Kong, N., Black, M.J.: Intrinsic depth: Improving depth transfer with intrinsic images. In: ICCV (2015)
9.
Zurück zum Zitat Shelhamer, E., Barron, J., Darrell, T.: Scene intrinsics and depth from a single image. In: ICCV Workshop (2015) Shelhamer, E., Barron, J., Darrell, T.: Scene intrinsics and depth from a single image. In: ICCV Workshop (2015)
10.
Zurück zum Zitat Zhou, T., Krahenbuhl, P., Efors, A.A.: Learning data-driven reflectnace priors for intrinsic image decomposition. In: ICCV (2015) Zhou, T., Krahenbuhl, P., Efors, A.A.: Learning data-driven reflectnace priors for intrinsic image decomposition. In: ICCV (2015)
11.
Zurück zum Zitat Narihira, T., Maire, M., Yu, S.X.: Direct intrinsics: learning albedo-shading decomposition by convolutional regression. In: ICCV (2015) Narihira, T., Maire, M., Yu, S.X.: Direct intrinsics: learning albedo-shading decomposition by convolutional regression. In: ICCV (2015)
12.
Zurück zum Zitat Saxena, A., Sun, M., Andrew, Y.: Make3D learning 3D scene structure from a single still image. IEEE Trans. PAMI 31(5), 824–840 (2009)CrossRef Saxena, A., Sun, M., Andrew, Y.: Make3D learning 3D scene structure from a single still image. IEEE Trans. PAMI 31(5), 824–840 (2009)CrossRef
13.
Zurück zum Zitat Wang, Y., Wang, R., Dai, Q.: A parametric model for describing the correlation between single color images and depth maps. IEEE SPL 21(7), 800–803 (2014) Wang, Y., Wang, R., Dai, Q.: A parametric model for describing the correlation between single color images and depth maps. IEEE SPL 21(7), 800–803 (2014)
14.
Zurück zum Zitat Li, X., Qin, H., Wang, Y., Zhang, Y., Dai, Q.: DEPT: depth estimation by parameter transfer for single still images. In: Cremers, D., Reid, I., Saito, H., Yang, M.-H. (eds.) ACCV 2014. LNCS, vol. 9004, pp. 45–58. Springer, Heidelberg (2015). doi:10.1007/978-3-319-16808-1_4 Li, X., Qin, H., Wang, Y., Zhang, Y., Dai, Q.: DEPT: depth estimation by parameter transfer for single still images. In: Cremers, D., Reid, I., Saito, H., Yang, M.-H. (eds.) ACCV 2014. LNCS, vol. 9004, pp. 45–58. Springer, Heidelberg (2015). doi:10.​1007/​978-3-319-16808-1_​4
15.
Zurück zum Zitat Konrad, J., Wang, M., Ishwar, P., Wu, C., Mukherjee, D.: Learning-based, automatic 2D-to-3D image and video conversion. IEEE Trans. IP 22(9), 3485–3496 (2013) Konrad, J., Wang, M., Ishwar, P., Wu, C., Mukherjee, D.: Learning-based, automatic 2D-to-3D image and video conversion. IEEE Trans. IP 22(9), 3485–3496 (2013)
16.
Zurück zum Zitat Karsch, K., Liu, C., Kang, S.B.: Depth transfer: depth extraction from video using non-parametric sampling. IEEE Trans. PAMI 32(11), 2144–2158 (2014)CrossRef Karsch, K., Liu, C., Kang, S.B.: Depth transfer: depth extraction from video using non-parametric sampling. IEEE Trans. PAMI 32(11), 2144–2158 (2014)CrossRef
17.
Zurück zum Zitat Choi, S., Min, D., Ham, B., Kim, Y., Oh, C., Sohn, K.: Depth analogy: data-driven approach for single image depth estimation using gradient samples. IEEE Trans. IP 24(12), 5953–5966 (2015)MathSciNet Choi, S., Min, D., Ham, B., Kim, Y., Oh, C., Sohn, K.: Depth analogy: data-driven approach for single image depth estimation using gradient samples. IEEE Trans. IP 24(12), 5953–5966 (2015)MathSciNet
18.
Zurück zum Zitat Wang, P., Shen, X., Lin, Z., Cohen, S., Price, B., Yuille, A.: Towards unified depth and semantic prediction from a single image. In: CVPR (2015) Wang, P., Shen, X., Lin, Z., Cohen, S., Price, B., Yuille, A.: Towards unified depth and semantic prediction from a single image. In: CVPR (2015)
19.
Zurück zum Zitat Barrow, H.G., Tenenbaum, J.M.: Recovering intrinsic scene characteristics from images. In: CVS (1978) Barrow, H.G., Tenenbaum, J.M.: Recovering intrinsic scene characteristics from images. In: CVS (1978)
20.
Zurück zum Zitat Land, E.H., Mccann, J.J.: Lightness and retinex theory. JOSA 61(1), 1–11 (1971)CrossRef Land, E.H., Mccann, J.J.: Lightness and retinex theory. JOSA 61(1), 1–11 (1971)CrossRef
21.
Zurück zum Zitat Shen, J., Tan, P., Lin, S.: Intrinsic image decomposition with non-local texture cues. In: CVPR (2008) Shen, J., Tan, P., Lin, S.: Intrinsic image decomposition with non-local texture cues. In: CVPR (2008)
22.
Zurück zum Zitat Zhao, Q., Tan, P., Dai, Q., SHen, L., Wu, E., Lin, S.: A closed-form solution to retinex with non-local texture constraints. IEEE Trans. PAMI 34(7), 1437–1444 (2012)CrossRef Zhao, Q., Tan, P., Dai, Q., SHen, L., Wu, E., Lin, S.: A closed-form solution to retinex with non-local texture constraints. IEEE Trans. PAMI 34(7), 1437–1444 (2012)CrossRef
23.
Zurück zum Zitat Li, Y., Brown, M.S.: Single image layer separation using relative smoothness. In: CVPR (2004) Li, Y., Brown, M.S.: Single image layer separation using relative smoothness. In: CVPR (2004)
24.
Zurück zum Zitat Bell, S., Bala, K., Snavely, N.: Intrinsic images in the wild. ACM Trans. Graph. TOG 33(4), 159 (2014) Bell, S., Bala, K., Snavely, N.: Intrinsic images in the wild. ACM Trans. Graph. TOG 33(4), 159 (2014)
25.
Zurück zum Zitat Bonneel, N., Sunkavalli, K., Tompkin, J., Sun, D., Paris, S., Pfister, H.: Interactive intrinsic video editing. ACM Trans. Graph. (SIGGRAPH ASIA) 33(6), 197 (2014) Bonneel, N., Sunkavalli, K., Tompkin, J., Sun, D., Paris, S., Pfister, H.: Interactive intrinsic video editing. ACM Trans. Graph. (SIGGRAPH ASIA) 33(6), 197 (2014)
26.
Zurück zum Zitat Wiess, Y.: Deriving intrinsic images from image sequences. In: ICCV (2001) Wiess, Y.: Deriving intrinsic images from image sequences. In: ICCV (2001)
27.
Zurück zum Zitat Laffont, P.Y., Bousseau, A., Drettakis, G.: Rich intrinsic image decomposition of outdoor scenes from multiple views. IEEE TVCG 19(2), 1–11 (2013) Laffont, P.Y., Bousseau, A., Drettakis, G.: Rich intrinsic image decomposition of outdoor scenes from multiple views. IEEE TVCG 19(2), 1–11 (2013)
28.
Zurück zum Zitat Kong, N., Gehler, P.V., Black, M.J.: Intrinsic video. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014, Part II. LNCS, vol. 8690, pp. 360–375. Springer, Heidelberg (2014) Kong, N., Gehler, P.V., Black, M.J.: Intrinsic video. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014, Part II. LNCS, vol. 8690, pp. 360–375. Springer, Heidelberg (2014)
29.
Zurück zum Zitat Bousseau, A., Paris, S., Durand, F.: User-assisted intrinsic images. ACM TOG 28(5), 1–11 (2009)CrossRef Bousseau, A., Paris, S., Durand, F.: User-assisted intrinsic images. ACM TOG 28(5), 1–11 (2009)CrossRef
30.
Zurück zum Zitat Shen, J., Yang, X., Jia, Y.: Intrinsic image using optimization. In: CVPR (2011) Shen, J., Yang, X., Jia, Y.: Intrinsic image using optimization. In: CVPR (2011)
31.
Zurück zum Zitat Barron, J., Malik, J.: Shape, albedo, and illumination from a single image of an unknown object. In: CVPR (2012) Barron, J., Malik, J.: Shape, albedo, and illumination from a single image of an unknown object. In: CVPR (2012)
32.
Zurück zum Zitat Butler, D.J., Wulff, J., Stanley, G.B., Black, M.J.: A naturalistic open source movie for optical flow evaluation. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part VI. LNCS, vol. 7577, pp. 611–625. Springer, Heidelberg (2012)CrossRef Butler, D.J., Wulff, J., Stanley, G.B., Black, M.J.: A naturalistic open source movie for optical flow evaluation. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part VI. LNCS, vol. 7577, pp. 611–625. Springer, Heidelberg (2012)CrossRef
33.
Zurück zum Zitat He, K., Zhang, X., Ren, S., Sun, J.: Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans. PAMI 37(9), 1904–1916 (2015)CrossRef He, K., Zhang, X., Ren, S., Sun, J.: Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans. PAMI 37(9), 1904–1916 (2015)CrossRef
34.
Zurück zum Zitat Perez, P., Gangnet, M., Blake, A.: Poisson image editing. ACM TOG 22(3), 313–318 (2003)CrossRef Perez, P., Gangnet, M., Blake, A.: Poisson image editing. ACM TOG 22(3), 313–318 (2003)CrossRef
35.
Zurück zum Zitat Xu, L., Ren, J., Yan, Q., Liao, R., Jia, J.: Deep edge-aware filters. In: ICML (2015) Xu, L., Ren, J., Yan, Q., Liao, R., Jia, J.: Deep edge-aware filters. In: ICML (2015)
36.
Zurück zum Zitat Shen, X., Yan, Q., Xu, L., Ma, L., Jia, J.: Multispectral joint image restoration via optimizing a scale map. IEEE Trans. PAMI 31(9), 1582–1599 (2015) Shen, X., Yan, Q., Xu, L., Ma, L., Jia, J.: Multispectral joint image restoration via optimizing a scale map. IEEE Trans. PAMI 31(9), 1582–1599 (2015)
37.
Zurück zum Zitat Eigen, D., R, F.: Predicting depth, surface normals and semantic labels with a common multi-scale convolutional architecture. In: ICCV (2015) Eigen, D., R, F.: Predicting depth, surface normals and semantic labels with a common multi-scale convolutional architecture. In: ICCV (2015)
38.
Zurück zum Zitat Alex, K., Ilya, S., E, H.: Imagenet classification with deep convolutional neural networks. In: NIPS (2012) Alex, K., Ilya, S., E, H.: Imagenet classification with deep convolutional neural networks. In: NIPS (2012)
39.
Zurück zum Zitat Dong, C., Loy, C.C., He, K., Tang, X.: Image super-resolution using deep convolutional networks. IEEE Trans. PAMI 37(3), 597–610 (2015)CrossRef Dong, C., Loy, C.C., He, K., Tang, X.: Image super-resolution using deep convolutional networks. IEEE Trans. PAMI 37(3), 597–610 (2015)CrossRef
44.
Zurück zum Zitat Grosse, R., Johnson, M.K., Adelson, E.H., Freeman, W.T.: Ground truth and baseline evaluations for intrinsic image algorithms. In: ICCV (2009) Grosse, R., Johnson, M.K., Adelson, E.H., Freeman, W.T.: Ground truth and baseline evaluations for intrinsic image algorithms. In: ICCV (2009)
45.
Zurück zum Zitat Liu, M., Salzmann, M., He, X.: Discrete-continuous depth estimation from a single image. In: CVPR (2014) Liu, M., Salzmann, M., He, X.: Discrete-continuous depth estimation from a single image. In: CVPR (2014)
Metadaten
Titel
Unified Depth Prediction and Intrinsic Image Decomposition from a Single Image via Joint Convolutional Neural Fields
verfasst von
Seungryong Kim
Kihong Park
Kwanghoon Sohn
Stephen Lin
Copyright-Jahr
2016
DOI
https://doi.org/10.1007/978-3-319-46484-8_9