Skip to main content

2016 | OriginalPaper | Buchkapitel

ATGV-Net: Accurate Depth Super-Resolution

verfasst von : Gernot Riegler, Matthias Rüther, Horst Bischof

Erschienen in: Computer Vision – ECCV 2016

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In this work we present a novel approach for single depth map super-resolution. Modern consumer depth sensors, especially Time-of-Flight sensors, produce dense depth measurements, but are affected by noise and have a low lateral resolution. We propose a method that combines the benefits of recent advances in machine learning based single image super-resolution, i.e. deep convolutional networks, with a variational method to recover accurate high-resolution depth maps. In particular, we integrate a variational method that models the piecewise affine structures apparent in depth data via an anisotropic total generalized variation regularization term on top of a deep network. We call our method ATGV-Net and train it end-to-end by unrolling the optimization procedure of the variational method. To train deep networks, a large corpus of training data with accurate ground-truth is required. We demonstrate that it is feasible to train our method solely on synthetic data that we generate in large quantities for this task. Our evaluations show that we achieve state-of-the-art results on three different benchmarks, as well as on a challenging Time-of-Flight dataset, all without utilizing an additional intensity image as guidance.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Anhänge
Nur mit Berechtigung zugänglich
Fußnoten
1
Note that we present our results over the full disparity range [0, 255], as opposed to e.g.  [13], where the disparities are scaled to a narrower range.
 
Literatur
1.
Zurück zum Zitat Aodha, O.M., Campbell, N.D., Nair, A., Brostow, G.J.: Patch based synthesis for single depth image super-resolution. In: European Conference on Computer Vision (ECCV) (2012) Aodha, O.M., Campbell, N.D., Nair, A., Brostow, G.J.: Patch based synthesis for single depth image super-resolution. In: European Conference on Computer Vision (ECCV) (2012)
2.
Zurück zum Zitat Apple, A.: Some techniques for shading machine renderings of solids. In: Proceedings of the April 30–May 2 1968, Spring Joint Computer Conference (1968) Apple, A.: Some techniques for shading machine renderings of solids. In: Proceedings of the April 30–May 2 1968, Spring Joint Computer Conference (1968)
4.
Zurück zum Zitat Butler, D.J., Wulff, J., Stanley, G.B., Black, M.J.: A naturalistic open source movie for optical flow evaluation. In: European Conference on Computer Vision (ECCV) (2012) Butler, D.J., Wulff, J., Stanley, G.B., Black, M.J.: A naturalistic open source movie for optical flow evaluation. In: European Conference on Computer Vision (ECCV) (2012)
5.
Zurück zum Zitat Chambolle, A., Pock, T.: A first-order primal-dual algorithm for convex problems with applications to imaging. J. Math. Imaging Vis. 40(1), 120–145 (2011)MathSciNetCrossRefMATH Chambolle, A., Pock, T.: A first-order primal-dual algorithm for convex problems with applications to imaging. J. Math. Imaging Vis. 40(1), 120–145 (2011)MathSciNetCrossRefMATH
6.
Zurück zum Zitat Chambolle, A., Pock, T.: On the ergodic convergence rates of a first-order primal-dual algorithm. Math. Program. 159, 253–287 (2016)MathSciNetCrossRef Chambolle, A., Pock, T.: On the ergodic convergence rates of a first-order primal-dual algorithm. Math. Program. 159, 253–287 (2016)MathSciNetCrossRef
7.
Zurück zum Zitat Chan, D., Buisman, H., Theobalt, C., Thrun, S.: A noise-aware filter for real-time depth upsampling. In: European Conference on Computer Vision Workshops (ECCVW) (2008) Chan, D., Buisman, H., Theobalt, C., Thrun, S.: A noise-aware filter for real-time depth upsampling. In: European Conference on Computer Vision Workshops (ECCVW) (2008)
8.
Zurück zum Zitat Chen, L.C., Schwing, A.G., Yuille, A.L., Urtasun, R.: Learning deep structured models. In: Proceedings of the International Conference on Machine Learning (ICML) (2015) Chen, L.C., Schwing, A.G., Yuille, A.L., Urtasun, R.: Learning deep structured models. In: Proceedings of the International Conference on Machine Learning (ICML) (2015)
9.
Zurück zum Zitat Diebel, J., Thrun, S.: An application of Markov random fields to range sensing. In: Proceedings of Conference on Neural Information Processing Systems (NIPS) (2005) Diebel, J., Thrun, S.: An application of Markov random fields to range sensing. In: Proceedings of Conference on Neural Information Processing Systems (NIPS) (2005)
10.
Zurück zum Zitat Domke, J.: Generic methods for optimization-based modeling. In: Proceedings of the International Conference on Artificial Intelligence and Statistics (AISTATS) (2012) Domke, J.: Generic methods for optimization-based modeling. In: Proceedings of the International Conference on Artificial Intelligence and Statistics (AISTATS) (2012)
11.
Zurück zum Zitat Dong, C., Loy, C.C., He, K., Tang, X.: Learning a deep convolutional network for image super-resolution. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014, Part IV. LNCS, vol. 8692, pp. 184–199. Springer, Heidelberg (2014) Dong, C., Loy, C.C., He, K., Tang, X.: Learning a deep convolutional network for image super-resolution. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014, Part IV. LNCS, vol. 8692, pp. 184–199. Springer, Heidelberg (2014)
12.
Zurück zum Zitat Ferstl, D., Reinbacher, C., Ranftl, R., Rüther, M., Bischof, H.: Image guided depth upsampling using anisotropic total generalized variation. In: IEEE International Conference on Computer Vision (ICCV) (2013) Ferstl, D., Reinbacher, C., Ranftl, R., Rüther, M., Bischof, H.: Image guided depth upsampling using anisotropic total generalized variation. In: IEEE International Conference on Computer Vision (ICCV) (2013)
13.
Zurück zum Zitat Ferstl, D., Rüther, M., Bischof, H.: Variational depth superresolution using example-based edge representations. In: IEEE International Conference on Computer Vision (ICCV) (2015) Ferstl, D., Rüther, M., Bischof, H.: Variational depth superresolution using example-based edge representations. In: IEEE International Conference on Computer Vision (ICCV) (2015)
14.
Zurück zum Zitat Girshick, R., Shotton, J., Kohli, P., Criminisi, A., Fitzgibbon, A.W.: Efficient regression of general-activity human poses from depth images. In: IEEE International Conference on Computer Vision (ICCV) (2011) Girshick, R., Shotton, J., Kohli, P., Criminisi, A., Fitzgibbon, A.W.: Efficient regression of general-activity human poses from depth images. In: IEEE International Conference on Computer Vision (ICCV) (2011)
15.
Zurück zum Zitat Glasner, D., Bagon, S., Irani, M.: Super-resolution from single image. In: IEEE International Conference on Computer Vision (ICCV) (2009) Glasner, D., Bagon, S., Irani, M.: Super-resolution from single image. In: IEEE International Conference on Computer Vision (ICCV) (2009)
16.
Zurück zum Zitat Gupta, S., Girshick, R., Arbeláez, P., Malik, J.: Learning rich features from RGB-D images for object detection and segmentation. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014, Part VII. LNCS, vol. 8695, pp. 345–360. Springer, Heidelberg (2014) Gupta, S., Girshick, R., Arbeláez, P., Malik, J.: Learning rich features from RGB-D images for object detection and segmentation. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014, Part VII. LNCS, vol. 8695, pp. 345–360. Springer, Heidelberg (2014)
17.
Zurück zum Zitat Handa, A., Patraucean, V., Badrinarayanan, V., Stent, S., Cipolla, R.: Understanding real world indoor scenes with synthetic data. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) Handa, A., Patraucean, V., Badrinarayanan, V., Stent, S., Cipolla, R.: Understanding real world indoor scenes with synthetic data. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)
18.
Zurück zum Zitat He, K., Sun, J., Tang, X.: Guided image filtering. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part I. LNCS, vol. 6311, pp. 1–14. Springer, Heidelberg (2010)CrossRef He, K., Sun, J., Tang, X.: Guided image filtering. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part I. LNCS, vol. 6311, pp. 1–14. Springer, Heidelberg (2010)CrossRef
19.
Zurück zum Zitat He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)
20.
Zurück zum Zitat Hornáček, M., Rhemann, C., Gelautz, M., Rother, C.: Depth super resolution by rigid body self-similarity in 3D. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2013) Hornáček, M., Rhemann, C., Gelautz, M., Rother, C.: Depth super resolution by rigid body self-similarity in 3D. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2013)
21.
Zurück zum Zitat Izadi, S., Kim, D., Hilliges, O., Molyneaux, D., Newcombe, R., Kohli, P., Shotton, J., Hodges, S., Freeman, D., Davison, A., Fitzgibbon, A.: KinectFusion: real-time 3D reconstruction and interaction using a moving depth camera. In: ACM Symposium on User Interface Software and Technology (2011) Izadi, S., Kim, D., Hilliges, O., Molyneaux, D., Newcombe, R., Kohli, P., Shotton, J., Hodges, S., Freeman, D., Davison, A., Fitzgibbon, A.: KinectFusion: real-time 3D reconstruction and interaction using a moving depth camera. In: ACM Symposium on User Interface Software and Technology (2011)
22.
Zurück zum Zitat Kim, J., Lee, J.K., Lee, K.M.: Accurate image super-resolution using very deep convolutional networks. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) Kim, J., Lee, J.K., Lee, K.M.: Accurate image super-resolution using very deep convolutional networks. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)
23.
Zurück zum Zitat Kopf, J., Cohen, M.F., Lischinski, D., Uyttendaele, M.: Joint bilateral upsampling. ACM Trans. Graph. (TOG) 26(3), 96 (2007)CrossRef Kopf, J., Cohen, M.F., Lischinski, D., Uyttendaele, M.: Joint bilateral upsampling. ACM Trans. Graph. (TOG) 26(3), 96 (2007)CrossRef
24.
Zurück zum Zitat Krähenbühl, P., Koltun, V.: Efficient inference in fully connected CRFs with gaussian edge potentials. In: Proceedings of Conference on Neural Information Processing Systems (NIPS) (2012) Krähenbühl, P., Koltun, V.: Efficient inference in fully connected CRFs with gaussian edge potentials. In: Proceedings of Conference on Neural Information Processing Systems (NIPS) (2012)
25.
Zurück zum Zitat Kwon, H., Tai, Y.W., Lin, S.: Data-driven depth map refinement via multi-scale spare representations. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2015) Kwon, H., Tai, Y.W., Lin, S.: Data-driven depth map refinement via multi-scale spare representations. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2015)
26.
Zurück zum Zitat Martull, S., Peris, M., Fukui, K.: Realistic CG stereo image dataset with ground truth disparity maps. In: International Conference on Pattern Recognition Workshops (ICPRW) (2012) Martull, S., Peris, M., Fukui, K.: Realistic CG stereo image dataset with ground truth disparity maps. In: International Conference on Pattern Recognition Workshops (ICPRW) (2012)
27.
Zurück zum Zitat Nagel, H.H., Enkelmann, W.: An investigation of smoothness constraints for the estimation of displacement vector fields from image sequences. IEEE Trans. Pattern Anal. Mach. Intell. (TPAMI) 8(5), 565–593 (1986)CrossRef Nagel, H.H., Enkelmann, W.: An investigation of smoothness constraints for the estimation of displacement vector fields from image sequences. IEEE Trans. Pattern Anal. Mach. Intell. (TPAMI) 8(5), 565–593 (1986)CrossRef
28.
Zurück zum Zitat Ochs, P., Ranftl, R., Brox, T., Pock, T.: Bilevel optimization with nonsmooth lower level problems. In: Aujol, J.-F., Nikolova, M., Papadakis, N. (eds.) SSVM 2015. LNCS, vol. 9087, pp. 654–665. Springer, Heidelberg (2015) Ochs, P., Ranftl, R., Brox, T., Pock, T.: Bilevel optimization with nonsmooth lower level problems. In: Aujol, J.-F., Nikolova, M., Papadakis, N. (eds.) SSVM 2015. LNCS, vol. 9087, pp. 654–665. Springer, Heidelberg (2015)
29.
Zurück zum Zitat Park, J., Kim, H., Tai, Y.W., Brown, M.S., Kweon, I.S.: High quality depth map upsampling for 3D-TOF cameras. In: IEEE International Conference on Computer Vision (ICCV) (2011) Park, J., Kim, H., Tai, Y.W., Brown, M.S., Kweon, I.S.: High quality depth map upsampling for 3D-TOF cameras. In: IEEE International Conference on Computer Vision (ICCV) (2011)
30.
Zurück zum Zitat Ranftl, R., Gehrig, S., Pock, T., Bischof, H.: Pushing the limits of stereo using variational stereo estimation. In: IEEE Intelligent Vehicles Symposium (2012) Ranftl, R., Gehrig, S., Pock, T., Bischof, H.: Pushing the limits of stereo using variational stereo estimation. In: IEEE Intelligent Vehicles Symposium (2012)
31.
Zurück zum Zitat Ranftl, R., Pock, T.: A deep variational model for image segmentation. In: Jiang, X., Hornegger, J., Koch, R. (eds.) GCPR 2014. LNCS, vol. 8753, pp. 107–118. Springer, Heidelberg (2014) Ranftl, R., Pock, T.: A deep variational model for image segmentation. In: Jiang, X., Hornegger, J., Koch, R. (eds.) GCPR 2014. LNCS, vol. 8753, pp. 107–118. Springer, Heidelberg (2014)
32.
Zurück zum Zitat Riegler, G., Ranftl, R., Rüther, M., Bischof, H.: Joint training of an convolutional neural net and a global regression model. In: Proceedings of the British Machine Vision Conference (BMVC) (2015) Riegler, G., Ranftl, R., Rüther, M., Bischof, H.: Joint training of an convolutional neural net and a global regression model. In: Proceedings of the British Machine Vision Conference (BMVC) (2015)
33.
Zurück zum Zitat Schulter, S., Leistner, C., Bischof, H.: Fast and accurate image upscaling with super-resolution forests. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2015) Schulter, S., Leistner, C., Bischof, H.: Fast and accurate image upscaling with super-resolution forests. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2015)
35.
Zurück zum Zitat Shotton, J., Fitzgibbon, A., Cook, M., Sharp, T., Finocchio, M., Moore, R., Kipman, A., Blake, A.: Real-time human pose recognition in parts from single depth images. In: Cipolla, R., Battiato, S., Farinella, G.M. (eds.) Machine Learning for Computer Vision. SCI, vol. 411, pp. 125–141. Springer, Heidelberg (2013)CrossRef Shotton, J., Fitzgibbon, A., Cook, M., Sharp, T., Finocchio, M., Moore, R., Kipman, A., Blake, A.: Real-time human pose recognition in parts from single depth images. In: Cipolla, R., Battiato, S., Farinella, G.M. (eds.) Machine Learning for Computer Vision. SCI, vol. 411, pp. 125–141. Springer, Heidelberg (2013)CrossRef
36.
Zurück zum Zitat Timofte, R., Smet, V.D., Gool, L.V.: Anchored neighborhood regression for fast example-based super-resolution. In: IEEE International Conference on Computer Vision (ICCV) (2013) Timofte, R., Smet, V.D., Gool, L.V.: Anchored neighborhood regression for fast example-based super-resolution. In: IEEE International Conference on Computer Vision (ICCV) (2013)
37.
Zurück zum Zitat Timofte, R., De Smet, V., Van Gool, L.: A+: adjusted anchored neighborhood regression for fast super-resolution. In: Cremers, D., Reid, I., Saito, H., Yang, M.-H. (eds.) ACCV 2014. LNCS, vol. 9006, pp. 111–126. Springer, Heidelberg (2015) Timofte, R., De Smet, V., Van Gool, L.: A+: adjusted anchored neighborhood regression for fast super-resolution. In: Cremers, D., Reid, I., Saito, H., Yang, M.-H. (eds.) ACCV 2014. LNCS, vol. 9006, pp. 111–126. Springer, Heidelberg (2015)
38.
Zurück zum Zitat Tompson, J., Jain, A., LeCun, Y., Bregler, C.: Joint training of a convolutional network and a graphical model for human pose estimation. In: Proceedings of Conference on Neural Information Processing Systems (NIPS) (2014) Tompson, J., Jain, A., LeCun, Y., Bregler, C.: Joint training of a convolutional network and a graphical model for human pose estimation. In: Proceedings of Conference on Neural Information Processing Systems (NIPS) (2014)
39.
Zurück zum Zitat Werlberger, M., Trobin, W., Pock, T., Wedel, A., Cremers, D., Bischof, H.: Anisotropic Huber-L1 optical flow. In: Proceedings of the British Machine Vision Conference (BMVC) (2009) Werlberger, M., Trobin, W., Pock, T., Wedel, A., Cremers, D., Bischof, H.: Anisotropic Huber-L1 optical flow. In: Proceedings of the British Machine Vision Conference (BMVC) (2009)
40.
Zurück zum Zitat Yang, J., Wright, J., Huang, T.S., Ma, Y.: Image super-resolution via sparse representation. IEEE Trans. Image Process. 19(11), 2861–2873 (2010)MathSciNetCrossRef Yang, J., Wright, J., Huang, T.S., Ma, Y.: Image super-resolution via sparse representation. IEEE Trans. Image Process. 19(11), 2861–2873 (2010)MathSciNetCrossRef
41.
Zurück zum Zitat Yang, Q., Yang, R., Davis, J., Nistér, D.: Spatial-depth super resolution for range images. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2007) Yang, Q., Yang, R., Davis, J., Nistér, D.: Spatial-depth super resolution for range images. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2007)
42.
Zurück zum Zitat Zeyde, R., Elad, M., Protter, M.: On single image scale-up using sparse-representations. In: Boissonnat, J.-D., Chenin, P., Cohen, A., Gout, C., Lyche, T., Mazure, M.-L., Schumaker, L. (eds.) Curves and Surfaces 2011. LNCS, vol. 6920, pp. 711–730. Springer, Heidelberg (2012)CrossRef Zeyde, R., Elad, M., Protter, M.: On single image scale-up using sparse-representations. In: Boissonnat, J.-D., Chenin, P., Cohen, A., Gout, C., Lyche, T., Mazure, M.-L., Schumaker, L. (eds.) Curves and Surfaces 2011. LNCS, vol. 6920, pp. 711–730. Springer, Heidelberg (2012)CrossRef
43.
Zurück zum Zitat Zheng, S., Jayasumana, S., Romera-Paredes, B., Vineet, V., Su, Z., Du, D., Huang, C., Torr, P.: Conditional random fields as recurrent neural networks. In: IEEE International Conference on Computer Vision (ICCV) (2015) Zheng, S., Jayasumana, S., Romera-Paredes, B., Vineet, V., Su, Z., Du, D., Huang, C., Torr, P.: Conditional random fields as recurrent neural networks. In: IEEE International Conference on Computer Vision (ICCV) (2015)
Metadaten
Titel
ATGV-Net: Accurate Depth Super-Resolution
verfasst von
Gernot Riegler
Matthias Rüther
Horst Bischof
Copyright-Jahr
2016
DOI
https://doi.org/10.1007/978-3-319-46487-9_17