nach oben

Erschienen in:

2018 | OriginalPaper | Buchkapitel

Monocular Depth Estimation Using Whole Strip Masking and Reliability-Based Refinement

verfasst von : Minhyeok Heo, Jaehan Lee, Kyung-Rae Kim, Han-Ul Kim, Chang-Su Kim

Erschienen in: Computer Vision – ECCV 2018

Verlag: Springer International Publishing

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

We propose a monocular depth estimation algorithm based on whole strip masking (WSM) and reliability-based refinement. First, we develop a convolutional neural network (CNN) tailored for the depth estimation. Specifically, we design a novel filter, called WSM, to exploit the tendency that a scene has similar depths in horizonal or vertical directions. The proposed CNN combines WSM upsampling blocks with a ResNet encoder. Second, we measure the reliability of an estimated depth, by appending additional layers to the main CNN. Using the reliability information, we perform conditional random field (CRF) optimization to refine the estimated depth map. Experimental results demonstrate that the proposed algorithm provides the state-of-the-art depth estimation performance.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Vorheriges Kapitel Progressive Structure from Motion

Nächstes Kapitel Local Spectral Graph Convolution for Point Set Feature Learning

Yang, S., Maturana, D., Scherer, S.: Real-time 3D scene layout from a single image using convolutional neural networks. In: ICRA, pp. 2183–2189 (2016)

Shao, T., Xu, W., Zhou, K., Wang, J., Li, D., Guo, B.: An interactive approach to semantic modeling of indoor scenes with an RGBD camera. ACM Trans. Graph. 31(6), 136 (2012)CrossRef

Porzi, L., Buló, S.R., Penate-Sanchez, A., Ricci, E., Moreno-Noguer, F.: Learning depth-aware deep representations for robotic perception. IEEE Robot. Autom. Lett. 2(2), 468–475 (2017)CrossRef

Kim, K.R., Koh, Y.J., Kim, C.S.: Multiscale feature extractors for stereo matching cost computation. IEEE Access 6, 27971–27983 (2018)CrossRef

Gupta, A., Efros, A.A., Hebert, M.: Blocks world revisited: image understanding using qualitative geometry and mechanics. In: Proceedings ECCV, pp. 482–496 (2010)

Lee, D.C., Gupta, A., Hebert, M., Kanade, T.: Estimating spatial layout of rooms using volumetric reasoning about objects and surfaces. In: Proceedings NIPS, pp. 1288–1296 (2010)

Russell, B.C., Torralba, A.: Building a database of 3D scenes from user annotations. In: Proceedings IEEE CVPR, pp. 2711–2718 (2009)

Ladicky, L., Shi, J., Pollefeys, M.: Pulling things out of perspective. In: Proceedings IEEE CVPR, pp. 89–96 (2014)

Saxena, A., Chung, S.H., Ng, A.Y.: Learning depth from single monocular images. In: Proceedings NIPS, pp. 1161–1168 (2005)

10.

Saxena, A., Sun, M., Ng, A.Y.: Make3D: learning 3D scene sctructure from a single still image. IEEE Trans. Pattern Anal. Mach. Intell. 31(5), 824–840 (2009)CrossRef

11.

Liu, B., Gould, S., Koller, D.: Single image depth estimation from predicted semantic labels. In: Proceedings IEEE CVPR, pp. 1253–1260 (2010)

12.

Karsch, K., Liu, C., Kang, S.B.: Depth transfer: depth extraction from video using non-parametric sampling. IEEE Trans. Pattern Anal. Mach. Intell. 36(11), 2144–2158 (2014)CrossRef

13.

Eigen, D., Puhrsch, C., Fergus, R.: Depth map prediction from a single image using a multi-scale deep network. In: Proceedings NIPS, pp. 2366–2374 (2014)

14.

Eigen, D., Fergus, R.: Predicting depth, surface normals and semantic labels with a common multi-scale convolutional architecture. In: Proceedings IEEE ICCV, pp. 2650–2658 (2015)

15.

Liu, F., Shen, C., Lin, G.: Deep convolutional neural fields for depth estimation from a single image. In: Proceedings IEEE CVPR, pp. 5162–5170 (2015)

16.

Wang, P., Shen, X., Lin, Z., Cohen, S., Price, B., Yuille, A.: Towards unified depth and semantic prediction from a single image. In: Proceedings IEEE CVPR, pp. 2800–2809 (2015)

17.

Li, B., Shen, C., Dai, Y., van den Hengel, A., He, M.: Depth and surface normal estimation from monocular images using regression on deep features and hierarchical CRFs. In: Proceedings IEEE CVPR, pp. 1119–1127 (2015)

18.

Chakrabarti, A., Shao, J., Shakhnarovich, G.: Depth from a single image by harmonizing overcomplete local network predictions. In: Proceedings NIPS, pp. 2658–2666 (2016)

19.

Laina, I., Rupprecht, C., Belagiannis, V.: Deeper depth prediction with fully convolutional residual networks. In: Proceedings IEEE 3DV, pp. 239–248 (2016)

20.

He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings IEEE CVPR, pp. 770–778 (2016)

21.

Liu, M., Salzmann, M., He, X.: Discrete-continuous depth estimation from a single image. In: Proceedings IEEE CVPR, pp. 716–723 (2014)

22.

Kim, H.U., Kim, C.S.: CDT: Cooperative detection and tracking for tracing multiple objects in video sequences. In: Proceedings ECCV, pp. 851–867 (2016)

23.

Jang, W.D., Kim, C.S.: Online video object segmentation via convolutional trident network. In: Proceedings IEEE CVPR, pp. 5849–5856 (2017)

24.

Lee, J.T., Kim, H.U., Lee, C., Kim, C.S.: Semantic line detection and its applications. In: Proceedings IEEE ICCV, pp. 3229–3237 (2017)

25.

Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Proceedings NIPS, pp. 1097–1105 (2012)

26.

Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: ICLR (2012)

27.

Lee, J.H., Heo, M., Kim, K.R., Kim, C.S.: Single-image depth estimation based on Fourier domain analysis. In: Proceedings IEEE CVPR, pp. 330–339 (2018)

28.

Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: Proceedings IEEE CVPR, pp. 248–255 (2009)

29.

Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings IEEE CVPR, pp. 3431–3440 (2015)

30.

Luo, W., Li, Y., Urtasun, R., Zemel, R.: Understanding the effective receptive field in deep convolutional neural networks. In: Proceedings NIPS, pp. 4898–4906 (2016)

31.

Silberman, N., Hoiem, D., Kohli, P., Fergus, R.: Indoor segmentation and support inference from RGBD images. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7576, pp. 746–760. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33715-4_54CrossRef

32.

Szegedy, C., Ioffe, S., Vanhoucke, V., Alemi, A.: Inception-v4, Inception-ResNet and the impact of residual connections on learning. AAA I, 4278–4284 (2016)

33.

Matthies, L., Kanade, T., Szeliski, R.: Kalman filter-based algorithms for estimating depth from image sequences. Int. J. Comput. Vis. 3(3), 209–238 (1989)CrossRef

34.

Levin, A., Lischinski, D., Weiss, Y.: A closed-form solution to natural image matting. IEEE Trans. Pattern Anal. Mach. Intell. 30(2), 228–242 (2008)CrossRef

35.

Yang, J., Ye, X., Li, K., Hou, C., Wang, Y.: Color-guided depth recovery from RGB-D data using an adaptive autoregressive model. IEEE Trans. Image Process. 23(8), 3443–3458 (2016)MathSciNetCrossRef

36.

Diebel, J., Thrun, S.: An application of markov random fields to range sensing. In: Advances in Neural Information Processing Systems, pp. 291–298 (2006)

37.

Park, J., Kim, H., Tai, Y.W., Brown, M.S., Kweon, I.: High quality depth map upsampling for 3D-TOF cameras. In: 2011 IEEE International Conference on Computer Vision (ICCV), pp. 1623–1630. IEEE (2011)

38.

Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R.: Caffe: Convolutional architecture for fast feature embedding. In: ACM Multimedia, pp. 675–678 (2014)

39.

Li, J., Klein, R., Yao, A.: A two-streamed network for estimating fine-scaled depth maps from single RGB images. In: Proceedings of the 2017 IEEE International Conference on Computer Vision, Venice, Italy, pp. 22–29 (2017)

Titel: Monocular Depth Estimation Using Whole Strip Masking and Reliability-Based Refinement
verfasst von: Minhyeok Heo
Jaehan Lee
Kyung-Rae Kim
Han-Ul Kim
Chang-Su Kim
Verlag: Springer International Publishing
Buch: Computer Vision – ECCV 2018
Print ISBN: 978-3-030-01224-3

Electronic ISBN: 978-3-030-01225-0

Copyright-Jahr: 2018
DOI: https://doi.org/10.1007/978-3-030-01225-0_3

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"