Skip to main content
Top
Published in: Neural Processing Letters 6/2021

11-08-2021

Self-supervised Monocular Trained Depth Estimation Using Triplet Attention and Funnel Activation

Authors: Xuezhi Xiang, Xiangdong Kong, Yujian Qiu, Kaixu Zhang, Ning Lv

Published in: Neural Processing Letters | Issue 6/2021

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Dense depth estimation based on a single image is a basic problem in computer vision and has exciting applications in many robotic tasks. Modelling fully supervised methods requires the acquisition of accurate and large ground truth data sets, which is often complex and expensive. On the other hand, self-supervised learning has emerged as a promising alternative to monocular depth estimation as it does not require ground truth depth data. In this paper, we propose a novel self-supervised joint learning framework for depth estimation using consecutive frames from monocular and stereo videos. Our architecture leverages two new ideas for improvement: (1) triplet attention and (2) funnel activation (FReLU). By adding triplet attention to the deep and pose networks, this module captures the importance of features across dimensions in a tensor without any information bottlenecks, making the optimisation learning framework more reliable. FReLU is used at the non-linear activation layer to grasp the local context adaptively in images, rather than using more complex convolutions at the convolution layer. FReLU extracts the spatial structure of objects by the pixel-wise modeling capacity provided by the spatial condition, making the details of the complex image richer. The experimental results show that the proposed method is comparable with the state-of-the-art self-supervised monocular depth estimation method.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Desouza G N, Kak A C (2002) Vision for mobile robot navigation: a survey. IEEE Trans Pattern Anal Mach Intell 237–267 Desouza G N, Kak A C (2002) Vision for mobile robot navigation: a survey. IEEE Trans Pattern Anal Mach Intell 237–267
2.
go back to reference Chen C, Seff A, Kornhauser A et al (2015) Deepdriving: learning affordance for direct perception in autonomous driving. In: Proceedings of the IEEE international conference on computer vision, pp 2722–2730 Chen C, Seff A, Kornhauser A et al (2015) Deepdriving: learning affordance for direct perception in autonomous driving. In: Proceedings of the IEEE international conference on computer vision, pp 2722–2730
3.
go back to reference Karsch K, Liu C, Kang S B (2014) Depth transfer: Depth extraction from video using non-parametric sampling. IEEE Trans Pattern Anal Mach Intell 2144–2158 Karsch K, Liu C, Kang S B (2014) Depth transfer: Depth extraction from video using non-parametric sampling. IEEE Trans Pattern Anal Mach Intell 2144–2158
4.
go back to reference Saxena A, Chung S H, Ng A Y (2005) Learning depth from single monocular images. In: Conference and workshop on neural information processing system, pp 1–8 Saxena A, Chung S H, Ng A Y (2005) Learning depth from single monocular images. In: Conference and workshop on neural information processing system, pp 1–8
5.
go back to reference Saxena A, Sun M, Ng A Y (2008) Make3d: learning 3d scene structure from a single still image. IEEE Trans Pattern Anal Mach Intell 824–840 Saxena A, Sun M, Ng A Y (2008) Make3d: learning 3d scene structure from a single still image. IEEE Trans Pattern Anal Mach Intell 824–840
6.
go back to reference Eigen D, Puhrsch C, Fergus R (2014) Depth map prediction from a single image using a multi-scale deep network. arXiv preprint arXiv: 1406.2283 Eigen D, Puhrsch C, Fergus R (2014) Depth map prediction from a single image using a multi-scale deep network. arXiv preprint arXiv:​ 1406.​2283
7.
go back to reference Eigen D, Fergus R (2015) Predicting depth, surface normals and semantic labels with a common multi-scale convolutional architecture. In: Proceedings of the IEEE international conference on computer vision, pp 2650–2658 Eigen D, Fergus R (2015) Predicting depth, surface normals and semantic labels with a common multi-scale convolutional architecture. In: Proceedings of the IEEE international conference on computer vision, pp 2650–2658
8.
go back to reference Fu H, Gong M, Wang C, et al (2018) Deep ordinal regression network for monocular depth estimation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2002–2011 Fu H, Gong M, Wang C, et al (2018) Deep ordinal regression network for monocular depth estimation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2002–2011
9.
go back to reference Yang N, Wang R, Stuckler J, et al (2018) Deep virtual stereo odometry: leveraging deep depth prediction for monocular direct sparse odometry. In: Proceedings of the European conference on computer vision, pp 817–833 Yang N, Wang R, Stuckler J, et al (2018) Deep virtual stereo odometry: leveraging deep depth prediction for monocular direct sparse odometry. In: Proceedings of the European conference on computer vision, pp 817–833
10.
go back to reference Mayer N, Ilg E, Fischer P et al (2018) What makes good synthetic training data for learning disparity and optical flow estimation. Int J Comput Vis 942–960 Mayer N, Ilg E, Fischer P et al (2018) What makes good synthetic training data for learning disparity and optical flow estimation. Int J Comput Vis 942–960
11.
go back to reference Gupta S, Girshick R, Arbeláez P, et al (2014) Learning rich features from RGB-D images for object detection and segmentation. In: European conference on computer vision, pp 345–360 Gupta S, Girshick R, Arbeláez P, et al (2014) Learning rich features from RGB-D images for object detection and segmentation. In: European conference on computer vision, pp 345–360
12.
go back to reference Garg R, Bg V K, Carneiro G, et al (2016) Unsupervised CNN for single view depth estimation: geometry to the rescue. In: Proceedings of the European conference on computer vision, pp 740–756 Garg R, Bg V K, Carneiro G, et al (2016) Unsupervised CNN for single view depth estimation: geometry to the rescue. In: Proceedings of the European conference on computer vision, pp 740–756
13.
go back to reference Godard C, Mac Aodha O, Brostow G J (2017) Unsupervised monocular depth estimation with left-right consistency. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 270–279 Godard C, Mac Aodha O, Brostow G J (2017) Unsupervised monocular depth estimation with left-right consistency. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 270–279
14.
go back to reference Zhou T, Brown M, Snavely N, et al (2017) Unsupervised learning of depth and ego-motion from video. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1851–1858 Zhou T, Brown M, Snavely N, et al (2017) Unsupervised learning of depth and ego-motion from video. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1851–1858
15.
go back to reference Guizilini V, Ambrus R, Pillai S et al (2019) Packnet-SFM: 3d packing for self-supervised monocular depth estimation. arXiv preprint arXiv: 1905.02693 Guizilini V, Ambrus R, Pillai S et al (2019) Packnet-SFM: 3d packing for self-supervised monocular depth estimation. arXiv preprint arXiv:​ 1905.​02693
16.
go back to reference Bian J W, Li Z, Wang N et al (2019) Unsupervised scale-consistent depth and ego-motion learning from monocular video. arXiv preprint arXiv: 1908.10553 Bian J W, Li Z, Wang N et al (2019) Unsupervised scale-consistent depth and ego-motion learning from monocular video. arXiv preprint arXiv:​ 1908.​10553
17.
19.
go back to reference Godard C, Mac Aodha O, Firman M, et al (2019) Digging into self-supervised monocular depth estimation. In: Proceedings of the IEEE international conference on computer vision, pp 3828–3838 Godard C, Mac Aodha O, Firman M, et al (2019) Digging into self-supervised monocular depth estimation. In: Proceedings of the IEEE international conference on computer vision, pp 3828–3838
20.
go back to reference Misra D,Nalamada T, Arasanipalai A U, et al (2021) Rotate to attend: convolutional triplet attention module. In: Proceedings of the IEEE winter conference on applications of computer vision, pp 3139–3148 Misra D,Nalamada T, Arasanipalai A U, et al (2021) Rotate to attend: convolutional triplet attention module. In: Proceedings of the IEEE winter conference on applications of computer vision, pp 3139–3148
22.
go back to reference Nair V, Hinton G E (2010) Rectified linear units improve restricted Boltzmann machines. In: Proceedings of the international conference on machine learning, pp 807–814 Nair V, Hinton G E (2010) Rectified linear units improve restricted Boltzmann machines. In: Proceedings of the international conference on machine learning, pp 807–814
23.
go back to reference He K, Zhang X, Ren S, Sun J (2015) Delving deep into rectifiers: surpassing human-level performance on imagenet classification. In: Proceedings of the IEEE international conference and computer vision, pp 1026–1034 He K, Zhang X, Ren S, Sun J (2015) Delving deep into rectifiers: surpassing human-level performance on imagenet classification. In: Proceedings of the IEEE international conference and computer vision, pp 1026–1034
24.
go back to reference He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778 He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
25.
go back to reference Menze M, Geiger A (2015) Object scene flow for autonomous vehicles. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3061–3070 Menze M, Geiger A (2015) Object scene flow for autonomous vehicles. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3061–3070
26.
go back to reference Dudek G, Jenkin M (2010) Computational principles of mobile robotics. Cambridge University Press, pp 1827–1834 Dudek G, Jenkin M (2010) Computational principles of mobile robotics. Cambridge University Press, pp 1827–1834
27.
go back to reference Achtelik M, Bachrach A, He R, et al (2009) Stereo vision and laser odometry for autonomous helicopters in GPS-denied indoor environments. In: Proceedings of the SPIE unmanned systems technology XI, Orlando, FL, pp 733219-1–733219-10 Achtelik M, Bachrach A, He R, et al (2009) Stereo vision and laser odometry for autonomous helicopters in GPS-denied indoor environments. In: Proceedings of the SPIE unmanned systems technology XI, Orlando, FL, pp 733219-1–733219-10
28.
go back to reference Ranjan A, Jampani V, Balles L, et al (2019) Competitive collaboration: joint unsupervised learning of depth, camera motion, optical flow and motion segmentation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 12240–12249 Ranjan A, Jampani V, Balles L, et al (2019) Competitive collaboration: joint unsupervised learning of depth, camera motion, optical flow and motion segmentation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 12240–12249
29.
go back to reference Eigen D, Puhrsch C, Fergus R (2014) Depth map prediction from a single image using a multi-scale deep network. arXiv preprint arXiv:1406.2283 Eigen D, Puhrsch C, Fergus R (2014) Depth map prediction from a single image using a multi-scale deep network. arXiv preprint arXiv:​1406.​2283
30.
go back to reference Li B, Shen C, Dai Y, et al (2015) Depth and surface normal estimation from monocular images using regression on deep features and hierarchical CRFs. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1119–1127 Li B, Shen C, Dai Y, et al (2015) Depth and surface normal estimation from monocular images using regression on deep features and hierarchical CRFs. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1119–1127
31.
go back to reference Laina I, Rupprecht C, Belagiannis V, et al (2016) Deeper depth prediction with fully convolutional residual networks. In: International conference on 3D vision (3DV), pp 239–248 Laina I, Rupprecht C, Belagiannis V, et al (2016) Deeper depth prediction with fully convolutional residual networks. In: International conference on 3D vision (3DV), pp 239–248
32.
go back to reference Zoran D, Isola P, Krishnan D, et al (2015) Learning ordinal relationships for mid-level vision. In: Proceedings of the IEEE international conference on computer vision, pp 388–396 Zoran D, Isola P, Krishnan D, et al (2015) Learning ordinal relationships for mid-level vision. In: Proceedings of the IEEE international conference on computer vision, pp 388–396
34.
35.
go back to reference Zbontar J, LeCun Y (2016) Stereo matching by training a convolutional neural network to compare image patches. J Mach Learn Res 2287–2318 Zbontar J, LeCun Y (2016) Stereo matching by training a convolutional neural network to compare image patches. J Mach Learn Res 2287–2318
36.
go back to reference Zhan H, Garg R, Weerasekera C S, et al (2018) Unsupervised learning of monocular depth estimation and visual odometry with deep feature reconstruction. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 340–349 Zhan H, Garg R, Weerasekera C S, et al (2018) Unsupervised learning of monocular depth estimation and visual odometry with deep feature reconstruction. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 340–349
37.
go back to reference Kundu J N, Uppala P K, Pahuja A, et al (2018) Adadepth: Unsupervised content congruent adaptation for depth estimation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2656–2665 Kundu J N, Uppala P K, Pahuja A, et al (2018) Adadepth: Unsupervised content congruent adaptation for depth estimation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2656–2665
38.
go back to reference Atapour-Abarghouei A, Breckon T P (2018) Real-time monocular depth estimation using synthetic data with domain adaptation via image style transfer. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2800–2810 Atapour-Abarghouei A, Breckon T P (2018) Real-time monocular depth estimation using synthetic data with domain adaptation via image style transfer. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2800–2810
39.
go back to reference Zou Y, Luo Z, Huang J B (2018) Df-net: unsupervised joint learning of depth and flow using cross-task consistency. In: Proceedings of the European conference on computer vision, pp 36–53 Zou Y, Luo Z, Huang J B (2018) Df-net: unsupervised joint learning of depth and flow using cross-task consistency. In: Proceedings of the European conference on computer vision, pp 36–53
40.
go back to reference Xie J, Girshick R, Farhadi A (2016) Deep3d: fully automatic 2d-to-3d video conversion with deep convolutional neural networks. In: Proceedings of the European conference on computer vision, pp 842–857 Xie J, Girshick R, Farhadi A (2016) Deep3d: fully automatic 2d-to-3d video conversion with deep convolutional neural networks. In: Proceedings of the European conference on computer vision, pp 842–857
41.
go back to reference Guizilini V, Ambrus R, Pillai S, et al (2020) 3d packing for self-supervised monocular depth estimation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2485–2494 Guizilini V, Ambrus R, Pillai S, et al (2020) 3d packing for self-supervised monocular depth estimation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2485–2494
42.
go back to reference Kuznietsov Y, Stuckler J, Leibe B (2017) Semi-supervised deep learning for monocular depth map prediction. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6647–6655 Kuznietsov Y, Stuckler J, Leibe B (2017) Semi-supervised deep learning for monocular depth map prediction. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6647–6655
43.
go back to reference Luo Y, Ren J, Lin M, et al (2018) Single view stereo matching. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 155–163 Luo Y, Ren J, Lin M, et al (2018) Single view stereo matching. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 155–163
44.
go back to reference Aleotti F, Tosi F, Poggi M, et al (2018) Generative adversarial networks for unsupervised monocular depth prediction. In: Proceedings of the European conference on computer vision, pp 298–313 Aleotti F, Tosi F, Poggi M, et al (2018) Generative adversarial networks for unsupervised monocular depth prediction. In: Proceedings of the European conference on computer vision, pp 298–313
45.
go back to reference Pilzer A, Xu D, Puscas M, et al (2018) Unsupervised adversarial depth estimation using cycled generative networks. In: International conference on 3D vision, pp 587–595 Pilzer A, Xu D, Puscas M, et al (2018) Unsupervised adversarial depth estimation using cycled generative networks. In: International conference on 3D vision, pp 587–595
46.
go back to reference Poggi M, Tosi F, Mattoccia S (2018) Learning monocular depth estimation with unsupervised trinocular assumptions. In: International conference on 3d vision, pp 324–333 Poggi M, Tosi F, Mattoccia S (2018) Learning monocular depth estimation with unsupervised trinocular assumptions. In: International conference on 3d vision, pp 324–333
47.
go back to reference Li R, Wang S, Long Z, et al (2018) Undeepvo: monocular visual odometry through unsupervised deep learning. In: IEEE International conference on robotics and automation, pp 7286–7291 Li R, Wang S, Long Z, et al (2018) Undeepvo: monocular visual odometry through unsupervised deep learning. In: IEEE International conference on robotics and automation, pp 7286–7291
48.
go back to reference Babu V M, Das K, Majumdar A, et al (2018) Undemon: unsupervised deep network for depth and ego-motion estimation. In: IEEE International conference on intelligent robots and systems, pp 1082–1088 Babu V M, Das K, Majumdar A, et al (2018) Undemon: unsupervised deep network for depth and ego-motion estimation. In: IEEE International conference on intelligent robots and systems, pp 1082–1088
49.
go back to reference Poggi M, Aleotti F, Tosi F, et al (2018) Towards real-time unsupervised monocular depth estimation on cpu. In: 2018 IEEE international conference on intelligent robots and systems, pp 5848–5854 Poggi M, Aleotti F, Tosi F, et al (2018) Towards real-time unsupervised monocular depth estimation on cpu. In: 2018 IEEE international conference on intelligent robots and systems, pp 5848–5854
50.
go back to reference Byravan A, Fox D (2017) Se3-nets: Learning rigid body motion using deep neural networks. In: 2017 IEEE international conference on robotics and automation, pp 173–180 Byravan A, Fox D (2017) Se3-nets: Learning rigid body motion using deep neural networks. In: 2017 IEEE international conference on robotics and automation, pp 173–180
51.
go back to reference Yin Z, Shi J (2018) Geonet: unsupervised learning of dense depth, optical flow and camera pose. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1983–1992 Yin Z, Shi J (2018) Geonet: unsupervised learning of dense depth, optical flow and camera pose. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1983–1992
52.
go back to reference Johnston A, Carneiro G (2020) Self-supervised monocular trained depth estimation using self-attention and discrete disparity volume. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4756–4765 Johnston A, Carneiro G (2020) Self-supervised monocular trained depth estimation using self-attention and discrete disparity volume. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4756–4765
53.
go back to reference Dai J, Qi H, Xiong Y, et al (2017) Deformable convolutional networks. In: Proceedings of the IEEE international conference on computer vision, pp 764–773 Dai J, Qi H, Xiong Y, et al (2017) Deformable convolutional networks. In: Proceedings of the IEEE international conference on computer vision, pp 764–773
54.
go back to reference Holschneider M, Kronland-Martinet R, Morlet J et al (1990) A real-time algorithm for signal analysis with the help of the wavelet transform. Wavelets. Inverse Problems and Theoretical Imaging 286–297 Holschneider M, Kronland-Martinet R, Morlet J et al (1990) A real-time algorithm for signal analysis with the help of the wavelet transform. Wavelets. Inverse Problems and Theoretical Imaging 286–297
55.
go back to reference Qiu S, Xu X, Cai B (2018) FReLU: flexible rectified linear units for improving convolutional neural networks. In: International conference on pattern recognition, pp 1223–1228 Qiu S, Xu X, Cai B (2018) FReLU: flexible rectified linear units for improving convolutional neural networks. In: International conference on pattern recognition, pp 1223–1228
56.
go back to reference Clevert D A, Unterthiner T, Hochreiter S (2015) Fast and accurate deep network learning by exponential linear units (elus). arXiv preprint arXiv:1511.07289 Clevert D A, Unterthiner T, Hochreiter S (2015) Fast and accurate deep network learning by exponential linear units (elus). arXiv preprint arXiv:​1511.​07289
57.
go back to reference Ronneberger O, Fischer P, Brox T (2015) U-net: convolutional networks for biomedical image segmentation. In: Medical image computing and computer assisted intervention society, pp 234–241 Ronneberger O, Fischer P, Brox T (2015) U-net: convolutional networks for biomedical image segmentation. In: Medical image computing and computer assisted intervention society, pp 234–241
58.
go back to reference Wang Z, Bovik AC, Sheikh HR et al (2004) Image quality assessment: from error visibility to structural similarity. IEEE Trans Image Process 600–612 Wang Z, Bovik AC, Sheikh HR et al (2004) Image quality assessment: from error visibility to structural similarity. IEEE Trans Image Process 600–612
59.
go back to reference Geiger A, Lenz P, Urtasun R (2012) Are we ready for autonomous driving? The kitti vision benchmark suite. In: IEEE Conference on computer vision and pattern recognition, pp 3354–3361 Geiger A, Lenz P, Urtasun R (2012) Are we ready for autonomous driving? The kitti vision benchmark suite. In: IEEE Conference on computer vision and pattern recognition, pp 3354–3361
60.
go back to reference Liu F, Shen C, Lin G et al (2015) Learning depth from single monocular images using deep convolutional neural fields. IEEE Trans Pattern Anal Mach Intell 2024–2039 Liu F, Shen C, Lin G et al (2015) Learning depth from single monocular images using deep convolutional neural fields. IEEE Trans Pattern Anal Mach Intell 2024–2039
63.
go back to reference Klodt M, Vedaldi A (2018) Supervising the new with the old: learning SFM from SFM. In: Proceedings of the European conference on computer vision, pp 698–713 Klodt M, Vedaldi A (2018) Supervising the new with the old: learning SFM from SFM. In: Proceedings of the European conference on computer vision, pp 698–713
64.
go back to reference Guo X, Li H, Yi S, et al (2018) Learning monocular depth by distilling cross-domain stereo networks. In: Proceedings of the European conference on computer vision, pp 484–500 Guo X, Li H, Yi S, et al (2018) Learning monocular depth by distilling cross-domain stereo networks. In: Proceedings of the European conference on computer vision, pp 484–500
65.
go back to reference Yang Z, Wang P, Xu W et al (2017) Unsupervised learning of geometry with edge-aware depth-normal consistency. arXiv preprint arXiv:1711.03665 Yang Z, Wang P, Xu W et al (2017) Unsupervised learning of geometry with edge-aware depth-normal consistency. arXiv preprint arXiv:​1711.​03665
66.
go back to reference Mahjourian R, Wicke M, Angelova A (2018) Unsupervised learning of depth and ego-motion from monocular video using 3d geometric constraints. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5667–5675 Mahjourian R, Wicke M, Angelova A (2018) Unsupervised learning of depth and ego-motion from monocular video using 3d geometric constraints. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5667–5675
67.
go back to reference Wang C, Buenaposada J M, Zhu R, et al (2018) Learning depth from monocular videos using direct methods. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2022–2030 Wang C, Buenaposada J M, Zhu R, et al (2018) Learning depth from monocular videos using direct methods. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2022–2030
68.
go back to reference Yang Z, Wang P, Wang Y, et al (2018) Lego: learning edge with geometry all at once by watching videos. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 225–234 Yang Z, Wang P, Wang Y, et al (2018) Lego: learning edge with geometry all at once by watching videos. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 225–234
69.
go back to reference Luo C, Yang Z, Wang P et al (2019) Every pixel counts++: joint learning of geometry and motion with 3d holistic understanding. IEEE Trans Pattern Anal Mach Intell 2624–2641 Luo C, Yang Z, Wang P et al (2019) Every pixel counts++: joint learning of geometry and motion with 3d holistic understanding. IEEE Trans Pattern Anal Mach Intell 2624–2641
70.
go back to reference Casser V, Pirk S, Mahjourian R, et al (2019) Depth prediction without the sensors: Leveraging structure for unsupervised learning from monocular videos. In: Proceedings of the AAAI conference on artificial intelligence, pp 8001–8008 Casser V, Pirk S, Mahjourian R, et al (2019) Depth prediction without the sensors: Leveraging structure for unsupervised learning from monocular videos. In: Proceedings of the AAAI conference on artificial intelligence, pp 8001–8008
71.
go back to reference Mehta I, Sakurikar P, Narayanan P J (2018) Structured adversarial training for unsupervised monocular depth estimation. In: International conference on 3D vision, pp 314–323 Mehta I, Sakurikar P, Narayanan P J (2018) Structured adversarial training for unsupervised monocular depth estimation. In: International conference on 3D vision, pp 314–323
72.
go back to reference Pillai S, Ambruş R, Gaidon A (2019) Superdepth: self-supervised, super-resolved monocular depth estimation. In: International conference on robotics and automation, pp 9250–9256 Pillai S, Ambruş R, Gaidon A (2019) Superdepth: self-supervised, super-resolved monocular depth estimation. In: International conference on robotics and automation, pp 9250–9256
73.
go back to reference Wang J, Zhang G, Wu Z et al (2020) Self-supervised joint learning framework of depth estimation via implicit cues. arXiv preprint arXiv:2006.09876 Wang J, Zhang G, Wu Z et al (2020) Self-supervised joint learning framework of depth estimation via implicit cues. arXiv preprint arXiv:​2006.​09876
Metadata
Title
Self-supervised Monocular Trained Depth Estimation Using Triplet Attention and Funnel Activation
Authors
Xuezhi Xiang
Xiangdong Kong
Yujian Qiu
Kaixu Zhang
Ning Lv
Publication date
11-08-2021
Publisher
Springer US
Published in
Neural Processing Letters / Issue 6/2021
Print ISSN: 1370-4621
Electronic ISSN: 1573-773X
DOI
https://doi.org/10.1007/s11063-021-10608-5

Other articles of this Issue 6/2021

Neural Processing Letters 6/2021 Go to the issue