Top

International Journal of Machine Learning and Cybernetics

Published in:

10-10-2021 | Original Article

Towards unified on-road object detection and depth estimation from a single image

Authors: Guofei Lian, Yan Wang, Huabiao Qin, Guancheng Chen

Published in: International Journal of Machine Learning and Cybernetics | Issue 5/2022

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

On-road object detection based on convolutional neural network (CNN) is an important problem in the field of automatic driving. However, traditional 2D object detection aims to accomplish object classification and location in image space, lacking the ability to acquire the depth information. Besides, it is inefficient to cascade the object detection and monocular depth estimation network for realizing 2.5D object detection. To address this problem, we propose a unified multi-task learning mechanism of object detection and depth estimation. Firstly, we propose an innovative loss function, namely projective consistency loss, which uses the perspective projection principle to model the transformation relationship between the target size and the depth value. Therefore, the object detection task and the depth estimation task can be mutually constrained. Then, we propose a global multi-scale feature extracting scheme by combining the Global Context (GC) and Atrous Spatial Pyramid Pooling (ASPP) block in an appropriate way, which can promote effective feature learning and collaborative learning between object detection and depth estimation. Comprehensive experiments conducted on KITTI and Cityscapes dataset show that our approach achieves high mAP and low distance estimation error, outperforming other state-of-the-art methods.

previous article Dual discriminator adversarial distillation for data-free model compression

next article A new two-stage hybrid feature selection algorithm and its application in Chinese medicine

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

ATZelectronics worldwide

ATZlectronics worldwide is up-to-speed on new trends and developments in automotive electronics on a scientific level with a high depth of information.

Order your 30-days-trial for free and without any commitment.

inform now

ATZelektronik

Die Fachzeitschrift ATZelektronik bietet für Entwickler und Entscheider in der Automobil- und Zulieferindustrie qualitativ hochwertige und fundierte Informationen aus dem gesamten Spektrum der Pkw- und Nutzfahrzeug-Elektronik.

Lassen Sie sich jetzt unverbindlich 2 kostenlose Ausgabe zusenden.

inform now

Unitbox: an advanced object detection network. ACM (2016)

Barzegar S, Sharifi A, Manthouri M (2020) Super-resolution using lightweight detailnet network. Multimed Tools Appl 79(1):1119–1136CrossRef

Brenner E, Smeets JB (2018) Depth perception. Stevens’ Handbook of Experimental Psychology and Cognitive Neuroscience 2:1–30

Caiwu L, Fan Q, Shunling R (2020) An open-pit mine roadway obstacle warning method integrating the object detection and distance threshold model. Opto-Electron Eng 47(1):190161

Cao Y, Xu J, Lin S, Wei F, Hu H (2019) Gcnet: non-local networks meet squeeze-excitation networks and beyond. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops

Chen F, Hong B (2005) Object detecting method based on background image difference using dynamic threshold. J Harbin Inst Technol 7

Chen L, Papandreou G, Schroff F, Adam H (2017) Rethinking atrous convolution for semantic image segmentation. CoRR arXiv:abs/1706.05587

Collado JM, Hilario C, de la Escalera A, Armingol JM (2004) Model based vehicle detection for intelligent vehicles. In: IEEE Intelligent Vehicles Symposium 2004, pp 572–577

Dijk, T.v, Croon, G.d (2019) How do neural networks see depth in single images? In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)

10.

Eigen D, Fergus R (2015) Predicting depth, surface normals and semantic labels with a common multi-scale convolutional architecture. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV)

11.

Eigen D, Puhrsch C, Fergus R (2014) Depth map prediction from a single image using a multi-scale deep network. MIT Press, Cambridge

12.

Eigen D, Puhrsch C, Fergus R (2014) Depth map prediction from a single image using a multi-scale deep network. CoRR arXiv:abs/1406.2283

13.

Felzenszwalb P, McAllester D, Ramanan D (2008) A discriminatively trained, multiscale, deformable part model. In: 2008 IEEE Conference on Computer Vision and Pattern Recognition, pp 1–8

14.

Gao Y, Guo S, Huang K, Chen J, Gong Q, Zou Y, Bai T, Overett G (2017) Scale optimization for full-image-cnn vehicle detection. In: 2017 IEEE Intelligent Vehicles Symposium (IV), pp 785–791. https://doi.org/10.1109/IVS.2017.7995812

15.

Garg R, BG, VK, Carneiro G, Reid I, (2016) Unsupervised cnn for single view depth estimation: Geometry to the rescue. In: Computer Vision—ECCV 2016, pp 740–756

16.

Geiger A, Lenz P, Urtasun R (2012) Are we ready for autonomous driving? the kitti vision benchmark suite. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp 3354–3361

17.

Ghlert N, Jourdan N, Cordts M, Franke U, Denzler J (2020) Cityscapes 3d: dataset and benchmark for 9 dof vehicle detection

18.

Girshick R (2015) Fast R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV)

19.

Godard C, Mac Aodha O, Brostow GJ (2017) Unsupervised monocular depth estimation with left-right consistency. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

20.

He K, Zhang X, Ren S, Sun J (2015) Delving deep into rectifiers: surpassing human-level performance on imagenet classification. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV)

21.

He K, Zhang X, Ren S, Sun J (2015) Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans Pattern Anal Mach Intell 37(9):1904–1916CrossRef

22.

Liu S , Di H , Wang Y (2017) Receptive Field Block Net for Accurate and Fast Object Detection

23.

Lin TY, Goyal P, Girshick R, He K, Dollar P (2017) Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV)

24.

Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu CY, Berg AC (2016) Ssd: Single shot multibox detector. In: Computer Vision—ECCV 2016, Cham, pp 21–37

25.

Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

26.

Redmon J, Farhadi A (2017) Yolo9000: better, faster, stronger. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

27.

Redmon J, Farhadi A (2018) Yolov3: an incremental improvement. CoRR arXiv:abs/1804.02767

28.

Ren S, He K, Girshick R, Sun J (2015) Faster R-CNN: towards real-time object detection with region proposal networks. Adv Neural Inform Process Syst 28:91–99

29.

Rezatofighi H, Tsoi N, Gwak J, Sadeghian A, Reid I, Savarese S (2019) Generalized intersection over union: a metric and a loss for bounding box regression. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

30.

Wang P, Shen X, Lin Z, Cohen S, Price B, Yuille AL (2015) Towards unified depth and semantic prediction from a single image. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

31.

Wang X, Girshick R, Gupta A, He K (2018) Non-local neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

32.

Wu B, Iandola F, Jin PH, Keutzer K (2017) Squeezedet: unified, small, low power fully convolutional neural networks for real-time object detection for autonomous driving. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops

33.

Xu Y, He P (2019) Yolov3 vehicle detection algorithm with improved loss function. Inform Commun 12:4–7

34.

Zhang Z, He T, Zhang H, Zhang Z, Xie J, Li M (2019) Bag of freebies for training object detection neural networks. CoRR https://arxiv.org/abs/1902.04103

35.

Zhang Z, Wang H, Ji Z, Wei Y (2018) A vehicle real-time detection algorithm based on yolov2 framework. In: Real-time Image & Video Processing

Title: Towards unified on-road object detection and depth estimation from a single image
Authors: Guofei Lian
Yan Wang
Huabiao Qin
Guancheng Chen
Publication date: 10-10-2021
Publisher: Springer Berlin Heidelberg
Published in: International Journal of Machine Learning and Cybernetics / Issue 5/2022
Print ISSN: 1868-8071
Electronic ISSN: 1868-808X
DOI: https://doi.org/10.1007/s13042-021-01444-z

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

ATZelectronics worldwide

ATZelektronik

Other articles of this Issue 5/2022

Ranking defects and solving countermeasures for Pythagorean fuzzy sets with hesitant degree

Hypergraph based semi-supervised support vector machine for binary and multi-category classifications

Sparse robust multiview feature selection via adaptive-weighting strategy

Segmentation-based multi-scale attention model for KRAS mutation prediction in rectal cancer

TSLOD: a coupled generalized subsequence local outlier detection model for multivariate time series

Correntropy metric-based robust low-rank subspace clustering for motion segmentation