Skip to main content
Top
Published in: Cognitive Computation 4/2023

05-01-2022

Multi-branch Bounding Box Regression for Object Detection

Authors: Hui-Shen Yuan, Si-Bao Chen, Bin Luo, Hao Huang, Qiang Li

Published in: Cognitive Computation | Issue 4/2023

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Localization and classification are two important components in the task of visual object detection. In recent years, object detectors have increasingly focused on creating various localization branches. Bounding box regression is vital for two-stage detectors. Therefore, we propose a multi-branch bounding box regression method called Multi-Branch R-CNN for robust object localization. Multi-Branch R-CNN is composed of the fully connected head and the fully convolutional head. The fully convolutional head focuses on the utilization of spatial semantics. It is complementary to the fully connected head that prefers local features. The features extracted from the two localization branches are fused, then flow to the next stage for classification and regression. The two branches cooperate to predict more precise localization, which significantly improves the performance of the detector. Extensive experiments were conducted on public PASCAL VOC and MS COCO benchmarks. On the COCO dataset, our Multi-Branch R-CNN with ResNet-101 backbone achieved state-of-the-art single model results by obtaining an mAP of 43.2. Extensive comparative experiments prove the effectiveness of the proposed method.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Ren S, He K, Girshick R, Sun J. Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems. 2015. p. 91–99. Ren S, He K, Girshick R, Sun J. Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems. 2015. p. 91–99.
2.
go back to reference Lu X, Li B, Yue Y, Li Q, Yan J. Grid R-CNN. In: Proceedings of the IEEE Conference on CVPR. 2019. p. 7363–7372. Lu X, Li B, Yue Y, Li Q, Yan J. Grid R-CNN. In: Proceedings of the IEEE Conference on CVPR. 2019. p. 7363–7372.
3.
go back to reference Wu Y, Chen Y, Yuan L, Liu Z, Wang L, et al. Double-head RCNN: rethinking classification and localization for object detection. arXiv. 2019;1904:06493. Wu Y, Chen Y, Yuan L, Liu Z, Wang L, et al. Double-head RCNN: rethinking classification and localization for object detection. arXiv. 2019;1904:06493.
4.
go back to reference Cai Z, Vasconcelos N. Cascade R-CNN: delving into high quality object detection. In: Proceedings of the IEEE Conference on CVPR. 2018. p. 6154–6162. Cai Z, Vasconcelos N. Cascade R-CNN: delving into high quality object detection. In: Proceedings of the IEEE Conference on CVPR. 2018. p. 6154–6162.
5.
go back to reference Vasamsetti S, Mittal N, Neelapu BC, et al. 3D local spatio-temporal ternary patterns for moving object detection in complex scenes. Cogn Comput. 2019;11:18–30.CrossRef Vasamsetti S, Mittal N, Neelapu BC, et al. 3D local spatio-temporal ternary patterns for moving object detection in complex scenes. Cogn Comput. 2019;11:18–30.CrossRef
6.
go back to reference Kim J, Oh K, Oh B, et al. A line feature extraction method for finger-knuckle-print verification. Cogn Comput. 2019;11:50–70.CrossRef Kim J, Oh K, Oh B, et al. A line feature extraction method for finger-knuckle-print verification. Cogn Comput. 2019;11:50–70.CrossRef
7.
go back to reference Gao F, Huang T, Sun J, et al. A new algorithm for SAR image target recognition based on an improved deep convolutional neural network. Cogn Comput. 2019;11:809–24.CrossRef Gao F, Huang T, Sun J, et al. A new algorithm for SAR image target recognition based on an improved deep convolutional neural network. Cogn Comput. 2019;11:809–24.CrossRef
8.
go back to reference Lin T-Y, Doll´ar P, Girshick R, He K, Hariharan B, et al. Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on CVPR. 2017. p. 2117–2125. Lin T-Y, Doll´ar P, Girshick R, He K, Hariharan B, et al. Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on CVPR. 2017. p. 2117–2125.
9.
go back to reference Xu H, et al. Auto-fpn: Automatic network architecture adaptation for object detection beyond classification. In: Proceedings of the IEEE International Conference on Computer Vision. 2019. Xu H, et al. Auto-fpn: Automatic network architecture adaptation for object detection beyond classification. In: Proceedings of the IEEE International Conference on Computer Vision. 2019.
10.
go back to reference Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, et al. SSD: single shot multibox detector. In: European Conference on Computer Vision. Springer; 2016. p. 21–37. Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, et al. SSD: single shot multibox detector. In: European Conference on Computer Vision. Springer; 2016. p. 21–37.
11.
go back to reference Redmon J, Farhadi A. YOLO9000: better, faster, stronger. In: Proceedings of the IEEE Conference on CVPR. 2017. p. 7263–7271. Redmon J, Farhadi A. YOLO9000: better, faster, stronger. In: Proceedings of the IEEE Conference on CVPR. 2017. p. 7263–7271.
12.
go back to reference Redmon J, Farhadi A. Yolov3: An incremental improvement. arXiv. 2018;1804:02767. Redmon J, Farhadi A. Yolov3: An incremental improvement. arXiv. 2018;1804:02767.
13.
go back to reference Lin T-Y, Goyal P, Girshick R, He K, Doll´ar P. Focal loss for dense object detection. In: Proceedings of the IEEE ICCV. 2017. p. 2980–2988. Lin T-Y, Goyal P, Girshick R, He K, Doll´ar P. Focal loss for dense object detection. In: Proceedings of the IEEE ICCV. 2017. p. 2980–2988.
14.
go back to reference He K, Zhang X, Ren S, Sun J. Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans Pattern Anal Mach Intell. 2015;37(9):1904–16.CrossRef He K, Zhang X, Ren S, Sun J. Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans Pattern Anal Mach Intell. 2015;37(9):1904–16.CrossRef
15.
go back to reference Girshick R. Fast R-CNN. In: Proceedings of the IEEE ICCV. 2015. p. 1440–1448. Girshick R. Fast R-CNN. In: Proceedings of the IEEE ICCV. 2015. p. 1440–1448.
16.
go back to reference He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on CVPR. 2016. p. 770–778. He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on CVPR. 2016. p. 770–778.
17.
go back to reference Deng J, Dong W, Socher R, Li L-J, Li K, et al. Imagenet: a large-scale hierarchical image database. In: 2009 IEEE Conference on CVPR. IEEE; 2009. p. 248–255. Deng J, Dong W, Socher R, Li L-J, Li K, et al. Imagenet: a large-scale hierarchical image database. In: 2009 IEEE Conference on CVPR. IEEE; 2009. p. 248–255.
18.
go back to reference He K, Gkioxari G, Doll´ar P, Girshick R. Mask R-CNN. In: Proceedings of the IEEE ICCV. 2017. p. 2961–2969. He K, Gkioxari G, Doll´ar P, Girshick R. Mask R-CNN. In: Proceedings of the IEEE ICCV. 2017. p. 2961–2969.
19.
go back to reference Gidaris S, Komodakis N. LocNet: Improve in localization accuracy for object detection. In: Proceedings of the IEEE ICCV. 2016. p. 789–798. Gidaris S, Komodakis N. LocNet: Improve in localization accuracy for object detection. In: Proceedings of the IEEE ICCV. 2016. p. 789–798.
20.
go back to reference Fu C-Y, Liu W, Ranga A, Tyagi A, Berg AC. DSSD: Deconvolutional single shot detector. arXiv. 2017;1701:06659. Fu C-Y, Liu W, Ranga A, Tyagi A, Berg AC. DSSD: Deconvolutional single shot detector. arXiv. 2017;1701:06659.
21.
go back to reference Bochkovskiy A, Wang CY, Liao H. YOLOv4: optimal speed and accuracy of object detection. arXiv. 2020;2004:10934. Bochkovskiy A, Wang CY, Liao H. YOLOv4: optimal speed and accuracy of object detection. arXiv. 2020;2004:10934.
22.
go back to reference Tychsen-Smith L, Petersson L. DeNet: scalable real-time object detection with directed sparse sampling. In: Proceedings of the IEEE ICCV. 2017. p. 428–436. Tychsen-Smith L, Petersson L. DeNet: scalable real-time object detection with directed sparse sampling. In: Proceedings of the IEEE ICCV. 2017. p. 428–436.
23.
go back to reference Zhang S, Wen L, Bian X, Lei Z, Li SZ. Single-shot refinement neural network for object detection. In: Proceedings of the IEEE Conference on CVPR. 2018. p. 4203–4212. Zhang S, Wen L, Bian X, Lei Z, Li SZ. Single-shot refinement neural network for object detection. In: Proceedings of the IEEE Conference on CVPR. 2018. p. 4203–4212.
24.
go back to reference Zhu Y, Zhao C, Wang J, Zhao X, Wu Y, et al. CoupleNet: Coupling global structure with local parts for object detection. In: Proceedings of the IEEE ICCV. 2017. p. 4126–4134. Zhu Y, Zhao C, Wang J, Zhao X, Wu Y, et al. CoupleNet: Coupling global structure with local parts for object detection. In: Proceedings of the IEEE ICCV. 2017. p. 4126–4134.
25.
go back to reference Gao Z, Wang L, Wu G. Lip: Local importance-based pooling. In: Proceedings of the IEEE International Conference on Computer Vision. 2019. Gao Z, Wang L, Wu G. Lip: Local importance-based pooling. In: Proceedings of the IEEE International Conference on Computer Vision. 2019.
26.
go back to reference Li Y, et al. Scale-aware trident networks for object detection. In: Proceedings of the IEEE international conference on computer vision. 2019. Li Y, et al. Scale-aware trident networks for object detection. In: Proceedings of the IEEE international conference on computer vision. 2019.
Metadata
Title
Multi-branch Bounding Box Regression for Object Detection
Authors
Hui-Shen Yuan
Si-Bao Chen
Bin Luo
Hao Huang
Qiang Li
Publication date
05-01-2022
Publisher
Springer US
Published in
Cognitive Computation / Issue 4/2023
Print ISSN: 1866-9956
Electronic ISSN: 1866-9964
DOI
https://doi.org/10.1007/s12559-021-09983-x

Other articles of this Issue 4/2023

Cognitive Computation 4/2023 Go to the issue

Premium Partner