Skip to main content
Top

2020 | OriginalPaper | Chapter

Guided Refine-Head for Object Detection

Authors : Lingyun Zeng, You Song, Wenhai Wang

Published in: MultiMedia Modeling

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

In recent years, multi-stage detectors improve the accuracy of object detection to a new level. However, due to multiple stages, these methods typically fall short in the inference speed. To alleviate this problem, we propose a novel object detector—Guided Refine-Head, which is made up of a newly proposed detection network called Refine-Head and a knowledge-distillation-like loss function. Refine-Head is a two-stage detector, and thus Refine-Head has faster inference speed than multi-stage detectors. Nonetheless, Refine-Head is able to predict bounding boxes for incremental IoU thresholds like a multi-stage detector. In addition, we use knowledge-distillation-like loss function to guide the training process of Refine-Head. Therefore, besides fast inference speed, the proposed Guided Refine-Head also has competitive accuracy. Abundant ablation studies and comparative experiments on MS-COCO 2017 validate the superiority of the proposed Guided Refine-Head. It is worth noting that Guided Refine-Head achieves the AP of 38.0% at 10.4 FPS, surpassing Faster R-CNN by 1.8% at the similar speed.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Alexe, B., Deselaers, T., Ferrari, V.: Measuring the objectness of image windows. IEEE Trans. Pattern Anal. Mach. Intell. 34(11), 2189–2202 (2012)CrossRef Alexe, B., Deselaers, T., Ferrari, V.: Measuring the objectness of image windows. IEEE Trans. Pattern Anal. Mach. Intell. 34(11), 2189–2202 (2012)CrossRef
2.
go back to reference Ba, J., Caruana, R.: Do deep nets really need to be deep? In: NeurIPS (2014) Ba, J., Caruana, R.: Do deep nets really need to be deep? In: NeurIPS (2014)
3.
go back to reference Bucilă, C., Caruana, R., Niculescu-Mizil, A.: Model compression. In: SIGKDD (2006) Bucilă, C., Caruana, R., Niculescu-Mizil, A.: Model compression. In: SIGKDD (2006)
4.
go back to reference Cai, Z., Vasconcelos, N.: Cascade R-CNN: Delving into high quality object detection. In: CVPR (2018) Cai, Z., Vasconcelos, N.: Cascade R-CNN: Delving into high quality object detection. In: CVPR (2018)
5.
go back to reference Chen, G., Choi, W., Yu, X., Han, T., Chandraker, M.: Learning efficient object detection models with knowledge distillation. In: NeurIPS (2017) Chen, G., Choi, W., Yu, X., Han, T., Chandraker, M.: Learning efficient object detection models with knowledge distillation. In: NeurIPS (2017)
6.
go back to reference Chen, Y., Li, J., Xiao, H., Jin, X., Yan, S., Feng, J.: Dual path networks. In: NeurIPS (2017) Chen, Y., Li, J., Xiao, H., Jin, X., Yan, S., Feng, J.: Dual path networks. In: NeurIPS (2017)
7.
go back to reference Dai, J., Li, Y., He, K., Sun, J.: R-FCN: object detection via region-based fully convolutional networks. In: NeurIPS (2016) Dai, J., Li, Y., He, K., Sun, J.: R-FCN: object detection via region-based fully convolutional networks. In: NeurIPS (2016)
8.
go back to reference Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: CVPR (2009) Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: CVPR (2009)
9.
go back to reference Gidaris, S., Komodakis, N.: Attend refine repeat: active box proposal generation via in-out localization. arXiv preprint (2016). arXiv:1606.04446 Gidaris, S., Komodakis, N.: Attend refine repeat: active box proposal generation via in-out localization. arXiv preprint (2016). arXiv:​1606.​04446
10.
11.
go back to reference Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: CVPR (2014) Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: CVPR (2014)
12.
go back to reference He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2961–2969 (2017) He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2961–2969 (2017)
13.
go back to reference He, K., Zhang, X., Ren, S., Sun, J.: Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans. Pattern Anal. Mach. Intell. 37(9), 1904–1916 (2015)CrossRef He, K., Zhang, X., Ren, S., Sun, J.: Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans. Pattern Anal. Mach. Intell. 37(9), 1904–1916 (2015)CrossRef
14.
go back to reference He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR (2016) He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR (2016)
16.
go back to reference Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: CVPR (2018) Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: CVPR (2018)
17.
go back to reference Huang, G., Liu, Z., Weinberger, K.Q., van der Maaten, L.D.: Densely connected convolutional networks. In: CVPR (2017) Huang, G., Liu, Z., Weinberger, K.Q., van der Maaten, L.D.: Densely connected convolutional networks. In: CVPR (2017)
18.
go back to reference Li, X., Wang, W., Hu, X., Yang, J.: Selective kernel networks. In: CVPR (2019) Li, X., Wang, W., Hu, X., Yang, J.: Selective kernel networks. In: CVPR (2019)
19.
go back to reference Lin, T.-Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection. In: ICCV (2017) Lin, T.-Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection. In: ICCV (2017)
20.
go back to reference Lin, T.-Y., et al.: Microsoft coco: common objects in context. In: ECCV (2014)CrossRef Lin, T.-Y., et al.: Microsoft coco: common objects in context. In: ECCV (2014)CrossRef
21.
go back to reference Liu, W., et al.: SSD: single shot multibox detector. In: ECCV (2016) Liu, W., et al.: SSD: single shot multibox detector. In: ECCV (2016)
22.
go back to reference Masnadi-Shirazi, H., Vasconcelos, N.: Cost-sensitive boosting. IEEE Trans. Pattern Anal. Mach. Intell. 33(2), 294–309 (2010)CrossRef Masnadi-Shirazi, H., Vasconcelos, N.: Cost-sensitive boosting. IEEE Trans. Pattern Anal. Mach. Intell. 33(2), 294–309 (2010)CrossRef
23.
go back to reference Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: CVPR (2016) Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: CVPR (2016)
24.
go back to reference Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: NeurIPS (2015) Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: NeurIPS (2015)
25.
go back to reference Shen, J., Vesdapunt, N., Boddeti, V.N., Kitani, K.M.: In teacher we trust: learning compressed models for pedestrian detection. arXiv preprint (2016). arXiv:1612.00478 Shen, J., Vesdapunt, N., Boddeti, V.N., Kitani, K.M.: In teacher we trust: learning compressed models for pedestrian detection. arXiv preprint (2016). arXiv:​1612.​00478
26.
go back to reference Uijlings, J.R.R., Van De Sande, K.E.A., Gevers, T., Smeulders, A.W.M.: Selective search for object recognition. Int. J. Comput. Vis. 104(2), 154–171 (2013)CrossRef Uijlings, J.R.R., Van De Sande, K.E.A., Gevers, T., Smeulders, A.W.M.: Selective search for object recognition. Int. J. Comput. Vis. 104(2), 154–171 (2013)CrossRef
27.
go back to reference Wang, W., Li, X., Lu, T., Yang, J.: Mixed link networks. In: IJCAI (2018) Wang, W., Li, X., Lu, T., Yang, J.: Mixed link networks. In: IJCAI (2018)
28.
go back to reference Zitnick, C.L., Dollár, P.: Edge boxes: locating object proposals from edges. In: ECCV (2014) Zitnick, C.L., Dollár, P.: Edge boxes: locating object proposals from edges. In: ECCV (2014)
Metadata
Title
Guided Refine-Head for Object Detection
Authors
Lingyun Zeng
You Song
Wenhai Wang
Copyright Year
2020
DOI
https://doi.org/10.1007/978-3-030-37731-1_17