Top

Published in:

2020 | OriginalPaper | Chapter

Pixel-Semantic Revising of Position: One-Stage Object Detector with Shared Encoder-Decoder

Authors : Qian Li, Nan Guo, Xiaochun Ye, Dongrui Fan, Zhimin Tang

Published in: Neural Information Processing

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

Recently, many methods have been proposed for object detection. However, they cannot detect objects by semantic features, adaptively. According to channel and spatial attention mechanisms, we mainly analyze that different methods detect objects adaptively. Some state-of-the-art detectors combine different feature pyramids with many mechanisms. However, they require more cost. This work addresses that by an anchor-free detector with shared encoder-decoder with attention mechanism, extracting shared features. We consider features of different levels from backbone (e.g., ResNet-50) as the basis features. Then, we feed the features into a simple module, followed by a detector header to detect objects. Meantime, we use the semantic features to revise geometric locations, and the detector is a pixel-semantic revising of position. More importantly, this work analyzes the impact of different pooling strategies (e.g., mean, maximum or minimum) on multi-scale objects, and finds the minimum pooling can improve detection performance on small objects better. Compared with state-of-the-art MNC based on ResNet-101 for the standard MSCOCO 2014 baseline, our method improves detection AP of 3.8%.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

previous chapter Pairwise-GAN: Pose-Based View Synthesis Through Pair-Wise Training

next chapter Reduction of Polarization-State Spread in Phase-Distortion Mitigation by Phasor-Quaternion Neural Networks in PolInSAR

Badrinarayanan, V., Kendall, A., Cipolla, R.: SegNet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39(12), 2481–2495 (2017)CrossRef

Bae, S.H.: Object detection based on region decomposition and assembly

Dai, J., He, K., Sun, J.: Instance-aware semantic segmentation via multi-task network cascades

Dai, J., Yi, L., He, K., Jian, S.: R-FCN: object detection via region-based fully convolutional networks (2016)

Fei, W., et al.: Residual attention network for image classification (2017)

Fu, C.Y., Liu, W., Ranga, A., Tyagi, A., Berg, A.C.: DSSD: deconvolutional single shot detector. arXiv preprint arXiv:1701.06659 (2017)

Girshick, R.: Fast R-CNN. In: Computer Science (2015)

Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Computer Vision and Pattern Recognition (2014)

Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7132–7141 (2018)

10.

Hu, X., Xu, X., Xiao, Y., Chen, H., Heng, P.A.: SINet: a scale-insensitive convolutional neural network for fast vehicle detection. IEEE Trans. Intell. Transp. Syst. 20(3), 1010–1019 (2019)CrossRef

11.

Huang, L., Yi, Y., Deng, Y., Yu, Y.: DenseBox: unifying landmark localization with end to end object detection. In: Computer Science (2015)

12.

Jaderberg, M., Simonyan, K., Zisserman, A., et al.: Spatial transformer networks. In: Advances in Neural Information Processing Systems, pp. 2017–2025 (2015)

13.

Kong, T., Sun, F., Yao, A., Liu, H., Lu, M., Chen, Y.: RON: reverse connection with objectness prior networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5936–5944 (2017)

14.

Law, H., Deng, J.: CornerNet: detecting objects as paired keypoints (2018)

15.

Lin, T.Y., Dollár, P., Girshick, R., He, K., Belongie, S.: Feature pyramid networks for object detection (2016)

16.

Lin, T.Y., Goyal, P., Girshick, R., He, K., Dollar, P.: Focal loss for dense object detection. IEEE Trans. Pattern Anal. Mach. Intell. PP(99), 2999–3007 (2017)

17.

Liu, W., et al.: SSD: single shot multibox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_2CrossRef

18.

Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection (2015)

19.

Redmon, J., Farhadi, A.: YOLO9000: better, faster, stronger. In: IEEE Conference on Computer Vision & Pattern Recognition (2017)

20.

Redmon, J., Farhadi, A.: YOLOv3: an incremental improvement (2018)

21.

Ren, S., Girshick, R., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39(6), 1137–1149 (2017)CrossRef

22.

Shrivastava, A., Gupta, A., Girshick, R.: Training region-based object detectors with online hard example mining. In: IEEE 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016, pp. 761–769 (2016)

23.

Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: Computer Science (2014)

24.

Singh, B., Davis, L.S.: An analysis of scale invariance in object detection - snip

25.

Tian, Z., Shen, C., Chen, H., He, T.: FCOS: fully convolutional one-stage object detection. In: Proceedings of the International Conference on Computer Vision (ICCV) (2019)

26.

Woo, S., Park, J., Lee, J.Y., Kweon, I.S.: CBAM: convolutional block attention module (2018)

27.

Yu, J., Jiang, Y., Wang, Z., Cao, Z., Huang, T.: UnitBox: an advanced object detection network (2016)

28.

Zhao, Q., Sheng, T., Wang, Y., Tang, Z., Ling, H.: M2Det: a single-shot object detector based on multi-level feature pyramid network (2018)

29.

Zhu, C., He, Y., Savvides, M.: Feature selective anchor-free module for single-shot object detection (2019)

Title: Pixel-Semantic Revising of Position: One-Stage Object Detector with Shared Encoder-Decoder
Authors: Qian Li
Nan Guo
Xiaochun Ye
Dongrui Fan
Zhimin Tang
Publisher: Springer International Publishing
Book: Neural Information Processing
Print ISBN: 978-3-030-63819-1

Electronic ISBN: 978-3-030-63820-7

Copyright Year: 2020
DOI: https://doi.org/10.1007/978-3-030-63820-7_59

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Premium Partner