Skip to main content
Top
Published in: Pattern Analysis and Applications 4/2023

28-09-2023 | Theoretical Advances

SAFPN: a full semantic feature pyramid network for object detection

Authors: Gaihua Wang, Qi Li, Nengyuan Wang, Hong Liu

Published in: Pattern Analysis and Applications | Issue 4/2023

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

To enhance the performance of object detection algorithm, this paper proposes segmentation attention feature pyramid network (SAFPN) to address the issue of semantic information loss. Compared to prior works, SAFPN discards the original \(1\times 1\) convolutions and achieves feature dimension reduction through a segmentation and accumulation architecture, thereby preserving the semantic information of high-dimensional features completely. To capture fine-grained semantic details, it integrates channel attention and spatial attention mechanisms to enhance the network’s focus on important information. Extensive experimental validation demonstrates that SAFPN achieves favorable results on multiple public datasets, and can better complete the target detection task.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Zhang L, Wang H, Wang X, Liu Q, Wang H, Wang H (2021) Vehicle object detection method based on candidate region aggregation. Pattern Anal Appl 24:1635–1647CrossRef Zhang L, Wang H, Wang X, Liu Q, Wang H, Wang H (2021) Vehicle object detection method based on candidate region aggregation. Pattern Anal Appl 24:1635–1647CrossRef
2.
go back to reference Sugiura M, Miyauchi CM, Kotozaki Y, Akimoto Y, Nozawa T, Yomogida Y, Hanawa S, Yamamoto Y, Sakuma A, Nakagawa S et al (2015) Neural mechanism for mirrored self-face recognition. Cereb Cortex 25(9):2806–2814CrossRef Sugiura M, Miyauchi CM, Kotozaki Y, Akimoto Y, Nozawa T, Yomogida Y, Hanawa S, Yamamoto Y, Sakuma A, Nakagawa S et al (2015) Neural mechanism for mirrored self-face recognition. Cereb Cortex 25(9):2806–2814CrossRef
3.
go back to reference Yan K, Wang X, Lu L, Summers RM (2018) Deeplesion: automated mining of large-scale lesion annotations and universal lesion detection with deep learning. J Med Imaging 5(3):036501–036501CrossRef Yan K, Wang X, Lu L, Summers RM (2018) Deeplesion: automated mining of large-scale lesion annotations and universal lesion detection with deep learning. J Med Imaging 5(3):036501–036501CrossRef
4.
go back to reference Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 580–587 Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 580–587
5.
go back to reference Girshick R (2015) Fast r-CNN. In: Proceedings of the IEEE international conference on computer vision, pp 1440–1448 Girshick R (2015) Fast r-CNN. In: Proceedings of the IEEE international conference on computer vision, pp 1440–1448
6.
go back to reference Ren S, He K, Girshick R, Sun J (2015) Faster r-CNN: Towards real-time object detection with region proposal networks. In: Advances in neural information processing systems Ren S, He K, Girshick R, Sun J (2015) Faster r-CNN: Towards real-time object detection with region proposal networks. In: Advances in neural information processing systems
7.
go back to reference Zhang H, Chang H, Ma B, Wang N, Chen X (2020) Dynamic r-CNN: towards high quality object detection via dynamic training. In: Computer vision—ECCV 2020: 16th European conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XV 16. Springer, Berlin, pp 260–275 Zhang H, Chang H, Ma B, Wang N, Chen X (2020) Dynamic r-CNN: towards high quality object detection via dynamic training. In: Computer vision—ECCV 2020: 16th European conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XV 16. Springer, Berlin, pp 260–275
8.
go back to reference Sun P, Zhang R, Jiang Y, Kong T, Xu C, Zhan W, Tomizuka M, Li L, Yuan Z, Wang C et al (2021) Sparse r-cnn: End-to-end object detection with learnable proposals. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 14454–14463 Sun P, Zhang R, Jiang Y, Kong T, Xu C, Zhan W, Tomizuka M, Li L, Yuan Z, Wang C et al (2021) Sparse r-cnn: End-to-end object detection with learnable proposals. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 14454–14463
9.
go back to reference Lin T-Y, Goyal P, Girshick R, He K, Dollár P (2017) Focal loss for dense object detection. In: Proceedings of the IEEE international conference on computer vision, pp 2980–2988 Lin T-Y, Goyal P, Girshick R, He K, Dollár P (2017) Focal loss for dense object detection. In: Proceedings of the IEEE international conference on computer vision, pp 2980–2988
10.
go back to reference Tian Z, Shen C, Chen H, He T (2019) FCOS: fully convolutional one-stage object detection. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 9627–9636 Tian Z, Shen C, Chen H, He T (2019) FCOS: fully convolutional one-stage object detection. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 9627–9636
11.
go back to reference Tian Z, Shen C, Chen H, He T (2019) FCOS: fully convolutional one-stage object detection. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 9627–9636 Tian Z, Shen C, Chen H, He T (2019) FCOS: fully convolutional one-stage object detection. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 9627–9636
12.
go back to reference Zhang S, Chi C, Yao Y, Lei Z, Li SZ (2020) Bridging the gap between anchor-based and anchor-free detection via adaptive training sample selection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 9759–9768 Zhang S, Chi C, Yao Y, Lei Z, Li SZ (2020) Bridging the gap between anchor-based and anchor-free detection via adaptive training sample selection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 9759–9768
13.
go back to reference Zhang H, Wang Y, Dayoub F, Sunderhauf N (2021) Varifocalnet: an IOU-aware dense object detector. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 8514–8523 Zhang H, Wang Y, Dayoub F, Sunderhauf N (2021) Varifocalnet: an IOU-aware dense object detector. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 8514–8523
14.
go back to reference Chen Q, Wang Y, Yang T, Zhang X, Cheng J, Sun J (2021) You only look one-level feature. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 13039–13048 Chen Q, Wang Y, Yang T, Zhang X, Cheng J, Sun J (2021) You only look one-level feature. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 13039–13048
15.
go back to reference Feng C, Zhong Y, Gao Y, Scott MR, Huang W (2021) TOOD: task-aligned one-stage object detection. In: 2021 IEEE/CVF international conference on computer vision (ICCV). IEEE Computer Society, pp 3490–3499 Feng C, Zhong Y, Gao Y, Scott MR, Huang W (2021) TOOD: task-aligned one-stage object detection. In: 2021 IEEE/CVF international conference on computer vision (ICCV). IEEE Computer Society, pp 3490–3499
16.
go back to reference Li S, He C, Li R, Zhang L (2022) A dual weighting label assignment scheme for object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 9387–9396 Li S, He C, Li R, Zhang L (2022) A dual weighting label assignment scheme for object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 9387–9396
17.
go back to reference Lin T-Y, Dollár P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2117–2125 Lin T-Y, Dollár P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2117–2125
18.
go back to reference Liu S, Qi L, Qin H, Shi J, Jia J (2018) Path aggregation network for instance segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 8759–8768 Liu S, Qi L, Qin H, Shi J, Jia J (2018) Path aggregation network for instance segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 8759–8768
20.
go back to reference Tan M, Pang R, Le QV (2020) Efficientdet: Scalable and efficient object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 10781–10790 Tan M, Pang R, Le QV (2020) Efficientdet: Scalable and efficient object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 10781–10790
21.
go back to reference Qiao S, Chen L-C, Yuille A (2021) Detectors: detecting objects with recursive feature pyramid and switchable atrous convolution. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 10213–10224 Qiao S, Chen L-C, Yuille A (2021) Detectors: detecting objects with recursive feature pyramid and switchable atrous convolution. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 10213–10224
23.
go back to reference Hu J, Shen L, Sun G (2018) Squeeze-and-excitation networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7132–7141 Hu J, Shen L, Sun G (2018) Squeeze-and-excitation networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7132–7141
24.
go back to reference Woo S, Park J, Lee J-Y, Kweon IS (2018) CBAM: convolutional block attention module. In: Proceedings of the European conference on computer vision (ECCV), pp 3–19 Woo S, Park J, Lee J-Y, Kweon IS (2018) CBAM: convolutional block attention module. In: Proceedings of the European conference on computer vision (ECCV), pp 3–19
25.
go back to reference Rahman MM, Fiaz M, Jung SK (2020) Efficient visual tracking with stacked channel-spatial attention learning. IEEE Access 8:100857–100869CrossRef Rahman MM, Fiaz M, Jung SK (2020) Efficient visual tracking with stacked channel-spatial attention learning. IEEE Access 8:100857–100869CrossRef
26.
go back to reference Wang Q, Wu B, Zhu P, Li P, Zuo W, Hu Q (2020) ECA-NET: efficient channel attention for deep convolutional neural networks. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 11534–11542 Wang Q, Wu B, Zhu P, Li P, Zuo W, Hu Q (2020) ECA-NET: efficient channel attention for deep convolutional neural networks. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 11534–11542
27.
go back to reference Hou Q, Zhou D, Feng J (2021) Coordinate attention for efficient mobile network design. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 13713–13722 Hou Q, Zhou D, Feng J (2021) Coordinate attention for efficient mobile network design. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 13713–13722
28.
go back to reference Yang L, Zhang R-Y, Li L, Xie X (2021) SIMAM: a simple, parameter-free attention module for convolutional neural networks. In: International conference on machine learning, pp 11863–11874. PMLR Yang L, Zhang R-Y, Li L, Xie X (2021) SIMAM: a simple, parameter-free attention module for convolutional neural networks. In: International conference on machine learning, pp 11863–11874. PMLR
29.
go back to reference Zhang Q-L, Yang Y-B (2021) Sa-net: shuffle attention for deep convolutional neural networks. In: ICASSP 2021-2021 IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE, pp 2235–2239 Zhang Q-L, Yang Y-B (2021) Sa-net: shuffle attention for deep convolutional neural networks. In: ICASSP 2021-2021 IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE, pp 2235–2239
30.
go back to reference Mou L, Zhao Y, Chen L, Cheng J, Gu Z, Hao H, Qi H, Zheng Y, Frangi A, Liu J (2019) CS-NET: channel and spatial attention network for curvilinear structure segmentation. In: Medical Image Computing and Computer Assisted Intervention—MICCAI 2019: 22nd international conference, Shenzhen, China, October 13–17, 2019, Proceedings, Part I 22. Springer, pp 721–730 Mou L, Zhao Y, Chen L, Cheng J, Gu Z, Hao H, Qi H, Zheng Y, Frangi A, Liu J (2019) CS-NET: channel and spatial attention network for curvilinear structure segmentation. In: Medical Image Computing and Computer Assisted Intervention—MICCAI 2019: 22nd international conference, Shenzhen, China, October 13–17, 2019, Proceedings, Part I 22. Springer, pp 721–730
31.
go back to reference Hsyu M-C, Liu C-W, Chen C-H, Chen C-W, Tsai W-C (2021) CSANET: high speed channel spatial attention network for mobile ISP. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 2486–2493 Hsyu M-C, Liu C-W, Chen C-H, Chen C-W, Tsai W-C (2021) CSANET: high speed channel spatial attention network for mobile ISP. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 2486–2493
33.
go back to reference Liu Z, Gong P, Wang J (2019) Attention-based feature pyramid network for object detection. In: Proceedings of the 2019 8th international conference on computing and pattern recognition, pp 117–121 Liu Z, Gong P, Wang J (2019) Attention-based feature pyramid network for object detection. In: Proceedings of the 2019 8th international conference on computing and pattern recognition, pp 117–121
34.
go back to reference Guo C, Fan B, Zhang Q, Xiang S, Pan C (2020) AUGFPN: improving multi-scale feature learning for object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 12595–12604 Guo C, Fan B, Zhang Q, Xiang S, Pan C (2020) AUGFPN: improving multi-scale feature learning for object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 12595–12604
35.
go back to reference Min K, Lee G-H, Lee S-W (2022) Attentional feature pyramid network for small object detection. Neural Netw 155:439–450CrossRef Min K, Lee G-H, Lee S-W (2022) Attentional feature pyramid network for small object detection. Neural Netw 155:439–450CrossRef
36.
go back to reference Yang X, Wang W, Wu J, Ding C, Ma S, Hou Z (2022) MLA-NET: feature pyramid network with multi-level local attention for object detection. Mathematics 10(24):4789CrossRef Yang X, Wang W, Wu J, Ding C, Ma S, Hou Z (2022) MLA-NET: feature pyramid network with multi-level local attention for object detection. Mathematics 10(24):4789CrossRef
37.
go back to reference Lin T-Y, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P, Zitnick CL (2014) Microsoft coco: common objects in context. In: Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6–12, 2014, Proceedings, Part V 13. Springer, Berlin, pp 740–755 Lin T-Y, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P, Zitnick CL (2014) Microsoft coco: common objects in context. In: Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6–12, 2014, Proceedings, Part V 13. Springer, Berlin, pp 740–755
38.
go back to reference Everingham M, Van Gool L, Williams CK, Winn J, Zisserman A (2010) The pascal visual object classes (VOC) challenge. Int J Comput Vision 88:303–338CrossRef Everingham M, Van Gool L, Williams CK, Winn J, Zisserman A (2010) The pascal visual object classes (VOC) challenge. Int J Comput Vision 88:303–338CrossRef
39.
go back to reference Everingham M, Eslami SA, Van Gool L, Williams CK, Winn J, Zisserman A (2015) The pascal visual object classes challenge: a retrospective. Int J Comput Vis 111:98–136CrossRef Everingham M, Eslami SA, Van Gool L, Williams CK, Winn J, Zisserman A (2015) The pascal visual object classes challenge: a retrospective. Int J Comput Vis 111:98–136CrossRef
40.
go back to reference Zhang H, Li D, Ji Y, Zhou H, Wu W (2019) Deep learning-based beverage recognition for unmanned vending machines: an empirical study. In: 2019 IEEE 17th international conference on industrial informatics (INDIN). IEEE, vol 1, pp 1464–1467 Zhang H, Li D, Ji Y, Zhou H, Wu W (2019) Deep learning-based beverage recognition for unmanned vending machines: an empirical study. In: 2019 IEEE 17th international conference on industrial informatics (INDIN). IEEE, vol 1, pp 1464–1467
41.
go back to reference Zhang H, Li D, Ji Y, Zhou H, Wu W, Liu K (2019) Toward new retail: a benchmark dataset for smart unmanned vending machines. IEEE Trans Ind Inf 16(12):7722–7731CrossRef Zhang H, Li D, Ji Y, Zhou H, Wu W, Liu K (2019) Toward new retail: a benchmark dataset for smart unmanned vending machines. IEEE Trans Ind Inf 16(12):7722–7731CrossRef
42.
go back to reference Chen K, Wang J, Pang J, Cao Y, Xiong Y, Li X, Sun S, Feng W, Liu Z, Xu J, et al (2019) Mmdetection: open MMLAB detection toolbox and benchmark. arXiv preprint arXiv:1906.07155 Chen K, Wang J, Pang J, Cao Y, Xiong Y, Li X, Sun S, Feng W, Liu Z, Xu J, et al (2019) Mmdetection: open MMLAB detection toolbox and benchmark. arXiv preprint arXiv:​1906.​07155
43.
go back to reference He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778 He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Metadata
Title
SAFPN: a full semantic feature pyramid network for object detection
Authors
Gaihua Wang
Qi Li
Nengyuan Wang
Hong Liu
Publication date
28-09-2023
Publisher
Springer London
Published in
Pattern Analysis and Applications / Issue 4/2023
Print ISSN: 1433-7541
Electronic ISSN: 1433-755X
DOI
https://doi.org/10.1007/s10044-023-01200-9

Other articles of this Issue 4/2023

Pattern Analysis and Applications 4/2023 Go to the issue

Premium Partner