Skip to main content
Erschienen in: Pattern Analysis and Applications 4/2023

28.09.2023 | Theoretical Advances

SAFPN: a full semantic feature pyramid network for object detection

verfasst von: Gaihua Wang, Qi Li, Nengyuan Wang, Hong Liu

Erschienen in: Pattern Analysis and Applications | Ausgabe 4/2023

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

To enhance the performance of object detection algorithm, this paper proposes segmentation attention feature pyramid network (SAFPN) to address the issue of semantic information loss. Compared to prior works, SAFPN discards the original \(1\times 1\) convolutions and achieves feature dimension reduction through a segmentation and accumulation architecture, thereby preserving the semantic information of high-dimensional features completely. To capture fine-grained semantic details, it integrates channel attention and spatial attention mechanisms to enhance the network’s focus on important information. Extensive experimental validation demonstrates that SAFPN achieves favorable results on multiple public datasets, and can better complete the target detection task.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Zhang L, Wang H, Wang X, Liu Q, Wang H, Wang H (2021) Vehicle object detection method based on candidate region aggregation. Pattern Anal Appl 24:1635–1647CrossRef Zhang L, Wang H, Wang X, Liu Q, Wang H, Wang H (2021) Vehicle object detection method based on candidate region aggregation. Pattern Anal Appl 24:1635–1647CrossRef
2.
Zurück zum Zitat Sugiura M, Miyauchi CM, Kotozaki Y, Akimoto Y, Nozawa T, Yomogida Y, Hanawa S, Yamamoto Y, Sakuma A, Nakagawa S et al (2015) Neural mechanism for mirrored self-face recognition. Cereb Cortex 25(9):2806–2814CrossRef Sugiura M, Miyauchi CM, Kotozaki Y, Akimoto Y, Nozawa T, Yomogida Y, Hanawa S, Yamamoto Y, Sakuma A, Nakagawa S et al (2015) Neural mechanism for mirrored self-face recognition. Cereb Cortex 25(9):2806–2814CrossRef
3.
Zurück zum Zitat Yan K, Wang X, Lu L, Summers RM (2018) Deeplesion: automated mining of large-scale lesion annotations and universal lesion detection with deep learning. J Med Imaging 5(3):036501–036501CrossRef Yan K, Wang X, Lu L, Summers RM (2018) Deeplesion: automated mining of large-scale lesion annotations and universal lesion detection with deep learning. J Med Imaging 5(3):036501–036501CrossRef
4.
Zurück zum Zitat Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 580–587 Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 580–587
5.
Zurück zum Zitat Girshick R (2015) Fast r-CNN. In: Proceedings of the IEEE international conference on computer vision, pp 1440–1448 Girshick R (2015) Fast r-CNN. In: Proceedings of the IEEE international conference on computer vision, pp 1440–1448
6.
Zurück zum Zitat Ren S, He K, Girshick R, Sun J (2015) Faster r-CNN: Towards real-time object detection with region proposal networks. In: Advances in neural information processing systems Ren S, He K, Girshick R, Sun J (2015) Faster r-CNN: Towards real-time object detection with region proposal networks. In: Advances in neural information processing systems
7.
Zurück zum Zitat Zhang H, Chang H, Ma B, Wang N, Chen X (2020) Dynamic r-CNN: towards high quality object detection via dynamic training. In: Computer vision—ECCV 2020: 16th European conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XV 16. Springer, Berlin, pp 260–275 Zhang H, Chang H, Ma B, Wang N, Chen X (2020) Dynamic r-CNN: towards high quality object detection via dynamic training. In: Computer vision—ECCV 2020: 16th European conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XV 16. Springer, Berlin, pp 260–275
8.
Zurück zum Zitat Sun P, Zhang R, Jiang Y, Kong T, Xu C, Zhan W, Tomizuka M, Li L, Yuan Z, Wang C et al (2021) Sparse r-cnn: End-to-end object detection with learnable proposals. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 14454–14463 Sun P, Zhang R, Jiang Y, Kong T, Xu C, Zhan W, Tomizuka M, Li L, Yuan Z, Wang C et al (2021) Sparse r-cnn: End-to-end object detection with learnable proposals. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 14454–14463
9.
Zurück zum Zitat Lin T-Y, Goyal P, Girshick R, He K, Dollár P (2017) Focal loss for dense object detection. In: Proceedings of the IEEE international conference on computer vision, pp 2980–2988 Lin T-Y, Goyal P, Girshick R, He K, Dollár P (2017) Focal loss for dense object detection. In: Proceedings of the IEEE international conference on computer vision, pp 2980–2988
10.
Zurück zum Zitat Tian Z, Shen C, Chen H, He T (2019) FCOS: fully convolutional one-stage object detection. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 9627–9636 Tian Z, Shen C, Chen H, He T (2019) FCOS: fully convolutional one-stage object detection. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 9627–9636
11.
Zurück zum Zitat Tian Z, Shen C, Chen H, He T (2019) FCOS: fully convolutional one-stage object detection. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 9627–9636 Tian Z, Shen C, Chen H, He T (2019) FCOS: fully convolutional one-stage object detection. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 9627–9636
12.
Zurück zum Zitat Zhang S, Chi C, Yao Y, Lei Z, Li SZ (2020) Bridging the gap between anchor-based and anchor-free detection via adaptive training sample selection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 9759–9768 Zhang S, Chi C, Yao Y, Lei Z, Li SZ (2020) Bridging the gap between anchor-based and anchor-free detection via adaptive training sample selection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 9759–9768
13.
Zurück zum Zitat Zhang H, Wang Y, Dayoub F, Sunderhauf N (2021) Varifocalnet: an IOU-aware dense object detector. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 8514–8523 Zhang H, Wang Y, Dayoub F, Sunderhauf N (2021) Varifocalnet: an IOU-aware dense object detector. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 8514–8523
14.
Zurück zum Zitat Chen Q, Wang Y, Yang T, Zhang X, Cheng J, Sun J (2021) You only look one-level feature. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 13039–13048 Chen Q, Wang Y, Yang T, Zhang X, Cheng J, Sun J (2021) You only look one-level feature. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 13039–13048
15.
Zurück zum Zitat Feng C, Zhong Y, Gao Y, Scott MR, Huang W (2021) TOOD: task-aligned one-stage object detection. In: 2021 IEEE/CVF international conference on computer vision (ICCV). IEEE Computer Society, pp 3490–3499 Feng C, Zhong Y, Gao Y, Scott MR, Huang W (2021) TOOD: task-aligned one-stage object detection. In: 2021 IEEE/CVF international conference on computer vision (ICCV). IEEE Computer Society, pp 3490–3499
16.
Zurück zum Zitat Li S, He C, Li R, Zhang L (2022) A dual weighting label assignment scheme for object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 9387–9396 Li S, He C, Li R, Zhang L (2022) A dual weighting label assignment scheme for object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 9387–9396
17.
Zurück zum Zitat Lin T-Y, Dollár P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2117–2125 Lin T-Y, Dollár P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2117–2125
18.
Zurück zum Zitat Liu S, Qi L, Qin H, Shi J, Jia J (2018) Path aggregation network for instance segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 8759–8768 Liu S, Qi L, Qin H, Shi J, Jia J (2018) Path aggregation network for instance segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 8759–8768
19.
20.
Zurück zum Zitat Tan M, Pang R, Le QV (2020) Efficientdet: Scalable and efficient object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 10781–10790 Tan M, Pang R, Le QV (2020) Efficientdet: Scalable and efficient object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 10781–10790
21.
Zurück zum Zitat Qiao S, Chen L-C, Yuille A (2021) Detectors: detecting objects with recursive feature pyramid and switchable atrous convolution. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 10213–10224 Qiao S, Chen L-C, Yuille A (2021) Detectors: detecting objects with recursive feature pyramid and switchable atrous convolution. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 10213–10224
23.
Zurück zum Zitat Hu J, Shen L, Sun G (2018) Squeeze-and-excitation networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7132–7141 Hu J, Shen L, Sun G (2018) Squeeze-and-excitation networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7132–7141
24.
Zurück zum Zitat Woo S, Park J, Lee J-Y, Kweon IS (2018) CBAM: convolutional block attention module. In: Proceedings of the European conference on computer vision (ECCV), pp 3–19 Woo S, Park J, Lee J-Y, Kweon IS (2018) CBAM: convolutional block attention module. In: Proceedings of the European conference on computer vision (ECCV), pp 3–19
25.
Zurück zum Zitat Rahman MM, Fiaz M, Jung SK (2020) Efficient visual tracking with stacked channel-spatial attention learning. IEEE Access 8:100857–100869CrossRef Rahman MM, Fiaz M, Jung SK (2020) Efficient visual tracking with stacked channel-spatial attention learning. IEEE Access 8:100857–100869CrossRef
26.
Zurück zum Zitat Wang Q, Wu B, Zhu P, Li P, Zuo W, Hu Q (2020) ECA-NET: efficient channel attention for deep convolutional neural networks. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 11534–11542 Wang Q, Wu B, Zhu P, Li P, Zuo W, Hu Q (2020) ECA-NET: efficient channel attention for deep convolutional neural networks. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 11534–11542
27.
Zurück zum Zitat Hou Q, Zhou D, Feng J (2021) Coordinate attention for efficient mobile network design. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 13713–13722 Hou Q, Zhou D, Feng J (2021) Coordinate attention for efficient mobile network design. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 13713–13722
28.
Zurück zum Zitat Yang L, Zhang R-Y, Li L, Xie X (2021) SIMAM: a simple, parameter-free attention module for convolutional neural networks. In: International conference on machine learning, pp 11863–11874. PMLR Yang L, Zhang R-Y, Li L, Xie X (2021) SIMAM: a simple, parameter-free attention module for convolutional neural networks. In: International conference on machine learning, pp 11863–11874. PMLR
29.
Zurück zum Zitat Zhang Q-L, Yang Y-B (2021) Sa-net: shuffle attention for deep convolutional neural networks. In: ICASSP 2021-2021 IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE, pp 2235–2239 Zhang Q-L, Yang Y-B (2021) Sa-net: shuffle attention for deep convolutional neural networks. In: ICASSP 2021-2021 IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE, pp 2235–2239
30.
Zurück zum Zitat Mou L, Zhao Y, Chen L, Cheng J, Gu Z, Hao H, Qi H, Zheng Y, Frangi A, Liu J (2019) CS-NET: channel and spatial attention network for curvilinear structure segmentation. In: Medical Image Computing and Computer Assisted Intervention—MICCAI 2019: 22nd international conference, Shenzhen, China, October 13–17, 2019, Proceedings, Part I 22. Springer, pp 721–730 Mou L, Zhao Y, Chen L, Cheng J, Gu Z, Hao H, Qi H, Zheng Y, Frangi A, Liu J (2019) CS-NET: channel and spatial attention network for curvilinear structure segmentation. In: Medical Image Computing and Computer Assisted Intervention—MICCAI 2019: 22nd international conference, Shenzhen, China, October 13–17, 2019, Proceedings, Part I 22. Springer, pp 721–730
31.
Zurück zum Zitat Hsyu M-C, Liu C-W, Chen C-H, Chen C-W, Tsai W-C (2021) CSANET: high speed channel spatial attention network for mobile ISP. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 2486–2493 Hsyu M-C, Liu C-W, Chen C-H, Chen C-W, Tsai W-C (2021) CSANET: high speed channel spatial attention network for mobile ISP. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 2486–2493
32.
33.
Zurück zum Zitat Liu Z, Gong P, Wang J (2019) Attention-based feature pyramid network for object detection. In: Proceedings of the 2019 8th international conference on computing and pattern recognition, pp 117–121 Liu Z, Gong P, Wang J (2019) Attention-based feature pyramid network for object detection. In: Proceedings of the 2019 8th international conference on computing and pattern recognition, pp 117–121
34.
Zurück zum Zitat Guo C, Fan B, Zhang Q, Xiang S, Pan C (2020) AUGFPN: improving multi-scale feature learning for object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 12595–12604 Guo C, Fan B, Zhang Q, Xiang S, Pan C (2020) AUGFPN: improving multi-scale feature learning for object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 12595–12604
35.
Zurück zum Zitat Min K, Lee G-H, Lee S-W (2022) Attentional feature pyramid network for small object detection. Neural Netw 155:439–450CrossRef Min K, Lee G-H, Lee S-W (2022) Attentional feature pyramid network for small object detection. Neural Netw 155:439–450CrossRef
36.
Zurück zum Zitat Yang X, Wang W, Wu J, Ding C, Ma S, Hou Z (2022) MLA-NET: feature pyramid network with multi-level local attention for object detection. Mathematics 10(24):4789CrossRef Yang X, Wang W, Wu J, Ding C, Ma S, Hou Z (2022) MLA-NET: feature pyramid network with multi-level local attention for object detection. Mathematics 10(24):4789CrossRef
37.
Zurück zum Zitat Lin T-Y, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P, Zitnick CL (2014) Microsoft coco: common objects in context. In: Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6–12, 2014, Proceedings, Part V 13. Springer, Berlin, pp 740–755 Lin T-Y, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P, Zitnick CL (2014) Microsoft coco: common objects in context. In: Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6–12, 2014, Proceedings, Part V 13. Springer, Berlin, pp 740–755
38.
Zurück zum Zitat Everingham M, Van Gool L, Williams CK, Winn J, Zisserman A (2010) The pascal visual object classes (VOC) challenge. Int J Comput Vision 88:303–338CrossRef Everingham M, Van Gool L, Williams CK, Winn J, Zisserman A (2010) The pascal visual object classes (VOC) challenge. Int J Comput Vision 88:303–338CrossRef
39.
Zurück zum Zitat Everingham M, Eslami SA, Van Gool L, Williams CK, Winn J, Zisserman A (2015) The pascal visual object classes challenge: a retrospective. Int J Comput Vis 111:98–136CrossRef Everingham M, Eslami SA, Van Gool L, Williams CK, Winn J, Zisserman A (2015) The pascal visual object classes challenge: a retrospective. Int J Comput Vis 111:98–136CrossRef
40.
Zurück zum Zitat Zhang H, Li D, Ji Y, Zhou H, Wu W (2019) Deep learning-based beverage recognition for unmanned vending machines: an empirical study. In: 2019 IEEE 17th international conference on industrial informatics (INDIN). IEEE, vol 1, pp 1464–1467 Zhang H, Li D, Ji Y, Zhou H, Wu W (2019) Deep learning-based beverage recognition for unmanned vending machines: an empirical study. In: 2019 IEEE 17th international conference on industrial informatics (INDIN). IEEE, vol 1, pp 1464–1467
41.
Zurück zum Zitat Zhang H, Li D, Ji Y, Zhou H, Wu W, Liu K (2019) Toward new retail: a benchmark dataset for smart unmanned vending machines. IEEE Trans Ind Inf 16(12):7722–7731CrossRef Zhang H, Li D, Ji Y, Zhou H, Wu W, Liu K (2019) Toward new retail: a benchmark dataset for smart unmanned vending machines. IEEE Trans Ind Inf 16(12):7722–7731CrossRef
42.
Zurück zum Zitat Chen K, Wang J, Pang J, Cao Y, Xiong Y, Li X, Sun S, Feng W, Liu Z, Xu J, et al (2019) Mmdetection: open MMLAB detection toolbox and benchmark. arXiv preprint arXiv:1906.07155 Chen K, Wang J, Pang J, Cao Y, Xiong Y, Li X, Sun S, Feng W, Liu Z, Xu J, et al (2019) Mmdetection: open MMLAB detection toolbox and benchmark. arXiv preprint arXiv:​1906.​07155
43.
Zurück zum Zitat He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778 He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Metadaten
Titel
SAFPN: a full semantic feature pyramid network for object detection
verfasst von
Gaihua Wang
Qi Li
Nengyuan Wang
Hong Liu
Publikationsdatum
28.09.2023
Verlag
Springer London
Erschienen in
Pattern Analysis and Applications / Ausgabe 4/2023
Print ISSN: 1433-7541
Elektronische ISSN: 1433-755X
DOI
https://doi.org/10.1007/s10044-023-01200-9

Weitere Artikel der Ausgabe 4/2023

Pattern Analysis and Applications 4/2023 Zur Ausgabe

Premium Partner