Top

Pattern Analysis and Applications

Published in:

28-09-2023 | Theoretical Advances

SAFPN: a full semantic feature pyramid network for object detection

Authors: Gaihua Wang, Qi Li, Nengyuan Wang, Hong Liu

Published in: Pattern Analysis and Applications | Issue 4/2023

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

To enhance the performance of object detection algorithm, this paper proposes segmentation attention feature pyramid network (SAFPN) to address the issue of semantic information loss. Compared to prior works, SAFPN discards the original \(1\times 1\) convolutions and achieves feature dimension reduction through a segmentation and accumulation architecture, thereby preserving the semantic information of high-dimensional features completely. To capture fine-grained semantic details, it integrates channel attention and spatial attention mechanisms to enhance the network’s focus on important information. Extensive experimental validation demonstrates that SAFPN achieves favorable results on multiple public datasets, and can better complete the target detection task.

previous article Applying unsupervised keyphrase methods on concepts extracted from discharge sheets

next article Adaptive frequency-based fully hyperbolic graph neural networks

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Zhang L, Wang H, Wang X, Liu Q, Wang H, Wang H (2021) Vehicle object detection method based on candidate region aggregation. Pattern Anal Appl 24:1635–1647CrossRef

Sugiura M, Miyauchi CM, Kotozaki Y, Akimoto Y, Nozawa T, Yomogida Y, Hanawa S, Yamamoto Y, Sakuma A, Nakagawa S et al (2015) Neural mechanism for mirrored self-face recognition. Cereb Cortex 25(9):2806–2814CrossRef

Yan K, Wang X, Lu L, Summers RM (2018) Deeplesion: automated mining of large-scale lesion annotations and universal lesion detection with deep learning. J Med Imaging 5(3):036501–036501CrossRef

Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 580–587

Girshick R (2015) Fast r-CNN. In: Proceedings of the IEEE international conference on computer vision, pp 1440–1448

Ren S, He K, Girshick R, Sun J (2015) Faster r-CNN: Towards real-time object detection with region proposal networks. In: Advances in neural information processing systems

Zhang H, Chang H, Ma B, Wang N, Chen X (2020) Dynamic r-CNN: towards high quality object detection via dynamic training. In: Computer vision—ECCV 2020: 16th European conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XV 16. Springer, Berlin, pp 260–275

Sun P, Zhang R, Jiang Y, Kong T, Xu C, Zhan W, Tomizuka M, Li L, Yuan Z, Wang C et al (2021) Sparse r-cnn: End-to-end object detection with learnable proposals. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 14454–14463

Lin T-Y, Goyal P, Girshick R, He K, Dollár P (2017) Focal loss for dense object detection. In: Proceedings of the IEEE international conference on computer vision, pp 2980–2988

10.

Tian Z, Shen C, Chen H, He T (2019) FCOS: fully convolutional one-stage object detection. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 9627–9636

11.

Tian Z, Shen C, Chen H, He T (2019) FCOS: fully convolutional one-stage object detection. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 9627–9636

12.

Zhang S, Chi C, Yao Y, Lei Z, Li SZ (2020) Bridging the gap between anchor-based and anchor-free detection via adaptive training sample selection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 9759–9768

13.

Zhang H, Wang Y, Dayoub F, Sunderhauf N (2021) Varifocalnet: an IOU-aware dense object detector. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 8514–8523

14.

Chen Q, Wang Y, Yang T, Zhang X, Cheng J, Sun J (2021) You only look one-level feature. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 13039–13048

15.

Feng C, Zhong Y, Gao Y, Scott MR, Huang W (2021) TOOD: task-aligned one-stage object detection. In: 2021 IEEE/CVF international conference on computer vision (ICCV). IEEE Computer Society, pp 3490–3499

16.

Li S, He C, Li R, Zhang L (2022) A dual weighting label assignment scheme for object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 9387–9396

17.

Lin T-Y, Dollár P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2117–2125

18.

Liu S, Qi L, Qin H, Shi J, Jia J (2018) Path aggregation network for instance segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 8759–8768

19.

Liu S, Huang D, Wang Y (2019) Learning spatial fusion for single-shot object detection. arXiv preprint arXiv:1911.09516

20.

Tan M, Pang R, Le QV (2020) Efficientdet: Scalable and efficient object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 10781–10790

21.

Qiao S, Chen L-C, Yuille A (2021) Detectors: detecting objects with recursive feature pyramid and switchable atrous convolution. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 10213–10224

22.

Wang G, Gan X, Cao Q, Zhai Q (2022) MFANet: multi-scale feature fusion network with attention mechanism. Visual Comput. https://doi.org/10.1007/s00371-022-02503-4CrossRef

23.

Hu J, Shen L, Sun G (2018) Squeeze-and-excitation networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7132–7141

24.

Woo S, Park J, Lee J-Y, Kweon IS (2018) CBAM: convolutional block attention module. In: Proceedings of the European conference on computer vision (ECCV), pp 3–19

25.

Rahman MM, Fiaz M, Jung SK (2020) Efficient visual tracking with stacked channel-spatial attention learning. IEEE Access 8:100857–100869CrossRef

26.

Wang Q, Wu B, Zhu P, Li P, Zuo W, Hu Q (2020) ECA-NET: efficient channel attention for deep convolutional neural networks. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 11534–11542

27.

Hou Q, Zhou D, Feng J (2021) Coordinate attention for efficient mobile network design. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 13713–13722

28.

Yang L, Zhang R-Y, Li L, Xie X (2021) SIMAM: a simple, parameter-free attention module for convolutional neural networks. In: International conference on machine learning, pp 11863–11874. PMLR

29.

Zhang Q-L, Yang Y-B (2021) Sa-net: shuffle attention for deep convolutional neural networks. In: ICASSP 2021-2021 IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE, pp 2235–2239

30.

Mou L, Zhao Y, Chen L, Cheng J, Gu Z, Hao H, Qi H, Zheng Y, Frangi A, Liu J (2019) CS-NET: channel and spatial attention network for curvilinear structure segmentation. In: Medical Image Computing and Computer Assisted Intervention—MICCAI 2019: 22nd international conference, Shenzhen, China, October 13–17, 2019, Proceedings, Part I 22. Springer, pp 721–730

31.

Hsyu M-C, Liu C-W, Chen C-H, Chen C-W, Tsai W-C (2021) CSANET: high speed channel spatial attention network for mobile ISP. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 2486–2493

32.

Li H, Xiong P, An J, Wang L (2018) Pyramid attention network for semantic segmentation. arXiv preprint arXiv:1805.10180

33.

Liu Z, Gong P, Wang J (2019) Attention-based feature pyramid network for object detection. In: Proceedings of the 2019 8th international conference on computing and pattern recognition, pp 117–121

34.

Guo C, Fan B, Zhang Q, Xiang S, Pan C (2020) AUGFPN: improving multi-scale feature learning for object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 12595–12604

35.

Min K, Lee G-H, Lee S-W (2022) Attentional feature pyramid network for small object detection. Neural Netw 155:439–450CrossRef

36.

Yang X, Wang W, Wu J, Ding C, Ma S, Hou Z (2022) MLA-NET: feature pyramid network with multi-level local attention for object detection. Mathematics 10(24):4789CrossRef

37.

Lin T-Y, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P, Zitnick CL (2014) Microsoft coco: common objects in context. In: Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6–12, 2014, Proceedings, Part V 13. Springer, Berlin, pp 740–755

38.

Everingham M, Van Gool L, Williams CK, Winn J, Zisserman A (2010) The pascal visual object classes (VOC) challenge. Int J Comput Vision 88:303–338CrossRef

39.

Everingham M, Eslami SA, Van Gool L, Williams CK, Winn J, Zisserman A (2015) The pascal visual object classes challenge: a retrospective. Int J Comput Vis 111:98–136CrossRef

40.

Zhang H, Li D, Ji Y, Zhou H, Wu W (2019) Deep learning-based beverage recognition for unmanned vending machines: an empirical study. In: 2019 IEEE 17th international conference on industrial informatics (INDIN). IEEE, vol 1, pp 1464–1467

41.

Zhang H, Li D, Ji Y, Zhou H, Wu W, Liu K (2019) Toward new retail: a benchmark dataset for smart unmanned vending machines. IEEE Trans Ind Inf 16(12):7722–7731CrossRef

42.

Chen K, Wang J, Pang J, Cao Y, Xiong Y, Li X, Sun S, Feng W, Liu Z, Xu J, et al (2019) Mmdetection: open MMLAB detection toolbox and benchmark. arXiv preprint arXiv:1906.07155

43.

He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778

Title: SAFPN: a full semantic feature pyramid network for object detection
Authors: Gaihua Wang
Qi Li
Nengyuan Wang
Hong Liu
Publication date: 28-09-2023
Publisher: Springer London
Published in: Pattern Analysis and Applications / Issue 4/2023
Print ISSN: 1433-7541
Electronic ISSN: 1433-755X
DOI: https://doi.org/10.1007/s10044-023-01200-9

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Other articles of this Issue 4/2023

EMTNet: efficient mobile transformer network for real-time monocular depth estimation

DWT-CompCNN: deep image classification network for high throughput JPEG 2000 compressed documents

Hybrid ABC and black hole algorithm with genetic operators optimized SVM ensemble based diagnosis of breast cancer

CInf-FS: an efficient infinite feature selection method using K-means clustering to partition large feature spaces

net: Global–Local Semantics Coupled Network for scene-specific video foreground extraction with less supervision

ViT-PGC: vision transformer for pedestrian gender classification on small-size dataset

Premium Partner