nach oben

Cognitive Computation

Erschienen in:

26.08.2022

Learning Discriminated Features Based on Feature Pyramid Networks and Attention for Multi-scale Object Detection

verfasst von: Yunhua Lu, Minghui Su, Yong Wang, Zhi Liu, Tao Peng

Erschienen in: Cognitive Computation | Ausgabe 2/2023

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

As the research scene in object detection becomes increasingly complex, the extracted feature information needs to be further improved. Many multi-scale feature pyramid network methods have been proposed to improve detection accuracy. However, most of them just follow a simple chain aggregation structure, resulting in not considering the distinction between multi-scale objects. Modern cognitive research presents that human cognitive ability is not a simple image-based matching process. It has an inherent process of information decomposition and reconstruction. Inspired by this theory, a new feature pyramid network model denoted as SuFPN based on discriminative learning is proposed to solve the problem of multi-scale object detection. In SuFPN, the correlation between the underlying location information and the deep feature information is fully considered. Firstly, object features are extracted through top-down path and lateral connection. Then deformable convolution is used to extract object discriminant spatial information. Finally, the attention mechanism is introduced to generate a discriminative feature map with enhanced spatial and channel interdependence, which provides excellent location information for the feature pyramid while considering semantic information. The proposed SuFPN is validated on the PASCAL VOC and COCO datasets. The Average Precision (AP) value reaches 80.0 on the PASCAL VOC dataset, which is 1.7 points higher than the feature pyramid networks (FPN), and 39.2 on the COCO dataset, which is 1.8 points higher than the FPN. The result demonstrates that our SuFPN outperforms other advanced methods in the multi-scale detection precision.

Vorheriger Artikel Dynamical Bifurcations in a Fractional-Order Neural Network with Nonidentical Communication Delays

Nächster Artikel C-Loss-Based Doubly Regularized Extreme Learning Machine

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

He K, Gkioxari G, Dollár P, Girshick R. Mask r-cnn. In: Proceedings of the IEEE International Conference on Computer Vision; 2017. p. 2961-9.

Lin TY, Dollár P, Girshick R, He K, Hariharan B, Belongie S. Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition; 2017. p. 2117-25.

Cai Z, Vasconcelos N. Cascade r-cnn: Delving into high quality object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition; 2018. p. 6154-62.

Lin TY, Goyal P, Girshick R, He K, Dollár P. Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision; 2017. p. 2980-8.

Tian Z, Shen C, Chen H, He T. Fcos: Fully convolutional one-stage object detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision; 2019. p. 9627-36.

Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu CY, et al. Ssd: Single shot multibox detector. In: European Conference on Computer Vision. Springer; 2016. p. 21-37.

Fu CY, Liu W, Ranga A, Tyagi A, Berg AC. Dssd: Deconvolutional single shot detector. arXiv preprint arXiv:1701.06659. 2017.

Kong T, Sun F, Tan C, Liu H, Huang W. Deep feature pyramid reconfiguration for object detection. In: Proceedings of the European Conference on Computer Vision (ECCV); 2018. p. 169-85.

Long J, Shelhamer E, Darrell T. Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition; 2015. p. 3431-40.

10.

Noh H, Hong S, Han B. Learning deconvolution network for semantic segmentation. In: Proceedings of the IEEE International Conference on Computer Vision; 2015. p. 1520-8.

11.

Cai Z, Fan Q, Feris RS, Vasconcelos N. A unified multi-scale deep convolutional neural network for fast object detection. In: European Conference on Computer Vision. Springer; 2016. p. 354-70.

12.

Kong T, Yao A, Chen Y, Sun F. Hypernet: Towards accurate region proposal generation and joint object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition; 2016. p. 845-53.

13.

Kong T, Sun F, Yao A, Liu H, Lu M, Chen Y. Ron: Reverse connection with objectness prior networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition; 2017. p. 5936-44.

14.

Kim SW, Kook HK, Sun JY, Kang MC, Ko SJ. Parallel feature pyramid network for object detection. In: Proceedings of the European Conference on Computer Vision (ECCV); 2018. p. 234-50.

15.

Zhou P, Ni B, Geng C, Hu J, Xu Y. Scale-transferrable object detection. In: proceedings of the IEEE Conference on Computer Vision and Pattern Recognition; 2018. p. 528-37.

16.

Liu S, Qi L, Qin H, Shi J, Jia J. Path aggregation network for instance segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition; 2018. p. 8759-68.

17.

Pang J, Chen K, Shi J, Feng H, Ouyang W, Lin D. Libra r-cnn: Towards balanced learning for object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition; 2019. p. 821-30.

18.

Lin TY, Maire M, Belongie S, Hays J, Perona P, Ramanan D, et al. Microsoft coco: Common objects in context. In: European Conference on Computer Vision. Springer; 2014. p. 740-55.

19.

Redmon J, Farhadi A. YOLO9000: better, faster, stronger. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition; 2017. p. 7263-71.

20.

Sermanet P, Eigen D, Zhang X, Mathieu M, Fergus R, LeCun Y. Overfeat: Integrated recognition, localization and detection using convolutional networks. arXiv preprint arXiv:1312.6229. 2013.

21.

Wang N, Gao Y, Chen H, Wang P, Tian Z, Shen C, et al. NAS-FCOS: Fast neural architecture search for object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition; 2020. p. 11943-51.

22.

Zhang Z, Qiao S, Xie C, Shen W, Wang B, Yuille AL. Single-shot object detection with enriched semantics. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition; 2018. p. 5813-21.

23.

Zhao Q, Sheng T, Wang Y, Tang Z, Chen Y, Cai L, et al. M2det: A single-shot object detector based on multi-level feature pyramid network. In: Proceedings of the AAAI Conference on Artificial Intelligence. vol. 33; 2019. p. 9259-66.

24.

Ren S, He K, Girshick R, Sun J. Faster R-CNN: Towards real-time object detection with region proposal networks. Advances in Neural Information Processing Systems. 2015;201.

25.

Guo C, Fan B, Zhang Q, Xiang S, Pan C. Augfpn: Improving multi-scale feature learning for object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition; 2020. p. 12595-604.

26.

Wu Y, Chen Y, Yuan L, Liu Z, Wang L, Li H, et al. Rethinking classification and localization for object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition; 2020. p. 10186-95.

27.

Girshick R, Donahue J, Darrell T, Malik J. Rich Feature Hierarchies for accurate object detection and semantic segmentation. IEEE Computer Society. 2013.

28.

Krizhevsky A, Sutskever I, Hinton GE. Imagenet classification with deep convolutional neural networks. Adv Neural Inf Proces Syst. 2012;25:1097–105.

29.

He K, Zhang X, Ren S, Sun J. Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans Pattern Anal Mach Intell. 2015;37(9):1904–16.CrossRef

30.

Girshick R. Fast r-cnn. In: Proceedings of the IEEE International Conference on Computer Vision; 2015. p. 1440-8.

31.

Sun K, Xiao B, Liu D, Wang J. Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition; 2019. p. 5693-703.

32.

Ghiasi G, Lin TY, Le QV. Nas-fpn: Learning scalable feature pyramid architecture for object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition; 2019. p. 7036-45.

33.

Xu H, Yao L, Zhang W, Liang X, Li Z. Auto-fpn: Automatic network architecture adaptation for object detection beyond classification. In: Proceedings of the IEEE/CVF International Conference on Computer Vision; 2019. p. 6649-58.

34.

Tan M, Pang R, Le QV. Efficientdet: Scalable and efficient object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition; 2020. p. 10781-90.

35.

Wang X, Zhang S, Yu Z, Feng L, Zhang W. Scale-equalizing pyramid convolution for object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition; 2020. p. 13359-68.

36.

Liang T, Wang Y, Tang Z, Hu G, Ling H. OPANAS: One-shot path aggregation network architecture search for object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition; 2021. p. 10195-203.

37.

Wang F, Jiang M, Qian C, Yang S, Li C, Zhang H, et al. Residual attention network for image classification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition; 2017. p. 3156-64.

38.

Hu J, Shen L, Sun G. Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition; 2018. p. 7132-41.

39.

Woo S, Park J, Lee JY, Kweon IS. Cbam: Convolutional block attention module. In: Proceedings of the European Conference on Computer Vision (ECCV); 2018. p. 3-19.

40.

Wang SH, Fernandes S, Zhu Z, Zhang YD. AVNC: attention-based VGG-style network for COVID-19 diagnosis by CBAM. IEEE Sensors J. 2021.

41.

Zhang YD, Zhang Z, Zhang X, Wang SH. MIDCAN: A multiple input deep convolutional attention network for Covid-19 diagnosis based on chest CT and chest X-ray. Pattern Recogn Lett. 2021;150:8–16.CrossRef

42.

Li X, Lai T, Wang S, Chen Q, Yang C, Chen R, et al.; IEEE. Weighted feature pyramid networks for object detection. IEEE Computer Society. 2013:1500-4.

43.

He K, Zhang X, Ren S, Sun J. Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. In: Proceedings of the IEEE International Conference on Computer Vision; 2015. p. 1026-34.

Titel: Learning Discriminated Features Based on Feature Pyramid Networks and Attention for Multi-scale Object Detection
verfasst von: Yunhua Lu
Minghui Su
Yong Wang
Zhi Liu
Tao Peng
Publikationsdatum: 26.08.2022
Verlag: Springer US
Erschienen in: Cognitive Computation / Ausgabe 2/2023
Print ISSN: 1866-9956
Elektronische ISSN: 1866-9964
DOI: https://doi.org/10.1007/s12559-022-10052-0

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Weitere Artikel der Ausgabe 2/2023

Dynamical Bifurcations in a Fractional-Order Neural Network with Nonidentical Communication Delays

Neural Mechanisms of the Maintenance and Manipulation of Gustatory Working Memory in Orbitofrontal Cortex

Exploring Dimensionality Reduction Techniques in Multilingual Transformers

Stein Variational Gradient Descent with Multiple Kernels

C-Loss-Based Doubly Regularized Extreme Learning Machine

Hybrid Convolutional Neural Network-Multilayer Perceptron Model for Solar Radiation Prediction

Premium Partner