Skip to main content
main-content

Tipp

Weitere Artikel dieser Ausgabe durch Wischen aufrufen

13.08.2022

LGADet: Light-weight Anchor-free Multispectral Pedestrian Detection with Mixed Local and Global Attention

verfasst von: Xin Zuo, Zhi Wang, Yue Liu, Jifeng Shen, Haoran Wang

Erschienen in: Neural Processing Letters

Einloggen, um Zugang zu erhalten
share
TEILEN

Abstract

Balancing accuracy and efficiency is of significant importance for multispectral pedestrian detection in practical applications. To address these problems, a light-weight anchor-free multispectral pedestrian detection method with mixed Local and Global Attention mechanism (LGA) is proposed to narrow the gap between academic research and practical application. The anchor-free detection pipeline equipped with light-weight backbone leads to significant speedup, while a mixed attention mechanism is utilized to refine features in order to improve the accuracy. Specifically, an anchor-free pedestrian detection framework with MobileNetV2 backbone is firstly utilized to reduce the computational complexity, achieving significant speedup for model inference. Secondly, our method makes use of DMAF module to enhance complementary information between RGB and Thermal image features. Finally, the quality of feature fusion is greatly improved with local and global attention mechanisms, thus enhancing the detection accuracy. Experiments on the KAIST, FLIR and CVC-14 datasets show significant performance improvement in terms of MR, comparing with other state-of-the-art methods. When deployed on the Nvidia Jetson TX2, impressing result is obtained with good compromise between accuracy and speed.
Literatur
1.
Zurück zum Zitat Yu X, Fu D (2014) Target extraction from blurred trace infrared images with a superstring galaxy template algorithm. Infrared Phys Technol 64:9–12 CrossRef Yu X, Fu D (2014) Target extraction from blurred trace infrared images with a superstring galaxy template algorithm. Infrared Phys Technol 64:9–12 CrossRef
2.
Zurück zum Zitat Vandersteegen M, Beeck K V, Goedemé T (2018) Real-time multispectral pedestrian detection with a single-pass deep neural network. In: International conference image analysis and recognition, 419–426 Vandersteegen M, Beeck K V, Goedemé T (2018) Real-time multispectral pedestrian detection with a single-pass deep neural network. In: International conference image analysis and recognition, 419–426
3.
Zurück zum Zitat Wagner J, Fischer V, Herman M, et al (2016) Multispectral Pedestrian Detection using Deep Fusion Convolutional Neural Networks. In: Proceedings of 24th European symposium on artificial neural networks, computational intelligence and machine learning, 509–514 Wagner J, Fischer V, Herman M, et al (2016) Multispectral Pedestrian Detection using Deep Fusion Convolutional Neural Networks. In: Proceedings of 24th European symposium on artificial neural networks, computational intelligence and machine learning, 509–514
4.
Zurück zum Zitat Xu D, Ouyang W, Ricci E, et al (2017) Learning cross-modal deep representations for robust pedestrian detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, 5363–5371 Xu D, Ouyang W, Ricci E, et al (2017) Learning cross-modal deep representations for robust pedestrian detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, 5363–5371
5.
Zurück zum Zitat Liu J, Zhang S, Wang S, et al (2016) Multispectral deep neural networks for pedestrian detection. In: Proceedings of 27th British machine vision conference, 731–733 Liu J, Zhang S, Wang S, et al (2016) Multispectral deep neural networks for pedestrian detection. In: Proceedings of 27th British machine vision conference, 731–733
6.
Zurück zum Zitat D. Konig, M. Adam, C. Jarvers, et al (2017) Fully convolutional region proposal networks for multispectral person detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR) workshops, 49–56 D. Konig, M. Adam, C. Jarvers, et al (2017) Fully convolutional region proposal networks for multispectral person detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR) workshops, 49–56
7.
8.
Zurück zum Zitat Zhang H, Fromont E, Lefèvre S, et al (2021) Guided attentive feature fusion for multispectral pedestrian detection. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision, 72–80. Zhang H, Fromont E, Lefèvre S, et al (2021) Guided attentive feature fusion for multispectral pedestrian detection. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision, 72–80.
9.
Zurück zum Zitat Kim J, Kim H, Kim T et al (2021) MLPD: multi-label pedestrian detector in multispectral domain. IEEE Robot Autom Lett 6(4):7846–7853 CrossRef Kim J, Kim H, Kim T et al (2021) MLPD: multi-label pedestrian detector in multispectral domain. IEEE Robot Autom Lett 6(4):7846–7853 CrossRef
10.
Zurück zum Zitat Howard AG, Zhu M, Chen B, et al (2017) MobileNets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint. arXiv:​1704.​04861 Howard AG, Zhu M, Chen B, et al (2017) MobileNets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint. arXiv:​1704.​04861
11.
Zurück zum Zitat Zhang X, Zhou X, Lin M, et al (2018) ShuffleNet: An extremely efficient convolutional neural network for mobile devices. In: Proceedings of the IEEE conference on computer vision and pattern recognition, 6848–6856 Zhang X, Zhou X, Lin M, et al (2018) ShuffleNet: An extremely efficient convolutional neural network for mobile devices. In: Proceedings of the IEEE conference on computer vision and pattern recognition, 6848–6856
14.
Zurück zum Zitat Law H, Deng J (2018) CornerNet: Detecting objects as paired keypoints. In: Proceedings of the European conference on computer vision (ECCV), 734–750 Law H, Deng J (2018) CornerNet: Detecting objects as paired keypoints. In: Proceedings of the European conference on computer vision (ECCV), 734–750
15.
Zurück zum Zitat Tian Z, Shen C, Chen H, et al (2019) FCOS: Fully convolutional one-stage object detection. In: Proceedings of the IEEE/CVF international conference on computer vision, 9627–9636 Tian Z, Shen C, Chen H, et al (2019) FCOS: Fully convolutional one-stage object detection. In: Proceedings of the IEEE/CVF international conference on computer vision, 9627–9636
16.
Zurück zum Zitat Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, 3431–3440 Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, 3431–3440
17.
Zurück zum Zitat Kim D, Park S, Kang D, et al (2019) Improved center and scale prediction-based pedestrian detection using convolutional block. In: IEEE 9th international conference on consumer electronics, 418–419 Kim D, Park S, Kang D, et al (2019) Improved center and scale prediction-based pedestrian detection using convolutional block. In: IEEE 9th international conference on consumer electronics, 418–419
18.
Zurück zum Zitat Hwang S, Park J, Kim N, et al (2015) Multispectral pedestrian detection: Benchmark dataset and baseline. In: Proceedings of the IEEE conference on computer vision and pattern recognition, 1037–1045 Hwang S, Park J, Kim N, et al (2015) Multispectral pedestrian detection: Benchmark dataset and baseline. In: Proceedings of the IEEE conference on computer vision and pattern recognition, 1037–1045
19.
Zurück zum Zitat Ren S, He K, Girshick R et al (2016) Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Trans Pattern Anal Mach Intell 39(6):1137–1149 CrossRef Ren S, He K, Girshick R et al (2016) Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Trans Pattern Anal Mach Intell 39(6):1137–1149 CrossRef
20.
Zurück zum Zitat Zhou K, Chen L, Cao X (2020) Improving multispectral pedestrian detection by addressing modality imbalance problems. In: 16th European conference of computer vision, 787–803 Zhou K, Chen L, Cao X (2020) Improving multispectral pedestrian detection by addressing modality imbalance problems. In: 16th European conference of computer vision, 787–803
21.
Zurück zum Zitat Mnih V, Heess N, Graves A, et al (2014) Recurrent models of visual attention. In: Proceedings of the 27th international conference on neural information processing systems, 2: 2204–2212 Mnih V, Heess N, Graves A, et al (2014) Recurrent models of visual attention. In: Proceedings of the 27th international conference on neural information processing systems, 2: 2204–2212
22.
Zurück zum Zitat Bahdanau D, Cho K, Bengio Y (2014) Neural machine translation by jointly learning to align and translate. arXiv preprint. arXiv:​1409.​0473 Bahdanau D, Cho K, Bengio Y (2014) Neural machine translation by jointly learning to align and translate. arXiv preprint. arXiv:​1409.​0473
23.
Zurück zum Zitat Li X, Wang W, Hu X, et al (2019) Selective kernel networks. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 510–519 Li X, Wang W, Hu X, et al (2019) Selective kernel networks. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 510–519
24.
Zurück zum Zitat Woo S, Park J, Lee J Y, et al (2018) CBAM: Convolutional block attention module. In: Proceedings of the European conference on computer vision (ECCV), 3–19 Woo S, Park J, Lee J Y, et al (2018) CBAM: Convolutional block attention module. In: Proceedings of the European conference on computer vision (ECCV), 3–19
25.
Zurück zum Zitat Hu J, Shen L, Sun G (2018) Squeeze-and-excitation networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, 7132–7141 Hu J, Shen L, Sun G (2018) Squeeze-and-excitation networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, 7132–7141
26.
Zurück zum Zitat Zhang S, Yang J, Schiele B (2018) Occluded pedestrian detection through guided attention in cnns. In: Proceedings of the IEEE conference on computer vision and pattern recognition, 6995–7003 Zhang S, Yang J, Schiele B (2018) Occluded pedestrian detection through guided attention in cnns. In: Proceedings of the IEEE conference on computer vision and pattern recognition, 6995–7003
27.
Zurück zum Zitat Pang Y, Xie J, Khan M H, et al (2019) Mask-guided attention network for occluded pedestrian detection. In: Proceedings of the IEEE/CVF international conference on computer vision, 4967–4975 Pang Y, Xie J, Khan M H, et al (2019) Mask-guided attention network for occluded pedestrian detection. In: Proceedings of the IEEE/CVF international conference on computer vision, 4967–4975
28.
Zurück zum Zitat Feng TT, Ge HY (2020) Pedestrian detection based on attention mechanism and feature enhancement with SSD. In: 2020 5th international conference on communication, image and signal processing, 145–148 Feng TT, Ge HY (2020) Pedestrian detection based on attention mechanism and feature enhancement with SSD. In: 2020 5th international conference on communication, image and signal processing, 145–148
29.
Zurück zum Zitat He K, Zhang X, Ren S, et al (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, 770–778 He K, Zhang X, Ren S, et al (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, 770–778
30.
Zurück zum Zitat Wang X, Girshick R, Gupta A, et al (2018) Non-local neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, 7794–7803 Wang X, Girshick R, Gupta A, et al (2018) Non-local neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, 7794–7803
31.
Zurück zum Zitat Lin T Y, Goyal P, Girshick R, et al (2017) Focal loss for dense object detection. In: Proceedings of the IEEE international conference on computer vision, 2980–2988 Lin T Y, Goyal P, Girshick R, et al (2017) Focal loss for dense object detection. In: Proceedings of the IEEE international conference on computer vision, 2980–2988
32.
Zurück zum Zitat Sandler M, Howard A, Zhu M, et al (2018) MobileNetV2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, 4510–4520 Sandler M, Howard A, Zhu M, et al (2018) MobileNetV2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, 4510–4520
33.
Zurück zum Zitat Li C, Song D, Tong R et al (2019) Illumination-aware faster R-CNN for robust multispectral pedestrian detection. Pattern Recogn 85:161–171 CrossRef Li C, Song D, Tong R et al (2019) Illumination-aware faster R-CNN for robust multispectral pedestrian detection. Pattern Recogn 85:161–171 CrossRef
34.
Zurück zum Zitat Guan D, Cao Y, Yang J et al (2019) Fusion of multispectral data through illumination-aware deep neural networks for pedestrian detection. Inform Fus 50:148–157 CrossRef Guan D, Cao Y, Yang J et al (2019) Fusion of multispectral data through illumination-aware deep neural networks for pedestrian detection. Inform Fus 50:148–157 CrossRef
35.
Zurück zum Zitat Zhang L, Zhu X, Chen X, et al (2019) Weakly aligned cross-modal learning for multispectral pedestrian detection. In: Proceedings of the IEEE/CVF international conference on computer vision. 5127–5137 Zhang L, Zhu X, Chen X, et al (2019) Weakly aligned cross-modal learning for multispectral pedestrian detection. In: Proceedings of the IEEE/CVF international conference on computer vision. 5127–5137
36.
Zurück zum Zitat Park K, Kim S, Sohn K (2018) Unified multi-spectral pedestrian detection based on probabilistic fusion networks. Pattern Recogn 80:143–155 CrossRef Park K, Kim S, Sohn K (2018) Unified multi-spectral pedestrian detection based on probabilistic fusion networks. Pattern Recogn 80:143–155 CrossRef
37.
Zurück zum Zitat Devaguptapu C, Akolekar N, M Sharma M, et al (2019) Borrow from anywhere: Pseudo multi-modal object detection in thermal imagery. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops. 0–0 Devaguptapu C, Akolekar N, M Sharma M, et al (2019) Borrow from anywhere: Pseudo multi-modal object detection in thermal imagery. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops. 0–0
38.
Zurück zum Zitat Kieu M, Bagdanov AD, Bertini M (2021) Bottom-up and layerwise domain adaptation for pedestrian detection in thermal images. ACM Trans Multimed Comput Commun Appl (TOMM) 17(1):1–19 CrossRef Kieu M, Bagdanov AD, Bertini M (2021) Bottom-up and layerwise domain adaptation for pedestrian detection in thermal images. ACM Trans Multimed Comput Commun Appl (TOMM) 17(1):1–19 CrossRef
39.
Zurück zum Zitat Zhou X, Zhuo J, Krahenbuhl P (2019) Bottom-up object detection by grouping extreme and center points. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 850–859 Zhou X, Zhuo J, Krahenbuhl P (2019) Bottom-up object detection by grouping extreme and center points. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 850–859
40.
Zurück zum Zitat Zhu C, He Y, Savvides M (2019) Feature selective anchor-free module for single-shot object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 840–849 Zhu C, He Y, Savvides M (2019) Feature selective anchor-free module for single-shot object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 840–849
41.
Zurück zum Zitat Kong T, Sun F, Liu H et al (2020) FoveaBox: beyound anchor-based object detection. IEEE Trans Image Process 29:7389–7398 CrossRef Kong T, Sun F, Liu H et al (2020) FoveaBox: beyound anchor-based object detection. IEEE Trans Image Process 29:7389–7398 CrossRef
Metadaten
Titel
LGADet: Light-weight Anchor-free Multispectral Pedestrian Detection with Mixed Local and Global Attention
verfasst von
Xin Zuo
Zhi Wang
Yue Liu
Jifeng Shen
Haoran Wang
Publikationsdatum
13.08.2022
Verlag
Springer US
Erschienen in
Neural Processing Letters
Print ISSN: 1370-4621
Elektronische ISSN: 1573-773X
DOI
https://doi.org/10.1007/s11063-022-10991-7