Skip to main content
Top
Published in:

20-04-2024

SPD-YOLOv8: an small-size object detection model of UAV imagery in complex scene

Authors: Rui Zhong, Ende Peng, Ziqiang Li, Qing Ai, Tao Han, Yong Tang

Published in: The Journal of Supercomputing | Issue 12/2024

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Traditional camera sensors rely on human observation. However, in complex scenes, people often experience fatigue when observing objects of various sizes. Moreover, human cognitive abilities have inherent limitations, leading to potential judgment errors. To overcome these challenges, object recognition technology, a pivotal means of categorizing objects captured by camera sensors is introduced. This paper presents a specialized small-size object detection algorithm designed for unique scenarios. This algorithm offers distinct advantages, including enhanced accuracy in detecting small-size objects and improved detection performance for objects of various sizes. The main innovations in this paper include the following three points. Firstly, summarizing a small-size object detection layer and reconfiguring both the feature extraction network and the feature fusion network to enhance the effectiveness of capturing small-size objects. Secondly, SPD-Conv modules are introduced to replace stride convolutions and pooling layers to improve the detection accuracy of small objects. Finally, employing the MPDIoU loss function to enhance the precision of bounding box fitting. In our experiments. We utilized authoritative official datasets. The experimental results on the Visdrone dataset demonstrate a 10.9% increase in mAP0.5 and a 9.3% increase in mAP0.5:0.95 rates compared to the original YOLOv8s. This model not only meets the accuracy requirements but also accounts for the lightweight demands when deployed on embedded devices.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Literature
6.
go back to reference Girshick R (2015) Fast R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp 1440–1448 Girshick R (2015) Fast R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp 1440–1448
8.
go back to reference He K, Gkioxari G, Dollar P, Girshick R (2017) Mask R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp 2961–2969 He K, Gkioxari G, Dollar P, Girshick R (2017) Mask R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp 2961–2969
13.
go back to reference Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu C-Y, Berg AC (2016) SSD: single shot multibox detector. In: Computer Vision-ECCV 2016 (ECCV), pp 21–37 Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu C-Y, Berg AC (2016) SSD: single shot multibox detector. In: Computer Vision-ECCV 2016 (ECCV), pp 21–37
27.
go back to reference Wang C-Y, Bochkovskiy A, Liao H-YM (2023) YOLOv7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp 7464–7475 Wang C-Y, Bochkovskiy A, Liao H-YM (2023) YOLOv7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp 7464–7475
28.
go back to reference Wang C-Y, Liao H-YM, Wu Y-H, Chen P-Y, Hsieh J-W, Yeh I-H (2020) CSPNet: a new backbone that can enhance learning capability of CNN. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp 390–391 Wang C-Y, Liao H-YM, Wu Y-H, Chen P-Y, Hsieh J-W, Yeh I-H (2020) CSPNet: a new backbone that can enhance learning capability of CNN. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp 390–391
29.
go back to reference Zhang S, Chi C, Yao Y, Lei Z, Li SZ (2020) Bridging the gap between anchor-based and anchor-free detection via adaptive training sample selection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 9759–9768 Zhang S, Chi C, Yao Y, Lei Z, Li SZ (2020) Bridging the gap between anchor-based and anchor-free detection via adaptive training sample selection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 9759–9768
32.
go back to reference Lin T-Y, Dollar P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 2117–2125 Lin T-Y, Dollar P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 2117–2125
34.
go back to reference Sandler M, Howard A, Zhu M, Zhmoginov A, Chen LC (2018) MobileNetV2: inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 4510–4520 Sandler M, Howard A, Zhu M, Zhmoginov A, Chen LC (2018) MobileNetV2: inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 4510–4520
35.
go back to reference Howard A, Sandler M, Chu G, Chen L-C, Chen B, Tan M et al (2019) Searching for MobileNetV3. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp 1314–1324 Howard A, Sandler M, Chu G, Chen L-C, Chen B, Tan M et al (2019) Searching for MobileNetV3. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp 1314–1324
39.
go back to reference Du D, Zhu P, Wen L, Bian X, Lin H, Hu Q et al (2019) VisDrone-DET2019: the vision meets drone object detection in image challenge results. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp 0–0 Du D, Zhu P, Wen L, Bian X, Lin H, Hu Q et al (2019) VisDrone-DET2019: the vision meets drone object detection in image challenge results. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp 0–0
40.
go back to reference Zhu P, Wen L, Du D, Bian X, Ling H, Hu Q et al (2018) VisDrone-DET2018: the vision meets drone object detection in image challenge results. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 0–0 Zhu P, Wen L, Du D, Bian X, Ling H, Hu Q et al (2018) VisDrone-DET2018: the vision meets drone object detection in image challenge results. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 0–0
41.
go back to reference Cao Y, He Z, Wang L, Wang W, Yuan Y, Zhang D et al (2021) VisDrone-DET2021: the vision meets drone object detection challenge results. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp 2847–2854 Cao Y, He Z, Wang L, Wang W, Yuan Y, Zhang D et al (2021) VisDrone-DET2021: the vision meets drone object detection challenge results. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp 2847–2854
43.
go back to reference Lyu S, Chang M-C, Du D, Wen L, Qi H, Li Y et al (2017) UA-DETRAC 2017: report of AVSS2017 & IWT4S challenge on advanced traffic monitoring. In: 2017 14th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), pp 1–7. https://doi.org/10.1109/AVSS.2017.8078560 Lyu S, Chang M-C, Du D, Wen L, Qi H, Li Y et al (2017) UA-DETRAC 2017: report of AVSS2017 & IWT4S challenge on advanced traffic monitoring. In: 2017 14th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), pp 1–7. https://​doi.​org/​10.​1109/​AVSS.​2017.​8078560
Metadata
Title
SPD-YOLOv8: an small-size object detection model of UAV imagery in complex scene
Authors
Rui Zhong
Ende Peng
Ziqiang Li
Qing Ai
Tao Han
Yong Tang
Publication date
20-04-2024
Publisher
Springer US
Published in
The Journal of Supercomputing / Issue 12/2024
Print ISSN: 0920-8542
Electronic ISSN: 1573-0484
DOI
https://doi.org/10.1007/s11227-024-06121-w

Premium Partner