Skip to main content
Erschienen in:

20.04.2024

SPD-YOLOv8: an small-size object detection model of UAV imagery in complex scene

verfasst von: Rui Zhong, Ende Peng, Ziqiang Li, Qing Ai, Tao Han, Yong Tang

Erschienen in: The Journal of Supercomputing | Ausgabe 12/2024

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Traditional camera sensors rely on human observation. However, in complex scenes, people often experience fatigue when observing objects of various sizes. Moreover, human cognitive abilities have inherent limitations, leading to potential judgment errors. To overcome these challenges, object recognition technology, a pivotal means of categorizing objects captured by camera sensors is introduced. This paper presents a specialized small-size object detection algorithm designed for unique scenarios. This algorithm offers distinct advantages, including enhanced accuracy in detecting small-size objects and improved detection performance for objects of various sizes. The main innovations in this paper include the following three points. Firstly, summarizing a small-size object detection layer and reconfiguring both the feature extraction network and the feature fusion network to enhance the effectiveness of capturing small-size objects. Secondly, SPD-Conv modules are introduced to replace stride convolutions and pooling layers to improve the detection accuracy of small objects. Finally, employing the MPDIoU loss function to enhance the precision of bounding box fitting. In our experiments. We utilized authoritative official datasets. The experimental results on the Visdrone dataset demonstrate a 10.9% increase in mAP0.5 and a 9.3% increase in mAP0.5:0.95 rates compared to the original YOLOv8s. This model not only meets the accuracy requirements but also accounts for the lightweight demands when deployed on embedded devices.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Literatur
6.
Zurück zum Zitat Girshick R (2015) Fast R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp 1440–1448 Girshick R (2015) Fast R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp 1440–1448
8.
Zurück zum Zitat He K, Gkioxari G, Dollar P, Girshick R (2017) Mask R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp 2961–2969 He K, Gkioxari G, Dollar P, Girshick R (2017) Mask R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp 2961–2969
13.
Zurück zum Zitat Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu C-Y, Berg AC (2016) SSD: single shot multibox detector. In: Computer Vision-ECCV 2016 (ECCV), pp 21–37 Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu C-Y, Berg AC (2016) SSD: single shot multibox detector. In: Computer Vision-ECCV 2016 (ECCV), pp 21–37
27.
Zurück zum Zitat Wang C-Y, Bochkovskiy A, Liao H-YM (2023) YOLOv7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp 7464–7475 Wang C-Y, Bochkovskiy A, Liao H-YM (2023) YOLOv7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp 7464–7475
28.
Zurück zum Zitat Wang C-Y, Liao H-YM, Wu Y-H, Chen P-Y, Hsieh J-W, Yeh I-H (2020) CSPNet: a new backbone that can enhance learning capability of CNN. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp 390–391 Wang C-Y, Liao H-YM, Wu Y-H, Chen P-Y, Hsieh J-W, Yeh I-H (2020) CSPNet: a new backbone that can enhance learning capability of CNN. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp 390–391
29.
Zurück zum Zitat Zhang S, Chi C, Yao Y, Lei Z, Li SZ (2020) Bridging the gap between anchor-based and anchor-free detection via adaptive training sample selection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 9759–9768 Zhang S, Chi C, Yao Y, Lei Z, Li SZ (2020) Bridging the gap between anchor-based and anchor-free detection via adaptive training sample selection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 9759–9768
32.
Zurück zum Zitat Lin T-Y, Dollar P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 2117–2125 Lin T-Y, Dollar P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 2117–2125
34.
Zurück zum Zitat Sandler M, Howard A, Zhu M, Zhmoginov A, Chen LC (2018) MobileNetV2: inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 4510–4520 Sandler M, Howard A, Zhu M, Zhmoginov A, Chen LC (2018) MobileNetV2: inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 4510–4520
35.
Zurück zum Zitat Howard A, Sandler M, Chu G, Chen L-C, Chen B, Tan M et al (2019) Searching for MobileNetV3. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp 1314–1324 Howard A, Sandler M, Chu G, Chen L-C, Chen B, Tan M et al (2019) Searching for MobileNetV3. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp 1314–1324
39.
Zurück zum Zitat Du D, Zhu P, Wen L, Bian X, Lin H, Hu Q et al (2019) VisDrone-DET2019: the vision meets drone object detection in image challenge results. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp 0–0 Du D, Zhu P, Wen L, Bian X, Lin H, Hu Q et al (2019) VisDrone-DET2019: the vision meets drone object detection in image challenge results. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp 0–0
40.
Zurück zum Zitat Zhu P, Wen L, Du D, Bian X, Ling H, Hu Q et al (2018) VisDrone-DET2018: the vision meets drone object detection in image challenge results. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 0–0 Zhu P, Wen L, Du D, Bian X, Ling H, Hu Q et al (2018) VisDrone-DET2018: the vision meets drone object detection in image challenge results. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 0–0
41.
Zurück zum Zitat Cao Y, He Z, Wang L, Wang W, Yuan Y, Zhang D et al (2021) VisDrone-DET2021: the vision meets drone object detection challenge results. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp 2847–2854 Cao Y, He Z, Wang L, Wang W, Yuan Y, Zhang D et al (2021) VisDrone-DET2021: the vision meets drone object detection challenge results. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp 2847–2854
43.
Zurück zum Zitat Lyu S, Chang M-C, Du D, Wen L, Qi H, Li Y et al (2017) UA-DETRAC 2017: report of AVSS2017 & IWT4S challenge on advanced traffic monitoring. In: 2017 14th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), pp 1–7. https://doi.org/10.1109/AVSS.2017.8078560 Lyu S, Chang M-C, Du D, Wen L, Qi H, Li Y et al (2017) UA-DETRAC 2017: report of AVSS2017 & IWT4S challenge on advanced traffic monitoring. In: 2017 14th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), pp 1–7. https://​doi.​org/​10.​1109/​AVSS.​2017.​8078560
Metadaten
Titel
SPD-YOLOv8: an small-size object detection model of UAV imagery in complex scene
verfasst von
Rui Zhong
Ende Peng
Ziqiang Li
Qing Ai
Tao Han
Yong Tang
Publikationsdatum
20.04.2024
Verlag
Springer US
Erschienen in
The Journal of Supercomputing / Ausgabe 12/2024
Print ISSN: 0920-8542
Elektronische ISSN: 1573-0484
DOI
https://doi.org/10.1007/s11227-024-06121-w