Skip to main content
Erschienen in: International Journal of Machine Learning and Cybernetics 3/2024

23.09.2023 | Original Article

Feature augmentation and scale penalty for tiny floating detection

verfasst von: Ke Li, Yining Wang, Wang Li, Siyuan Shen, Shukai Duan, Lidan Wang

Erschienen in: International Journal of Machine Learning and Cybernetics | Ausgabe 3/2024

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Rapidly increasing concerns about the impact of tiny floating objects on water health has prompted the need for more effective detection methods. The main challenge in detecting these objects is their small size, accounting for only 0.5% of the image, which significantly hampers detection efforts. Moreover, existing object detectors utilize the intersection over union (IOU) as the bounding box regression loss to enhance object localization accuracy. However, this approach penalizes larger objects more heavily than smaller ones, leading to imbalanced regression losses. To address these issues, we propose enhancements to the YOLOv4 model. Our approach incorporates the following key improvements. Firstly, we introduce a feature augmentation module (FAM) to capture multi-scale contextual features of tiny objects and low-level features. This helps overcome the challenge of limited representation of tiny objects in the deeper layers of the network. Additionally, we integrate a convolutional block attention module (CBAM) into the path aggregation network to prevent the flooding of conflicting information in the fusion of features at different levels, ensuring an accurate representation of tiny object features. Finally, we propose a scale penalty function to address the issue of imbalanced regression loss. Experimental results demonstrate that our improved model achieves impressive detection performance on the Flow-RI dataset, specifically for detecting small-scale objects. These findings highlight the efficacy of our proposed methodology in enhancing the detection of tiny floating objects and contribute to the overall goal of improving water health.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Weitere Produktempfehlungen anzeigen
Literatur
1.
Zurück zum Zitat Wang M, Deng W (2021) Deep face recognition: a survey. Neurocomputing 429:215–244CrossRef Wang M, Deng W (2021) Deep face recognition: a survey. Neurocomputing 429:215–244CrossRef
2.
Zurück zum Zitat Sundararaman R, De Almeida Braga C, Marchand E, Pettre J (2021) Tracking pedestrian heads in dense crowd. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 3865–3875 Sundararaman R, De Almeida Braga C, Marchand E, Pettre J (2021) Tracking pedestrian heads in dense crowd. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 3865–3875
3.
Zurück zum Zitat Prakash A, Chitta K, Geiger A (2021) Multi-modal fusion transformer for end-to-end autonomous driving. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 7077–7087 Prakash A, Chitta K, Geiger A (2021) Multi-modal fusion transformer for end-to-end autonomous driving. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 7077–7087
4.
Zurück zum Zitat Han J, Ding J, Xue N, Xia G-S (2021) Redet: a rotation-equivariant detector for aerial object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 2786–2795 Han J, Ding J, Xue N, Xia G-S (2021) Redet: a rotation-equivariant detector for aerial object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 2786–2795
5.
Zurück zum Zitat Medak D, Posilović L, Subašić M, Budimir M, Lončarić S (2022) DefectDet: a deep learning architecture for detection of defects with extreme aspect ratios in ultrasonic images. Neurocomputing 473:107–115CrossRef Medak D, Posilović L, Subašić M, Budimir M, Lončarić S (2022) DefectDet: a deep learning architecture for detection of defects with extreme aspect ratios in ultrasonic images. Neurocomputing 473:107–115CrossRef
6.
Zurück zum Zitat Wang K, Liu M, Ye Z (2021) An advanced YOLOv3 method for small-scale road object detection. Appl Soft Comput 112:107846CrossRef Wang K, Liu M, Ye Z (2021) An advanced YOLOv3 method for small-scale road object detection. Appl Soft Comput 112:107846CrossRef
7.
8.
Zurück zum Zitat Gai R, Chen N, Yuan H (2023) A detection algorithm for cherry fruits based on the improved YOLO-v4 model. Neural Comput Appl 35:13895–13906CrossRef Gai R, Chen N, Yuan H (2023) A detection algorithm for cherry fruits based on the improved YOLO-v4 model. Neural Comput Appl 35:13895–13906CrossRef
9.
Zurück zum Zitat Hu X, Liu Y, Zhao Z, Liu J, Yang X, Sun C, Chen S, Li B, Zhou C (2021) Real-time detection of uneaten feed pellets in underwater images for aquaculture using an improved YOLO-v4 network. Comput Electron Agric 185:106135CrossRef Hu X, Liu Y, Zhao Z, Liu J, Yang X, Sun C, Chen S, Li B, Zhou C (2021) Real-time detection of uneaten feed pellets in underwater images for aquaculture using an improved YOLO-v4 network. Comput Electron Agric 185:106135CrossRef
10.
Zurück zum Zitat Tzou T-L, Huang C-H, Lai Y-H, Tsai M-H, Hsu C-T, Chen P-S, Lee W-J (2022) Detect safety net on the construction site based on YOLO-v4. In: Innovative computing. Springer, pp 33–42 Tzou T-L, Huang C-H, Lai Y-H, Tsai M-H, Hsu C-T, Chen P-S, Lee W-J (2022) Detect safety net on the construction site based on YOLO-v4. In: Innovative computing. Springer, pp 33–42
11.
Zurück zum Zitat Chen Z-H, Juang J-C (2022) YOLOv4 object detection model for nondestructive radiographic testing in aviation maintenance tasks. AIAA J 60(1):526–531 Chen Z-H, Juang J-C (2022) YOLOv4 object detection model for nondestructive radiographic testing in aviation maintenance tasks. AIAA J 60(1):526–531
12.
Zurück zum Zitat Woo S, Park J, Lee J-Y, Kweon IS (2018) CBAM: convolutional block attention module. In: Proceedings of the European conference on computer vision (ECCV), pp 3–19 Woo S, Park J, Lee J-Y, Kweon IS (2018) CBAM: convolutional block attention module. In: Proceedings of the European conference on computer vision (ECCV), pp 3–19
13.
Zurück zum Zitat Liu S, Qi L, Qin H, Shi J, Jia J (2018) Path aggregation network for instance segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 8759–8768 Liu S, Qi L, Qin H, Shi J, Jia J (2018) Path aggregation network for instance segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 8759–8768
14.
Zurück zum Zitat Yi Z, Yao D, Li G, Ai J, Xie W (2022) Detection and localization for lake floating objects based on CA-Faster R-CNN. Multimed Tools Appl 81(12):17263–17281CrossRef Yi Z, Yao D, Li G, Ai J, Xie W (2022) Detection and localization for lake floating objects based on CA-Faster R-CNN. Multimed Tools Appl 81(12):17263–17281CrossRef
15.
Zurück zum Zitat Li N, Huang H, Wang X, Yuan B, Liu Y, Xu S (2022) Detection of floating garbage on water surface based on PC-Net. Sustainability 14(18):11729CrossRef Li N, Huang H, Wang X, Yuan B, Liu Y, Xu S (2022) Detection of floating garbage on water surface based on PC-Net. Sustainability 14(18):11729CrossRef
17.
Zurück zum Zitat Renfei C, Jian W, Yong P, Zhongwen L, Hua S (2023) Detection and tracking of floating objects based on spatial–temporal information fusion. Expert Syst Appl 225:120185CrossRef Renfei C, Jian W, Yong P, Zhongwen L, Hua S (2023) Detection and tracking of floating objects based on spatial–temporal information fusion. Expert Syst Appl 225:120185CrossRef
18.
Zurück zum Zitat Zhang L, Wei Y, Wang H, Shao Y, Shen J (2021) Real-time detection of river surface floating object based on improved RefineDet. IEEE Access 9:81 147-81 160CrossRef Zhang L, Wei Y, Wang H, Shao Y, Shen J (2021) Real-time detection of river surface floating object based on improved RefineDet. IEEE Access 9:81 147-81 160CrossRef
19.
Zurück zum Zitat Yu X, Ye X, Zhang S (2022) Floating pollutant image target extraction algorithm based on immune extremum region. Digital Signal Process 123:103442CrossRef Yu X, Ye X, Zhang S (2022) Floating pollutant image target extraction algorithm based on immune extremum region. Digital Signal Process 123:103442CrossRef
20.
Zurück zum Zitat Cheng Y, Xu H, Liu Y (2021) Robust small object detection on the water surface through fusion of camera and millimeter wave radar. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 15263–15272 Cheng Y, Xu H, Liu Y (2021) Robust small object detection on the water surface through fusion of camera and millimeter wave radar. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 15263–15272
21.
Zurück zum Zitat Lin T-Y, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P, Zitnick CL (2014) Microsoft coco: common objects in context. In: European conference on computer vision. Springer, pp 740–755 Lin T-Y, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P, Zitnick CL (2014) Microsoft coco: common objects in context. In: European conference on computer vision. Springer, pp 740–755
22.
Zurück zum Zitat Chen C, Liu M-Y, Tuzel O, Xiao J (2016) R-CNN for small object detection. In: Asian conference on computer vision. Springer, pp 214–230 Chen C, Liu M-Y, Tuzel O, Xiao J (2016) R-CNN for small object detection. In: Asian conference on computer vision. Springer, pp 214–230
23.
Zurück zum Zitat Kisantal M, Wojna Z, Murawski J, Naruniec J, Cho K (2019) Augmentation for small object detection, arXiv preprintarXiv:1902.07296 Kisantal M, Wojna Z, Murawski J, Naruniec J, Cho K (2019) Augmentation for small object detection, arXiv preprintarXiv:​1902.​07296
24.
Zurück zum Zitat Yu X, Gong Y, Jiang N, Ye Q, Han Z (2020) Scale match for tiny person detection. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision, pp 1257–1265 Yu X, Gong Y, Jiang N, Ye Q, Han Z (2020) Scale match for tiny person detection. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision, pp 1257–1265
25.
Zurück zum Zitat Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu C-Y, Berg AC (2016) SSD: single shot multibox detector. In: European conference on computer vision. Springer, pp 21–37 Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu C-Y, Berg AC (2016) SSD: single shot multibox detector. In: European conference on computer vision. Springer, pp 21–37
26.
Zurück zum Zitat Lin T-Y, Dollár P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2117–2125 Lin T-Y, Dollár P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2117–2125
27.
Zurück zum Zitat Leng J, Ren Y, Jiang W, Sun X, Wang Y (2021) Realize your surroundings: exploiting context information for small object detection. Neurocomputing 433:287–299CrossRef Leng J, Ren Y, Jiang W, Sun X, Wang Y (2021) Realize your surroundings: exploiting context information for small object detection. Neurocomputing 433:287–299CrossRef
28.
Zurück zum Zitat Bai Y, Zhang Y, Ding M, Ghanem B (2018) SOD-MTGAN: small object detection via multi-task generative adversarial network. In: Proceedings of the European conference on computer vision (ECCV), pp 206–221 Bai Y, Zhang Y, Ding M, Ghanem B (2018) SOD-MTGAN: small object detection via multi-task generative adversarial network. In: Proceedings of the European conference on computer vision (ECCV), pp 206–221
29.
Zurück zum Zitat Cai Z, Vasconcelos N (2018) Cascade R-CNN: delving into high quality object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6154–6162 Cai Z, Vasconcelos N (2018) Cascade R-CNN: delving into high quality object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6154–6162
32.
Zurück zum Zitat Cheng G, Yuan X, Yao X, Yan K, Zeng Q, Han J (2022) Towards large-scale small object detection: survey and benchmarks, arXiv preprintarXiv:2207.14096 Cheng G, Yuan X, Yao X, Yan K, Zeng Q, Han J (2022) Towards large-scale small object detection: survey and benchmarks, arXiv preprintarXiv:​2207.​14096
33.
Zurück zum Zitat Lim J-S, Astrid M, Yoon H-J, Lee S-I (2021) Small object detection using context and attention. In: 2021 International conference on artificial intelligence in information and communication (ICAIIC). IEEE, pp 181–186 Lim J-S, Astrid M, Yoon H-J, Lee S-I (2021) Small object detection using context and attention. In: 2021 International conference on artificial intelligence in information and communication (ICAIIC). IEEE, pp 181–186
34.
Zurück zum Zitat Chen L-C, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2017) Deeplab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Trans Pattern Anal Mach Intell 40(4):834–848CrossRefPubMed Chen L-C, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2017) Deeplab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Trans Pattern Anal Mach Intell 40(4):834–848CrossRefPubMed
35.
Zurück zum Zitat Sun D, Yang Y, Li M, Yang J, Meng B, Bai R, Li L, Ren J (2020) A scale balanced loss for bounding box regression. IEEE Access 8:108438–108448CrossRef Sun D, Yang Y, Li M, Yang J, Meng B, Bai R, Li L, Ren J (2020) A scale balanced loss for bounding box regression. IEEE Access 8:108438–108448CrossRef
36.
Zurück zum Zitat Cheng Y, Zhu J, Jiang M, Fu J, Pang C, Wang P, Sankaran K, Onabola O, Liu Y, Liu D et al (2021) Flow: a dataset and benchmark for floating waste detection in inland waters. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 10953–10962 Cheng Y, Zhu J, Jiang M, Fu J, Pang C, Wang P, Sankaran K, Onabola O, Liu Y, Liu D et al (2021) Flow: a dataset and benchmark for floating waste detection in inland waters. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 10953–10962
39.
Zurück zum Zitat Lin T-Y, Goyal P, Girshick R, He K, Dollár P (2017) Focal loss for dense object detection. In: Proceedings of the IEEE international conference on computer vision, pp 2980–2988 Lin T-Y, Goyal P, Girshick R, He K, Dollár P (2017) Focal loss for dense object detection. In: Proceedings of the IEEE international conference on computer vision, pp 2980–2988
40.
Zurück zum Zitat Faster R (2015) Towards real-time object detection with region proposal networks. Adv Neural Inf Process Syst 9199(10.5555):2 969 239-2 969 250 Faster R (2015) Towards real-time object detection with region proposal networks. Adv Neural Inf Process Syst 9199(10.5555):2 969 239-2 969 250
Metadaten
Titel
Feature augmentation and scale penalty for tiny floating detection
verfasst von
Ke Li
Yining Wang
Wang Li
Siyuan Shen
Shukai Duan
Lidan Wang
Publikationsdatum
23.09.2023
Verlag
Springer Berlin Heidelberg
Erschienen in
International Journal of Machine Learning and Cybernetics / Ausgabe 3/2024
Print ISSN: 1868-8071
Elektronische ISSN: 1868-808X
DOI
https://doi.org/10.1007/s13042-023-01943-1

Weitere Artikel der Ausgabe 3/2024

International Journal of Machine Learning and Cybernetics 3/2024 Zur Ausgabe

Neuer Inhalt