Top

Neural Computing and Applications

Published in:

15-02-2024 | Review

A review of small object detection based on deep learning

Authors: Wei Wei, Yu Cheng, Jiafeng He, Xiyue Zhu

Published in: Neural Computing and Applications | Issue 12/2024

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

Small object detection is widely used in a variety of fields such as automatic driving, UAV-based object detection, and aerial image detection. However, small objects carry limited information, making it difficult for detectors to detect small objects. In recent years, the development of deep learning has significantly improved the performance of small object detection. This paper provides a comprehensive review to help further the development of small target detection. We summarize the challenges related to small object detection and analyze solutions to these challenges in existing works, including integrating the feature at different layers, enriching available information, balancing the number of positive and negative samples for small objects, and increasing sufficient small object instances. We discuss related methods developed in three application areas, including automatic driving, UAV search and rescue, and aerial image detection. In addition, we thoroughly analyze the performance of typical small object detection methods on popular datasets. Finally, based on the comprehensive review of small object detection methods, we point out possible research directions for future studies.

previous article Syntax-guided question generation using prompt learning

next article CoolGust: knowledge representation learning with commonsense knowledge guidelines and constraints

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 580–587

Girshick R (2015) Fast R-CNN. In: Proceedings of the IEEE international conference on computer vision, pp 1440–1448

Ren S, He K, Girshick R, Sun J (2017) Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 39(6):1137–1149. https://doi.org/10.1109/TPAMI.2016.2577031CrossRefPubMed

Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: unified, real-time object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 779–788

Redmon J, Farhadi A (2017) Yolo9000: better, faster, stronger. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7263–7271

Lin T-Y, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P, Zitnick CL (2014) Microsoft coco: common objects in context. In: Fleet D, Pajdla T, Schiele B, Tuytelaars T (eds) Computer vision—ECCV 2014. Springer, Cham, pp 740–755CrossRef

Zou Z, Chen K, Shi Z, Guo Y, Ye J (2019) Object detection in 20 years: a survey. arXiv e-prints, 1905

Tong K, Wu Y, Zhou F (2020) Recent advances in small object detection based on deep learning: a review. Image Vis Comput 97:103910. https://doi.org/10.1016/j.imavis.2020.103910CrossRef

Chen C, Liu M-Y, Tuzel O, Xiao J (2017) R-CNN for small object detection. In: Lai S-H, Lepetit V, Nishino K, Sato Y (eds) Computer vision—ACCV 2016. Springer, Cham, pp 214–230CrossRef

10.

Xiao J, Ehinger KA, Hays J, Torralba A, Oliva A (2016) Sun database: exploring a large collection of scene categories. Int J Comput Vis 119(1):3–22MathSciNetCrossRef

11.

Chen G, Wang H, Chen K, Li Z, Song Z, Liu Y, Chen W, Knoll A (2020) A survey of the four pillars for small object detection: multiscale representation, contextual information, super-resolution, and region proposal. IEEE Trans Syst Man Cybern Syst 52(2):936–953CrossRef

12.

Liu Y, Sun P, Wergeles N, Shang Y (2021) A survey and performance evaluation of deep learning methods for small object detection. Expert Syst Appl 172:114602. https://doi.org/10.1016/j.eswa.2021.114602CrossRef

13.

Lin T-Y, Dollár P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2117–2125

14.

Liu S, Qi L, Qin H, Shi J, Jia J (2018) Path aggregation network for instance segmentation. In: 2018 IEEE/CVF conference on computer vision and pattern recognition, pp 8759–8768. https://doi.org/10.1109/CVPR.2018.00913

15.

Liang Z, Shao J, Zhang D, Gao L (2018) Small object detection using deep feature pyramid networks. In: Pacific rim conference on multimedia. Springer, pp 554–564

16.

Ghiasi G, Lin T-Y, Le QV (2019) NAS-FPN: learning scalable feature pyramid architecture for object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 7036–7045

17.

Qiao S, Chen L-C, Yuille A (2021) Detectors: detecting objects with recursive feature pyramid and switchable atrous convolution. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 10213–10224

18.

Li C, Li L, Jiang H, Weng K, Geng Y, Li L, Ke Z, Li Q, Cheng M, Nie W et al. (2022) Yolov6: a single-stage object detection framework for industrial applications. arXiv preprint arXiv:2209.02976

19.

Woo S, Hwang S, Kweon IS (2018) Stairnet: top-down semantic aggregation for accurate one shot detection. In: 2018 IEEE winter conference on applications of computer vision (WACV). IEEE, pp 1093–1102

20.

Guo C, Fan B, Zhang Q, Xiang S, Pan C (2019) Augfpn: improving multi-scale feature learning for object detection. Journal Article

21.

Nayan A-A, Saha J, Mozumder AN, Mahmud KR, Azad AKA (2020) Real time multi-class object detection and recognition using vision augmentation algorithm. arXiv preprint arXiv:2003.07442

22.

Hong M, Li S, Yang Y, Zhu F, Zhao Q, Lu L (2022) Sspnet: scale selection pyramid network for tiny person detection from UAV images. IEEE Geosci Remote Sens Lett 19:1–5. https://doi.org/10.1109/LGRS.2021.3103069CrossRef

23.

Gong Y, Yu X, Ding Y, Peng X, Zhao J, Han Z (2021) Effective fusion factor in FPN for tiny object detection. In: 2021 IEEE winter conference on applications of computer vision (WACV), pp 1159–1167. https://doi.org/10.1109/WACV48630.2021.00120

24.

Tan M, Pang R, Le QV (2020) Efficientdet: scalable and efficient object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 10781–10790

25.

Liu S, Huang D, Wang Y (2019) Learning spatial fusion for single-shot object detection. arXiv preprint arXiv:1911.09516

26.

Hu J, Shen L, Sun G (2018) Squeeze-and-excitation networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7132–7141

27.

Li X, Wang W, Hu X, Yang J (2019) Selective kernel networks. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 510–519

28.

Zhang H, Wu C, Zhang Z, Zhu Y, Lin H, Zhang Z, Sun Y, He T, Mueller J, Manmatha R et al. (2022) Resnest: split-attention networks. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 2736–2746

29.

Dai Y, Gieseke F, Oehmcke S, Wu Y, Barnard K (2021) Attentional feature fusion. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision, pp 3560–3569

30.

Yu T, Mo B, Liu F, Qi H, Liu Y (2019) Robust thermal infrared object tracking with continuous correlation filters and adaptive feature fusion. Infrared Phys Technol 98:69–81. https://doi.org/10.1016/j.infrared.2019.02.012ADSCrossRef

31.

Yuan D, Chang X, Liu Q, Yang Y, Wang D, Shu M, He Z, Shi G (2023) Active learning for deep visual tracking. IEEE Trans Neural Netw Learn Syst. https://doi.org/10.1109/TNNLS.2023.3266837CrossRefPubMed

32.

Zeng X, Ouyang W, Yan J, Li H, Xiao T, Wang K, Liu Y, Zhou Y, Yang B, Wang Z et al (2017) Crafting gbd-net for object detection. IEEE Trans Pattern Anal Mach Intell 40(9):2109–2123CrossRefPubMed

33.

Li Y, Zeng J, Shan S, Chen X (2018) Occlusion aware facial expression recognition using CNN with attention mechanism. IEEE Trans Image Process 28(5):2439–2450ADSMathSciNetCrossRef

34.

Tang X, Du DK, He Z, Liu J (2018) Pyramidbox: a context-assisted single shot face detector. In: Proceedings of the European conference on computer vision (ECCV), pp 797–813

35.

Bell S, Zitnick CL, Bala K, Girshick R (2016) Inside-outside net: detecting objects in context with skip pooling and recurrent neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2874–2883

36.

Le QV, Jaitly N, Hinton GE (2015) A simple way to initialize recurrent networks of rectified linear units. arXiv preprint arXiv:1504.00941

37.

Zhu Y, Urtasun R, Salakhutdinov R, Fidler S (2015) segdeepm: exploiting segmentation and context in deep neural networks for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4703–4711

38.

Liu Y, Wang R, Shan S, Chen X (2018) Structure inference net: object detection using scene-level context and instance-level relationships. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6985–6994

39.

Fu K, Li J, Ma L, Mu K, Tian Y (2020) Intrinsic relationship reasoning for small object detection. arXiv e-prints, 2009

40.

Kipf TN, Welling M (2016) Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907

41.

Hu H, Gu J, Zhang Z, Dai J, Wei Y (2018) Relation networks for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3588–3597

42.

Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. In: Advances in neural information processing systems, vol 30

43.

Leng J, Ren Y, Jiang W, Sun X, Wang Y (2021) Realize your surroundings: exploiting context information for small object detection. Neurocomputing 433:287–299. https://doi.org/10.1016/j.neucom.2020.12.093CrossRef

44.

Creswell A, White T, Dumoulin V, Arulkumaran K, Sengupta B, Bharath AA (2018) Generative adversarial networks: an overview. IEEE Signal Process Mag 35(1):53–65ADSCrossRef

45.

Li J, Liang X, Wei Y, Xu T, Feng J, Yan S (2017) Perceptual generative adversarial networks for small object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1222–1230

46.

Bai Y, Zhang Y, Ding M, Ghanem B (2018) SOD-MTGAN: small object detection via multi-task generative adversarial network. Springer, Cham, pp 210–226

47.

Noh J, Bae W, Lee W, Seo J, Kim G (2019) Better to follow, follow to be better: towards precise supervision of feature super-resolution for small object detection. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 9725–9734

48.

Liu J, Li C, Liang F, Lin C, Sun M, Yan J, Ouyang W, Xu D (2021) Inception convolution with efficient dilation search. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 11486–11495

49.

Huang L, Yang Y, Deng Y, Yu Y (2015) Densebox: unifying landmark localization with end to end object detection. arXiv preprint arXiv:1509.04874

50.

Tian Z, Shen C, Chen H, He T (2019) FCOS: fully convolutional one-stage object detection. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 9627–9636

51.

Zhu C, He Y, Savvides M (2019) Feature selective anchor-free module for single-shot object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 840–849

52.

Kong T, Sun F, Liu H, Jiang Y, Li L, Shi J (2020) Foveabox: beyound anchor-based object detection. IEEE Trans Image Process 29:7389–7398ADSCrossRef

53.

Chen R, Liu Y, Zhang M, Liu S, Yu B, Tai Y-W (2020) Dive deeper into box for object detection. In: Computer vision–ECCV 2020: 16th European conference, Glasgow, UK, August 23–28, 2020, proceedings, part XXII 16. Springer, pp 412–428

54.

Tychsen-Smith L, Petersson L (2017) Denet: scalable real-time object detection with directed sparse sampling. In: Proceedings of the IEEE international conference on computer vision, pp 428–436

55.

Wang X, Chen K, Huang Z, Yao C, Liu W (2017) Point linking network for object detection. arXiv preprint arXiv:1706.03646

56.

Law H, Deng J (2018) Cornernet: detecting objects as paired keypoints. In: Proceedings of the European conference on computer vision (ECCV), pp 734–750

57.

Duan K, Bai S, Xie L, Qi H, Huang Q, Tian Q (2019) Centernet: keypoint triplets for object detection. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 6569–6578

58.

Law H, Teng Y, Russakovsky O, Deng J (2019) Cornernet-lite: efficient keypoint based object detection. arXiv e-prints, 1904

59.

Zhou X, Zhuo J, Krahenbuhl P (2019) Bottom-up object detection by grouping extreme and center points. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 850–859

60.

Yang Z, Liu S, Hu H, Wang L, Lin S (2019) Reppoints: point set representation for object detection. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 9657–9666

61.

Zhang S, Zhu X, Lei Z, Shi H, Wang X, Li SZ (2017) Faceboxes: a CPU real-time face detector with high accuracy. In: 2017 IEEE international joint conference on biometrics (IJCB). IEEE, pp 1–9

62.

Zhang S, Zhu X, Lei Z, Shi H, Wang X, Li SZ (2017) S3fd: single shot scale-invariant face detector. In: Proceedings of the IEEE international conference on computer vision, pp 192–201

63.

Eggert C, Zecha D, Brehm S, Lienhart R (2017) Improving small object proposals for company logo detection. In: Proceedings of the 2017 ACM on international conference on multimedia retrieval, pp 167–174

64.

Deng J, Dong W, Socher R, Li L-J, Li K, Fei-Fei L (2009) Imagenet: a large-scale hierarchical image database. In: 2009 IEEE conference on computer vision and pattern recognition, pp 248–255. https://doi.org/10.1109/CVPR.2009.5206848

65.

Everingham M, Gool LV, Williams CKI, Winn JM, Zisserman A (2009) The Pascal visual object classes (VOC) challenge. Int J Comput Vis 88:303–338CrossRef

66.

Kisantal M, Wojna Z, Murawski J, Naruniec J, Cho K (2019) Augmentation for small object detection. arXiv preprint arXiv:1902.07296

67.

Zhao M, Cheng L, Yang X, Feng P, Liu L, Wu N (2019) Tbc-net: a real-time detector for infrared small target detection using semantic constraint. arXiv preprint arXiv:2001.05852

68.

Gao C, Meng D, Yang Y, Wang Y, Zhou X, Hauptmann AG (2013) Infrared patch-image model for small target detection in a single image. IEEE Trans Image Process 22(12):4996–5009ADSMathSciNetCrossRefPubMed

69.

Chen C, Zhang Y, Lv Q, Wei S, Wang X, Sun X, Dong J (2019) Rrnet: a hybrid detector for object detection in drone-captured images. In: Proceedings of the IEEE/CVF international conference on computer vision workshops, pp 0–0

70.

Chen Y, Zhang P, Li Z, Li Y, Zhang X, Qi L, Sun J, Jia J (2020) Dynamic scale training for object detection. Journal Article

71.

Ou Z, Xiao F, Xiong B, Shi S, Song M (2019) Famn: feature aggregation multipath network for small traffic sign detection. IEEE Access 7:178798–178810CrossRef

72.

Shaoqing R, Kaiming H, Girshick R, Xiangyu Z, Jian S (2017) Object detection networks on convolutional feature maps. IEEE Trans Pattern Anal Mach Intell 39(7):1476–1481. https://doi.org/10.1109/TPAMI.2016.2601099CrossRef

73.

Liu Z, Du J, Tian F, Wen J (2019) Mr-cnn: a multi-scale region-based convolutional neural network for small traffic sign recognition. IEEE Access 7:57120–57128. https://doi.org/10.1109/ACCESS.2019.2913882CrossRef

74.

Zhu Z, Liang D, Zhang S, Huang X, Li B, Hu S (2016) Traffic-sign detection and classification in the wild. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2110–2118

75.

Yuan Y, Xiong Z, Wang Q (2019) Vssa-net: vertical spatial sequence attention network for traffic sign detection. IEEE Trans Image Process 28(7):3423–3434. https://doi.org/10.1109/TIP.2019.2896952ADSMathSciNetCrossRefPubMed

76.

Chen G, Chen K, Zhang L, Zhang L, Knoll A (2021) Vcanet: vanishing-point-guided context-aware network for small road object detection. Autom Innov 4(4):400–412. https://doi.org/10.1007/s42154-021-00157-x. (identifier: 157)CrossRef

77.

Lee S, Kim J, Yoon JS, Shin S, Bailo O, Kim N, Lee T-H, Hong HS, Han S-H, Kweon IS (2017) Vpgnet: vanishing point guided network for lane and road marking detection and recognition. https://doi.org/10.1109/ICCV.2017.215. Journal Article

78.

Liu T, Fu HY, Wen Q, Zhang DK, Li LF (2018) Extended faster R-CNN for long distance human detection: finding pedestrians in UAV images. In: 2018 IEEE international conference on consumer electronics (ICCE), pp 1–2. https://doi.org/10.1109/ICCE.2018.8326306

79.

Liu M, Wang X, Zhou A, Fu X, Ma Y, Piao C (2020) Uav-yolo: Small object detection on unmanned aerial vehicle perspective. Sensors 20(8):2238ADSCrossRefPubMedPubMedCentral

80.

Redmon J, Farhadi A (2018) Yolov3: an incremental improvement. arXiv preprint arXiv:1804.02767

81.

He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778

82.

Liang X, Zhang J, Zhuo L, Li Y, Tian Q (2019) Small object detection in unmanned aerial vehicle images using feature fusion and scaling-based single shot detector with spatial context analysis. IEEE Trans Circuits Syst Video Technol 30(6):1758–1770CrossRef

83.

Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu C-Y, Berg AC (2016) SSD: single shot multibox detector. Springer, Cham, pp 21–37

84.

Li Z, Zhou F (2017) Fssd: feature fusion single shot multibox detector. arXiv preprint arXiv:1712.00960

85.

Yuan D, Chang X, Li Z, He Z (2022) Learning adaptive spatial-temporal context-aware correlation filters for UAV tracking. ACM Trans Multimed Comput Commun Appl. https://doi.org/10.1145/3486678CrossRef

86.

Tian G, Liu J, Yang W (2021) A dual neural network for object detection in UAV images. Neurocomputing 443:292–301CrossRef

87.

Wang C-Y, Bochkovskiy A, Liao H-YM (2022) YOLOv7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. arXiv preprint arXiv:2207.02696

88.

Zhao H, Zhang H, Zhao Y (2023) Yolov7-sea: object detection of maritime UAV images based on improved yolov7. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision, pp 233–238

89.

Yang X, Yang J, Yan J, Zhang Y, Zhang T, Guo Z, Sun X, Fu K (2019) Scrdet: towards more robust detection for small, cluttered and rotated objects. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 8232–8241

90.

Xiaolin F, Fan H, Ming Y, Tongxin Z, Ran B, Zenghui Z, Zhiyuan G (2022) Small object detection in remote sensing images based on super-resolution. Pattern Recogn Lett 153:107–112ADSCrossRef

91.

Han J, Ding J, Li J, Xia G-S (2021) Align deep features for oriented object detection. IEEE Trans Geosci Remote Sens 60:1–11

92.

Xia G-S, Bai X, Ding J, Zhu Z, Belongie S, Luo J, Datcu M, Pelillo M, Zhang L (2018) Dota: a large-scale dataset for object detection in aerial images. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3974–3983

93.

Rabbi J, Ray N, Schubert M, Chowdhury S, Chao D (2020) Small-object detection in remote sensing images with end-to-end edge-enhanced GAN and object detector network. Remote Sens 12(9):1432ADSCrossRef

94.

Jiang K, Wang Z, Yi P, Wang G, Lu T, Jiang J (2019) Edge-enhanced GAN for remote sensing image superresolution. IEEE Trans Geosci Remote Sens 57(8):5799–5812ADSCrossRef

95.

Courtrai L, Pham M-T, Lefèvre S (2020) Small object detection in remote sensing images based on super-resolution with auxiliary generative adversarial networks. Remote Sens 12(19):3152ADSCrossRef

96.

Gulrajani I, Ahmed F, Arjovsky M, Dumoulin V, Courville A (2017) Improved training of Wasserstein GANs. In: Proceedings of the 31st international conference on neural information processing systems, pp 5769–5779

97.

Lim B, Son S, Kim H, Nah S, Mu Lee K (2017) Enhanced deep residual networks for single image super-resolution. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp 136–144

98.

Zhu J-Y, Park T, Isola P, Efros AA (2017) Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of the IEEE international conference on computer vision, pp 2223–2232

99.

Ren Y, Zhu C, Xiao S (2018) Small object detection in optical remote sensing images via modified faster R-CNN. Appl Sci 8(5):813. https://doi.org/10.3390/app8050813. (identifier: app8050813)CrossRef

100.

Braun M, Krebs S, Flohr F, Gavrila DM (2018) The Eurocity persons dataset: a novel benchmark for object detection. arXiv preprint arXiv:1805.07193

101.

Stallkamp J, Schlipsing M, Salmen J, Igel C (2012) Man vs. computer: benchmarking machine learning algorithms for traffic sign recognition. Neural Netw 32:323–332CrossRefPubMed

102.

Zhang S, Benenson R, Schiele B (2017) Citypersons: a diverse dataset for pedestrian detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3213–3221

103.

Li K, Wan G, Cheng G, Meng L, Han J (2020) Object detection in optical remote sensing images: a survey and a new benchmark. ISPRS J Photogramm Remote Sens 159:296–307ADSCrossRef

104.

Yu X, Gong Y, Jiang N, Ye Q, Han Z (2020) Scale match for tiny person detection. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision, pp 1257–1265

105.

Bondi E, Jain R, Aggrawal P, Anand S, Hannaford R, Kapoor A, Piavis J, Shah S, Joppa L, Dilkina B, et al (2020) Birdsai: a dataset for detection and tracking in aerial thermal infrared videos. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision, pp 1747–1756

106.

Wan J, Ding W, Zhu H, Xia M, Huang Z, Tian L, Zhu Y, Wang H (2021) An efficient small traffic sign detection method based on yolov3. J Signal Process Syst 93(8):899–911CrossRef

107.

Lin T-Y, Goyal P, Girshick R, He K, Dollár P (2017) Focal loss for dense object detection. In: Proceedings of the IEEE international conference on computer vision, pp 2980–2988

108.

Liu Z, Lin Y, Cao Y, Hu H, Wei Y, Zhang Z, Lin S, Guo B (2021) Swin transformer: hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 10012–10022

109.

Dai J, Li Y, He K, Sun J (2016) R-FCN: object detection via region-based fully convolutional networks. In: Proceedings of the 30th international conference on neural information processing systems, pp 379–387

110.

Azimi SM, Vig E, Bahmanyar R, Körner M, Reinartz P (2018) Towards multi-class object detection in unconstrained remote sensing imagery. In: Asian conference on computer vision. Springer, pp 150–165

111.

Zhang G, Lu S, Zhang W (2019) Cad-net: a context-aware detection network for objects in remote sensing imagery. IEEE Trans Geosci Remote Sens 57(12):10015–10024ADSCrossRef

112.

Dhariwal P, Nichol A (2021) Diffusion models beat GANs on image synthesis. Adv Neural Inf Process Syst 34:8780–8794

113.

Carion N, Massa F, Synnaeve G, Usunier N, Kirillov A, Zagoruyko S (2020) End-to-end object detection with transformers. In: European conference on computer vision. Springer, pp 213–229

114.

Zhu X, Su W, Lu L, Li B, Wang X, Dai J (2020) Deformable DETR: deformable transformers for end-to-end object detection. arXiv preprint arXiv:2010.04159

Title: A review of small object detection based on deep learning
Authors: Wei Wei
Yu Cheng
Jiafeng He
Xiyue Zhu
Publication date: 15-02-2024
Publisher: Springer London
Published in: Neural Computing and Applications / Issue 12/2024
Print ISSN: 0941-0643
Electronic ISSN: 1433-3058
DOI: https://doi.org/10.1007/s00521-024-09422-6

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft"

Springer Professional "Technik"

Springer Professional "Wirtschaft+Technik"

Other articles of this Issue 12/2024

YOLO-based CAD framework with ViT transformer for breast mass detection and classification in CESM and FFDM images

Decentralized variable impedance control of modular robot manipulators with physical human–robot interaction using Gaussian process-based motion intention estimation

VITALT: a robust and efficient brain tumor detection system using vision transformer with attention and linear transformation

BinDMO: a new Binary Dwarf Mongoose Optimization algorithm on based Z-shaped, U-shaped, and taper-shaped transfer functions for CEC-2017 benchmarks

FCDS-DETR: detection transformer based on feature correction and double sampling

Enhancing user and item representation with collaborative signals for KG-based recommendation

Premium Partner