Skip to main content
Erschienen in: International Journal of Multimedia Information Retrieval 3/2022

11.05.2022 | Regular Paper

InceptionDepth-wiseYOLOv2: improved implementation of YOLO framework for pedestrian detection

verfasst von: Sweta Panigrahi, U. S. N. Raju

Erschienen in: International Journal of Multimedia Information Retrieval | Ausgabe 3/2022

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Pedestrian detection is one of the most challenging research areas in computer vision, as it involves classifying the image and localizing the pedestrian. Its applications, especially in automated surveillance and robotics, are exceedingly sought-after. Compared to traditional hand-crafted methods, convolutional neural networks (CNNs) have superior detection results. The single-stage detection networks, particularly the You Only Look Once (YOLO) network, have attained a satisfactory performance in object detection without compromising the computation speed and are among the state-of-the-art CNN-based methods. The YOLO framework can be leveraged to use in pedestrian detection as well. In this work, we propose an improved YOLOv2, called InceptionDepth-wiseYOLOv2. The proposed model uses a modified DarkNet53 engineered for a robust feature formation. Three inception depth-wise convolution modules are integrated at varying levels in DarkNet53, leading to a comprehensive feature of an object in the image. The proposed method is compared with state-of-the-art detection methods, i.e., FasterRCNN, YOLOv2 with various base networks, YOLOv3, and Single Shot Multibox Detector. Detection Error Trade-off Curve, Precision–Recall Curve, Log Average Miss Rate, and Average Precision performance metrics are used to compare the methods. The analysis for the count of pedestrians detected concerning their height is also carried out. The experimental study used three benchmark pedestrian datasets: the INRIA Pedestrian, PASCAL VOC 2012, and Caltech Pedestrian.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
9.
15.
Zurück zum Zitat Cai Z, Fan Q, Feris RS, Vasconcelos N (2016) A unified multi-scale deep convolutional neural network for fast object detection. In: Leibe B, Matas J, Sebe N, Welling M (eds.) Computer vision – ECCV 2016. Lecture notes in computer science. Springer, Cham, 354–370. https://doi.org/10.1007/978-3-319-46493-0_22 Cai Z, Fan Q, Feris RS, Vasconcelos N (2016) A unified multi-scale deep convolutional neural network for fast object detection. In: Leibe B, Matas J, Sebe N, Welling M (eds.) Computer vision – ECCV 2016. Lecture notes in computer science. Springer, Cham, 354–370. https://​doi.​org/​10.​1007/​978-3-319-46493-0_​22
27.
Zurück zum Zitat Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. Adv Neural Inf Process Syst 25:1097–1105 Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. Adv Neural Inf Process Syst 25:1097–1105
34.
Zurück zum Zitat Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition, 580–587. https://doi.org/10.1109/CVPR.2014.81 Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition, 580–587. https://​doi.​org/​10.​1109/​CVPR.​2014.​81
35.
Zurück zum Zitat Ren S, He K, Girshick R, Sun J (2015) Faster r-cnn: towards real-time object detection with region proposal networks. Adv Neural Inf Process Syst 28:91–99 Ren S, He K, Girshick R, Sun J (2015) Faster r-cnn: towards real-time object detection with region proposal networks. Adv Neural Inf Process Syst 28:91–99
36.
43.
Zurück zum Zitat Iandola FN, Han S, Moskewicz MW, Ashraf K, Dally WJ, Keutzer K (2016) SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and< 0.5 MB model size. arXiv preprint arXiv:1602.07360. Iandola FN, Han S, Moskewicz MW, Ashraf K, Dally WJ, Keutzer K (2016) SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and< 0.5 MB model size. arXiv preprint arXiv:​1602.​07360.
44.
Zurück zum Zitat Martin A, Doddington G, Kamm T, Ordowski M, Przybocki M (1997) The DET curve in assessment of detection task performance. Defense Technical Information Center. Virginia, US. Martin A, Doddington G, Kamm T, Ordowski M, Przybocki M (1997) The DET curve in assessment of detection task performance. Defense Technical Information Center. Virginia, US.
Metadaten
Titel
InceptionDepth-wiseYOLOv2: improved implementation of YOLO framework for pedestrian detection
verfasst von
Sweta Panigrahi
U. S. N. Raju
Publikationsdatum
11.05.2022
Verlag
Springer London
Erschienen in
International Journal of Multimedia Information Retrieval / Ausgabe 3/2022
Print ISSN: 2192-6611
Elektronische ISSN: 2192-662X
DOI
https://doi.org/10.1007/s13735-022-00239-4

Weitere Artikel der Ausgabe 3/2022

International Journal of Multimedia Information Retrieval 3/2022 Zur Ausgabe