nach oben

Neural Computing and Applications

Erschienen in:

19.04.2018 | Original Article

An enhanced SSD with feature fusion and visual reasoning for object detection

verfasst von: Jiaxu Leng, Ying Liu

Erschienen in: Neural Computing and Applications | Ausgabe 10/2019

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Single Shot Multibox Detector (SSD) is one of the top performing object detection algorithms in terms of both accuracy and speed. SSD achieves impressive performance on various datasets by using different output layers for object detection. However, each layer in the feature pyramid is used independently, and SSD considers only the fine-grained details of the objects but ignores the context surrounding objects. In this paper, we proposed an enhanced SSD, called ESSD, that improved the performance of the conventional SSD by fusing feature maps of different output layers, instead of growing layers close to the input data. Our method used two-way transfer of feature information and feature fusion to enhance the network. To assist further with object detection, we proposed a visual reasoning method that utilized fully the relationships between objects instead of using only the features of the objects themselves. This addition of visual reasoning proved very effective for detecting objects that are too small or have small features. To evaluate the proposed ESSD, we trained the model with VOC2007 and VOC2012 training sets and evaluated the performance on the Pascal VOC2007 test set. For \(300 \times 300\) input, ESSD achieved 79.2% mean average precision (mAP) at 52.0 frames per second (FPS), and for \(512 \times 512\) input, this approach achieved 82.4% mAP at 18.6 FPS. These results demonstrated that our proposed method can achieve state-of-the-art mAP, which is a better result than provided by the conventional SSD and other advanced detectors.

Vorheriger Artikel Strength retrieval of artificially cemented bauxite residue using machine learning: an alternative design approach based on response surface methodology

Nächster Artikel Structure regularized self-paced learning for robust semi-supervised pattern classification

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Yang F, Choi W, Lin Y (2016) Exploit all the layers: fast and accurate cnn object detector with scale dependent pooling and cascaded rejection classifiers. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 2129–2137

Dai J, Li Y, He K, et al (2016) R-fcn: object detection via region-based fully convolutional networks. Adv Neural Inf Process. Syst, pp 379–387

Girshick R, Donahue J, Darrell T, et al (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 580–587

Bell S, Lawrence Zitnick C, Bala K, et al (2016) Inside-outside net: Detecting objects in context with skip pooling and recurrent neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2874–2883

Fukui A, Park D H, Yang D, et al (2016) Multimodal compact bilinear pooling for visual question answering and visual grounding. arXiv preprint arXiv:1606.01847

Kong T, Yao A, Chen Y, et al (2016) Hypernet: towards accurate region proposal generation and joint object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 845–853

Liu W, Anguelov D, Erhan D et al (2016) Ssd: single shot multibox detector[C]. In: European conference on computer vision. Springer, Cham, pp 21–37

Gao Y, Beijbom O, Zhang N, et al (2016) Compact bilinear pooling. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 317–326

Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: CVPR 2005: IEEE computer society conference on computer vision and pattern recognition, 2005, vol 1. IEEE, pp 886–893

10.

Erhan D, Szegedy C, Toshev A, et al (2014) Scalable object detection using deep neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2147–2154

11.

Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556

12.

Pinheiro PO, Collobert R, Dollár P (2015) Learning to segment object candidates. In: Proceedings of the 28th international conference on neural information processing systems (NIPS’15), Montreal, 7–12 December 2015, pp 1990–1998

13.

Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Proceedings of the 25th international conference on neural information processing systems (NIPS’12), Lake Tahoe, 3–6 December 2012, pp 1097–1105

14.

Zhang H, Cao X, Ho JKL et al (2017) Object-level video advertising: an optimization framework. IEEE Trans Industr Inf 13(2):520–531CrossRef

15.

Girshick RB, Donahue J, Darrell T et al (2016) Region-based convolutional networks for accurate object detection and segmentation. IEEE Trans Pattern Anal Mach Intell 38(1):142–158CrossRef

16.

Uijlings JR, De Sande KE, Gevers T et al (2013) Selective search for object recognition. Int J Comput Vis 104(2):154–171CrossRef

17.

Zitnick CL, Dollár P (2014) Edge boxes: locating object proposals from edges. In: European conference on computer vision. Springer, Cham, pp 391–405

18.

He K, Zhang X, Ren S, et al (2014) Spatial pyramid pooling in deep convolutional networks for visual recognition. In: European conference on computer vision, pp 346–361

19.

Girshick RB (2015) Fast R-CNN. In: International conference on computer vision, pp 1440–1448

20.

Zitnick CL, Dollár P (2014) Edge boxes: locating object proposals from edges. In: European conference on computer vision. Springer, Cham, pp 391–405

21.

Ren S, He K, Girshick RB et al (2017) Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 39(6):1137–1149CrossRef

22.

Redmon J, Divvala S, Girshick R et al (2016) You only look once: unified, real-time object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, Las Vegas, 27–30 June 2016, pp 779–788

23.

Redmon J, Farhadi A (2016) YOLO9000: better, faster, stronger. arXiv preprint, p 1612

24.

Fu CY, Liu W, Ranga A, et al (2017) DSSD: deconvolutional single shot detector. arXiv preprint arXiv:1701.06659

25.

Everingham M, Van Gool L, Williams CKI et al (2010) The Pascal visual object classes (voc) challenge. Int J Comput Vis 88(2):303–338CrossRef

26.

Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3431–3440

Titel: An enhanced SSD with feature fusion and visual reasoning for object detection
verfasst von: Jiaxu Leng
Ying Liu
Publikationsdatum: 19.04.2018
Verlag: Springer London
Erschienen in: Neural Computing and Applications / Ausgabe 10/2019
Print ISSN: 0941-0643
Elektronische ISSN: 1433-3058
DOI: https://doi.org/10.1007/s00521-018-3486-1

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Springer Professional "Technik"

Springer Professional "Wirtschaft+Technik"

Weitere Artikel der Ausgabe 10/2019

A new variant of restricted Boltzmann machine with horizontal connections

Analysis and design of single-phase power factor corrector with genetic algorithm and adaptive neuro-fuzzy-based sliding mode controller using DC–DC SEPIC

A comparative analysis of ANN and chaotic approach-based wind speed prediction in India

Perturbation wavelet neural sliding mode position control for a voice coil motor driver

Computational diagnosis of skin lesions from dermoscopic images using combined features

Fuzzy logic-based performance improvement on MAC layer in wireless local area networks

Premium Partner