Top

International Journal of Machine Learning and Cybernetics

Published in:

26-08-2019 | Original Article

Enhance the recognition ability to occlusions and small objects with Robust Faster R-CNN

Authors: Tao Zhou, Zhixin Li, Canlong Zhang

Published in: International Journal of Machine Learning and Cybernetics | Issue 11/2019

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

Recognizing objects with vastly different size scales and objects with occlusions is a fundamental challenge in computer vision. This paper addresses this issue by proposing a novel approach denoted as Robust Faster R-CNN for detecting objects in multi-label images. Robust Faster R-CNN employs a cascaded network structure based on the Faster R-CNN architecture to extract features from objects with different size scales. However, the proposed design provides greater robustness than Faster R-CNN by replacing the RoIPooling operation with RoIAligns to eliminate the harsh quantization conducted by RoIPooling, and we design a multi-scale RoIAligns operation by adding multiple pool sizes for adapting the detection ability of the network to objects with different sizes. Furthermore, we combine an adversarial network with the proposed network to generate training samples with occlusions significantly affecting the classification ability of the model, which improves its robustness to occlusions. Experimental results for the PASCAL VOC 2012 and 2007 datasets demonstrate the superiority of the proposed object detection approach relative to several state-of-the-art approaches.

previous article SegFast-V2: Semantic image segmentation with less parameters in deep learning for autonomous driving

next article Dynamic time alignment kernel-based fuzzy clustering of non-equal length vector time series

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

ATZelectronics worldwide

ATZlectronics worldwide is up-to-speed on new trends and developments in automotive electronics on a scientific level with a high depth of information.

Order your 30-days-trial for free and without any commitment.

inform now

ATZelektronik

Die Fachzeitschrift ATZelektronik bietet für Entwickler und Entscheider in der Automobil- und Zulieferindustrie qualitativ hochwertige und fundierte Informationen aus dem gesamten Spektrum der Pkw- und Nutzfahrzeug-Elektronik.

Lassen Sie sich jetzt unverbindlich 2 kostenlose Ausgabe zusenden.

inform now

Chen Y, Wang Z, Peng Y, Zhang Z, Yu G, Sun J (2018) Cascaded pyramid network for multi-person pose estimation. Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7103–7112

Dai J, Li Y, He K, Sun J (2016) R-fcn: Object detection via region-based fully convolutional networks. In: Advances in neural information processing systems, pp 379–387

Everingham M, Williams C (2010) The pascal visual object classes challenge 2010 (voc2010). In: International conference on machine learning, pp 117–176

Girshick R (2015) Fast r-cnn. In: Advances in neural information processing systems, pp 91–99

Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of IEEE international conference on computer vision and pattern recognition, pp 580–587

He K, Zhang X, Ren S, Sun J (2015) Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans Pattern Anal Mach Intell 37(9):1904CrossRef

He K, Gkioxari G, Dollar P, Girshick R (2017) Mask r-cnn. IEEE Trans Pattern Anal Mach Intell 99:1–1

Huang G, Liu Z, Laurens VDM, Weinberger KQ (2016) Densely connected convolutional networks. In: Proceedings of IEEE international conference on computer vision and pattern recognition, pp 2261–2269

Jiang Y, Zhu X, Wang X, Yang S, Li W, Wang H, Fu P, Luo Z (2017) R2cnn: Rotational region cnn for orientation robust scene text detection. In: Proceedings of IEEE international conference on computer vision and pattern recognition, pp 2261–2269

10.

Kong T, Yao A, Chen Y, Sun F (2016) Hypernet: Towards accurate region proposal generation and joint object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 845–853

11.

Kong T, Sun F, Yao A, Liu H, Lu M, Chen Y (2017) Ron: Reverse connection with objectness prior networks for object detection. In: Proceedings of IEEE international conference on computer vision and pattern recognition, vol 1

12.

Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp 1097–1105

13.

Lin TY, Dollár P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2117–2125

14.

Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu CY, Berg AC (2015) Ssd: Single shot multibox detector. In: European conference on computer vision, pp 21–37

15.

Oquab M, Bottou L, Laptev I, Sivic J (2014) Learning and transferring mid-level image representations using convolutional neural networks. In: Proceedings of IEEE international conference on computer vision and pattern recognition, pp 1717–1724

16.

Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: Unified, real-time object detection. In: Computer vision and pattern recognition

17.

Ren S, He K, Girshick R, Sun J (2015) Faster r-cnn: towards real-time object detection with region proposal networks. In: Advances in neural information processing systems, pp 91–99

18.

Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M et al (2015) Imagenet large scale visual recognition challenge. Int J Comput Vision 115(3):211–252MathSciNetCrossRef

19.

Sermanet P, Eigen D, Zhang X, Mathieu M, Fergus R, Lecun Y (2013) Overfeat: Integrated recognition, localization and detection using convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition

20.

Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:14091556

21.

Szegedy C, Ioffe S, Vanhoucke V, Alemi AA (2017) Inception-v4, inception-resnet and the impact of residual connections on learning. In: Proceedings of the thirty-first AAAI conference on artificial intelligence

22.

Tao Z, Li Z, Zhang C, Lan L (2018) An improved convolutional neural network model with adversarial net for multi-label image classification. In: Pacific Rim international conference on artificial intelligence

23.

Uijlings JR, Van De Sande KE, Gevers T, Smeulders AW (2013) Selective search for object recognition. Int J Comput Vision 104(2):154–171CrossRef

24.

Wang X, Shrivastava A, Gupta A (2017) A-fast-rcnn: Hard positive generation via adversary for object detection. In: Proceedings of IEEE international conference on computer vision and pattern recognition, pp 21–26

25.

Wei S, Li Z, Zhang C (2018) Combined constraint-based with metric-based in semi-supervised clustering ensemble. Int J Mach Learn Cybernet 9(7):1085–1100CrossRef

26.

Wei Y, Xia W, Lin M, Huang J, Ni B, Dong J, Zhao Y, Yan S (2016) Hcp: A flexible cnn framework for multi-label image classification. IEEE Trans Pattern Anal Mach Intell 38(9):1901–1907CrossRef

27.

Zheng Y, Li Z, Zhang C (2018) A hybrid architecture based on cnn for cross-modal semantic instance annotation. Multimedia Tools and Applications 77(7):8695–8710CrossRef

Title: Enhance the recognition ability to occlusions and small objects with Robust Faster R-CNN
Authors: Tao Zhou
Zhixin Li
Canlong Zhang
Publication date: 26-08-2019
Publisher: Springer Berlin Heidelberg
Published in: International Journal of Machine Learning and Cybernetics / Issue 11/2019
Print ISSN: 1868-8071
Electronic ISSN: 1868-808X
DOI: https://doi.org/10.1007/s13042-019-01006-4

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

ATZelectronics worldwide

ATZelektronik

Other articles of this Issue 11/2019

A theoretical study on object-oriented and property-oriented multi-scale formal concept analysis

An optimization algorithm guided by a machine learning approach

Interval stability analysis of power systems with wind turbine based on three-mass shaft model

Fuzzy rough clustering for categorical data

Double-quantitative multigranulation decision-theoretic rough fuzzy set model

A novel modified gravitational search algorithm for the real world optimization problem