Skip to main content
Top
Published in: Pattern Analysis and Applications 3/2020

21-10-2019 | Theoretical advances

An over-regression suppression method to discriminate occluded objects of same category

Authors: Bin Zhao, Chunping Wang, Qiang Fu

Published in: Pattern Analysis and Applications | Issue 3/2020

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Occlusion is a key challenge in object detection. It is hard to discriminate objects accurately when they gather together and occlude each other, especially when they belong to same category which easily leads to the problem that multiple objects are regressed into the same bounding box. To address this problem, an over-regression suppression (ORS) method is proposed to take full advantage of supervised information. Firstly, annotated information is utilized to compute the overlaps between different ground truth boxes. Then, the regression loss function is redesigned by adding a penalty term which is associated with the aforementioned overlaps to prevent Over-regression. Finally, the validity of the algorithm is proved by making some changes in Faster R-CNN, in which a k-means ++ clustering algorithm is used to automatically generate various size anchors by learning the shape regularities of objects from dataset, and the Soft-NMS, a nearly cost-free method, is introduced to replace the traditional NMS. Extensive evaluations on the challenging PASCAL VOC and MS COCO benchmarks demonstrate the superiority of ORS in handling intra-class occlusion. Its performance increases when dataset contains more large objects and hard samples, as demonstrated by the results on the MS COCO dataset.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Sun X, Wu P, Hoi SCH (2018) Face detection using deep learning: an improved faster RCNN approach. Neurocomputing 299:42–50CrossRef Sun X, Wu P, Hoi SCH (2018) Face detection using deep learning: an improved faster RCNN approach. Neurocomputing 299:42–50CrossRef
2.
go back to reference Sam B, Surya S, Babu RV (2017) Switching convolutional neural network for crowd counting. In: Proceedings of the IEEE international conference on computer vision and pattern recognition, 2017 Sam B, Surya S, Babu RV (2017) Switching convolutional neural network for crowd counting. In: Proceedings of the IEEE international conference on computer vision and pattern recognition, 2017
3.
go back to reference Liu T, Abd-Elrahman A (2018) Deep convolutional neural network training enrichment using multi-view object-based analysis of unmanned aerial systems imagery for wetlands classification. ISPRS J Photogramm Remote Sens 139:154–170CrossRef Liu T, Abd-Elrahman A (2018) Deep convolutional neural network training enrichment using multi-view object-based analysis of unmanned aerial systems imagery for wetlands classification. ISPRS J Photogramm Remote Sens 139:154–170CrossRef
4.
go back to reference Pham C, Jeon JW (2017) Robust object proposals re-ranking for object detection in autonomous driving using convolutional neural networks. Sig Process Image Commun 53:110–122CrossRef Pham C, Jeon JW (2017) Robust object proposals re-ranking for object detection in autonomous driving using convolutional neural networks. Sig Process Image Commun 53:110–122CrossRef
5.
go back to reference Bodla N, Singh B, Chellappa R, et al (2017) Soft-NMS—improving object detection with one line of code. arXiv preprint arXiv:1704.04503, 2017 Bodla N, Singh B, Chellappa R, et al (2017) Soft-NMS—improving object detection with one line of code. arXiv preprint arXiv:​1704.​04503, 2017
6.
go back to reference Ren S, He K, Girshick R, Sun J (2015) Faster R-CNN: towards real-time object detection with region proposal networks. In: NIPS, 2015 Ren S, He K, Girshick R, Sun J (2015) Faster R-CNN: towards real-time object detection with region proposal networks. In: NIPS, 2015
7.
go back to reference Zhou HY, Gao BB, Wu J (2017) Adaptive feeding: achieving fast and accurate detections by adaptively combining object detectors. In: Proceedings of the IEEE international conference on computer vision and pattern recognition, 2017 Zhou HY, Gao BB, Wu J (2017) Adaptive feeding: achieving fast and accurate detections by adaptively combining object detectors. In: Proceedings of the IEEE international conference on computer vision and pattern recognition, 2017
9.
go back to reference Sermanet P, Eigen D, Zhang X, Mathieu M, Fergus R, LeCun Y (2014) Overfeat: integrated recognition, localization and detection using convolutional networks. In: ICLR, 2014 Sermanet P, Eigen D, Zhang X, Mathieu M, Fergus R, LeCun Y (2014) Overfeat: integrated recognition, localization and detection using convolutional networks. In: ICLR, 2014
10.
go back to reference Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE international conference on computer vision and pattern recognition, 2014 Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE international conference on computer vision and pattern recognition, 2014
11.
go back to reference Girshick R (2015) Fast R-CNN. In: IEEE international conference on computer vision. IEEE, pp 1440–1448, 2015 Girshick R (2015) Fast R-CNN. In: IEEE international conference on computer vision. IEEE, pp 1440–1448, 2015
12.
go back to reference Dai J, Li Y, He K, Sun J (2016) R-FCN: object detection via region-based fully convolutional networks. In: NIPS, pp 379–387, 2016 Dai J, Li Y, He K, Sun J (2016) R-FCN: object detection via region-based fully convolutional networks. In: NIPS, pp 379–387, 2016
13.
go back to reference He K, Gkioxari G, Dollár P (2017) Mask R-CNN [C]. In: ICCV, pp 2980–2988, 2017 He K, Gkioxari G, Dollár P (2017) Mask R-CNN [C]. In: ICCV, pp 2980–2988, 2017
14.
go back to reference Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: unified, real-time object detection. In: Proceedings of the IEEE international conference on computer vision and pattern recognition, 2016 Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: unified, real-time object detection. In: Proceedings of the IEEE international conference on computer vision and pattern recognition, 2016
15.
go back to reference Liu W, Anguelov D, Erhan D, Szegedy C, Christian S, Cheng-Yang F, Alexander C (2016) SSD: single shot multibox detector. In: ECCV, 2016 Liu W, Anguelov D, Erhan D, Szegedy C, Christian S, Cheng-Yang F, Alexander C (2016) SSD: single shot multibox detector. In: ECCV, 2016
16.
17.
go back to reference Redmon J, Farhadi A (2017) YOLO9000: better, faster, stronger. In: Proceedings of the IEEE international conference on computer vision and pattern recognition, 2017 Redmon J, Farhadi A (2017) YOLO9000: better, faster, stronger. In: Proceedings of the IEEE international conference on computer vision and pattern recognition, 2017
19.
go back to reference He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE international conference on computer vision and pattern recognition (CVPR), 2016 He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE international conference on computer vision and pattern recognition (CVPR), 2016
20.
go back to reference Xie S, Girshick R, Dollr P, Tu Z, He K (2016) Aggregated residual transformations for deep neural networks. arXiv preprint arXiv:1611.05431, 2016 Xie S, Girshick R, Dollr P, Tu Z, He K (2016) Aggregated residual transformations for deep neural networks. arXiv preprint arXiv:​1611.​05431, 2016
21.
go back to reference Szegedy C, Liu W, Jia Y (2015) Going deeper with convolutions. In: Proceedings of the IEEE international conference on computer vision and pattern recognition, 2015 Szegedy C, Liu W, Jia Y (2015) Going deeper with convolutions. In: Proceedings of the IEEE international conference on computer vision and pattern recognition, 2015
22.
go back to reference Ioffe S, Szegedy C (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift. In: International conference on international conference on machine learning. JMLR. org, 2015 Ioffe S, Szegedy C (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift. In: International conference on international conference on machine learning. JMLR. org, 2015
23.
go back to reference Szegedy V (2016) Vanhoucke and S. Ioffe, Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE international conference on computer vision and pattern recognition, 2016 Szegedy V (2016) Vanhoucke and S. Ioffe, Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE international conference on computer vision and pattern recognition, 2016
24.
go back to reference Szegedy S, Ioffe S, Vanhoucke V (2017) Inception-v4, inception-resnet and the impact of residual connections on learning. In: AAAI, pp 4–12, 2017 Szegedy S, Ioffe S, Vanhoucke V (2017) Inception-v4, inception-resnet and the impact of residual connections on learning. In: AAAI, pp 4–12, 2017
25.
go back to reference Howard G, Zhu M, Chen B (2017) MobileNets: efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861, 2017 Howard G, Zhu M, Chen B (2017) MobileNets: efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:​1704.​04861, 2017
26.
27.
go back to reference Zhang X, Zhou X, Lin M (2017) ShuffleNet: an extremely efficient convolutional neural network for mobile devices. arXiv preprint arXiv:1707.01083, 2017 Zhang X, Zhou X, Lin M (2017) ShuffleNet: an extremely efficient convolutional neural network for mobile devices. arXiv preprint arXiv:​1707.​01083, 2017
28.
go back to reference Lin TY, Goyal P, Girshick R (2017) Focal loss for dense object detection. In: IEEE international conference on computer vision. IEEE Computer Society, pp 2999–3007, 2017 Lin TY, Goyal P, Girshick R (2017) Focal loss for dense object detection. In: IEEE international conference on computer vision. IEEE Computer Society, pp 2999–3007, 2017
29.
go back to reference Cai Z, Fan Q, Feris RS, et al (2016) A unified multi-scale deep convolutional neural network for fast object detection. In: European conference on computer vision, pp 354–370. Springer, Cham, 2016 Cai Z, Fan Q, Feris RS, et al (2016) A unified multi-scale deep convolutional neural network for fast object detection. In: European conference on computer vision, pp 354–370. Springer, Cham, 2016
30.
31.
go back to reference Yuan Y, Xiong Z, Wang Q (2019) VSSA-NET: vertical spatial sequence attention network for traffic sign detection. In: IEEE transactions on image processing, 2019 Yuan Y, Xiong Z, Wang Q (2019) VSSA-NET: vertical spatial sequence attention network for traffic sign detection. In: IEEE transactions on image processing, 2019
32.
go back to reference Tian Y, Luo P, Wang X (2015) Deep learning strong parts for pedestrian detection. In: IEEE international conference on computer vision. IEEE, pp 1904–1912, 2015 Tian Y, Luo P, Wang X (2015) Deep learning strong parts for pedestrian detection. In: IEEE international conference on computer vision. IEEE, pp 1904–1912, 2015
33.
go back to reference Ouyang W, Zeng X, Wang X (2016) Partial occlusion handling in pedestrian detection with a deep model. IEEE Trans Circuits Syst Video Technol 26(11):2123–2137CrossRef Ouyang W, Zeng X, Wang X (2016) Partial occlusion handling in pedestrian detection with a deep model. IEEE Trans Circuits Syst Video Technol 26(11):2123–2137CrossRef
34.
go back to reference Zhou C, Yuan J (2016) Learning to integrate occlusion-specific detectors for heavily occluded pedestrian detection. In: ACCV, pp. 305–320, 2016 Zhou C, Yuan J (2016) Learning to integrate occlusion-specific detectors for heavily occluded pedestrian detection. In: ACCV, pp. 305–320, 2016
35.
go back to reference Zhou C, Yuan J (2017) Multi-label learning of part detectors for heavily occluded pedestrian detection. In: IEEE international conference on computer vision. IEEE computer society, pp 3506–3515, 2017 Zhou C, Yuan J (2017) Multi-label learning of part detectors for heavily occluded pedestrian detection. In: IEEE international conference on computer vision. IEEE computer society, pp 3506–3515, 2017
36.
go back to reference Wang X, Xiao T, Jiang Y, Shao S, Sun J, Shen C (2017) Repulsion loss: detecting pedestrians in a crowd. CoRR abs/1711.07752, 2017 Wang X, Xiao T, Jiang Y, Shao S, Sun J, Shen C (2017) Repulsion loss: detecting pedestrians in a crowd. CoRR abs/1711.07752, 2017
37.
go back to reference Zhang S, Wen L, Bian X (2018) Occlusion-aware R-CNN: detecting pedestrians in a crowd. In: European conference on computer vision, 2018 Zhang S, Wen L, Bian X (2018) Occlusion-aware R-CNN: detecting pedestrians in a crowd. In: European conference on computer vision, 2018
38.
go back to reference Weinberger K, Saul L (2009) Distance metric learning for large margin nearest neighbor classification. J Mach Learn Res 10(1):207–244MATH Weinberger K, Saul L (2009) Distance metric learning for large margin nearest neighbor classification. J Mach Learn Res 10(1):207–244MATH
39.
go back to reference Hsu JL, Yang HX (2009) A modified K-means algorithm for sequence clustering. In: International conference on hybrid intelligent systems. IEEE, pp. 287–292, 2009 Hsu JL, Yang HX (2009) A modified K-means algorithm for sequence clustering. In: International conference on hybrid intelligent systems. IEEE, pp. 287–292, 2009
40.
go back to reference Arthur D, Vassilvitskii S (2007) K-means ++: the advantages of careful seeding. In: 18th ACM-SIAM symposium on discrete algorithms. Society for industrial and applied mathematics, pp 1027–1035, 2007 Arthur D, Vassilvitskii S (2007) K-means ++: the advantages of careful seeding. In: 18th ACM-SIAM symposium on discrete algorithms. Society for industrial and applied mathematics, pp 1027–1035, 2007
41.
go back to reference Milan A, Schindler K, Roth S (2013) Detection- and trajectory-level exclusion in multiple object tracking. In: Proceedings of the IEEE international conference on computer vision and pattern recognition, 2013 Milan A, Schindler K, Roth S (2013) Detection- and trajectory-level exclusion in multiple object tracking. In: Proceedings of the IEEE international conference on computer vision and pattern recognition, 2013
42.
go back to reference Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. In: International conference on learning representations, 2015 Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. In: International conference on learning representations, 2015
Metadata
Title
An over-regression suppression method to discriminate occluded objects of same category
Authors
Bin Zhao
Chunping Wang
Qiang Fu
Publication date
21-10-2019
Publisher
Springer London
Published in
Pattern Analysis and Applications / Issue 3/2020
Print ISSN: 1433-7541
Electronic ISSN: 1433-755X
DOI
https://doi.org/10.1007/s10044-019-00853-9

Other articles of this Issue 3/2020

Pattern Analysis and Applications 3/2020 Go to the issue

Industrial and commercial application

Customs fraud detection

Premium Partner