Skip to main content
Erschienen in: The Journal of Supercomputing 12/2020

24.02.2020

Pyramid context learning for object detection

verfasst von: Pengxin Ding, Jianping Zhang, Huan Zhou, Xiang Zou, Minghui Wang

Erschienen in: The Journal of Supercomputing | Ausgabe 12/2020

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Contextual information in complex scenarios is critical for accurate object detection. Existing state-of-the-art detectors have greatly improved detection performance with the use of contexts around objects. However, these detectors consider the local and global contexts separately, which limits the improvement in detection accuracy. In this paper, we propose a pyramid context learning module (PCL) for object detection, which makes full use of the feature context at different levels. Specifically, two operators, named aggregation and distribution, are designed to assemble and synthesize contextual information at different levels. In addition, a channel context learning operator is also used to capture the channel context. PCL is a universal module, so it can be easily integrated into most of the detection frameworks. To evaluate our PCL, we apply it into some popular detectors, e.g., SSD, Faster R-CNN and RetinaNet, and conduct extensive experiments on PASCAL VOC and MS COCO datasets. Experimental results show that PCL can produce competitive performance gains and significantly improve the baselines.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Bell S, Lawrence Zitnick C, Bala K, Girshick R (2016) Inside–outside net: detecting objects in context with skip pooling and recurrent neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp 2874–2883 Bell S, Lawrence Zitnick C, Bala K, Girshick R (2016) Inside–outside net: detecting objects in context with skip pooling and recurrent neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp 2874–2883
2.
Zurück zum Zitat Cai Z, Vasconcelos N (2018) Cascade R-CNN: delving into high quality object detection. In: IEEE CVPR Cai Z, Vasconcelos N (2018) Cascade R-CNN: delving into high quality object detection. In: IEEE CVPR
3.
Zurück zum Zitat Chen X, Gupta A (2017) Spatial memory for context reasoning in object detection. In: Proceedings of the IEEE International Conference on Computer Vision. pp 4086–4096 Chen X, Gupta A (2017) Spatial memory for context reasoning in object detection. In: Proceedings of the IEEE International Conference on Computer Vision. pp 4086–4096
4.
Zurück zum Zitat Chen X, Li LJ, Fei-Fei L, Gupta A (2018) Iterative visual reasoning beyond convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp 7239–7248 Chen X, Li LJ, Fei-Fei L, Gupta A (2018) Iterative visual reasoning beyond convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp 7239–7248
5.
Zurück zum Zitat Chen Y, Li W, Sakaridis C, Dai D, Van Gool L (2018) Domain adaptive faster R-CNN for object detection in the wild. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp 3339–3348 Chen Y, Li W, Sakaridis C, Dai D, Van Gool L (2018) Domain adaptive faster R-CNN for object detection in the wild. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp 3339–3348
6.
Zurück zum Zitat Chen Z, Huang S, Tao D (2018) Context refinement for object detection. In: The European Conference on Computer Vision (ECCV) Chen Z, Huang S, Tao D (2018) Context refinement for object detection. In: The European Conference on Computer Vision (ECCV)
7.
Zurück zum Zitat Dai J, Li Y, He K, Sun J (2016) R-FCN: object detection via region-based fully convolutional networks. In: Advances in Neural Information Processing Systems. pp 379–387 Dai J, Li Y, He K, Sun J (2016) R-FCN: object detection via region-based fully convolutional networks. In: Advances in Neural Information Processing Systems. pp 379–387
8.
Zurück zum Zitat Dai J, Qi H, Xiong Y, Li Y, Zhang G, Hu H, Wei Y (2017) Deformable convolutional networks. In: Proceedings of the IEEE International Conference on Computer Vision. pp 764–773 Dai J, Qi H, Xiong Y, Li Y, Zhang G, Hu H, Wei Y (2017) Deformable convolutional networks. In: Proceedings of the IEEE International Conference on Computer Vision. pp 764–773
9.
Zurück zum Zitat Duan K, Bai S, Xie L, Qi H, Huang Q, Tian Q (2019) Centernet: object detection with keypoint triplets. ArXiv preprint arXiv:1904.08189 Duan K, Bai S, Xie L, Qi H, Huang Q, Tian Q (2019) Centernet: object detection with keypoint triplets. ArXiv preprint arXiv:​1904.​08189
10.
Zurück zum Zitat Everingham M, Van Gool L, Williams CK, Winn J, Zisserman A (2010) The pascal visual object classes (VOC) challenge. Int J Comput Vis 88(2):303–338CrossRef Everingham M, Van Gool L, Williams CK, Winn J, Zisserman A (2010) The pascal visual object classes (VOC) challenge. Int J Comput Vis 88(2):303–338CrossRef
11.
12.
Zurück zum Zitat Ghiasi G, Lin TY, Le QV (2019) Nas-fpn: Learning scalable feature pyramid architecture for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp 7036–7045 Ghiasi G, Lin TY, Le QV (2019) Nas-fpn: Learning scalable feature pyramid architecture for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp 7036–7045
13.
14.
Zurück zum Zitat He K, Gkioxari G, Dollár P, Girshick R (2017) Mask r-cnn. In: Computer Vision (ICCV), 2017 IEEE International Conference on. IEEE, pp 2980–2988 He K, Gkioxari G, Dollár P, Girshick R (2017) Mask r-cnn. In: Computer Vision (ICCV), 2017 IEEE International Conference on. IEEE, pp 2980–2988
15.
Zurück zum Zitat He K, Zhang X, Ren S, Sun J (2016) Identity mappings in deep residual networks. In: European Conference on Computer Vision. Springer, pp 630–645 He K, Zhang X, Ren S, Sun J (2016) Identity mappings in deep residual networks. In: European Conference on Computer Vision. Springer, pp 630–645
16.
Zurück zum Zitat Kong T, Sun F, Yao A, Liu H, Lu M, Chen Y (2017) Ron: Reverse connection with objectness prior networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp 5936–5944 Kong T, Sun F, Yao A, Liu H, Lu M, Chen Y (2017) Ron: Reverse connection with objectness prior networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp 5936–5944
17.
Zurück zum Zitat Kou G, Yang P, Peng Y, Xiao F, Chen Y, Alsaadi FE (2020) Evaluation of feature selection methods for text classification with small datasets using multiple criteria decision-making methods. Appl Soft Comput 86:105836CrossRef Kou G, Yang P, Peng Y, Xiao F, Chen Y, Alsaadi FE (2020) Evaluation of feature selection methods for text classification with small datasets using multiple criteria decision-making methods. Appl Soft Comput 86:105836CrossRef
18.
Zurück zum Zitat Law H, Deng J (2018) Cornernet: Detecting objects as paired keypoints. In: Proceedings of the European Conference on Computer Vision (ECCV). pp 734–750 Law H, Deng J (2018) Cornernet: Detecting objects as paired keypoints. In: Proceedings of the European Conference on Computer Vision (ECCV). pp 734–750
20.
Zurück zum Zitat Leng J, Liu Y (2019) An enhanced SSD with feature fusion and visual reasoning for object detection. Neural Comput Appl 31(10):6549–6558CrossRef Leng J, Liu Y (2019) An enhanced SSD with feature fusion and visual reasoning for object detection. Neural Comput Appl 31(10):6549–6558CrossRef
22.
Zurück zum Zitat Li J, Wei Y, Liang X, Dong J, Xu T, Feng J, Yan S (2016) Attentive contexts for object detection. IEEE Trans Multimedia 19(5):944–954CrossRef Li J, Wei Y, Liang X, Dong J, Xu T, Feng J, Yan S (2016) Attentive contexts for object detection. IEEE Trans Multimedia 19(5):944–954CrossRef
23.
Zurück zum Zitat Li J, Wei Y, Liang X, Dong J, Xu T, Feng J, Yan S (2017) Attentive contexts for object detection. IEEE Trans Multimedia 19(5):944–954CrossRef Li J, Wei Y, Liang X, Dong J, Xu T, Feng J, Yan S (2017) Attentive contexts for object detection. IEEE Trans Multimedia 19(5):944–954CrossRef
25.
Zurück zum Zitat Li X, Jiang S (2019) Know more say less: image captioning based on scene graphs. IEEE Trans Multimedia 21(8):2117–2130CrossRef Li X, Jiang S (2019) Know more say less: image captioning based on scene graphs. IEEE Trans Multimedia 21(8):2117–2130CrossRef
26.
Zurück zum Zitat Li Z, Peng C, Yu G, Zhang X, Deng Y, Sun J (2017) Light-head r-cnn: In defense of two-stage object detector. arXiv preprint arXiv:1711.07264 Li Z, Peng C, Yu G, Zhang X, Deng Y, Sun J (2017) Light-head r-cnn: In defense of two-stage object detector. arXiv preprint arXiv:​1711.​07264
27.
Zurück zum Zitat Lin TY, Dollár P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp 2117–2125 Lin TY, Dollár P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp 2117–2125
28.
Zurück zum Zitat Lin TY, Goyal P, Girshick R, He K, Dollár P (2017) Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision. pp 2980–2988 Lin TY, Goyal P, Girshick R, He K, Dollár P (2017) Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision. pp 2980–2988
29.
Zurück zum Zitat Lin TY, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P, Zitnick CL (2014) Microsoft coco: Common objects in context. In: European Conference on Computer Vision. Springer, pp 740–755 Lin TY, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P, Zitnick CL (2014) Microsoft coco: Common objects in context. In: European Conference on Computer Vision. Springer, pp 740–755
30.
Zurück zum Zitat Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu CY, Berg AC (2016) Ssd: Single shot multibox detector. In: European Conference on Computer Vision. Springer, pp 21–37 Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu CY, Berg AC (2016) Ssd: Single shot multibox detector. In: European Conference on Computer Vision. Springer, pp 21–37
31.
Zurück zum Zitat Neubeck A, Van Gool L (2006) Efficient non-maximum suppression. In: 18th International Conference on Pattern Recognition (ICPR’06). IEEE, vol 3, pp 850–855 Neubeck A, Van Gool L (2006) Efficient non-maximum suppression. In: 18th International Conference on Pattern Recognition (ICPR’06). IEEE, vol 3, pp 850–855
32.
Zurück zum Zitat Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: Unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp 779–788 Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: Unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp 779–788
33.
Zurück zum Zitat Redmon J, Farhadi A (2017) Yolo9000: better, faster, stronger. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp 7263–7271 Redmon J, Farhadi A (2017) Yolo9000: better, faster, stronger. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp 7263–7271
35.
Zurück zum Zitat Ren S, He K, Girshick R, Sun J (2017) Faster r-cnn: towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 6:1137–1149CrossRef Ren S, He K, Girshick R, Sun J (2017) Faster r-cnn: towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 6:1137–1149CrossRef
36.
Zurück zum Zitat Shrivastava A, Gupta A, Girshick R (2016) Training region-based object detectors with online hard example mining. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp 761–769 Shrivastava A, Gupta A, Girshick R (2016) Training region-based object detectors with online hard example mining. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp 761–769
37.
Zurück zum Zitat Simonyan K, Zisserman (2014) A Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 Simonyan K, Zisserman (2014) A Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:​1409.​1556
39.
Zurück zum Zitat Tang X, Du DK, He Z, Liu J (2018) Pyramidbox: A context-assisted single shot face detector. In: Proceedings of the European Conference on Computer Vision (ECCV). pp 797–813 Tang X, Du DK, He Z, Liu J (2018) Pyramidbox: A context-assisted single shot face detector. In: Proceedings of the European Conference on Computer Vision (ECCV). pp 797–813
40.
Zurück zum Zitat Tychsen-Smith L, Petersson L (2018) Improving object localization with fitness nms and bounded iou loss. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp 6877–6885 Tychsen-Smith L, Petersson L (2018) Improving object localization with fitness nms and bounded iou loss. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp 6877–6885
41.
Zurück zum Zitat Wang X, Shrivastava A, Gupta A (2017) A-fast-rcnn: Hard positive generation via adversary for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp 2606–2615 Wang X, Shrivastava A, Gupta A (2017) A-fast-rcnn: Hard positive generation via adversary for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp 2606–2615
42.
Zurück zum Zitat Woo S, Park J, Lee JY, So Kweon I (2018) Cbam: Convolutional block attention module. In: Proceedings of the European Conference on Computer Vision (ECCV). pp 3–19 Woo S, Park J, Lee JY, So Kweon I (2018) Cbam: Convolutional block attention module. In: Proceedings of the European Conference on Computer Vision (ECCV). pp 3–19
43.
Zurück zum Zitat Yang L, Tang K, Yang J, Li LJ (2017) Dense captioning with joint inference and visual context. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp 2193–2202 Yang L, Tang K, Yang J, Li LJ (2017) Dense captioning with joint inference and visual context. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp 2193–2202
44.
Zurück zum Zitat Zhang S, Wen L, Bian X, Lei Z, Li SZ (2018) Single-shot refinement neural network for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp 4203–4212 Zhang S, Wen L, Bian X, Lei Z, Li SZ (2018) Single-shot refinement neural network for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp 4203–4212
45.
Zurück zum Zitat Zhu C, He Y, Savvides M (2019) Feature selective anchor-free module for single-shot object detection. arXiv preprint arXiv:1903.00621 Zhu C, He Y, Savvides M (2019) Feature selective anchor-free module for single-shot object detection. arXiv preprint arXiv:​1903.​00621
46.
Zurück zum Zitat Zhu Y, Zhao C, Wang J, Zhao X, Wu Y, Lu H (2017) Couplenet: Coupling global structure with local parts for object detection. In: Proceedings of the IEEE International Conference on Computer Vision. pp 4126–4134 Zhu Y, Zhao C, Wang J, Zhao X, Wu Y, Lu H (2017) Couplenet: Coupling global structure with local parts for object detection. In: Proceedings of the IEEE International Conference on Computer Vision. pp 4126–4134
Metadaten
Titel
Pyramid context learning for object detection
verfasst von
Pengxin Ding
Jianping Zhang
Huan Zhou
Xiang Zou
Minghui Wang
Publikationsdatum
24.02.2020
Verlag
Springer US
Erschienen in
The Journal of Supercomputing / Ausgabe 12/2020
Print ISSN: 0920-8542
Elektronische ISSN: 1573-0484
DOI
https://doi.org/10.1007/s11227-020-03168-3

Weitere Artikel der Ausgabe 12/2020

The Journal of Supercomputing 12/2020 Zur Ausgabe

Premium Partner