Skip to main content
Top
Published in: International Journal of Computer Vision 11-12/2019

17-07-2018

End-to-End Learning of Latent Deformable Part-Based Representations for Object Detection

Authors: Taylor Mordan, Nicolas Thome, Gilles Henaff, Matthieu Cord

Published in: International Journal of Computer Vision | Issue 11-12/2019

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Object detection methods usually represent objects through rectangular bounding boxes from which they extract features, regardless of their actual shapes. In this paper, we apply deformations to regions in order to learn representations better fitted to objects. We introduce DP-FCN, a deep model implementing this idea by learning to align parts to discriminative elements of objects in a latent way, i.e. without part annotation. This approach has two main assets: it builds invariance to local transformations, thus improving recognition, and brings geometric information to describe objects more finely, leading to a more accurate localization. We further develop both features in a new model named DP-FCN2.0 by explicitly learning interactions between parts. Alignment is done with an in-network joint optimization of all parts based on a CRF with custom potentials, and deformations are influencing localization through a bilinear product. We validate our models on PASCAL VOC and MS COCO datasets and show significant gains. DP-FCN2.0 achieves state-of-the-art results of 83.3 and 81.2% on VOC 2007 and 2012 with VOC data only.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literature
go back to reference Azizpour, H., & Laptev, I.(2012). Object detection using strongly-supervised deformable part models. In Proceedings of the IEEE European conference on computer vision (ECCV) (pp. 836–849). Azizpour, H., & Laptev, I.(2012). Object detection using strongly-supervised deformable part models. In Proceedings of the IEEE European conference on computer vision (ECCV) (pp. 836–849).
go back to reference Bell, S., Zitnick, L., Bala, K., & Girshick, R.(2016). Inside-outside net: Detecting objects in context with skip pooling and recurrent neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR). Bell, S., Zitnick, L., Bala, K., & Girshick, R.(2016). Inside-outside net: Detecting objects in context with skip pooling and recurrent neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR).
go back to reference Ben-Younes, H., Cadène, R., Thome, N., & Cord M. (2017). MUTAN: Multimodal tucker fusion for visual question answering. In Proceedings of the IEEE international conference on computer vision (ICCV). Ben-Younes, H., Cadène, R., Thome, N., & Cord M. (2017). MUTAN: Multimodal tucker fusion for visual question answering. In Proceedings of the IEEE international conference on computer vision (ICCV).
go back to reference Chandra, S., Usunier, N., Kokkinos, I. (2017). Dense and low-rank gaussian CRFs using deep embeddings. In Proceedings of the IEEE international conference on computer vision (ICCV). Chandra, S., Usunier, N., Kokkinos, I. (2017). Dense and low-rank gaussian CRFs using deep embeddings. In Proceedings of the IEEE international conference on computer vision (ICCV).
go back to reference Chen, L. C., Papandreou, G., Kokkinos, I., Murphy, K., & Yuille, A. (2015). Semantic image segmentation with deep convolutional nets and fully connected CRFs. In Proceedings of the international conference on learning representations (ICLR). Chen, L. C., Papandreou, G., Kokkinos, I., Murphy, K., & Yuille, A. (2015). Semantic image segmentation with deep convolutional nets and fully connected CRFs. In Proceedings of the international conference on learning representations (ICLR).
go back to reference Dai, J., He, K., Li, Y., Ren, S., & Sun, J. (2016a). Instance-sensitive fully convolutional networks. In Proceedings of the IEEE European conference on computer vision (ECCV) (pp. 534–549). Dai, J., He, K., Li, Y., Ren, S., & Sun, J. (2016a). Instance-sensitive fully convolutional networks. In Proceedings of the IEEE European conference on computer vision (ECCV) (pp. 534–549).
go back to reference Dai, J., Li, Y., He, K., & Sun, J. (2016b). R-FCN: Object detection via region-based fully convolutional networks. In Advances in neural information processing systems (NIPS). Dai, J., Li, Y., He, K., & Sun, J. (2016b). R-FCN: Object detection via region-based fully convolutional networks. In Advances in neural information processing systems (NIPS).
go back to reference Dai, J., Qi, H., Xiong, Y., Li, Y., Zhang, G., Hu, H., & Wei, Y. (2017). Deformable convolutional networks. In Proceedings of the IEEE international conference on computer vision (ICCV). Dai, J., Qi, H., Xiong, Y., Li, Y., Zhang, G., Hu, H., & Wei, Y. (2017). Deformable convolutional networks. In Proceedings of the IEEE international conference on computer vision (ICCV).
go back to reference Durand, T., Mordan, T., Thome, N., & Cord, M. (2017). WILDCAT: Weakly supervised learning of deep convnets for image classification, pointwise localization and segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR). Durand, T., Mordan, T., Thome, N., & Cord, M. (2017). WILDCAT: Weakly supervised learning of deep convnets for image classification, pointwise localization and segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR).
go back to reference Everingham, M., Eslami, A., Van Gool, L., Williams, C., Winn, J., & Zisserman, A. (2015). The PASCAL visual object classes challenge: A retrospective. International Journal of Computer Vision (IJCV), 111(1), 98–136.CrossRef Everingham, M., Eslami, A., Van Gool, L., Williams, C., Winn, J., & Zisserman, A. (2015). The PASCAL visual object classes challenge: A retrospective. International Journal of Computer Vision (IJCV), 111(1), 98–136.CrossRef
go back to reference Felzenszwalb, P., Girshick, R., McAllester, D., & Ramanan, D. (2010). Object detection with discriminatively trained part-based models. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 32(9), 1627–1645.CrossRef Felzenszwalb, P., Girshick, R., McAllester, D., & Ramanan, D. (2010). Object detection with discriminatively trained part-based models. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 32(9), 1627–1645.CrossRef
go back to reference Fidler, S., Mottaghi, R., Yuille, A., & Urtasun, R. (2013). Bottom-up segmentation for top-down detection. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR) (pp. 3294–3301). Fidler, S., Mottaghi, R., Yuille, A., & Urtasun, R. (2013). Bottom-up segmentation for top-down detection. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR) (pp. 3294–3301).
go back to reference Gidaris, S., & Komodakis, N.(2015). Object detection via a multi-region and semantic segmentation-aware CNN model. In Proceedings of the IEEE international conference on computer vision (ICCV) (pp. 1134–1142). Gidaris, S., & Komodakis, N.(2015). Object detection via a multi-region and semantic segmentation-aware CNN model. In Proceedings of the IEEE international conference on computer vision (ICCV) (pp. 1134–1142).
go back to reference Gidaris, S., & Komodakis, N. (2016a). Attend refine repeat: Active box proposal generation via in-out localization. In Proceedings of the British machine vision conference (BMVC). Gidaris, S., & Komodakis, N. (2016a). Attend refine repeat: Active box proposal generation via in-out localization. In Proceedings of the British machine vision conference (BMVC).
go back to reference Gidaris, S., & Komodakis, N.(2016b). LocNet: Improving localization accuracy for object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR). Gidaris, S., & Komodakis, N.(2016b). LocNet: Improving localization accuracy for object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR).
go back to reference Girshick, R. (2015). Fast R-CNN. In Proceedings of the IEEE international conference on computer vision (ICCV) (pp. 1440–1448). Girshick, R. (2015). Fast R-CNN. In Proceedings of the IEEE international conference on computer vision (ICCV) (pp. 1440–1448).
go back to reference Girshick, R., Donahue, J., Darrell, T., & Malik, J. (2014). Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR) (pp. 580–587). Girshick, R., Donahue, J., Darrell, T., & Malik, J. (2014). Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR) (pp. 580–587).
go back to reference Girshick, R., Iandola, F., Darrell, T., & Malik, J. (2015). Deformable part models are convolutional neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR) (pp. 437–446). Girshick, R., Iandola, F., Darrell, T., & Malik, J. (2015). Deformable part models are convolutional neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR) (pp. 437–446).
go back to reference He, K., Zhang, X., Ren, S., & Sun, J. (2015). Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 37(9), 1904–1916.CrossRef He, K., Zhang, X., Ren, S., & Sun, J. (2015). Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 37(9), 1904–1916.CrossRef
go back to reference He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR). He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR).
go back to reference Kong, T., Yao, A., Chen, Y., & Sun, F. (2016). HyperNet: Towards accurate region proposal generation and joint object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR). Kong, T., Yao, A., Chen, Y., & Sun, F. (2016). HyperNet: Towards accurate region proposal generation and joint object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR).
go back to reference Krähenbühl, P., & Koltun, V. (2011) Efficient inference in fully connected CRFs with Gaussian ddge potentials. In Advances in neural information processing systems (NIPS) (pp. 109–117). Krähenbühl, P., & Koltun, V. (2011) Efficient inference in fully connected CRFs with Gaussian ddge potentials. In Advances in neural information processing systems (NIPS) (pp. 109–117).
go back to reference Krizhevsky, A., Sutskever, I., & Hinton, G. (2012). ImageNet classification with deep convolutional neural networks. In Advances in neural information processing systems (NIPS) (pp. 1097–1105). Krizhevsky, A., Sutskever, I., & Hinton, G. (2012). ImageNet classification with deep convolutional neural networks. In Advances in neural information processing systems (NIPS) (pp. 1097–1105).
go back to reference Lafferty, J., McCallum, A., & Pereira, F. (2001). Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proceedings of the international conference on machine learning (ICML). Lafferty, J., McCallum, A., & Pereira, F. (2001). Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proceedings of the international conference on machine learning (ICML).
go back to reference LeCun, Y., Boser, B., Denker, J., Henderson, D., Howard, R., Hubbard, W., et al. (1989). Backpropagation applied to handwritten zip code recognition. Neural Computation, 1(4), 541–551.CrossRef LeCun, Y., Boser, B., Denker, J., Henderson, D., Howard, R., Hubbard, W., et al. (1989). Backpropagation applied to handwritten zip code recognition. Neural Computation, 1(4), 541–551.CrossRef
go back to reference Li, Y., Qi, H., Dai, J., Ji, X., & Wei, Y. (2017). Fully convolutional instance-aware semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR). Li, Y., Qi, H., Dai, J., Ji, X., & Wei, Y. (2017). Fully convolutional instance-aware semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR).
go back to reference Lin, D., Shen, X., Lu, C., & Jia, J. (2015). Deep LAC: Deep localization, alignment and classification for fine-grained recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR) (pp. 1666–1674). Lin, D., Shen, X., Lu, C., & Jia, J. (2015). Deep LAC: Deep localization, alignment and classification for fine-grained recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR) (pp. 1666–1674).
go back to reference Lin, T. Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., & Zitnick, L. (2014). Microsoft COCO: Common objects in context. In Proceedings of the IEEE European conference on computer vision (ECCV) (pp. 740–755). Lin, T. Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., & Zitnick, L. (2014). Microsoft COCO: Common objects in context. In Proceedings of the IEEE European conference on computer vision (ECCV) (pp. 740–755).
go back to reference Lin, T. Y., Dollár, P., Girshick, R., He, K., Hariharan, B., & Belongie, S. (2017a). Feature pyramid networks for object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR). Lin, T. Y., Dollár, P., Girshick, R., He, K., Hariharan, B., & Belongie, S. (2017a). Feature pyramid networks for object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR).
go back to reference Lin, T. Y., Goyal, P., Girshick, R., He, K., & Dollár, P. (2017b). Focal loss for dense object detection. In Proceedings of the IEEE international conference on computer vision (ICCV). Lin, T. Y., Goyal, P., Girshick, R., He, K., & Dollár, P. (2017b). Focal loss for dense object detection. In Proceedings of the IEEE international conference on computer vision (ICCV).
go back to reference Liu, W., Anguelov, D., Erhan, D., Szegedy, C., & Reed, S.(2016). SSD: Single shot multibox detector. In Proceedings of the IEEE European conference on computer vision (ECCV). Liu, W., Anguelov, D., Erhan, D., Szegedy, C., & Reed, S.(2016). SSD: Single shot multibox detector. In Proceedings of the IEEE European conference on computer vision (ECCV).
go back to reference Long, J., Shelhamer, E., & Darrell, T. (2015). Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR) (pp. 3431–3440). Long, J., Shelhamer, E., & Darrell, T. (2015). Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR) (pp. 3431–3440).
go back to reference Mordan, T., Thome, N., Cord, M., & Henaff, G. (2017). Deformable part-based fully convolutional network for object detection. In Proceedings of the British machine vision conference (BMVC). Mordan, T., Thome, N., Cord, M., & Henaff, G. (2017). Deformable part-based fully convolutional network for object detection. In Proceedings of the British machine vision conference (BMVC).
go back to reference Ott, P., & Everingham, M. (2011). Shared parts for deformable part-based models. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR) (pp. 1513–1520). Ott, P., & Everingham, M. (2011). Shared parts for deformable part-based models. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR) (pp. 1513–1520).
go back to reference Pinheiro, P., Lin, T. Y., Collobert, R., & Dollár, P. (2016) Learning to refine object segments. In Proceedings of the IEEE European conference on computer vision (ECCV) (pp. 75–91). Pinheiro, P., Lin, T. Y., Collobert, R., & Dollár, P. (2016) Learning to refine object segments. In Proceedings of the IEEE European conference on computer vision (ECCV) (pp. 75–91).
go back to reference Redmon, J., & Farhadi, A. (2017). YOLO9000: Better, faster, stronger. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR). Redmon, J., & Farhadi, A. (2017). YOLO9000: Better, faster, stronger. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR).
go back to reference Redmon, J., Divvala, S., Girshick, R., & Farhadi, A. (2016). You only look once: Unified, real-time object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR). Redmon, J., Divvala, S., Girshick, R., & Farhadi, A. (2016). You only look once: Unified, real-time object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR).
go back to reference Ren, S., He, K., Girshick, R., & Sun, J. (2015). Faster R-CNN: Towards real-time object detection with region proposal networks. In Advances in neural information processing systems (NIPS) (pp. 91–99). Ren, S., He, K., Girshick, R., & Sun, J. (2015). Faster R-CNN: Towards real-time object detection with region proposal networks. In Advances in neural information processing systems (NIPS) (pp. 91–99).
go back to reference Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., et al. (2015). ImageNet large scale visual recognition challenge. International Journal of Computer Vision (IJCV), 115(3), 211–252.MathSciNetCrossRef Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., et al. (2015). ImageNet large scale visual recognition challenge. International Journal of Computer Vision (IJCV), 115(3), 211–252.MathSciNetCrossRef
go back to reference Savalle, P. A., Tsogkas, S., Papandreou, G., & Kokkinos, I. (2014). Deformable part models with CNN features. In Proceedings of the IEEE European conference on computer vision (ECCV), parts and attributes workshop. Savalle, P. A., Tsogkas, S., Papandreou, G., & Kokkinos, I. (2014). Deformable part models with CNN features. In Proceedings of the IEEE European conference on computer vision (ECCV), parts and attributes workshop.
go back to reference Shrivastava, A., Gupta, A., & Girshick, R. (2016). Training region-based object detectors with online hard example mining. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR). Shrivastava, A., Gupta, A., & Girshick, R. (2016). Training region-based object detectors with online hard example mining. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR).
go back to reference Sicre, R., Avrithis, Y., Kijak, E., & Jurie, F. (2017). Unsupervised part learning for visual recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR). Sicre, R., Avrithis, Y., Kijak, E., & Jurie, F. (2017). Unsupervised part learning for visual recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR).
go back to reference Simon, M., & Rodner, E. (2015). Neural activation constellations: Unsupervised part model discovery with convolutional networks. In Proceedings of the IEEE international conference on computer vision (ICCV) (pp. 1143–1151). Simon, M., & Rodner, E. (2015). Neural activation constellations: Unsupervised part model discovery with convolutional networks. In Proceedings of the IEEE international conference on computer vision (ICCV) (pp. 1143–1151).
go back to reference Simonyan, K., & Zisserman, A. (2015) Very deep convolutional networks for large-scale image recognition. In Proceedings of the international conference on learning representations (ICLR). Simonyan, K., & Zisserman, A. (2015) Very deep convolutional networks for large-scale image recognition. In Proceedings of the international conference on learning representations (ICLR).
go back to reference Wan, L., Eigen, D., & Fergus, R. (2015). End-to-end integration of a convolution network, deformable parts model and non-maximum suppression. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR) (pp. 851–859). Wan, L., Eigen, D., & Fergus, R. (2015). End-to-end integration of a convolution network, deformable parts model and non-maximum suppression. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR) (pp. 851–859).
go back to reference Wang, P., Shen, X., Lin, Z., Cohen, S., Price, B., & Yuille, A. L. (2015). Joint object and part segmentation using deep learned potentials. In Proceedings of the IEEE international conference on computer vision (ICCV) (pp. 1573–1581). Wang, P., Shen, X., Lin, Z., Cohen, S., Price, B., & Yuille, A. L. (2015). Joint object and part segmentation using deep learned potentials. In Proceedings of the IEEE international conference on computer vision (ICCV) (pp. 1573–1581).
go back to reference Xie, S., Girshick, R., Dollár, P., Tu, Z., & He, K. (2017). Aggregated residual transformations for deep neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR). Xie, S., Girshick, R., Dollár, P., Tu, Z., & He, K. (2017). Aggregated residual transformations for deep neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR).
go back to reference Yu, F., & Koltun, V. (2016). Multi-scale context aggregation by dilated convolutions. In Proceedings of the international conference on learning representations (ICLR). Yu, F., & Koltun, V. (2016). Multi-scale context aggregation by dilated convolutions. In Proceedings of the international conference on learning representations (ICLR).
go back to reference Zagoruyko, S., & Komodakis, N. (2016). Wide residual networks. In Proceedings of the British machine vision conference (BMVC). Zagoruyko, S., & Komodakis, N. (2016). Wide residual networks. In Proceedings of the British machine vision conference (BMVC).
go back to reference Zagoruyko, S., Lerer, A., Lin, T. Y., Pinheiro, P., Gross, S., Chintala, S., & Dollar, P. (2016). A multipath network for object detection. In Proceedings of the British machine vision conference (BMVC). Zagoruyko, S., Lerer, A., Lin, T. Y., Pinheiro, P., Gross, S., Chintala, S., & Dollar, P. (2016). A multipath network for object detection. In Proceedings of the British machine vision conference (BMVC).
go back to reference Zhang, H., Xu, T., Elhoseiny, M., Huang, X., Zhang, S., Elgammal, A., & Metaxas, D. (2016). SPDA-CNN: Unifying semantic part detection and abstraction for fine-grained recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR) (pp. 1143–1152). Zhang, H., Xu, T., Elhoseiny, M., Huang, X., Zhang, S., Elgammal, A., & Metaxas, D. (2016). SPDA-CNN: Unifying semantic part detection and abstraction for fine-grained recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR) (pp. 1143–1152).
go back to reference Zhang, N., Donahue, J., Girshick, R., & Darrell, T. (2014). Part-based R-CNNs for fine-grained category detection. In Proceedings of the IEEE European conference on computer vision (ECCV) (pp. 834–849). Zhang, N., Donahue, J., Girshick, R., & Darrell, T. (2014). Part-based R-CNNs for fine-grained category detection. In Proceedings of the IEEE European conference on computer vision (ECCV) (pp. 834–849).
go back to reference Zheng, S., Jayasumana, S., Romera-Paredes, B., Vineet, V., Su, Z., Du, D., Huang, C., & Torr, P. (2015). Conditional random fields as recurrent neural networks. In Proceedings of the IEEE international conference on computer vision (ICCV) (pp. 1529–1537). Zheng, S., Jayasumana, S., Romera-Paredes, B., Vineet, V., Su, Z., Du, D., Huang, C., & Torr, P. (2015). Conditional random fields as recurrent neural networks. In Proceedings of the IEEE international conference on computer vision (ICCV) (pp. 1529–1537).
go back to reference Zhu, L., Chen, Y., Yuille, A., & Freeman, W. (2010). Latent hierarchical structural learning for object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR) (pp. 1062–1069). Zhu, L., Chen, Y., Yuille, A., & Freeman, W. (2010). Latent hierarchical structural learning for object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR) (pp. 1062–1069).
Metadata
Title
End-to-End Learning of Latent Deformable Part-Based Representations for Object Detection
Authors
Taylor Mordan
Nicolas Thome
Gilles Henaff
Matthieu Cord
Publication date
17-07-2018
Publisher
Springer US
Published in
International Journal of Computer Vision / Issue 11-12/2019
Print ISSN: 0920-5691
Electronic ISSN: 1573-1405
DOI
https://doi.org/10.1007/s11263-018-1109-z

Other articles of this Issue 11-12/2019

International Journal of Computer Vision 11-12/2019 Go to the issue

Premium Partner