nach oben

International Journal of Computer Vision

Erschienen in:

Open Access 17.10.2017

Do Semantic Parts Emerge in Convolutional Neural Networks?

verfasst von: Abel Gonzalez-Garcia, Davide Modolo, Vittorio Ferrari

Erschienen in: International Journal of Computer Vision | Ausgabe 5/2018

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Semantic object parts can be useful for several visual recognition tasks. Lately, these tasks have been addressed using Convolutional Neural Networks (CNN), achieving outstanding results. In this work we study whether CNNs learn semantic parts in their internal representation. We investigate the responses of convolutional filters and try to associate their stimuli with semantic parts. We perform two extensive quantitative analyses. First, we use ground-truth part bounding-boxes from the PASCAL-Part dataset to determine how many of those semantic parts emerge in the CNN. We explore this emergence for different layers, network depths, and supervision levels. Second, we collect human judgements in order to study what fraction of all filters systematically fire on any semantic part, even if not annotated in PASCAL-Part. Moreover, we explore several connections between discriminative power and semantics. We find out which are the most discriminative filters for object recognition, and analyze whether they respond to semantic parts or to other image patches. We also investigate the other direction: we determine which semantic parts are the most discriminative and whether they correspond to those parts emerging in the network. This enables to gain an even deeper understanding of the role of semantic parts in the network.

Vorheriger Artikel Dense Reconstruction of Transparent Objects by Altering Incident Light Paths Through Refraction

Nächster Artikel Classification of Multi-class Daily Human Motion using Discriminative Body Parts and Sentence Descriptions

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Agrawal, P., Girshick, R., & Malik, J. (2014). Analyzing the performance of multilayer neural networks for object recognition. In ECCV.

Caesar, H., Uijlings, J., & Ferrari, V. (2015). Joint calibration for semantic segmentation. In BMVC.

Chen, X., Mottaghi, R., Liu, X., Fidler, S., Urtasun, R., & Yuille, A. (2014). Detect what you can: Detecting and representing objects using holistic models and body parts. In CVPR.

Dalal, N., & Triggs, B. (2005). Histogram of oriented gradients for human detection. In CVPR.

Eigen, D., Rolfe, J., Fergus, R., & LeCun, Y. (2013). Understanding deep architectures using a recursive convolutional network. In ICLR workshop.

Everingham, M., Van Gool, L., Williams, C. K. I., Winn, J., & Zisserman, A. (2010). The pascal visual object classes (VOC) challenge. International Journal of Computer Vision, 88(2), 303–338.

Felzenszwalb, P., Girshick, R., McAllester, D., & Ramanan, D. (2010). Object detection with discriminatively trained part based models. IEEE transactions on pattern analysis and machine intelligence, 32(9), 1627–1645.CrossRef

Girshick, R. (2015). Fast R-CNN. In ICCV.

Girshick, R., Donahue, J., Darrell, T., & Malik, J. (2014). Rich feature hierarchies for accurate object detection and semantic segmentation. In CVPR.

Gkioxari, G., Girshick, R., & Malik, J. (2015). Actions and attributes from wholes and parts. In ICCV.

Hariharan, B., Arbeláez, P., Girshick, R., & Malik, J. (2015). Hypercolumns for object segmentation and fine-grained localization. In CVPR.

He, K., Zhang, X., Ren, S., & Sun, J. (2014). Spatial pyramid pooling in deep convolutional networks for visual recognition. In ECCV.

Jia, Y. (2013). Caffe: An open source convolutional architecture for fast feature embedding. http://caffe.berkeleyvision.org/.

Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. In NIPS.

Lenc, K., & Vedaldi, A. (2015). Understanding image representations by measuring their equivariance and equivalence. In CVPR.

Lin, D., Shen, X., Lu, C., & Jia, J. (2015). Deep lac: Deep localization, alignment and classification for fine-grained recognition. In CVPR.

Liu, J., Li, Y., & Belhumeur, P.N. (2014). Part-pair representation for part localization. In ECCV.

Long, J., Shelhamer, E., & Darrell, T. (2015) Fully convolutional networks for semantic segmentation. In CVPR.

Lowe, D. (2004). Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60(2), 91–110.CrossRef

Mahendran, A., & Vedaldi, A. (2015). Understanding deep image representations by inverting them. In CVPR.

Mitchell, M. (1998). An introduction to genetic algorithms. Cambridge: MIT Press.MATH

Oquab, M., Bottou, L., Laptev, I., & Sivic, J. (2015). Is object localization for free?-weakly-supervised learning with convolutional neural networks. In CVPR.

Parkhi, O., Vedaldi, A., Jawahar, C. V., & Zisserman, A. (2011). The truth about cats and dogs. In ICCV.

Parkhi, O.M., Vedaldi, A., Zisserman, A., & Jawahar, C. (2012) .Cats and dogs. In CVPR.

Pearson, K. (1895). Note on regression and inheritance in the case of two parents. Proceedings of the Royal Society of London, 58, 240–242.CrossRef

Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., et al. (2015). ImageNet large scale visual recognition challenge. International Journal of Computer Vision, 115(3), 211–252.

Simon, M., & Rodner, E. (2015). Neural activation constellations: Unsupervised part model discovery with convolutional networks. In ICCV.

Simon, M., Rodner, E., & Denzler, J. (2014). Part detector discovery in deep convolutional neural networks. In ACCV.

Simonyan, K., & Zisserman, A. (2015) Very deep convolutional networks for large-scale image recognition. In ICLR.

Simonyan, K., Vedaldi, A., & Zisserman, A. (2014). Deep inside convolutional networks: Visualising image classification models and saliency maps. In ICLR workshop.

Sun, M., & Savarese, S. (2011). Articulated part-based model for joint object detection and pose estimation. In ICCV.

Szegedy, C., Zaremba, W., Sutskever, I., Bruna, J., Erhan, D., Goodfellow, I., & Fergus, R. (2014). Intriguing properties of neural networks. In ICLR.

Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., & Rabinovich, A. (2015). Going deeper with convolutions. In CVPR.

Ukita, N. (2012). Articulated pose estimation with parts connectivity using discriminative local oriented contours. In CVPR.

Vedaldi, A., Mahendran, S., Tsogkas, S., Maji, S., Girshick, R., Kannala, J., et al. (2014). Understanding objects in detail with fine-grained attributes. In CVPR.

Xiao, T., Xu, Y., Yang, K., Zhang, J., Peng, Y., & Zhang, Z. (2015). The application of two-level attention models in deep convolutional neural network for fine-grained image classification. In CVPR.

Yosinski, J., Clune, J., Bengio, Y., & Lipson, H. (2014). How transferable are features in deep neural networks? In NIPS.

Zeiler, M. D., & Fergus, R. (2014). Visualizing and understanding convolutional networks. In ECCV.

Zhang, N., Farrell, R., Iandola, F., & Darrell, T. (2013). Deformable part descriptors for fine-grained recognition and attribute prediction. In ICCV.

Zhang, N., Donahue, J., Girshick, R., & Darrell, T. (2014). Part-based R-CNNS for fine-grained category detection. In ECCV.

Zhou, B., Lapedriza, A., Xiao, J., Torralba, A., & Oliva, A. (2014). Learning deep features for scene recognition using places database. In NIPS.

Zhou, B., Khosla, A., Lapedriza, A., Oliva, A., & Torralba, A. (2015). Object detectors emerge in deep scene cnns. In ICLR.

Titel: Do Semantic Parts Emerge in Convolutional Neural Networks?
verfasst von: Abel Gonzalez-Garcia
Davide Modolo
Vittorio Ferrari
Publikationsdatum: 17.10.2017
Verlag: Springer US
Erschienen in: International Journal of Computer Vision / Ausgabe 5/2018
Print ISSN: 0920-5691
Elektronische ISSN: 1573-1405
DOI: https://doi.org/10.1007/s11263-017-1048-0

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Wirtschaft"

Springer Professional "Technik"

Weitere Artikel der Ausgabe 5/2018

No-Reference Image Quality Assessment for Image Auto-Denoising

Appreciation to IJCV Reviewers of 2017

Visual Tracking via Subspace Learning: A Discriminative Approach

From Facial Expression Recognition to Interpersonal Relation Prediction

Classification of Multi-class Daily Human Motion using Discriminative Body Parts and Sentence Descriptions

Dense Reconstruction of Transparent Objects by Altering Incident Light Paths Through Refraction

Premium Partner