nach oben

Erschienen in:

2018 | OriginalPaper | Buchkapitel

DeepKSPD: Learning Kernel-Matrix-Based SPD Representation For Fine-Grained Image Recognition

verfasst von : Melih Engin, Lei Wang, Luping Zhou, Xinwang Liu

Erschienen in: Computer Vision – ECCV 2018

Verlag: Springer International Publishing

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

As a second-order pooled representation, covariance matrix has attracted much attention in visual recognition, and some pioneering works have recently integrated it into deep learning. A recent study shows that kernel matrix works considerably better than covariance matrix for this kind of representation, by modeling the higher-order, nonlinear relationship among pooled visual descriptors. Nevertheless, in that study neither the descriptors nor the kernel matrix is deeply learned. Worse, they are considered separately, hindering the pursuit of an optimal representation. To improve this situation, this work designs a deep network that jointly learns local descriptors and kernel-matrix-based pooled representation in an end-to-end manner. The derivatives for the mapping from a local descriptor set to this representation are derived to carry out backpropagation. More importantly, we introduce the Daleckiǐ-Kreǐn formula from Operator theory to give a concise and unified result on differentiating general functions defined on symmetric positive-definite (SPD) matrix, which shows its better numerical stability in conducting backpropagation compared with the existing method when handling the Riemannian geometry of SPD matrix. Experiments on fine-grained image benchmark datasets not only show the superiority of kernel-matrix-based SPD representation with deep local descriptors, but also verify the advantage of the proposed deep network in pursuing better SPD representations. Also, ablation study is provided to explain why and from where these improvements are attained.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Vorheriges Kapitel Deep Recursive HDRI: Inverse Tone Mapping Using Generative Adversarial Networks

Nächstes Kapitel Pairwise Relational Networks for Face Recognition

Nur mit Berechtigung zugänglich

Sivic, J., Zisserman, A.: Video Google: a text retrieval approach to object matching in videos. In: 9th IEEE International Conference on Computer Vision (ICCV 2003), pp. 1470–1477 (2003)

Wang, J., Yang, J., Yu, K., Lv, F., Huang, T.S., Gong, Y.: Locality-constrained linear coding for image classification. In: The Twenty-Third IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2010, pp. 3360–3367 (2010)

Jegou, H., Douze, M., Schmid, C., Pérez, P.: Aggregating local descriptors into a compact image representation. In: The Twenty-Third IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2010, pp. 3304–3311 (2010)

Sánchez, J., Perronnin, F., Mensink, T., Verbeek, J.J.: Image classification with the fisher vector: theory and practice. Int. J. Comput. Vis. 105(3), 222–245 (2013)MathSciNetCrossRef

Jayasumana, S., Hartley, R.I., Salzmann, M., Li, H., Harandi, M.T.: Kernel methods on the Riemannian manifold of symmetric positive definite matrices. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp. 73–80 (2013)

Wang, R., Guo, H., Davis, L.S., Dai, Q.: Covariance discriminative learning: a natural and efficient approach to image set classification. [7], pp. 2496–2503

IEEE Conference on Computer Vision and Pattern Recognition, Providence, RI, USA, 16–21 June 2012. IEEE Computer Society (2012)

Fleet, D.J., Pajdla, T., Schiele, B., Tuytelaars, T., (eds.): Computer Vision - ECCV 2014–13th European Conference, Zurich, Switzerland, 6–12 September 2014, Proceedings, Part II. Lecture Notes in Computer Science, vol. 8690. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1

Bach, F.R., Blei, D.M., (eds.): Proceedings of the 32nd International Conference on Machine Learning, ICML 2015, Lille, France, 6–11 July 2015. JMLR Workshop and Conference Proceedings, vol. 37. JMLR.org (2015)

10.

Li, P., Xie, J., Wang, Q., Zuo, W.: Is second-order information helpful for large-scale visual recognition? In: IEEE International Conference on Computer Vision, ICCV 2017, Venice, Italy, 22–29 October 2017, pp. 2089–2097 (2017)

11.

Kong, S., Fowlkes, C.C.: Low-rank bilinear pooling for fine-grained classification. CoRR abs/1611.05109 (2016)

12.

Lin, T.Y., Maji, S.: Improved bilinear pooling with CNNs. In: British Machine Vision Conference (BMVC)(2017)

13.

Lin, T.Y., RoyChowdhury, A., Maji, S.: Bilinear CNNS for fine-grained visual recognition. IEEE Trans. Pattern Anal. Mach. Intell. 40(6), 1309–1322 (2018)CrossRef

14.

Gao, Y., Beijbom, O., Zhang, N., Darrell, T.: Compact bilinear pooling. CoRR abs/1511.06062 (2015)

15.

Harandi, M.T., Salzmann, M., Porikli, F.M.: Bregman divergences for infinite dimensional covariance matrices. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2014, pp. 1003–1010 (2014)

16.

Wang, L., Zhang, J., Zhou, L., Tang, C., Li, W.: Beyond covariance: feature representation with nonlinear Kernel matrices. In: 2015 IEEE International Conference on Computer Vision, ICCV 2015, pp. 4570–4578 (2015)

17.

Cui, Y., Zhou, F., Wang, J., Liu, X., Lin, Y., Belongie, S.: Kernel pooling for convolutional neural networks. In: Computer Vision and Pattern Recognition (CVPR), Honolulu, HI (2017)

18.

Ionescu, C., Vantzos, O., Sminchisescu, C.: Matrix backpropagation for deep networks with structured layers. In: 2015 IEEE International Conference on Computer Vision, ICCV 2015, pp. 2965–2973 (2015)

19.

Daleckiĭ, Y.L., Kreĭn, S.G.: Integration and differentiation of functions of hermitian operators and applications to the theory of perturbations. (Russian) Vorone. Gos. Univ. Trudy Sem. Funkcional. Anal. 1(1), 81–105(1956). English translation is in book Thirteen Papers on Functional Analysis and Partial Differential Equations, American Mathematical Society Translations: Series 2, vol. 47 (1965)

20.

Bhatia, R.: Positive Definite Matrices. Princeton University Press (2015)

21.

Lin, T., Roy Chowdhury, A., Maji, S.: Bilinear CNN models for fine-grained visual recognition. In: 2015 IEEE International Conference on Computer Vision, ICCV 2015, pp. 1449–1457 (2015)

22.

Arsigny, V., Fillard, P., Pennec, X., Ayache, N.: Log-euclidean metrics for fast and simple calculus on diffusion tensors. Mag. Reson. Med. 56(2), 411–421 (2006)CrossRef

23.

Krause, J., Stark, M., Deng, J., Fei-Fei, L.: 3D object representations for fine-grained categorization. In: 4th International IEEE Workshop on 3D Representation and Recognition (3dRR-13), Sydney, Australia (2013)

24.

Welinder, P., et al.: Caltech-UCSD Birds 200. Technical report CNS-TR-2010-001, California Institute of Technology (2010)

25.

Maji, S., Rahtu, E., Kannala, J., Blaschko, M.B., Vedaldi, A.: Fine-grained visual classification of aircraft. CoRR abs/1306.5151 (2013)

26.

Branson, S., Horn, G.V., Belongie, S., Perona, P.: Bird species categorization using pose normalized deep convolutional nets. In: British Machine Vision Conference (BMVC), Nottingham (2014)

27.

Perronnin, F., Sánchez, J., Mensink, T.: Improving the Fisher Kernel for large-scale image classification. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010. LNCS, vol. 6314, pp. 143–156. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-15561-1_11CrossRef

28.

Cimpoi, M., Maji, S., Vedaldi, A.: Deep filter banks for texture recognition and segmentation. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015, pp. 3828–3836 (2015)

29.

Chai, Y., Lempitsky, V., Zisserman, A.: Symbiotic segmentation and part localization for fine-grained categorization. In: IEEE International Conference on Computer Vision (2013)

30.

Gosselin, P.H., Murray, N., Jégou, H., Perronnin, F.: Revisiting the Fisher vector for fine-grained classification. Pattern Recognit. Lett. 49, 92–98 (2014)CrossRef

Titel: DeepKSPD: Learning Kernel-Matrix-Based SPD Representation For Fine-Grained Image Recognition
verfasst von: Melih Engin
Lei Wang
Luping Zhou
Xinwang Liu
Verlag: Springer International Publishing
Buch: Computer Vision – ECCV 2018
Print ISBN: 978-3-030-01215-1

Electronic ISBN: 978-3-030-01216-8

Copyright-Jahr: 2018
DOI: https://doi.org/10.1007/978-3-030-01216-8_38

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"