Skip to main content

2018 | OriginalPaper | Buchkapitel

DeepKSPD: Learning Kernel-Matrix-Based SPD Representation For Fine-Grained Image Recognition

verfasst von : Melih Engin, Lei Wang, Luping Zhou, Xinwang Liu

Erschienen in: Computer Vision – ECCV 2018

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

As a second-order pooled representation, covariance matrix has attracted much attention in visual recognition, and some pioneering works have recently integrated it into deep learning. A recent study shows that kernel matrix works considerably better than covariance matrix for this kind of representation, by modeling the higher-order, nonlinear relationship among pooled visual descriptors. Nevertheless, in that study neither the descriptors nor the kernel matrix is deeply learned. Worse, they are considered separately, hindering the pursuit of an optimal representation. To improve this situation, this work designs a deep network that jointly learns local descriptors and kernel-matrix-based pooled representation in an end-to-end manner. The derivatives for the mapping from a local descriptor set to this representation are derived to carry out backpropagation. More importantly, we introduce the Daleckiǐ-Kreǐn formula from Operator theory to give a concise and unified result on differentiating general functions defined on symmetric positive-definite (SPD) matrix, which shows its better numerical stability in conducting backpropagation compared with the existing method when handling the Riemannian geometry of SPD matrix. Experiments on fine-grained image benchmark datasets not only show the superiority of kernel-matrix-based SPD representation with deep local descriptors, but also verify the advantage of the proposed deep network in pursuing better SPD representations. Also, ablation study is provided to explain why and from where these improvements are attained.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Anhänge
Nur mit Berechtigung zugänglich
Literatur
1.
Zurück zum Zitat Sivic, J., Zisserman, A.: Video Google: a text retrieval approach to object matching in videos. In: 9th IEEE International Conference on Computer Vision (ICCV 2003), pp. 1470–1477 (2003) Sivic, J., Zisserman, A.: Video Google: a text retrieval approach to object matching in videos. In: 9th IEEE International Conference on Computer Vision (ICCV 2003), pp. 1470–1477 (2003)
2.
Zurück zum Zitat Wang, J., Yang, J., Yu, K., Lv, F., Huang, T.S., Gong, Y.: Locality-constrained linear coding for image classification. In: The Twenty-Third IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2010, pp. 3360–3367 (2010) Wang, J., Yang, J., Yu, K., Lv, F., Huang, T.S., Gong, Y.: Locality-constrained linear coding for image classification. In: The Twenty-Third IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2010, pp. 3360–3367 (2010)
3.
Zurück zum Zitat Jegou, H., Douze, M., Schmid, C., Pérez, P.: Aggregating local descriptors into a compact image representation. In: The Twenty-Third IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2010, pp. 3304–3311 (2010) Jegou, H., Douze, M., Schmid, C., Pérez, P.: Aggregating local descriptors into a compact image representation. In: The Twenty-Third IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2010, pp. 3304–3311 (2010)
4.
Zurück zum Zitat Sánchez, J., Perronnin, F., Mensink, T., Verbeek, J.J.: Image classification with the fisher vector: theory and practice. Int. J. Comput. Vis. 105(3), 222–245 (2013)MathSciNetCrossRef Sánchez, J., Perronnin, F., Mensink, T., Verbeek, J.J.: Image classification with the fisher vector: theory and practice. Int. J. Comput. Vis. 105(3), 222–245 (2013)MathSciNetCrossRef
5.
Zurück zum Zitat Jayasumana, S., Hartley, R.I., Salzmann, M., Li, H., Harandi, M.T.: Kernel methods on the Riemannian manifold of symmetric positive definite matrices. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp. 73–80 (2013) Jayasumana, S., Hartley, R.I., Salzmann, M., Li, H., Harandi, M.T.: Kernel methods on the Riemannian manifold of symmetric positive definite matrices. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp. 73–80 (2013)
6.
Zurück zum Zitat Wang, R., Guo, H., Davis, L.S., Dai, Q.: Covariance discriminative learning: a natural and efficient approach to image set classification. [7], pp. 2496–2503 Wang, R., Guo, H., Davis, L.S., Dai, Q.: Covariance discriminative learning: a natural and efficient approach to image set classification. [7], pp. 2496–2503
7.
Zurück zum Zitat IEEE Conference on Computer Vision and Pattern Recognition, Providence, RI, USA, 16–21 June 2012. IEEE Computer Society (2012) IEEE Conference on Computer Vision and Pattern Recognition, Providence, RI, USA, 16–21 June 2012. IEEE Computer Society (2012)
8.
Zurück zum Zitat Fleet, D.J., Pajdla, T., Schiele, B., Tuytelaars, T., (eds.): Computer Vision - ECCV 2014–13th European Conference, Zurich, Switzerland, 6–12 September 2014, Proceedings, Part II. Lecture Notes in Computer Science, vol. 8690. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1 Fleet, D.J., Pajdla, T., Schiele, B., Tuytelaars, T., (eds.): Computer Vision - ECCV 2014–13th European Conference, Zurich, Switzerland, 6–12 September 2014, Proceedings, Part II. Lecture Notes in Computer Science, vol. 8690. Springer, Cham (2014). https://​doi.​org/​10.​1007/​978-3-319-10602-1
9.
Zurück zum Zitat Bach, F.R., Blei, D.M., (eds.): Proceedings of the 32nd International Conference on Machine Learning, ICML 2015, Lille, France, 6–11 July 2015. JMLR Workshop and Conference Proceedings, vol. 37. JMLR.org (2015) Bach, F.R., Blei, D.M., (eds.): Proceedings of the 32nd International Conference on Machine Learning, ICML 2015, Lille, France, 6–11 July 2015. JMLR Workshop and Conference Proceedings, vol. 37. JMLR.org (2015)
10.
Zurück zum Zitat Li, P., Xie, J., Wang, Q., Zuo, W.: Is second-order information helpful for large-scale visual recognition? In: IEEE International Conference on Computer Vision, ICCV 2017, Venice, Italy, 22–29 October 2017, pp. 2089–2097 (2017) Li, P., Xie, J., Wang, Q., Zuo, W.: Is second-order information helpful for large-scale visual recognition? In: IEEE International Conference on Computer Vision, ICCV 2017, Venice, Italy, 22–29 October 2017, pp. 2089–2097 (2017)
11.
Zurück zum Zitat Kong, S., Fowlkes, C.C.: Low-rank bilinear pooling for fine-grained classification. CoRR abs/1611.05109 (2016) Kong, S., Fowlkes, C.C.: Low-rank bilinear pooling for fine-grained classification. CoRR abs/1611.05109 (2016)
12.
Zurück zum Zitat Lin, T.Y., Maji, S.: Improved bilinear pooling with CNNs. In: British Machine Vision Conference (BMVC)(2017) Lin, T.Y., Maji, S.: Improved bilinear pooling with CNNs. In: British Machine Vision Conference (BMVC)(2017)
13.
Zurück zum Zitat Lin, T.Y., RoyChowdhury, A., Maji, S.: Bilinear CNNS for fine-grained visual recognition. IEEE Trans. Pattern Anal. Mach. Intell. 40(6), 1309–1322 (2018)CrossRef Lin, T.Y., RoyChowdhury, A., Maji, S.: Bilinear CNNS for fine-grained visual recognition. IEEE Trans. Pattern Anal. Mach. Intell. 40(6), 1309–1322 (2018)CrossRef
14.
Zurück zum Zitat Gao, Y., Beijbom, O., Zhang, N., Darrell, T.: Compact bilinear pooling. CoRR abs/1511.06062 (2015) Gao, Y., Beijbom, O., Zhang, N., Darrell, T.: Compact bilinear pooling. CoRR abs/1511.06062 (2015)
15.
Zurück zum Zitat Harandi, M.T., Salzmann, M., Porikli, F.M.: Bregman divergences for infinite dimensional covariance matrices. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2014, pp. 1003–1010 (2014) Harandi, M.T., Salzmann, M., Porikli, F.M.: Bregman divergences for infinite dimensional covariance matrices. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2014, pp. 1003–1010 (2014)
16.
Zurück zum Zitat Wang, L., Zhang, J., Zhou, L., Tang, C., Li, W.: Beyond covariance: feature representation with nonlinear Kernel matrices. In: 2015 IEEE International Conference on Computer Vision, ICCV 2015, pp. 4570–4578 (2015) Wang, L., Zhang, J., Zhou, L., Tang, C., Li, W.: Beyond covariance: feature representation with nonlinear Kernel matrices. In: 2015 IEEE International Conference on Computer Vision, ICCV 2015, pp. 4570–4578 (2015)
17.
Zurück zum Zitat Cui, Y., Zhou, F., Wang, J., Liu, X., Lin, Y., Belongie, S.: Kernel pooling for convolutional neural networks. In: Computer Vision and Pattern Recognition (CVPR), Honolulu, HI (2017) Cui, Y., Zhou, F., Wang, J., Liu, X., Lin, Y., Belongie, S.: Kernel pooling for convolutional neural networks. In: Computer Vision and Pattern Recognition (CVPR), Honolulu, HI (2017)
18.
Zurück zum Zitat Ionescu, C., Vantzos, O., Sminchisescu, C.: Matrix backpropagation for deep networks with structured layers. In: 2015 IEEE International Conference on Computer Vision, ICCV 2015, pp. 2965–2973 (2015) Ionescu, C., Vantzos, O., Sminchisescu, C.: Matrix backpropagation for deep networks with structured layers. In: 2015 IEEE International Conference on Computer Vision, ICCV 2015, pp. 2965–2973 (2015)
19.
Zurück zum Zitat Daleckiĭ, Y.L., Kreĭn, S.G.: Integration and differentiation of functions of hermitian operators and applications to the theory of perturbations. (Russian) Vorone. Gos. Univ. Trudy Sem. Funkcional. Anal. 1(1), 81–105(1956). English translation is in book Thirteen Papers on Functional Analysis and Partial Differential Equations, American Mathematical Society Translations: Series 2, vol. 47 (1965) Daleckiĭ, Y.L., Kreĭn, S.G.: Integration and differentiation of functions of hermitian operators and applications to the theory of perturbations. (Russian) Vorone. Gos. Univ. Trudy Sem. Funkcional. Anal. 1(1), 81–105(1956). English translation is in book Thirteen Papers on Functional Analysis and Partial Differential Equations, American Mathematical Society Translations: Series 2, vol. 47 (1965)
20.
Zurück zum Zitat Bhatia, R.: Positive Definite Matrices. Princeton University Press (2015) Bhatia, R.: Positive Definite Matrices. Princeton University Press (2015)
21.
Zurück zum Zitat Lin, T., Roy Chowdhury, A., Maji, S.: Bilinear CNN models for fine-grained visual recognition. In: 2015 IEEE International Conference on Computer Vision, ICCV 2015, pp. 1449–1457 (2015) Lin, T., Roy Chowdhury, A., Maji, S.: Bilinear CNN models for fine-grained visual recognition. In: 2015 IEEE International Conference on Computer Vision, ICCV 2015, pp. 1449–1457 (2015)
22.
Zurück zum Zitat Arsigny, V., Fillard, P., Pennec, X., Ayache, N.: Log-euclidean metrics for fast and simple calculus on diffusion tensors. Mag. Reson. Med. 56(2), 411–421 (2006)CrossRef Arsigny, V., Fillard, P., Pennec, X., Ayache, N.: Log-euclidean metrics for fast and simple calculus on diffusion tensors. Mag. Reson. Med. 56(2), 411–421 (2006)CrossRef
23.
Zurück zum Zitat Krause, J., Stark, M., Deng, J., Fei-Fei, L.: 3D object representations for fine-grained categorization. In: 4th International IEEE Workshop on 3D Representation and Recognition (3dRR-13), Sydney, Australia (2013) Krause, J., Stark, M., Deng, J., Fei-Fei, L.: 3D object representations for fine-grained categorization. In: 4th International IEEE Workshop on 3D Representation and Recognition (3dRR-13), Sydney, Australia (2013)
24.
Zurück zum Zitat Welinder, P., et al.: Caltech-UCSD Birds 200. Technical report CNS-TR-2010-001, California Institute of Technology (2010) Welinder, P., et al.: Caltech-UCSD Birds 200. Technical report CNS-TR-2010-001, California Institute of Technology (2010)
25.
Zurück zum Zitat Maji, S., Rahtu, E., Kannala, J., Blaschko, M.B., Vedaldi, A.: Fine-grained visual classification of aircraft. CoRR abs/1306.5151 (2013) Maji, S., Rahtu, E., Kannala, J., Blaschko, M.B., Vedaldi, A.: Fine-grained visual classification of aircraft. CoRR abs/1306.5151 (2013)
26.
Zurück zum Zitat Branson, S., Horn, G.V., Belongie, S., Perona, P.: Bird species categorization using pose normalized deep convolutional nets. In: British Machine Vision Conference (BMVC), Nottingham (2014) Branson, S., Horn, G.V., Belongie, S., Perona, P.: Bird species categorization using pose normalized deep convolutional nets. In: British Machine Vision Conference (BMVC), Nottingham (2014)
28.
Zurück zum Zitat Cimpoi, M., Maji, S., Vedaldi, A.: Deep filter banks for texture recognition and segmentation. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015, pp. 3828–3836 (2015) Cimpoi, M., Maji, S., Vedaldi, A.: Deep filter banks for texture recognition and segmentation. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015, pp. 3828–3836 (2015)
29.
Zurück zum Zitat Chai, Y., Lempitsky, V., Zisserman, A.: Symbiotic segmentation and part localization for fine-grained categorization. In: IEEE International Conference on Computer Vision (2013) Chai, Y., Lempitsky, V., Zisserman, A.: Symbiotic segmentation and part localization for fine-grained categorization. In: IEEE International Conference on Computer Vision (2013)
30.
Zurück zum Zitat Gosselin, P.H., Murray, N., Jégou, H., Perronnin, F.: Revisiting the Fisher vector for fine-grained classification. Pattern Recognit. Lett. 49, 92–98 (2014)CrossRef Gosselin, P.H., Murray, N., Jégou, H., Perronnin, F.: Revisiting the Fisher vector for fine-grained classification. Pattern Recognit. Lett. 49, 92–98 (2014)CrossRef
Metadaten
Titel
DeepKSPD: Learning Kernel-Matrix-Based SPD Representation For Fine-Grained Image Recognition
verfasst von
Melih Engin
Lei Wang
Luping Zhou
Xinwang Liu
Copyright-Jahr
2018
DOI
https://doi.org/10.1007/978-3-030-01216-8_38