Skip to main content
Top

2021 | OriginalPaper | Chapter

Second-Order Convolutional Neural Network Based on Cholesky Compression Strategy

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

In the past few years, Convolution Neural Network (CNN) has been successfully applied to many computer vision tasks. Most of these networks can only extract first-order information from input images. The second-order statistical information refers to the second-order correlation obtained by calculating the covariance matrix, the fisher information matrix, or the vector outer product operation on the local feature group according to the channels. It has been shown that using second-order information on facial expression datasets can better capture the distortion of facial area features, while at the same time generate more parameters which may cause much more computational cost. In this article we propose a new CNN structure including layers which can (i) incorporate first-order information into the covariance matrix; (ii) use eigenvalue vectors to measure the importance of feature channels; (iii) reduce the bilinear dimensionality of the parameter matrix; and (iv) perform Cholesky decomposition on the positive definite matrix to complete the compression of the second-order information matrix. Due to the incorporation of both first-order and second-order information and the Cholesky compression strategy, our proposed method reduces the number of parameters by half of the SPDNet model, and simultaneously achieves better results in facial expression classification tasks than the corresponding first-order model and the reference second-order model.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Krizhevsky, A., Sutskever, I., Hinton, G.: ImageNet classification with deep convolutional neural networks. In: NIPS, pp. 1106–1114 (2012) Krizhevsky, A., Sutskever, I., Hinton, G.: ImageNet classification with deep convolutional neural networks. In: NIPS, pp. 1106–1114 (2012)
2.
go back to reference He, K., Zhang, X., Ren, S, Sun, J.: Deep residual learning for image recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016) He, K., Zhang, X., Ren, S, Sun, J.: Deep residual learning for image recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
3.
go back to reference Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In ICLR, pp. 340–352 (2015) Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In ICLR, pp. 340–352 (2015)
4.
go back to reference Lin, T., RoyChowdhury, A., Maji, S.: Bilinear CNN models for fine-grained visual recognition. In: ICCV, pp. 1449–1457 (2015) Lin, T., RoyChowdhury, A., Maji, S.: Bilinear CNN models for fine-grained visual recognition. In: ICCV, pp. 1449–1457 (2015)
5.
go back to reference Ionescu, C.,Vantzos, O., Sminchisescu, C.: Matrix backpropagation for deep networks with structured layers. In: IEEE International Conference on Computer Vision, pp. 990–1002 (2015) Ionescu, C.,Vantzos, O., Sminchisescu, C.: Matrix backpropagation for deep networks with structured layers. In: IEEE International Conference on Computer Vision, pp. 990–1002 (2015)
6.
go back to reference Li, P., Xie, J., Wang, Q., Zuo, W.: Is second-order information helpful for large-scale visual recognition? In: ICCV, pp. 1205–1213 (2017) Li, P., Xie, J., Wang, Q., Zuo, W.: Is second-order information helpful for large-scale visual recognition? In: ICCV, pp. 1205–1213 (2017)
8.
go back to reference Gao, Y., Beijbom, O., Zhang, N., Darrell, T.: Compact bilinear pooling. In: International Conference on Computer Vision and Pattern Recognition, pp. 317–326 (2016) Gao, Y., Beijbom, O., Zhang, N., Darrell, T.: Compact bilinear pooling. In: International Conference on Computer Vision and Pattern Recognition, pp. 317–326 (2016)
9.
go back to reference Kong, S., Fowlkes, C.: Low-rank bilinear pooling for fine-grained classification. In: Conference on Computer Vision and Pattern Recognition, pp. 880–890 (2017) Kong, S., Fowlkes, C.: Low-rank bilinear pooling for fine-grained classification. In: Conference on Computer Vision and Pattern Recognition, pp. 880–890 (2017)
10.
go back to reference Krizhevsky, A., Sutskever, I., Hinton, G.: ImageNet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 25(2) (2012) Krizhevsky, A., Sutskever, I., Hinton, G.: ImageNet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 25(2) (2012)
11.
go back to reference He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016) He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
12.
go back to reference Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: ICLR, pp. 553–572 (2015) Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: ICLR, pp. 553–572 (2015)
13.
go back to reference Duchi, J., Hazan, E., Singer, Y.: Adaptive Subgradient Methods for Online Learning and Stochastic Optimization. J. Mach. Learn. Res. 12(7), 257–269 (2011)MathSciNetMATH Duchi, J., Hazan, E., Singer, Y.: Adaptive Subgradient Methods for Online Learning and Stochastic Optimization. J. Mach. Learn. Res. 12(7), 257–269 (2011)MathSciNetMATH
14.
go back to reference Kingma, D., Ba, J.: Adam: A method for stochastic optimization. Computer Science, pp. 1135–1142 (2015) Kingma, D., Ba, J.: Adam: A method for stochastic optimization. Computer Science, pp. 1135–1142 (2015)
15.
go back to reference Srivastava, N., Hinton, G., Krizhevsky, A., Salakhutdinov, R.: Dropout - a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)MathSciNetMATH Srivastava, N., Hinton, G., Krizhevsky, A., Salakhutdinov, R.: Dropout - a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)MathSciNetMATH
16.
go back to reference Zeiler, M.: ADADELTA: An Adaptive Learning Rate Method. arXiv.org (2012) Zeiler, M.: ADADELTA: An Adaptive Learning Rate Method. arXiv.org (2012)
17.
go back to reference Tuzel, O., Porikli, F., Meer, P.: Pedestrian detection via classification on riemannian manifolds. In: IEEE TPAMI, pp. 1980–1991 (2008) Tuzel, O., Porikli, F., Meer, P.: Pedestrian detection via classification on riemannian manifolds. In: IEEE TPAMI, pp. 1980–1991 (2008)
18.
go back to reference Pennec. X., Fillard, P., Ayache, N.: A riemannian framework for tensor computing. In: IJCV, pp. 990–1112 (2006) Pennec. X., Fillard, P., Ayache, N.: A riemannian framework for tensor computing. In: IJCV, pp. 990–1112 (2006)
19.
go back to reference Ha, M., San-Biagio, M., Murino, V.: Log-Hilbert-Schmidt metric between positive definite operators on Hilbert spaces. In NIPS, pp. 1124–1134 (2014) Ha, M., San-Biagio, M., Murino, V.: Log-Hilbert-Schmidt metric between positive definite operators on Hilbert spaces. In NIPS, pp. 1124–1134 (2014)
20.
go back to reference Sra, S.: A new metric on the manifold of kernel matrices with application to matrix geometric means. In: NIPS, pp. 2010–2023 (2012) Sra, S.: A new metric on the manifold of kernel matrices with application to matrix geometric means. In: NIPS, pp. 2010–2023 (2012)
21.
go back to reference Xu, X., Mu, N., Zhang, X.: Covariance descriptor based convolution neural network for saliency computation in low contrast images. In: International Joint Conference on Neural Networks, pp. 1220–1229 (2016) Xu, X., Mu, N., Zhang, X.: Covariance descriptor based convolution neural network for saliency computation in low contrast images. In: International Joint Conference on Neural Networks, pp. 1220–1229 (2016)
22.
go back to reference Yu, K., Salzmann, M.: Second-order convolutional neural networks. In: Computer Vision and Pattern Recognition, pp. 1305–1316 (2017) Yu, K., Salzmann, M.: Second-order convolutional neural networks. In: Computer Vision and Pattern Recognition, pp. 1305–1316 (2017)
24.
go back to reference Huang, Z., Van Gool, L.: A Riemannian network for SPD matrix learning. In: Internaltional Conference on Computer Vision and Pattern Recognition, pp. 2036–2042 ( 2017) Huang, Z., Van Gool, L.: A Riemannian network for SPD matrix learning. In: Internaltional Conference on Computer Vision and Pattern Recognition, pp. 2036–2042 ( 2017)
25.
go back to reference Acharya, D., Huang, Z., Paudel, D.: Covariance pooling for facial expression recognition. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, pp. 480–487 (2018) Acharya, D., Huang, Z., Paudel, D.: Covariance pooling for facial expression recognition. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, pp. 480–487 (2018)
26.
go back to reference Dai, T., Cai, J., Zhang, Y., Xia, S., Zhang, L.: Second-order attention network for single image super-resolution. In: International Conference on Computer Vision and Pattern Recogintion, pp. 1123–1135 (2019) Dai, T., Cai, J., Zhang, Y., Xia, S., Zhang, L.: Second-order attention network for single image super-resolution. In: International Conference on Computer Vision and Pattern Recogintion, pp. 1123–1135 (2019)
27.
go back to reference Dhall, A., et al.: Collecting large, richly annotated facial expression databases from movies. IEEE Multimedia 19(3), 34–41 (2012)CrossRef Dhall, A., et al.: Collecting large, richly annotated facial expression databases from movies. IEEE Multimedia 19(3), 34–41 (2012)CrossRef
28.
go back to reference Dhall, A., et al.: Emotion recognition in the wild challenge 2014: Baseline, data and protocol. In: ACM ICMI (2014) Dhall, A., et al.: Emotion recognition in the wild challenge 2014: Baseline, data and protocol. In: ACM ICMI (2014)
29.
go back to reference Li, S., Deng, W, Du, J.: Reliable crowdsourcing and deep locality-preserving learning for expression recognition in the wild. In: The IEEE Conference on Computer Vision and Pattern Recognition, pp. 89–96 (2017) Li, S., Deng, W, Du, J.: Reliable crowdsourcing and deep locality-preserving learning for expression recognition in the wild. In: The IEEE Conference on Computer Vision and Pattern Recognition, pp. 89–96 (2017)
30.
go back to reference Zhu, X., Ramanan, D.: Face detection, pose estimation and landmark estimation in the wild. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3445 (2012) Zhu, X., Ramanan, D.: Face detection, pose estimation and landmark estimation in the wild. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3445 (2012)
31.
go back to reference Benitez-Quiroz, C.F., Srinivasan, R., Martinez, A.M.: Emotionet: an accurate, real-time algorithm for the automatic annotation of a million facial expressions in the wild. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 5562–5570 (2016) Benitez-Quiroz, C.F., Srinivasan, R., Martinez, A.M.: Emotionet: an accurate, real-time algorithm for the automatic annotation of a million facial expressions in the wild. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 5562–5570 (2016)
32.
go back to reference Goodfellow, I.J.: Challenges in representation learning. Neural Netw. 64(C), 59–63 (2015) Goodfellow, I.J.: Challenges in representation learning. Neural Netw. 64(C), 59–63 (2015)
Metadata
Title
Second-Order Convolutional Neural Network Based on Cholesky Compression Strategy
Authors
Yan Li
Jing Zhang
Qiang Hua
Copyright Year
2021
DOI
https://doi.org/10.1007/978-3-030-69244-5_30

Premium Partner