Skip to main content

2021 | OriginalPaper | Buchkapitel

Second-Order Convolutional Neural Network Based on Cholesky Compression Strategy

verfasst von : Yan Li, Jing Zhang, Qiang Hua

Erschienen in: Parallel and Distributed Computing, Applications and Technologies

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In the past few years, Convolution Neural Network (CNN) has been successfully applied to many computer vision tasks. Most of these networks can only extract first-order information from input images. The second-order statistical information refers to the second-order correlation obtained by calculating the covariance matrix, the fisher information matrix, or the vector outer product operation on the local feature group according to the channels. It has been shown that using second-order information on facial expression datasets can better capture the distortion of facial area features, while at the same time generate more parameters which may cause much more computational cost. In this article we propose a new CNN structure including layers which can (i) incorporate first-order information into the covariance matrix; (ii) use eigenvalue vectors to measure the importance of feature channels; (iii) reduce the bilinear dimensionality of the parameter matrix; and (iv) perform Cholesky decomposition on the positive definite matrix to complete the compression of the second-order information matrix. Due to the incorporation of both first-order and second-order information and the Cholesky compression strategy, our proposed method reduces the number of parameters by half of the SPDNet model, and simultaneously achieves better results in facial expression classification tasks than the corresponding first-order model and the reference second-order model.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Krizhevsky, A., Sutskever, I., Hinton, G.: ImageNet classification with deep convolutional neural networks. In: NIPS, pp. 1106–1114 (2012) Krizhevsky, A., Sutskever, I., Hinton, G.: ImageNet classification with deep convolutional neural networks. In: NIPS, pp. 1106–1114 (2012)
2.
Zurück zum Zitat He, K., Zhang, X., Ren, S, Sun, J.: Deep residual learning for image recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016) He, K., Zhang, X., Ren, S, Sun, J.: Deep residual learning for image recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
3.
Zurück zum Zitat Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In ICLR, pp. 340–352 (2015) Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In ICLR, pp. 340–352 (2015)
4.
Zurück zum Zitat Lin, T., RoyChowdhury, A., Maji, S.: Bilinear CNN models for fine-grained visual recognition. In: ICCV, pp. 1449–1457 (2015) Lin, T., RoyChowdhury, A., Maji, S.: Bilinear CNN models for fine-grained visual recognition. In: ICCV, pp. 1449–1457 (2015)
5.
Zurück zum Zitat Ionescu, C.,Vantzos, O., Sminchisescu, C.: Matrix backpropagation for deep networks with structured layers. In: IEEE International Conference on Computer Vision, pp. 990–1002 (2015) Ionescu, C.,Vantzos, O., Sminchisescu, C.: Matrix backpropagation for deep networks with structured layers. In: IEEE International Conference on Computer Vision, pp. 990–1002 (2015)
6.
Zurück zum Zitat Li, P., Xie, J., Wang, Q., Zuo, W.: Is second-order information helpful for large-scale visual recognition? In: ICCV, pp. 1205–1213 (2017) Li, P., Xie, J., Wang, Q., Zuo, W.: Is second-order information helpful for large-scale visual recognition? In: ICCV, pp. 1205–1213 (2017)
8.
Zurück zum Zitat Gao, Y., Beijbom, O., Zhang, N., Darrell, T.: Compact bilinear pooling. In: International Conference on Computer Vision and Pattern Recognition, pp. 317–326 (2016) Gao, Y., Beijbom, O., Zhang, N., Darrell, T.: Compact bilinear pooling. In: International Conference on Computer Vision and Pattern Recognition, pp. 317–326 (2016)
9.
Zurück zum Zitat Kong, S., Fowlkes, C.: Low-rank bilinear pooling for fine-grained classification. In: Conference on Computer Vision and Pattern Recognition, pp. 880–890 (2017) Kong, S., Fowlkes, C.: Low-rank bilinear pooling for fine-grained classification. In: Conference on Computer Vision and Pattern Recognition, pp. 880–890 (2017)
10.
Zurück zum Zitat Krizhevsky, A., Sutskever, I., Hinton, G.: ImageNet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 25(2) (2012) Krizhevsky, A., Sutskever, I., Hinton, G.: ImageNet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 25(2) (2012)
11.
Zurück zum Zitat He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016) He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
12.
Zurück zum Zitat Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: ICLR, pp. 553–572 (2015) Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: ICLR, pp. 553–572 (2015)
13.
Zurück zum Zitat Duchi, J., Hazan, E., Singer, Y.: Adaptive Subgradient Methods for Online Learning and Stochastic Optimization. J. Mach. Learn. Res. 12(7), 257–269 (2011)MathSciNetMATH Duchi, J., Hazan, E., Singer, Y.: Adaptive Subgradient Methods for Online Learning and Stochastic Optimization. J. Mach. Learn. Res. 12(7), 257–269 (2011)MathSciNetMATH
14.
Zurück zum Zitat Kingma, D., Ba, J.: Adam: A method for stochastic optimization. Computer Science, pp. 1135–1142 (2015) Kingma, D., Ba, J.: Adam: A method for stochastic optimization. Computer Science, pp. 1135–1142 (2015)
15.
Zurück zum Zitat Srivastava, N., Hinton, G., Krizhevsky, A., Salakhutdinov, R.: Dropout - a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)MathSciNetMATH Srivastava, N., Hinton, G., Krizhevsky, A., Salakhutdinov, R.: Dropout - a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)MathSciNetMATH
16.
Zurück zum Zitat Zeiler, M.: ADADELTA: An Adaptive Learning Rate Method. arXiv.org (2012) Zeiler, M.: ADADELTA: An Adaptive Learning Rate Method. arXiv.org (2012)
17.
Zurück zum Zitat Tuzel, O., Porikli, F., Meer, P.: Pedestrian detection via classification on riemannian manifolds. In: IEEE TPAMI, pp. 1980–1991 (2008) Tuzel, O., Porikli, F., Meer, P.: Pedestrian detection via classification on riemannian manifolds. In: IEEE TPAMI, pp. 1980–1991 (2008)
18.
Zurück zum Zitat Pennec. X., Fillard, P., Ayache, N.: A riemannian framework for tensor computing. In: IJCV, pp. 990–1112 (2006) Pennec. X., Fillard, P., Ayache, N.: A riemannian framework for tensor computing. In: IJCV, pp. 990–1112 (2006)
19.
Zurück zum Zitat Ha, M., San-Biagio, M., Murino, V.: Log-Hilbert-Schmidt metric between positive definite operators on Hilbert spaces. In NIPS, pp. 1124–1134 (2014) Ha, M., San-Biagio, M., Murino, V.: Log-Hilbert-Schmidt metric between positive definite operators on Hilbert spaces. In NIPS, pp. 1124–1134 (2014)
20.
Zurück zum Zitat Sra, S.: A new metric on the manifold of kernel matrices with application to matrix geometric means. In: NIPS, pp. 2010–2023 (2012) Sra, S.: A new metric on the manifold of kernel matrices with application to matrix geometric means. In: NIPS, pp. 2010–2023 (2012)
21.
Zurück zum Zitat Xu, X., Mu, N., Zhang, X.: Covariance descriptor based convolution neural network for saliency computation in low contrast images. In: International Joint Conference on Neural Networks, pp. 1220–1229 (2016) Xu, X., Mu, N., Zhang, X.: Covariance descriptor based convolution neural network for saliency computation in low contrast images. In: International Joint Conference on Neural Networks, pp. 1220–1229 (2016)
22.
Zurück zum Zitat Yu, K., Salzmann, M.: Second-order convolutional neural networks. In: Computer Vision and Pattern Recognition, pp. 1305–1316 (2017) Yu, K., Salzmann, M.: Second-order convolutional neural networks. In: Computer Vision and Pattern Recognition, pp. 1305–1316 (2017)
24.
Zurück zum Zitat Huang, Z., Van Gool, L.: A Riemannian network for SPD matrix learning. In: Internaltional Conference on Computer Vision and Pattern Recognition, pp. 2036–2042 ( 2017) Huang, Z., Van Gool, L.: A Riemannian network for SPD matrix learning. In: Internaltional Conference on Computer Vision and Pattern Recognition, pp. 2036–2042 ( 2017)
25.
Zurück zum Zitat Acharya, D., Huang, Z., Paudel, D.: Covariance pooling for facial expression recognition. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, pp. 480–487 (2018) Acharya, D., Huang, Z., Paudel, D.: Covariance pooling for facial expression recognition. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, pp. 480–487 (2018)
26.
Zurück zum Zitat Dai, T., Cai, J., Zhang, Y., Xia, S., Zhang, L.: Second-order attention network for single image super-resolution. In: International Conference on Computer Vision and Pattern Recogintion, pp. 1123–1135 (2019) Dai, T., Cai, J., Zhang, Y., Xia, S., Zhang, L.: Second-order attention network for single image super-resolution. In: International Conference on Computer Vision and Pattern Recogintion, pp. 1123–1135 (2019)
27.
Zurück zum Zitat Dhall, A., et al.: Collecting large, richly annotated facial expression databases from movies. IEEE Multimedia 19(3), 34–41 (2012)CrossRef Dhall, A., et al.: Collecting large, richly annotated facial expression databases from movies. IEEE Multimedia 19(3), 34–41 (2012)CrossRef
28.
Zurück zum Zitat Dhall, A., et al.: Emotion recognition in the wild challenge 2014: Baseline, data and protocol. In: ACM ICMI (2014) Dhall, A., et al.: Emotion recognition in the wild challenge 2014: Baseline, data and protocol. In: ACM ICMI (2014)
29.
Zurück zum Zitat Li, S., Deng, W, Du, J.: Reliable crowdsourcing and deep locality-preserving learning for expression recognition in the wild. In: The IEEE Conference on Computer Vision and Pattern Recognition, pp. 89–96 (2017) Li, S., Deng, W, Du, J.: Reliable crowdsourcing and deep locality-preserving learning for expression recognition in the wild. In: The IEEE Conference on Computer Vision and Pattern Recognition, pp. 89–96 (2017)
30.
Zurück zum Zitat Zhu, X., Ramanan, D.: Face detection, pose estimation and landmark estimation in the wild. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3445 (2012) Zhu, X., Ramanan, D.: Face detection, pose estimation and landmark estimation in the wild. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3445 (2012)
31.
Zurück zum Zitat Benitez-Quiroz, C.F., Srinivasan, R., Martinez, A.M.: Emotionet: an accurate, real-time algorithm for the automatic annotation of a million facial expressions in the wild. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 5562–5570 (2016) Benitez-Quiroz, C.F., Srinivasan, R., Martinez, A.M.: Emotionet: an accurate, real-time algorithm for the automatic annotation of a million facial expressions in the wild. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 5562–5570 (2016)
32.
Zurück zum Zitat Goodfellow, I.J.: Challenges in representation learning. Neural Netw. 64(C), 59–63 (2015) Goodfellow, I.J.: Challenges in representation learning. Neural Netw. 64(C), 59–63 (2015)
Metadaten
Titel
Second-Order Convolutional Neural Network Based on Cholesky Compression Strategy
verfasst von
Yan Li
Jing Zhang
Qiang Hua
Copyright-Jahr
2021
DOI
https://doi.org/10.1007/978-3-030-69244-5_30