nach oben

Erschienen in:

2021 | OriginalPaper | Buchkapitel

Second-Order Convolutional Neural Network Based on Cholesky Compression Strategy

verfasst von : Yan Li, Jing Zhang, Qiang Hua

Erschienen in: Parallel and Distributed Computing, Applications and Technologies

Verlag: Springer International Publishing

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

In the past few years, Convolution Neural Network (CNN) has been successfully applied to many computer vision tasks. Most of these networks can only extract first-order information from input images. The second-order statistical information refers to the second-order correlation obtained by calculating the covariance matrix, the fisher information matrix, or the vector outer product operation on the local feature group according to the channels. It has been shown that using second-order information on facial expression datasets can better capture the distortion of facial area features, while at the same time generate more parameters which may cause much more computational cost. In this article we propose a new CNN structure including layers which can (i) incorporate first-order information into the covariance matrix; (ii) use eigenvalue vectors to measure the importance of feature channels; (iii) reduce the bilinear dimensionality of the parameter matrix; and (iv) perform Cholesky decomposition on the positive definite matrix to complete the compression of the second-order information matrix. Due to the incorporation of both first-order and second-order information and the Cholesky compression strategy, our proposed method reduces the number of parameters by half of the SPDNet model, and simultaneously achieves better results in facial expression classification tasks than the corresponding first-order model and the reference second-order model.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Vorheriges Kapitel Data Caching Based Transfer Optimization in Large Scale Networks

Nächstes Kapitel Submodular Maximization with Bounded Marginal Values

Krizhevsky, A., Sutskever, I., Hinton, G.: ImageNet classification with deep convolutional neural networks. In: NIPS, pp. 1106–1114 (2012)

He, K., Zhang, X., Ren, S, Sun, J.: Deep residual learning for image recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)

Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In ICLR, pp. 340–352 (2015)

Lin, T., RoyChowdhury, A., Maji, S.: Bilinear CNN models for fine-grained visual recognition. In: ICCV, pp. 1449–1457 (2015)

Ionescu, C.,Vantzos, O., Sminchisescu, C.: Matrix backpropagation for deep networks with structured layers. In: IEEE International Conference on Computer Vision, pp. 990–1002 (2015)

Li, P., Xie, J., Wang, Q., Zuo, W.: Is second-order information helpful for large-scale visual recognition? In: ICCV, pp. 1205–1213 (2017)

Tuzel, O., Porikli, F., Meer, P.: Region Covariance: A Fast Descriptor for Detection and Classification. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3952, pp. 589–600. Springer, Heidelberg (2006). https://doi.org/10.1007/11744047_45CrossRef

Gao, Y., Beijbom, O., Zhang, N., Darrell, T.: Compact bilinear pooling. In: International Conference on Computer Vision and Pattern Recognition, pp. 317–326 (2016)

Kong, S., Fowlkes, C.: Low-rank bilinear pooling for fine-grained classification. In: Conference on Computer Vision and Pattern Recognition, pp. 880–890 (2017)

10.

Krizhevsky, A., Sutskever, I., Hinton, G.: ImageNet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 25(2) (2012)

11.

He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)

12.

Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: ICLR, pp. 553–572 (2015)

13.

Duchi, J., Hazan, E., Singer, Y.: Adaptive Subgradient Methods for Online Learning and Stochastic Optimization. J. Mach. Learn. Res. 12(7), 257–269 (2011)MathSciNetMATH

14.

Kingma, D., Ba, J.: Adam: A method for stochastic optimization. Computer Science, pp. 1135–1142 (2015)

15.

Srivastava, N., Hinton, G., Krizhevsky, A., Salakhutdinov, R.: Dropout - a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)MathSciNetMATH

16.

Zeiler, M.: ADADELTA: An Adaptive Learning Rate Method. arXiv.org (2012)

17.

Tuzel, O., Porikli, F., Meer, P.: Pedestrian detection via classification on riemannian manifolds. In: IEEE TPAMI, pp. 1980–1991 (2008)

18.

Pennec. X., Fillard, P., Ayache, N.: A riemannian framework for tensor computing. In: IJCV, pp. 990–1112 (2006)

19.

Ha, M., San-Biagio, M., Murino, V.: Log-Hilbert-Schmidt metric between positive definite operators on Hilbert spaces. In NIPS, pp. 1124–1134 (2014)

20.

Sra, S.: A new metric on the manifold of kernel matrices with application to matrix geometric means. In: NIPS, pp. 2010–2023 (2012)

21.

Xu, X., Mu, N., Zhang, X.: Covariance descriptor based convolution neural network for saliency computation in low contrast images. In: International Joint Conference on Neural Networks, pp. 1220–1229 (2016)

22.

Yu, K., Salzmann, M.: Second-order convolutional neural networks. In: Computer Vision and Pattern Recognition, pp. 1305–1316 (2017)

23.

Yu, K., Salzmann, M.: Statistically-Motivated Second-Order Pooling. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11211, pp. 621–637. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01234-2_37CrossRef

24.

Huang, Z., Van Gool, L.: A Riemannian network for SPD matrix learning. In: Internaltional Conference on Computer Vision and Pattern Recognition, pp. 2036–2042 ( 2017)

25.

Acharya, D., Huang, Z., Paudel, D.: Covariance pooling for facial expression recognition. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, pp. 480–487 (2018)

26.

Dai, T., Cai, J., Zhang, Y., Xia, S., Zhang, L.: Second-order attention network for single image super-resolution. In: International Conference on Computer Vision and Pattern Recogintion, pp. 1123–1135 (2019)

27.

Dhall, A., et al.: Collecting large, richly annotated facial expression databases from movies. IEEE Multimedia 19(3), 34–41 (2012)CrossRef

28.

Dhall, A., et al.: Emotion recognition in the wild challenge 2014: Baseline, data and protocol. In: ACM ICMI (2014)

29.

Li, S., Deng, W, Du, J.: Reliable crowdsourcing and deep locality-preserving learning for expression recognition in the wild. In: The IEEE Conference on Computer Vision and Pattern Recognition, pp. 89–96 (2017)

30.

Zhu, X., Ramanan, D.: Face detection, pose estimation and landmark estimation in the wild. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3445 (2012)

31.

Benitez-Quiroz, C.F., Srinivasan, R., Martinez, A.M.: Emotionet: an accurate, real-time algorithm for the automatic annotation of a million facial expressions in the wild. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 5562–5570 (2016)

32.

Goodfellow, I.J.: Challenges in representation learning. Neural Netw. 64(C), 59–63 (2015)

Titel: Second-Order Convolutional Neural Network Based on Cholesky Compression Strategy
verfasst von: Yan Li
Jing Zhang
Qiang Hua
Verlag: Springer International Publishing
Buch: Parallel and Distributed Computing, Applications and Technologies
Print ISBN: 978-3-030-69243-8

Electronic ISBN: 978-3-030-69244-5

Copyright-Jahr: 2021
DOI: https://doi.org/10.1007/978-3-030-69244-5_30

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"