Skip to main content
Top

2021 | OriginalPaper | Chapter

Adaptive Tensor-Train Decomposition for Neural Network Compression

Authors : Yanwei Zheng, Yang Zhou, Zengrui Zhao, Dongxiao Yu

Published in: Parallel and Distributed Computing, Applications and Technologies

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

It could be of great difficulty and cost to directly apply complex deep neural network to mobile devices with limited computing and endurance abilities. This paper aims to solve such problem through improving the compactness of the model and efficiency of computing. On the basis of MobileNet, a mainstream lightweight neural network, we proposed an Adaptive Tensor-Train Decomposition (ATTD) algorithm to solve the cumbersome problem of finding optimal decomposition rank. For its non-obviousness in the forward acceleration of GPU side, our strategy of choosing to use lower decomposition dimensions and moderate decomposition rank, and the using of dynamic programming, have effectively reduced the number of parameters and amount of computation. And then, we have also set up a real-time target network for mobile devices. With the support of sufficient amount of experiment results, the method proposed in this paper can greatly reduce the number of parameters and amount of computation, improving the model’s speed in deducing on mobile devices.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)CrossRef LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)CrossRef
2.
go back to reference Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: NIPS, pp. 1097–1105 (2012) Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: NIPS, pp. 1097–1105 (2012)
4.
go back to reference Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition (2014) Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition (2014)
5.
go back to reference Szegedy, C.: Going deeper with convolutions. In: CVPR, pp. 1–9 (2015) Szegedy, C.: Going deeper with convolutions. In: CVPR, pp. 1–9 (2015)
6.
go back to reference He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016)
7.
go back to reference Sheng, H., et al.: Mining hard samples globally and efficiently for person reidentification. IoT-J 7(10), 9611–9622 (2020) Sheng, H., et al.: Mining hard samples globally and efficiently for person reidentification. IoT-J 7(10), 9611–9622 (2020)
8.
go back to reference Han, S., Pool, J., Tran, J., Dally, W.J.: Learning both weights and connections for efficient neural network. In: NIPS, pp. 1135–1143 (2015) Han, S., Pool, J., Tran, J., Dally, W.J.: Learning both weights and connections for efficient neural network. In: NIPS, pp. 1135–1143 (2015)
9.
go back to reference Srinivas, S., Babu, R.V.: Data-free parameter pruning for deep neural networks (2015) Srinivas, S., Babu, R.V.: Data-free parameter pruning for deep neural networks (2015)
10.
go back to reference Cheng, Y., Wang, D., Zhou, P., Zhang, T.: A survey of model compression and acceleration for deep neural networks (2017) Cheng, Y., Wang, D., Zhou, P., Zhang, T.: A survey of model compression and acceleration for deep neural networks (2017)
11.
go back to reference Bach, F.R., Jordan, M.I.: Predictive low-rank decomposition for kernel methods. In: ICML, pp. 33–40. Association for Computing Machinery (2005) Bach, F.R., Jordan, M.I.: Predictive low-rank decomposition for kernel methods. In: ICML, pp. 33–40. Association for Computing Machinery (2005)
12.
go back to reference Prakash, A., Storer, J., Florencio, D., Zhang, C.: Repr: Improved training of convolutional filters (2018) Prakash, A., Storer, J., Florencio, D., Zhang, C.: Repr: Improved training of convolutional filters (2018)
14.
go back to reference Romero, A., Ballas, N., Kahou, S.E., Chassang, A., Gatta, C., Bengio, Y.: Fitnets: Hints for thin deep nets (2014) Romero, A., Ballas, N., Kahou, S.E., Chassang, A., Gatta, C., Bengio, Y.: Fitnets: Hints for thin deep nets (2014)
15.
go back to reference Chen, W., Wilson, J., Tyree, S., Weinberger, K., Chen, Y.: Compressing neural networks with the hashing trick. In: ICML, pp. 2285–2294 (2015) Chen, W., Wilson, J., Tyree, S., Weinberger, K., Chen, Y.: Compressing neural networks with the hashing trick. In: ICML, pp. 2285–2294 (2015)
16.
go back to reference Cheng, Y., Yu, F.X., Feris, R.S., Kumar, S., Choudhary, A., Chang, S.F.: An exploration of parameter redundancy in deep networks with circulant projections. In: ICCV, pp. 2857–2865 (2015) Cheng, Y., Yu, F.X., Feris, R.S., Kumar, S., Choudhary, A., Chang, S.F.: An exploration of parameter redundancy in deep networks with circulant projections. In: ICCV, pp. 2857–2865 (2015)
17.
go back to reference Jaderberg, M., Vedaldi, A., Zisserman, A.: Speeding up convolutional neural networks with low rank expansions (2014) Jaderberg, M., Vedaldi, A., Zisserman, A.: Speeding up convolutional neural networks with low rank expansions (2014)
20.
go back to reference Novikov, A., Podoprikhin, D., Osokin, A., Vetrov, D.P.: Tensorizing neural networks. In: NIPS, pp. 442–450 (2015) Novikov, A., Podoprikhin, D., Osokin, A., Vetrov, D.P.: Tensorizing neural networks. In: NIPS, pp. 442–450 (2015)
21.
go back to reference LeCun, Y., Denker, J.S., Solla, S.A.: Optimal brain damage. In: NIPS, pp. 598–605 (1990) LeCun, Y., Denker, J.S., Solla, S.A.: Optimal brain damage. In: NIPS, pp. 598–605 (1990)
22.
go back to reference Han, S., Mao, H., Dally, W.J.: Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding (2015) Han, S., Mao, H., Dally, W.J.: Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding (2015)
23.
go back to reference Lebedev, V., Lempitsky, V.: Fast convnets using group-wise brain damage. In: CVPR, pp. 2554–2564 (2016) Lebedev, V., Lempitsky, V.: Fast convnets using group-wise brain damage. In: CVPR, pp. 2554–2564 (2016)
24.
go back to reference Molchanov, P., Tyree, S., Karras, T., Aila, T., Kautz, J.: Pruning convolutional neural networks for resource efficient inference (2016) Molchanov, P., Tyree, S., Karras, T., Aila, T., Kautz, J.: Pruning convolutional neural networks for resource efficient inference (2016)
25.
go back to reference Luo, J.H., Wu, J., Lin, W.: Thinet: a filter level pruning method for deep neural network compression. In: ICCV, pp. 5058–5066 (2017) Luo, J.H., Wu, J., Lin, W.: Thinet: a filter level pruning method for deep neural network compression. In: ICCV, pp. 5058–5066 (2017)
26.
go back to reference Gupta, S., Agrawal, A., Gopalakrishnan, K., Narayanan, P.: Deep learning with limited numerical precision. In: ICML, pp. 1737–1746 (2015) Gupta, S., Agrawal, A., Gopalakrishnan, K., Narayanan, P.: Deep learning with limited numerical precision. In: ICML, pp. 1737–1746 (2015)
27.
go back to reference Ma, Y., Suda, N., Cao, Y., Seo, J.S., Vrudhula, S.: Scalable and modularized RTL compilation of convolutional neural networks onto FPGA. In: International Conference on Field Programmable Logic and Applications, pp. 1–8 (2016) Ma, Y., Suda, N., Cao, Y., Seo, J.S., Vrudhula, S.: Scalable and modularized RTL compilation of convolutional neural networks onto FPGA. In: International Conference on Field Programmable Logic and Applications, pp. 1–8 (2016)
28.
go back to reference Gysel, P.: Ristretto: hardware-oriented approximation of convolutional neural networks (2016) Gysel, P.: Ristretto: hardware-oriented approximation of convolutional neural networks (2016)
29.
go back to reference Courbariaux, M., Bengio, Y., David, J.P.: Binaryconnect: training deep neural networks with binary weights during propagations. In: NIPS, pp. 3123–3131 (2015) Courbariaux, M., Bengio, Y., David, J.P.: Binaryconnect: training deep neural networks with binary weights during propagations. In: NIPS, pp. 3123–3131 (2015)
30.
go back to reference Courbariaux, M., Hubara, I., Soudry, D., El-Yaniv, R., Bengio, Y.: Binarized neural networks: Training deep neural networks with weights and activations constrained to +1 or \(-\)1 (2016) Courbariaux, M., Hubara, I., Soudry, D., El-Yaniv, R., Bengio, Y.: Binarized neural networks: Training deep neural networks with weights and activations constrained to +1 or \(-\)1 (2016)
32.
go back to reference Denil, M., Shakibi, B., Dinh, L., Ranzato, M.A., De Freitas, N.: Predicting parameters in deep learning. In: NIPS, pp. 2148–2156 (2013) Denil, M., Shakibi, B., Dinh, L., Ranzato, M.A., De Freitas, N.: Predicting parameters in deep learning. In: NIPS, pp. 2148–2156 (2013)
33.
go back to reference Lebedev, V., Ganin, Y., Rakhuba, M., Oseledets, I., Lempitsky, V.: Speeding-up convolutional neural networks using fine-tuned cp-decomposition (2014) Lebedev, V., Ganin, Y., Rakhuba, M., Oseledets, I., Lempitsky, V.: Speeding-up convolutional neural networks using fine-tuned cp-decomposition (2014)
34.
go back to reference Tai, C., Xiao, T., Zhang, Y., Wang, X.: Convolutional neural networks with low-rank regularization (2015) Tai, C., Xiao, T., Zhang, Y., Wang, X.: Convolutional neural networks with low-rank regularization (2015)
35.
go back to reference Lin, S., Ji, R., Chen, C., Huang, F.: Espace: accelerating convolutional neural networks via eliminating spatial and channel redundancy. In: AAAI, pp. 1424–1430 (2017) Lin, S., Ji, R., Chen, C., Huang, F.: Espace: accelerating convolutional neural networks via eliminating spatial and channel redundancy. In: AAAI, pp. 1424–1430 (2017)
36.
go back to reference Iandola, F.N., Han, S., Moskewicz, M.W., Ashraf, K., Dally, W.J., Keutzer, K.: Squeezenet: Alexnet-level accuracy with 50x fewer parameters and 0.5 mb model size. arXiv preprint arXiv:1602.07360 (2016) Iandola, F.N., Han, S., Moskewicz, M.W., Ashraf, K., Dally, W.J., Keutzer, K.: Squeezenet: Alexnet-level accuracy with 50x fewer parameters and 0.5 mb model size. arXiv preprint arXiv:​1602.​07360 (2016)
37.
go back to reference Howard, A.G.: Efficient convolutional neural networks for mobile vision applications, Mobilenets (2017) Howard, A.G.: Efficient convolutional neural networks for mobile vision applications, Mobilenets (2017)
38.
go back to reference Hluchyj, M.G., Karol, M.J.: Shuffle net: an application of generalized perfect shuffles to multihop lightwave networks. J. Lightwave Technol. 9(10), 1386–1397 (1991) Hluchyj, M.G., Karol, M.J.: Shuffle net: an application of generalized perfect shuffles to multihop lightwave networks. J. Lightwave Technol. 9(10), 1386–1397 (1991)
39.
go back to reference Korattikara, A., Rathod, V., Murphy, K., Welling, M.: Bayesian dark knowledge. In: NIPS, pp. 3438–3446 (2015) Korattikara, A., Rathod, V., Murphy, K., Welling, M.: Bayesian dark knowledge. In: NIPS, pp. 3438–3446 (2015)
40.
go back to reference Luo, P., Zhu, Z., Liu, Z., Wang, X., Tang, X.: Face model compression by distilling knowledge from neurons. In: AAAI, pp. 3560–3566 (2016) Luo, P., Zhu, Z., Liu, Z., Wang, X., Tang, X.: Face model compression by distilling knowledge from neurons. In: AAAI, pp. 3560–3566 (2016)
41.
go back to reference Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.C.: Mobilenetv 2: inverted residuals and linear bottlenecks. In: CVPR, pp. 4510–4520 (2018) Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.C.: Mobilenetv 2: inverted residuals and linear bottlenecks. In: CVPR, pp. 4510–4520 (2018)
Metadata
Title
Adaptive Tensor-Train Decomposition for Neural Network Compression
Authors
Yanwei Zheng
Yang Zhou
Zengrui Zhao
Dongxiao Yu
Copyright Year
2021
DOI
https://doi.org/10.1007/978-3-030-69244-5_6

Premium Partner