nach oben

Erschienen in:

2019 | OriginalPaper | Buchkapitel

Towards Efficient Forward Propagation on Resource-Constrained Systems

verfasst von : Günther Schindler, Matthias Zöhrer, Franz Pernkopf, Holger Fröning

Erschienen in: Machine Learning and Knowledge Discovery in Databases

Verlag: Springer International Publishing

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

In this work we present key elements of DeepChip, a framework that bridges recent trends in machine learning with applicable forward propagation on resource-constrained devices. Main objective of this work is to reduce compute and memory requirements by removing redundancy from neural networks. DeepChip features a flexible quantizer to reduce the bit width of activations to 8-bit fixed-point and weights to an asymmetric ternary representation. In combination with novel algorithms and data compression we leverage reduced precision and sparsity for efficient forward propagation on a wide range of processor architectures. We validate our approach on a set of different convolutional neural networks and datasets: ConvNet on SVHN, ResNet-44 on CIFAR10 and AlexNet on ImageNet. Compared to single-precision floating point, memory requirements can be compressed by a factor of 43, 22 and 10 and computations accelerated by a factor of 5.2, 2.8 and 2.0 on a mobile processor without a loss in classification accuracy. DeepChip allows trading accuracy for efficiency, and for instance tolerating about 2% loss in classification accuracy further reduces memory requirements by a factor of 88, 29 and 13, and speeds up computations by a factor of 6.0, 4.3 and 5.0. Code related to this paper is available at: https://github.com/UniHD-CEG/ECML2018.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Vorheriges Kapitel Using Supervised Pretraining to Improve Generalization of Neural Networks on Binary Classification Problems

Nächstes Kapitel Auxiliary Guided Autoregressive Variational Autoencoders

https://github.com/UniHD-CEG/ECML2018.

https://github.com/google/gemmlowp.

https://www.tensorflow.org/mobile/.

https://github.com/baidu/mobile-deep-learning.

Huffman, D.A.: A method for the construction of minimum-redundancy codes. Resonance 11, 91–99 (2006)CrossRef

Abadi, M.: Tensorflow: Large-scale machine learning on heterogeneous distributed systems. CoRR abs/1603.04467 (2016)

ARM: Cortex-a9 neon media - technical reference manual. Technical report (2008)

Cai, Z., He, X., Sun, J., Vasconcelos, N.: Deep learning with low precision by half-wave gaussian quantization. CoRR abs/1702.00953 (2017)

Chetlur, S., et al.: cuDNN: efficient primitives for deep learning. CoRR abs/1410.0759 (2014)

Courbariaux, M., Bengio, Y.: Binarynet: training deep neural networks with weights and activations constrained to +1 or \(-\)1. CoRR (2016)

Han, S., et al.: ESE: efficient speech recognition engine with compressed LSTM, on FPGA. CoRR abs/1612.00694 (2016)

Han, S., et al.: EIE: efficient inference engine on compressed deep neural network. CoRR abs/1602.01528 (2016)

Han, S., Mao, H., Dally, W.J.: Deep compression: compressing deep neural network with pruning, trained quantization and Huffman coding. CoRR abs/1510.00149 (2015)

10.

He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. CoRR abs/1512.03385 (2015)

11.

Hinton, G., Dean, J., Vinyals, O.: Distilling the knowledge in a neural network. In: NIPS 2014 Deep Learning Workshop, pp. 1–9 (2014)

12.

Hinton, G., et al.: Deep neural networks for acoustic modeling in speech recognition: the shared views of four research groups. IEEE Signal Process. Mag. 29, 82–97 (2012)CrossRef

13.

Horowitz, M.: 1.1 computing’s energy problem (and what we can do about it), vol. 57, pp. 10–14 (2014)

14.

Iandola, F.N., Moskewicz, M.W., Ashraf, K., Han, S., Dally, W.J., Keutzer, K.: Squeezenet: Alexnet-level accuracy with 50x fewer parameters and \(<\)1mb model size. CoRR abs/1602.07360 (2016)

15.

Jacob, B., et al.: Quantization and training of neural networks for efficient integer-arithmetic-only inference. eprint arXiv:1712.05877 (2017)

16.

Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Proceedings of the 25th NIPS, NIPS 2012, pp. 1097–1105. Curran Associates Inc., USA (2012)

17.

Lenz, I.: Deep Learning for Robotics (2016)

18.

Li, F., Liu, B.: Ternary weight networks. CoRR abs/1605.04711 (2016)

19.

Rastegari, M., Ordonez, V., Redmon, J., Farhadi, A.: Xnor-net: imagenet classification using binary convolutional neural networks. CoRR abs/1603.05279 (2016)

20.

Schindler, G., Mücke, M., Fröning, H.: Linking application description with efficient SIMD Code generation for low-precision signed-integer GEMM. In: Heras, D.B., Bougé, L. (eds.) Euro-Par 2017. LNCS, vol. 10659, pp. 688–699. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-75178-8_55CrossRef

21.

Umuroglu, Y., et al.: FINN: a framework for fast, scalable binarized neural network inference. CoRR abs/1612.07119 (2016)

22.

Vanhoucke, V., Senior, A., Mao, M.Z.: Improving the speed of neural networks on CPUs. In: Deep Learning and Unsupervised Feature Learning Workshop, NIPS 2011 (2011)

23.

Wu, Y., et al.: Tensorpack (2016)

24.

Zhou, S., Ni, Z., Zhou, X., Wen, H., Wu, Y., Zou, Y.: Dorefa-net: training low bitwidth convolutional neural networks with low bitwidth gradients. CoRR abs/1606.06160 (2016)

25.

Zhu, C., Han, S., Mao, H., Dally, W.J.: Trained ternary quantization. CoRR (2016)

Titel: Towards Efficient Forward Propagation on Resource-Constrained Systems
verfasst von: Günther Schindler
Matthias Zöhrer
Franz Pernkopf
Holger Fröning
Verlag: Springer International Publishing
Buch: Machine Learning and Knowledge Discovery in Databases
Print ISBN: 978-3-030-10924-0

Electronic ISBN: 978-3-030-10925-7

Copyright-Jahr: 2019
DOI: https://doi.org/10.1007/978-3-030-10925-7_26

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Premium Partner