nach oben

Erschienen in:

2018 | OriginalPaper | Buchkapitel

Mathematical Optimizations for Deep Learning

verfasst von : Sam Green, Craig M. Vineyard, Çetin Kaya Koç

Erschienen in: Cyber-Physical Systems Security

Verlag: Springer International Publishing

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Deep neural networks are often computationally expensive, during both the training stage and inference stage. Training is always expensive, because back-propagation requires high-precision floating-point multiplication and addition. However, various mathematical optimizations may be employed to reduce the computational cost of inference. Optimized inference is important for reducing power consumption and latency and for increasing throughput. This chapter introduces the central approaches for optimizing deep neural network inference: pruning “unnecessary” weights, quantizing weights and inputs, sharing weights between layer units, compressing weights before transferring from main memory, distilling large high-performance models into smaller models, and decomposing convolutional filters to reduce multiply and accumulate operations. In this chapter, using a unified notation, we provide a mathematical and algorithmic description of the aforementioned deep neural network inference optimization methods.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Vorheriges Kapitel Approximate Computing and Its Application to Hardware Security

Nächstes Kapitel A Zero-Entry Cyber Range Environment for Future Learning Ecosystems

Also called input filter maps (ifmaps) and output filter maps (ofmaps) in literature.

Note that we consider \(\mathcal {W}\) and \(\mathcal {I}\) to be flattened.

Multiplication by α is still necessary when using the weight binarization technique in XNOR-Net.

Not to be confused with XNOR-Net [2]. Here we are referring to the exclusive-NOR operation.

Hamming weight is defined as the number of 1s in a vector.

The softmax layer is at the output and has no trainable weights. It can therefore be replaced in the larger network with a separate temperature, with no need for retraining.

First-order estimates of power costs can be calculated using Table 1.

Our calculations assume there is no pooling layer after convolution, which is now commonly the case.

D. William, High-performance hardware for machine learning, in Conference on Neural Information Processing Systems (2015)

M. Rastegari, V. Ordonez, J. Redmon, A. Farhadi, Xnor-net: Imagenet classification using binary convolutional neural networks, in European Conference on Computer Vision (2016)

S. Han, H. Mao, W.J. Dally, Deep compression: compressing deep neural networks with pruning, trained quantization and huffman coding, in International Conference on Learning Representations (2016)

S. Marcel, Y. Rodriguez, Torchvision the machine-vision package of torch, in International Conference on Multimedia (ACM, New York, 2010), pp. 1485–1488

A. Parashar, M. Rhu, A. Mukkara, A. Puglielli, R. Venkatesan, B. Khailany, J. Emer, S.W. Keckler, W.J. Dally, Scnn: an accelerator for compressed-sparse convolutional neural networks, in International Symposium on Computer Architecture (ACM, New York, 2017), pp. 27–40

N.P. Jouppi, C. Young, N. Patil, D. Patterson, G. Agrawal, R. Bajwa, S. Bates, S. Bhatia, N. Boden, A. Borchers, R. Boyle, P.-l. Cantin, C. Chao, C. Clark, J. Coriell, M. Daley, M. Dau, J. Dean, B. Gelb, T.V. Ghaemmaghami, R. Gottipati, W. Gulland, R. Hagmann, C.R. Ho, D. Hogberg, J. Hu, R. Hundt, D. Hurt, J. Ibarz, A. Jaffey, A. Jaworski, A. Kaplan, H. Khaitan, D. Killebrew, A. Koch, N. Kumar, S. Lacy, J. Laudon, J. Law, D. Le, C. Leary, Z. Liu, K. Lucke, A. Lundin, G. MacKean, A. Maggiore, M. Mahony, K. Miller, R. Nagarajan, R. Narayanaswami, R. Ni, K. Nix, T. Norrie, M. Omernick, N. Penukonda, A. Phelps, J. Ross, M. Ross, A. Salek, E. Samadiani, C. Severn, G. Sizikov, M. Snelham, J. Souter, D. Steinberg, A. Swing, M. Tan, G. Thorson, B. Tian, H. Toma, E. Tuttle, V. Vasudevan, R. Walter, W. Wang, E. Wilcox, D.H. Yoon, In-datacenter performance analysis of a tensor processing unit, in International Symposium on Computer Architecture (ACM, New York, 2017), pp. 1–12CrossRef

M. Courbariaux, Y. Bengio, J.-P. David, Binaryconnect: training deep neural networks with binary weights during propagations, in Conference on Neural Information Processing Systems (2015)

G. Hinton, Neural networks for machine learning. https://www.coursera.org/learn/neural-networks (2012). Accessed 03/14/18

I. Hubara, M. Courbariaux, D. Soudry, R. El-Yaniv, Y. Bengio, Binarized neural networks, in Conference on Neural Information Processing Systems (2016), pp. 4107–4115

10.

C. Buciluǒ, R. Caruana, A. Niculescu-Mizil, Model compression, in International Conference on Knowledge Discovery and Data Mining (ACM, New York, 2006), pp. 535–541

11.

G. Hinton, O. Vinyals, J. Dean, Distilling the knowledge in a neural network (2015). Preprint. arXiv:1503.02531

12.

A. Krizhevsky, I. Sutskever, G.E. Hinton, Imagenet classification with deep convolutional neural networks, in Advances in Neural Information Processing Systems (2012), pp. 1097–1105

13.

R. Rigamonti, A. Sironi, V. Lepetit, P. Fua, Learning separable filters, Conference on Computer Vision and Pattern Recognition (IEEE, New York, 2013), pp. 2754–2761

14.

M. Lin, Q. Chen, S. Yan, Network in network, International Conference on Learning Representations (2014)

15.

C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, A. Rabinovich, Going deeper with convolutions, in Conference on Computer Vision and Pattern Recognition (IEEE, New York, 2015)

16.

D. Strukov, ECE594BB neuromorphic engineering, University of California, Santa Barbara, March (2018)

Titel: Mathematical Optimizations for Deep Learning
verfasst von: Sam Green
Craig M. Vineyard
Çetin Kaya Koç
Verlag: Springer International Publishing
Buch: Cyber-Physical Systems Security
Print ISBN: 978-3-319-98934-1

Electronic ISBN: 978-3-319-98935-8

Copyright-Jahr: 2018
DOI: https://doi.org/10.1007/978-3-319-98935-8_4

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Premium Partner