nach oben

The Journal of Supercomputing

Erschienen in:

16.04.2018

Balancing the learning ability and memory demand of a perceptron-based dynamically trainable neural network

verfasst von: Edward Richter, Spencer Valancius, Josiah McClanahan, John Mixter, Ali Akoglu

Erschienen in: The Journal of Supercomputing | Ausgabe 7/2018

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Artificial neural networks (ANNs) have become a popular means of solving complex problems in prediction-based applications such as image and natural language processing. Two challenges prominent in the neural network domain are the practicality of hardware implementation and dynamically training the network. In this study, we address these challenges with a development methodology that balances the hardware footprint and the quality of the ANN. We use the well-known perceptron-based branch prediction problem as a case study for demonstrating this methodology. This problem is perfect to analyze dynamic hardware implementations of ANNs because it exists in hardware and trains dynamically. Using our hierarchical configuration search space exploration, we show that we can decrease the memory footprint of a standard perceptron-based branch predictor by 2.3\(\times \) with only a 0.6% decrease in prediction accuracy.

Vorheriger Artikel Efficient implementation of space–time adaptive processing for adaptive weights calculation based on floating point FPGAs

Nächster Artikel A hybrid GPU cluster and volunteer computing platform for scalable deep learning

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

ARM Cortex-M7 Processor (2014) ARM, revision r0p2

Akopyan F, Sawada J, Cassidy A, Alvarez-Icaza R, Arthur J, Merolla P, Imam N, Nakamura Y, Datta P, Nam GJ, Taba B, Beakes M, Brezzo B, Kuang JB, Manohar R, Risk WP, Jackson B, Modha DS (2015) Truenorth: design and tool flow of a 65 mw 1 million neuron programmable neurosynaptic chip. IEEE Trans Comput Aided Des Integr Circuits Syst 34(10):1537–1557. https://doi.org/10.1109/TCAD.2015.2474396 CrossRef

Amant RS, Jimenez DA, Burger D (2008) Low-power, high-performance analog neural branch prediction. In: 2008 41st IEEE/ACM International Symposium on Microarchitecture, pp 447–458. https://doi.org/10.1109/MICRO.2008.4771812

Bhattacharjee A (2017) Using branch predictors to predict brain activity in brain-machine implants. In: Proceedings of the 50th Annual IEEE/ACM International Symposium on Microarchitecture, ACM, New York, NY, USA, MICRO-50 ’17, pp 409–422. https://doi.org/10.1145/3123939.3123943

Burger D, Austin TM (1997) The simplescalar tool set, version 2.0. SIGARCH Comput Archit News 25(3):13–25. https://doi.org/10.1145/268806.268810 CrossRef

Calder B, Grunwald D, Lindsay D, Martin J, Mozer M, Zorn B (1995) Corpus-based static branch prediction. SIGPLAN Not 30(6):79–92. https://doi.org/10.1145/223428.207118 CrossRef

Das M, Banerjee A, Sardar B (2017) An empirical study on performance of branch predictors with varying storage budgets. In: 2017 7th International Symposium on Embedded Computing and System Design (ISED), pp 1–5. https://doi.org/10.1109/ISED.2017.8303913

Henning JL (2000) SPEC CPU2000: measuring CPU performance in the new millennium. Computer 33(7):28–35. https://doi.org/10.1109/2.869367 CrossRef

Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780. https://doi.org/10.1162/neco.1997.9.8.1735 CrossRef

10.

Hubara I, Courbariaux M, Soudry D, El-Yaniv R, Bengio Y (2016) Binarized neural networks. In: Lee DD, Sugiyama M, Luxburg UV, Guyon I, Garnett R (eds) Advances in neural information processing systems 29. Curran Associates, Inc., pp 4107–4115. http://papers.nips.cc/paper/6573-binarized-neural-networks.pdf

11.

Hubara I, Courbariaux M, Soudry D, El-Yaniv R, Bengio Y (2016) Quantized neural networks: Training neural networks with low precision weights and activations. CoRR arXiv:1609.07061

12.

Jimenez DA (2003) Fast path-based neural branch prediction. In: Proceedings of the 36th Annual IEEE/ACM International Symposium on Microarchitecture, IEEE Computer Society, Washington, DC, USA, MICRO 36, p 243. http://dl.acm.org/citation.cfm?id=956417.956562

13.

Jimenez DA, Lin C (2001) Dynamic branch prediction with perceptrons. In: Proceedings HPCA Seventh International Symposium on High-Performance Computer Architecture, pp 197–206. https://doi.org/10.1109/HPCA.2001.903263

14.

Jimenez DA, Lin C (2002) Neural methods for dynamic branch prediction. ACM Trans Comput Syst 20(4):369–397. https://doi.org/10.1145/571637.571639 CrossRef

15.

Jouppi NP, Young C, Patil N, Patterson D, Agrawal G, Bajwa R, Bates S, Bhatia S, Boden N, Borchers A, Boyle R, Cantin P, Chao C, Clark C, Coriell J, Daley M, Dau M, Dean J, Gelb B, Ghaemmaghami TV, Gottipati R, Gulland W, Hagmann R, Ho RC, Hogberg D, Hu J, Hundt R, Hurt D, Ibarz J, Jaffey A, Jaworski A, Kaplan A, Khaitan H, Koch A, Kumar N, Lacy S, Laudon J, Law J, Le D, Leary C, Liu Z, Lucke K, Lundin A, MacKean G, Maggiore A, Mahony M, Miller K, Nagarajan R, Narayanaswami R, Ni R, Nix K, Norrie T, Omernick M, Penukonda N, Phelps A, Ross J, Salek A, Samadiani E, Severn C, Sizikov G, Snelham M, Souter J, Steinberg D, Swing A, Tan M, Thorson G, Tian B, Toma H, Tuttle E, Vasudevan V, Walter R, Wang W, Wilcox E, Yoon DH (2017) In-datacenter performance analysis of a tensor processing unit. CoRR arXiv:1704.04760

16.

Khan MM, Lester DR, Plana LA, Rast A, Jin X, Painkras E, Furber SB (2008) Spinnaker: Mapping neural networks onto a massively-parallel chip multiprocessor. In: 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence), pp 2849–2856. https://doi.org/10.1109/IJCNN.2008.4634199

17.

Ko JH, Fromm J, Philipose M, Tashev I, Zarar S (2017) Precision scaling of neural networks for efficient audio processing. ArXiv e-prints arXiv:1712.01340

18.

Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Pereira F, Burges CJC, Bottou L, Weinberger KQ (eds) Advances in neural information processing systems 25. Curran Associates, Inc., pp 1097–1105. http://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf

19.

Lu Y, Liu Y, Wang H (2011) A study of perceptron based branch prediction on simplescalar platform. In: 2011 IEEE International Conference on Computer Science and Automation Engineering, vol 4, pp 591–595. https://doi.org/10.1109/CSAE.2011.5952918

20.

Ma Y, Gao H, Zhou H (2006) Using indexing functions to reduce conflict aliasing in branch prediction tables. IEEE Trans Comput 55(8):1057–1061. https://doi.org/10.1109/TC.2006.133 CrossRef

21.

Maas A, Le QV, ONeil TM, Vinyals O, Nguyen P, Ng AY (2012) Recurrent neural networks for noise reduction in robust ASR. In: INTERSPEECH

22.

Mao Y, Shen J, Gui X (2018) A study on deep belief net for branch prediction. IEEE Access 6:10,779–10,786. https://doi.org/10.1109/ACCESS.2017.2772334 CrossRef

23.

McFarling S (1993) Combining branch predictors. Technical Report TN-36m, Digital Western Research Laboratory, Palo Alto, CA

24.

Michaud P, Seznec A (2014) Pushing the branch predictability limits with the multi-poTAGE+SC predictor. In: 4th JILP Workshop on Computer Architecture Competitions (JWAC-4): Championship Branch Prediction (CBP-4), Minneapolis, USA. https://hal.archives-ouvertes.fr/hal-01087719

25.

Murray AF (1995) Applications of neural networks. Springer, New YorkCrossRef

26.

Nazzal J, El-Emary M, I, A Najim S, (2008) Multilayer perceptron neural network (MLPS) for analyzing the properties of Jordan Oil Shale. World Appl Sci J 5:546–552

27.

Orhan U, Hekim M, Ozer M (2011) EGG signals classification using the k-means clustering and a multilayer perceptron neural network model. Expert Syst Appl 38(10):13475–13481. https://doi.org/10.1016/j.eswa.2011.04.149, http://www.sciencedirect.com/science/article/pii/S0957417411006762

28.

Parasanna S, Sarma R, Balasubramanian S (2017) A study on improving branch prediction accuracy in the context of conditional branches. Int J Eng Technol Sci Res 4:654–662

29.

Patterson DA, Hennessy JL (2013) Computer organization and design, fifth edition: the hardware/software interface, 5th edn. Morgan Kaufmann Publishers Inc., San Francisco

30.

Rau BR (1991) Pseudo-randomly interleaved memory. In: Proceedings of the 18th Annual International Symposium on Computer Architecture, ACM, New York, NY, USA, ISCA ’91, pp 74–83. https://doi.org/10.1145/115952.115961

31.

Sainath T, Vinyals O, Senior A, Sak H (2015) Convolutional, long short-term memory, fully connected deep neural networks. In: ICASSP

32.

Seznec A (2005) Analysis of the o-geometric history length branch predictor. In: 32nd International Symposium on Computer Architecture (ISCA’05), pp 394–405. https://doi.org/10.1109/ISCA.2005.13

33.

Seznec A (2007) The L-TAGE branch predictor. J Instr Level Parallelism. http://wwwjilp.org/vol9

34.

Seznec A (2011) A 64-kbytes ISL-TAGE branch predictor. In: Proceedings of the 3rd Championship Branch Prediction

35.

Seznec A (2011) A new case for the tage branch predictor. In: Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture, ACM, New York, NY, USA, MICRO-44, pp 117–127. https://doi.org/10.1145/2155620.2155635

36.

Sherwood T, Sair S, Calder B (2003) Phase tracking and prediction. In: Proceedings of the 30th Annual International Symposium on Computer Architecture, ACM, New York, NY, USA, ISCA ’03, pp 336–349. https://doi.org/10.1145/859618.859657

37.

Sprangle E, Chappell RS, Alsup M, Patt YN (1997) The agree predictor: a mechanism for reducing negative branch history interference. In: Conference Proceedings. The 24th Annual International Symposium on Computer Architecture, pp 284–291. https://doi.org/10.1145/384286.264210

38.

Umuroglu Y, Fraser NJ, Gambardella G, Blott M, Leong P, Jahre M, Vissers K (2017) Finn: a framework for fast, scalable binarized neural network inference. In: Proceedings of the 2017 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, ACM, New York, NY, USA, FPGA ’17, pp 65–74. https://doi.org/10.1145/3020078.3021744

39.

Vanzella E, Cristiani S, Fontana A, Nonino M, Arnouts S, Giallongo E, Grazian A, Fasano G, Popesso P, Saracco P, Zaggia S (2004) Photometric redshifts with the multilayer perceptron neural network: application to the HDF-S and SDSS. Astron Astrophys 423:761–776. https://doi.org/10.1051/0004-6361:20040176 arXiv:astro-ph/0312064 CrossRef

40.

Yeh TY, Patt YN (1991) Two-level adaptive training branch prediction. In: Proceedings of the 24th Annual International Symposium on Microarchitecture, ACM, New York, NY, USA, MICRO 24, pp 51–61. https://doi.org/10.1145/123465.123475

41.

Zhou Z, Kejriwal M, Miikkulainen R (2013) Extended scaled neural predictor for improved branch prediction. In: The 2013 International Joint Conference on Neural Networks (IJCNN), pp 1–7. https://doi.org/10.1109/IJCNN.2013.6707059

Titel: Balancing the learning ability and memory demand of a perceptron-based dynamically trainable neural network
verfasst von: Edward Richter
Spencer Valancius
Josiah McClanahan
John Mixter
Ali Akoglu
Publikationsdatum: 16.04.2018
Verlag: Springer US
Erschienen in: The Journal of Supercomputing / Ausgabe 7/2018
Print ISSN: 0920-8542
Elektronische ISSN: 1573-0484
DOI: https://doi.org/10.1007/s11227-018-2374-x

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Springer Professional "Technik"

Springer Professional "Wirtschaft+Technik"

Weitere Artikel der Ausgabe 7/2018

Load balancing in reducers for skewed data in MapReduce systems by using scalable simple random sampling

Real-time tsunami inundation forecast system for tsunami disaster prevention and mitigation

Resource optimization of container orchestration: a case study in multi-cloud microservices-based applications

Improving the energy efficiency and performance of data-intensive workflows in virtualized clouds

Learning-based dynamic scalable load-balanced firewall as a service in network function-virtualized cloud computing environments

Fast induced sorting suffixes on a multicore machine