nach oben

Erschienen in:

2017 | OriginalPaper | Buchkapitel

10. An Approximation Workflow for Exploiting Data-Level Parallelism in FPGA Acceleration

verfasst von : Abbas Rahimi, Luca Benini, Rajesh K. Gupta

Erschienen in: From Variability Tolerance to Approximate Computing in Parallel Integrated Architectures and Accelerators

Verlag: Springer International Publishing

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Modern applications including graphics, multimedia, web search, and data analytics not only can benefit from acceleration, but also exhibit significant degrees of tolerance to imprecise computation. This amenability to approximation provides an opportunity to trade quality of the results for higher performance and better resource utilization. Exploiting this opportunity is particularly important for FPGA accelerators that are inherently subject to many resource constraints. To better utilize the FPGA resources, we devise, Grater, an automated design workflow for FPGA accelerators that leverages imprecise computation to increase data-level parallelism and achieve higher computational throughput. The core of our workflow is a source-to-source compiler that takes in an input kernel and applies a novel optimization technique that selectively reduces the precision of kernel’s data and operations. By selectively reducing the precision of the data and operation, the required area to synthesize the kernels on the FPGA decreases allowing to integrate a larger number of operations and parallel kernels in the fixed area of the FPGA. The larger number of integrated kernels provides more hardware context to better exploit data-level parallelism in the target applications. To effectively explore the possible design space of approximate kernels, we exploit a genetic algorithm to find a subset of safe-to-approximate operations and data elements and then tune their precision levels until the desired output quality is achieved. Grater exploits a fully software technique and does not require any changes to the underlying FPGA hardware. We evaluate Grater on a diverse set of data-intensive OpenCL benchmarks from the AMD SDK. The synthesis result on a modern Altera FPGA shows that our approximation workflow yields 1.4\(\times \)–3.0\(\times \) higher throughput with less than 1% quality loss.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Vorheriges Kapitel Accuracy-Configurable OpenMP

Nächstes Kapitel Memristive-Based Associative Memory for Approximate Computational Reuse

Replication is handled in Altera OpenCL by setting num_compute_units as a kernel attribute.

We limit the space of our optimization search across the available variable types in OpenCL, as opposed to within a type itself [10], due to the nature of a source-to-source transformer that requires to work at the same level of abstraction of the input programming language. Grater enables Altera OpenCL synthesis tool chain to benefit from this source-to-source translation by generating standard OpenCL approximate kernels.

Grater also enables the programmer to annotate critical variables as non-approximable, so that the transcompiler would not change their precision.

It should be noted that the accelerated profiling process on GPU takes order of milliseconds to determine if the kernel can meet the quality-of-result target. While it takes on average more than an hour to synthesize the approximate OpenCL kernels on Stratix V FPGA.

A. Yazdanbakhsh, J. Park, H. Sharma, P. Lotfi-Kamran, H. Esmaeilzadeh, Neural acceleration for GPU throughput processors, in Proceedings of the 48th International Symposium on Microarchitecture, MICRO-48 (ACM, New York, NY, USA, 2015), pp. 482–493

A. Yazdanbakhsh, D. Mahajan, B. Thwaites, J. Park, A. Nagendrakumar, S. Sethuraman, K. Ramkrishnan, N. Ravindran, R. Jariwala, A. Rahimi, H. Esmaeilzadeh, K. Bazargan, Axilog: language support for approximate hardware design, in 2015 Design, Automation Test in Europe Conference Exhibition (DATE) (2015), pp. 812–817

T. Moreau, M. Wyse, J. Nelson, A. Sampson, H. Esmaeilzadeh, L. Ceze, M. Oskin, SNNAP: approximate computing on programmable SoCs via neural acceleration, in 2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA) (2015), pp. 603–614

A.B. Kahng, S. Kang, Accuracy-configurable adder for approximate arithmetic designs, in 2012 49th ACM/EDAC/IEEE Design Automation Conference (DAC) (2012), pp. 820–825

P. Kulkarni, P. Gupta, M. Ercegovac, Trading accuracy for power with an underdesigned multiplier architecture, in 2011 24th International Conference on VLSI Design (VLSI Design) (2011), pp. 346–351

Altera SDK for OpenCL. http://www.altera.com/products/software/opencl/opencl-index.html

SDAccel. http://www.xilinx.com/products/design-tools/sdx/sdaccel.html (2015)

AMD APP SDK v2.9. http://developer.amd.com/tools-and-sdks/opencl-zone/amd-accelerated-parallel-processing-app-sdk/

D. Chen, D. Singh, Invited paper: using OpenCL to evaluate the efficiency of CPUS, GPUS and FPGAS for information filtering, in 2012 22nd International Conference on Field Programmable Logic and Applications (FPL) (2012), pp. 5–12

10.

E. Schkufza, R. Sharma, A. Aiken, Stochastic optimization of floating-point programs with tunable precision, in Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI’14 (ACM, New York, NY, USA, 2014), pp. 53–64

11.

A.E. Eiben, J.E. Smith, Introduction to Evolutionary Computing, 2nd edn., Natural Computing Series (Springer, Heidelberg, 2007)

12.

GRATER transcompiler. https://bitbucket.org/act-lab/grater/src

13.

S. Misailovic, M. Carbin, S. Achour, Z. Qi, M.C. Rinard, Chisel: reliability- and accuracy-aware optimization of approximate computational kernels, in Proceedings of the 2014 ACM International Conference on Object Oriented Programming Systems Languages & Applications, OOPSLA’14 (ACM, New York, NY, USA, 2014), pp. 309–328

14.

P. Roy, R. Ray, C. Wang, W.F. Wong, ASAC: automatic sensitivity analysis for approximate computing, in Proceedings of the 2014 SIGPLAN/SIGBED Conference on Languages, Compilers and Tools for Embedded Systems, LCTES’14 (ACM, New York, NY, USA, 2014), pp. 95–104

Titel: An Approximation Workflow for Exploiting Data-Level Parallelism in FPGA Acceleration
verfasst von: Abbas Rahimi
Luca Benini
Rajesh K. Gupta
Verlag: Springer International Publishing
Buch: From Variability Tolerance to Approximate Computing in Parallel Integrated Architectures and Accelerators
Print ISBN: 978-3-319-53767-2

Electronic ISBN: 978-3-319-53768-9

Copyright-Jahr: 2017
DOI: https://doi.org/10.1007/978-3-319-53768-9_10

Neuer Inhalt

Bildnachweise

VDI-Icon, Profil Icon, inhalt2, Springer Professional Modul/© Springer Fachmedien Wiesbaden GmbH, Nachhaltigkeitsaward Key Visual/© Cometis AG/Global ESG Monitor | Daniel Rupp | Generiert mit KI, Search Icon, Banner Hanser, Beijing Auto Show 2024: Deutsche Hersteller wollen angreifen./© EKH-Pictures / Generated with AI / Stock.adobe.com, Buchstaben, die aus einem Megaphon kommen/© MicroStockHub/Getty Images/iStock, Digitale Lieferkette/© zapp2photo / stock.adobe.com, Zeitschrift Wissensmanagement Cover, PatentFit-Logo/© Springer Fachmedien Wiesbaden GmbH, Sustainibility Finance/© Robert Kneschke / stock.adobe.com / Springer Fachmedien Wiesbaden GmbH, Zukunftswerkstatt Sales Excellence 2024/© AndreyPopov / Getty Images / iStock, 2023_Antrieb/© supervisuell

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Neuer Inhalt

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.