Weitere Kapitel dieses Buchs durch Wischen aufrufen
CPU/FPGA hybrid systems have emerged as a viable means to achieve high performance in the field of embedded applications and computing. High-Level Synthesis (HLS) tools facilitate software designers and programmers to utilize the underlying hardware in a hybrid system without requiring deep insights into hardware. HLS tools execute the program in sequential order by default. However, these tools provide mechanisms to parallelize the code wherein the user/programmer can apply constructs such as loop-unrolling, loop-flattening, and pipelining in the form of pragmas. Along with all these constructs in place, it is also important for programmers to understand the memory access pattern used in the program for efficiently utilizing the underlying capabilities of CPU/FPGA hybrid system. Memory access patterns in array references play a major role in deciding the latency and area required for a specific computation. Four typical memory access patterns with growing input sizes in array context were exercised in Vivado HLS with C code as an input and it was observed that change in the memory access pattern leads to a different area and timing requirements and change in the coding style may improve the performance of HLS tools.
Bitte loggen Sie sich ein, um Zugang zu diesem Inhalt zu erhalten
Sie möchten Zugang zu diesem Inhalt erhalten? Dann informieren Sie sich jetzt über unsere Produkte:
C. Nugteren and R. Corvino and H. Corporaal.: Algorithmic species revisited: A program code classification based on array references. 2013 IEEE 6th International Workshop on Multi-/Many-core Computing Systems (MuCoCoS).
Canis, Andrew and Choi, Jongsok and Aldham, Mark and Zhang, Victor and Kammoona, Ahmed and Czajkowski, Tomasz and Brown, Stephen D. and Anderson, Jason H.: LegUp: An Open-source High-level Synthesis Tool for FPGA-based Processor/Accelerator Systems. ACM Trans. Embed. Comput. Syst., year: 2013, issn: 1539-9087, pages: 24: 1–24: 27.
LegUp HLS, http://legup.eecg.utoronto.ca/.
Nugteren, Cedric and Custers, Pieter and Corporaal, Henk.: Algorithmic Species: A Classification of Affine Loop Nests for Parallel Programming. ACM Trans. Archit. Code Optim., year: 2013, issn: 1544-3566, pages: 40: 1–40: 25.
M. Belwal and M. Purnaprajna and Sudarshan TSB.: Enabling seamless execution on hybrid CPU/FPGA systems: Challenges amp; directions. 2015 25th International Conference on Field Programmable Logic and Applications (FPL).
Nugteren, Cedric and Corporaal, Henk.: Bones: An Automatic Skeleton-Based C-to-CUDA Compiler for GPUs. ACM Trans. Archit. Code Optim. January 2015, issn: 1544-3566, pages: 35: 1–35: 25.
Howes, Lee W and Lokhmotov, Anton and Donaldson, Alastair F and Kelly, Paul HJ.: Deriving efficient data movement from decoupled access/execute specifications. High Performance Embedded Architectures and Compilers, pages: 168–182, year: 2009, publisher: Springer.
Membarth, Richard and Lokhmotov, Anton and Teich, Jürgen: Generating GPU code from a high-level representation for image processing kernels. Euro-Par 2011: Parallel Processing Workshops, pages: 270–280, year: 2011, Organization: Springer.
Nithin George, HyoukJoong Lee, David Novo, Muhsen Owaida, David Andrews, Kunle Olukotun and Paolo Ienne.: Automatic support for multi-module parallelism from computational patterns. 25th International Conference on Field Programmable Logic and Applications, FPL 2015, London, United Kingdom, September 2–4, 2015, pages: 1–8, year: 2015.
G. Weisz and J. C. Hoe.: CoRAM++: Supporting data-structure-specific memory interfaces for FPGA computing. 2015 25th International Conference on Field Programmable Logic and Applications (FPL), London, 2015, pp. 1–8.
- A Study of Memory Access Patterns as an Indicator of Performance in High-Level Synthesis
T. S. B. Sudarshan
- Springer Singapore
Neuer Inhalt/© ITandMEDIA, Best Practices für die Mitarbeiter-Partizipation in der Produktentwicklung/© astrosystem | stock.adobe.com