ABSTRACT
Reconfigurable architectures promise significant performance and flexibility advantages over conventional architectures. Automatic mapping techniques that exploit the features of the hardware are needed to leverage the power of these architectures. In this paper, we develop techniques for parallelizing nested loop computations from digital signal processing (DSP) applications onto high performance pipelined configurations. We propose a novel data context switching technique that exploits the embedded distributed memory available in reconfigurable architectures to parallelize such loops. Our technique is demonstrated on two diverse state-of-the-art reconfigurable architectures, namely, Virtex and the Chameleon Systems Reconfigurable Communications Processor. Our techniques show significant performance improvements on both architectures and also perform better than state-of-the-art DSP and microprocessor architectures.
- 1.K. Bondalapati and V.K. Prasanna. Mapping Loops onto Reconfigurable Architectures. In 8th International Workshop on Field-Programmable Logic and Applications, September 1998. Google ScholarDigital Library
- 2.K. Bondalapati and V.K. Prasanna. Loop Pipelining and Optimization for Reconfigurable Architectures. In Reconfigurable Architectures Workshop (RAW '2000), May 2000. Google ScholarDigital Library
- 3.L. Caglar and B. Salefski. Reconfigurable Computing in Wireless. In 38th Design Automation Conference, June 2001. Google ScholarDigital Library
- 4.Chameleon Systems. http://www.chameleonsystems.com/.Google Scholar
- 5.Xilinx Inc.(www.xilinx.com). Virtex Series FPGAs.Google Scholar
Index Terms
- Parallelizing DSP nested loops on reconfigurable architectures using data context switching
Recommendations
Pipeline Reconfigurable DSP for Dynamically Reconfigurable Architectures
Dynamically reconfigurable architectures, such as NATURE, achieve high logic density and low reconfiguration latency compared to traditional field-programmable gate arrays. Unlike fine-grained NATURE, reconfigurable DSP block incorporated NATURE ...
Parallelizing tightly nested loops
IPPS '91: Proceedings of the Fifth International Parallel Processing SymposiumPresents a new technique to parallelize nested loops at the statement level. It transforms sequential nested loops, either vectorizable or not, into parallel ones. Previously, the wavefront method was used to parallelize non-vectorizable nested loops. ...
Mapping Imperfect Loops to Coarse-Grained Reconfigurable Architectures
Nested loops represent a significant portion of application runtime in multimedia and DSP applications, an important domain of applications for coarse-grained reconfigurable architectures (CGRAs). While conventional approaches to mapping nested loops ...
Comments