Abstract
The rapid growth of device densities on silicon has made it feasible to deploy reconfigurable hardware as a highly parallel computing platform. However, one of the obstacles to the wider acceptance of this technology is its programmability. The application needs to be programmed in hardware description languages or an assembly equivalent, whereas most application programmers are used to the algorithmic programming paradigm. SA-C has been proposed as an expression-oriented language designed to implicitly express data parallel operations. The Morphosys project proposes an SoC architecture consisting of reconfigurable hardware that supports a data-parallel, SIMD computational model. This paper describes a compiler framework to analyze SA-C programs, perform optimizations, and automatically map the application onto the Morphosys architecture. The mapping process is static and it involves operation scheduling, processor allocation and binding, and register allocation in the context of the Morphosys architecture. The compiler also handles issues concerning data streaming and caching in order to minimize data transfer overhead. We have compiled some important image-processing kernels, and the generated schedules reflect an average speedup in execution times of up to 6× compared to the execution on 800 MHz Pentium III machines.
- Bondalapati, K. and Prasanna, V. K. 2000. Loop pipelining and optimization for run time reconfiguration. In Reconfigurable Architectures Workshop. Google Scholar
- Crenshaw, J. W. 2000. MATH Toolkit for Real-Time Programming. CMP Books. Google Scholar
- Ebeling, C., Cronquist, D. C., and Franklin, P. 1996. RaPiD---reconfigurable pipelined datapath. In Proceedings Field Programmable Logic. Google Scholar
- Filho, E. M. C. 1998. The TinyRISC instruction set architecture. www.eng.uci.edu/morphosys/ docs/isa.pdf.Google Scholar
- Frigo, J., Gokhale, M., and Lavenier, D. 2002. Evaluation of the Strems-C to FPGA compiler: An application perspective. In 9th International Symposium on Field Programmable Gate Arrays. Monterey, CA. Google Scholar
- Goldstein, S., Schmit, H., Budiu, M., Cadambi, S., Moe, M., and Taylor, R. R. 2000. PipeRench: A reconfigurable architecture and compiler. IEEE Computer 33, 70--77. Google Scholar
- Hall, M., Diniz, P., Bondalapati, K., Ziegler, H., Duncan, P., Jain, R., and Granacki, J. 1999. DEFACTO: A design environment for adaptive computing technology. In 6th Reconfigurable Architectures Workshop (RAW'99).Google Scholar
- Hammes, J. P. and Böhm, A. P. W. 2001. The SA-C language. www.cs.colostate.edu/cameron Colorado State University.Google Scholar
- Hammes, J. P., Rinker, R. E., McClure, D. M., Böhm, A. P. W., and Najjar, W. A. 2001. The SA-C compiler dataflow description. www.cs.colostate.edu/cameron Colorado State University.Google Scholar
- Hammes, J., Rinker, R., Böhm, W., Najjar, W., Draper, B., and Beveridge, R. 1999. Cameron: High level language compilation for reconfigurable systems. In Conference on Parallel Architectures and Compilation Techniques. Newport Beach, CA. Google Scholar
- Hsieh, C. and Lin, T. 1992. VLSI architecture for block-matching motion estimation algorithm. IEEE Transactions on Circuits, Systems for Video Technology 2, 169--175.Google Scholar
- Kaul, M., Vemuri, R., Govindarajan, S., and Ouaiss, I. E. 1999. An automated temporal partitioning and loop fission for FPGA-based reconfigurable synthesis of DSP applications. In 36th Design Automation Conf. New Orleans, LA. Google Scholar
- Kenedy, K. and Allen, R. 1987. Automatic translation of FORTRAN programs to vector forms. ACM Transactions on Programming Languages and Systems 9, 491--542. Google Scholar
- Lee, M. et al. 2000. Design and implementation of the Morphosys reconfigurable computing processor. Journal of VLSI and Signal Processing Systems. Google Scholar
- Li, Y., Callahan, T., Darnell, E., Harr, R., Kurkure, U., and Stockwood, J. 1999. Hardware-software co-design of embedded reconfigurable architectures. In Design Automation Conf. (DAC). Google Scholar
- Maestre, R. et al. 1999. Kernel scheduling in reconfigurable computing. In Design and Test Europe ( DATE). Munich, Germany. Google Scholar
- Peterson, J. B., O'Connor, R. B., and Athanas, P. M. 1996. Scheduling and partitioning ANSI-C programs onto multiple FPGA CCM architectures. In IEE Symposium on FPGAs for Custom Computing Machines. Napa, CA,Google Scholar
- Singh, H., Lee, M. H., Lu, G., Kurdahi, F. J., Bagherzadeh, N., and Filho, E. M. C. 2000. MorphoSys: An integrated reconfigurable system for data-parallel and computation-intensive applications. IEEE Trans. Comput. 49, 5, 465--481. Google Scholar
- Venkataramani, G. 2001. A compiler framework for mapping applications to a coarse-grained reconfigurable architecture. M.S. Thesis, University of California Riverside.Google Scholar
- Waingold, E., Taylor, M., Srikrishna, D., Sarkar, V., Lee, W., Lee, V., Kim, J., Frank, M., Finch, P., Barua, R., Babb, J., Amarsinghe, S., and Agrawal, A. 1997. Baring it all to software: Raw machines. IEEE Computer 30, 86--93. Google Scholar
- Wawrzynek, J. and Calahan, T. J. 1998. Instruction-level parallelism for reconfigurable computing. In 8th International Workshop on Field Programmable Logic and Applications. Berlin, Germany. Google Scholar
- Ye, Z. A., Moshovos, A., Hauck, S., and Banerjee, P. 2000. CHIMAERA: A high-performance computer architecture with a tightly-coupled reconfigurable unit. In International Symposium on Computer Architecture (ISCA). Vancouver, BC, Canada. Google Scholar
Index Terms
- Automatic compilation to a coarse-grained reconfigurable system-opn-chip
Recommendations
Mapping method of coarse-grained dynamically reconfigurable computing system-on-chip of REMUS-II
ODES '13: Proceedings of the 10th Workshop on Optimizations for DSP and Embedded SystemsThe REMUS-II is a heterogeneous computing system-on-chip that consists of one or two coarse-grained dynamically reconfigurable processing units (RPUs) and an array of RISC processors (μPU). The RPUs are used for accelerating computation-intensive tasks, ...
Compiling custom instructions onto expression-grained reconfigurable architectures
CASES '08: Proceedings of the 2008 international conference on Compilers, architectures and synthesis for embedded systemsWhile customizable processors aim at combining the flexibility of general purpose processors with the speed and power advantages of custom circuits, commercially available processors are often limited by the inability to reconfigure the application-...
A Mixed-Grained Reconfigurable Computing Platform for Multiple-Standard Video Decoding (Abstract Only)
FPGA '15: Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate ArraysA mixed-grained reconfigurable computing platform targeting multiple-standard video decoding is proposed in this paper. The platform integrates eight coarse-grained Reconfigurable Processing Units (RPUs), each of which consists of 16×16 multi-functional ...
Comments