Abstract
The future trend in microprocessors for the more advanced embedded systems is focusing on massively parallel reconfigurable architectures, consisting of heterogeneous ensembles of hundreds of processing elements communicating over a reconfigurable interconnection network. However, the mastering of low-level microarchitectural details involved in the programming of such massively parallel platforms becomes too cumbersome, which limits their adoption in many applications. Thus, there is a dire need for an approach to produce high-performance scalable implementations that harness the computational resources of the emerging reconfigurable platforms.
This article addresses the grand challenge of accessibility of these diverse reconfigurable platforms by suggesting the use of a high-level language, occam-pi, and developing a complete design flow for building, compiling, and generating machine code for heterogeneous coarse-grained hardware. We have evaluated the approach by implementing complex industrial case studies and three common signal processing algorithms. The results of the implemented case studies suggest that the occam-pi language-based approach, because of its well-defined semantics for expressing concurrency and reconfigurability, simplifies the development of applications employing runtime reconfigurable devices. The associated compiler framework ensures portability as well as the performance benefits across heterogeneous platforms.
- 1995. Occam® 2.1 Reference Manual. SGS-Thomson Microelectronics Limited.Google Scholar
- 2002. Handel-C Language Reference Manual, Version 3.1. Celoxica Inc.Google Scholar
- 2006. Mobius Language Manual. Codetronix LLC.Google Scholar
- 2010. Platform 2012: A Manycore Programmable Accelerator for Ultra-Efficient Embedded Computing in Nanometer Technology. Technical Report. ST-Microelectronics and CEA.Google Scholar
- Fred Barnes and Peter H. Welch. 2002. Prioritised dynamic communicating processes - Part II. In Communicating Process Architectures 2002. 353--370.Google Scholar
- Volker Baumgarte, G. Ehlers, Frank May, Armin Nuckel, Martin Vorback, and Markus Weinhardt. 2003. PACT XPP: A self-reconfigurable data processing architecture. Journal of Supercomputing 26, 2 (2003), 167--184. DOI:http://dx.doi.org/10.1023/A:1024499601571 Google ScholarDigital Library
- Srinivas Boppu, Frank Hannig, and Jürgen Teich. 2014. Compact code generation for tightly-coupled processor arrays. Journal of Signal Processing Systems 77, 1--2 (2014), 5--29. DOI:http://dx.doi.org/ 10.1007/s11265-014-0891-2 Google ScholarDigital Library
- Jim Burns, Adam Donlin, Jonathan Hogg, Satnam Singh, and Mark de Wit. 1997. A dynamic reconfiguration run-time system. In Proceedings of the 5th International Symposium on Field-Programmable Custom Computing Machines (FCCM’97). Google ScholarDigital Library
- João M. P. Cardoso, Pedro C. Diniz, and Markus Weinhardt. 2010. Compiling for reconfigurable computing: A survey. ACM Computing Survey 42, 4, Article 13 (June 2010), 65 pages. DOI:http://dx.doi.org/ 10.1145/1749603.1749604 Google ScholarDigital Library
- Joao M. P. Cardoso and Markus Weinhardt. 2002. XPP-VC: A C compiler with temporal partitioning for the PACT-XPP architecture. In Proceedings of the 12th International Conference on Field Programmable Logic and Applications (FPL’02). Springer-Verlag, 864--874. Google ScholarDigital Library
- Jan Frigo, Maya Gokhale, and Dominique Lavenier. 2001. Evaluation of the streams-C C-to-FPGA compiler: An applications perspective. In Proceedings of the 9th International Symposium on Field-Programmable Gate Arrays (FPGA’01). ACM, 134--140. DOI:http://dx.doi.org/10.1145/360276.360326 Google ScholarDigital Library
- Essayas Gebrewahid, Zain Ul-Abdin, Bertil Svensson, Veronica Gaspes, Bruno Jego, Bruno Lavigueur, and Mathieu Robart. 2013. Programming real-time image processing for manycores in a high-level language. In Proceedings of the Advanced Parallel Processing Technologies. Lecture Notes in Computer Science, Vol. 8299. Springer, Berlin, 381--395. DOI:http://dx.doi.org/10.1007/978-3-642-45293-2_29 Google ScholarDigital Library
- Michael I. Gordon, William Thies, Michal Karczmarek, Jasper Lin, Ali S. Meli, Andrew A. Lamb, Chris Leger, Jeremy Wong, Henry Hoffmann, David Maze, and Saman Amarasinghe. 2002. A stream compiler for communication-exposed architectures. SIGARCH Computer Architecture News 30, 5 (2002), 13. DOI:http://dx.doi.org/10.1145/635506.605428 Google ScholarDigital Library
- David Greaves and Satnam Singh. 2008. Kiwi: Synthesis of FPGA circuits from parallel programs. In Proceedings of the International Symposium on Field-Programmable Custom Computing Machines (FCCM’08). http://research.microsoft.com/apps/pubs/default.aspx?id=71425. Google ScholarDigital Library
- Frank Hannig, Hritam Dutta, and Jrgen Teich. 2004. Mapping of regular nested loop programs to coarse-grained reconfigurable arrays -- Constraints and methodology. In Proceedings of the 18th Parallel and Distributed Processing Symposium (IPDPS’04).Google ScholarCross Ref
- C. A. R. Hoare. 1985. Communicating Sequential Processes. Prentice-Hall. Google ScholarDigital Library
- Anthony Mark Jones and Michael Butts. 2006. TeraOPS hardware: A new massively-parallel MIMD computing fabric IC. In Proceedings of IEEE Hot Chips Symposium.Google ScholarCross Ref
- Jong-eun Lee, Kiyoung Choi, and Nikil D. Dutt. 2003. Compilation approach for coarse-grained reconfigurable architectures. IEEE Design and Test of Computers 20, 1 (2003), 26--33. DOI:http://dx.doi.org/ 10.1109/MDT.2003.1173050 Google ScholarDigital Library
- Wayne Luk and Steve Mckeever. 1998. Pebble: A language for parametrised and reconfigurable hardware design. In Field-Programmable Logic and Applications From FPGAs to Computing Paradigm. Springer, 9--18. DOI:http://dx.doi.org/10.1007/BFb0055228 Google ScholarDigital Library
- Bingfeng Mei, S. Vernalde, D. Verkest, H. De Man, and R. Lauwereins. 2002. DRESC: A retargetable compiler for coarse-grained reconfigurable architectures. In Proceedings of the International Conference on Field-Programmable Technology (FPT’02). 166--173. DOI:http://dx.doi.org/10.1109/FPT.2002.1188678Google Scholar
- Robin Milner, Joachim Parrow, and David Walker. 1989. A calculus of mobile processes, Part I. Information and Computation 100, 1 (1989), 1--40. Google ScholarDigital Library
- Alexandros Papkonstantinou, Yun Liang, John A. Stratton, Karthik Gururaj, Deming Chen, Wen-Mei W. Hwu, and Jason Cong. 2011. Multilevel granularity parallelism synthesis on FPGAs. In Proceedings of the International Symposium on Field-Programmable Custom Computing Machines (FCCM’11). DOI:http://dx.doi.org/10.1109/FCCM.2011.29 Google ScholarDigital Library
- Hyunchul Park, Yongjun Park, and Scott Mahlke. 2009. Polymorphic pipeline array: A flexible multicore accelerator with virtualized execution for mobile multimedia applications. In Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 42). 11. DOI:http://dx.doi.org/10.1145/1669112.1669160 Google ScholarDigital Library
- Tock. 2008. Tock: Translator from Occam to C by Kent. http://projects.cs.kent.ac.uk/projects/tock/trac/.Google Scholar
- Zain Ul-Abdin, Anders Ahlander, and Bertil Svensson. 2011. Programming real-time autofocus on a massively parallel reconfigurable architecture using occam-pi. Proceedings of the International Symposium on Field-Programmable Custom Computing Machines (FCCM’11), 194--201. DOI:http://dx.doi.org/10.1109/FCCM.2011.20 Google ScholarDigital Library
- Zain Ul-Abdin, E. Gebrewahid, and B. Svensson. 2012. Managing dynamic reconfiguration for fault-tolerance on a manycore architecture. In Proceedings of the 26th International Symposium on Parallel and Distributed Processing Workshops (IPDPSW’12). 312--319. DOI:http://dx.doi.org/10.1109/IPDPSW.2012.38 Google ScholarDigital Library
- Zain Ul-Abdin and B. Svensson. 2011. Occam-pi as a high-level language for coarse-grained reconfigurable architectures. In Proceedings of the 25th International Symposium on Parallel and Distributed Processing Workshops (IPDPSW’11). 236--243. DOI:http://dx.doi.org/10.1109/IPDPS.2011.147 Google ScholarDigital Library
- Zain Ul-Abdin and B. Svensson. 2012. Occam-pi for programming of massively parallel reconfigurable architectures. International Journal of Reconfigurable Computing 2012 (2012), Article No. 1. DOI:http://dx.doi.org/ doi:10.1155/2012/504815 Google ScholarDigital Library
- Peter H. Welch and Frederick R. M. Barnes. 2005. Communicating mobile processes: Introducing occam-pi. In 25 Years of CSP (Lecture Notes in Computer Science), Vol. 3525. Springer Verlag, 175--210. http://www.cs. kent.ac.uk/pubs/2005/2162 Google ScholarDigital Library
- Michael J. Wirthlin and Brad L. Hutchings. 1995. A dynamic instruction set computer. In Proceedings of IEEE Workshop on FPGAs for Custom Computing Machines. 99--107. Google ScholarDigital Library
Index Terms
- A Retargetable Compilation Framework for Heterogeneous Reconfigurable Computing
Recommendations
A hybrid nano/CMOS dynamically reconfigurable system—Part I: Architecture
Rapid progress on nanodevices points to a promising direction for future circuit design. However, since nanofabrication techniques are not yet mature, implementation of nanocircuits, at least on a large scale, in the near future is infeasible. To ease ...
Microkernel Architecture and Hardware Abstraction Layer of a Reliable Reconfigurable Real-Time Operating System (R3TOS)
This article presents a new solution for easing the development of reconfigurable applications using Field-Programable Gate Arrays (FPGAs). Namely, our Reliable Reconfigurable Real-Time Operating System (R3TOS) provides OS-like support for partially ...
Low-power 3D nano/CMOS hybrid dynamically reconfigurable architecture
In order to continue technology scaling beyond CMOS, diverse nanoarchitectures have been proposed in recent years based on emerging nanodevices, such as nanotubes, nanowires, etc. Among them, some hybrid nano/CMOS reconfigurable architectures enjoy the ...
Comments