skip to main content
research-article

Graph Minor Approach for Application Mapping on CGRAs

Published:03 September 2014Publication History
Skip Abstract Section

Abstract

Coarse-Grained Reconfigurable Arrays (CGRAs) exhibit high performance, improved flexibility, low cost, and power efficiency for various application domains. Compute-intensive loop kernels, which are perfect candidates to be executed on CGRAs, are usually mapped through modified modulo scheduling algorithms. These algorithms should be capable of performing both placement and routing. We formalize the CGRA mapping problem as a graph minor containment problem. We essentially test whether the dataflow graph representing the loop kernel is a minor of the modulo routing resource graph representing the CGRA resources and their interconnects. We design an exact graph minor testing approach that exploits the unique properties of both the dataflow graph and the routing resource graph to significantly prune the search space. We introduce additional heuristic strategies that drastically improve the compilation time while still generating optimal or near-optimal mapping solutions. Experimental evaluation confirms the efficiency of our approach.

References

  1. Shail Aditya, Vinod Kathail, and B. Ramakrishna Rau. 1998. Elcor's machine description system: Version 3.0. http://www.hpl.hp.com/techreports/98/HPL-98-128.pdf.Google ScholarGoogle Scholar
  2. Isolde Adler, Frederic Dorn, Fedor V. Fomin, Ignasi Sau, and Dimitrios M. Thilikos. 2012. Fast minor testing in planar graphs. Algorithmica 64, 1, 69--84. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Mythri Alle, Keshavan Varadarajan, Reddy C. Ramesh, Joseph Nimmy, Alexander Fell, Adarsha Rao, S. K. Nandy, and Ranjani Narayan. 2008. Synthesis of application accelerators on runtime reconfigurable hardware. In Proceedings of the International Conference on Application-Specific Systems, Architectures and Processors (ASAP'08). IEEE, 13--18. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Nikhil Bansal, Sumit Gupta, Nikil Dutt, and Alexandru Nicolau. 2003. Analysis of the performance of coarse-grain reconfigurable architectures with different processing element configurations. In Proceedings of the Workshop on Application Specific Processors, Held in Conjunction with the International Symposium on Microarchitecture (MICRO'03).Google ScholarGoogle Scholar
  5. Janina A. Brenner, Sandor P. Fekete, and Jan C. Van Der Veen. 2009. A minimization version of a directed subgraph homeomorphism problem. Math. Methods Oper. Res. 69, 2, 281--296.Google ScholarGoogle ScholarCross RefCross Ref
  6. Lakshmi N. Chakrapani, John Gyllenhaal, Wen-Mei W. Hwu, Scott A. Mahlke, Krishna V. Palem, and Rodric M. Rabbah. 2005. Trimaran: An infrastructure for research in instruction-level parallelism. In Languages and Compilers for High Performance Computing, Springer, 32--41. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Liang Chen and Tulika Mitra. 2012. Graph minor approach for application mapping on CGRAs. In Proceedings of the International Conference on Field-Programmable Technology (ICFPT'12). IEEE, 285--292.Google ScholarGoogle ScholarCross RefCross Ref
  8. Nathan Clark, Amir Hormati, Scott Mahlke, and Sami Yehia. 2006. Scalable subgraph mapping for acyclic computation accelerators. In Proceedings of the International Conference on Compilers, Architecture and Synthesis for Embedded Systems (CASES'06). ACM Press, New York, 147--157. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Luigi P. Cordella, Pasquale Foggia, Carlo Sansone, and Mario Vento. 2004. A (sub) graph isomorphism algorithm for matching large graphs. IEEE Trans. Pattern Anal. Mach. Intell. 26, 10, 1367--1372. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Bjorn De Sutter, Paul Coene, Tom Vander Aa, and Bingfeng Mei. 2008. Placement-and-routing-based register allocation for coarse-grained reconfigurable arrays. In Proceedings of the ACM SIGPLAN-SIGBED Conference on Languages, Compilers and Tools for Embedded System (LCTES'08). ACM Press, New York, 151--160. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Giuseppe Di Battista, Maurizio Patrignani, and Francesco Vargiu. 1998. A split&push approach to 3D orthogonal drawing. In Graph Drawing, Springer, 87--101. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Christine Eisenbeis, Sylvain Lelait, and Bruno Marmol. 1995. The meeting graph: A new model for loop cyclic register allocation. In Proceedings of the International Federation for Information Processing Working Group. 264--267. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Steven Fortune, John Hopcroft, and James Wyllie. 1980. The directed subgraph homeomorphism problem. Theor. Comput. Sci. 10, 2, 111--121.Google ScholarGoogle ScholarCross RefCross Ref
  14. Stephen Friedman, Allan Carroll, Brian Van Essen, Benjamin Ylvisaker, Carl Ebeling, and Scott Hauck. 2009. SPR: An architecture-adaptive cgra mapping tool. In Proceedings of the 17th ACM/SIGDA International Symposium on Field Programmable Gate Arrays (FPGA'09). ACM Press, New York, 191--200. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Rani Gnanaolivu, Theodore S. Norvell, and Ramachandran Venkatesan. 2010. Mapping loops onto coarse-grained reconfigurable architectures using particle swarm optimization. In Proceedings of the International Conference on Soft Computing and Pattern Recognition (SoCPaR'10). IEEE, 145--151.Google ScholarGoogle ScholarCross RefCross Ref
  16. Rani Gnanaolivu, Theodore S. Norvell, and Ramachandran Venkatesan. 2011. Analysis of inner-loop mapping onto coarse-grained reconfigurable architectures using hybrid particle swarm optimization. Int. J. Organiz. Collect. Intell. 2, 2, 17--35. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Matthew R. Guthaus, Jeffrey S. Ringenberg, Dan Ernst, Todd M. Austin, Trevor Mudge, and Richard B. Brown. 2001. MiBench: A free, commercially representative embedded benchmark suite. In Proceedings of the IEEE International Workshop on Workload Characterization. 3--14. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Mahdi Hamzeh, Aviral Shrivastava, and Sarma Vrudhula. 2012. EPIMap: Using epimorphism to map applications on CGRAs. In Proceedings of the 49th Annual Design Automation Conference (DAC'12). ACM Press, New York, 1284--1291. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Mahdi Hamzeh, Aviral Shrivastava, and Sarma Vrudhula. 2013. REGIMap: Register-aware application mapping on coarse-grained reconfigurable architectures (CGRAs). In Proceedings of the 50th Annual Design Automation Conference (DAC'13). ACM Press, New York, 18:1--18:10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Akira Hatanaka and Nader Bagherzadeh. 2007. A modulo scheduling algorithm for a coarse-grain reconfigurable array template. In Proceedings of the 21st International Parallel and Distributed Processing Symposium (IPDPS'07). IEEE, 1--8.Google ScholarGoogle ScholarCross RefCross Ref
  21. Giuseppe F. Italiano. 1986. Amortized efficiency of a path retrieval data structure. Theor. Comput. Sci. 48, 2--3, 273--281. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Yongjoo Kim, Jongeun Lee, Aviral Shrivastava, Jonghee W. Yoon, Doosan Cho, and Yunheung Paek. 2011. High through-put data mapping for coarse-grained reconfigurable architectures. IEEE Trans. Comput.-Aided Des. Integr. Circ. Syst. 30, 11, 1599--1609. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Ian Kuon and Jonathan Rose. 2007. Measuring the gap between FPGAs and ASICs. IEEE Trans. Comput.-Aided Des. Integr. Circ. Syst. 26, 2, 203--215. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Zion Kwok and Steven J. E. Wilton. 2005. Register file architecture optimization in a coarse-grained reconfigurable architecture. In Proceedings of the 13th International Symposium on Field-Programmable Custom Computing Machines (FCCM'05). IEEE, 35--44. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Dongwook Lee, Manhwee Jo, Kyuseung Han, and Kiyoung Choi. 2009. FloRA: Coarse-grained reconfigurable architecture with floating-point operation capability. In Proceedings of the International Conference on Field-Programmable Technology (ICFPT'09). IEEE, 376--379.Google ScholarGoogle ScholarCross RefCross Ref
  26. Jong-Eun Lee, Kiyoung Choi, and Nikil D. Dutt. 2003. Compilation approach for coarse-grained reconfigurable architectures. IEEE Des. Test Comput. 20, 1, 26--33. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Giorgio Levi. 1973. A note on the derivation of maximal common subgraphs of two directed or undirected graphs. Calcolo 9, 4, 341--352.Google ScholarGoogle ScholarCross RefCross Ref
  28. Alan Marshall, Tony Stansfield, Igor Kostarnov, Jean Vuillemin, and Brad Hutchings. 1999. A reconfigurable arithmetic array for multimedia applications. In Proceedings of the 7th ACM/SIGDA International Symposium on Field Programmable Gate Arrays (FPGA'99). ACM Press, New York, 135--143. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Bingfeng Mei, Serge Vernalde, Diederik Verkest, Hugo De Man, and Rudy Lauwereins. 2003a. ADRES: An architecture with tightly coupled vliw processor and coarse-grained reconfigurable matrix. In Proceedings of the 13th International Conference on Field Programmable Logic and Application (FPL'03). Springer, 61--70.Google ScholarGoogle ScholarCross RefCross Ref
  30. Bingfeng Mei, Serge Vernalde, Diederik Verkest, Hugo De Man, and Rudy Lauwereins. 2003b. Exploiting loop-level parallelism on coarse-grained reconfigurable architectures using modulo scheduling. In Proceedings of the Conference on Design, Automation and Test in Europe (DATE'03). IEEE, 296--301. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Nils J. Nilsson. 1982. Principles of Artificial Intelligence. Springer. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Hyunchul Park, Kevin Fan, Manjunath Kudlur, and Scott Mahlke. 2006. Modulo graph embedding: Mapping applications onto coarse-grained reconfigurable architectures. In Proceedings of the International Conference on Compilers, Architecture and Synthesis for Embedded Systems (CASES'06). ACM Press, New York, 136--146. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Hyunchul Park, Kevin Fan, Scott A. Mahlke, Taewook Oh, Heeseok Kim, and Hong-Seok Kim. 2008. Edge-centric modulo scheduling for coarse-grained reconfigurable architectures. In Proceedings of the 17th International Conference on Parallel Architectures and Compilation Techniques (PACT'08). ACM Press, New York, 166--176. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. B. Ramakrishna Rau. 1994. Iterative modulo scheduling: An algorithm for software pipelining loops. In Proceedings of the 27th International Symposium on Microarchitecture (MICRO'94). ACM Press, New York, 63--74. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Neil Robertson and Paul D. Seymour. 1999. Graph minors. J. Combin. Theory B77, 1, 162--210. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Neil Robertson and Paul D. Seymour. 2004. Graph minors. XX. Wagner's conjecture. J. Combin. Theory B92, 2, 325--357. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Hartej Singh, Ming-Hau Lee, Guangming Lu, Fadi J. Kurdahi, Nader Bagherzadeh, and Eliseu M. Chaves Filho. 2000. MorphoSys: An integrated reconfigurable system for data-parallel and computation-intensive applications. IEEE Trans. Comput. 49, 5, 465--481. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Mohammed A. A. Tuhin and Theodore S. Norvell. 2008. Compiling parallel applications to coarse-grained reconfigurable architectures. In Proceedings of the Canadian Conference on Electrical and Computer Engineering (CCECE'08). IEEE, 1723--1728.Google ScholarGoogle Scholar
  39. Julian R. Ullmann. 1976. An algorithm for subgraph isomorphism. J. ACM 23, 1, 31--42. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Jonghee W. Yoon, Aviral Shrivastava, Sanghyun Park, Minwook Ahn, and Yunheung Paek. 2009. A graph drawing based spatial mapping algorithm for coarse-grained reconfigurable architectures. IEEE Trans. VLSI Syst. 17, 11, 1565--1578. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Javier Zalamea, Josep Llosa, Eduard Ayguade, and Mateo Valero. 2003. MIRS: Modulo scheduling with integrated register spilling. In Proceedings of the Workshop on Languages and Compilers for Parallel Computing. Springer, 239--253. Google ScholarGoogle ScholarDigital LibraryDigital Library

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in

Full Access

  • Published in

    cover image ACM Transactions on Reconfigurable Technology and Systems
    ACM Transactions on Reconfigurable Technology and Systems  Volume 7, Issue 3
    Special Issue on 11th International Conference on Field-Programmable Technology (FPT'12) and Special Issue on the 7th International Workshop on Reconfigurable Communication-Centric Systems-on-Chip (ReCoSoC'12)
    August 2014
    199 pages
    ISSN:1936-7406
    EISSN:1936-7414
    DOI:10.1145/2664590
    Issue’s Table of Contents

    Copyright © 2014 ACM

    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 3 September 2014
    • Accepted: 1 December 2013
    • Revised: 1 November 2013
    • Received: 1 May 2013
    Published in trets Volume 7, Issue 3

    Permissions

    Request permissions about this article.

    Request Permissions

    Check for updates

    Qualifiers

    • research-article
    • Research
    • Refereed

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader