skip to main content
10.1145/1176760.1176778acmconferencesArticle/Chapter ViewAbstractPublication PagesesweekConference Proceedingsconference-collections
Article

Modulo graph embedding: mapping applications onto coarse-grained reconfigurable architectures

Published:22 October 2006Publication History

ABSTRACT

Coarse-grained reconfigurable architectures (CGRAs) present an appealing hardware platform by providing the potential for high computation throughput, scalability, low cost and energy efficiency. CGRAs consist of an array of function units and register files generally organized as a two dimensional grid. The most difficult challenge with deploying CGRAs is compiler scheduling technology that can map software implementations of compute intensive loops onto the array. Traditional schedulers are not suitable because they do not take into account the explicit routing of operand values that is necessary. In essence, the problem of binding operations to time slots and resources is extended to also include explicit routing of operands from producers to consumers. To tackle this problem, this paper introduces a software pipelining technique for mapping loop bodies onto CGRAs, referred to as modulo graph embedding. We leverage graph embedding from graph theory, which is used to draw graphs onto a target space. The loop body is essentially drawn onto the CGRA mesh, subject to modulo resource usage constraints. Modulo graph embedding is effective because it can take into account the communication structure of the loop body during mapping. On average, a compute utilization of 56-68% is achieved for a set of loop kernels across three 4x4 CGRA designs.

References

  1. T. Callahan, J. Hauser, and J. Wawrzynek. The Garp architecture and C compiler. IEEE Computer, 33(4):62--69, Apr. 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. R. Davidson and D. Harel. Drawing graphs nicely using simulated annealing. ACM Transactions on Graphics, 15(4):301--331, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. P. Eades. A heuristic for graph drawing. Congressus Numerantium, 42:149--160, 1984.Google ScholarGoogle Scholar
  4. C. Ebeling et al. Mapping applications to the RaPiD configurable architecture. In Proc. of the 5th IEEE Symposium on Field-Programmable Custom Computing Machines, pages 106--115, Apr. 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. J. Ellis. Bulldog: A Compiler for VLIW Architectures. MIT Press, Cambridge, MA, 1985. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. S. Goldstein et al. PipeRench: A coprocessor for streaming multimedia acceleration. In Proc. of the 26th Annual International Symposium on Computer Architecture, pages 28--39, June 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. T. Kamada and S. Kawai. An algorithm for drawing general undirected graphs. Information Processing Letters, 31:7--15, 1989. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. G. Krishnamurthy, E. Granston, and E. Stotzer. Affinity-based cluster assignment for unrolled loops. In Proc. of the 2002 International Conference on Supercomputing, pages 107--116, June 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. J. Lee, K. Choi, and N. Dutt. Compilation approach for coarse-grained reconfigurable architectures. IEEE Design & Test of Computers, 20(1):26--33, Jan. 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. W. Lee et al. Space-time scheduling of instruction-level parallelism on a RAW machine. In Eighth International Conference on Architectural Support for Programming Languages and Operating Systems, pages 46--57, Oct. 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. W. Lee, D. Puppin, S. Swenson, and S. Amarasinghe. Convergent scheduling. In Proc. of the 35th Annual International Symposium on Microarchitecture, pages 111--122, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. W. Li and H. Kurata. A grid layout algorithm for automatic drawing of biochemical networks. Bioinformatics, 21(9):2036--2042, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. G. Lu, H. Singh, M. -H. Lee, N. Bagherzadeh, F. J. Kurdahi, and E. M. C. Filho. The MorphoSys parallel reconfigurable system. In Proc. of the 5th International Euro-Par Conference, pages 727--734, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. B. Mei, S. Vernalde, D. Verkest, H. D. Man, and R. Lauwereins. Exploiting loop-level parallelism on coarse-grained reconfigurable architectures using modulo scheduling. In Proc. of the 2003 Design, Automation and Test in Europe, pages 296--301, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. E. Nystrom and A. E. Eichenberger. Effective cluster assignment for modulo scheduling. In Proc. of the 31st Annual International Symposium on Microarchitecture, pages 103--114, Dec. 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. B. R. Rau. Iterative modulo scheduling: An algorithm for software pipelining loops. In Proc. of the 27th Annual International Symposium on Microarchitecture, pages 63--74, Nov. 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. J. Sanchez and A. Gonzalez. Modulo scheduling for a fully-distributed clustered VLIW architecture. In Proc. of the 33rd Annual International Symposium on Microarchitecture, pages 124--133, Dec. 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. M. B. Taylor et al. The Raw microprocessor: A computational fabric for software circuits and general purpose programs. IEEE Micro, 22(2):25--35, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. G. Venkataramani, W. Najjar, F. Kurdahi, N. Bagherzadeh, and W. Bohm. A compiler framework for mapping applications to a coarse-grained reconfigurable computer architecture. pages 116--125, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Modulo graph embedding: mapping applications onto coarse-grained reconfigurable architectures

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          CASES '06: Proceedings of the 2006 international conference on Compilers, architecture and synthesis for embedded systems
          October 2006
          448 pages
          ISBN:1595935436
          DOI:10.1145/1176760

          Copyright © 2006 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 22 October 2006

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • Article

          Acceptance Rates

          Overall Acceptance Rate52of230submissions,23%

          Upcoming Conference

          ESWEEK '24
          Twentieth Embedded Systems Week
          September 29 - October 4, 2024
          Raleigh , NC , USA

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader