ABSTRACT
DSP architectures often feature multiple register files with sparse connections to a large set of ALUs. For such DSPs, traditional register allocation algorithms suffer from a lot of problems, including a lack of retargetability and phase-ordering problems. This paper studies alternative register allocation techniques based on placement and routing. Different register file models are studied and evaluated on a state-of-the art coarse-grained reconfigurable array DSP, together with a new post-pass register allocator for rotating register files.
- AHN, M., AND PAEK, Y. Fast code generation for embedded processors with aliased heterogeneous registers. Trans. on HiPEAC 2, 2 (2007), 40--59.Google Scholar
- ALETA , A., CODINA, J. M., GONZÁLEZ, A., AND KAELI, D. Removing communications in clustered microarchitectures through instruction replication. ACM Trans. Archit. Code Optim. 1, 2 (2004), 127--151. Google ScholarDigital Library
- BETZ, V., ROSE, J., AND MARGUARDT, A. Architecture and CAD for Deep-Submicron FPGAs. Kluwer Academic Publishers, 1999. Google ScholarDigital Library
- BOUWENS, F., BEREKOVIC, M., GAYDADJIEV, G., AND DE SUTTER, B. Architecture enhancements for the ADRES coarse-grained reconfigurable array. In Proc. of HiPEAC Conf. (2008). Google ScholarDigital Library
- BRIGGS, P., COOPER, K. D., AND TORCZON, L. Improvements to graph coloring register allocation. ACM Trans. Program. Lang. Syst. 16, 3 (1994), 428--455. Google ScholarDigital Library
- CALLAHAN, D., AND KOBLENZ, B. Register allocation via hierarchical graph coloring. SIGPLAN Not. 26, 6 (1991), 192--203. Google ScholarDigital Library
- CARPANETO, G., DELL'AMICO, M., AND TOTH, P. Exact solution of large-scale, asymmetric traveling salesman problems. ACM Trans.Math. Softw. 21, 4 (1995), 394--409. Google ScholarDigital Library
- CERVERO, T. Analysis, implementation and architectural exploration of the H.264/AVC decoder onto a reconfigurable architecture. Master's thesis, Universidad de Los Palmas de Gran Canaria, 2007.Google Scholar
- CHAINTIN, G., AUSLANDER, M., CHANDRA, A. K., COCKE, J., HOPKINS, M., AND MARKSTEIN, P. Register allocation via coloring. Computer Languages 6, 1 (1981), 47--57.Google Scholar
- CHU, M., FAN, K., AND MAHLKE, S. Region-based hierarchical operation partitioning for multicluster processors. In PLDI '03: Proceedings of the ACM SIGPLAN 2003 conference on Programming language design and implementation (2003), pp. 300--311. Google ScholarDigital Library
- EISENBEIS, C., LELAIT, S., AND MARMOL, B. Circular-arc graph coloring and unrolling. In Proceedings of the 5th Twente Workshop on Graphs and Combinatorial Optimization (Twente, Netherlands, May 1997), U. Faigle and C. Hoede, Eds., pp. 71--74.Google Scholar
- GEORGE, L., AND APPEL, A. W. Iterated register coalescing. ACM Trans. Program. Lang. Syst. 18, 3 (1996), 300--324. Google ScholarDigital Library
- HARAIKAWA, T., SOENO, M., YAMASHITA, Y., AND NAKATA, I. Register allocation frameworks for slide-window architecture. Transactions of Information Processing Society of Japan 39, 9 (1998), 2684--2694. (in Japanese).Google Scholar
- HENDREN, L. J., GAO, G. R., ALTMAN, E. R., AND MUKERJI, C. A register allocation framework based on hierarchical cyclic interval graphs. In Compiler Construction (1992), pp. 176--191. Google ScholarDigital Library
- ITOGA, H., HARAIKAWA, T., YAMASHITA, Y., AND TANAKA, J. Register allocation for software pipelining with predication using spiral graph. In Proceedings of the International Symposium on Future Software Technology (ISFST2001) (2001), pp. 58--65.Google Scholar
- KIM, S., AND MOON, S.-M. Rotating register allocation for enhanced pipeline scheduling. In PACT '07: Proceedings of the 16th International Conference on Parallel Architecture and Compilation Techniques (PACT 2007) (2007), pp. 60--72. Google ScholarDigital Library
- KOES, D. R., AND GOLDSTEIN, S. A global progressive register allocator. In Proc. PLDI (2006), pp. 204--215. Google ScholarDigital Library
- LAM, M. S. Software pipelining: an effecive scheduling technique for VLIW machines. In Proc. PLDI (1988), pp. 318--327. Google ScholarDigital Library
- LAPINSKII,V. S., JACOME,M. F., AND VECIANA, G. A. D. Cluster assignment for high-performance embedded CLIW processors. ACM Trans. Des. Autom. Electron. Syst. 7, 3 (2002), 430--454. Google ScholarDigital Library
- MAHLKE, S., LIN, D., W.Y., C., HANK, R., AND BRINGMANN, R. Effective compiler support for predicated execution using the hyperblock. In MICRO 25: Proceedings of the 25th annual international symposium on Microarchitecture (1992), pp. 45.54. Google ScholarDigital Library
- MARX, D. Eulerian disjoint paths problem in grid graphs is NP-complete. Discrete Appl. Math. 143, 1-3 (2004), 336--341. Google ScholarDigital Library
- MEI, B., VERNALDE, S., VERKEST, D., MAN, H. D., AND LAUWEREINS, R. ADRES: An architecture with tightly coupled VLIW processor and coarse-grained reconfigurable matrix. In Proc. of Field-Programmable Logic and Applications (2003), pp. 61.70.Google ScholarCross Ref
- MEI, B., VERNALDE, S., VERKEST, D., MAN, H. D., AND LAUWEREINS, R. Exploiting loop-level parallelism for coarse-grained reconfigurable architecture using modulo scheduling. IEE Proceedings: Computer and Digital Techniques 150, 5 (2003).Google ScholarCross Ref
- PARK, H., FAN, K., KUDLUR,M., AND MAHLKE, S. Modulo graph embedding: Mapping applications onto coarse-grained reconfigurable architectures. In Proc. CASES (2006). Google ScholarDigital Library
- PARK, I., POWELL, M. D., AND VIJAYKUMAR, T. N. Reducing register ports for higher speed and lower energy. In MICRO 35: In Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture (2002), pp. 171--182. Google ScholarDigital Library
- PARK, J., AND MOON, S.-M. Optimistic register coalescing. ACM Trans. Program. Lang. Syst. 26, 4 (2004), 735--765. Google ScholarDigital Library
- POLETTO, M., AND SARKAR, V. Linear scan register allocation. ACM Trans. Program. Lang. Syst. 21, 5 (1999), 895--913. Google ScholarDigital Library
- RAU, B. R. Iterative modulo scheduling. Tech. rep., Hewlett-Packard Lab: HPL-94-115, 1995.Google Scholar
- RAU, B. R., AND GLASER, C. D. Scheduling techniques and an easily schedulable horizontal architecture for high performance scientific computing. In Proc. 20th Annual Workshop on Microprogramming and Microarchitecture (1981), pp. 183--198. Google ScholarDigital Library
- RAU, B. R., LEE, M., TIRUMALAI, P. P., AND SCHLANSKER, M. S. Register allocation for software pipelined loops. In Proc. PLDI (1992), pp. 283--299. Google ScholarDigital Library
- TERECHKO, A. S., AND CORPORAAL, H. Inter-cluster communication in vliw architectures. ACM Trans. Archit. Code Optim. 4, 2 (2007), 11. Google ScholarDigital Library
- TOUATI, S.-A.-A., AND EISENBEIS, C. Cyclic register pressure and allocation for modulo scheduled loops. Tech. Rep. 4442, INRIA, April 2002.Google Scholar
Index Terms
- Placement-and-routing-based register allocation for coarse-grained reconfigurable arrays
Recommendations
Placement-and-routing-based register allocation for coarse-grained reconfigurable arrays
LCTES '08DSP architectures often feature multiple register files with sparse connections to a large set of ALUs. For such DSPs, traditional register allocation algorithms suffer from a lot of problems, including a lack of retargetability and phase-ordering ...
Differential register allocation
PLDI '05: Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementationMicro-architecture designers are very cautious about expanding the number of architected registers (also the register field), because increasing the register field adds to the code size, raises I-cache and memory pressure, complicates processor ...
Differential register allocation
Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementationMicro-architecture designers are very cautious about expanding the number of architected registers (also the register field), because increasing the register field adds to the code size, raises I-cache and memory pressure, complicates processor ...
Comments