ABSTRACT
We discuss the issues about the interdependency between code scheduling and register allocation. We present two methods as solutions: (1) an integrated code scheduling technique; and (2) a DAG-driven register allocator. The integrated code scheduling method combines two scheduling techniques—one to reduce pipeline delays and the other to minimize register usage—into a single phase. By keeping track of the number of available registers, the scheduler can choose the appropriate scheduling technique to schedule a better code sequence. The DAG-driven register allocator uses a dependency graph to assist in assigning registers; it introduces much less extra dependency than does an ordinary register allocator. For large basic blocks, both approaches were shown to generate more efficient code sequences than conventional techniques in the simulations.
- Aho77.Aho, A. Y., S. C. Johnson, and J. D. Ullrrlan, "Code Generation for expressions with common subexpressions," JACM, 24:1, 146-160, 1977 Google ScholarDigital Library
- Aho86.Aho, A. V., J. D. Ullman, and R. Sethi, "Compilers Principles, Techniques, and Tools," Addison-Wesley, Reading, MA, 1986. Google ScholarDigital Library
- Ausl82.Auslander, M, and M. Hopkins, "An Overview of the PL.8 compiler," Proceedings of the SIGPLAN '82 Symposium on Compiler Construction, June, 1982. Google ScholarDigital Library
- Cray82.Cray Research inc., Cray-1 Computer System S Series Mainframe Reference Manual (HR-0029), 1982.Google Scholar
- Davi86.Davidson, J. W., "A Retargetable Instruction Reorganizer," Proceedings of the SIGPLAN '86 Symposium on Compiler Construction. June, 1986 Google ScholarDigital Library
- Dong79.Dongarra, J. J. and A. R. Jinds, "Unrolling I.x~ps in Fortran,'" Software Practice and Experience 9, 3, pp. 219-226, Mar., 1979Google ScholarCross Ref
- Elli85.Ellis, J. R., "Bulldog; A Compiler for VLIW Architectures," PhJD. Thesis, YaleU/DCS/RR-364, Yale University, Feb,, 1985 Google ScholarDigital Library
- Fish81.Fisher, J., "Trace Scheduling: A Technique for Global Microcode Compaction," IEEE Transactions on Computers, Vol. C-30, No. 7, July 1981.Google Scholar
- Gibb86.Gibbons P. B., and S. S. Muchnick, "Efficient Instruction Scheduling for a Pipelined Architecture," Proceedings of the SIGPLaCN '86 Symposium on Compiler Construction, JUrl., 1986 Google ScholarDigital Library
- Henn83.Hermessy, J, L., and Thomas Gross, "Postpass Code Optimization of Pipeline Constraints," ACM Transactions on Programming Languages and Systems 5, 3, pp. 422-448, July 1983 Google ScholarDigital Library
- Henn84.Hennessy, J. L., "VLSI Processor Architecture," IEEE Transactions on Computers, Vol. c-33 No. 12, Dec., 1984.Google ScholarDigital Library
- Hsu87.Hsu, Wei-Chung, "Register Allocation and Code Scheduling for Load/Store Architectures" UW Computer Science Technique Report #722, Oct., 1987Google Scholar
- Kogg81.Kogge, P. M., "The Architecture of Pipelined Computers,~' McGraw-Hill, New York, 1981Google Scholar
- McMa72.McMahon, F. H, "FORTRAN CPU Performance Analysis", Lawrence Livermore Laboratories, 1972Google Scholar
- Milu86.Milutinovic, Veliko, "GaAs Microprocessor Technology" Computer, Vol. 19, No. 10, Oct. 1986 Google ScholarDigital Library
- Padu86.Padua, D. and M, J. Wolfe,. "Advanced Compiler 0ptimizations for Supercomputers," Communication of the ACM, Dec. 1986 Google ScholarDigital Library
- Seth70.Sethi, R. and J. D. Ullman, "The Generation of Optimal Code for Arithmetic Expressions," JACM 17, 6, 1970, pp. 715-728 Google ScholarDigital Library
- Thor70.Thornton, J. E., "Design o1: a Computer, The Control Data 6600," Scott, Foresman and Co., Glenview, IL, 1970. Google ScholarDigital Library
- Tjad70.Tjaden, G. S. and M. J. Plynn, "Detection and Parallel Execution of Independent lru~tmctions," IEEE Transactions on Computers 19(10):889-895, Oct., 1970Google ScholarDigital Library
- Toma67.Tomasulo, R. M., "An Efficient Algorithm for Exploiting Multiple Arithmetic Units," IBM Journal of Research and Development V 11, pp. 25-33, Jan., 1967.Google ScholarDigital Library
- Weis87.Weiss, S. and J. E. Smith, "A Study of Scalar Compilation Techniques for Pipelined Supercomputers" Second International Conference on Architectural Support for Programming Languages and Operating Systems," Oct., 1987. Google ScholarCross Ref
- Youn85.Young, H., "Evaluation of a Decoupled Computer Architecture and the Design of A Vector Extension," Computer Sciences Technical Report #603, July, 1985Google Scholar
Index Terms
- Code scheduling and register allocation in large basic blocks
Recommendations
Code scheduling and register allocation in large basic blocks
ACM International Conference on Supercomputing 25th Anniversary VolumeWe discuss the issues about the interdependency between code scheduling and register allocation. We present two methods as solutions: (1) an integrated code scheduling technique; and (2) a DAG-driven register allocator. The integrated code scheduling ...
Author retrospective for code scheduling and register allocation in large basic blocks
ACM International Conference on Supercomputing 25th Anniversary VolumeIn 1987 we were working at the University of Wisconsin-Madison with Jim Smith, J. T. Hsieh, Koujuch Liou and Andrew Pleszkun on PIPE [4], an unorthodox 'decoupled access-execute processor.' The driving innovation of PIPE was the separation of ...
Fast, frequency-based, integrated register allocation and instruction scheduling
Instruction scheduling and register allocation are two of the most important optimization phases in modern compilers as they have a significant impact on the quality of the generated code. Unfortunately, the objectives of these two optimizations are in ...
Comments