ABSTRACT
Recently, software pipelining methods based on an ILP (Integer Linear Programming) framework have been successfully applied to derive rate-optimal schedules for architectures involving clean pipelines - pipelines without structural hazards. The problem for architectures beyond such clean pipelines remains open. One challenge is how, under a unified ILP framework, to simultaneously represent resource constraints for unclean pipelines, and the assignment or mapping of operations from a loop to those pipelines. In this paper we provide a framework which does exactly this, and in addition constructs rate-optimal software pipelined schedules. The proposed formulation and a solution method have been implemented and tested on a set of 1006 loops taken from various scientific and integer benchmark suites. The formulation found a rate-optimal schedule for 75% of the loops, and required a median time of only 2 seconds per loop on a Sparc 10/30.
- 1.A. Aiken and A. Nicolau. cprimal loop parallelization. tn Proc. of the SIGPLAN '88 Conf. on Programming Language Design and Implementation, pages 308-317, Atlanta, Georgia, Jun. 22-24, 1988. Google ScholarDigital Library
- 2.E.R Altman. Two Approaches for Optimal Software Pipelining with Resource Constraints (In Preparation). PhD thesis, McGill U., Montreal, Que., 1995. Google ScholarDigital Library
- 3.R. P. Colwei1, R. P. Nix, J. J. O'Donnell, D. B. Papworth, and P. K. Rodman. A VLIW architecture for a trace scheduling compiler. IEEE Trans. on Computers, pages 967-979, Aug. 1988. Google ScholarDigital Library
- 4.J. C. Dehnert and R. A. Towle. Compiling for Cydra 5.tovrnal of Supercomputing, 7:181-227, May 1993. Google ScholarDigital Library
- 5.A. E. Eichenberger, E. S. Davidson, and S. G. Abraham. Minimum register requirements for a modulo schedule. In Proc. of the 27th Ann. Intl. Syrup. on M~croarchitecture, pages 75-84, San Jose, Calif., Nov. 30-Dec.2, 1994. Google Scholar
- 6.P. Feautrier. Fine-grMn Scheduling under Resource Constraints. In Seventh Annual Workshop on Languages and Compilers for Parallel Computing, Ithaca, USA, August 1994. Google ScholarDigital Library
- 7.F. Gasperoni and U. Schwiegelshohn. Efficient algorithms for cyclic scheduling. Rcs. Rep. RC 17068, IBM T. J. Watson Res. Center, Yorktown Heights, 1991. Google ScholarDigital Library
- 8.R. Govindarajan, E. R. Altman, and G. R. Gao. Coscheduling hardware and software pipelines. ACAPS Technical Memo 92, School of Computer Science, McGill University, Montreal, Que., 1995.Google Scholar
- 9.R. Govindarajan, E. R. Altman, and G. R. Gao. Minimizing register requirements under resourceconstrained rate-optimal software pipelining. In Proc. of the 27th Ann. Intl. Syrup. on Microarchitecture, pages 85-94, San Jose, C~tif., Nov. 30-Dec.2, 1994. Google Scholar
- 10.L. J. Hendren, G. R. Gao, E. R. Altman, and C. Mukerji. A register allocation framework based on hierarchical cyclic intervM graphs. In U. Kastens and P. Pfahter, editors, Proc. of the Intl. Conf. on Compiler Construction, number 641 in Lec. Notes in Comp. Sci., pages 176-191. Springer-Verlag, Oct. 1992. Google ScholarDigital Library
- 11.P.Y.T. Hsu. Highly concurrent scalar processing. Technical report, University of Illinois at Urbana- Ch~tmpagne, Urb~na, IL, 1986. Ph.D. Thesis, Google ScholarDigital Library
- 12.T. C. Hu. Integer Programming and Network Flows, page 270. Addison-Wesley Pub. Co., 1969.Google Scholar
- 13.R. A. Huff. Lifetime-sensitive modulo scheduling. In Proc. of the SIGPLAN '93 Conf. on Programming Language Design and Implementation, pages 258-267, Albuquerque, N. Mex., Jun. 23-25, 1993. Google ScholarDigital Library
- 14.IBM/Motorola. PowerPC 604 RISC Microprocessor Technical Summary, 1994.Google Scholar
- 15.P. M. Kogge. The Architecture of Pzpehned Computers. McGraw-Hill Book Company, New York, N. Y., 1981. Google ScholarDigital Library
- 16.M. Lam. Software pipelining: An effective scheduling technique for VLIW machines. In Proc. of the SIC- PLAN '88 Conf. on Programming Language Design and Implementation, p~ges 318-328, Atlanta, Georgia, Jun. 22-24, 1988. Google ScholarDigital Library
- 17.S-M. Moon and K. Ebcioglu. An efficient resourceconstrained global scheduling technique for superscalar and VLIW processors. In Proc. of the 25th Ann. Intl. Syrup. on Microarchitecture, pages 55-71, Portland, Ore., Dec. 1-4, 1992. Google ScholarDigital Library
- 18.Q. Ning and G. R. Gao. A novel framework of register allocation for software pipelining. In Conf. Rec. of the Twentieth Ann. A CM SIGPLAN-SICA CT Syrup. on Principles of Programming Languages, pages 29-42, Charleston, South C~rolina, Jan. I0-13, 1993. Google ScholarDigital Library
- 19.B. R. Rau and J. A. Fisher. Instruction-level parallel processing: History, overview and perspective. J. of Supercomputing, 7:9-50, May 1993. Google ScholarDigital Library
- 20.B. R. Rau and C. D. Glaeser. Some scheduling techniques and an easily schedulable horizontal architecture for high performance scientific computing. In Proc. of the 1Jth Ann. Microprogramming Work., pages 183-198, Chatham, Mass., Oct. 12-15, 1981. Google ScholarDigital Library
- 21.B. R. Rau, M. Lee, P. P. Tirumalai, and M. S. Schlansker. Register allocation for software pipelined loops. In Proc. of the SIGPLAN '92 Conf. on Programming Language Deszgn and Implementation, pages 283-299, San Francisco, Calif., Jun. 17-19, 1992. Google ScholarDigital Library
- 22.B. R. Rau. Iterative modulo scheduling: An algorithm for software pipelining loops, tn Proc. of the 27th Ann. Intl. Syrup. on Mzcroarchitccture, pages 63- 74, San Jose, Calif., Nov. 30-Dec.2, 1994. Google ScholarDigital Library
- 23.R. Reiter. Scheduling parallel computations. J. of the ACM, 15(4):590-599, Oct. 1968. Google ScholarDigital Library
- 24.V. Van Dongen, G. R. Gao, and Q. Ning. A polynomial time method for optimal software pipetining. In Proc. of the Conf. on Vector and Parallel Processzng, CONPAR-92, number 634 in Lec. Notes in Comp. Sci., pages 613-624, Lyon, France, Sep. 1-4, 1992. Google ScholarDigital Library
- 25.S. R. Vegdahl. A Dynamic Programming Technique for Compacting Loops. In Proc. of the 25th Ann. Intl. Symp. on Microarchitecture, pages 180-188, Portland, Ore., Dec. 1-4, 1992. Google ScholarDigital Library
- 26.J. Wang and E. Eisenbeis. A new approach to software pipelining of complicated loops with branches. Res. rep., INRIA, Rocquencourt, France, Jan. 1993,Google Scholar
- 27.N. J. Warter, J. w. Bockhaus, G. E. Haab, and K. Subramanian. Enhanced modulo scheduling for loops with conditionM branches. In Proc. of the 25th Ann. Intl. Symp. on Microarchitecture, pages 170-179, Portland, Ore., Dec. 1-4, 1992. Google ScholarDigital Library
Index Terms
- Scheduling and mapping: software pipelining in the presence of structural hazards
Recommendations
Scheduling and mapping: software pipelining in the presence of structural hazards
Recently, software pipelining methods based on an ILP (Integer Linear Programming) framework have been successfully applied to derive rate-optimal schedules for architectures involving clean pipelines - pipelines without structural hazards. The problem ...
Mapping Imperfect Loops to Coarse-Grained Reconfigurable Architectures
Nested loops represent a significant portion of application runtime in multimedia and DSP applications, an important domain of applications for coarse-grained reconfigurable architectures (CGRAs). While conventional approaches to mapping nested loops ...
Flattening-based mapping of imperfect loop nests for CGRAs
CODES '14: Proceedings of the 2014 International Conference on Hardware/Software Codesign and System SynthesisFor loop accelerators such as coarse-grained reconfigurable architectures (CGRAs) and GP-GPUs, nested loops represent an important source of parallelism. Existing solutions to mapping nested loops on CGRAs, however, are either designed for perfectly ...
Comments