Abstract
Large-grain synchronous dataflow graphs or multi-rate graphs have the distinct feature that the nodes of the dataflow graph fire at different rates. Such multi-rate large-grain dataflow graphs have been widely regarded as a powerful programming model for DSP applications. In this paper we propose a method to minimize buffer storage requirement in constructing rate-optimal compile-time (MBRO) schedules for multi-rate dataflow graphs. We demonstrate that the constraints to minimize buffer storage while executing at the optimal computation rate (i.e. the maximum possible computation rate without storage constraints) can be formulated as a unified linear programming problem in our framework. A novel feature of our method is that in constructing the rate-optimal schedule, it directly minimizes the memory requirement by choosing the schedule time of nodes appropriately. Lastly, a new circular-arc interval graph coloring algorithm has been proposed to further reduce the memory requirement by allowing buffer sharing among the arcs of the multi-rate dataflow graph.
We have constructed an experimental testbed which implements our MBRO scheduling algorithm as well as (i) the widely used periodic admissible parallel schedules (also known as block schedules) proposed by Lee and Messerschmitt (IEEE Transactions on Computers, vol. 36, no. 1, 1987, pp. 24–35), (ii) the optimal scheduling buffer allocation (OSBA) algorithm of Ning and Gao (Conference Record of the Twentieth Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, Charleston, SC, Jan. 10–13, 1993, pp. 29–42), and (iii) the multi-rate software pipelining (MRSP) algorithm (Govindarajan and Gao, in Proceedings of the 1993 International Conference on Application Specific Array Processors, Venice, Italy, Oct. 25–27, 1993, pp. 77–88). Schedules generated for a number of random dataflow graphs and for a set of DSP application programs using the different scheduling methods are compared. The experimental results have demonstrated a significant improvement (10–20%) in buffer requirements for the MBRO schedules compared to the schedules generated by the other three methods, without sacrificing the computation rate. The MBRO method also gives a 20% average improvement in computation rate compared to Lee's Block scheduling method.
Similar content being viewed by others
References
E. Ashford Lee and D.G. Messerschmitt, “Static Scheduling of Synchronous Data Flow Programs for Digital Signal Processing, IEEE Transactions on Computers, vol. 36,no. 1, 1987, pp. 24-35.
H. Printz, “Automatic Mapping of Large Signal Processing Systems to a Parallel Machine,” Ph.D. Thesis. Published as Memorandum CMU-CS-91-101, Department of Computer Science, Carnegie-Mellon University, Pittsburgh, Pennsylvania, 1991.
S. Ritz, M. Pankert, and H. Meyr, “High Level Software Synthesis for Signal Processing Systems,” in Proceedings of the 1992 International Conference on Application Specific Array Processors, Berkeley, California, Aug. 4–7, 1992.
G.R. Gao, R. Govindarajan, and P. Panangaden, “Construction Rules for Well-Behaved Stream Programs,” ACAPS Technical Memo 26, School of Computer Science, McGill University, Montréal, Québec, April 1992.
J.B. Dennis, “First Version of a Data-Flow Procedure Language,” in Proceedings of the Colloque sur la Programmation, Lecture Notes in Computer Science, vol. 19, Berlin: Springer-Verlag, 1975, pp. 362-376.
B.R. Rau and C.D. Glaeser, “Some Scheduling Techniques and an Easily Schedulable Horizontal Architecture for High Performance Scientific Computing,” in Proceedings of the 14th Annual Microprogramming Workshop, Chatham, Massachusetts, Oct. 12–15, 1981, pp. 183-198.
M. Lam, “Software Pipelining: An Effective Scheduling Technique for VLIW Machines,” in Proceedings of the SIGPLAN '88 Conference on Programming Language Design and Implementation, Atlanta, Georgia, June 22–24, 1988, pp. 318-328.
V.H. Allan, R.B. Jones, R.M. Lee, and S.J. Allan, “Software Pipelining,” ACM Computing Surveys, vol. 27,no. 3, 1995, pp. 367-432.
R. Govindarajan and G.R. Gao, “A Novel Framework for Multi-Rate Scheduling in DSP Applications,” in Proceedings of the 1993 International Conference on Application Specific Array Processors, Venice, Italy, Oct. 25–27, 1993, pp. 77-88.
R. Govindarajan and G.R. Gao, “Rate-Optimal Schedule for Multi-Rate DSP Computations,” Journal of VLSI Signal Processing, vol. 9,no. 3, 1995, pp. 211-232.
L.-G. Jeng and L.-G. Chen, “Rate-Optimal DSP Synthesis by Pipeline and Minimum Unfolding,” IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 2,no. 1, 1994, pp. 81-88.
D.B. Powell, E.A. Lee, and W.C. Newman, “Direct Synthesis of Optimized DSP Assembly Code from Signal Flow Block Diagrams,” in Proceedings of ICASSP-92, the 1992 International Conference on Acoustics, Speech, and Signal Processing, San Francisco, California, March 23–26, 1992, pp. 553-556.
Q. Ning and G.R. Gao, “A Novel Framework of Register Allocation for Software Pipelining,” in Conference Record of the Twentieth Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, Charleston, SC, Jan. 10–13, 1993, pp. 29-42.
G.C. Sih, “Multiprocessor Scheduling to Account for Interprocessor Communication,” Ph.D. Thesis, Memorandum no. UCB/ERL M91/29, Electronics Research Laboratory, University of California at Berkeley, 1991.
E.A. Lee, “Consistency in Dataflow Graphs,” IEEE Transactions on Parallel and Distributed Systems, vol. 2, 1990, pp. 223-235.
R. Reiter, “Scheduling Parallel Computations,” J. of the ACM, vol. 15,no. 4, 1968, pp. 590-599.
E.L. Lawler, Combinatorial Optimization: Networks and Matroids, Ft. Worth, Texas: Saunders College Publishing, 1976.
R.M. Karp, “A Characterization of the Minimum Cycle Mean in a Digraph,” Discrete Mathematics, vol. 23, 1978, pp. 309-311.
R. Govindarajan, E.R. Altman, and G.R. Gao, “A Framework for Resource-Constrained Rate-Optimal Software Pipelining,” IEEE Transactions on Parallel and Distributed Systems, vol. 7,no. 11, 1996, pp. 1133-1149.
E.R. Altman, R. Govindarajan, and G.R. Gao, “A Unified Framework for Instruction Scheduling and Mapping for Function Units with Structural Hazards,” Journal of Parallel and Distributed Computing, vol. 49,no. 2, 1998, pp. 259-294.
D. Fimmel and J. Muller, “Optimal Software Pipelining under Resource Constraints,” Technical Report SFB 358-A1-1/99, Dresden University of Technology, Germany, 1999. Available at http://www.iee.et.tu-dresden.de/~desa/A1/Pipeline.ps.zip.
P. Desai, “An Implementation of a Multi-Rate Scheduling Testbed,” ACAPS Technical Note, School of Computer Science, McGill University, Montreal, Québec, 1993.
G. Liao, E.R. Altman, V.K. Agarwal, and G.R. Gao, “A Comparative Study of DSP Multiprocessor List Scheduling Heuristics,” in Proceedings of the International Hawaii System Science Conference, Hawaii, 1994.
G.R. Gao, Y.-B. Wong, and Q. Ning, “A Petri-Net Model for Fine-Grain Loop Scheduling,” ACAPS Technical Memo 18, School of Computer Science, McGill University, Montréal, Québec, Jan. 1991.
G.J. Chaitin, M.A. Auslander, A.K. Chandra, J. Cocke, M.E. Hopkins, and P.W. Markstein, “Register Allocation via Coloring,” Computer Languages, vol. 6, 1981, pp. 47-57.
L.J. Hendren, G.R. Gao, E.R. Altman, and C. Mukerji, “A Register Allocation Framework Based on Hierarchical Cyclic Interval Graphs,” in Proceedings of the 4th International Conference on Compiler Construction, CC '92, Paderborn, Germany, Oct. 5–7, 1992, U. Kastens and P. Pfahler, (Ed.), Lecture Notes in Computer Science, vol. 641, Berlin: Springer-Verlag pp. 176-191.
M.C. Golumbic, Algorithmic Graph Theory and Perfect Graphs, New York: Academic Press, 1980.
T. Kashiwabara, S. Masuda, K. Nakajsma, and T. Fujisawa, “Generation of Maximum Independent Sets of a Bipartite Graph and Maximum Cliques of a Circular Are Graph. Interval, Circular-Arc and Chordal Craphs,” Journal of Algorithms, vol. 13, 1992, pp. 161-1745.
J.Y.T. Leung, “Fast Algorithms for Generating All Maximum Independent Sets of Interval, Circular-Arc and Chordal Graphs,” Journal of Algorithms, vol. 5, 1984, pp. 22-35.
S.S. Bhattacharyya and E.A. Lee, “Scheduling Synchronous Dataflow Graphs for Efficient Looping,” Journal of VLSI Signal Processing, vol. 6,no. 3, 1993.
S. Ha and E.A. Lee, “Compile-Time Scheduling and Assignment of Data-Flow Program Graphs with Data-Dependent Iteration,” IEEE Transactions on Computers, Vol. 40,no. 11, 1991, pp. 1225-1238.
M. Čubrić and P. Panangaden, “Minimal Memory Schedules for Dataflow Networks,” In Proceedings of the 4th International Conference on Concurrency Theory, Hildesheim, Germany, Aug. 23–26, 1993, Eike Best (Ed.), Lecture Notes in Computer Science, vol. 715, Berlin: Springer-Verlag, pp. 368-383.
K.K. Parhi and D.G. Messerschmitt, “Static Rate-Optimal Scheduling of Iterative Data-Flow Programs via Optimum Unfolding,” IEEE Transactions on Computers, vol. 40,no. 2, 1991, pp. 178-195.
L.E. Lucke and K.K. Parhi, “Generalized ILP Scheduling and Allocation for High-Level DSP Synthesis,” in Proceedings of the 1993 IEEE Custom Integrated Circuits Conference, 1993.
L.-F. Chao and E.H.-M. Sha, “Rate-Optimal Static Scheduling for DSP Dataflow Programs,” in Proceedings of 1993 Great Lakes Symposium on VLSI, March 1993, pp. 80-84.
C.E. Leiserson, F.M. Rose, and J.B. Saxe, “Optimizing Synchronous Circuitry by Retiming,” in Proceedings of the Third Caltech Conference on VLSI, Pasadena, California, March 1983, pp. 87-116.
P.D. Hoang, “Compiling Real-Time Digital Signal Processing Applications onto Multiprocessor Systems,” Ph.D. Thesis, Memorandum no. UCB/ERL M92/68, Electronics Research Laboratory, University of California at Berkeley, 1992.
S. Ritz, M. Pankert, V. Živojnović, and H. Meyr, “Optimum Vectorization of Scalable Synchronous Dataflow Graphs,” in Proceedings of the 1993 International Conference on Application Specific Array Processors, Venice, Italy, Oct. 25–27, 1993, pp. 285-296.
C.-T. Hwang, J.-H. Lee, and Y.-C. Hsu, “A Formal Approach to the Scheduling Problem in High-Level Synthesis,” IEEE Transactions on Computer-Aided Design, vol. 10,no. 4, 1991, pp. 464-475.
V. van Dongen, G.R. Gao, and Q. Ning, “A Polynomial Time Method for Optimal Software Pipelining,” in Proceedings of the Conference on Vector and Parallel Processing, CONPAR-92, Lyon, France, Berlin: Springer-Verlag, Sept. 1–4, 1992, Lecture Notes in Computer Science, vol. 634, pp. 613-624.
J. Ramanujam, “Optimal Software Pipelining of Nested Loops,” in Proceedings of the 8th International Parallel Processing Symposium, Cancún, Mexico, April 26–29, 1994, pp. 335-342.
F. Depuydt, W. Geurts, G. Goossens, and H. De Man, “Optimal Scheduling and Software Pipelining of a Repetitive Signal Flow Graphs with Delay Line Optimization,” in Proc. of the EDDA European Design and Test Conference, Paris, France, Feb. 1994.
A.E. Eichenberger and E.S. Davidson, “Efficient Formulation for Optimal Modulo Schedulers,” in Proceedings of the ACM SIGPLAN '97 Conference on Programming Language Design and Implementation, Las Vegas, Nevada, June 15–18, 1997, pp. 194-205.
M. Potkonjak and J. Rabey, “Retiming for Scheduling,” in Proceedings of the VLSI Signal Processing Workshop, 1990.
P.-Y. Calland, A. Darte, and Y. Robert, “Circuit Retiming Applied to Decomposed Software Pipelining,” IEEE Transactions on Parallel and Distributed Systems, vol. 9,no. 1, 1998, pp. 24-35.
N. Passos, E.H.-M. Sha, and S. Bass, “Scheduled-Based Multi-Dimensional Retiming on Data Flow Graphs,” in Proceedings of the 8th International Parallel Processing Symposium, Cancún, Mexico, April 26–29, 1994, pp. 195-199.
T.C. Denk, M. Majumdar, and K.K. Parhi, “Two-Dimensional Retiming with Low Memory Requirements,” in Proceedings of ICASSP-96, the 1996 International Conference on Acoustics, Speech, and Signal Processing, Atlanta, Georgia, May 1996, pp. 3330-3333.
T.C. Denk and K.K. Parhi, “Lower Bounds on Memory Requirements for Statically Scheduled DSP Programs,” Journal of VLSI Signal Processing, vol. 12, 1996, pp. 247-264.
P.K. Murthy and S.S. Bhattacharyya, “A Buffer Merging Technique for Reducing Memory Requirements of Synchronous Dataflow Specifications,” in Proceedings of the International Symposium on Systems Synthesis, San Jose, CA, Nov. 1999.
P.K. Murthy, S.S. Bhattacharyya, and E.A. Lee, “Joint Minimization of Code and Data for Synchronous Data Flow Programs,” Journal on Formal Methods in System Design, vol. 11,no. 1, 1997, pp. 41-70.
E. Teich, J. Zitzler, and S.S. Bhattacharyya, “Buffer Memory Optimization in DSP Applications—An Evolutionary Approach,” in Proceedings of the International Conference on Parallel Problem Solving from Nature, Amsterdam, The Netherlands, Sept. 1999, pp. 885-894.
S. Ritz, M. Willems, and H. Meyr, “Scheduling for Optimum Data Memory Compaction in Block Diagram Oriented Software Synthesis,” in Proceedings of ICASSP-95, the 1995 International Conference on Acoustics, Speech, and Signal Processing, 1995.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Govindarajan, R., Gao, G.R. & Desai, P. Minimizing Buffer Requirements under Rate-Optimal Schedule in Regular Dataflow Networks. The Journal of VLSI Signal Processing-Systems for Signal, Image, and Video Technology 31, 207–229 (2002). https://doi.org/10.1023/A:1015452903532
Published:
Issue Date:
DOI: https://doi.org/10.1023/A:1015452903532