Skip to main content
Log in

Minimizing Buffer Requirements under Rate-Optimal Schedule in Regular Dataflow Networks

  • Published:
Journal of VLSI signal processing systems for signal, image and video technology Aims and scope Submit manuscript

Abstract

Large-grain synchronous dataflow graphs or multi-rate graphs have the distinct feature that the nodes of the dataflow graph fire at different rates. Such multi-rate large-grain dataflow graphs have been widely regarded as a powerful programming model for DSP applications. In this paper we propose a method to minimize buffer storage requirement in constructing rate-optimal compile-time (MBRO) schedules for multi-rate dataflow graphs. We demonstrate that the constraints to minimize buffer storage while executing at the optimal computation rate (i.e. the maximum possible computation rate without storage constraints) can be formulated as a unified linear programming problem in our framework. A novel feature of our method is that in constructing the rate-optimal schedule, it directly minimizes the memory requirement by choosing the schedule time of nodes appropriately. Lastly, a new circular-arc interval graph coloring algorithm has been proposed to further reduce the memory requirement by allowing buffer sharing among the arcs of the multi-rate dataflow graph.

We have constructed an experimental testbed which implements our MBRO scheduling algorithm as well as (i) the widely used periodic admissible parallel schedules (also known as block schedules) proposed by Lee and Messerschmitt (IEEE Transactions on Computers, vol. 36, no. 1, 1987, pp. 24–35), (ii) the optimal scheduling buffer allocation (OSBA) algorithm of Ning and Gao (Conference Record of the Twentieth Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, Charleston, SC, Jan. 10–13, 1993, pp. 29–42), and (iii) the multi-rate software pipelining (MRSP) algorithm (Govindarajan and Gao, in Proceedings of the 1993 International Conference on Application Specific Array Processors, Venice, Italy, Oct. 25–27, 1993, pp. 77–88). Schedules generated for a number of random dataflow graphs and for a set of DSP application programs using the different scheduling methods are compared. The experimental results have demonstrated a significant improvement (10–20%) in buffer requirements for the MBRO schedules compared to the schedules generated by the other three methods, without sacrificing the computation rate. The MBRO method also gives a 20% average improvement in computation rate compared to Lee's Block scheduling method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. E. Ashford Lee and D.G. Messerschmitt, “Static Scheduling of Synchronous Data Flow Programs for Digital Signal Processing, IEEE Transactions on Computers, vol. 36,no. 1, 1987, pp. 24-35.

    Article  MATH  Google Scholar 

  2. H. Printz, “Automatic Mapping of Large Signal Processing Systems to a Parallel Machine,” Ph.D. Thesis. Published as Memorandum CMU-CS-91-101, Department of Computer Science, Carnegie-Mellon University, Pittsburgh, Pennsylvania, 1991.

    Google Scholar 

  3. S. Ritz, M. Pankert, and H. Meyr, “High Level Software Synthesis for Signal Processing Systems,” in Proceedings of the 1992 International Conference on Application Specific Array Processors, Berkeley, California, Aug. 4–7, 1992.

  4. G.R. Gao, R. Govindarajan, and P. Panangaden, “Construction Rules for Well-Behaved Stream Programs,” ACAPS Technical Memo 26, School of Computer Science, McGill University, Montréal, Québec, April 1992.

    Google Scholar 

  5. J.B. Dennis, “First Version of a Data-Flow Procedure Language,” in Proceedings of the Colloque sur la Programmation, Lecture Notes in Computer Science, vol. 19, Berlin: Springer-Verlag, 1975, pp. 362-376.

    Google Scholar 

  6. B.R. Rau and C.D. Glaeser, “Some Scheduling Techniques and an Easily Schedulable Horizontal Architecture for High Performance Scientific Computing,” in Proceedings of the 14th Annual Microprogramming Workshop, Chatham, Massachusetts, Oct. 12–15, 1981, pp. 183-198.

  7. M. Lam, “Software Pipelining: An Effective Scheduling Technique for VLIW Machines,” in Proceedings of the SIGPLAN '88 Conference on Programming Language Design and Implementation, Atlanta, Georgia, June 22–24, 1988, pp. 318-328.

  8. V.H. Allan, R.B. Jones, R.M. Lee, and S.J. Allan, “Software Pipelining,” ACM Computing Surveys, vol. 27,no. 3, 1995, pp. 367-432.

    Article  Google Scholar 

  9. R. Govindarajan and G.R. Gao, “A Novel Framework for Multi-Rate Scheduling in DSP Applications,” in Proceedings of the 1993 International Conference on Application Specific Array Processors, Venice, Italy, Oct. 25–27, 1993, pp. 77-88.

  10. R. Govindarajan and G.R. Gao, “Rate-Optimal Schedule for Multi-Rate DSP Computations,” Journal of VLSI Signal Processing, vol. 9,no. 3, 1995, pp. 211-232.

    Article  Google Scholar 

  11. L.-G. Jeng and L.-G. Chen, “Rate-Optimal DSP Synthesis by Pipeline and Minimum Unfolding,” IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 2,no. 1, 1994, pp. 81-88.

    Article  Google Scholar 

  12. D.B. Powell, E.A. Lee, and W.C. Newman, “Direct Synthesis of Optimized DSP Assembly Code from Signal Flow Block Diagrams,” in Proceedings of ICASSP-92, the 1992 International Conference on Acoustics, Speech, and Signal Processing, San Francisco, California, March 23–26, 1992, pp. 553-556.

  13. Q. Ning and G.R. Gao, “A Novel Framework of Register Allocation for Software Pipelining,” in Conference Record of the Twentieth Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, Charleston, SC, Jan. 10–13, 1993, pp. 29-42.

  14. G.C. Sih, “Multiprocessor Scheduling to Account for Interprocessor Communication,” Ph.D. Thesis, Memorandum no. UCB/ERL M91/29, Electronics Research Laboratory, University of California at Berkeley, 1991.

  15. E.A. Lee, “Consistency in Dataflow Graphs,” IEEE Transactions on Parallel and Distributed Systems, vol. 2, 1990, pp. 223-235.

    Article  Google Scholar 

  16. R. Reiter, “Scheduling Parallel Computations,” J. of the ACM, vol. 15,no. 4, 1968, pp. 590-599.

    Article  MathSciNet  MATH  Google Scholar 

  17. E.L. Lawler, Combinatorial Optimization: Networks and Matroids, Ft. Worth, Texas: Saunders College Publishing, 1976.

    MATH  Google Scholar 

  18. R.M. Karp, “A Characterization of the Minimum Cycle Mean in a Digraph,” Discrete Mathematics, vol. 23, 1978, pp. 309-311.

    Article  MathSciNet  MATH  Google Scholar 

  19. R. Govindarajan, E.R. Altman, and G.R. Gao, “A Framework for Resource-Constrained Rate-Optimal Software Pipelining,” IEEE Transactions on Parallel and Distributed Systems, vol. 7,no. 11, 1996, pp. 1133-1149.

    Article  Google Scholar 

  20. E.R. Altman, R. Govindarajan, and G.R. Gao, “A Unified Framework for Instruction Scheduling and Mapping for Function Units with Structural Hazards,” Journal of Parallel and Distributed Computing, vol. 49,no. 2, 1998, pp. 259-294.

    Article  MATH  Google Scholar 

  21. D. Fimmel and J. Muller, “Optimal Software Pipelining under Resource Constraints,” Technical Report SFB 358-A1-1/99, Dresden University of Technology, Germany, 1999. Available at http://www.iee.et.tu-dresden.de/~desa/A1/Pipeline.ps.zip.

    Google Scholar 

  22. P. Desai, “An Implementation of a Multi-Rate Scheduling Testbed,” ACAPS Technical Note, School of Computer Science, McGill University, Montreal, Québec, 1993.

    Google Scholar 

  23. G. Liao, E.R. Altman, V.K. Agarwal, and G.R. Gao, “A Comparative Study of DSP Multiprocessor List Scheduling Heuristics,” in Proceedings of the International Hawaii System Science Conference, Hawaii, 1994.

  24. G.R. Gao, Y.-B. Wong, and Q. Ning, “A Petri-Net Model for Fine-Grain Loop Scheduling,” ACAPS Technical Memo 18, School of Computer Science, McGill University, Montréal, Québec, Jan. 1991.

    Google Scholar 

  25. G.J. Chaitin, M.A. Auslander, A.K. Chandra, J. Cocke, M.E. Hopkins, and P.W. Markstein, “Register Allocation via Coloring,” Computer Languages, vol. 6, 1981, pp. 47-57.

    Article  Google Scholar 

  26. L.J. Hendren, G.R. Gao, E.R. Altman, and C. Mukerji, “A Register Allocation Framework Based on Hierarchical Cyclic Interval Graphs,” in Proceedings of the 4th International Conference on Compiler Construction, CC '92, Paderborn, Germany, Oct. 5–7, 1992, U. Kastens and P. Pfahler, (Ed.), Lecture Notes in Computer Science, vol. 641, Berlin: Springer-Verlag pp. 176-191.

    Chapter  Google Scholar 

  27. M.C. Golumbic, Algorithmic Graph Theory and Perfect Graphs, New York: Academic Press, 1980.

    MATH  Google Scholar 

  28. T. Kashiwabara, S. Masuda, K. Nakajsma, and T. Fujisawa, “Generation of Maximum Independent Sets of a Bipartite Graph and Maximum Cliques of a Circular Are Graph. Interval, Circular-Arc and Chordal Craphs,” Journal of Algorithms, vol. 13, 1992, pp. 161-1745.

    Article  MathSciNet  MATH  Google Scholar 

  29. J.Y.T. Leung, “Fast Algorithms for Generating All Maximum Independent Sets of Interval, Circular-Arc and Chordal Graphs,” Journal of Algorithms, vol. 5, 1984, pp. 22-35.

    Article  MathSciNet  MATH  Google Scholar 

  30. S.S. Bhattacharyya and E.A. Lee, “Scheduling Synchronous Dataflow Graphs for Efficient Looping,” Journal of VLSI Signal Processing, vol. 6,no. 3, 1993.

  31. S. Ha and E.A. Lee, “Compile-Time Scheduling and Assignment of Data-Flow Program Graphs with Data-Dependent Iteration,” IEEE Transactions on Computers, Vol. 40,no. 11, 1991, pp. 1225-1238.

    Article  Google Scholar 

  32. M. Čubrić and P. Panangaden, “Minimal Memory Schedules for Dataflow Networks,” In Proceedings of the 4th International Conference on Concurrency Theory, Hildesheim, Germany, Aug. 23–26, 1993, Eike Best (Ed.), Lecture Notes in Computer Science, vol. 715, Berlin: Springer-Verlag, pp. 368-383.

    Google Scholar 

  33. K.K. Parhi and D.G. Messerschmitt, “Static Rate-Optimal Scheduling of Iterative Data-Flow Programs via Optimum Unfolding,” IEEE Transactions on Computers, vol. 40,no. 2, 1991, pp. 178-195.

    Article  Google Scholar 

  34. L.E. Lucke and K.K. Parhi, “Generalized ILP Scheduling and Allocation for High-Level DSP Synthesis,” in Proceedings of the 1993 IEEE Custom Integrated Circuits Conference, 1993.

  35. L.-F. Chao and E.H.-M. Sha, “Rate-Optimal Static Scheduling for DSP Dataflow Programs,” in Proceedings of 1993 Great Lakes Symposium on VLSI, March 1993, pp. 80-84.

  36. C.E. Leiserson, F.M. Rose, and J.B. Saxe, “Optimizing Synchronous Circuitry by Retiming,” in Proceedings of the Third Caltech Conference on VLSI, Pasadena, California, March 1983, pp. 87-116.

  37. P.D. Hoang, “Compiling Real-Time Digital Signal Processing Applications onto Multiprocessor Systems,” Ph.D. Thesis, Memorandum no. UCB/ERL M92/68, Electronics Research Laboratory, University of California at Berkeley, 1992.

  38. S. Ritz, M. Pankert, V. Živojnović, and H. Meyr, “Optimum Vectorization of Scalable Synchronous Dataflow Graphs,” in Proceedings of the 1993 International Conference on Application Specific Array Processors, Venice, Italy, Oct. 25–27, 1993, pp. 285-296.

  39. C.-T. Hwang, J.-H. Lee, and Y.-C. Hsu, “A Formal Approach to the Scheduling Problem in High-Level Synthesis,” IEEE Transactions on Computer-Aided Design, vol. 10,no. 4, 1991, pp. 464-475.

    Article  Google Scholar 

  40. V. van Dongen, G.R. Gao, and Q. Ning, “A Polynomial Time Method for Optimal Software Pipelining,” in Proceedings of the Conference on Vector and Parallel Processing, CONPAR-92, Lyon, France, Berlin: Springer-Verlag, Sept. 1–4, 1992, Lecture Notes in Computer Science, vol. 634, pp. 613-624.

    Chapter  Google Scholar 

  41. J. Ramanujam, “Optimal Software Pipelining of Nested Loops,” in Proceedings of the 8th International Parallel Processing Symposium, Cancún, Mexico, April 26–29, 1994, pp. 335-342.

  42. F. Depuydt, W. Geurts, G. Goossens, and H. De Man, “Optimal Scheduling and Software Pipelining of a Repetitive Signal Flow Graphs with Delay Line Optimization,” in Proc. of the EDDA European Design and Test Conference, Paris, France, Feb. 1994.

  43. A.E. Eichenberger and E.S. Davidson, “Efficient Formulation for Optimal Modulo Schedulers,” in Proceedings of the ACM SIGPLAN '97 Conference on Programming Language Design and Implementation, Las Vegas, Nevada, June 15–18, 1997, pp. 194-205.

  44. M. Potkonjak and J. Rabey, “Retiming for Scheduling,” in Proceedings of the VLSI Signal Processing Workshop, 1990.

  45. P.-Y. Calland, A. Darte, and Y. Robert, “Circuit Retiming Applied to Decomposed Software Pipelining,” IEEE Transactions on Parallel and Distributed Systems, vol. 9,no. 1, 1998, pp. 24-35.

    Article  Google Scholar 

  46. N. Passos, E.H.-M. Sha, and S. Bass, “Scheduled-Based Multi-Dimensional Retiming on Data Flow Graphs,” in Proceedings of the 8th International Parallel Processing Symposium, Cancún, Mexico, April 26–29, 1994, pp. 195-199.

  47. T.C. Denk, M. Majumdar, and K.K. Parhi, “Two-Dimensional Retiming with Low Memory Requirements,” in Proceedings of ICASSP-96, the 1996 International Conference on Acoustics, Speech, and Signal Processing, Atlanta, Georgia, May 1996, pp. 3330-3333.

  48. T.C. Denk and K.K. Parhi, “Lower Bounds on Memory Requirements for Statically Scheduled DSP Programs,” Journal of VLSI Signal Processing, vol. 12, 1996, pp. 247-264.

    Article  MATH  Google Scholar 

  49. P.K. Murthy and S.S. Bhattacharyya, “A Buffer Merging Technique for Reducing Memory Requirements of Synchronous Dataflow Specifications,” in Proceedings of the International Symposium on Systems Synthesis, San Jose, CA, Nov. 1999.

  50. P.K. Murthy, S.S. Bhattacharyya, and E.A. Lee, “Joint Minimization of Code and Data for Synchronous Data Flow Programs,” Journal on Formal Methods in System Design, vol. 11,no. 1, 1997, pp. 41-70.

    Article  Google Scholar 

  51. E. Teich, J. Zitzler, and S.S. Bhattacharyya, “Buffer Memory Optimization in DSP Applications—An Evolutionary Approach,” in Proceedings of the International Conference on Parallel Problem Solving from Nature, Amsterdam, The Netherlands, Sept. 1999, pp. 885-894.

  52. S. Ritz, M. Willems, and H. Meyr, “Scheduling for Optimum Data Memory Compaction in Block Diagram Oriented Software Synthesis,” in Proceedings of ICASSP-95, the 1995 International Conference on Acoustics, Speech, and Signal Processing, 1995.

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Govindarajan, R., Gao, G.R. & Desai, P. Minimizing Buffer Requirements under Rate-Optimal Schedule in Regular Dataflow Networks. The Journal of VLSI Signal Processing-Systems for Signal, Image, and Video Technology 31, 207–229 (2002). https://doi.org/10.1023/A:1015452903532

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1015452903532

Navigation