Skip to main content
Log in

Memory-constrained Block Processing for DSP Software Optimization

  • Published:
Journal of Signal Processing Systems Aims and scope Submit manuscript

Abstract

Digital signal processing (DSP) applications involve processing long streams of input data. It is important to take into account this form of processing when implementing embedded software for DSP systems. Task-level vectorization, or block processing, is a useful dataflow graph transformation that can significantly improve execution performance by allowing subsequences of data items to be processed through individual task invocations. In this way, several benefits can be obtained, including reduced context switch overhead, increased memory locality, improved utilization of processor pipelines, and use of more efficient DSP oriented addressing modes. On the other hand, block processing generally results in increased memory requirements since it effectively increases the sizes of the input and output values associated with processing tasks. In this paper, we investigate the memory-performance trade-off associated with block processing. We develop novel block processing algorithms that carefully take into account memory constraints to achieve efficient block processing configurations within given memory space limitations. Our experimental results indicate that these methods derive optimal memory-constrained block processing solutions most of the time. We demonstrate the advantages of our block processing techniques on practical kernel functions and applications in the DSP domain.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. S. S. Bhattacharyya, P. K. Murthy, and E. A. Lee, “Software Synthesis from Dataflow Graphs,” Kluwer, 1996.

  2. R. E. Blahut, “Fast Algorithms for Digital Signal Processing,” Addison-Wesley, 1985.

  3. J. Eker, J. W. Janneck, E. A. Lee, J. Liu, X. Liu, J. Ludvig, S. Neuendorffer, S. Sachs, and Y. Xiong, “Taming heterogeneity—the Ptolemy approach,” Proc. IEEE Spec. Issue Model. Des. Embed. Softtw., vol. 91, no. 1, January 2003, pp. 127–144.

    Google Scholar 

  4. L. Guerra, M. Potkonjak, and J. Rabey, “System-level Design Guidance Using Algorithm Properties,” IEEE Workshop VLSI Signal Proc., 1994, pp. 73–82.

  5. I. Hong, M. Potkonjak, and M. Papaefthymiou, “Efficient Block Scheduling to Minimize Context Switching Time for Programmable Embedded Processors,” Des. Autom. Embed. Syst., vol. 4, no. 4, 1999, pp. 311–327.

    Article  Google Scholar 

  6. C. Hsu, M. Ko, and S. S., “Bhattacharyya, Software Synthesis from the Dataflow Interchange Format,” in Proc. Int. Workshop Softw. Compilers Embed. Syst., Dallas, Texas, September 2005, pp. 37–49.

  7. M. Ko, C. Shen, and S. S. Bhattacharyya, “Memory-constrained Block Processing Optimization for Synthesis of DSP Software,” in Intl. Conf. Embed. Comput. Syst.: Archit., Model. Simul. (IC-SAMOS), Samos, Greece, July 2006, pp. 137–143.

  8. K. N. Lalgudi, M. C. Papaefthymiou, and M. Potkonjak, “Optimizing Computations for Effective Block-processing,” ACM Transact. Des. Automat. Electron. Syst (TODAES), vol. 5, no. 3, July 2000, pp. 604–630.

    Article  Google Scholar 

  9. E. A. Lee and D. G. Messerschmitt, “Synchronous Dataflow,” Proc. IEEE, vol. 75, September 1987, pp. 1235–1245.

    Article  Google Scholar 

  10. R. Leupers, “Code Optimization Techniques for Embedded Processors—Methods, Algorithms, Tools,” Kluwer, 2000.

  11. P. K. Murthy and S. S. Bhattacharyya, “Shared Buffer Implementations of Signal Processing Systems using Lifetime Analysis Techniques,” IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst., vol. 20, no. 2, February 2001, pp. 177–198.

    Article  Google Scholar 

  12. K. K. Parhi, “VLSI Digital Signal Processing Systems: Design and Implementation,” Wiley-Interscience, 1999.

  13. S. Ritz, M. Willems, and H. Meyer, “Scheduling for Optimum Data Memory Compaction in Block Diagram Oriented Software Synthesis,” Int. Conf. Acoust. Speech Signal Process (ICASSP), vol. 4, May 1995, pp. 2651–2654.

    Article  Google Scholar 

  14. S. Ritz, M. Pankert, and H. Meyer, “Optimum Vectorization of Scalable Synchronous Dataflow Graphs,” Int. Conf. Appl. Spec. Array Process. (ASAP), October 1993, pp. 285–296.

  15. W. Sung, M. Oh, C. Im, and S. Ha, “Demonstration of Hardware Software Codesign Workflow in PeaCE,” in Proc. Int. Conf. VLSI CAD, October 1997.

  16. W. Thies, M. Karczmarek, and S. Amarasinghe, “StreamIt: A Language for Streaming Applications,” in Proc. Int. Conf. Compiler Construction, 2002.

  17. V. Zivojnovic, S. Ritz, and H. Meyr, “Retimimg of DSP Programs for Optimum Vectorization,” Int. Conf. Acoust. Speech Signal Process (ICASSP), vol. 2, 1994, pp. 19–22.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ming-Yung Ko.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ko, MY., Shen, CC. & Bhattacharyya, S.S. Memory-constrained Block Processing for DSP Software Optimization. J Sign Process Syst Sign Image 50, 163–177 (2008). https://doi.org/10.1007/s11265-007-0114-1

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11265-007-0114-1

Keywords

Navigation