Abstract
Digital signal processing (DSP) applications involve processing long streams of input data. It is important to take into account this form of processing when implementing embedded software for DSP systems. Task-level vectorization, or block processing, is a useful dataflow graph transformation that can significantly improve execution performance by allowing subsequences of data items to be processed through individual task invocations. In this way, several benefits can be obtained, including reduced context switch overhead, increased memory locality, improved utilization of processor pipelines, and use of more efficient DSP oriented addressing modes. On the other hand, block processing generally results in increased memory requirements since it effectively increases the sizes of the input and output values associated with processing tasks. In this paper, we investigate the memory-performance trade-off associated with block processing. We develop novel block processing algorithms that carefully take into account memory constraints to achieve efficient block processing configurations within given memory space limitations. Our experimental results indicate that these methods derive optimal memory-constrained block processing solutions most of the time. We demonstrate the advantages of our block processing techniques on practical kernel functions and applications in the DSP domain.
Similar content being viewed by others
References
S. S. Bhattacharyya, P. K. Murthy, and E. A. Lee, “Software Synthesis from Dataflow Graphs,” Kluwer, 1996.
R. E. Blahut, “Fast Algorithms for Digital Signal Processing,” Addison-Wesley, 1985.
J. Eker, J. W. Janneck, E. A. Lee, J. Liu, X. Liu, J. Ludvig, S. Neuendorffer, S. Sachs, and Y. Xiong, “Taming heterogeneity—the Ptolemy approach,” Proc. IEEE Spec. Issue Model. Des. Embed. Softtw., vol. 91, no. 1, January 2003, pp. 127–144.
L. Guerra, M. Potkonjak, and J. Rabey, “System-level Design Guidance Using Algorithm Properties,” IEEE Workshop VLSI Signal Proc., 1994, pp. 73–82.
I. Hong, M. Potkonjak, and M. Papaefthymiou, “Efficient Block Scheduling to Minimize Context Switching Time for Programmable Embedded Processors,” Des. Autom. Embed. Syst., vol. 4, no. 4, 1999, pp. 311–327.
C. Hsu, M. Ko, and S. S., “Bhattacharyya, Software Synthesis from the Dataflow Interchange Format,” in Proc. Int. Workshop Softw. Compilers Embed. Syst., Dallas, Texas, September 2005, pp. 37–49.
M. Ko, C. Shen, and S. S. Bhattacharyya, “Memory-constrained Block Processing Optimization for Synthesis of DSP Software,” in Intl. Conf. Embed. Comput. Syst.: Archit., Model. Simul. (IC-SAMOS), Samos, Greece, July 2006, pp. 137–143.
K. N. Lalgudi, M. C. Papaefthymiou, and M. Potkonjak, “Optimizing Computations for Effective Block-processing,” ACM Transact. Des. Automat. Electron. Syst (TODAES), vol. 5, no. 3, July 2000, pp. 604–630.
E. A. Lee and D. G. Messerschmitt, “Synchronous Dataflow,” Proc. IEEE, vol. 75, September 1987, pp. 1235–1245.
R. Leupers, “Code Optimization Techniques for Embedded Processors—Methods, Algorithms, Tools,” Kluwer, 2000.
P. K. Murthy and S. S. Bhattacharyya, “Shared Buffer Implementations of Signal Processing Systems using Lifetime Analysis Techniques,” IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst., vol. 20, no. 2, February 2001, pp. 177–198.
K. K. Parhi, “VLSI Digital Signal Processing Systems: Design and Implementation,” Wiley-Interscience, 1999.
S. Ritz, M. Willems, and H. Meyer, “Scheduling for Optimum Data Memory Compaction in Block Diagram Oriented Software Synthesis,” Int. Conf. Acoust. Speech Signal Process (ICASSP), vol. 4, May 1995, pp. 2651–2654.
S. Ritz, M. Pankert, and H. Meyer, “Optimum Vectorization of Scalable Synchronous Dataflow Graphs,” Int. Conf. Appl. Spec. Array Process. (ASAP), October 1993, pp. 285–296.
W. Sung, M. Oh, C. Im, and S. Ha, “Demonstration of Hardware Software Codesign Workflow in PeaCE,” in Proc. Int. Conf. VLSI CAD, October 1997.
W. Thies, M. Karczmarek, and S. Amarasinghe, “StreamIt: A Language for Streaming Applications,” in Proc. Int. Conf. Compiler Construction, 2002.
V. Zivojnovic, S. Ritz, and H. Meyr, “Retimimg of DSP Programs for Optimum Vectorization,” Int. Conf. Acoust. Speech Signal Process (ICASSP), vol. 2, 1994, pp. 19–22.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Ko, MY., Shen, CC. & Bhattacharyya, S.S. Memory-constrained Block Processing for DSP Software Optimization. J Sign Process Syst Sign Image 50, 163–177 (2008). https://doi.org/10.1007/s11265-007-0114-1
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11265-007-0114-1