article

Array recovery and high-level transformations for DSP applications

Authors:
Björn Franke

University of Edinburgh, UK

University of Edinburgh, UK
View Profile

,
Michael O'boyle

University of Edinburgh, UK

University of Edinburgh, UK
View Profile

Authors Info & Claims

ACM Transactions on Embedded Computing Systems Volume 2 Issue 2pp 132–162https://doi.org/10.1145/643470.643472

Published:01 May 2003Publication History

ACM Transactions on Embedded Computing Systems

Abstract

Efficient implementation of DSP applications is critical for many embedded systems. Optimizing compilers for application programs, written in C, largely focus on code generation and scheduling, which, with their growing maturity, are providing diminishing returns. As DSP applications typically make extensive use of pointer arithmetic, the alternative use of high-level, source-to-source, transformations has been largely ignored. This article develops an array recovery technique that automatically converts pointers to arrays, enabling the empirical evaluation of high-level transformations. High-level techniques were applied to the DSPstone benchmarks on three platforms: TriMedia TM-1000, Texas Instruments TMS320C6201, and the Analog Devices SHARC ADSP-21160. On average, the best transformation gave a factor of 2.43 improvement across the platforms. In certain cases, a speedup of 5.48 was found for the SHARC, 7.38 for the TM-1, and 2.3 for the C6201. These preliminary results justify pointer to array conversion and further investigation into the use of high-level techniques for embedded compilers.

References

Allen, R. and Johnson, S. 1988. Compiling C for vectorization, parallelization, and inline expansion, In Proceedings of the SIGPLAN '88 Conference of Programming Languages Design and Implementation (Atlanta, GA, June 1988).]] Google ScholarDigital Library
Analog Devices Inc. 1999. ADSP-21160 SHARC DSP Hardware Reference. http://www.analog.com.]]Google Scholar
Andreyev, A. L., Balyakhov, D. I., and Russakov, V. P. 1996. The technique of high-level optimization of DSP algorithms implementation, International Conference on Signal Processing Applications & Technology (ICSPAT), 1996.]]Google Scholar
de Araujo, G. C. S. 1997. Code generation algorithms for digital signal processor. Dissertation, Princeton University, Dep. Electrical Engineering.]] Google ScholarDigital Library
Bacon, D. F., Graham, S. L., and Sharp, O. J. 1994. Compiler transformations for high-performance computing. ACM Comput. Surv. 26, 4, (1994), 345--420.]] Google ScholarDigital Library
Barreteau, M. et al. 1998. OCEANS optimising compilers for embedded applications, In Proceedings Euro-Par 98, LNCS 1470 (1998), 1123--1130.]] Google ScholarDigital Library
Bhattacharyya, S. S., Leupers, R., and Marwedel, P. 2000. Software synthesis and code generation for signal processing systems. IEEE Trans. Circuits Syst. II: Analog and Digital Signal Processing, 47, 9.]]Google Scholar
Bodin, F., Chamski, Z., Eisenbeis, C., Rohou, E., and Seznec, A. 1998. GCDS: A compiler strategy for trading code size against performance in embedded applications. INRIA, Tech. Rep. RR-3346.]]Google Scholar
Callahan, D., Cooper K. D., Kennedy, K., and Torczon, L. 1986. Interprocedural constant propagation. In Proceedings of the SIGPLAN Symposium on Compiler Construction (1986), 152--161.]] Google ScholarDigital Library
Clauss, P. 1996. Counting solutions to linear and nonlinear constraints through ehrhart polynomials: applications to analyze and transform scientific programs. In Proceedings of the 1996 International Conference on Supercomputing (Philadelphia, PA, May 1996), ACM Press, 278--285.]] Google ScholarDigital Library
Duesterwald, E., Gupta, R., and Soffa, M. 1993. A practical data flow framework for array reference analysis and its use in optimizations. In Proceedings of the SIGPLAN Conference on Programming Languages Design and Implementation, (Albuquerque, NM, 1993), 67--77.]] Google ScholarDigital Library
Franke, B. and O'Boyle, M. 2001. Compiler transformation of pointers to explicit array accesses in DSP applications. In Proceedings of ETAPS CC 2001 (Genova, Italy, 2001).]] Google ScholarDigital Library
Frederiksen, A., Christiansen, R., Bier, J., and Koch, P. 2000. An evaluation of compiler-processor interaction for DSP applications. In Proceedings of the 34th IEEE Asilomar Conference on Signals, Systems, and Computers (Pacific Grove, CA, Oct. 2000).]]Google Scholar
Grove,D. and Torczon, L. 1993. Interprocedural constant propagation: A study of jump function implementations. In Proceedings of the ACM SIGPLAN '93 Conference on Programming Language Design and Implementation (1993), 90--99.]] Google ScholarDigital Library
Gupta, S., Miranda, M., Catthoor, F., and Gupta, R. 2000. Analysis of high-level address code transformations for programmable processors. In Proceedings of Design and Test in Europe Conference (DATE, 2000).]] Google ScholarDigital Library
Kandemir, M., Vijaykrishnan, N., Irwin, M. J., and Kim, H. S. 2000. Experimental evaluation of energy behavior of iteration space tiling. In Proceedings of the International Workshop on Languages and Compilers for Parallel Computing (Yorktown Heights, NY, Aug. 2000).]] Google ScholarDigital Library
Kandemir, M., Ramanujam, J., and Choudhary, A. 1999. Improving cache locality by a combination of loop and data transformations. IEEE Trans. Comput. 48, 2 (Feb. 1999), 159--167.]] Google ScholarDigital Library
Kisuki, T., Knijnenburg, P. M. W., and O'Boyle, M. F. P. 2000. Combined selection of tile sizes and unroll factors using iterative compilation. In Proceedings of PACT 2000, Parallel Architectures and Compiler Technology (Oct. 2000), IEEE Press, 237--248.]] Google ScholarDigital Library
Liem, C., Paulin, P., and Jerraya, A. 1996. Address calculation for retargetable compilation and exploration of instruction set architecture. In Proceedings of the 33rd ACM Design Automation Conference (DAC, Las Vegas, NV, 1996).]] Google ScholarDigital Library
Leupers, R. 1998. Novel code optimzation techniques for DSPs. In Proceedings of the 2nd European DSP Education and Research Conference (Paris, 1998).]]Google Scholar
Maydan, D. E., Hennessy, J. L., and Lam, M. S. 1995. Effectiveness of data dependence analysis, Int. J. Parallel Program. 23, 1, (1995), 63--81.]] Google ScholarDigital Library
Morgan, R. 1998. Building an Optimizing Compiler. Butterworth-Heinemann, Boston, MA.]] Google ScholarDigital Library
Numerix-DSP Digital Signal Processing Web Site, http://www.numerix-dsp.com/c_coding.pdf, 2000.]]Google Scholar
O'Boyle M. F. P. and Knijnenburg, P. M. W. 1998. Integrating loop and data transformations for global optimisation. In Proceedings of PACT '98, Parallel Architectures and Compiler Technology (Oct. 1998), IEEE Press, 12--19.]] Google ScholarDigital Library
Philips Semiconductors 1999. TriMedia TM-1300 http://www.semiconductors.philips.com, 1999.]]Google Scholar
Pugh, W. 1994. Counting solutions to presburger formulas: How and why. In Proceedings of the SIGPLAN Conference on Programming Languages Design and Implementation, ACM Press, 1994, 121--134.]] Google ScholarDigital Library
Sagiv, M., Reps, T., and Horwitz, S. 1995. Precise interprocedural dataflow analysis with applications to constant propagation. In Proceedings of the TAPSOFT'95 Conference (Arhus, Denmark), LNCS, Springer-Verlag, 1995, 49--61.]] Google ScholarDigital Library
Sair, S., Kaeli, D. K., and Meleis, W. 1998. A study of loop unrolling for VLIW-based DSP processors. In Proceedings of the 1998 IEEE Workshop on Signal Processing Systems (SiPS '98, Oct. 1998), 519--527.]]Google Scholar
Song, Y. and Lin, Y. 2000. Unroll-and-jam for imperfectly-nested loops in DSP applications. In Proceedings of the ACM International Conference on Compilers, Architectures, Synthesis for Embedded Systems (Nov. 2000).]] Google ScholarDigital Library
Su, B., Wang, J., and Esguerra, A. 1999. Source-level loop optimization for DSP code generation. In Proceedings of the 1999 IEEE International Conference on Acoustic, Speech and Signal Processing (ICASSP '99, Phoenix, AZ, 1999), 2155--2158.]] Google ScholarDigital Library
van Swaaij, M. F. X. B., Franssen, F. H. M., Catthoor, F. V. M., and De Man, H. J. 1992. Automatimg high level control flow transformations for DSP memory management. In Proceedings of the IEEE Workshop on VLSI Signal Processing, (Napa Valley, CA, Oct. 1992). Also in VLSI Signal Processing V, K. Yao, R. Jain, W. Przytula (Eds.), IEEE Press, New York, 397--406.]]Google ScholarCross Ref
Timmer, A., Strik, M., van Meerbergen, J., and Jess, J. 1995. Conflict modelling and instruction scheduling in code generation for in-house DSP cores. In Proceedings of the 1995 Design Automation Conference (1995), 593--598.]] Google ScholarDigital Library
Texas Instruments 2001. TMS320C6201B Digital Signal Processor. http://www.ti.com.]]Google Scholar
van Engelen, R. and Gallivan, K. 2001. An efficient algorithm for pointer-to-array access conversion for compiling and optimizing DSP applications. In Proceedings of the International Workshop on Innovative Architecture (IWIA 2001, Maui, 2001).]]Google Scholar
Wang, J. and Su, B. 1998. Software pipelining of nested loops for real-time DSP applications, In Proceedings of the 1998 IEEE International Conference on Acoustic, Speech and Signal Processing (ICASSP '98, Seattle, WA, 1998).]]Google Scholar
Zivojnovic, V., Velarde, J. M., Schlager, C., and Meyr, H. 1994. DSPstone: A DSP-oriented benchmarking methodology, In Proceedings of Signal Processing Applications & Technology Conference (Dallas, TX, 1994).]]Google Scholar

Index Terms

Array recovery and high-level transformations for DSP applications
1. Software and its engineering
  1. Software notations and tools
    1. Compilers

Recommendations

Register coalescing techniques for heterogeneous register architecture with copy sifting

Optimistic coalescing has been proven as an elegant and effective technique that provides better chances of safely coloring more registers in register allocation than other coalescing techniques. Its algorithm originally assumes homogeneous registers, ...
Read More
Optimistic coalescing for heterogeneous register architectures
LCTES '07: Proceedings of the 2007 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems

In this paper, Optimistic coalescing has been proven as an elegant and effective technique that provides better chances of safely coloring more registers in register allocation than other coalescing techniques. Its algorithm originally assumes ...
Read More
A retargetable register allocation framework for embedded processors
LCTES '04

This paper describes the FlexCC2 register allocation framework. FlexCC2 is an optimizing retargetable C compiler for embedded processors, and in particular for DSP processors. Embedded processors often contain features such as irregular and constrained ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in

ACM Transactions on Embedded Computing Systems Volume 2, Issue 2
May 2003
120 pages
ISSN:1539-9087
EISSN:1558-3465
DOI:10.1145/643470
Issue’s Table of Contents

Copyright © 2003 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States

Journal Family
ACM Journals for the Design of Smart and Connected Systems
Publication History
- Published: 1 May 2003
Published in tecs Volume 2, Issue 2

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Pointer conversion
dataflow graphs
embedded processors
high-level transformations
Qualifiers
- article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 36
  Total Citations
  View Citations
- 1,225
  Total Downloads
- Downloads (Last 12 months)7
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Array recovery and high-level transformations for DSP applications

ACM Transactions on Embedded Computing Systems

Abstract

References

Cited By

Index Terms

Recommendations

Register coalescing techniques for heterogeneous register architecture with copy sifting

Optimistic coalescing for heterogeneous register architectures

A retargetable register allocation framework for embedded processors

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Journal Family

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Array recovery and high-level transformations for DSP applications

ACM Transactions on Embedded Computing Systems

Abstract

References

Cited By

Index Terms

Recommendations

Register coalescing techniques for heterogeneous register architecture with copy sifting

Optimistic coalescing for heterogeneous register architectures

A retargetable register allocation framework for embedded processors

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Journal Family

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media