skip to main content
article

Array recovery and high-level transformations for DSP applications

Published:01 May 2003Publication History
Skip Abstract Section

Abstract

Efficient implementation of DSP applications is critical for many embedded systems. Optimizing compilers for application programs, written in C, largely focus on code generation and scheduling, which, with their growing maturity, are providing diminishing returns. As DSP applications typically make extensive use of pointer arithmetic, the alternative use of high-level, source-to-source, transformations has been largely ignored. This article develops an array recovery technique that automatically converts pointers to arrays, enabling the empirical evaluation of high-level transformations. High-level techniques were applied to the DSPstone benchmarks on three platforms: TriMedia TM-1000, Texas Instruments TMS320C6201, and the Analog Devices SHARC ADSP-21160. On average, the best transformation gave a factor of 2.43 improvement across the platforms. In certain cases, a speedup of 5.48 was found for the SHARC, 7.38 for the TM-1, and 2.3 for the C6201. These preliminary results justify pointer to array conversion and further investigation into the use of high-level techniques for embedded compilers.

References

  1. Allen, R. and Johnson, S. 1988. Compiling C for vectorization, parallelization, and inline expansion, In Proceedings of the SIGPLAN '88 Conference of Programming Languages Design and Implementation (Atlanta, GA, June 1988).]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Analog Devices Inc. 1999. ADSP-21160 SHARC DSP Hardware Reference. http://www.analog.com.]]Google ScholarGoogle Scholar
  3. Andreyev, A. L., Balyakhov, D. I., and Russakov, V. P. 1996. The technique of high-level optimization of DSP algorithms implementation, International Conference on Signal Processing Applications & Technology (ICSPAT), 1996.]]Google ScholarGoogle Scholar
  4. de Araujo, G. C. S. 1997. Code generation algorithms for digital signal processor. Dissertation, Princeton University, Dep. Electrical Engineering.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Bacon, D. F., Graham, S. L., and Sharp, O. J. 1994. Compiler transformations for high-performance computing. ACM Comput. Surv. 26, 4, (1994), 345--420.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Barreteau, M. et al. 1998. OCEANS optimising compilers for embedded applications, In Proceedings Euro-Par 98, LNCS 1470 (1998), 1123--1130.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Bhattacharyya, S. S., Leupers, R., and Marwedel, P. 2000. Software synthesis and code generation for signal processing systems. IEEE Trans. Circuits Syst. II: Analog and Digital Signal Processing, 47, 9.]]Google ScholarGoogle Scholar
  8. Bodin, F., Chamski, Z., Eisenbeis, C., Rohou, E., and Seznec, A. 1998. GCDS: A compiler strategy for trading code size against performance in embedded applications. INRIA, Tech. Rep. RR-3346.]]Google ScholarGoogle Scholar
  9. Callahan, D., Cooper K. D., Kennedy, K., and Torczon, L. 1986. Interprocedural constant propagation. In Proceedings of the SIGPLAN Symposium on Compiler Construction (1986), 152--161.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Clauss, P. 1996. Counting solutions to linear and nonlinear constraints through ehrhart polynomials: applications to analyze and transform scientific programs. In Proceedings of the 1996 International Conference on Supercomputing (Philadelphia, PA, May 1996), ACM Press, 278--285.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Duesterwald, E., Gupta, R., and Soffa, M. 1993. A practical data flow framework for array reference analysis and its use in optimizations. In Proceedings of the SIGPLAN Conference on Programming Languages Design and Implementation, (Albuquerque, NM, 1993), 67--77.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Franke, B. and O'Boyle, M. 2001. Compiler transformation of pointers to explicit array accesses in DSP applications. In Proceedings of ETAPS CC 2001 (Genova, Italy, 2001).]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Frederiksen, A., Christiansen, R., Bier, J., and Koch, P. 2000. An evaluation of compiler-processor interaction for DSP applications. In Proceedings of the 34th IEEE Asilomar Conference on Signals, Systems, and Computers (Pacific Grove, CA, Oct. 2000).]]Google ScholarGoogle Scholar
  14. Grove,D. and Torczon, L. 1993. Interprocedural constant propagation: A study of jump function implementations. In Proceedings of the ACM SIGPLAN '93 Conference on Programming Language Design and Implementation (1993), 90--99.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Gupta, S., Miranda, M., Catthoor, F., and Gupta, R. 2000. Analysis of high-level address code transformations for programmable processors. In Proceedings of Design and Test in Europe Conference (DATE, 2000).]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Kandemir, M., Vijaykrishnan, N., Irwin, M. J., and Kim, H. S. 2000. Experimental evaluation of energy behavior of iteration space tiling. In Proceedings of the International Workshop on Languages and Compilers for Parallel Computing (Yorktown Heights, NY, Aug. 2000).]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Kandemir, M., Ramanujam, J., and Choudhary, A. 1999. Improving cache locality by a combination of loop and data transformations. IEEE Trans. Comput. 48, 2 (Feb. 1999), 159--167.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Kisuki, T., Knijnenburg, P. M. W., and O'Boyle, M. F. P. 2000. Combined selection of tile sizes and unroll factors using iterative compilation. In Proceedings of PACT 2000, Parallel Architectures and Compiler Technology (Oct. 2000), IEEE Press, 237--248.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Liem, C., Paulin, P., and Jerraya, A. 1996. Address calculation for retargetable compilation and exploration of instruction set architecture. In Proceedings of the 33rd ACM Design Automation Conference (DAC, Las Vegas, NV, 1996).]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Leupers, R. 1998. Novel code optimzation techniques for DSPs. In Proceedings of the 2nd European DSP Education and Research Conference (Paris, 1998).]]Google ScholarGoogle Scholar
  21. Maydan, D. E., Hennessy, J. L., and Lam, M. S. 1995. Effectiveness of data dependence analysis, Int. J. Parallel Program. 23, 1, (1995), 63--81.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Morgan, R. 1998. Building an Optimizing Compiler. Butterworth-Heinemann, Boston, MA.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Numerix-DSP Digital Signal Processing Web Site, http://www.numerix-dsp.com/c_coding.pdf, 2000.]]Google ScholarGoogle Scholar
  24. O'Boyle M. F. P. and Knijnenburg, P. M. W. 1998. Integrating loop and data transformations for global optimisation. In Proceedings of PACT '98, Parallel Architectures and Compiler Technology (Oct. 1998), IEEE Press, 12--19.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Philips Semiconductors 1999. TriMedia TM-1300 http://www.semiconductors.philips.com, 1999.]]Google ScholarGoogle Scholar
  26. Pugh, W. 1994. Counting solutions to presburger formulas: How and why. In Proceedings of the SIGPLAN Conference on Programming Languages Design and Implementation, ACM Press, 1994, 121--134.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Sagiv, M., Reps, T., and Horwitz, S. 1995. Precise interprocedural dataflow analysis with applications to constant propagation. In Proceedings of the TAPSOFT'95 Conference (Arhus, Denmark), LNCS, Springer-Verlag, 1995, 49--61.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Sair, S., Kaeli, D. K., and Meleis, W. 1998. A study of loop unrolling for VLIW-based DSP processors. In Proceedings of the 1998 IEEE Workshop on Signal Processing Systems (SiPS '98, Oct. 1998), 519--527.]]Google ScholarGoogle Scholar
  29. Song, Y. and Lin, Y. 2000. Unroll-and-jam for imperfectly-nested loops in DSP applications. In Proceedings of the ACM International Conference on Compilers, Architectures, Synthesis for Embedded Systems (Nov. 2000).]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Su, B., Wang, J., and Esguerra, A. 1999. Source-level loop optimization for DSP code generation. In Proceedings of the 1999 IEEE International Conference on Acoustic, Speech and Signal Processing (ICASSP '99, Phoenix, AZ, 1999), 2155--2158.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. van Swaaij, M. F. X. B., Franssen, F. H. M., Catthoor, F. V. M., and De Man, H. J. 1992. Automatimg high level control flow transformations for DSP memory management. In Proceedings of the IEEE Workshop on VLSI Signal Processing, (Napa Valley, CA, Oct. 1992). Also in VLSI Signal Processing V, K. Yao, R. Jain, W. Przytula (Eds.), IEEE Press, New York, 397--406.]]Google ScholarGoogle ScholarCross RefCross Ref
  32. Timmer, A., Strik, M., van Meerbergen, J., and Jess, J. 1995. Conflict modelling and instruction scheduling in code generation for in-house DSP cores. In Proceedings of the 1995 Design Automation Conference (1995), 593--598.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Texas Instruments 2001. TMS320C6201B Digital Signal Processor. http://www.ti.com.]]Google ScholarGoogle Scholar
  34. van Engelen, R. and Gallivan, K. 2001. An efficient algorithm for pointer-to-array access conversion for compiling and optimizing DSP applications. In Proceedings of the International Workshop on Innovative Architecture (IWIA 2001, Maui, 2001).]]Google ScholarGoogle Scholar
  35. Wang, J. and Su, B. 1998. Software pipelining of nested loops for real-time DSP applications, In Proceedings of the 1998 IEEE International Conference on Acoustic, Speech and Signal Processing (ICASSP '98, Seattle, WA, 1998).]]Google ScholarGoogle Scholar
  36. Zivojnovic, V., Velarde, J. M., Schlager, C., and Meyr, H. 1994. DSPstone: A DSP-oriented benchmarking methodology, In Proceedings of Signal Processing Applications & Technology Conference (Dallas, TX, 1994).]]Google ScholarGoogle Scholar

Index Terms

  1. Array recovery and high-level transformations for DSP applications

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Embedded Computing Systems
      ACM Transactions on Embedded Computing Systems  Volume 2, Issue 2
      May 2003
      120 pages
      ISSN:1539-9087
      EISSN:1558-3465
      DOI:10.1145/643470
      Issue’s Table of Contents

      Copyright © 2003 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 1 May 2003
      Published in tecs Volume 2, Issue 2

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • article

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader