skip to main content
10.1145/1065910.1065927acmconferencesArticle/Chapter ViewAbstractPublication PagescpsweekConference Proceedingsconference-collections
Article

Cache aware optimization of stream programs

Published:15 June 2005Publication History

ABSTRACT

Effective use of the memory hierarchy is critical for achieving high performance on embedded systems. We focus on the class of streaming applications, which is increasingly prevalent in the embedded domain. We exploit the widespread parallelism and regular communication patterns in stream programs to formulate a set of cache aware optimizations that automatically improve instruction and data locality. Our work is in the context of the Synchronous Dataflow model, in which a program is described as a graph of independent actors that communicate over channels. The communication rates between actors are known at compile time, allowing the compiler to statically model the caching behavior.We present three cache aware optimizations: 1) execution scaling, which judiciously repeats actor executions to improve instruction locality, 2) cache aware fusion, which combines adjacent actors while respecting instruction cache constraints, and 3) scalar replacement, which converts certain data buffers into a sequence of scalar variables that can be register allocated. The optimizations are founded upon a simple and intuitive model that quantifies the temporal locality for a sequence of actor executions. Our implementation of cache aware optimizations in the StreamIt compiler yields a 249% average speedup (over unoptimized code) for our streaming benchmark suite on a StrongARM 1110 processor. The optimizations also yield a 154% speedup on a Pentium 3 and a 152% speedup on an Itanium 2.

References

  1. A. Benveniste, P. Caspi, P. L. Guernic, and N. Halbwachs. Data-Flow Synchronous Languages. In REX School/Symposium, pages 1--45, 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. G. Berry and G. Gonthier. The Esterel Synchronous Programming Language: Design, Semantics, Implementation. Science of Computer Prog., 19(2), 1992. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. C. S. Bhattacharyya, P. K. Murthy, and E. A. Lee. APGAN and RPMC: Complementary Heuristics for Translating DSP Block Diagrams into Efficient Software Implementations. Journal of Design Automation for Embedded Systems., pages 33--60, January 1997.Google ScholarGoogle Scholar
  4. S. Bhattacharyya, P. Murthy, and E. Lee. Synthesis of embedded software from synchronous dataflow specifications. Journal of VLSI Signal Processing Systems, 21(2), June 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. S. S. Bhattacharyya, P. K. Murthy, and E. A. Lee. Software Synthesis from Dataflow Graphs. Kluwer Academic Publishers, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. I. Buck, T. Foley, D. Horn, J. Sugerman, K. Fatahalian, M. Houston, and P. Hanrahan. Brook for GPUs: Stream Computing on Graphics Hardware. In SIGGRAPH, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. R. Govindarajan, G. Gao, and P. Desai. Minimizing memory requirements in rate-optimal schedules. In Proceedings of the 1994 Int. conference on Application Specific Array Processors, pages 75--86, August 1994.Google ScholarGoogle ScholarCross RefCross Ref
  8. N. Halbwachs, P. Caspi, P. Raymond, and D. Pilaud. The synchronous data-flow programming language LUSTRE. Proc. of the IEEE, 79(9), 1991.Google ScholarGoogle ScholarCross RefCross Ref
  9. J. Gaudiot and W. Bohm and T. DeBoni and J. Feo and P. Mille. The Sisal Model of Functional Programming and its Implementation. In Proc. of the Second Aizu International Symposium on Parallel Algorithms/Architectures Synthesis, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. M. Karczmarek, W. Thies, and S. Amarasinghe. Phased scheduling of stream programs. In LCTES, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. S. Kohli. Cache aware scheduling of synchronous dataflow programs. Master's Report Technical Memorandum UCB/URL M04/03, UC Berkeley, 2004.Google ScholarGoogle Scholar
  12. E. A. Lee. Overview of the Ptolemy Project. Technical report, Tech Memo UCB/ERL M03/25, UC Berkeley, 2003.Google ScholarGoogle Scholar
  13. E. A. Lee and D. G. Messerschmitt. Static scheduling of synchronous data flow programs for digital signal processing. IEEE Transactions on Computers, January 1987. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. R. L. Mattson, J. Gecsei, D. R. Slutz, and I. L. Traiger. Evaluation techniques for storage hierarchies. IBM Systems Journal, 1970.Google ScholarGoogle Scholar
  15. P. K. Murthy and S. S. Bhattacharyya. A Buffer Merging Technique for Reducing Memory Requirements of Synchronous Dataflow Specifications. In International Symposium on System Synthesis, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. P. K. Murthy and S. S. Bhattacharyya. Buffer Merging --- A Powerful Technique for Reducing Memory Requirements of Synchronous Dataflow Specifications. Technical report, Inst. for Adv. Computer Studies, UMD College Park, 2000.Google ScholarGoogle Scholar
  17. T. A. Proebsting and S. A. Watterson. Filter Fusion. In POPL, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. J. Sermulins. In preparation. Master's thesis, MIT CSAIL, 2005.Google ScholarGoogle Scholar
  19. R. Stephens. A Survey of Stream Processing. Acta Informatica, 34(7), 1997.Google ScholarGoogle Scholar
  20. D. Tennenhouse and V. Bose. The SpectrumWare Approach to Wireless Signal Processing. Wireless Networks, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. W. Thies, M. Karczmarek, and S. Amarasinghe. StreamIt: A Language for Streaming Applications. In Proc. of the Int. Conf. on Compiler Construction (CC), 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Cache aware optimization of stream programs

            Recommendations

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in
            • Published in

              cover image ACM Conferences
              LCTES '05: Proceedings of the 2005 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems
              June 2005
              248 pages
              ISBN:1595930183
              DOI:10.1145/1065910
              • General Chair:
              • Yunheung Paek,
              • Program Chair:
              • Rajiv Gupta
              • cover image ACM SIGPLAN Notices
                ACM SIGPLAN Notices  Volume 40, Issue 7
                Proceedings of the 2005 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems
                July 2005
                238 pages
                ISSN:0362-1340
                EISSN:1558-1160
                DOI:10.1145/1070891
                Issue’s Table of Contents

              Copyright © 2005 ACM

              Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

              Publisher

              Association for Computing Machinery

              New York, NY, United States

              Publication History

              • Published: 15 June 2005

              Permissions

              Request permissions about this article.

              Request Permissions

              Check for updates

              Qualifiers

              • Article

              Acceptance Rates

              Overall Acceptance Rate116of438submissions,26%

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader