skip to main content
10.1145/781027.781062acmconferencesArticle/Chapter ViewAbstractPublication PagesmetricsConference Proceedingsconference-collections
Article

Data cache locking for higher program predictability

Authors Info & Claims
Published:10 June 2003Publication History

ABSTRACT

Caches have become increasingly important with the widening gap between main memory and processor speeds. However, they are a source of unpredictability due to their characteristics, resulting in programs behaving in a different way than expected.Cache locking mechanisms adapt caches to the needs of real-time systems. Locking the cache is a solution that trades performance for predictability: at a cost of generally lower performance, the time of accessing the memory becomes predictable.This paper combines compile-time cache analysis with data cache locking to estimate the worst-case memory performance (WCMP) in a safe, tight and fast way. In order to get predictable cache behavior, we first lock the cache for those parts of the code where the static analysis fails. To minimize the performance degradation, our method loads the cache, if necessary, with data likely to be accessed.Experimental results show that this scheme is fully predictable, without compromising the performance of the transformed program. When compared to an algorithm that assumes compulsory misses when the state of the cache is unknown, our approach eliminates all overestimation for the set of benchmarks, giving an exact WCMP of the transformed program without any significant decrease in performance.

References

  1. M. Alt, C. Ferdinand, F. Martin, and R. Wilhelm. Cache behaviour prediction by abstract interpretation. In Proceedings of Static Analysis Symposium (SAS'96), Lecture Notes in Computer Science (LNCS) 1145, pages 52--66. Springer-Verlag, September 1996.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. R. Arnold, F. Mueller, D. Whalley, and M. Harmon. Bounding worst-case instruction cache performance. In Proceedings of 15th Real-Time Systems Symposium (RTSS'94), pages 172--181, 1994.]]Google ScholarGoogle ScholarCross RefCross Ref
  3. M. Campoy, A. P. Ivars, and J. V. Busquets-Mataix. Static use of locking caches in multitask preemptive real-time systems. In Proceedings of IEEE/IEE Real-Time Embedded Systems Workshop (Satellite of the IEEE Real-Time Systems Symposium), 2001.]]Google ScholarGoogle Scholar
  4. S. Carr and K. Kennedy. Compiler blockability of numerical algorithms. In Proceedings of Supercomputing (SC'92), pages 114--124, Nov. 1992.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. S. Chatterjee, V. V. Jain, A. R. Lebeck, S. Mundhra, and M. Thottethodi. Nonlinear array layout for hierarchical memory systems. In Proceedings of ACM International Conference on Supercomputing (ICS'99), pages 444--453, Rhodes, Greece, Jun. 1999.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. S. Chatterjee, E. Parker, P. J. Hanlon, and A. R. Lebeck. Exact analysis of the cache behavior of nested loops. In ACM SIGPLAN '01 Conference on Programming Language Design and Implementation (PLDI'01), pages 286--297, 2001.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. S. Coleman and K. S. McKinley. Tile size selection using cache organization and data layout. In Proceedings of ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI'95), pages 279--290, Jun. 1995.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. J. Engblom and A. Ermedhal. Modeling complex flows for worst-case execution time analysis. In Proceedings of the IEEE Real-Time Systems Symposium (RTSS 2000), Dec. 2000.]]Google ScholarGoogle ScholarCross RefCross Ref
  9. C. Ferdinand and R. Wilhelm. Efficient and precise cache behavior prediction for real-time systems. Real-Time Systems, 17:131--181, 1999.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. B. B. Fraguela, R. Doallo, and E. L. Zapata. Automatic analytical modeling for the estimation of cache misses. In Proceedings of International Conference on Parallel Architectures and Compilation Techniques (PACT'99), 1999.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. D. Gannon, W. Jalby, and K. Gallivan. Strategies for cache and local memory management by global program transformations. Journal of Parallel and Distributed Computing, 5:587--616, 1988.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. S. Ghosh, M. Martonosi, and S. Malik. Cache miss equations: a compiler framework for analyzing and tuning memory behavior. ACM Transactions on Programming Languages and Systems, 21(4):703--746, 1999.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. C. Healey, D. Whalley, and M. Harmon. Integrating the timing analysis of pipelining and instruction caching. In Proceedings of 16th Real-Time Systems Symposium (RTSS'95), pages 288--297, 1995.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. M. Hill. DineroIII: a uniprocessor cache simulator (http://www.cs.wisc.edu/ larus/warts.html).]]Google ScholarGoogle Scholar
  15. IBM Microelectronics Division. The PowerPC 440 core, 1999.]]Google ScholarGoogle Scholar
  16. M. Kandemir, A. Choudhary, P. Banerjee, and J. Ramanujam. A linear algebra framework for automatic determination of optimal data layouts. IEEE Transactions on Parallel and Distributed Systems, 10(2):115--135, Feb. 1999.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. S. Kim, S. Min, and R. Ha. Efficient worst case timing analysis of data caching. In Proceedings of IEEE Real-Time Technology and Applications Symposium (RTAS'96), 1996.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. D. Kirk. SMART (strategic memory allocation for real-time) cache design. In Proceedings of 10th Real-Time Systems Symposium (RTSS'89), Dec. 1989.]]Google ScholarGoogle ScholarCross RefCross Ref
  19. M. Lam, E. E. Rothberg, and M. E. Wolf. The cache performance of blocked algorithms. In Proceedings of IV International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS'91), Apr. 1991.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Y. Li, S. Malik, and A. Wolfe. Efficient microarchitecture modeling and path analysis for real-time software. In Proceedings of 16th Real-Time Systems Symposium (RTSS'95), pages 298--307, 1995.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Y. Li, S. Malik, and A. Wolfe. Cache modeling and path analysis for real-time software. In Proceedings of 17th Real-Time Systems Symposium (RTSS'96), 1996.]]Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. S. Lim, Y. Bae, G. Jang, B. Rhee, S. Min, C. Park, H. Shin, K. Park, and C. Kim. An accurate worst case timing analysis technique for RISC processors. In Proceedings of 15th Real-Time Systems Symposium (RTSS'94), pages 97--108, 1994.]]Google ScholarGoogle Scholar
  23. T. Lundqvist and P. Stenström. A method to improve the estimated worst-case performance of data caching. In Proceedings of the 6th International Conference on Real-Time Computing Systems and Applications (RTCSA'99), pages 255--262, Dec. 1999.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. T. Lundqvist and P. Stenström. Timing anomalies in dynamically scheduled microprocessors. In Proceedings of 20th Real-Time Systems Symposium (RTSS'99), Dec. 1999.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Motorola Inc. PowerPC 604e RISC Microprocessor Technical Summary, 1996.]]Google ScholarGoogle Scholar
  26. T. Mowry, M. Lam, and A. Gupta. Design and evaluation of a compiler algorithm for prefetching. In Proceedings of V Int. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS'92), pages 62--73, Oct. 1992.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. G. Rivera and C.-W. Tseng. Data transformations for eliminating conflict misses. In Proceedings of ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI'98), pages 38--49, 1998.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. F. Sánchez, A. González, and M. Valero. Static locality analysis for cache management. In Proceedings of International Conference on Parallel Architectures and Compilation Techniques (PACT'97), November 1997.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Sun Microelectronics. microSPARC-IIep User's Manual, 1997.]]Google ScholarGoogle Scholar
  30. O. Temam, E. Granston, and W. Jalby. To copy or not to copy: A compile-time technique for accessing when data copying should be used to eliminate cache conflicts. In Proceedings of Supercomputing (SC'93), pages 410--419, 1993.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. X. Vera, J. Llosa, A. González, and N. Bermudo. A fast and accurate approach to analyze cache memory behavior. In Proceedings of European Conference on Parallel Computing (Europar'00), 2000.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. X. Vera and J. Xue. Let's study whole program cache behaviour analytically. In Proceedings of International Symposium on High-Performance Computer Architecture (HPCA 8), Cambridge, Feb. 2002.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. R. T. White, F. Müller, C. Healy, D. Whalley, and M. Harmon. Timing analysis for data caches and set-associative caches. In Proceedings of Third IEEE Real-Time Technology and Applications Symposium (RTAS'97), pages 192--202, 1997.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. M. Wolf and M. Lam. A data locality optimizing algorithm. In Proceedings of ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI`91), pages 30--44, Jun. 1991.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. J. Xue and X. Vera. Efficient and accurate analytical modeling of whole-program data cache behavior. To Appear in IEEE Transactions of Computers]] Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Data cache locking for higher program predictability

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        SIGMETRICS '03: Proceedings of the 2003 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
        June 2003
        338 pages
        ISBN:1581136641
        DOI:10.1145/781027
        • cover image ACM SIGMETRICS Performance Evaluation Review
          ACM SIGMETRICS Performance Evaluation Review  Volume 31, Issue 1
          June 2003
          325 pages
          ISSN:0163-5999
          DOI:10.1145/885651
          Issue’s Table of Contents

        Copyright © 2003 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 10 June 2003

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • Article

        Acceptance Rates

        SIGMETRICS '03 Paper Acceptance Rate26of222submissions,12%Overall Acceptance Rate459of2,691submissions,17%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader