skip to main content
10.1145/1837274.1837362acmconferencesArticle/Chapter ViewAbstractPublication PagesdacConference Proceedingsconference-collections
research-article

Instruction cache locking using temporal reuse profile

Published:13 June 2010Publication History

ABSTRACT

The performance of most embedded systems is critically dependent on the average memory access latency. Improving the cache hit rate can have significant positive impact on the performance of an application. Modern embedded processors often feature cache locking mechanisms that allow memory blocks to be locked in the cache under software control. Cache locking was primarily designed to offer timing predictability for hard real-time applications. Hence, the compiler optimization techniques focus on employing cache locking to improve worst-case execution time. However, cache locking can be quite effective in improving the average-case execution time of general embedded applications as well. In this paper, we explore static instruction cache locking to improve average-case program performance. We introduce temporal reuse profile to accurately and efficiently model the cost and benefit of locking memory blocks in the cache. We propose an optimal algorithm and a heuristic approach that use the temporal reuse profile to determine the most beneficial memory blocks to be locked in the cache. Experimental results show that locking heuristic achieves close to optimal results and can improve the cache miss rate by up to 24% across a suite of real-world benchmarks. Moreover, our heuristic provides significant improvement compared to the state-of-the-art locking algorithm both in terms of performance and efficiency.

References

  1. 3rd Generation Intel Xscale Microarchitecture Developers's Manual. Intel, May 2007. http://www.intel.com/design/intelxscale.Google ScholarGoogle Scholar
  2. ADSP-BF533 Processor Hardware Reference. Analog Devices, April 2009. http://www.analog.com/static/imported-files/processor_manuals/bf533_hwr_Rev3.4.pdf.Google ScholarGoogle Scholar
  3. ARM Cortex A-8 Technical Reference Manual. ARM, Revised March 2004. http://www.arm.com/products/CPUs/families/ARMCortexFamily.html.Google ScholarGoogle Scholar
  4. ARM1156T2-S Technical Reference Manual. ARM, Revised July 2007. http://www.arm.com/products/CPUs/families/ARM11Family.html.Google ScholarGoogle Scholar
  5. K. Anand and R. Barua. Instruction cache locking inside a binary rewriter. In CASES, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. T. Austin et al. Simplescalar: An infrastructure for computer system modeling. Computer, 35(2):59--67, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. K. Beyls and E. H. D'Hollander. Reuse distance as a metric for cache behavior. In Proceedings of the IASTED International Conference on Parallel and Distributed Computing and Systems, 2001.Google ScholarGoogle Scholar
  8. B. Bryan and J. K. Hollingsworth. An API for runtime code patching. Int. J. High Perform. Comput. Appl., 14(4), 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. C. Ding and Y. T. Zhong. Predicting whole-program locality through reuse distance analysis. SIGPLAN Not., 38(5), 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. H. Falk et al. Compile-time decided instruction cache locking using worst-case execution paths. In CODES+ISSS, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. N. Gloy and M. D. Smith. Procedure placement using temporal-ordering information. ACM Trans. Program. Lang. Syst., 21(5):977--1027, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. I. Puaut and D. Decotigny. Low-complexity algorithms for static cache locking in multitasking hard real-time systems. In RTSS, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. V. Suhendra and T. Mitra. Exploring locking & partitioning for predictable shared caches on multi-cores. In DAC, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. X. Vera et al. Data cache locking for higher program predictability. In SIGMETRICS, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. H. Yang et al. Improving power efficiency with compiler-assisted cache replacement. J. Embedded Comput., 1(4):487--499, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Instruction cache locking using temporal reuse profile

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      DAC '10: Proceedings of the 47th Design Automation Conference
      June 2010
      1036 pages
      ISBN:9781450300025
      DOI:10.1145/1837274

      Copyright © 2010 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 13 June 2010

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate1,770of5,499submissions,32%

      Upcoming Conference

      DAC '24
      61st ACM/IEEE Design Automation Conference
      June 23 - 27, 2024
      San Francisco , CA , USA

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader