skip to main content
10.1145/379240.379268acmconferencesArticle/Chapter ViewAbstractPublication PagesiscaConference Proceedingsconference-collections
Article

Cache decay: exploiting generational behavior to reduce cache leakage power

Published:01 May 2001Publication History

ABSTRACT

Power dissipation is increasingly important in CPUs ranging from those intended for mobile use, all the way up to high-performance processors for high-end servers. While the bulk of the power dissipated is dynamic switching power, leakage power is also beginning to be a concern. Chipmakers expect that in future chip generations, leakage's proportion of total chip power will increase significantly.

This paper examines methods for reducing leakage power within the cache memories of the CPU. Because caches comprise much of a CPU chip's area and transistor counts, they are reasonable targets for attacking leakage. We discuss policies and implementations for reducing cache leakage by invalidating and “turning off” cache lines when they hold data not likely to be reused. In particular, our approach is targeted at the generational nature of cache line usage. That is, cache lines typically have a flurry of frequent use when first brought into the cache, and then have a period of “dead time” before they are evicted. By devising effective, low-power ways of deducing dead time, our results show that in many cases we can reduce LI cache leakage energy by 4x in SPEC2000 applications without impacting performance. Because our decay-based techniques have notions of competitive on-line algorithms at their roots, their energy usage can be theoretically bounded at within a factor of two of the optimal oracle-based policy. We also examine adaptive decay-based policies that make energy-minimizing policy choices on a per-application basis by choosing appropriate decay intervals individually for each cache line. Our proposed adaptive policies effectively reduce LI cache leakage energy by 5x for the SPEC2000 with only negligible degradations in performance.

References

  1. 1.J. Baer and W. Wang. On the inclusion property in multi-level cache hierarchies. In Proc. ISCA-15, 1988. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. 2.S. Borkar. Design challenges of technology scaling. IEEE Micro, 19(4), 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. 3.W. J. Bowhill et al. Circuit Implementation of a 300-MHz 64-bit Second-generation CMOS Alpha CPU. Digital Technical Journal, 7(1):100-118, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. 4.D. Brooks, V. Tiwari, and M. Martonosi. Wattch: A Framework for Architecture-Level Power Analysis and Optimizations. In Proc. 1SCA-27, ISCA 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. 5.D. Burger, T. M. Austin, and S. Bennett. Evaluating future microprocessors: the SimpleScalar tool set. Tecfi. Report TR-1308, Univ. of Wisconsin-Madison Computer Sciences Dept., July 1996.Google ScholarGoogle Scholar
  6. 6.D. Burger, J. Goodman, and A. Kagi. The declining effectiveness of dynamic caching for general-purpose microprocessors. Tech. Report TR- 1216, Univ. of Wisconsin-Madison Computer Sciences Dept.Google ScholarGoogle Scholar
  7. 7.Z. Chen et al. Estimation of standby leakage power in CMOS circuits considering accurate modeling of transistor stacks. In ISLPED, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. 8.J. Dean, J. Hicks, et al. Profileme: Hardware support for instructionlevel profiling on out-of-order processors. In Ptvc. Micro-30, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. 9.L. Gwennap. Digital 21264 sets new standard. Microprocessor Report, pages 11-16, Oct. 28, 1996.Google ScholarGoogle Scholar
  10. 10.H.-H. Lee, G. S. Tyson, M. Farrens. Eager Writeback - a Technique for Improving Bandwidth Utilization. In Proc. Micro-33, Dec. 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. 11.E. G. Hallnor and S. K. Reinhardt. A fully associative softwaremanaged cache design. In Proc. 1SCA-27, June 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. 12.IBM Corp. Personal communication. November, 2000.Google ScholarGoogle Scholar
  13. 13.Intel Corp. Intel architecture optimization manual.Google ScholarGoogle Scholar
  14. 14.J. A. Butts and G. Sohi. A Static Power Model for Architects. In Proc. Micro-33, Dec. 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. 15.T. Johnson et al. Run-time Cache Bypassing. IEEE Transactions on Computers, 48(12), 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. 16.M.B. Kamble and K. Ghose. Analytical Energy Dissipation Models for Low Power Caches. In ISLPED, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. 17.A. Karlin et al. Empirical studies of competitive spinning for a shared-memory muhiprocessor. In Ptvc. SOSP, 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. 18.S. Kaxiras and C. Young. Coherence communication prediction in shared-memory multiprocessors. In Proc. HPCA-6, Jan. 2000.Google ScholarGoogle Scholar
  19. 19.T. Kimbrel and A. Karlin. Near-optimal parallel prefetching and caching. SIAM Journal on computing, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. 20.A.-C. Lai and B. Falsafi. Selective, Accurate, and Timely Self- Invalidation Using Last-Touch Prediction. In Proc. ISCA-27, May 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. 21.A.R. Lebeck and D. A. Wood. Dynamic Self-lnvalidation: Reducing Coherence Overhead in Shared-Memory Multiprocessors. In Proc. ISCA-22, June 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. 22.C. Lee, M. Potkonjak, and W. H. Mangione-Smith. MediaBench: A Tool for Evaluating and Synthesizing Multimedia and Communication Systems. In Proc. Micro-30, Dec. 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. 23.J. Peir, Y. Lee, and W. Hsu. Capturing Dynamic Memory Reference Behavior with Adaptive Cache Topology. In Proc. ASPLOS-VIII, Nov. 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. 24.M. D. Powell et al. Gated-Vdd: A Circuit Technique to Reduce Leakage in Deep-Submicron Cache Memories. In ISLPED, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. 25.T. Romer, W. Ohlrich, A. Karlin, and B. Bershad. Reducing TLB and memory overhead using online superpage promotion. In Proc. ISCA-22, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. 26.S. Sair and M. Charney. Memory behavior of the SPEC2000 benchmark suite. Technical report, IBM, 2000.Google ScholarGoogle Scholar
  27. 27.Semiconductor Industry Association. The International Technology Roadmap for Semiconductors, 1999. hnp://www.semichips.org.Google ScholarGoogle Scholar
  28. 28.W. Stallings. Operating Systems. Prentice Hall, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. 29.The Standard Performance Evaluation Corporation. WWW Site. http://www.spec.org, Dec. 2000.Google ScholarGoogle Scholar
  30. 30.U.S. Environmental Protection Agency. Energy Star Program web page. http://www.epa.gov/energystar/.Google ScholarGoogle Scholar
  31. 31.Z. Wang, K. S. McKinley, and A. L. Rosenberg. Improving replacement decisions in set-associative caches. Technical Report TR- 01-02, University of Massachusetts, Mar. 2001. http://ali-www.cs.- umass.edu/. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. 32.K. M. Wilson and K. Olukotun. Designing high bandwidth on-chip caches. In Proc. 1SCA-24, pages 121-32, June 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. 33.D. A. Wood, M. D. Hill, and R. E. Kessler. A Model for Estimating Trace-Sample Miss Ratios. In ACM SIGMETRICS, pages 79-89, June 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. 34.S.-H. Yang et al. An Integrated Circuit/Architecture Approach to Reducing Leakage in Deep-Submicron High-Performance I-Caches. In Proc. HPCA-7, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. 35.T. N. Yeh and Y. Part. A Comparison of Dynamic Branch Predictors that Use Two Levels of Branch History. In Proc. ISCA-20, May 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. 36.M. Zagha, B. Larson, et al. Performance analysis using the MIPS R I0000 performance counters. In Proc. Supercomputing, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Cache decay: exploiting generational behavior to reduce cache leakage power

            Recommendations

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in
            • Published in

              cover image ACM Conferences
              ISCA '01: Proceedings of the 28th annual international symposium on Computer architecture
              June 2001
              289 pages
              ISBN:0769511627
              DOI:10.1145/379240
              • cover image ACM SIGARCH Computer Architecture News
                ACM SIGARCH Computer Architecture News  Volume 29, Issue 2
                Special Issue: Proceedings of the 28th annual international symposium on Computer architecture (ISCA '01)
                May 2001
                262 pages
                ISSN:0163-5964
                DOI:10.1145/384285
                Issue’s Table of Contents

              Copyright © 2001 Authors

              Publisher

              Association for Computing Machinery

              New York, NY, United States

              Publication History

              • Published: 1 May 2001

              Permissions

              Request permissions about this article.

              Request Permissions

              Check for updates

              Qualifiers

              • Article

              Acceptance Rates

              ISCA '01 Paper Acceptance Rate24of163submissions,15%Overall Acceptance Rate543of3,203submissions,17%

              Upcoming Conference

              ISCA '24

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader