skip to main content
article

Speculative software management of datapath-width for energy optimization

Authors Info & Claims
Published:11 June 2004Publication History
Skip Abstract Section

Abstract

This paper evaluates managing the processor's datapath-width at the compiler level by means of exploiting dynamic narrow-width operands. We capitalize on the large occurrence of these operands in multimedia programs to build static narrow-width regions that may be directly exposed to the compiler. We propose to augment the ISA with instructions directly exposing the datapath and the register widths to the compiler. Simple exception management allows this exposition to be only speculative. In this way, we permit the software to speculatively accommodate the execution of a program on a narrower datapath-width in order to save energy. For this purpose, we introduce a novel register file organization, the byte-slice register file, which allows the width of the register file to be dynamically reconfigured, providing both static and dynamic energy savings. We show that by combining the advantages of the byte-slice register file with the advantages provided by clock-gating the datapath on a per-region basis, up to 17% of the datapath dynamic energy can be saved, while a 22% reduction of the register file static energy is achieved.

References

  1. Ayala, J.L., López, V.M., Veidenbaum, A., and López C.A. Energy Aware Register File Implementation through Instruction Predecode. In Proceedings of the International Conference on Application-Specific Systems, Architectures and Processors, June 2003.Google ScholarGoogle ScholarCross RefCross Ref
  2. Bahar, R.I., and Manne, S. Power and Energy Reduction Via Pipeline Balancing. In Proceedings of the 28th International Symposium on Computer Architecture, June 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Balasubramonian, R., Dwarkadas, S., Albonesi, D. Reducing the Complexity of the Register File in Dynamic Superscalar Processor. In Proceedings of the 34th International Symposium on Microarchitecture, December 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Bodin, F., Rohou, E., and Seznec, A. SALTO: System for Assembly-Language Transformation and Optimization. In Proceedings of the Sixth Workshop on Compilers for Parallel Computers, December 1996.Google ScholarGoogle Scholar
  5. Brooks, D., and Martonosi, M. Dynamically Exploiting Narrow Width Operands to Improve Processor Power and Performance. In Proceedings of the 5th International Symposium on High-Performance Computer Architecture, January 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Canal, R., Gonzales, A., and Smith, J.E. Very Low Power Pipelines Using Significance Compression. In Proceedings of the 33th International Symposium on Microarchitecture, December 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Canal, R., Gonzales, A., and Smith, J.E. Software-Controlled Operand-Gating. In Proceedings of the International Symposium on Code Generation and Optimization, March 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Cao, Y., and Yasuura, H. Low-Energy Design using Datapath Width Optimization for Embedded Processor-based Systems. IPSJ Journal, 43(5):1348--1356, May 2002.Google ScholarGoogle Scholar
  9. Drach, N., and Sebot, J. SIMD ISA Extensions: Tradeoff between Power Consumption and Performance on a Superscalar Processor. In Proceedings of the Kool Chips Workshop, December 2000.Google ScholarGoogle Scholar
  10. Faraboschi, P., Brown, G., Fisher, J.A., Desoli, G., and Homewood, F. Lx: A Technology Platform for Customizable VLIW Embedded Processing. In Proceedings of the 27th International. Symposium on Computer Architecture, June 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Flautner, K., Sung Kim, N., Martin, S., Blaauw, D., and Mudge, T. Drowsy Caches: Simple Techniques for Reducing Leakage Power. In Proceedings of the 29th International Symposium on Computer Architecture, May 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Guthaus, M.R., Ringenberg, J.S., Ernst, D., Austin, T. M., Mudge, T., and Brown, R.B. MiBench: A Free, Commercially Representative Embedded Benchmark Suite. In Proceedings of the 4th IEEE International Workshop on Workload Characterization, pages 3--14, December 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Larsen, S., and Amarasinghe, S. Exploiting Superword Level Parallelism with Multimedia Instruction Sets. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation, June 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Loh, G. Exploiting Data-Width Locality to Increase Superscalar Execution Bandwidth. In Proceedings of the 35th International Symposium on Microarchitecture, November 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Mahlke, S., Ravindran, R., Schlansker, M., Schreiber, R., and Sherwood, T. Bitwidth Cognizant Architecture Synthesis of Custom Hardware Accelerators. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 20(11), November 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Manne, S., Klauser, A., and Grunwald, D. Pipeline Gating: Speculation Control for Energy Reduction. In Proceedings of the 25th International Symposium on Computer Architecture, June 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Moreno, J.H., et al. An Innovative Low-Power High-Performance Programmable Signal Processor for Digital Communications. IBM Journal of Research and Development, 47(2-3):299--326, March/May 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Nakra, T., Childers, B.R., and Soffa, M.L. Width-Sensitive Scheduling for Resource-Constrained VLIW Processors. In Proceedings of the 3th ACM Workshop on Feedback-Directed and Dynamic Optimization, December 2000.Google ScholarGoogle Scholar
  19. Pokam, G., Bihan, S., Simonnet, J., and Bodin, F. SWARP: A Retargetable Preprocessor for Multimedia Instructions. Concurrency and Computation: Practice and Experience, 16(2-3):303--318, February/March 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Scott, J., Hwang Lee, L., Arends, J., and Moyer, W. Designing the Low-Power M.CORE Architecture. In Proceedings of Power Driven Microarchitecture, June 1998.Google ScholarGoogle Scholar
  21. Shivakumar, P., and Jouppi, N. CACTI 3.0: An Integrated Cache Timing Power, and Area Model. Technical report, DEC Western research Lab, 2002.Google ScholarGoogle Scholar
  22. Smith, I.E., et al. The ZS-I Central Processor. In Proceedings of the 2nd International Conference on Architectural Support for Programming Languages and Operating Systems, pages 199--204, October 1987. Google ScholarGoogle ScholarCross RefCross Ref
  23. Stephenson, M., Babb, J., and Amarasinghe, S. Bitwidth Analysis with Application to Silicon Compilation. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation, June 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Tseng, J.H., and Asanovic, K. Banked Multiported Register Files for High-Frequency Superscalar Microprocessors. In Proceedings of the 30th International Symposium on Computer Architecture, June 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Vijaykrishnan, N., Kandemir, M., Irwin, M.J., Kim, H.S., and Ye, W. Energy-driven Integrated Hardware-Software Optimizations using SimplePower. In Proceedings of the 27th International Symposium on Computer Architecture, June 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Zhang, Y., Parikh, D., Sankaranarayanan, K., Skadron, K., and Stan, M. Hotleakage: A Temperature-aware Model of Subthreshold and Gate Leakage for Architects. Technical Report CS-2003-05, University of Virginia, Department of Computer Science, March 2003.Google ScholarGoogle Scholar

Index Terms

  1. Speculative software management of datapath-width for energy optimization

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM SIGPLAN Notices
      ACM SIGPLAN Notices  Volume 39, Issue 7
      LCTES '04
      July 2004
      265 pages
      ISSN:0362-1340
      EISSN:1558-1160
      DOI:10.1145/998300
      Issue’s Table of Contents
      • cover image ACM Conferences
        LCTES '04: Proceedings of the 2004 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems
        June 2004
        276 pages
        ISBN:1581138067
        DOI:10.1145/997163

      Copyright © 2004 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 11 June 2004

      Check for updates

      Qualifiers

      • article

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader