ABSTRACT
Dictionary code compression is a technique where long instructions in the memory are replaced with shorter code words used as index in a table to look up the original instructions. We present a new view of dictionary code compression for moderately high-performance processors for embedded applications. Previous work with dictionary code compression has shown decent performance and energy savings results which we verify with our own measurement that are more thorough than previously published. We also augment previous work with a more thorough analysis on the effects of cache and line size changes. In addition, we introduce the concept of aggregated profiling to allow for two or more programs to share the same dictionary contents. Finally, we also introduce dynamic dictionaries where the dictionary contents is considered to be part of the context of a process and show that the performance overhead of reloading the dictionary contents on a context switch is negligible while on the same time we can save considerable energy with a more specialized dictionary contents.
- G. Albera and R. I. Bahar, Power and Performance Tradeoffs using Various Caching Strategies, in Proceedings of the International Symposium on Low Power Electronics and Design, Monterey, CA, August 1998, pp. 64--69. Google ScholarDigital Library
- L. Benini, F. Menichelli, and M Olivieri, A class of code compression schemes for reducing power cosumption in embedded microprocessor systems, IEEE Transactions on Computers, Volume 53, Issue 4, April 2004 Page(s):467--482 Google ScholarDigital Library
- E. Berg and E. Hagersten, Fast Data-Locality Profiling of Native Execution, in Proceedings of ACM SIGMETRICS'05, Banff, Canada, June 2005, pp. 169--180. Google ScholarDigital Library
- D. Brooks, V. Tiwari V., and M. Martonosi, Wattch: A Framework for Architectural-Level Power Analysis and Optimizations, in Proceedings of the 27th International Symposium on Computer Architecture, ISCA'00, June 2000, pp. 83--94. Google ScholarDigital Library
- M. Brorsson and M. Collin, A Microarchitecture for Profile-Based Code Compression using Code Word Dictionaries, Technical report, Dept. of Electronic, Computer and Software systems, Royal Institute of Technology, May 2006. Submitted for publication.Google Scholar
- D. Burger and T. M. Austin, The SimpleScalar Tool Set, Version 2. 0, Computer Architecture News, June 1997, pp. 13--25. Google ScholarDigital Library
- M. Collin and M. Brorsson. "Low Power Instruction Fetch using Profiled Variable Length Instructions", in Proceedings of the IEEE International SoC Conference, Sept. 17-20, Portland, Oregon, 2003.Google ScholarCross Ref
- R. Fromm et al., The Energy Efficiency of IRAM Architectures, in Proceedings of the 24th Annual International Symposium on Computer Architecture, ISCA'97, Denver, CO, 2-4 June 1997, pp. 327--337. Google ScholarDigital Library
- M. R. Guthaus, J. S. Ringenberg, D Ernst, T. Austin, T. Mudge, and R. Brown, MiBench: A free, commercially representative embedded benchmark suite IEEE 4th Annual Workshop on Workload Characterization, Austin, TX, December 2001. Google ScholarDigital Library
- K. D. Kissell. MIPS16: High-density MIPS for the Embedded Market, in Proceedings of Real Time Systems'97 (RTS97), 1997.Google Scholar
- C. Lee, et al., MediaBench: A Tool for Evaluating and Synthesizing Multimedia and Communication Systems, in Proceedings of the 30th International Symposium on Microarchitecture, Dec 2997, pp. 330--335. Google ScholarDigital Library
- C. Lefurgy, P. Bird, I-C. Chen, and T. Mudge, Improving Code Density Using Compression Techniques, in Proceedings of the 30th Annual International Symposium on Microarchitecture, MICRO'30, December 1997, pp. 194--203. Google ScholarDigital Library
- H. Lekatsas, J. Henkel, and V. Jakkula, Design of an One-cycle Deompression Hardware for Performance Increase in Embessded Systems, in Proceedings of the Design Automation Conference DAC2002 (June 2002), pp. 34--39. Google ScholarDigital Library
- H. Lekatsas, J. Henkel, and W. Wolf, Code Compression for Embedded System Design, in Proceedings of the 37th Design Automation Conference, June 2000, pp. 516--521. Google ScholarDigital Library
- E. W. Netto, R. Azevedo, P. Centoducatte, and G Araujo, Multi-Profile Based Code Compression in Proceedings of the 41st Design Automation Conference, June 7-11, 2004 Page(s):244--249. Google ScholarDigital Library
- R. Phelan, Improving ARM Code Density and Performance, ARM Ltd white paper, June 2003.Google Scholar
- Sleeba B., Collin M., and Brorsson M. An ASIC implementation and evaluation of a profiled low-energy instruction set architecture extension. Technical report, KTH Microelectronics and Information Technology, Oct 2003. http://web. it. kth. se/~matsbror/papers/sleeba_variable_asic. pdfGoogle Scholar
- C. L. Su, C. Y. Tsui, and A. M. Despain, Saving power in the control path of embedded processors, IEEE Design Test Comput., vol. 11, no. 4, pp. 24--30, 1994. Google ScholarDigital Library
- J. L. Turley, Thumb Squeezes ARM code size, Microprocessor Report, 9(4), pp. 1--5, 27 March 1995.Google Scholar
- N. Vijaykrishnan et al., Energy-driven Integrated Hardware-Software Optimizations using SimplePower, in Proceedings of the 27th Annual International Symposium on Computer Architecture, ISCA'00, June 2000, pp. 95--106. Google ScholarDigital Library
- A. Wolfe and A. Chanin, "Executing Compressed Programs on an Embedded RISC Architecture", in Proceedings of MICRO'25, December 1992, pp. 81--91. Google ScholarDigital Library
Index Terms
- Adaptive and flexible dictionary code compression for embedded applications
Recommendations
Two-Level Dictionary Code Compression: A New Scheme to Improve Instruction Code Density of Embedded Applications
CGO '09: Proceedings of the 7th annual IEEE/ACM International Symposium on Code Generation and OptimizationDictionary code compression is a technique which has been studied as a method to reduce the energy consumed in the instruction fetch path of processors. Instructions or instruction sequences in the code are replaced with short code words. These code ...
Evaluation of processor code efficiency for embedded systems
ICS '01: Proceedings of the 15th international conference on SupercomputingThis paper evaluates the code efficiency of the ARM, Java, and x86 instruction sets by compiling the SPEC CPU95/ CPU2000/JVM98 and CaffeineMark benchmarks, in terms of code sizes, basic block sizes, instruction distributions, and average instruction ...
Retargetable code generation for application-specific processors
Special issue: Parallel computing technologiesAn approach of intelligent retargetable compiler is introduced to overcome the gap between hardware and software development and to increase performance of embedded systems. It focuses on knowledgeable treatment of code generation where knowledge about ...
Comments