ABSTRACT
Code positioning is a well-known compiler optimization aiming at the improvement of the instruction cache behavior. A contiguous mapping of code fragments in memory avoids overlapping of cache sets and thus decreases the number of cache conflict misses.
We present a novel cache-aware code positioning optimization driven by worst-case execution time (WCET) information. For this purpose, we introduce a formal cache model based on a conflict graph which is able to capture a broad class of cache architectures. This cache model is combined with a formal WCET timing model, resulting in a cache conflict graph weighted with WCET data. This conflict graph is then exploited by heuristics for code positioning of both basic blocks and entire functions.
Code positioning is able to decrease the accumulated cache misses for a total of 18 real-life benchmarks by 15.5% on average for an automotive processor featuring a 2-way set-associative cache. These cache miss reductions translate to average WCET reductions by 6.1%. For direct-mapped caches, even larger savings of 18.8% (cache misses) and 9.0% (WCET) were achieved.
- AbsInt Angewandte Informatik GmbH. aiT: Worst-Case Execution Time Analyzers. http://www.absint.com/ait, July 2011.Google Scholar
- H. Falk and J. C. Kleinsorge. Optimal Static WCET-aware Scratchpad Allocation of Program Code. In Proceedings of DAC, July 2009. Google ScholarDigital Library
- H. Falk and P. Lokuciejewski. A compiler framework for the reduction of worst-case execution times. Real-Time Systems, 46(2), Oct. 2010. Google ScholarDigital Library
- H. Falk, S. Plazar, and H. Theiling. Compile-Time Decided Instruction Cache Locking Using Worst-Case Execution Paths. In Proceedings of CODES+ISSS, Oct. 2007. Google ScholarDigital Library
- C. Ferdinand, F. Martin, and R. Wilhelm. Applying Compiler Techniques to Cache Behavior Prediction. In Proceedings of LCT-RTS, June 1997.Google Scholar
- C. Ferdinand and R. Wilhelm. Efficient and Precise Cache Behavior Prediction for Real-Time Systems. Real-Time Systems, 17(2), Nov. 1999. Google ScholarDigital Library
- G. Gebhard and S. Altmeyer. Optimal Task Placement to Improve Cache Performance. In Proceedings of EMSOFT, Sept. 2007. Google ScholarDigital Library
- C. Guillon, F. Rastello, T. Bidault, et al. Procedure Placement using Temporal-Ordering Information: dealing with Code Size Expansion. In Proceedings of CASES, Sept. 2004. Google ScholarDigital Library
- Informatik Centrum Dortmund e. V. ICD-C Compiler framework. http://www.icd.de/es/icd-c, July 2011.Google Scholar
- C. Lee, M. Potkonjak, and W. H. Mangione-Smith. MediaBench: A Tool for Evaluating and Synthesizing Multimedia and Communications Systems. In Proceedings of MICRO, Washington DC, Dec. 1997. Google ScholarDigital Library
- E. A. Lee. Absolutely Positively On Time: What Would It Take? Embedded Systems Column, IEEE Computer, July 2005. Google ScholarDigital Library
- Y. Liang and T. Mitra. Improved Procedure Placement for Set Associative Caches. In Proceedings of CASES, Oct. 2010. Google ScholarDigital Library
- T. Liu, M. Li, and C. J. Xue. Minimizing WCET for Real-Time Embedded Systems via Static Instruction Cache Locking. In Proceedings of RTAS, Apr. 2009. Google ScholarDigital Library
- P. Lokuciejewski, H. Falk, and P. Marwedel. WCET-driven Cache-based Procedure Positioning Optimizations. In Proceedings of ECRTS, July 2008. Google ScholarDigital Library
- Mälardalen WCET Research Group. WCET Benchmarks. http://www.mrtc.mdh.se/projects/wcet/benchmarks.html, July 2011.Google Scholar
- S. Plazar, P. Lokuciejewski, and P. Marwedel. WCET-aware Software Based Cache Partitioning for Multi-Task Real-Time Systems. In Proceedings of WCET, June 2009.Google Scholar
- S. Plazar, P. Lokuciejewski, and P. Marwedel. WCET-driven Cache-aware Memory Content Selection. In Proceedings of ISORC, May 2010. Google ScholarDigital Library
- I. Puaut and C. Pais. Scratchpad memories vs locked caches in hard real-time systems: a quantitative comparison. In Proceedings of DATE, Apr. 2007. Google ScholarDigital Library
- J. Reineke, D. Grund, C. Berg, et al. Timing Predictability of Cache Replacement Policies. Real-Time Systems, 37(2), Nov. 2007. Google ScholarDigital Library
- S. Steinke, L. Wehmeyer, B.-S. Lee, et al. Assigning Program and Data Objects to Scratchpad for Energy Reduction. In Proceedings of DATE, Mar. 2002. Google ScholarDigital Library
- V. Suhendra, T. Mitra, A. Roychoudhury, et al. WCET Centric Data Allocation to Scratchpad Memory. In Proceedings of RTSS, Dec. 2005. Google ScholarDigital Library
- Synopsys, Inc. http://www.synopsys.com, July 2011.Google Scholar
- UTDSP Benchmark Suite. http://www.eecg.toronto.edu/~corinna/DSP/infrastructure/UTDSP.html, July 2011.Google Scholar
- X. Vera, B. Lisper, and J. Xue. Data Cache Locking for Higher Program Predictability. In Proceedings of SIGMETRICS, June 2003. Google ScholarDigital Library
- M. Verma, L. Wehmeyer, and P. Marwedel. Cache-Aware Scratchpad Allocation Algorithm. In Proceedings of DATE, Feb. 2004. Google ScholarDigital Library
- V. Živojnović, J. M. Velarde, C. Schläger, et al. DSPstone: A DSP-Oriented Benchmarking Methodology. In Proceedings of ICSPAT, Dallas, USA, Oct. 1994.Google Scholar
- WCET-aware Compilation. http://ls12-www.cs.tu-dortmund.de/research/activities/wcc, July 2011.Google Scholar
- W. Zhao, D. Whalley, C. Healy, et al. Improving WCET by Applying a WC Code-Positioning Optimization. ACM Transactions on Architecture and Code Optimization, 2(4), Dec. 2005. Google ScholarDigital Library
Index Terms
- WCET-driven cache-aware code positioning
Recommendations
A new cache replacement algorithm for last-level caches by exploiting tag-distance correlation of cache lines
Cache memory plays a crucial role in determining the performance of processors, especially for embedded processors where area and power are tightly constrained. It is necessary to have effective management mechanisms, such as cache replacement policies, ...
WCET-Driven Cache-Aware Memory Content Selection
ISORC '10: Proceedings of the 2010 13th IEEE International Symposium on Object/Component/Service-Oriented Real-Time Distributed ComputingCaches are widely used to bridge the increasingly growing gap between processor and memory performance. They store copies of frequently used parts of the slow main memory for faster access. Static analysis techniques allow the estimation of the worst ...
LDAC: Locality-Aware Data Access Control for Large-Scale Multicore Cache Hierarchies
The trend of increasing the number of cores to achieve higher performance has challenged efficient management of on-chip data. Moreover, many emerging applications process massive amounts of data with varying degrees of locality. Therefore, exploiting ...
Comments