ABSTRACT
Recently, chip multiprocessors (CMPs) that can simultaneously execute multiple workloads using multiple cores have become a key to achieve high-performance processing. To improve CMP performance, various shared resource management mechanisms have been proposed. In particular, cache partitioning is significantly effective to avoid resource conflicts at a shared cache memory. As most cache partitioning methods need to predict the changes in cache access characteristics of each workload when the cache partition moves, it is important for cache partitioning to establish an accurate prediction model.
In this paper, we first analyze the cache access locality of various applications using stack distance profiling. We figure out that stack distance distributions incline to obey socalled Zipf's law. To achieve effective cache partitioning, then, we propose a model based on Zipf's law that predicts the changes in the stack distance distributions. Using the model, we also show the validity of a measure, which has been proposed in our previous work to quantify how much a workload demands the cache capacity.
- A. Bardine, P. Foglia, G. Gabrielli, C. A. Prete, and P. Stenström. Improving power efficiency of d-nuca caches. ACM SIGARCH Computer Architecture News, 35(4):53--58, Sept. 2007. Google ScholarDigital Library
- N. Binkert, R. Dreslinski, L. Hsu, K. Lim, A. Saidi, and S. Reinhardt. The m5 simulator: Modeling networked systems. IEEE Micro, 26(4):52--60, July-Aug. 2006. Google ScholarDigital Library
- L. Breslau, P. Cao, L. Fan, G. Phillips, and S. Shenker. Web caching and zipf-like distributions: evidence and implications. INFOCOM '99: Proceedings of the Eighteenth Annual Joint Conference, 1:126--134, Mar 1999.Google ScholarCross Ref
- D. Chandra, F. Guo, S. Kim, and Y. Solihin. Predicting inter-thread cache contention on a chip multi-processor architecture. In HPCA '05: Proceedings of the 11th International Symposium on High-Performance Computer Architecture, pages 340--351, 2005. Google ScholarDigital Library
- C. Cunha, A. Bestavros, and M. Crovella. Characteristics of World Wide Web Client-based Traces. Technical Report BUCS-TR-1995-010, Boston University, CS Dept, Boston, MA 02215, April 1995. Google ScholarDigital Library
- D. G. Feitelson. On the interpretation of top500 data. International Journal of High Performance Computing Applications, 13(2):146--153, 1999. Google ScholarDigital Library
- B. Gutenberg and C. F. Richter. Frequency and energy of earthquakes. Seismicity of the Earth and Associated Phenomena, pages 17--19, 1954.Google Scholar
- R. Iyer, L. Zhao, F. Guo, R. Illikkal, S. Makineni, D. Newell, Y. Solihin, L. Hsu, and S. Reinhardt. Qos policies and architecture for cache/memory in cmp platforms. Proceedings of the 2007 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems, 35(1):25--36, 2007. Google ScholarDigital Library
- S. Kim, D. Chandra, and Y. Solihin. Fair cache sharing and partitioning in a chip multiprocessor architecture. In PACT '04: Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques, pages 111--122, 2004. Google ScholarDigital Library
- H. Kobayashi, I. Kotera, and H. Takizawa. Locality analysis to control dynamically way-adaptable caches. ACM SIGARCH Computer Architecture News, 33(3):25--32, 2005. Google ScholarDigital Library
- I. Kotera, K. Abe, R. Egawa, H. Takizawa, and H. Kobayashi. Power-aware dynamic cache partitionning for cmps. Transactions on High-Performance Embedded Architectures and Compilers, 3(2):149--167, 2008.Google Scholar
- I. Kotera, R. Egawa, H. Takizawa, and H. Kobayashi. A power-aware shared cache mechanism based on locality assessment of memory reference for cmps. In MEDEA '07: Proceedings of the 2007 workshop on MEmory performance, pages 113--120, 2007. Google ScholarDigital Library
- M. K. Qureshi and Y. N. Patt. Utility-based cache partitioning: A low-overhead, high-performance, runtime mechanism to partition shared caches. In MICRO 39: Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture, pages 423--432, 2006. Google ScholarDigital Library
- M. Ripeanu. Note on zipf distribution in top500 supercomputers list. Technical report, IEEE Distributed Systems Online, October 2006.Google Scholar
- G. E. Suh, L. Rudolph, and S. Devadas. Dynamic partitioning of shared cache memory. Journal of Supercomputing, 28(1):7--26, 2004. Google ScholarDigital Library
- The Standard Performance Evaluation Corporation. http://www.spec.org/.Google Scholar
- TOP500 Supercomputer Sites. http://www.top500.org/.Google Scholar
- G. K. Zipf. Human Behavior and the Principle of Least Effort. Addison-Wesley, 1949.Google Scholar
Index Terms
- Modeling of cache access behavior based on Zipf's law
Recommendations
Data access history cache and associated data prefetching mechanisms
SC '07: Proceedings of the 2007 ACM/IEEE conference on SupercomputingData prefetching is an effective way to bridge the increasing performance gap between processor and memory. As computing power is increasing much faster than memory performance, we suggest that it is time to have a dedicated cache to store data access ...
CPU Cache Prefetching: Timing Evaluation of Hardware Implementations
Prefetching into CPU caches has long been known to be effective in reducing the cache miss ratio, but known implementations of prefetching have been unsuccessful in improving CPU performance. The reasons for this are that prefetches interfere with ...
Code-based cache partitioning for improving hardware cache performance
ICUIMC '12: Proceedings of the 6th International Conference on Ubiquitous Information Management and CommunicationRecently, improving hardware cache performance is getting more important, because the performance gap between processor and memory has caused "memory wall" problem. Most cache designs are based on the LRU replacement policy which is effective for high-...
Comments