ABSTRACT
As we enter the era of CMP platforms with multiple threads/cores on the die, the diversity of the simultaneous workloads running on them is expected to increase. The rapid deployment of virtualization as a means to consolidate workloads on to a single platform is a prime example of this trend. In such scenarios, the quality of service (QoS) that each individual workload gets from the platform can widely vary depending on the behavior of the simultaneously running workloads. While the number of cores assigned to each workload can be controlled, there is no hardware or software support in today's platforms to control allocation of platform resources such as cache space and memory bandwidth to individual workloads. In this paper, we propose a QoS-enabled memory architecture for CMP platforms that addresses this problem. The QoS-enabled memory architecture enables more cache resources (i.e. space) and memory resources (i.e. bandwidth) for high priority applications based on guidance from the operating environment. The architecture also allows dynamic resource reassignment during run-time to further optimize the performance of the high priority application with minimal degradation to low priority. To achieve these goals, we will describe the hardware/software support required in the platform as well as the operating environment (O/S and virtual machine monitor). Our evaluation framework consists of detailed platform simulation models and a QoS-enabled version of Linux. Based on evaluation experiments, we show the effectiveness of a QoS-enabled architecture and summarize key findings/trade-offs.
- Azul Systems. Azul Compute Appliance. http://www.azulsystems.com/products/cpools_cappliance.htmlGoogle Scholar
- P. Barham, et al. Xen and the Art of Virtualization. In Proc. of the ACM Symposium on Operating Systems Principles (SOSP), Oct 2003. Google ScholarDigital Library
- D. Chandra, F. Guo, S. Kim, and Y. Solihin. Predicting inter-thread cache contention on a chip multiprocessor architecture", In Proc. of 11th International Symposium on High Performance Computer Architecture (HPCA), Feb 2005. Google ScholarDigital Library
- T. Deshane, D. Dimatos, et al. Performance Isolation of a Misbehaving Virtual Machine with Xen, VMware and Solaris Containers. http://people.clarkson.edu/~jnm/publications/isolationOfMisbehavingVMs.pdf.Google Scholar
- L. Hsu, S. Reinhardt, R. Iyer and S. Makineni. Communist, Utilitarian, and Capitalist Policies on CMPs: Caches as a Shared Resource. In Proc. of 15th International Conference on Parallel Architectures and Compilation Techniques (PACT), Sept 2006. Google ScholarDigital Library
- R. P. Goldberg. Survey of virtual machine research. IEEE Transactions on Computers, 1974.Google ScholarDigital Library
- Intel Corporation. Intel Dual-Core Processors-The First Multi-core Revolution. http://www.intel.com/technology/computing/dual-core/.Google Scholar
- R. Iyer. On Modeling and Analyzing Cache Performance using CASPER. In Proc. of 11th International Symposium on Modeling, Analysis and Simulation of Computer & Telecom Systems, Oct 2003.Google Scholar
- R. Iyer. CQoS: A Framework for Enabling QoS in Shared Caches of CMP Platforms. In Proc. of 18th Annual International Conference on Supercomputing (ICS'04), July 2004. Google ScholarDigital Library
- S. Kim, D. Chandra, and Y. Solihin. Fair Cache Sharing and Partitioning in a Chip Multiprocessor Architecture. In Proc. of 13th Int'l Conf. on Parallel Arch. & Complication Techniques(PACT), Sept 2004. Google ScholarDigital Library
- P. Kongetira, K. Aingaran, and K. Olukotun. Niagara: A 32-Way Multithreaded Sparc Processor.In Proc. of Annual International Symposium on Microarchitecture(MICRO), Mar 2005.Google ScholarDigital Library
- K. Krewell. Best Servers of 2004: Multicore is Norm. Microprocessor Report, www.mpronline.com, Jan 2005.Google Scholar
- R. Kumar, D. M. Tullsen, N. P. Jouppi, P. Ranganathan. Heterogeneous Chip Multiprocessors. IEEE Transactions on Computers, 2005. Google ScholarDigital Library
- J. Laudon. Performance/Watt: The New Server Focus. In 1st Workshop on Design, Architecture and Simulation of CMP (dasCMP), Nov 2005. Google ScholarDigital Library
- K. Lee, T. Lin and C. Jen. An Efficient Quality-Aware Memory Controller for Multimedia Platform SoC. IEEE Trans. On Circuits and Systems for Video Technology, May 2005. Google ScholarDigital Library
- C. Natarajan, B. Christenson, and F. Briggs. Performance Impact of Memory Controller Features in Multiprocessor Server Environment. In 3rd Workshop on Memory Performance Issues, 2004. Google ScholarDigital Library
- Kyle J. Nesbit, et al. Fair Queuing Memory Systems. In Proc. of Annual International Symposium on Microarchitecture (MICRO), June 2006. Google ScholarDigital Library
- K. Olukotun, B. A. Nayfeh , et. al. The case for a single-chip multiprocessor. In Proc. of the 7th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), Oct 1996. Google ScholarDigital Library
- M. K. Qureshi and Y. N. Patt. Utility-Based Cache Partitioning: A Low-Overhead, High-Performance, Runtime Mechanism to Partition Shared Caches In Proc. of Annual Int'l Symposium on Microarchitecture (MICRO), June 2006. Google ScholarDigital Library
- N. Rafique, W. T. Lim and M. Thottethodi. Architectural Support for Operating System-Driven CMP Cache Management. In Proc. of the 15th International Conference on Parallel Architectures and Compilation Technology (PACT 2006), Sept 2006. Google ScholarDigital Library
- P. Ranganathan and N. Jouppi. Enterprise IT Trends and Implications on Architecture Research. In Proc. of the 11th International Symposium on High Performance Computer Architecture (HPCA), Feb 2005. Google ScholarDigital Library
- S. Rixner, W. J. Dally, U. J. Kapasi, et al. Memory access scheduling. In Proc. of the International Symposium on Computer Architecture (ISCA), June 2000. Google ScholarDigital Library
- M. Rosenblum and T. Garfinkel. Virtual Machine Monitors: Current Technology and Future Trends. IEEE Transactions on Computers, 2005. Google ScholarDigital Library
- L. Sha, R. Rajkumar and J. P. Lehoczky. Priority Inheritance Protocols: An Approach to Real-Time Synchronization. IEEE Transactions on Computers, Sept 1990. Google ScholarDigital Library
- SPECint, http://www.spec.org/cpu2000/SPECintGoogle Scholar
- SPECjbb2005, http://www.spec.org/jbb2005Google Scholar
- H. S. Stone, J. Turek, and J. L. Wolf. Optimal partitioning of cache memory. IEEE Transactions on Computers, Sept 1992. Google ScholarDigital Library
- G. Suh, S. Devadas, and L. Rudolph. A New Memory Monitoring Scheme for Memory-Aware Scheduling and Partitioning. In Proc. of International Symposium on High Performance Computer Architecture (HPCA), Feb 2002. Google ScholarDigital Library
- "Test TCP (TTCP) Benchmarking Tool", http://www.pcausa.comGoogle Scholar
- "TPC-C Design Document", http://www.tpc.org/tpcc/Google Scholar
- R. Uhlig, et al., "Intel Virtualization Technology," IEEE Transactions on Computers, 2005. Google ScholarDigital Library
- R. Uhlig, R. Fishtein, et. al. SoftSDV: A Presilicon Software Development Environment for the IA-64 Architecture. Intel Technology Journal. (http://www.intel.com/technology/itjf)Google Scholar
- T. Y. Yeh and G. Reinman. Fast and Fair: Data-stream Quality of Service. In Proc. of International Conference of Compilers, Architecture and System For Embedded Systems (CASES), July 2004. Google ScholarDigital Library
- L. Zhao, J. Moses, R. Iyer, et al. Architectural Evaluation of Large-Scale CMP Platforms using ManySim. In Intel's Design & Test Technology Conference (DTTC), Aug 2006.Google Scholar
- H. Zhang. Service Disciplines for Guaranteed Performance Service in Packet-switching Networks. In Proc. of IEEE, Oct. 1995.Google ScholarCross Ref
- Z. Zhu and Z. Zhang. A Performance Comparison of DRAM Memory System Optimizations for SMT Processors. In Proc, of the 11th International Symposium on High Performance Computer Architecture (HPCA), Feb 2005. Google ScholarDigital Library
Index Terms
- QoS policies and architecture for cache/memory in CMP platforms
Recommendations
QoS policies and architecture for cache/memory in CMP platforms
SIGMETRICS '07 Conference ProceedingsAs we enter the era of CMP platforms with multiple threads/cores on the die, the diversity of the simultaneous workloads running on them is expected to increase. The rapid deployment of virtualization as a means to consolidate workloads on to a single ...
Quality of service shared cache management in chip multiprocessor architecture
The trends in enterprise IT toward service-oriented computing, server consolidation, and virtual computing point to a future in which workloads are becoming increasingly diverse in terms of performance, reliability, and availability requirements. It can ...
Rate-based QoS techniques for cache/memory in CMP platforms
ICS '09: Proceedings of the 23rd international conference on SupercomputingAs we embrace the era of chip multi-processors (CMP), we are faced with two major architectural challenges: (i) QoS or performance management of disparate applications running on CPU cores contending for shared cache/memory resources and (ii) global/...
Comments