ABSTRACT
Asymmetric chip multiprocessors are imminent in the multi-core era primarily due their potential for power-performance efficiency. In order for software to fully realize this potential, the scheduling of threads to cores must be automated to adapt to the changing program behavior. However, strict system abstraction layers limit the controllability and observability of low level hardware details, thereby, limiting the state-of-the-art systems to rely on manual or static mapping of threads to cores in an asymmetric multi-core. In this paper, we propose a self-adaptive scheduler that exploits program behavior at runtime by matching computational demands of threads to the capabilities of cores. We present a novel empirical model to predict the selection of an appropriate core (based on optimizing throughput, power or performance per watt) for changing program phases within threads. Thread migration is initiated when an optimal mapping of threads to cores is predicted. Results show that our predictive schedulers for the three target optimizations are within 10% of the ideal scheduler.
- International Roadmap for Semiconductor (ITRS).Google Scholar
- L. Seiler et al., "Larrabee: A Many-Core x86 Architecture for Visual Computing", ACM Transactions on Graphics, Vol. 27, Issue 3, August 2008. Google ScholarDigital Library
- P. Denning, "The Working Set Model for Program Behavior", Comms. of ACM, 1968. Google ScholarDigital Library
- T. Sherwood and B. Calder, "Time varying behavior of programs", Tech Report UCSD-CS99-630, August 1999.Google Scholar
- A. Dhodapakar, J. E. Smith, "Comparing program phase detection techniques", Int'l Symp. Microarchitecture, 2003. Google ScholarDigital Library
- T. Sherwood et al, "Automatically characterizing large scale program behavior", Int'l Conf. on Architectural Support for Programming Languages and Operating Systems, 2002. Google ScholarDigital Library
- A. Dhodapakar, J. E. Smith, "Managing multi-configuration hardware via dynamic working set analysis", Int'l Symposium on Computer Architecture, 2002. Google ScholarDigital Library
- T. Sherwood et al., "Phase Tracking and Prediction," Int'l Symposium on Computer Architecture, 2003. Google ScholarDigital Library
- O. Khan and S. Kundu, "Thread Relocation: A Runtime Architecture for Tolerating Hard-Errors in Chip Multiprocessors", IEEE Transactions on Computers, 2009. Google ScholarDigital Library
- IBM Systems Virtualization", IBM Corp., 2005.Google Scholar
- J. Renau et al., "SESC Simulator", 2005.Google Scholar
- D. Brooks, V. Tiwari, and M. Martonosi, "Wattch: A framework for architectural-level power analysis and optimizations", Int'l Symp. on Computer Architecture, 2000. Google ScholarDigital Library
- P. Shivakumar and N. P. Jouppi, "CACTI 3.0: An integrated cache timing, power, and area model", TR, Compaq, 2001.Google Scholar
- Intel Turbo Boost Technology in Intel1 Core Microarchitecture (Nehalem) Based Processors. Intel Corp., 2008.Google Scholar
Index Terms
- A self-adaptive scheduler for asymmetric multi-cores
Recommendations
A comprehensive scheduler for asymmetric multicore systems
EuroSys '10: Proceedings of the 5th European conference on Computer systemsSymmetric-ISA (instruction set architecture) asymmetric-performance multicore processors were shown to deliver higher performance per watt and area for applications with diverse architectural requirements, and so it is likely that future multicore ...
Low-latency adaptive mode transitions and hierarchical power management in asymmetric clustered cores
Recently, engineering solutions that include asymmetric multicores have been fabricated for low form-factor computing devices, indicating a potential direction for future evolution of processors. In this article we propose an asymmetric clustered core ...
Dark silicon and the end of multicore scaling
ISCA '11: Proceedings of the 38th annual international symposium on Computer architectureSince 2005, processor designers have increased core counts to exploit Moore's Law scaling, rather than focusing on single-core performance. The failure of Dennard scaling, to which the shift to multicore parts is partially a response, may soon limit ...
Comments