ABSTRACT
This poster introduces SLAW, a Scalable Locality-aware Adaptive Work-stealing scheduler. The SLAW features an adaptive task scheduling algorithm combined with a locality-aware scheduling framework.
Past work has demonstrated the pros and cons of using fixed scheduling policies, such as work-first and help-first, in different cases without a clear winner. Prior work also assumes the availability and successful execution of a serial version of the parallel program. This assumption can limit the expressiveness of dynamic task parallel languages.
The SLAW scheduler supports both work-first and help-first policies simultaneously. It does so by using an adaptive approach that selects a scheduling policy on a per-task basis at runtime. The SLAW scheduler also establishes bounds on the stack usage and the heap space needed to store tasks. The experimental results for the benchmarks studied show that SLAW's adaptive scheduler achieves 0.98x - 9.2$x speedup over the help-first scheduler and 0.97x - 4.5x speedup over the work-first scheduler for 64-thread executions, thereby establishing the robustness of using an adaptive approach instead of a fixed policy. In contrast, the help-first policy is 9.2x slower than work-first in the worst case for a fixed help-first policy, and the work-first policy is 3.7x slower than help-first in the worst case for a fixed work-first policy. Further, for large irregular recursive parallel computations, the adaptive scheduler runs with bounded stack usage and achieves performance (and supports data sizes) that cannot be delivered by the use of any single fixed policy.
The SLAW scheduler is designed for programming models where locality hints are provided to the runtime by the programmer or compiler, and achieves locality-awareness by grouping workers into places. Locality awareness can lead to improved performance by increasing temporal data reuse within a worker and among workers in the same place. Our experimental results show that locality-aware scheduling can achieve up to 2.6x speedup over locality-oblivious scheduling, for the benchmarks studied.
- P. Charles et al. X10: An object-oriented approach to non-uniform cluster computing. OOPSLA 2005 Onward! Track. Google ScholarDigital Library
- G. Cong et al. Solving Large, Irregular Graph Problems Using Adaptive Work-Stealing. ICPP 2008. Google ScholarDigital Library
- M. Frigo, C. E. Leiserson, and K. H. Randall. The implementation of the Cilk-5 multithreaded language. PLDI 1998. Google ScholarDigital Library
- Y. Guo, R. Barik, R. Raman, and V. Sarkar. Work-first and help-first scheduling policies for async-finish task parallelism. IPDPS 2009. Google ScholarDigital Library
- Habanero Java (HJ) Project. http://habanero.rice.edu/hj.Google Scholar
- Y. Yan, J. Zhao, Y. Guo, and V. Sarkar. Hierarchical Place Trees: A Portable Abstraction for Task Parallelism and Data Movement. LCPC 2009. Google ScholarDigital Library
Index Terms
- SLAW: a scalable locality-aware adaptive work-stealing scheduler for multi-core systems
Recommendations
SLAW: a scalable locality-aware adaptive work-stealing scheduler for multi-core systems
PPoPP '10This poster introduces SLAW, a Scalable Locality-aware Adaptive Work-stealing scheduler. The SLAW features an adaptive task scheduling algorithm combined with a locality-aware scheduling framework.
Past work has demonstrated the pros and cons of using ...
A work-stealing scheduler for X10's task parallelism with suspension
PPoPP '12: Proceedings of the 17th ACM SIGPLAN symposium on Principles and Practice of Parallel ProgrammingThe X10 programming language is intended to ease the programming of scalable concurrent and distributed applications. X10 augments a familiar imperative object-oriented programming model with constructs to support light-weight asynchronous tasks as well ...
Work-stealing without the baggage
OOPSLA '12: Proceedings of the ACM international conference on Object oriented programming systems languages and applicationsWork-stealing is a promising approach for effectively exploiting software parallelism on parallel hardware. A programmer who uses work-stealing explicitly identifies potential parallelism and the runtime then schedules work, keeping otherwise idle ...
Comments