research-article

Synthesizing concurrent schedulers for irregular algorithms

Authors:
Donald Nguyen

The University of Texas at Austin, Austin, TX, USA

The University of Texas at Austin, Austin, TX, USA
View Profile

,
Keshav Pingali

The University of Texas at Austin, Austin, TX, USA

The University of Texas at Austin, Austin, TX, USA
View Profile

ASPLOS XVI: Proceedings of the sixteenth international conference on Architectural support for programming languages and operating systemsMarch 2011Pages 333–344https://doi.org/10.1145/1950365.1950404

Published:05 March 2011Publication History

ASPLOS XVI: Proceedings of the sixteenth international conference on Architectural support for programming languages and operating systems

Pages 333–344

ABSTRACT

Scheduling is the assignment of tasks or activities to processors for execution, and it is an important concern in parallel programming. Most prior work on scheduling has focused either on static scheduling of applications in which the dependence graph is known at compile-time or on dynamic scheduling of independent loop iterations such as in OpenMP.

In irregular algorithms, dependences between activities are complex functions of runtime values so these algorithms are not amenable to compile-time analysis nor do they consist of independent activities. Moreover, the amount of work can vary dramatically with the scheduling policy. To handle these complexities, implementations of irregular algorithms employ carefully handcrafted, algorithm-specific schedulers but these schedulers are themselves parallel programs, complicating the parallel programming problem further.

In this paper, we present a flexible and efficient approach for specifying and synthesizing scheduling policies for irregular algorithms. We develop a simple compositional specification language and show how it can concisely encode scheduling policies in the literature. Then, we show how to synthesize efficient parallel schedulers from these specifications. We evaluate our approach for five irregular algorithms on three multicore architectures and show that (1) the performance of some algorithms can improve by orders of magnitude with the right scheduling policy, and (2) for the same policy, the overheads of our synthesized schedulers are comparable to those of fixed-function schedulers.

References

N. Amenta, S. Choi, and G. Rote. Incremental constructions con brio. In Proc. Symp. on Computational Geometry (SCG), pages 211--219, 2003. Google ScholarDigital Library
L. O. Andersen. Program Analysis and Specialization for the C Programming Language. PhD thesis, DIKU, University of Copenhagen, May 1994. (DIKU report 94/19).Google Scholar
D. Bader and V. Sachdeva. A cache-aware parallel implementation of the push-relabel network flow algorithm and experimental evaluation of the gap relabeling heuristic. In Proc. 18th ISCA International Conference on Parallel and Distributed Computing Systems (PDCS), September 2005.Google Scholar
R. D. Blumofe, C. F. Joerg, B. C. Kuszmaul, C. E. Leiserson, K. H. Randall, and Y. Zhou. Cilk: an efficient multithreaded runtime system. SIGPLAN Not., 30(8):207--216, 1995. Google ScholarDigital Library
A. Bowyer. Computing Dirichlet tessellations. The Computer Journal, 24(2):162--166, 1981.Google ScholarCross Ref
B. V. Cherkassy and A. V. Goldberg. On implementing push-relabel method for the maximum flow problem. In Proc. Intl. Conf. on Integer Programming and Combinatorial Optimization (IPCO), pages 157--171, London, UK, 1995. Google ScholarDigital Library
L. P. Chew. Guaranteed-quality mesh generation for curved surfaces. In Proc. Symp. on Computational Geometry (SCG), 1993. Google ScholarDigital Library
K. L. Clarkson and P. W. Shor. Applications of random sampling in computational geometry, ii. Discrete Comput. Geom., 4(5):287--421, 1989.Google ScholarDigital Library
T. Cormen, C. Leiserson, R. Rivest, and C. Stein, editors. Introduction to Algorithms. MIT Press, 2001. Google ScholarDigital Library
L. Dagum and R. Menon. OpenMP: An industry-standard API for shared-memory programming. IEEE Comput. Sci. Eng., 5(1):46--55, 1998. Google ScholarDigital Library
A. Duran, J. Corbalán, and E. Ayguadé. Evaluation of OpenMP task scheduling strategies. In Proc. Conf. on OpenMP in a new era of parallelism (IWOMP), pages 100--110, 2008. Google ScholarDigital Library
A. V. Goldberg and R. E. Tarjan. A new approach to the maximum-flow problem. J. ACM, 35(4):921--940, 1988. Google ScholarDigital Library
R. H. Halstead, Jr. MULTILISP: a language for concurrent symbolic computation. ACM Trans. Program. Lang. Syst., 7:501--538, October 1985. Google ScholarDigital Library
B. Hardekopf and C. Lin. The ant and the grasshopper: fast and accurate pointer analysis for millions of lines of code. In Proc. Conf. on Programming Language Design and Implementation (PLDI), 2007. Google ScholarDigital Library
M. Herlihy and J. E. B. Moss. Transactional memory: architectural support for lock-free data structures. In Proc. of International Symposium on Computer Architecture (ISCA), 1993. Google ScholarDigital Library
M. P. Herlihy and J. M. Wing. Linearizability: a correctness condition for concurrent objects. ACM Trans. Program. Lang. Syst., 12(3):463--492, 1990. Google ScholarDigital Library
Intel Corporation. Intel threading building blocks 3.0. http://threadingbuildingblocks.org.Google Scholar
M. Kulkarni, P. Carribault, K. Pingali, G. Ramanarayanan, B. Walter, K. Bala, and L. P. Chew. Scheduling strategies for optimistic parallel execution of irregular programs. In Proc. Symp. on Parallelism in algorithms and architectures (SPAA), pages 217--228, New York, NY, USA, 2008. Google ScholarDigital Library
M. Kulkarni, K. Pingali, B. Walter, G. Ramanarayanan, K. Bala, and L. P. Chew. Optimistic parallelism requires abstractions. SIGPLAN Not. (Proceedings of PLDI), 42(6):211--222, 2007. Google ScholarDigital Library
M. Kulkarni, D. Prountzos, D. Nguyen, and K. Pingali. Defining and implementing commutativity conditions for parallel execution. regular rech report TR-ECE-09-11, School of Electrial and Computer Engineering, Purdue University, August 2009.Google Scholar
D. Lea. A Java fork/join framework. In Proc. Conf. on Java Grande, pages 36--43, 2000. Google ScholarDigital Library
D. Leijen, W. Schulte, and S. Burckhardt. The design of a task parallel library. In Proc. Conf. on object-oriented programming, systems, languages, and applications (OOPSLA), pages 227--242, 2009. Google ScholarDigital Library
K. Madduri, D. Bader, J. Berry, and J. Crobak. Parallel shortest path algorithms for solving large-scale instances. Technical Report GT-CSE-06-19, Georgia Institute of Technology, September 2006.Google Scholar
M. Méndez-Lojo, A. Mathew, and K. Pingali. Parallel inclusion-based points-to analysis. In Proc. Intl. Conference on Systems, Programming, Languages and Applications: Software for Humanity (SPLASH), 2010. Google ScholarDigital Library
U. Meyer and P. Sanders. Delta-stepping: A parallel single source shortest path algorithm. In Proc. European Symp. on Algorithms (ESA), pages 393--404, 1998. Google ScholarDigital Library
M. M. Michael, M. T. Vechev, and V. A. Saraswat. Idempotent work stealing. In Proc. Symp. on Principles and practice of parallel programming (PPoPP), pages 45--54, 2009. Google ScholarDigital Library
G. L. Miller. A time efficient Delaunay refinement algorithm. In Proc. Symposium on Discrete Algorithms (SODA), pages 400--409, New Orleans, 2004. Google ScholarDigital Library
F. Nielson, H. R. Nielson, and C. Hankin. Principles of Program Analysis. Springer-Verlag New York, Inc., Secaucus, NJ, USA, 1999. Google ScholarDigital Library
D. J. Pearce, P. H. J. Kelly, and C. L. Hankin. Online cycle detection and difference propagation for pointer analysis. In Intl. IEEE Workshop on Source Code Analysis and Manipulation (SCAM), pages 3--12, 2003.Google ScholarCross Ref
K. Pingali, M. Kulkarni, D. Nguyen, M. Burtscher, M. Mendez-Lojo, D. Prountzos, X. Sui, and Z. Zhong. Amorphous data-parallelism in irregular algorithms. regular tech report TR-09-05, The University of Texas at Austin, 2009.Google Scholar
J. Ruppert. A Delaunay refinement algorithm for quality 2-dimensional mesh generation. J. Algorithms, 18(3):548--585, 1995. Google ScholarDigital Library
J. R. Shewchuk. Triangle: Engineering a 2d quality mesh generator and Delaunay triangulator. In Applied Computational Geometry: Towards Geometric Engineering, volume 1148 of Lecture Notes in Computer Science, pages 203--222. Springer-Verlag, 1996. Google ScholarDigital Library
A. Solar-Lezama, C. G. Jones, and R. Bodik. Sketching concurrent data structures. In Proc. Conf. on Programming Language Design and Implementation (PLDI), pages 136--148, New York, NY, USA, 2008. Google ScholarDigital Library
M. T. Vechev, E. Yahav, D. F. Bacon, and N. Rinetzky. CGCExplorer: a semi-automated search procedure for provably correct concurrent collectors. In Proc. Conf. on Programming Language Design and Implementation (PLDI), pages 456--467, New York, NY, USA, 2007. Google ScholarDigital Library
D. F. Watson. Computing the n-dimensional tessellation with application to Voronoi polytopes. The Computer Journal, 24(2):167--172, 1981.Google ScholarCross Ref
J. Wu, R. Das, J. Saltz, H. Berryman, and S. Hiranandani. Distributed memory compiler design for sparse problems. IEEE Transactions on Computers, 44, 1995. Google ScholarDigital Library

Index Terms

Synthesizing concurrent schedulers for irregular algorithms
1. Software and its engineering
  1. Software notations and tools
    1. General programming languages
      1. Language features
        Concurrent programming structures

Recommendations

Synthesizing concurrent schedulers for irregular algorithms
ASPLOS '11

Scheduling is the assignment of tasks or activities to processors for execution, and it is an important concern in parallel programming. Most prior work on scheduling has focused either on static scheduling of applications in which the dependence graph ...
Read More
Synthesizing concurrent schedulers for irregular algorithms
ASPLOS '11

Scheduling is the assignment of tasks or activities to processors for execution, and it is an important concern in parallel programming. Most prior work on scheduling has focused either on static scheduling of applications in which the dependence graph ...
Read More
Scheduling of deteriorating jobs with release dates to minimize the maximum lateness

In this paper, we consider the problem of scheduling n deteriorating jobs with release dates on a single (batching) machine. Each job's processing time is a simple linear function of its starting time. The objective is to minimize the maximum lateness. ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
ASPLOS XVI: Proceedings of the sixteenth international conference on Architectural support for programming languages and operating systems
March 2011
432 pages
ISBN:9781450302661
DOI:10.1145/1950365
General Chair:
Rajiv Gupta
University of California, Riverside
,
Program Chair:
Todd C. Mowry
Carnegie Mellon University
ACM SIGPLAN Notices Volume 46, Issue 3
ASPLOS '11
March 2011
407 pages
ISSN:0362-1340
EISSN:1558-1160
DOI:10.1145/1961296
Issue’s Table of Contents
ACM SIGARCH Computer Architecture News Volume 39, Issue 1
ASPLOS '11
March 2011
407 pages
ISSN:0163-5964
DOI:10.1145/1961295
Issue’s Table of Contents
Copyright © 2011 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 5 March 2011
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
scheduling
synthesis
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate535of2,713submissions,20%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 33
  Total Citations
  View Citations
- 560
  Total Downloads
- Downloads (Last 12 months)7
- Downloads (Last 6 weeks)3
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Synthesizing concurrent schedulers for irregular algorithms

ASPLOS XVI: Proceedings of the sixteenth international conference on Architectural support for programming languages and operating systems

ABSTRACT

References

Cited By

Index Terms

Recommendations

Synthesizing concurrent schedulers for irregular algorithms

Synthesizing concurrent schedulers for irregular algorithms

Scheduling of deteriorating jobs with release dates to minimize the maximum lateness