research-article

Free Access

Optimistic parallelism requires abstractions

Authors:
Milind Kulkarni

University of Texas, Austin

University of Texas, Austin
View Profile

,
Keshav Pingali

University of Texas, Austin

University of Texas, Austin
View Profile

,
Bruce Walter

Cornell University, Ithaca, NY

Cornell University, Ithaca, NY
View Profile

,
Ganesh Ramanarayanan

Cornell University, Ithaca, NY

Cornell University, Ithaca, NY
View Profile

,
Kavita Bala

Cornell University, Ithaca, NY

Cornell University, Ithaca, NY
View Profile

,
L. Paul Chew

Cornell University, Ithaca, NY

Cornell University, Ithaca, NY
View Profile

Authors Info & Claims

Communications of the ACM Volume 52 Issue 9September 2009pp 89–97https://doi.org/10.1145/1562164.1562188

Published:01 September 2009Publication History

Communications of the ACM

Abstract

The problem of writing software for multicore processors is greatly simplified if we could automatically parallelize sequential programs. Although auto-parallelization has been studied for many decades, it has succeeded only in a few application areas such as dense matrix computations. In particular, auto-parallelization of irregular programs, which are organized around large, pointer-based data structures like graphs, has seemed intractable.

The Galois project is taking a fresh look at autoparallelization. Rather than attempt to parallelize all programs no matter how obscurely they are written, we are designing programming abstractions that permit programmers to highlight opportunities for exploiting parallelism in sequential programs, and building a runtime system that uses these hints to execute the program in parallel. In this paper, we describe the design and implementation of a system based on these ideas. Experimental results for two real-world irregular applications, a Delaunay mesh refinement application and a graphics application that performs agglomerative clustering, demonstrate that this approach is promising.

References

Burke, M., Carini, P., Choi, J.-D. Interprocedural Pointer Alias Analysis. Technical Report IBM RC 21055, IBM Yorktown Heights, 1997.Google Scholar
Chew, L.P. Guaranteed-quality mesh generation for curved surfaces. In SCG'93: Proceedings of the 9th Annual Symposium on Computational Geometry (1993), 274--280. Google ScholarDigital Library
de Galas, J. The quest for more processing power: is the single core CPU doomed? http://www.anandtech.com/cpuchipsets/showdoc.aspx?I=2377, February 2005.Google Scholar
Diniz, P.C., Rinard, M.C. Commutativity analysis: a new analysis technique for parallelizing compilers. ACM Trans. Prog. Lang. Syst. 19, 6 (1997), 942--991. Google ScholarDigital Library
Ghiya, R., Hendren, L. Is it a tree, a dag, or a cyclic graph? A shape analysis for heap-directed pointers c. In POPL, 1996. Google ScholarDigital Library
Herlihy, M., Koskinen, E. Transactional boosting: a methodology for highlyconcurrent transactional objects. In Principles and Practices of Parallel Programming (PPoPP), 2008. Google ScholarDigital Library
Herlihy, M., Moss, J.E.B. Transactional memory: architectural support for lock-free data structures. In ISCA '93: Proceedings of the 20th Annual International Symposium on Computer Architecture (1993). Google ScholarDigital Library
Hudson, B., Miller, G.L., Phillips, T. Sparse parallel Delaunay mesh refinement. In SPAA (2007). Google ScholarDigital Library
Jefferson, D.R. Virtual time. ACM Trans. Prog. Lang. Syst. 7, 3 (1985), 404--425. Google ScholarDigital Library
Kennedy, K., Allen, J., editors. Optimizing Compilers for Modern Architectures: A Dependence-Based Approach. Morgan Kaufmann, 2001. Google ScholarDigital Library
Kulkarni, M., Burtscher, M., Inkulu, R., Pingali, K., Cascaval, C. How much parallelism is there in irregular applications? In Principles and Practices of Parallel Programming (PPoPP), 2009. Google ScholarDigital Library
Kulkarni, M., Carribault, P., Pingali, K., Ramanarayanan, G., Walter, B., Bala, K., Chew, L.P. Scheduling strategies for optimistic parallel execution of irregular programs. In Symposium on Parallel Architectures and Algorithms (SPAA) (2008). Google ScholarDigital Library
Kulkarni M., Pingali, K., Ramanarayanan, G., Walter, B., Bala, K., Chew, L.P. Optimistic parallelism benefits from data partitioning. SIGARCH Comput. Archit. News 36, 1 (2008), 233--243. Google ScholarDigital Library
Kulkarni, M., Pingali, K., Walter, B., Ramanarayanan, G., Bala, K., Chew, L.P. Optimistic parallelism requires abstractions. SIGPLAN Not (Proceedings of PLDI 2007) 42, 6 (2007), 211--222. Google ScholarDigital Library
Larus, J., Rajwar, R. Transactional Memory (Synthesis Lectures on Computer Architecture). Morgan&Claypool Publishers, 2007. Google ScholarDigital Library
Ni, Y., Menon, V., Adl-Tabatabai, A.-R., Hosking, A.L., Hudson, R., Moss, J.E.B., Saha, B., Shpeisman, T. Open nesting in software transactional memory. In Principles and Practices of Parallel Programming (PPoPP), 2007. Google ScholarDigital Library
Pang-Ning Tan, M.S., Kumar, V., editors. Introduction to Data Mining. Pearson Addison Wesley, 2005. Google ScholarDigital Library
Ponnusamy, R., Saltz, J., Choudhary, A. Runtime compilation techniques for data partitioning and communication schedule reuse. In Proceedings of the 1993 ACM/IEEE Conference on Supercomputing (1993). Google ScholarDigital Library
Rauchwerger, L., Padua, D.A. The LRPD test: Speculative run-time parallelization of loops with privatization and reduction parallelization. IEEE Trans. Parallel Distrib. Syst. 10, 2 (1999), 160--180. Google ScholarDigital Library
Sagiv, M., Reps, T., Wilhelm, R. Solving shape-analysis problems in languages with destructive updating. In Proceedings of the 23rd Annual ACM Symposium on the Principles of Programming Languages (st. Petersburg Beach, FL, January 1996). Google ScholarDigital Library
Shewchuk, J.R. Triangle: Engineering a 2D quality mesh generator and Delaunay triangulator. In Applied Computational Geometry: Towards Geometric Engineering, volume 1148 of Lecture Notes in Computer Science. May 1996, 203--222. Google ScholarDigital Library
Steffan, J.G., Colohan, C.B., Zhai, A., Mowry, T.C. A scalable approach to thread-level speculation. In ISCA '00: Proceedings of the 27th Annual International Symposium on Computer Architecture (2000). Google ScholarDigital Library
Walter, B., Fernandez, S., Arbree, A., Bala, K., Donikian, M., Greenberg, D. Lightcuts: a scalable approach to illumination. ACM Trans. Graphics (SIGGRAPH) 24, 3 (July 2005), 1098--1107. Google ScholarDigital Library
Zhan, L.R.Y., Torrellas, J. Hardware for speculative run-time parallelization in distributed shared-memory multiprocessors. In HPCA '98: Proceedings of the 4th International Symposium on High-Performance Computer Architecture (1998). Google ScholarDigital Library

Index Terms

Optimistic parallelism requires abstractions
1. Computing methodologies
  1. Concurrent computing methodologies
    1. Concurrent programming languages
2. Software and its engineering
  1. Software notations and tools
    1. General programming languages
      1. Language types
        Concurrent programming languages

Recommendations

Optimistic parallelism requires abstractions
PLDI '07: Proceedings of the 28th ACM SIGPLAN Conference on Programming Language Design and Implementation

Irregular applications, which manipulate large, pointer-based data structures like graphs, are difficult to parallelize manually. Automatic tools and techniques such as restructuring compilers and run-time speculative execution have failed to uncover ...
Read More
Optimistic parallelism requires abstractions
Proceedings of the 2007 PLDI conference

Irregular applications, which manipulate large, pointer-based data structures like graphs, are difficult to parallelize manually. Automatic tools and techniques such as restructuring compilers and run-time speculative execution have failed to uncover ...
Read More
Performance of hybrid message-passing and shared-memory parallelism for discrete element modeling
SC '00: Proceedings of the 2000 ACM/IEEE conference on Supercomputing

The current trend in HPC hardware is towards clusters of shared-memory (SMP) compute nodes. For applications developers the major question is how best to program these SMP clusters. To address this we study an algorithm from Discrete Element Modeling, ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in

Communications of the ACM Volume 52, Issue 9
The Status of the P versus NP Problem
September 2009
139 pages
ISSN:0001-0782
EISSN:1557-7317
DOI:10.1145/1562164
Issue’s Table of Contents

Copyright © 2009 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 1 September 2009
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Qualifiers
- research-article
- Popular
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 20
  Total Citations
  View Citations
- 5,144
  Total Downloads
- Downloads (Last 12 months)199
- Downloads (Last 6 weeks)29
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

Optimistic parallelism requires abstractions

Communications of the ACM

Abstract

References

Cited By

Index Terms

Recommendations

Optimistic parallelism requires abstractions

Optimistic parallelism requires abstractions

Performance of hybrid message-passing and shared-memory parallelism for discrete element modeling

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

HTML Format

Caption

Optimistic parallelism requires abstractions

Communications of the ACM

Abstract

References

Cited By

Index Terms

Recommendations

Optimistic parallelism requires abstractions

Optimistic parallelism requires abstractions

Performance of hybrid message-passing and shared-memory parallelism for discrete element modeling

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

HTML Format

Share this Publication link

Share on Social Media