Optimizing parallel programs with explicit synchronization

Authors:
Arvind Krishnamurthy

Computer Science Division, University of California, Berkeley

Computer Science Division, University of California, Berkeley
View Profile

,
Katherine Yelick

Computer Science Division, University of California, Berkeley

Computer Science Division, University of California, Berkeley
View Profile

PLDI '95: Proceedings of the ACM SIGPLAN 1995 conference on Programming language design and implementationJune 1995Pages 196–204https://doi.org/10.1145/207110.207142

Published:01 June 1995Publication History

PLDI '95: Proceedings of the ACM SIGPLAN 1995 conference on Programming language design and implementation

Pages 196–204

ABSTRACT

We present compiler analyses and optimizations for explicitly parallel programs that communicate through a shared address space. Any type of code motion on explicitly parallel programs requires a new kind of analysis to ensure that operations reordered on one processor cannot be observed by another. The analysis, based on work by Shasha and Snir, checks for cycles among interfering accesses. We improve the accuracy of their analysis by using additional information from post-wait synchronization, barriers, and locks.

We demonstrate the use of this analysis by optimizing remote access on distributed memory machines. The optimizations include message pipelining, to allow multiple outstanding remote memory operations, conversion of two-way to one-way communication, and elimination of communication through data re-use. The performance improvements are as high as 20-35% for programs running on a CM-5 multiprocessor using the Split-C language as a global address layer.

References

1.S.V. Adve and M. D. Hill. Weak Ordering-A New Definition. In I 7th International Symposium on Computer Architecture, April 1990. Google ScholarDigital Library
2.P~. Arpaci, D. Culler, A. Krishnamurthy, S. Steinberg, and K. Yelick. Empirical Evaluation of the CRAY-T3D: A Compiler Perspective. In Internatzonat Symposium on Computer Architect~tre, June 1995. Google ScholarDigital Library
3.H. Berryman, J. Saltz, and J. Scroggs. Execution Time Support for Adaptive Scientific Algorithms on Distributed Memory Multiprocessors. Concurrenty: Practice and Experience, June 1991. Google ScholarDigital Library
4.D. Callahan and J. Subhlok. Static Analysis of Low-level Synchronization. In A CM SIGPLAN and SIGOPS Workshop on Parallel and Distributed Debugging, May 1988. Google ScholarDigital Library
5.W. W. Carlson and J. M. Draper. Distributed Data Access in AC. In A CM Symposzum on Pr~nc,ples and Practice of Parallel Programming, July 1995. Google ScholarDigital Library
6.D.E. Culler, A. Dusseau, S. C. Goldstein, A. Krishnamurthy, S. Lumetta, T. yon Eicken, and K. Yelick. Parallel Programming in Split-C. In Supercomp~t~ng '93, Portland, Oregon, November 1993. Google ScholarDigital Library
7.K. Gharachorloo, D. Lenoski, J. Laudon, A. Gupta, and J. Hennessy. Memory Consistency and Event Ordering in Scalable Shared-Memory Multiprocessors. In 17th International Symposzum on Computer Architecture, 1990. Google ScholarDigital Library
8.D. Grunwald and H. Srinivasan. Data flow equations for Explicitly Parallel Programs. In A UM Symposium on Principles and Practices of Parallel Programming, June 1993. Google ScholarDigital Library
9.S. Hiranandani, K. Kennedy. and C.-W. Tseng. Compiler 0ptimziations for Fortran D on MIMD Distributed-Memory Machines. In Proceedings of the 199t International Conference on Supercomputing, 1991. Google ScholarDigital Library
10.T. Ig. Jeremiassen and S. J. Eggers. Reducing False Sharing on Shared Memory Multiprocessors through Compile Time Data Transformations. In A CM Symposium on Principles and Practice of Parallel Programming, July 1995. Google ScholarDigital Library
11.A. Krishnamurthy and K. Yelick. Optimizing Parallel SPMD Programs. In Languages and Compilers for Parallel Computing, 1994. Google ScholarDigital Library
12.L. Lamport. How to Make a Multiprocessor Computer that Correctly Executes Multiprocess Programs. IEEE Transactions on Computers, C-28(9), September 1979.Google Scholar
13.D. Lenoski, J. Laudon, K. Gharachorloo, A. Gupta, and J. Hennessy. The Directory-Based Cache Coherence Protocol for the DASH Multiprocessor. In 17th International Symposium on Computer Architecture, May 1990. Google ScholarDigital Library
14.N. K. Madsen. Divergence Preserving Discrete Surface Integral Methods for Maxwell's Curl Equations Using Non- Orthogonal Unstructured Grids. Technical Report 92.04, RIACS, February 1992.Google Scholar
15.S. Midkiff and D. Padua. Issues in the Optimization of Parallel Programs. In International Conference on Parallel Processing- Vol II, 1990.Google Scholar
16.S. P. Midkiff, D. Padua, and R. G. Cytron. Compiling Programs with User Parallelism. In Languages and Compilers for Parallel Computing, 1990. Google ScholarDigital Library
17.A. Rogers and K. Pingali. Compiling for distributed memory architectures. IEEE Transactions on Parallel and Distributed Systems, March 1994. Google ScholarDigital Library
18.D. Shasha and M. Snir. Efficient and Correct Execution of Parallel Programs that Share Memory. ACM Transactions on Programming Languages and Systems, 10(2), April 1988. Google ScholarDigital Library
19.J.P. Singh, W. D. Weber, and A. Gupta. SPLASH: Stanford parallel applications for shared memory. Computer Architecture News, March 1992. Google ScholarDigital Library
20.The SPARC Architecture Manual: Version 8. Spare International, Inc., 1992. Google ScholarDigital Library
21.T. yon Eicken, D. E. Culler, S. C. Goldstein, and K. E. Schauser. Active Messages: a Mechanism for Integrated Communication and Computation. In International Symposium on Computer Architecture, May 1992. Google ScholarDigital Library

Index Terms

Optimizing parallel programs with explicit synchronization
1. Computing methodologies
  1. Parallel computing methodologies
    1. Parallel programming languages
2. Software and its engineering
  1. Software notations and tools
    1. Compilers
      1. Source code generation
    2. General programming languages
      1. Language types
        Concurrent programming languages
        Distributed programming languages
        Parallel programming languages

Recommendations

Optimizing parallel programs with explicit synchronization

We present compiler analyses and optimizations for explicitly parallel programs that communicate through a shared address space. Any type of code motion on explicitly parallel programs requires a new kind of analysis to ensure that operations reordered ...
Read More
Optimizing the Synchronization Operations in Message Passing Interface One-Sided Communication

One-sided communication in Message Passing Interface (MPI) requires the use of one of three different synchronization mechanisms, which indicate when the one-sided operation can be started and when the operation is completed. Efficient implementation of ...
Read More
Compiling data-parallel programs for clusters of SMPs: Research Articles
Compilers for Parallel Computers

Clusters of shared-memory multiprocessors (SMPs) have become the most promising parallel computing platforms for scientific computing. However, SMP clusters significantly increase the complexity of user application development when using the low-level ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
PLDI '95: Proceedings of the ACM SIGPLAN 1995 conference on Programming language design and implementation
June 1995
335 pages
ISBN:0897916972
DOI:10.1145/207110
Chairman:
David W. Wall
Digital—Western Research Laboratory
ACM SIGPLAN Notices Volume 30, Issue 6
June 1995
327 pages
ISSN:0362-1340
EISSN:1558-1160
DOI:10.1145/223428
Editors:
Brent T. Hailpern
IBM T. J. Watson Research Center, Yorktown Heoghts, NY
,
Richard L. Wexelblat
IRS IS:AO, Washington, DC
Issue’s Table of Contents
Copyright © 1995 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 1 June 1995
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Qualifiers
- Article
Conference

Acceptance Rates
PLDI '95 Paper Acceptance Rate28of105submissions,27%Overall Acceptance Rate406of2,067submissions,20%
More
Upcoming Conference
PLDI '24

Sponsor:

sigplan

ACM SIGPLAN Conference on Programming Language Design and Implementation

June 24 - 28, 2024

Copenhagen , Denmark
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 49
  Total Citations
  View Citations
- 469
  Total Downloads
- Downloads (Last 12 months)36
- Downloads (Last 6 weeks)2
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Optimizing parallel programs with explicit synchronization

PLDI '95: Proceedings of the ACM SIGPLAN 1995 conference on Programming language design and implementation

ABSTRACT

References

Cited By

Index Terms

Recommendations

Optimizing parallel programs with explicit synchronization

Optimizing the Synchronization Operations in Message Passing Interface One-Sided Communication

Compiling data-parallel programs for clusters of SMPs: Research Articles