article

Open Access

Efficient and correct execution of parallel programs that share memory

Authors:
Dennis Shasha

Courant Institute, New York Univ., New York, NY

Courant Institute, New York Univ., New York, NY
View Profile

,
Marc Snir

IBM T. J. Watson Research Center, Yorktown Heights, NY

IBM T. J. Watson Research Center, Yorktown Heights, NY
View Profile

Authors Info & Claims

ACM Transactions on Programming Languages and Systems Volume 10 Issue 2pp 282–312https://doi.org/10.1145/42190.42277

Published:01 April 1988Publication History

ACM Transactions on Programming Languages and Systems

Abstract

In this paper we consider an optimization problem that arises in the execution of parallel programs on shared-memory multiple-instruction-stream, multiple-data-stream (MIMD) computers. A program on such machines consists of many sequential program segments, each executed by a single processor. These segments interact as they access shared variables. Access to memory is asynchronous, and memory accesses are not necessarily executed in the order they were issued. An execution is correct if it is sequentially consistent: It should seem as if all the instructions were executed sequentially, in an order obtained by interleaving the instruction streams of the processors. Sequential consistency can be enforced by delaying each access to shared memory until the previous access of the same processor has terminated. For performance reasons, however, we want to allow several accesses by the same processor to proceed concurrently. Our analysis finds a minimal set of delays that enforces sequential consistency. The analysis extends to interprocessor synchronization constraints and to code where blocks of operations have to execute atomically. We use a conflict graph similar to that used to schedule transactions in distributed databases. Our graph incorporates the order on operations given by the program text, enabling us to do without locks even when database conflict graphs would suggest that locks are necessary. Our work has implications for the design of multiprocessors; it offers new compiler optimization techniques for parallel languages that support shared variables.

References

1 AHO, A., SETHI, R., AND ULLMAN, J. Compilers Principles, Techniques and Tools. Addison- Wesley, Reading, Mass., 1986. Google Scholar
2 AHO, A. V., GAREY, M. R., AND ULLMAN, J.D. The transitive reduction of a directed graph. SIAM J. Comput. 1, 2 (Dec. 1972), 131-137.Google Scholar
3 ALLEN, J. R., AND KENNEDY, K. Automatic translation of Fortran programs to vector form. COMP TR84-9, Dept. of Computer Science, Rice Univ., Houston, Tex., July 1984.Google Scholar
4 BEERI, C., BERNSTEIN, P. A., AND GOODMAN, N. A model for nested transaction systems. Tech. Rep. CS~86-1, Computer Science Dept., Hebrew Univ., Jerusalem, 1986.Google Scholar
5 BERNSTEIN, P. A., AND GOODMAN, N. Concurrency control in distributed database systems. ACM Comput. Surv. 13, 2 (June 1981), 185-221. Google Scholar
6 BERNSTEIN, P. A., SHIPMAN, D. W., AND ROTHNIE, J. B., JR. Concurrency control in a system for distributed databases (SDD-1). ACM Trans. Database Syst. 5, 1 (Mar. 1980), 18-51. Google Scholar
7 COLLIER, W. Principles of architecture for systems of parallel processes. Tech. Rep. TR00.3100, IBM, T. J. Watson Research Center, Yorktown Heights, N.Y., Mar. 1981.Google Scholar
8 EDLER, J., GOTTLIEB, A., KRUSKAL, C. K., MCAULIFFE, K. P., RUDOLPH, L., SNIR, M., TELLER, P., AND WILSON, J. Issues related to MIMD, shared-memory computers: The NYU UItracomputer approach. In Proceedings of the 12th International Conference on Computer Architecture (Boston, Mass., June). IEEE Press, New York, 1985, pp. 126-135. Google Scholar
9 GOTTLIEB, A., GRISHMAN, a., KRUSKAL, C. K., McAULIFFE, K. P., RUDOLPH, L., AND SNIR, M. The NYU Ultracomputer--Designing a MIMD, shared-memory parallel computer. IEEE Trans. Comput. C-32, 3 (Feb. 1983), 175-189.Google Scholar
10 KUNG, H. T., AND PAPADIMITRIOU, C. H. An optimality theory of concurrency control for databases. In Proceedings of the ACM-SIGMOD International Conference on Management of Data (Boston, May 1979). ACM, New York, 1979, pp. 116-126. Google Scholar
11 LAMPORT, L. Time, clocks and the ordering of events in a distributed system. Commun. ACM 21, 7 (July 1978), 558-565. Google Scholar
12 LAMPORT, L. How to make a multiprocessor computer that correctly executes multiprocess programs. IEEE Trans. Comput. C-28, 9 (Sept. 1979), 690-691.Google Scholar
13 LAMPORT, L. On interprocess communication, parts I and II. Dist. Comput. 1 (Jan. 1986), 77-101.Google Scholar
14 LYNCH, N. A., AND FISHER, M.J. Semantics of concurrent computations. Theor. Comput. Sci. 13, i (Jan. 1981), 17-43.Google Scholar
15 PADUA, D. A., AND WOLFE, W. J. Advanced compiler optimizations for supercomputers. Commun. ACM 29, 12 (Dec. 1986), 1184-1201. Google Scholar
16 PFISTER, G. F., BRANTLEY, W. C., GEORGE, D. A., HARVEY, S. L., KLEINFELDER, W. J., McAUUFFE, K. P., MELTON, E. A., NORTON, V. A., AND WEISS, J. The IBM research parallel processor prototype (RP3): Introduction and architecture. In Proceedings of the IEEE 1985 International Conference on Parallel Processing (Boston, June 1985). IEEE Press, New York, 1985, pp. 764-771.Google Scholar

Index Terms

Efficient and correct execution of parallel programs that share memory
1. Computer systems organization
  1. Architectures
    1. Distributed architectures
2. Software and its engineering
  1. Software organization and properties
    1. Contextual software domains
      1. Operating systems
        Process management
    2. Software system structures
      1. Distributed systems organizing principles

Recommendations

Correct Execution of Transactions at Different Isolation Levels

Many transaction processing applications execute at isolation levels lower than SERIALIZABLE in order to increase throughput and reduce response time. However, the resulting schedules might not be serializable and, hence, not necessarily correct. The ...
Read More
From lock to correct and efficient software transactional memory
INTERACT-14: Proceedings of the 2010 Workshop on Interaction between Compilers and Computer Architecture

Transactional memory solves many problems in lock-based parallel programs. Unfortunately, the semantics of transactions are different from those of critical sections defined by locks. The semantic differences make it difficult to correctly port existing ...
Read More
Transactional lock-free execution of lock-based programs
Special Issue: Proceedings of the 10th annual conference on Architectural Support for Programming Languages and Operating Systems

This paper is motivated by the difficulty in writing correct high-performance programs. Writing shared-memory multi-threaded programs imposes a complex trade-off between programming ease and performance, largely due to subtleties in coordinating access ...
Read More

Reviews

Reviewer: Edward W. Davis

The title of this paper will attract the attention of many people, but the content will be meaningful to only a select few. The very formal treatment of the subject includes 24 theorems, lemmas, and corollaries and many definitions. The authors provide a good introduction that motivates the work and states the goals. They give an example of the problem they are addressing, which concerns correct access to a shared memory when two program segments in a parallel program update a given variable. The basic goal is “to determine the minimal set of delays that enforce sequential consistency.” This seemingly simple situation leads to much complexity when stated formally. The categories and subject descriptors given by the authors are all related to architectures and processors, but the text is also directed toward compilation. The practicality of their results is brought into question in the conclusion when the authors state that in “an actual implementation of this method . . . one has to find all the minimal cycles in a graph. This requires time exponential in the number of nodes in a general graph.” They comment that the constrained structure of program graphs makes the problem easier. Should you read the paper__?__ Yes, if you are interested in theoretical questions related to the title. The technical content should be good, based on the quality of the journal in which it appeared and the refereeing acknowledged in the paper.

Access critical reviews of Computing literature here

Become a reviewer for Computing Reviews.

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in

ACM Transactions on Programming Languages and Systems Volume 10, Issue 2
April 1988
154 pages
ISSN:0164-0925
EISSN:1558-4593
DOI:10.1145/42190
Issue’s Table of Contents

Copyright © 1988 ACM
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 1 April 1988
Published in toplas Volume 10, Issue 2

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Qualifiers
- article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 366
  Total Citations
  View Citations
- 2,077
  Total Downloads
- Downloads (Last 12 months)158
- Downloads (Last 6 weeks)22
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Efficient and correct execution of parallel programs that share memory

ACM Transactions on Programming Languages and Systems

Abstract

References

Cited By

Index Terms

Recommendations

Correct Execution of Transactions at Different Isolation Levels

From lock to correct and efficient software transactional memory

Transactional lock-free execution of lock-based programs

Reviews

Access critical reviews of Computing literature here

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Efficient and correct execution of parallel programs that share memory

ACM Transactions on Programming Languages and Systems

Abstract

References

Cited By

Index Terms

Recommendations

Correct Execution of Transactions at Different Isolation Levels

From lock to correct and efficient software transactional memory

Transactional lock-free execution of lock-based programs

Reviews

Access critical reviews of Computing literature here

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media