Article

Performance comparison of MPI and three openMP programming styles on shared memory multiprocessors

Author:
Géraud Krawezik

Université de Paris Sud, Orsay, France

Université de Paris Sud, Orsay, France
View Profile

SPAA '03: Proceedings of the fifteenth annual ACM symposium on Parallel algorithms and architecturesJune 2003Pages 118–127https://doi.org/10.1145/777412.777433

Published:07 June 2003Publication History

SPAA '03: Proceedings of the fifteenth annual ACM symposium on Parallel algorithms and architectures

Pages 118–127

ABSTRACT

When using a shared memory multiprocessor, the programmer faces the selection of the portable programming model which will deliver the best performance. Even if he restricts his choice to the standard programming environments (MPI and OpenMP), he has a choice of a broad range of programming approaches.To help the programmer in his selection, we compare MPI with three OpenMP programming styles (loop level, loop level with large parallel sections, SPMD) using a subset of the NAS benchmark (CG, MG, FT, LU), two dataset sizes (A and B) and two shared memory multiprocessors (IBM SP3 Night Hawk II, SGI Origin 3800). We also present a path from MPI to OpenMP SPMD guiding the programmers starting from an existing MPI code. We present the first SPMD OpenMP version of the NAS benchmark and compare it with other OpenMP versions from independent sources (PBN, SDSC and RWCP). Experimental results demonstrate that OpenMP provides competitive performance compared to MPI for a large set of experimental conditions. However the price of this performance is a strong programming effort on data set adaptation and inter-thread communications. MPI still provides the best performance under some conditions. We present breakdowns of the execution times and measurements of hardware performance counters to explain the performance differences.

References

F. Cappello and D. Etiemble. MPI versus MPI+OpenMP on IBM SP for the NAS Benchmarks. Proc. of the international Conference on Supercomputing 2000 : High-Performance Networking and Computing (SC2000), 2000. http://www.sc2000.org/proceedings/techpapr/index.htm.]] Google ScholarDigital Library
P. Kloos amd F. Mathey and P. Blaise. OpenMP and MPI programming with a CG algorithm. In Proceedings of the Second European Workshop on OpenMP (EWOMP 2000), http://www.epcc.ed.ac.uk/ewomp2000/proceedings.html, 2000.]]Google Scholar
A. Kneer. Industrial Mixed OpenMP/MPI CFD application for Practical Use in Free-surface Flow Calculations. In International Workshop on OpenMP Applications and Tools, WOMPAT 2000, http://www.cs.uh.edu/wompat2000/Program.html, 2000.]]Google Scholar
L. Smith and M. Bull. Development of Mixed Mode MPI/ OpenMP Applications. In International Workshop on OpenMP Applications and Tools, WOMPAT 2000, http://www.cs.uh.edu/wompat2000/Program.html, 2000.]]Google Scholar
Marc Snir, Steve Otto, Steven Huss-Lederman, David Walker, and Jack Dongarra. MPI: The Complete Reference. Massachussets Institute of Technology Press, 1996.]] Google ScholarDigital Library
Jack Dongarra et al. Message Passing Interface Forum. www.mpi-forum.org/docs/docs.html, 1994.]]Google Scholar
M.K. Bane et al. A Comparison of MPI and OpenMP Implementations of a Finite Analysis Code. In Cray User Group, (CUG-2000) (Noordwijk, Netherlands, 22--26 May 2000), 2000.]]Google Scholar
Kazuhiro Kusano, Shigehisa Satoh, and Mitsuhisa Sato. Performance Evaluation of the Omni OpenMP Compiler. Lecture Notes in Computer Science, 1940:403--414, 2000.]] Google ScholarDigital Library
B. Chapman, A. Patil, and A. Prabhakar. Performance Oriented Programming for NUMA Architectures. In Springer-Verlag~Berlin Heidelberg, editor, LNCS 2104, International Workshop on OpenMP Applications and Tools, WOMPAT 2001, West Lafayette, IN, USA, 2001.]] Google ScholarDigital Library
P. Kloos amd F. Mathey and P. Blaise. OpenMP and MPI programming with a CG algorithm. In Proceedings of the Second European Workshop on OpenMP (EWOMP 2000), http://www.epcc.ed.ac.uk/ewomp2000/proceedings.html, 2000.]]Google Scholar
E. Ayguade et al. NANOS: Effective Integration of Fine-grain Parallelism Exploitation and Multiprogramming, 1999.]]Google Scholar
Yoshizumi Tanaka, Kenjiro Taura, Mitsuhisa Sato, and Akinori Yonezawa. Performance Evaluation of OpenMP Applications with Nested Parallelism. In Languages, Compilers, and Run-Time Systems for Scalable Computers, pages 100--112, 2000.]] Google ScholarDigital Library
David Bailey, Tim Harris, William Saphir, Rob van~der Wijngaart, Alex Woo, and Maurice Yarrow. The NAS Parallel Benchmarks 2.0. Report NAS-95-020, Numerical Aerodynamic Simulation Facility, NASA Ames Research Center, Mail Stop T 27 A-1, Moffett Field, CA 94035-1000, USA, December 1995.]]Google Scholar
F. C. Wong, R. P. Martin, R. H. Arpaci-Dusseau, and D. E. Culler. Architectural Requirements and Scalability of the NAS Parallel Benchmarks. In Proc. of international Conference on Supercomputing 1999 : High-Performance Networking and Computing (SC299), 1999.]] Google ScholarDigital Library
H. Jin, M. Frumkin, and J. Yan. The OpenMP Implementation of the NAS Parallel Benchmarks and its Performance. In NASA Ames~Research Center, editor, Technical Report NAS-99-01, 1999.]]Google Scholar
B. Armstrong, S. Wook Kim, and R. Eigenmann. Quantifying Differences between OpenMP and MPI Using a Large-Scale Application Suite. In Springer-Verlag~Berlin Heidelberg, editor, LNCS 1940, ISHPC International Workshop on OpenMP: Experiences and Implementations (WOMPEI 2000), 2000.]] Google ScholarDigital Library
A. J. Wallcraft. SPMD OpenMP vs MPI for Ocean Models. In First European Workshop on OpenMP - EWOMP'99, http://www.it.lth.se/ewomp99/programme.html, 2000.]]Google Scholar

Index Terms

Performance comparison of MPI and three openMP programming styles on shared memory multiprocessors

Recommendations

Towards automatic translation of OpenMP to MPI
ICS '05: Proceedings of the 19th annual international conference on Supercomputing

We present compiler techniques for translating OpenMP shared-memory parallel applications into MPI message-passing programs for execution on distributed memory systems. This translation aims to extend the ease of creating parallel applications with ...
Read More
Performance comparison of MPI and OpenMP on shared memory multiprocessors: Research Articles

When using a shared memory multiprocessor, the programmer faces the issue of selecting the portable programming model which will provide the best performance. Even if they restricts their choice to the standard programming environments (MPI and OpenMP), ...
Read More
A Detailed Performance Analysis of the Interpolation Supplemented Lattice Boltzmann Method on the Cray T3E and Cray X1A Detailed Performance Analysis of the Interpolation Supplemented Lattice Boltzmann Method on the Cray T3E and Cray X1

A detailed study of the parallel performance of the interpolation supplemented lattice Boltzmann (ISLB) method using SHMEM and MPI on the Cray T3E-900 and Cray X1 architectures is presented. The noteworthy feature of the ...
Read More

Reviews

Reviewer: James Harold Davenport

This article is precisely what the title says: a performance comparison of the message passing interface (MPI) and three openMP programming styles on shared memory multiprocessors. The three openMP styles are described as loop level, loop level with large parallel sections, and single program multiple data (SPMD). In fact, a total of seven different implementations of each of four elements of the NASA Ames (NAS) benchmark are compared. Each is tested on two dataset sizes, and on both an IBM SP3 Nighthawk II and an SGI Origin 3800, in each case with varying numbers of processors. This is clearly a large amount of data. Nonetheless, the authors draw some fairly clear conclusions, including The naive loop level openMP is simply not competitive and Currently, only SPMD programming with openMP [of the openMP variants] can provide good performance consistently. It would appear from the paper that it was a substantial effort to write and tune the openMP SPMD versions, and the authors list other drawbacks of this style. There is always the issue of (inadvertent) bias in studies like this, since the authors started from MPI versions, but this is an inevitable hazard, and the authors seem to have done what they could to reduce it. Online Computing Reviews Service

Access critical reviews of Computing literature here

Become a reviewer for Computing Reviews.

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SPAA '03: Proceedings of the fifteenth annual ACM symposium on Parallel algorithms and architectures
June 2003
374 pages
ISBN:1581136617
DOI:10.1145/777412
General Chair:
Arnold Rosenberg
University of Massachusetts
,
Program Chair:
Friedhelm Meyer auf der Heide
U. Paderborn
Copyright © 2003 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 7 June 2003
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
MPI
multiprocessors
openMP
performance evaluation
shared memory
Qualifiers
- Article
Conference

Acceptance Rates
SPAA '03 Paper Acceptance Rate38of106submissions,36%Overall Acceptance Rate447of1,461submissions,31%
More
Upcoming Conference
SPAA '24

Sponsor:

sigact

sigact

36th ACM Symposium on Parallelism in Algorithms and Architectures

June 17 - 21, 2024

Nantes , France
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 36
  Total Citations
  View Citations
- 1,875
  Total Downloads
- Downloads (Last 12 months)30
- Downloads (Last 6 weeks)3
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Performance comparison of MPI and three openMP programming styles on shared memory multiprocessors

SPAA '03: Proceedings of the fifteenth annual ACM symposium on Parallel algorithms and architectures

ABSTRACT

References

Cited By

Index Terms

Recommendations

Towards automatic translation of OpenMP to MPI

Performance comparison of MPI and OpenMP on shared memory multiprocessors: Research Articles

A Detailed Performance Analysis of the Interpolation Supplemented Lattice Boltzmann Method on the Cray T3E and Cray X1A Detailed Performance Analysis of the Interpolation Supplemented Lattice Boltzmann Method on the Cray T3E and Cray X1

Reviews

Access critical reviews of Computing literature here

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Performance comparison of MPI and three openMP programming styles on shared memory multiprocessors

SPAA '03: Proceedings of the fifteenth annual ACM symposium on Parallel algorithms and architectures

ABSTRACT

References

Cited By

Index Terms

Recommendations

Towards automatic translation of OpenMP to MPI

Performance comparison of MPI and OpenMP on shared memory multiprocessors: Research Articles

A Detailed Performance Analysis of the Interpolation Supplemented Lattice Boltzmann Method on the Cray T3E and Cray X1A Detailed Performance Analysis of the Interpolation Supplemented Lattice Boltzmann Method on the Cray T3E and Cray X1

Reviews

Access critical reviews of Computing literature here

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media