2010 | OriginalPaper | Chapter
A Scalable MPI_Comm_split Algorithm for Exascale Computing
Authors : Paul Sack, William Gropp
Published in: Recent Advances in the Message Passing Interface
Publisher: Springer Berlin Heidelberg
Activate our intelligent search to find suitable subject content or patents.
Select sections of text to find matching patents with Artificial Intelligence. powered by
Select sections of text to find additional relevant content using AI-assisted search. powered by
Existing algorithms for creating communicators in MPI programs will not scale well to future exascale supercomputers containing millions of cores. In this work, we present a novel communicator-creation algorithm that does scale well into millions of processes using three techniques: replacing the sorting at the end of
MPI_Comm_split
with merging as the color and key table is built, sorting the color and key table in parallel, and using a distributed table to store the output communicator data rather than a replicated table. This reduces the time cost of
MPI_Comm_split
in the worst case we consider from 22 seconds to 0.37 second. Existing algorithms build a table with as many entries as processes, using vast amounts of memory. Our algorithm uses a small, fixed amount of memory per communicator after
MPI_Comm_split
has finished and uses a fraction of the memory used by the conventional algorithm for temporary storage during the execution of
MPI_Comm_split
.