Skip to main content
Top

2019 | OriginalPaper | Chapter

Finepoints: Partitioned Multithreaded MPI Communication

Authors : Ryan E. Grant, Matthew G. F. Dosanjh, Michael J. Levenhagen, Ron Brightwell, Anthony Skjellum

Published in: High Performance Computing

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

The MPI multithreading model has been historically difficult to optimize; the interface that it provides for threads was designed as a process-level interface. This model has led to implementations that treat function calls as critical regions and protect them with locks to avoid race conditions. We hypothesize that an interface designed specifically for threads can provide superior performance than current approaches and even outperform single-threaded MPI.
In this paper, we describe a design for partitioned communication in MPI that we call finepoints. First, we assess the existing communication models for MPI two-sided communication and then introduce finepoints as a hybrid of MPI models that has the best features of each existing MPI communication model. In addition, “partitioned communication” created with finepoints leverages new network hardware features that cannot be exploited with current MPI point-to-point semantics, making this new approach both innovative and useful both now and in the future.
To demonstrate the validity of our hypothesis, we implement a finepoints library and show improvements against a state-of-the-art multithreaded optimized Open MPI implementation on a Cray XC40 with an Aries network. Our experiments demonstrate up to a 12\(\times \) reduction in wait time for completion of send operations. This new model is shown working on a nuclear reactor physics neutron-transport proxy-application, providing up to 26.1% improvement in communication time and up to 4.8% improvement in runtime over the best performing MPI communication mode, single-threaded MPI.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Barrett, B.W., Brightwell, R., et al.: The Portals 4.1 networking programming interface. Technical report SAND2017-3825, Sandia National Laboratories (SNL-NM), Albuquerque, NM, United States (2017) Barrett, B.W., Brightwell, R., et al.: The Portals 4.1 networking programming interface. Technical report SAND2017-3825, Sandia National Laboratories (SNL-NM), Albuquerque, NM, United States (2017)
2.
go back to reference Bernholdt, D.E., Boehm, S., et al.: A survey of MPI usage in the U.S. Exascale Computing Project. Concurr. Comput. Pract. Exp. (2018) Bernholdt, D.E., Boehm, S., et al.: A survey of MPI usage in the U.S. Exascale Computing Project. Concurr. Comput. Pract. Exp. (2018)
3.
go back to reference Derradji, S. Palfer-Sollier, T., et al.: The BXI interconnect architecture. In: Proceedings of the 23rd Annual Symposium on High Performance Interconnects, HOTI 2015. IEEE (2015) Derradji, S. Palfer-Sollier, T., et al.: The BXI interconnect architecture. In: Proceedings of the 23rd Annual Symposium on High Performance Interconnects, HOTI 2015. IEEE (2015)
4.
go back to reference Dimitrov, R., Skjellum, A.: Software architecture and performance comparison of MPI/Pro and MPICH. In: Sloot, P.M.A., Abramson, D., Bogdanov, A.V., Gorbachev, Y.E., Dongarra, J.J., Zomaya, A.Y. (eds.) ICCS 2003, Part III. LNCS, vol. 2659, pp. 307–315. Springer, Heidelberg (2003). https://doi.org/10.1007/3-540-44863-2_31CrossRef Dimitrov, R., Skjellum, A.: Software architecture and performance comparison of MPI/Pro and MPICH. In: Sloot, P.M.A., Abramson, D., Bogdanov, A.V., Gorbachev, Y.E., Dongarra, J.J., Zomaya, A.Y. (eds.) ICCS 2003, Part III. LNCS, vol. 2659, pp. 307–315. Springer, Heidelberg (2003). https://​doi.​org/​10.​1007/​3-540-44863-2_​31CrossRef
5.
go back to reference Dinan, J., Grant, R.E., et al.: Enabling communication concurrency through flexible MPI endpoints. Int. J. High Perform. Comput. Appl. 28(4), 390–405 (2014)CrossRef Dinan, J., Grant, R.E., et al.: Enabling communication concurrency through flexible MPI endpoints. Int. J. High Perform. Comput. Appl. 28(4), 390–405 (2014)CrossRef
6.
go back to reference Doerfler, D.W., Rajan, M., et al.: A comparison of the performance characteristics of capability and capacity class HPC systems. Technical report, Sandia National Lab. (SNL-NM), Albuquerque, NM, United States (2011) Doerfler, D.W., Rajan, M., et al.: A comparison of the performance characteristics of capability and capacity class HPC systems. Technical report, Sandia National Lab. (SNL-NM), Albuquerque, NM, United States (2011)
7.
go back to reference Dosanjh, M.G.F., Grant, R.E., et al.: Re-evaluating network onload vs. offload for the many-core era. In: IEEE International Conference on Cluster Computing (CLUSTER), pp. 342–350. IEEE (2015) Dosanjh, M.G.F., Grant, R.E., et al.: Re-evaluating network onload vs. offload for the many-core era. In: IEEE International Conference on Cluster Computing (CLUSTER), pp. 342–350. IEEE (2015)
8.
go back to reference Dosanjh, M.G.F., Groves, T., et al.: RMA-MT: a benchmark suite for assessing MPI multi-threaded RMA performance. In: 16th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid), pp. 550–559. IEEE (2016) Dosanjh, M.G.F., Groves, T., et al.: RMA-MT: a benchmark suite for assessing MPI multi-threaded RMA performance. In: 16th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid), pp. 550–559. IEEE (2016)
9.
go back to reference Grant, R.E., Rashti, M.J., et al.: RDMA capable iWARP over datagrams. In: IEEE International Parallel & Distributed Processing Symposium (IPDPS), pp. 628–639. IEEE (2011) Grant, R.E., Rashti, M.J., et al.: RDMA capable iWARP over datagrams. In: IEEE International Parallel & Distributed Processing Symposium (IPDPS), pp. 628–639. IEEE (2011)
10.
go back to reference Gunow, G., Tramm, J.R., et al.: SimpleMOC - a performance abstraction for 3D MOC. In: ANS MC2015. American Nuclear Society, American Nuclear Society (2015) Gunow, G., Tramm, J.R., et al.: SimpleMOC - a performance abstraction for 3D MOC. In: ANS MC2015. American Nuclear Society, American Nuclear Society (2015)
11.
go back to reference Heroux, M.A., Doerfler, D.W., et al.: Improving performance via mini-applications. Sandia National Laboratories, Technical report SAND2009-5574, vol. 3 (2009) Heroux, M.A., Doerfler, D.W., et al.: Improving performance via mini-applications. Sandia National Laboratories, Technical report SAND2009-5574, vol. 3 (2009)
12.
go back to reference Hjelm, N., Dosanjh, M.G.F., et al.: Improving MPI multi-threaded RMA communication performance. In: Proceedings of the International Conference on Parallel Processing, pp. 1–10 (2018) Hjelm, N., Dosanjh, M.G.F., et al.: Improving MPI multi-threaded RMA communication performance. In: Proceedings of the International Conference on Parallel Processing, pp. 1–10 (2018)
13.
go back to reference Kamal, H., Wagner, A.: An integrated fine-grain runtime system for MPI. Computing 96(4), 293–309 (2014). ISSN: 0010-485XCrossRef Kamal, H., Wagner, A.: An integrated fine-grain runtime system for MPI. Computing 96(4), 293–309 (2014). ISSN: 0010-485XCrossRef
14.
go back to reference Mendygral, P., Radcliffe, N., et al.: WOMBAT: a scalable and high-performance astrophysical magnetohydrodynamics code. Astrophys. J. Suppl. Ser. 228(2), 23 (2017)CrossRef Mendygral, P., Radcliffe, N., et al.: WOMBAT: a scalable and high-performance astrophysical magnetohydrodynamics code. Astrophys. J. Suppl. Ser. 228(2), 23 (2017)CrossRef
15.
go back to reference MPI Forum. MPI: A message-passing interface standard version 3.1. Technical report, University of Tennessee, Knoxville (2015) MPI Forum. MPI: A message-passing interface standard version 3.1. Technical report, University of Tennessee, Knoxville (2015)
16.
go back to reference Petrini, F., Kerbyson, D.J., et al.: The case of the missing supercomputer performance: achieving optimal performance on the 8,192 processors of ASCI Q. In: Proceedings of the 2003 ACM/IEEE conference on Supercomputing, p. 55 (2003) Petrini, F., Kerbyson, D.J., et al.: The case of the missing supercomputer performance: achieving optimal performance on the 8,192 processors of ASCI Q. In: Proceedings of the 2003 ACM/IEEE conference on Supercomputing, p. 55 (2003)
17.
go back to reference Rashti, M.J., Grant, R.E., et al.: iWARP redefined: scalable connectionless communication over high-speed Ethernet. In: International Conference on High Performance Computing (HiPC), pp. 1–10. IEEE (2010) Rashti, M.J., Grant, R.E., et al.: iWARP redefined: scalable connectionless communication over high-speed Ethernet. In: International Conference on High Performance Computing (HiPC), pp. 1–10. IEEE (2010)
18.
go back to reference Schneider, T., Hoefler, T., et al.: Protocols for fully offloaded collective operations on accelerated network adapters. In: 42nd International Conference on Parallel Processing (ICPP 2013), Lyon, France, October 2013 Schneider, T., Hoefler, T., et al.: Protocols for fully offloaded collective operations on accelerated network adapters. In: 42nd International Conference on Parallel Processing (ICPP 2013), Lyon, France, October 2013
Metadata
Title
Finepoints: Partitioned Multithreaded MPI Communication
Authors
Ryan E. Grant
Matthew G. F. Dosanjh
Michael J. Levenhagen
Ron Brightwell
Anthony Skjellum
Copyright Year
2019
DOI
https://doi.org/10.1007/978-3-030-20656-7_17

Premium Partner