Skip to main content

2015 | OriginalPaper | Buchkapitel

Performance Analysis of Irregular Collective Communication with the Crystal Router Algorithm

verfasst von : Michael Schliephake, Erwin Laure

Erschienen in: Solving Software Challenges for Exascale

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In order to achieve exascale performance it is important to detect potential bottlenecks and identify strategies to overcome them. For this, both applications and system software must be analysed and potentially improved. The EU FP7 project Collaborative Research into Exascale Systemware, Tools & Applications (CRESTA) chose the approach to co-design advanced simulation applications and system software as well as development tools. In this paper, we present the results of a co-design activity focused on the simulation code NEK5000 that aims at performance improvements of collective communication operations. We have analysed the algorithms that form the core of NEK5000’s communication module in order to assess its viability on recent computer architectures before starting to improve its performance. Our results show that the crystal router algorithm performs well in sparse, irregular collective operations for medium and large processor number but improvements for even larger system sizes of the future will be needed. We sketch the needed improvements, which will make the communication algorithms also beneficial for other applications that need to implement latency-dominated communication schemes with short messages. The latency-optimised communication operations will also become used in a runtime-system providing dynamic load balancing, under development within CRESTA.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
3.
Zurück zum Zitat Alverson, R., Roweth, D., Kaplan, L.: The gemini system interconnect. In: 2010 IEEE 18th Annual Symposium on High Performance Interconnects (HOTI), pp. 83–87, 18–20 August 2010 Alverson, R., Roweth, D., Kaplan, L.: The gemini system interconnect. In: 2010 IEEE 18th Annual Symposium on High Performance Interconnects (HOTI), pp. 83–87, 18–20 August 2010
4.
Zurück zum Zitat Fox, G.C., et al.: Solving Problems on Concurrent Processors: General Techniques and Regular Problems. Prentice Hall, Englewood Cliffs (1988) Fox, G.C., et al.: Solving Problems on Concurrent Processors: General Techniques and Regular Problems. Prentice Hall, Englewood Cliffs (1988)
5.
Zurück zum Zitat Grama, A.: Introduction to Parallel Computing. Addison-Wesley, Harlow (2003) Grama, A.: Introduction to Parallel Computing. Addison-Wesley, Harlow (2003)
6.
Zurück zum Zitat Li, B., Huo, Z., Zhang, P., Meng, D.: Multiple virtual lanes-aware MPI collective communication in multi-core clusters. In: 2009 International Conference on High Performance Computing (HiPC), pp. 304–311, 16-19 December 2009 Li, B., Huo, Z., Zhang, P., Meng, D.: Multiple virtual lanes-aware MPI collective communication in multi-core clusters. In: 2009 International Conference on High Performance Computing (HiPC), pp. 304–311, 16-19 December 2009
7.
Zurück zum Zitat Li, Q., Huo, Z., Sun, N.: Optimizing MPI alltoall communication of large messages in multicore clusters. In: 2011 12th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT), pp. 257–262, 20–22 October 2011 Li, Q., Huo, Z., Sun, N.: Optimizing MPI alltoall communication of large messages in multicore clusters. In: 2011 12th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT), pp. 257–262, 20–22 October 2011
8.
Zurück zum Zitat Schliephake, M., Aguilar, X., Laure, E.: Design and implementation of a runtime system for parallel numerical simulations on large-scale clusters. In: Procedia Computer Science, Proceedings of the International Conference on Computational Science, ICCS 2011, vol. 4, pp. 2105–2114 (2011) Schliephake, M., Aguilar, X., Laure, E.: Design and implementation of a runtime system for parallel numerical simulations on large-scale clusters. In: Procedia Computer Science, Proceedings of the International Conference on Computational Science, ICCS 2011, vol. 4, pp. 2105–2114 (2011)
9.
Zurück zum Zitat Schliephake, M., Laure, E.: Towards improving the communication performance of CRESTA’s co-design application NEK5000. In: High Performance Computing, Networking, Storage and Analysis (SCC), 2012 SC Companion, pp. 669–674, 10–16 November 2012 Schliephake, M., Laure, E.: Towards improving the communication performance of CRESTA’s co-design application NEK5000. In: High Performance Computing, Networking, Storage and Analysis (SCC), 2012 SC Companion, pp. 669–674, 10–16 November 2012
10.
Zurück zum Zitat Sur, S., Hyun-Wook, J., Panda, D.K.: Efficient and scalable all-to-all personalized exchange for InfiniBand-based clusters. In: ICPP 2004, 2004 International Conference on Parallel Processing, vol. 1, pp. 275–282, 15–18 August 2004 Sur, S., Hyun-Wook, J., Panda, D.K.: Efficient and scalable all-to-all personalized exchange for InfiniBand-based clusters. In: ICPP 2004, 2004 International Conference on Parallel Processing, vol. 1, pp. 275–282, 15–18 August 2004
11.
Zurück zum Zitat Tu, B., Fan, J., Zhan, J., Zhao, X.: Performance analysis and optimization of MPI collective operations on multi-core clusters. J. Supercomput. 60(1), 141–162 (2012)CrossRef Tu, B., Fan, J., Zhan, J., Zhao, X.: Performance analysis and optimization of MPI collective operations on multi-core clusters. J. Supercomput. 60(1), 141–162 (2012)CrossRef
12.
Zurück zum Zitat Tufo, H.M., Fscher, P.F.: Terascale spectral element algorithms and implementations, Gordon Bell prize paper. In: Proceedings of the ACM/IEEE SC99 Conference on High Performance Networking and Computing. IEEE Computer Society, CDROM (1999) Tufo, H.M., Fscher, P.F.: Terascale spectral element algorithms and implementations, Gordon Bell prize paper. In: Proceedings of the ACM/IEEE SC99 Conference on High Performance Networking and Computing. IEEE Computer Society, CDROM (1999)
13.
Zurück zum Zitat Tufo, H.M., Fischer, P.F.: Fast parallel direct solvers for coarse grid problems. J. Par. & Dist. Comput. 61, 151–177 (2001)CrossRefMATH Tufo, H.M., Fischer, P.F.: Fast parallel direct solvers for coarse grid problems. J. Par. & Dist. Comput. 61, 151–177 (2001)CrossRefMATH
Metadaten
Titel
Performance Analysis of Irregular Collective Communication with the Crystal Router Algorithm
verfasst von
Michael Schliephake
Erwin Laure
Copyright-Jahr
2015
DOI
https://doi.org/10.1007/978-3-319-15976-8_10