Skip to main content

2018 | OriginalPaper | Buchkapitel

Supercomputer in a Laptop: Distributed Application and Runtime Development via Architecture Simulation

verfasst von : Samuel Knight, Joseph P. Kenny, Jeremiah J. Wilke

Erschienen in: High Performance Computing

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Architecture simulation can aid in predicting and understanding application performance, particularly for proposed hardware or large system designs that do not exist. In network design studies for high-performance computing, most simulators focus on the dominant message passing (MPI) model. Currently, many simulators build and maintain their own simulator-specific implementations of MPI. This approach has several drawbacks. Rather than reusing an existing MPI library, simulator developers must implement all semantics, collectives, and protocols. Additionally, alternative runtimes like GASNet cannot be simulated without again building a simulator-specific version. It would be far more sustainable and flexible to maintain lower-level layers like uGNI or IB-verbs and reuse the production runtime code. Directly building and running production communication runtimes inside a simulator poses technical challenges, however. We discuss these challenges and show how they are overcome via the macroscale components for the Structural Simulation Toolkit (SST), leveraging a basic source-to-source tool to automatically adapt production code for simulation. SST is able to encapsulate and virtualize thousands of MPI ranks in a single simulator process, providing a “supercomputer in a laptop” environment. We demonstrate the approach for the production GASNet runtime over uGNI running inside SST. We then discuss the capabilities enabled, including investigating performance with tunable delays, deterministic debugging of race conditions, and distributed debugging with serial debuggers.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Groves, T., et al.: (SAI) Stalled, Active and Idle: characterizing power and performance of large-scale dragonfly networks. In: IEEE International Conference on Cluster Computing (CLUSTER) 2016, pp. 50–59 (2016) Groves, T., et al.: (SAI) Stalled, Active and Idle: characterizing power and performance of large-scale dragonfly networks. In: IEEE International Conference on Cluster Computing (CLUSTER) 2016, pp. 50–59 (2016)
2.
Zurück zum Zitat Hoefler, T., et al.: sPIN: high-performance streaming processing in the network. In: SC 2017: International Conference for High Performance Computing, Networking, Storage and Analysis (2017) Hoefler, T., et al.: sPIN: high-performance streaming processing in the network. In: SC 2017: International Conference for High Performance Computing, Networking, Storage and Analysis (2017)
3.
Zurück zum Zitat Bonachea, D.: Gasnet specification, v1.1, Berkeley, CA, USA, Technical report (2002) Bonachea, D.: Gasnet specification, v1.1, Berkeley, CA, USA, Technical report (2002)
4.
Zurück zum Zitat Barrett, B., et al.: The Portals 4.0.2 Network Programming Interface. Technical report SAND2014-19568 Barrett, B., et al.: The Portals 4.0.2 Network Programming Interface. Technical report SAND2014-19568
5.
Zurück zum Zitat Graham, R., et al.: Open MPI: a high performance, flexible implementation of MPI point-to-point communications. Parallel Process. Lett. 17(01), 79–88 (2007)MathSciNetCrossRef Graham, R., et al.: Open MPI: a high performance, flexible implementation of MPI point-to-point communications. Parallel Process. Lett. 17(01), 79–88 (2007)MathSciNetCrossRef
7.
Zurück zum Zitat Bauer, M., Treichler, S., Slaughter, E., Aiken, A.: Legion: expressing locality and independence with logical regions. In: SC 2012: International Conference for High Performance Computing, Networking, Storage and Analysis, pp. 1–11 (2012) Bauer, M., Treichler, S., Slaughter, E., Aiken, A.: Legion: expressing locality and independence with logical regions. In: SC 2012: International Conference for High Performance Computing, Networking, Storage and Analysis, pp. 1–11 (2012)
8.
Zurück zum Zitat Zheng, Y., et al.: UPC++: A PGAS extension for C++. In: International Parallel and Distributed Processing Symposium (2014) Zheng, Y., et al.: UPC++: A PGAS extension for C++. In: International Parallel and Distributed Processing Symposium (2014)
9.
Zurück zum Zitat Knüpfer, A., et al.: Score-P: A Joint Performance Measurement Run-Time Infrastructure for Periscope, Scalasca, TAU, and Vampir, pp. 79–91, January 2012 Knüpfer, A., et al.: Score-P: A Joint Performance Measurement Run-Time Infrastructure for Periscope, Scalasca, TAU, and Vampir, pp. 79–91, January 2012
11.
Zurück zum Zitat Jain, N., et al.: Evaluating HPC networks via simulation of parallel workloads. In: SC 2016: International Conference for High Performance Computing, Networking, Storage and Analysis, pp. 154–165 (2016) Jain, N., et al.: Evaluating HPC networks via simulation of parallel workloads. In: SC 2016: International Conference for High Performance Computing, Networking, Storage and Analysis, pp. 154–165 (2016)
12.
Zurück zum Zitat Hoefler, T., Schneider, T., Lumsdaine, A.: LogGOPSim: simulating large-scale applications in the LogGOPS model. In: HPDC, pp. 597–604 (2010) Hoefler, T., Schneider, T., Lumsdaine, A.: LogGOPSim: simulating large-scale applications in the LogGOPS model. In: HPDC, pp. 597–604 (2010)
13.
Zurück zum Zitat Degomme, A., Legrand, A., Markomanolis, G.S., Quinson, M., Stillwell, M., Suter, F.: Simulating MPI applications: the SMPI approach. IEEE Trans. Parallel Distrib. Sys. 28(8), 2387–2400 (2017)CrossRef Degomme, A., Legrand, A., Markomanolis, G.S., Quinson, M., Stillwell, M., Suter, F.: Simulating MPI applications: the SMPI approach. IEEE Trans. Parallel Distrib. Sys. 28(8), 2387–2400 (2017)CrossRef
14.
Zurück zum Zitat Jiang, N., et al.: A detailed and flexible cycle-accurate network-on-chip simulator. In: ISPASS, pp. 86-96 (2013) Jiang, N., et al.: A detailed and flexible cycle-accurate network-on-chip simulator. In: ISPASS, pp. 86-96 (2013)
15.
Zurück zum Zitat Mubarak, M., Carothers, C.D., Ross, R.B., Carns, P.: A case study in using massively parallel simulation for extreme-scale torus network codesign. In: SIGSIM PADS (2014) Mubarak, M., Carothers, C.D., Ross, R.B., Carns, P.: A case study in using massively parallel simulation for extreme-scale torus network codesign. In: SIGSIM PADS (2014)
16.
Zurück zum Zitat Fujimoto, R.M.: Parallel discrete event simulation. Commun. ACM 33, 30–53 (1990)CrossRef Fujimoto, R.M.: Parallel discrete event simulation. Commun. ACM 33, 30–53 (1990)CrossRef
20.
Zurück zum Zitat Kumar, S., Sun, Y., Kale, L.V.: Acceleration of an asynchronous message driven programming paradigm on IBM Blue Gene/Q. In: IPDPS (2013) Kumar, S., Sun, Y., Kale, L.V.: Acceleration of an asynchronous message driven programming paradigm on IBM Blue Gene/Q. In: IPDPS (2013)
21.
Zurück zum Zitat Sato, K., et al.: Clock delta compression for scalable order-replay of non-deterministic parallel applications. In: SC 2015: International Conference for High Performance Computing, Networking, Storage and Analysis, pp. 62:1–62:12 (2015) Sato, K., et al.: Clock delta compression for scalable order-replay of non-deterministic parallel applications. In: SC 2015: International Conference for High Performance Computing, Networking, Storage and Analysis, pp. 62:1–62:12 (2015)
22.
Zurück zum Zitat Hunold, S., Carpen-Amarie, S., Träff, J.L.: Reproducible MPI micro-benchmarking isn’t as easy as you think. In: EuroMPI/ASIA (2014) Hunold, S., Carpen-Amarie, S., Träff, J.L.: Reproducible MPI micro-benchmarking isn’t as easy as you think. In: EuroMPI/ASIA (2014)
Metadaten
Titel
Supercomputer in a Laptop: Distributed Application and Runtime Development via Architecture Simulation
verfasst von
Samuel Knight
Joseph P. Kenny
Jeremiah J. Wilke
Copyright-Jahr
2018
DOI
https://doi.org/10.1007/978-3-030-02465-9_23