Skip to main content
Top

2018 | OriginalPaper | Chapter

Compiler-Assisted Source-to-Source Skeletonization of Application Models for System Simulation

Authors : Jeremiah J. Wilke, Joseph P. Kenny, Samuel Knight, Sebastien Rumley

Published in: High Performance Computing

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Performance modeling of networks through simulation requires application endpoint models that inject traffic into the simulation models. Endpoint models today for system-scale studies consist mainly of post-mortem trace replay, but these off-line simulations may lack flexibility and scalability. On-line simulations running so-called skeleton applications run reduced versions of an application that generate traffic that is the same or similar to the full application. These skeleton apps have advantages for flexibility and scalability, but they often must be custom written for the simulator itself. Auto-skeletonization of existing application source code via compiler tools would provide endpoint models with minimal development effort. These source-to-source transformations have been only narrowly explored. We introduce a pragma language and corresponding Clang-driven source-to-source compiler that performs auto-skeletonization based on provided pragma annotations. We describe the compiler toolchain, validate the generated skeletons, and show scalability of the generated simulation models beyond 100 K endpoints for example MPI applications. Overall, we assert that our proposed auto-skeletonization approach and the flexible skeletons it produces can be an important tool in realizing balanced exascale interconnect designs.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Appendix
Available only for authorised users
Literature
4.
go back to reference Bao, W., et al.: Static and dynamic frequency scaling on multicore cpus. ACM Trans. Archit. Code Optim. 13(4), 51:1–51:26 (2016)CrossRef Bao, W., et al.: Static and dynamic frequency scaling on multicore cpus. ACM Trans. Archit. Code Optim. 13(4), 51:1–51:26 (2016)CrossRef
5.
go back to reference Binkert, N., et al.: The gem5 simulator. SIGARCH Comput. Archit. News 39(2), 1–7 (2011)CrossRef Binkert, N., et al.: The gem5 simulator. SIGARCH Comput. Archit. News 39(2), 1–7 (2011)CrossRef
6.
go back to reference Chan, C.P., et al.: Topology-aware performance optimization and modeling of adaptive mesh refinement codes for exascale. In: International Workshop on Communication Optimizations in HPC (COMHPC), pp. 17–28. IEEE (2016) Chan, C.P., et al.: Topology-aware performance optimization and modeling of adaptive mesh refinement codes for exascale. In: International Workshop on Communication Optimizations in HPC (COMHPC), pp. 17–28. IEEE (2016)
7.
go back to reference Chennupati, G., et al.: AMM: scalable memory reuse model to predict the performance of physics codes. In: 2017 IEEE International Conference on Cluster Computing (CLUSTER), pp. 649–650 (2017) Chennupati, G., et al.: AMM: scalable memory reuse model to predict the performance of physics codes. In: 2017 IEEE International Conference on Cluster Computing (CLUSTER), pp. 649–650 (2017)
8.
go back to reference Degomme, A., Legrand, A., Markomanolis, G.S., Quinson, M., Stillwell, M., Suter, F.: Simulating MPI applications: the SMPI approach. IEEE Trans. Parallel Distrib. Syst. 28, 2387–2400 (2017)CrossRef Degomme, A., Legrand, A., Markomanolis, G.S., Quinson, M., Stillwell, M., Suter, F.: Simulating MPI applications: the SMPI approach. IEEE Trans. Parallel Distrib. Syst. 28, 2387–2400 (2017)CrossRef
9.
go back to reference Desprez, F., Markomanolis, G., Quinson, M., Suter, F.: Assessing the performance of MPI applications through time-independent trace replay. In: PSTI 2011: Second International Workshop on Parallel Software Tools and Tool Infrastructures (2011) Desprez, F., Markomanolis, G., Quinson, M., Suter, F.: Assessing the performance of MPI applications through time-independent trace replay. In: PSTI 2011: Second International Workshop on Parallel Software Tools and Tool Infrastructures (2011)
10.
go back to reference Fujimoto, R.M.: Parallel discrete event simulation. Commun. ACM 33, 30–53 (1990)CrossRef Fujimoto, R.M.: Parallel discrete event simulation. Commun. ACM 33, 30–53 (1990)CrossRef
11.
go back to reference Gropp, W., Lusk, E.L., Skjellum, A.: Using MPI - 2nd Edition: Portable Parallel Programming with the Message Passing Interface. The MIT Press, Cambridge (1999) Gropp, W., Lusk, E.L., Skjellum, A.: Using MPI - 2nd Edition: Portable Parallel Programming with the Message Passing Interface. The MIT Press, Cambridge (1999)
12.
go back to reference Groves, T., et al.: (SAI) Stalled, Active and Idle: characterizing power and performance of large-scale dragonfly networks. In: 2016 IEEE International Conference on Cluster Computing (CLUSTER), pp. 50–59 (2016) Groves, T., et al.: (SAI) Stalled, Active and Idle: characterizing power and performance of large-scale dragonfly networks. In: 2016 IEEE International Conference on Cluster Computing (CLUSTER), pp. 50–59 (2016)
13.
go back to reference Guo, J., Yi, Q., Meng, J., Zhang, J., Balaji, P.: Compiler-assisted overlapping of communication and computation in MPI applications. In: 2016 IEEE International Conference on Cluster Computing (CLUSTER), pp. 60–69 (2016) Guo, J., Yi, Q., Meng, J., Zhang, J., Balaji, P.: Compiler-assisted overlapping of communication and computation in MPI applications. In: 2016 IEEE International Conference on Cluster Computing (CLUSTER), pp. 60–69 (2016)
14.
go back to reference Hoefler, T., Schneider, T., Lumsdaine, A.: LogGOPSim: simulating large-scale applications in the LogGOPS model. In: HPDC 2010: 19th ACM International Symposium on High Performance Distributed Computing, pp. 597–604 (2010) Hoefler, T., Schneider, T., Lumsdaine, A.: LogGOPSim: simulating large-scale applications in the LogGOPS model. In: HPDC 2010: 19th ACM International Symposium on High Performance Distributed Computing, pp. 597–604 (2010)
15.
go back to reference Jain, N., et al.: Evaluating HPC networks via simulation of parallel workloads. In: SC16: International Conference for High Performance Computing, Networking, Storage and Analysis, pp. 154–165 (2016) Jain, N., et al.: Evaluating HPC networks via simulation of parallel workloads. In: SC16: International Conference for High Performance Computing, Networking, Storage and Analysis, pp. 154–165 (2016)
16.
go back to reference Jiang, N., Becker, D.U., Michelogiannakis, G., Balfour, J.D., Towles, B., Shaw, D.E., Kim, J., Dally, W.J.: A detailed and flexible cycle-accurate Network-on-Chip simulator. In: ISPASS, pp. 86–96 (2013) Jiang, N., Becker, D.U., Michelogiannakis, G., Balfour, J.D., Towles, B., Shaw, D.E., Kim, J., Dally, W.J.: A detailed and flexible cycle-accurate Network-on-Chip simulator. In: ISPASS, pp. 86–96 (2013)
17.
go back to reference Minkenberg, C.: HPC networks: challenges and the role of optics. In: Optical Fiber Communications Conference and Exhibition (OFC), 2015, pp. 1–3. IEEE (2015) Minkenberg, C.: HPC networks: challenges and the role of optics. In: Optical Fiber Communications Conference and Exhibition (OFC), 2015, pp. 1–3. IEEE (2015)
18.
go back to reference Preissl, R., Schulz, M., Kranzlmüller, D., de Supinski, B.R., Quinlan, D.J.: Using MPI communication patterns to guide source code transformations. In: Bubak, M., van Albada, G.D., Dongarra, J., Sloot, P.M.A. (eds.) ICCS 2008, Part III. LNCS, vol. 5103, pp. 253–260. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-69389-5_29CrossRef Preissl, R., Schulz, M., Kranzlmüller, D., de Supinski, B.R., Quinlan, D.J.: Using MPI communication patterns to guide source code transformations. In: Bubak, M., van Albada, G.D., Dongarra, J., Sloot, P.M.A. (eds.) ICCS 2008, Part III. LNCS, vol. 5103, pp. 253–260. Springer, Heidelberg (2008). https://​doi.​org/​10.​1007/​978-3-540-69389-5_​29CrossRef
19.
go back to reference Rodrigues, A., et al.: Improvements to the structural simulation toolkit. In: International Conference on Simulation Tools and Techniques, pp. 190–195 (2012) Rodrigues, A., et al.: Improvements to the structural simulation toolkit. In: International Conference on Simulation Tools and Techniques, pp. 190–195 (2012)
20.
21.
go back to reference Snavely, A., et al.: A framework for performance modeling and prediction. In: SC 2002: International Conference for High Performance Computing, Networking, Storage and Analysis, pp. 1–17 (2002) Snavely, A., et al.: A framework for performance modeling and prediction. In: SC 2002: International Conference for High Performance Computing, Networking, Storage and Analysis, pp. 1–17 (2002)
22.
go back to reference Sottile, M., et al.: Semi-automatic extraction of software skeletons for benchmarking large-scale parallel applications. In: PADS 2013: ACM SIGSIM Conference on Principles of Advanced Discrete Simulation, pp. 1–10 (2013) Sottile, M., et al.: Semi-automatic extraction of software skeletons for benchmarking large-scale parallel applications. In: PADS 2013: ACM SIGSIM Conference on Principles of Advanced Discrete Simulation, pp. 1–10 (2013)
23.
go back to reference Strout, M.M., Kreaseck, B., Hovland, P.D.: Data-flow analysis for MPI programs. In: ICPP 2006: International Conference on Parallel Processing, pp. 175–184 (2006) Strout, M.M., Kreaseck, B., Hovland, P.D.: Data-flow analysis for MPI programs. In: ICPP 2006: International Conference on Parallel Processing, pp. 175–184 (2006)
24.
go back to reference Susukita, R., et al.: Performance prediction of large-scale parallel system and application using macro-level simulation. In: SC 2008: International Conference for High Performance Computing, Networking, Storage and Analysis (2008) Susukita, R., et al.: Performance prediction of large-scale parallel system and application using macro-level simulation. In: SC 2008: International Conference for High Performance Computing, Networking, Storage and Analysis (2008)
25.
go back to reference Wilke, J.J., Sargsyan, K., Kenny, J.P., Debusschere, B., Najm, H.N., Hendry, G.: Validation and Uncertainty assessment of extreme-scale HPC simulation through Bayesian inference. In: Wolf, F., Mohr, B., an Mey, D. (eds.) Euro-Par 2013. LNCS, vol. 8097, pp. 41–52. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-40047-6_7CrossRef Wilke, J.J., Sargsyan, K., Kenny, J.P., Debusschere, B., Najm, H.N., Hendry, G.: Validation and Uncertainty assessment of extreme-scale HPC simulation through Bayesian inference. In: Wolf, F., Mohr, B., an Mey, D. (eds.) Euro-Par 2013. LNCS, vol. 8097, pp. 41–52. Springer, Heidelberg (2013). https://​doi.​org/​10.​1007/​978-3-642-40047-6_​7CrossRef
26.
go back to reference Xu, Q.: Automatic Construction of Coordinated Performance Skeletons, p. 84 (2007) Xu, Q.: Automatic Construction of Coordinated Performance Skeletons, p. 84 (2007)
27.
go back to reference Zhang, W., Almgren, A.S., Day, M., Nguyen, T., Shalf, J., Unat, D.: Boxlib with tiling: An AMR software framework. CoRR abs/1604.03570 (2016) Zhang, W., Almgren, A.S., Day, M., Nguyen, T., Shalf, J., Unat, D.: Boxlib with tiling: An AMR software framework. CoRR abs/1604.03570 (2016)
Metadata
Title
Compiler-Assisted Source-to-Source Skeletonization of Application Models for System Simulation
Authors
Jeremiah J. Wilke
Joseph P. Kenny
Samuel Knight
Sebastien Rumley
Copyright Year
2018
DOI
https://doi.org/10.1007/978-3-319-92040-5_7