Skip to main content
Top
Published in: Cluster Computing 1/2019

17-09-2018

A simulation provenance data management system for efficient job execution on an online computational science engineering platform

Authors: Jin Ma, Sik Lee, Kum Won Cho, Young-Kyoon Suh

Published in: Cluster Computing | Issue 1/2019

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

In the past few years an online simulation service platform (named EDISON) has been applauded by several computational science and engineering communities in several countries. Though armed with multiple computing clusters and high-end storage resources, the platform has suffered from handling a huge amount of CPU-/IO-bound simulations that are most duplicated. Such intense simulations are normally admitted with no duplicate elimination and thus can adversely affect the performance of the platform. To address this performance concern, we propose a novel system, termed SuperMan, to seamlessly record and retrieve the provenances of previously executed simulations, and so prevent users from initiating duplicate and/or similar simulations using the limited computing resources. The system collects the simulation provenances based on a variant of a de-facto standard form, thereby offering interoperability. Based on the stored provenances, the system can provide useful simulation run statistics for users that need assistance. SuperMan also applies a hash-based duplicate elimination technique, resulting in making more efficient simulations on the platform. Finally, we show that the proposed proposed system could remove slightly over half of duplicate simulations on a variety of simulation software while obtaining about overall elapsed time savings of 30% and queuing time savings of 25%.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Footnotes
1
Or, science apps. In this article, both of the terms are interchangeably used.
 
2
NCN was established in 2002 and is funded by the National Science Foundation (NSF) to support the National Nanotechnology Initiative (NNI).
 
Literature
1.
go back to reference Suh, Y.-K., Ryu, H., Kim, H., Cho, K.W.: EDISON: a web-based HPC simulation execution framework for large-scale scientific computing software. In: Proceedings of IEEE/ACM 16th International Symposium on Cluster, Cloud and Grid Computing (CCGrid 2016), pp. 608–612 (2016) Suh, Y.-K., Ryu, H., Kim, H., Cho, K.W.: EDISON: a web-based HPC simulation execution framework for large-scale scientific computing software. In: Proceedings of IEEE/ACM 16th International Symposium on Cluster, Cloud and Grid Computing (CCGrid 2016), pp. 608–612 (2016)
2.
go back to reference Ma, J., Lee, J.R., Cho, K., Park, M.: Design and implementation of information management tools for the EDISON open platform. KSII Trans. Internet Inf. Syst. 11(2), 1089–1104 (2017) Ma, J., Lee, J.R., Cho, K., Park, M.: Design and implementation of information management tools for the EDISON open platform. KSII Trans. Internet Inf. Syst. 11(2), 1089–1104 (2017)
6.
go back to reference Moreau, L., Groth, P., Cheney, J., Lebo, T., Miles, S.: The rationale of PROV. J. Web Semant. 35(4), 235–257 (2015)CrossRef Moreau, L., Groth, P., Cheney, J., Lebo, T., Miles, S.: The rationale of PROV. J. Web Semant. 35(4), 235–257 (2015)CrossRef
7.
go back to reference Suh, Y.-K., Ma, J.: SuperMan: a novel system for storing and retrieving scientific-simulation provenance for efficient job executions on computing clusters. In: Proceedings of 2017 IEEE 2nd International Workshops on Foundations and Applications of Self* Systems (FAS*W), pp. 283–288 (2017) Suh, Y.-K., Ma, J.: SuperMan: a novel system for storing and retrieving scientific-simulation provenance for efficient job executions on computing clusters. In: Proceedings of 2017 IEEE 2nd International Workshops on Foundations and Applications of Self* Systems (FAS*W), pp. 283–288 (2017)
12.
go back to reference Lee, K.Y., Suh, Y.-K., Cho, K.W.: Development of a simulation result management and prediction system using machine learning techniques. Int. J. Data Min. Bioinform. 19(1), 75–96 (2017)CrossRef Lee, K.Y., Suh, Y.-K., Cho, K.W.: Development of a simulation result management and prediction system using machine learning techniques. Int. J. Data Min. Bioinform. 19(1), 75–96 (2017)CrossRef
14.
go back to reference Hacker, T.J., et al.: The NEEShub cyberinfrastructure for earthquake engineering. Comput. Sci. Eng. 13(4), 6778 (2011)CrossRef Hacker, T.J., et al.: The NEEShub cyberinfrastructure for earthquake engineering. Comput. Sci. Eng. 13(4), 6778 (2011)CrossRef
15.
go back to reference Klimeck, G., et al.: nanoHUB.org: advancing education and research in nanotechnology. Comput. Sci. Eng. 10(5), 17–23 (2008)CrossRef Klimeck, G., et al.: nanoHUB.org: advancing education and research in nanotechnology. Comput. Sci. Eng. 10(5), 17–23 (2008)CrossRef
17.
go back to reference McLennan, M., Kennell, R.: HUBzero: a platform for dissemination and collaboration in computational science and engineering. Comput. Sci. Eng. 12(2), 4853 (2010)CrossRef McLennan, M., Kennell, R.: HUBzero: a platform for dissemination and collaboration in computational science and engineering. Comput. Sci. Eng. 12(2), 4853 (2010)CrossRef
18.
go back to reference Docan, C., Parashar, M., Klasky, S.: DataSpaces: an interaction and coordination framework for coupled simulation workflows. Cluster Comput. 15(2), 163–181 (2012)CrossRef Docan, C., Parashar, M., Klasky, S.: DataSpaces: an interaction and coordination framework for coupled simulation workflows. Cluster Comput. 15(2), 163–181 (2012)CrossRef
19.
go back to reference Mishin, D., Medvedev, D., Szalay, A.S., Plante, R., Graham, M.: Data sharing and publication using the SciDrive service. In: Proceedings of Astronomical Data Analysis Software and Systems XXIII, p. 465 (2014) Mishin, D., Medvedev, D., Szalay, A.S., Plante, R., Graham, M.: Data sharing and publication using the SciDrive service. In: Proceedings of Astronomical Data Analysis Software and Systems XXIII, p. 465 (2014)
20.
go back to reference Huang, J., Zhang, X., Eisenhauer, G., Schwan, K., Wolf, M., Ethier, S., Klasky, S.: Scibox: Online sharing of scientific data via the cloud. In: Proceedings of the 28th IEEE International Parallel & Distributed Processing Symposium, pp. 145–154 (2014) Huang, J., Zhang, X., Eisenhauer, G., Schwan, K., Wolf, M., Ethier, S., Klasky, S.: Scibox: Online sharing of scientific data via the cloud. In: Proceedings of the 28th IEEE International Parallel & Distributed Processing Symposium, pp. 145–154 (2014)
Metadata
Title
A simulation provenance data management system for efficient job execution on an online computational science engineering platform
Authors
Jin Ma
Sik Lee
Kum Won Cho
Young-Kyoon Suh
Publication date
17-09-2018
Publisher
Springer US
Published in
Cluster Computing / Issue 1/2019
Print ISSN: 1386-7857
Electronic ISSN: 1573-7543
DOI
https://doi.org/10.1007/s10586-018-2827-2

Other articles of this Issue 1/2019

Cluster Computing 1/2019 Go to the issue

Premium Partner