Skip to main content
Erschienen in: Cluster Computing 2/2021

12.08.2020

AutoWM: a novel domain-specific tool for universal multi-/many-core accelerations of the WRF cloud microphysics

verfasst von: Peng Zhang, Chao Yang, Yulong Ao

Erschienen in: Cluster Computing | Ausgabe 2/2021

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In large-scale atmospheric simulations, microphysics parameterization often takes a large portion of simulation time and usually consists of dozens of parameterization schemes. Performance optimizing these schemes one by one on different hardware platforms is tedious and error-prone even for skilled programmers. In this work, we propose AutoWM, a novel domain-specific tool for universal performance accelerations of the famous weather research and forecasting model (WRF) microphysics on multi-/many-core systems. The main idea of AutoWM is to reconstruct various schemes into compositions of common building blocks and optimize these building blocks instead of the schemes on target platforms for reusing. To achieve this goal, a light-weight domain-specific language, WML, is provided to describe different microphysics schemes so that the workflow information can be parsed and extracted easily. Experiments on the popular WRF single/double moments microphysics schemes show that AutoWM can automatically generate well optimized microphysics kernels on three multi- and many-core platforms including Intel Ivy Bridge, Intel Xeon Phi and Chinese homegrown SW26010, with the average floating-point efficiency reaching \(47\%\), \(20\%\) and \(10\%\) of the theoretical peak performance, respectively.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Anhänge
Nur mit Berechtigung zugänglich
Literatur
1.
Zurück zum Zitat Aljanabi, S., Alwan, E.: Soft mathematical system to solve black box problem through development the farb based on hyperbolic and polynomial functions. In: International Conference on Developments in Esystems Engineering, pp. 37–42 (2017) Aljanabi, S., Alwan, E.: Soft mathematical system to solve black box problem through development the farb based on hyperbolic and polynomial functions. In: International Conference on Developments in Esystems Engineering, pp. 37–42 (2017)
2.
Zurück zum Zitat Al-Janabi, S., Alkaim, A.F.: A nifty collaborative analysis to predicting a novel tool (DRFLLS) for missing values estimation. Soft Comput. 24(1), 555–569 (2020)CrossRef Al-Janabi, S., Alkaim, A.F.: A nifty collaborative analysis to predicting a novel tool (DRFLLS) for missing values estimation. Soft Comput. 24(1), 555–569 (2020)CrossRef
3.
Zurück zum Zitat Aljanabi, S., Mohammad, M., Alsultan, A.: A new method for prediction of air pollution based on intelligent computation. Soft Comput. 24(1), 661–680 (2020)CrossRef Aljanabi, S., Mohammad, M., Alsultan, A.: A new method for prediction of air pollution based on intelligent computation. Soft Comput. 24(1), 661–680 (2020)CrossRef
4.
Zurück zum Zitat Alkaim, A.F., Janabi, S.A.: Multi objectives optimization to gas flaring reduction from oil production. pp. 117–139 (2019) Alkaim, A.F., Janabi, S.A.: Multi objectives optimization to gas flaring reduction from oil production. pp. 117–139 (2019)
5.
Zurück zum Zitat Cumming, B., Osuna, C., Gysi, T., Bianco, M., Lapillonne, X., Fuhrer, O., Schulthess, T.C.: A review of the challenges and results of refactoring the community climate code COSMO for hybrid Cray HPC systems. In: Proceedings of Cray User Group (2013) Cumming, B., Osuna, C., Gysi, T., Bianco, M., Lapillonne, X., Fuhrer, O., Schulthess, T.C.: A review of the challenges and results of refactoring the community climate code COSMO for hybrid Cray HPC systems. In: Proceedings of Cray User Group (2013)
6.
Zurück zum Zitat Damian, V., Sandu, A., Damian, M., Potra, F., Carmichael, G.R.: The kinetic preprocessor KPP-a software environment for solving chemical kinetics. Comput. Chem. Eng. 26(11), 1567–1579 (2002)CrossRef Damian, V., Sandu, A., Damian, M., Potra, F., Carmichael, G.R.: The kinetic preprocessor KPP-a software environment for solving chemical kinetics. Comput. Chem. Eng. 26(11), 1567–1579 (2002)CrossRef
7.
Zurück zum Zitat Demeshko, I., Maruyama, N., Tomita, H., Matsuoka, S.: Multi-GPU implementation of the NICAM atmospheric model. Springer, Berlin (2013)CrossRef Demeshko, I., Maruyama, N., Tomita, H., Matsuoka, S.: Multi-GPU implementation of the NICAM atmospheric model. Springer, Berlin (2013)CrossRef
8.
Zurück zum Zitat Fu, H., Liao, J., Xue, W., Wang, L., Chen, D., Gu, L., Xu, J., Ding, N., Wang, X., He, C., et al.: Refactoring and optimizing the community atmosphere model (CAM) on the sunway taihulight supercomputer. In: IEEE High Performance Computing, Networking, Storage and Analysis, SC16: International Conference for, pp. 969–980 (2016) Fu, H., Liao, J., Xue, W., Wang, L., Chen, D., Gu, L., Xu, J., Ding, N., Wang, X., He, C., et al.: Refactoring and optimizing the community atmosphere model (CAM) on the sunway taihulight supercomputer. In: IEEE High Performance Computing, Networking, Storage and Analysis, SC16: International Conference for, pp. 969–980 (2016)
9.
Zurück zum Zitat Haohuan, F., Liao, J., Yang, J., Wang, L., Song, Z., Huang, X., Yang, C., Xue, W., Liu, F., Qiao, F.: The Sunway TaihuLight supercomputer: system and applications. Sci. China Inf. Sci. 59(7), 072001:1–16 (2016) Haohuan, F., Liao, J., Yang, J., Wang, L., Song, Z., Huang, X., Yang, C., Xue, W., Liu, F., Qiao, F.: The Sunway TaihuLight supercomputer: system and applications. Sci. China Inf. Sci. 59(7), 072001:1–16 (2016)
10.
Zurück zum Zitat Hong, S.Y., Lim, J.O.J.: The WRF single-moment 6-class microphysics scheme (WSM6). Asia-Pac. J. Atmos. Sci. 42, 129–151 (2006) Hong, S.Y., Lim, J.O.J.: The WRF single-moment 6-class microphysics scheme (WSM6). Asia-Pac. J. Atmos. Sci. 42, 129–151 (2006)
11.
Zurück zum Zitat Hong, S.Y., Dudhia, J., Chen, S.H.: A revised approach to ice microphysical processes for the bulk parameterization of clouds and precipitation. Mon. Weather Rev. 132(1), 103–120 (2004)CrossRef Hong, S.Y., Dudhia, J., Chen, S.H.: A revised approach to ice microphysical processes for the bulk parameterization of clouds and precipitation. Mon. Weather Rev. 132(1), 103–120 (2004)CrossRef
12.
Zurück zum Zitat Huang, M., Mielikainen, J., Huang, B., Huang, H.L.A., Goldberg, M.D.: On the acceleration of the eta ferrier cloud microphysics scheme in the weather research and forecasting (WRF) model using a GPU. In: Proceedings of SPIE—The International Society for Optical Engineering 8539, 85390K–85390K–11 (2012) Huang, M., Mielikainen, J., Huang, B., Huang, H.L.A., Goldberg, M.D.: On the acceleration of the eta ferrier cloud microphysics scheme in the weather research and forecasting (WRF) model using a GPU. In: Proceedings of SPIE—The International Society for Optical Engineering 8539, 85390K–85390K–11 (2012)
13.
Zurück zum Zitat Huang, M., Mielikainen, J., Huang, B., Chen, H., Huang, H.L.A., Goldberg, M.D.: Development of efficient GPU parallelization of WRF Yonsei University planetary boundary layer scheme. Geosci. Model Dev. 7(6), 2977–2990 (2014) Huang, M., Mielikainen, J., Huang, B., Chen, H., Huang, H.L.A., Goldberg, M.D.: Development of efficient GPU parallelization of WRF Yonsei University planetary boundary layer scheme. Geosci. Model Dev. 7(6), 2977–2990 (2014)
14.
Zurück zum Zitat Kashyap, A., Vadhiyar, S.S., Nanjundiah, R.S., Vinayachandran, P.: Asynchronous and synchronous models of executions on Intel Xeon Phi coprocessor systems for high performance of long wave radiation calculations in atmosphere models. J. Parallel Distrib. Comput. (2017) Kashyap, A., Vadhiyar, S.S., Nanjundiah, R.S., Vinayachandran, P.: Asynchronous and synchronous models of executions on Intel Xeon Phi coprocessor systems for high performance of long wave radiation calculations in atmosphere models. J. Parallel Distrib. Comput. (2017)
15.
Zurück zum Zitat Lim, K.S.S., Hong, S.Y.: Development of an effective double-moment cloud microphysics scheme with prognostic cloud condensation nuclei (CCN) for weather and climate models. Mon. Weather Rev. 138(138), 1587–1612 (2010)CrossRef Lim, K.S.S., Hong, S.Y.: Development of an effective double-moment cloud microphysics scheme with prognostic cloud condensation nuclei (CCN) for weather and climate models. Mon. Weather Rev. 138(138), 1587–1612 (2010)CrossRef
16.
Zurück zum Zitat Linford, J.C., Michalakes, J., Vachharajani, M., Sandu, A.: Automatic generation of multicore chemical kernels. IEEE Trans. Parallel Distrib. Syst. 22(1), 119–131 (2011)CrossRef Linford, J.C., Michalakes, J., Vachharajani, M., Sandu, A.: Automatic generation of multicore chemical kernels. IEEE Trans. Parallel Distrib. Syst. 22(1), 119–131 (2011)CrossRef
17.
Zurück zum Zitat Michalakes, J., Vachharajani, M.: GPU acceleration of numerical weather prediction. In: IEEE International Symposium on Parallel and Distributed Processing, pp. 1–7 (2008) Michalakes, J., Vachharajani, M.: GPU acceleration of numerical weather prediction. In: IEEE International Symposium on Parallel and Distributed Processing, pp. 1–7 (2008)
18.
Zurück zum Zitat Michalakes, J., Iacono, M.J., Jessup, E.R.: Optimizing weather model radiative transfer physics for intel many integrated core (MIC) architecture. Parallel Process. Lett. (2016) Michalakes, J., Iacono, M.J., Jessup, E.R.: Optimizing weather model radiative transfer physics for intel many integrated core (MIC) architecture. Parallel Process. Lett. (2016)
19.
Zurück zum Zitat Mielikainen, J., Huang, B., Huang, H.L.A., Goldberg, M.D.: Improved GPU/CUDA based parallel weather and research forecast (WRF) single moment 5-class (WSM5) cloud microphysics. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 5(4), 1256–1265 (2012)CrossRef Mielikainen, J., Huang, B., Huang, H.L.A., Goldberg, M.D.: Improved GPU/CUDA based parallel weather and research forecast (WRF) single moment 5-class (WSM5) cloud microphysics. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 5(4), 1256–1265 (2012)CrossRef
20.
Zurück zum Zitat Mielikainen, J., Huang, B., Wang, J., Huang, H.L.A., Goldberg, M.D.: Compute unified device architecture (CUDA)-based parallelization of WRF Kessler cloud microphysics scheme. Comput. Geosci. 52(1), 292–299 (2013)CrossRef Mielikainen, J., Huang, B., Wang, J., Huang, H.L.A., Goldberg, M.D.: Compute unified device architecture (CUDA)-based parallelization of WRF Kessler cloud microphysics scheme. Comput. Geosci. 52(1), 292–299 (2013)CrossRef
21.
Zurück zum Zitat Mielikainen, J., Huang, B., Huang, A.: Optimizing weather and research forecast (WRF) thompson cloud microphysics on intel many integrated core (MIC). In: SPIE Sensing Technology Applications, p. 91240Q (2014) Mielikainen, J., Huang, B., Huang, A.: Optimizing weather and research forecast (WRF) thompson cloud microphysics on intel many integrated core (MIC). In: SPIE Sensing Technology Applications, p. 91240Q (2014)
23.
Zurück zum Zitat Price, E., Mielikainen, J., Huang, B., Huang, H.L.A., Lee, T.: GPU acceleration experience with RRTMG long wave radiation model. In: SPIE Remote Sensing (2013) Price, E., Mielikainen, J., Huang, B., Huang, H.L.A., Lee, T.: GPU acceleration experience with RRTMG long wave radiation model. In: SPIE Remote Sensing (2013)
24.
Zurück zum Zitat Shimokawabe, T., Aoki, T., Muroi, C., Ishida, J., Kawano, K., Endo, T., Nukada, A., Maruyama, N., Matsuoka, S.: An 80-fold speedup, 15.0 TFlops full GPU acceleration of non-hydrostatic weather model ASUCA production code. In: High Performance Computing, Networking, Storage and Analysis, pp. 1–11 (2010) Shimokawabe, T., Aoki, T., Muroi, C., Ishida, J., Kawano, K., Endo, T., Nukada, A., Maruyama, N., Matsuoka, S.: An 80-fold speedup, 15.0 TFlops full GPU acceleration of non-hydrostatic weather model ASUCA production code. In: High Performance Computing, Networking, Storage and Analysis, pp. 1–11 (2010)
25.
Zurück zum Zitat Shimokawabe, T., Aoki, T., Ishida, J., Kawano, K., Muroi, C.: 145 TFlops performance on 3990 GPUs of TSUBAME 2.0 supercomputer for an operational weather prediction. Procedia Comput. Sci. 4(2), 1535–1544 (2011)CrossRef Shimokawabe, T., Aoki, T., Ishida, J., Kawano, K., Muroi, C.: 145 TFlops performance on 3990 GPUs of TSUBAME 2.0 supercomputer for an operational weather prediction. Procedia Comput. Sci. 4(2), 1535–1544 (2011)CrossRef
29.
Zurück zum Zitat Vu, V.T., Cats, G., Wolters, L.: Graphics Processing Unit optimizations for the dynamics of the HIRLAM weather forecast model. Concurr. Comput. Pract. Exp. 25(10), 1376–1393 (2013)CrossRef Vu, V.T., Cats, G., Wolters, L.: Graphics Processing Unit optimizations for the dynamics of the HIRLAM weather forecast model. Concurr. Comput. Pract. Exp. 25(10), 1376–1393 (2013)CrossRef
30.
Zurück zum Zitat Wang, Y., Hao, H., Zhang, J., Jiang, J., He, J., Ma, Y.: Performance optimization and evaluation for parallel processing of big data in earth system models. Clust. Comput. 22(1), 2371–2381 (2019)CrossRef Wang, Y., Hao, H., Zhang, J., Jiang, J., He, J., Ma, Y.: Performance optimization and evaluation for parallel processing of big data in earth system models. Clust. Comput. 22(1), 2371–2381 (2019)CrossRef
32.
Zurück zum Zitat Wu, X., Jin, Z., Huang, L., Chen, D.: The software framework and application of GRAPES model. Q. J. Appl. Meteorol. 109(12), 5977–84 (2005) Wu, X., Jin, Z., Huang, L., Chen, D.: The software framework and application of GRAPES model. Q. J. Appl. Meteorol. 109(12), 5977–84 (2005)
33.
Zurück zum Zitat Wu, X., Huang, B., Huang, H.L.A., Goldberg, M.D.: A GPU-based implementation of WRF PBL/MYNN surface layer scheme. In: IEEE International Conference on Parallel and Distributed Systems, pp. 879–883 (2012) Wu, X., Huang, B., Huang, H.L.A., Goldberg, M.D.: A GPU-based implementation of WRF PBL/MYNN surface layer scheme. In: IEEE International Conference on Parallel and Distributed Systems, pp. 879–883 (2012)
34.
Zurück zum Zitat Xue, W., Yang, C., Fu, H., Wang, X., Xu, Y., Gan, L., Lu, Y., Zhu, X.: Enabling and scaling a global shallow-water atmospheric model on Tianhe-2. In: IEEE International Parallel and Distributed Processing Symposium, pp. 745–754 (2014) Xue, W., Yang, C., Fu, H., Wang, X., Xu, Y., Gan, L., Lu, Y., Zhu, X.: Enabling and scaling a global shallow-water atmospheric model on Tianhe-2. In: IEEE International Parallel and Distributed Processing Symposium, pp. 745–754 (2014)
35.
Zurück zum Zitat Yang, C., Xue, W., Fu, H., Gan, L., Li, L., Xu, Y., Lu, Y., Sun, J., Yang, G., Zheng, W.: A peta-scalable CPU-GPU algorithm for global atmospheric simulations. ACM Sigplan Not. 48(8), 1–12 (2013)CrossRef Yang, C., Xue, W., Fu, H., Gan, L., Li, L., Xu, Y., Lu, Y., Sun, J., Yang, G., Zheng, W.: A peta-scalable CPU-GPU algorithm for global atmospheric simulations. ACM Sigplan Not. 48(8), 1–12 (2013)CrossRef
36.
Zurück zum Zitat Zhang, P., Yang, C., Chen, C., Li, X., Shen, X., Xiao, F.: Development of a hybrid parallel MCV-based high-order global shallow-water model. J. Supercomput. 1–20 (2017) Zhang, P., Yang, C., Chen, C., Li, X., Shen, X., Xiao, F.: Development of a hybrid parallel MCV-based high-order global shallow-water model. J. Supercomput. 1–20 (2017)
Metadaten
Titel
AutoWM: a novel domain-specific tool for universal multi-/many-core accelerations of the WRF cloud microphysics
verfasst von
Peng Zhang
Chao Yang
Yulong Ao
Publikationsdatum
12.08.2020
Verlag
Springer US
Erschienen in
Cluster Computing / Ausgabe 2/2021
Print ISSN: 1386-7857
Elektronische ISSN: 1573-7543
DOI
https://doi.org/10.1007/s10586-020-03170-7

Weitere Artikel der Ausgabe 2/2021

Cluster Computing 2/2021 Zur Ausgabe

Premium Partner