Skip to main content

2020 | OriginalPaper | Buchkapitel

Accelerating Lattice Boltzmann Method by Fully Exposing Vectorizable Loops

verfasst von : Bin Qu, Song Liu, Hailong Huang, Jiajun Yuan, Qian Wang, Weiguo Wu

Erschienen in: Algorithms and Architectures for Parallel Processing

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Lattice Boltzmann Method (LBM) plays an important role in CFD applications. Accelerating LBM computation indicates the decrease of simulation costs for many industries. However, the loop-carried dependencies in LBM kernels prevent the vectorization of loops and general compilers therefore have missed many opportunities of vectorization. This paper proposes a SIMD-aware loop transformation algorithm to fully expose vectorizable loops for LBM kernels. The proposed algorithm identifies most potential vectorizable loops according to a defined dependence table. Then, it performs appropriate loop transformations and array copying techniques to legalize loop-carried dependencies and makes the identified loops automatically vectorized by compiler. Experiments carried on an Intel Xeon Gold 6140 server show that the proposed algorithm significantly raises the ratio of number of vectorized loops to number of all loops in LBM kernels. And our algorithm also achieves a better performance than an Intel C++ compiler and a polyhedral optimizer, accelerating LBM computation by 147% and 120% on average lattice update speed, respectively.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
5.
Zurück zum Zitat Acharya, A., Bondhugula, U.: PLUTO+: near-complete modeling of affine transformations for parallelism and locality. In: Proceedings of the 20th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP 2015, San Francisco, CA, USA, 7–11 February, 2015, pp. 54–64 (2015) Acharya, A., Bondhugula, U.: PLUTO+: near-complete modeling of affine transformations for parallelism and locality. In: Proceedings of the 20th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP 2015, San Francisco, CA, USA, 7–11 February, 2015, pp. 54–64 (2015)
6.
Zurück zum Zitat Bernstein, A.J.: Analysis of programs for parallel processing. IEEE Trans. Electron. Comput. 5, 757–763 (1966)CrossRef Bernstein, A.J.: Analysis of programs for parallel processing. IEEE Trans. Electron. Comput. 5, 757–763 (1966)CrossRef
7.
Zurück zum Zitat Bondhugula, U., Hartono, A., Ramanujam, J., Sadayappan, P.: A practical automatic polyhedral program optimization system. In: ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), June 2008 Bondhugula, U., Hartono, A., Ramanujam, J., Sadayappan, P.: A practical automatic polyhedral program optimization system. In: ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), June 2008
8.
Zurück zum Zitat Chen, S., Doolen, G.D.: Lattice boltzmann method for fluid flows. Ann. Rev. Fluid Mechan. 30(1), 329–364 (1998)MathSciNetCrossRef Chen, S., Doolen, G.D.: Lattice boltzmann method for fluid flows. Ann. Rev. Fluid Mechan. 30(1), 329–364 (1998)MathSciNetCrossRef
9.
Zurück zum Zitat Devan, P.S., Kamat, R.: A review-loop dependence analysis for parallelizing compiler. Int. J. Comput. Sci. Inf. Technol. 5(3), 4038–4046 (2014) Devan, P.S., Kamat, R.: A review-loop dependence analysis for parallelizing compiler. Int. J. Comput. Sci. Inf. Technol. 5(3), 4038–4046 (2014)
10.
Zurück zum Zitat Di, P., Ye, D., Su, Y., Sui, Y., Xue, J.: Automatic parallelization of tiled loop nests with enhanced fine-grained parallelism on gpus. In: 2012 41st International Conference on Parallel Processing, pp. 350–359. IEEE (2012) Di, P., Ye, D., Su, Y., Sui, Y., Xue, J.: Automatic parallelization of tiled loop nests with enhanced fine-grained parallelism on gpus. In: 2012 41st International Conference on Parallel Processing, pp. 350–359. IEEE (2012)
12.
Zurück zum Zitat Feautrier, P.: Some efficient solutions to the affine scheduling problem. i. one-dimensional time. Int. J. Parallel Program. 21(5), 313–347 (1992)MathSciNetCrossRef Feautrier, P.: Some efficient solutions to the affine scheduling problem. i. one-dimensional time. Int. J. Parallel Program. 21(5), 313–347 (1992)MathSciNetCrossRef
13.
Zurück zum Zitat Feautrier, P.: Some efficient solutions to the affine scheduling problem. part ii. multidimensional time. Int. J. Parallel Program. 21(6), 389–420 (1992)MathSciNetCrossRef Feautrier, P.: Some efficient solutions to the affine scheduling problem. part ii. multidimensional time. Int. J. Parallel Program. 21(6), 389–420 (1992)MathSciNetCrossRef
15.
Zurück zum Zitat Kong, M., Veras, R., Stock, K., Franchetti, F., Pouchet, L., Sadayappan, P.: When polyhedral transformations meet SIMD code generation. In: ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 2013, Seattle, WA, USA, 16–19 June, 2013, pp. 127–138 (2013) Kong, M., Veras, R., Stock, K., Franchetti, F., Pouchet, L., Sadayappan, P.: When polyhedral transformations meet SIMD code generation. In: ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 2013, Seattle, WA, USA, 16–19 June, 2013, pp. 127–138 (2013)
16.
Zurück zum Zitat Krafczyk, M., Tölke, J., Luo, L.S.: Large-eddy simulations with a multiple-relaxation-time lbe model. Int. J. Modern Phys. B 17(01n02), 33–39 (2003)CrossRef Krafczyk, M., Tölke, J., Luo, L.S.: Large-eddy simulations with a multiple-relaxation-time lbe model. Int. J. Modern Phys. B 17(01n02), 33–39 (2003)CrossRef
17.
Zurück zum Zitat Lim, A.W., Lam, M.S.: Maximizing parallelism and minimizing synchronization with affine transforms. In: Proceedings of the 24th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, pp. 201–214. ACM (1997) Lim, A.W., Lam, M.S.: Maximizing parallelism and minimizing synchronization with affine transforms. In: Proceedings of the 24th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, pp. 201–214. ACM (1997)
18.
Zurück zum Zitat Liu, S., Zou, N., Cui, Y., Wu, W.: Accelerating the parallelization of lattice boltzmann method by exploiting the temporal locality. In: 2017 IEEE International Symposium on Parallel and Distributed Processing with Applications and 2017 IEEE International Conference on Ubiquitous Computing and Communications (ISPA/IUCC), pp. 1186–1193. IEEE (2017) Liu, S., Zou, N., Cui, Y., Wu, W.: Accelerating the parallelization of lattice boltzmann method by exploiting the temporal locality. In: 2017 IEEE International Symposium on Parallel and Distributed Processing with Applications and 2017 IEEE International Conference on Ubiquitous Computing and Communications (ISPA/IUCC), pp. 1186–1193. IEEE (2017)
19.
Zurück zum Zitat Pouchet, L.N.: Interative optimization in the polyhedral model. Ph.D. thesis, University of Paris-Sud 11, Orsay, France, January 2010 Pouchet, L.N.: Interative optimization in the polyhedral model. Ph.D. thesis, University of Paris-Sud 11, Orsay, France, January 2010
20.
Zurück zum Zitat Qian, Y., d’Humières, D., Lallemand, P.: Lattice BGK models for navier-stokes equation. EPL (Europhys. Lett.) 17(6), 479 (1992)CrossRef Qian, Y., d’Humières, D., Lallemand, P.: Lattice BGK models for navier-stokes equation. EPL (Europhys. Lett.) 17(6), 479 (1992)CrossRef
21.
Zurück zum Zitat Shanley, T.: Pentium Pro and Pentium II System Architecture. Addison-Wesley Professional, Boston (1998) Shanley, T.: Pentium Pro and Pentium II System Architecture. Addison-Wesley Professional, Boston (1998)
22.
Zurück zum Zitat Tran, N.P., Lee, M., Choi, D.H.: Memory-efficient parallelization of 3D lattice boltzmann flow solver on a gpu. In: 2015 IEEE 22nd International Conference on High Performance Computing (HiPC), pp. 315–324. IEEE (2015) Tran, N.P., Lee, M., Choi, D.H.: Memory-efficient parallelization of 3D lattice boltzmann flow solver on a gpu. In: 2015 IEEE 22nd International Conference on High Performance Computing (HiPC), pp. 315–324. IEEE (2015)
23.
Zurück zum Zitat Trifunovic, K., Nuzman, D., Cohen, A., Zaks, A., Rosen, I.: Polyhedral-model guided loop-nest auto-vectorization. In: 2009 18th International Conference on Parallel Architectures and Compilation Techniques, pp. 327–337. IEEE (2009) Trifunovic, K., Nuzman, D., Cohen, A., Zaks, A., Rosen, I.: Polyhedral-model guided loop-nest auto-vectorization. In: 2009 18th International Conference on Parallel Architectures and Compilation Techniques, pp. 327–337. IEEE (2009)
Metadaten
Titel
Accelerating Lattice Boltzmann Method by Fully Exposing Vectorizable Loops
verfasst von
Bin Qu
Song Liu
Hailong Huang
Jiajun Yuan
Qian Wang
Weiguo Wu
Copyright-Jahr
2020
DOI
https://doi.org/10.1007/978-3-030-38991-8_8