2012 | OriginalPaper | Chapter
A Multi-GPU Implementation of a D2Q37 Lattice Boltzmann Code
Authors : Luca Biferale, Filippo Mantovani, Marcello Pivanti, Fabio Pozzati, Mauro Sbragaglia, Andrea Scagliarini, Sebastiano Fabio Schifano, Federico Toschi, Raffaele Tripiccione
Published in: Parallel Processing and Applied Mathematics
Publisher: Springer Berlin Heidelberg
Activate our intelligent search to find suitable subject content or patents.
Select sections of text to find matching patents with Artificial Intelligence. powered by
Select sections of text to find additional relevant content using AI-assisted search. powered by
We describe a parallel implementation of a compressible Lattice Boltzmann code on a multi-GPU cluster based on Nvidia
Fermi
processors. We analyze how to optimize the algorithm for GP-GPU architectures, describe the implementation choices that we have adopted and compare our performance results with an implementation optimized for latest generation multi-core CPUs. Our program runs at ≈ 30% of the double-precision peak performance of one GPU and shows almost linear scaling when run on the multi-GPU cluster.