Skip to main content

2018 | OriginalPaper | Buchkapitel

A Space and Bandwidth Efficient Multicore Algorithm for the Particle-in-Cell Method

verfasst von : Yann Barsamian, Arthur Charguéraud, Alain Ketterlin

Erschienen in: Parallel Processing and Applied Mathematics

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

The Particle-in-Cell (PIC) method allows solving partial differential equation through simulations, with important applications in plasma physics. To simulate thousands of billions of particles on clusters of multicore machines, prior work has proposed hybrid algorithms that combine domain decomposition and particle decomposition with carefully optimized algorithms for handling particles processed on each multicore socket. Regarding the multicore processing, existing algorithms either suffer from suboptimal execution time, due to sorting operations or use of atomic instructions, or suffer from suboptimal space usage. In this paper, we propose a novel parallel algorithm for two-dimensional PIC simulations on multicore hardware that features asymptotically-optimal memory consumption, and does not perform unnecessary accesses to the main memory. In practice, our algorithm reaches 65% of the maximum bandwidth, and shows excellent scalability on the classical Landau damping and two-stream instability test cases.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
2.
Zurück zum Zitat Barnes, J., Hut, P.: A hierarchical \(O(N \log N)\) force-calculation algorithm. Nature 324(3), 446–449 (1986)CrossRef Barnes, J., Hut, P.: A hierarchical \(O(N \log N)\) force-calculation algorithm. Nature 324(3), 446–449 (1986)CrossRef
3.
Zurück zum Zitat Barsamian, Y., Hirstoaga, S.A., Violard, É.: Efficient data structures for a hybrid parallel and vectorized particle-in-cell code. In: 2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 1168–1177 (2017) Barsamian, Y., Hirstoaga, S.A., Violard, É.: Efficient data structures for a hybrid parallel and vectorized particle-in-cell code. In: 2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 1168–1177 (2017)
4.
Zurück zum Zitat Bender, M.A., Demaine, E.D., Farach-Colton, M.: Cache-oblivious b-trees. In: Proceedings of the 41st Annual Symposium on Foundations of Computer Science (FOCS), pp. 399–409 (2000) Bender, M.A., Demaine, E.D., Farach-Colton, M.: Cache-oblivious b-trees. In: Proceedings of the 41st Annual Symposium on Foundations of Computer Science (FOCS), pp. 399–409 (2000)
5.
Zurück zum Zitat Birdsall, C.K., Fuss, D.: Clouds-in-clouds, clouds-in-cells physics for many-body plasma simulation. J. Comput. Phys. 3, 494–511 (1969)CrossRefMATH Birdsall, C.K., Fuss, D.: Clouds-in-clouds, clouds-in-cells physics for many-body plasma simulation. J. Comput. Phys. 3, 494–511 (1969)CrossRefMATH
6.
Zurück zum Zitat Birdsall, C.K., Langdon, A.B.: Plasma Physics via Computer Simulation. McGraw-Hill, New York (1985) Birdsall, C.K., Langdon, A.B.: Plasma Physics via Computer Simulation. McGraw-Hill, New York (1985)
7.
Zurück zum Zitat Bowers, K.J.: Accelerating a particle-in-cell simulation using a hybrid counting sort. J. Comput. Phys. 173(2), 393–411 (2001)CrossRefMATH Bowers, K.J.: Accelerating a particle-in-cell simulation using a hybrid counting sort. J. Comput. Phys. 173(2), 393–411 (2001)CrossRefMATH
8.
Zurück zum Zitat Bowers, K.J., Albright, B.J., Yin, L., Bergen, B., Kwan, T.J.T.: Ultrahigh performance three-dimensional electromagnetic relativistic kinetic plasma simulation. Phys. Plasmas 15(5), 055703 (2008)CrossRef Bowers, K.J., Albright, B.J., Yin, L., Bergen, B., Kwan, T.J.T.: Ultrahigh performance three-dimensional electromagnetic relativistic kinetic plasma simulation. Phys. Plasmas 15(5), 055703 (2008)CrossRef
9.
Zurück zum Zitat Bussmann, M., Burau, H., Cowan, T.E., Debus, A., Huebl, A., Juckeland, G., Kluge, T., Nagel, W.E., Pausch, R., Schmitt, F., Schramm, U., Schuchart, J., Widera, R.: Radiative signatures of the relativistic Kelvin-Helmholtz instability. In: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis (SC), pp. 5:1–5:12 (2013) Bussmann, M., Burau, H., Cowan, T.E., Debus, A., Huebl, A., Juckeland, G., Kluge, T., Nagel, W.E., Pausch, R., Schmitt, F., Schramm, U., Schuchart, J., Widera, R.: Radiative signatures of the relativistic Kelvin-Helmholtz instability. In: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis (SC), pp. 5:1–5:12 (2013)
10.
Zurück zum Zitat Decyk, V.K., Singh, T.V.: Particle-in-cell algorithms for emerging computer architectures. Comput. Phys. Commun. 185(3), 708–719 (2014)MathSciNetCrossRef Decyk, V.K., Singh, T.V.: Particle-in-cell algorithms for emerging computer architectures. Comput. Phys. Commun. 185(3), 708–719 (2014)MathSciNetCrossRef
11.
Zurück zum Zitat Durand, M.: PaVo. An adaptative parallel sorting algorithm. Ph.D. thesis, Université de Grenoble (2013) Durand, M.: PaVo. An adaptative parallel sorting algorithm. Ph.D. thesis, Université de Grenoble (2013)
12.
Zurück zum Zitat Durand, M., Raffin, B., Faure, F.: A packed memory array to keep moving particles sorted. In: Workshop on Virtual Reality Interaction and Physical Simulation (VRIPHYS) (2012) Durand, M., Raffin, B., Faure, F.: A packed memory array to keep moving particles sorted. In: Workshop on Virtual Reality Interaction and Physical Simulation (VRIPHYS) (2012)
14.
Zurück zum Zitat Germaschewski, K., Fox, W., Abbott, S., Ahmadi, N., Maynard, K., Wang, L., Ruhl, H., Bhattacharjee, A.: The plasma simulation code: a modern particle-in-cell code with patch-based load-balancing. J. Comput. Phys. 318, 305–326 (2016)MathSciNetCrossRefMATH Germaschewski, K., Fox, W., Abbott, S., Ahmadi, N., Maynard, K., Wang, L., Ruhl, H., Bhattacharjee, A.: The plasma simulation code: a modern particle-in-cell code with patch-based load-balancing. J. Comput. Phys. 318, 305–326 (2016)MathSciNetCrossRefMATH
15.
Zurück zum Zitat Hanson, D.R.: Fast allocation and deallocation of memory based on object lifetimes. Softw. Pract. Exper. 20(1), 5–12 (1990)CrossRef Hanson, D.R.: Fast allocation and deallocation of memory based on object lifetimes. Softw. Pract. Exper. 20(1), 5–12 (1990)CrossRef
16.
Zurück zum Zitat Hockney, R.W., Eastwood, J.W.: Computer Simulation Using Particles. Institute of Physics, Philadelphia (1988)CrossRefMATH Hockney, R.W., Eastwood, J.W.: Computer Simulation Using Particles. Institute of Physics, Philadelphia (1988)CrossRefMATH
17.
Zurück zum Zitat Jocksch, A., Hariri, F., Tran, T.-M., Brunner, S., Gheller, C., Villard, L.: A bucket sort algorithm for the particle-in-cell method on manycore architectures. In: Wyrzykowski, R., Deelman, E., Dongarra, J., Karczewski, K., Kitowski, J., Wiatr, K. (eds.) PPAM 2015. LNCS, vol. 9573, pp. 43–52. Springer, Cham (2016)CrossRef Jocksch, A., Hariri, F., Tran, T.-M., Brunner, S., Gheller, C., Villard, L.: A bucket sort algorithm for the particle-in-cell method on manycore architectures. In: Wyrzykowski, R., Deelman, E., Dongarra, J., Karczewski, K., Kitowski, J., Wiatr, K. (eds.) PPAM 2015. LNCS, vol. 9573, pp. 43–52. Springer, Cham (2016)CrossRef
18.
Zurück zum Zitat Kim, C.C., Parker, S.E.: Massively parallel three-dimensional toroidal gyrokinetic flux-tube turbulence simulation. J. Comput. Phys. 161(2), 589–604 (2000)CrossRefMATH Kim, C.C., Parker, S.E.: Massively parallel three-dimensional toroidal gyrokinetic flux-tube turbulence simulation. J. Comput. Phys. 161(2), 589–604 (2000)CrossRefMATH
19.
Zurück zum Zitat McCalpin, J.D.: Memory bandwidth and machine balance in current high performance computers. In: IEEE Computer Society Technical Committee on Computer Architecture Newsletter (TCCA), pp. 19–25 (1995) McCalpin, J.D.: Memory bandwidth and machine balance in current high performance computers. In: IEEE Computer Society Technical Committee on Computer Architecture Newsletter (TCCA), pp. 19–25 (1995)
20.
Zurück zum Zitat Nakashima, H., Summura, Y., Kikura, K., Miyake, Y.: Large scale manycore-aware PIC simulation with efficient particle binning. In: 2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS), pp. 202–212 (2017) Nakashima, H., Summura, Y., Kikura, K., Miyake, Y.: Large scale manycore-aware PIC simulation with efficient particle binning. In: 2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS), pp. 202–212 (2017)
21.
Zurück zum Zitat Nicol, D.M.: Rectilinear partitioning of irregular data parallel computations. J. Parallel Distr. Com. 23(2), 119–134 (1994)CrossRef Nicol, D.M.: Rectilinear partitioning of irregular data parallel computations. J. Parallel Distr. Com. 23(2), 119–134 (1994)CrossRef
22.
Zurück zum Zitat Surmin, I., Bashinov, A., Bastrakov, S., Efimenko, E., Gonoskov, A., Meyerov, I.: Dynamic load balancing based on rectilinear partitioning in particle-in-cell plasma simulation. In: Parallel Computing Technologies: 13th International Conference (PaCT), pp. 107–119 (2015) Surmin, I., Bashinov, A., Bastrakov, S., Efimenko, E., Gonoskov, A., Meyerov, I.: Dynamic load balancing based on rectilinear partitioning in particle-in-cell plasma simulation. In: Parallel Computing Technologies: 13th International Conference (PaCT), pp. 107–119 (2015)
23.
Zurück zum Zitat Sáez, X., Soba, A., Sánchez, E., Kleiber, R., Castejón, F., Cela, J.M.: Improvements of the particle-in-cell code EUTERPE for petascaling machines. Comput. Phys. Commun. 182(9), 2047–2051 (2011)CrossRef Sáez, X., Soba, A., Sánchez, E., Kleiber, R., Castejón, F., Cela, J.M.: Improvements of the particle-in-cell code EUTERPE for petascaling machines. Comput. Phys. Commun. 182(9), 2047–2051 (2011)CrossRef
24.
Zurück zum Zitat Tskhakaya, D., Schneider, R.: Optimization of PIC codes by improved memory management. J. Comput. Phys. 225(1), 829–839 (2007)CrossRefMATH Tskhakaya, D., Schneider, R.: Optimization of PIC codes by improved memory management. J. Comput. Phys. 225(1), 829–839 (2007)CrossRefMATH
25.
Zurück zum Zitat Vincenti, H., Lobet, M., Lehe, R., Sasanka, R., Vay, J.L.: An efficient and portable SIMD algorithm for charge/current deposition in particle-in-cell codes. Comput. Phys. Commun. 210, 145–154 (2016)CrossRefMATH Vincenti, H., Lobet, M., Lehe, R., Sasanka, R., Vay, J.L.: An efficient and portable SIMD algorithm for charge/current deposition in particle-in-cell codes. Comput. Phys. Commun. 210, 145–154 (2016)CrossRefMATH
26.
Zurück zum Zitat Winkel, M., Speck, R., Hübner, H., Arnold, L., Krause, R., Gibbon, P.: A massively parallel, multi-disciplinary Barnes-Hut tree code for extreme-scale N-body simulations. Comput. Phys. Commun. 183(4), 880–889 (2012)MathSciNetCrossRef Winkel, M., Speck, R., Hübner, H., Arnold, L., Krause, R., Gibbon, P.: A massively parallel, multi-disciplinary Barnes-Hut tree code for extreme-scale N-body simulations. Comput. Phys. Commun. 183(4), 880–889 (2012)MathSciNetCrossRef
Metadaten
Titel
A Space and Bandwidth Efficient Multicore Algorithm for the Particle-in-Cell Method
verfasst von
Yann Barsamian
Arthur Charguéraud
Alain Ketterlin
Copyright-Jahr
2018
DOI
https://doi.org/10.1007/978-3-319-78024-5_13