Skip to main content
Erschienen in: The Journal of Supercomputing 5/2018

15.12.2017

Efficient sparse matrix-delayed vector multiplication for discretized neural field model

verfasst von: Jan Fousek

Erschienen in: The Journal of Supercomputing | Ausgabe 5/2018

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Computational models of the human brain provide an important tool for studying the principles behind brain function and disease. To achieve whole-brain simulation, models are formulated at the level of neuronal populations as systems of delayed differential equations. In this paper, we show that the integration of large systems of sparsely connected neural masses is similar to well-studied sparse matrix-vector multiplication; however, due to delayed contributions, it differs in the data access pattern to the vectors. To improve data locality, we propose a combination of node reordering and tiled schedules derived from the connectivity matrix of the particular system, which allows performing multiple integration steps within a tile. We present two schedules: with a serial processing of the tiles and one allowing for parallel processing of the tiles. We evaluate the presented schedules showing speedup up to \(2\,\times \) on single-socket CPU, and \(1.25\,\times \) on Xeon Phi accelerator.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Fußnoten
1
For anatomical and functional atlases, the number of areas ranges typically from 80 to 200, e.g., [6, 34].
 
Literatur
1.
Zurück zum Zitat Bojak I, Oostendorp TF, Reid AT, Kötter R (2011) Towards a model-based integration of co-registered electroencephalography/functional magnetic resonance imaging data with realistic neural population meshes. Philos Trans R Soc A Math Phys Eng Sci 369(1952):3785–3801MathSciNetCrossRefMATH Bojak I, Oostendorp TF, Reid AT, Kötter R (2011) Towards a model-based integration of co-registered electroencephalography/functional magnetic resonance imaging data with realistic neural population meshes. Philos Trans R Soc A Math Phys Eng Sci 369(1952):3785–3801MathSciNetCrossRefMATH
2.
3.
Zurück zum Zitat Byun JH, Lin R, Yelick KA, Demmel J (2012) Autotuning sparse matrix-vector multiplication for multicore. Technical report UCB/EECS-2012-215, EECS Department, University of California, Berkeley Byun JH, Lin R, Yelick KA, Demmel J (2012) Autotuning sparse matrix-vector multiplication for multicore. Technical report UCB/EECS-2012-215, EECS Department, University of California, Berkeley
5.
Zurück zum Zitat Coombes S, beim Graben P, Potthast R, Wright J (2014) Neural fields. Springer, BerlinCrossRefMATH Coombes S, beim Graben P, Potthast R, Wright J (2014) Neural fields. Springer, BerlinCrossRefMATH
6.
Zurück zum Zitat Craddock RC, James GA, Holtzheimer PE, Hu XP, Mayberg HS (2012) A whole brain fMRI atlas generated via spatially constrained spectral clustering. Hum Brain Mapp 33(8):1914–1928CrossRef Craddock RC, James GA, Holtzheimer PE, Hu XP, Mayberg HS (2012) A whole brain fMRI atlas generated via spatially constrained spectral clustering. Hum Brain Mapp 33(8):1914–1928CrossRef
7.
Zurück zum Zitat Cuthill E, McKee J (1969) Reducing the bandwidth of sparse symmetric matrices. In: Proceedings of the 1969 24th National Conference. ACM, pp 157–172 Cuthill E, McKee J (1969) Reducing the bandwidth of sparse symmetric matrices. In: Proceedings of the 1969 24th National Conference. ACM, pp 157–172
8.
Zurück zum Zitat Datta K, Kamil S, Williams S, Oliker L, Shalf J, Yelick K (2009) Optimization and performance modeling of stencil computations on modern microprocessors. SIAM Rev 51(1):129–159CrossRefMATH Datta K, Kamil S, Williams S, Oliker L, Shalf J, Yelick K (2009) Optimization and performance modeling of stencil computations on modern microprocessors. SIAM Rev 51(1):129–159CrossRefMATH
9.
Zurück zum Zitat Demmel J, Hoemmen M, Mohiyuddin M, Yelick K (2008) Avoiding communication in sparse matrix computations. In: IEEE International Symposium on Parallel and Distributed Processing, 2008. IPDPS 2008. IEEE, pp 1–12 Demmel J, Hoemmen M, Mohiyuddin M, Yelick K (2008) Avoiding communication in sparse matrix computations. In: IEEE International Symposium on Parallel and Distributed Processing, 2008. IPDPS 2008. IEEE, pp 1–12
10.
Zurück zum Zitat Douglas CC, Hu J, Kowarschik M, Rüde U, Weiß C (2000) Cache optimization for structured and unstructured grid multigrid. Electron Trans Numer Anal 10:21–40MathSciNetMATH Douglas CC, Hu J, Kowarschik M, Rüde U, Weiß C (2000) Cache optimization for structured and unstructured grid multigrid. Electron Trans Numer Anal 10:21–40MathSciNetMATH
11.
Zurück zum Zitat Geuzaine C, Remacle JF (2009) Gmsh: a 3-D finite element mesh generator with built-in pre-and post-processing facilities. Int J Numer Methods Eng 79(11):1309–1331MathSciNetCrossRefMATH Geuzaine C, Remacle JF (2009) Gmsh: a 3-D finite element mesh generator with built-in pre-and post-processing facilities. Int J Numer Methods Eng 79(11):1309–1331MathSciNetCrossRefMATH
12.
Zurück zum Zitat Green KR, van Veen L (2014) Open-source tools for dynamical analysis of Liley’s mean-field cortex model. J Comput Sci 5(3):507–516MathSciNetCrossRef Green KR, van Veen L (2014) Open-source tools for dynamical analysis of Liley’s mean-field cortex model. J Comput Sci 5(3):507–516MathSciNetCrossRef
13.
Zurück zum Zitat Grosser T, Cohen A, Holewinski J, Sadayappan P, Verdoolaege S (2014) Hybrid hexagonal/classical tiling for GPUs. In: Proceedings of Annual IEEE/ACM International Symposium on Code Generation and Optimization. ACM, p 66 Grosser T, Cohen A, Holewinski J, Sadayappan P, Verdoolaege S (2014) Hybrid hexagonal/classical tiling for GPUs. In: Proceedings of Annual IEEE/ACM International Symposium on Code Generation and Optimization. ACM, p 66
14.
Zurück zum Zitat Jirsa VK (2009) Neural field dynamics with local and global connectivity and time delay. Philos Trans R Soc A Math Phys Eng Sci 367(1891):1131–1143MathSciNetCrossRefMATH Jirsa VK (2009) Neural field dynamics with local and global connectivity and time delay. Philos Trans R Soc A Math Phys Eng Sci 367(1891):1131–1143MathSciNetCrossRefMATH
15.
Zurück zum Zitat Korch M, Rauber T (2010) Parallel low-storage Runge–Kutta solvers for ODE systems with limited access distance. Int J High Perform Comput Appl 25(2):236–255CrossRef Korch M, Rauber T (2010) Parallel low-storage Runge–Kutta solvers for ODE systems with limited access distance. Int J High Perform Comput Appl 25(2):236–255CrossRef
16.
Zurück zum Zitat L’Ecuyer P, Munger D, Oreshkin B, Simard R (2017) Random numbers for parallel computers: requirements and methods, with emphasis on gpus. Math Comput Simul 135:3–17MathSciNetCrossRef L’Ecuyer P, Munger D, Oreshkin B, Simard R (2017) Random numbers for parallel computers: requirements and methods, with emphasis on gpus. Math Comput Simul 135:3–17MathSciNetCrossRef
17.
Zurück zum Zitat Leon PS, Knock SA, Woodman MM, Domide L, Mersmann J, McIntosh AR, Jirsa V (2013) The Virtual Brain: a simulator of primate Brain network dynamics. Front Neuroinform 7:36–47 Leon PS, Knock SA, Woodman MM, Domide L, Mersmann J, McIntosh AR, Jirsa V (2013) The Virtual Brain: a simulator of primate Brain network dynamics. Front Neuroinform 7:36–47
18.
Zurück zum Zitat Liu X, Chow E, Vaidyanathan K, Smelyanskiy M (2012) Improving the performance of dynamical simulations via multiple right-hand sides. In: 2012 IEEE 26th International on Parallel & Distributed Processing Symposium (IPDPS). IEEE, pp 36–47 Liu X, Chow E, Vaidyanathan K, Smelyanskiy M (2012) Improving the performance of dynamical simulations via multiple right-hand sides. In: 2012 IEEE 26th International on Parallel & Distributed Processing Symposium (IPDPS). IEEE, pp 36–47
19.
Zurück zum Zitat Malas T, Hager G, Ltaief H, Keyes D (2015) Multi-dimensional intra-tile parallelization for memory-starved stencil computations. arXiv preprint arXiv:1510.04995 Malas T, Hager G, Ltaief H, Keyes D (2015) Multi-dimensional intra-tile parallelization for memory-starved stencil computations. arXiv preprint arXiv:​1510.​04995
21.
Zurück zum Zitat Morlan J, Kamil S, Fox A (2012) Auto-tuning the matrix powers kernel with SEJITS. In: Daydé M, Marques O, Nakajima K (eds) High performance computing for computational science-VECPAR 2012. Springer, pp 391–403 Morlan J, Kamil S, Fox A (2012) Auto-tuning the matrix powers kernel with SEJITS. In: Daydé M, Marques O, Nakajima K (eds) High performance computing for computational science-VECPAR 2012. Springer, pp 391–403
22.
Zurück zum Zitat Orozco D, Garcia E, Gao G (2010) Locality optimization of stencil applications using data dependency graphs. In: International Workshop on Languages and Compilers for Parallel Computing. Springer, pp 77–91 Orozco D, Garcia E, Gao G (2010) Locality optimization of stencil applications using data dependency graphs. In: International Workshop on Languages and Compilers for Parallel Computing. Springer, pp 77–91
23.
Zurück zum Zitat Proix T, Spiegler A, Schirner M, Rothmeier S, Ritter P, Jirsa VK (2016) How do parcellation size and short-range connectivity affect dynamics in large-scale brain network models? NeuroImage 142:135–149CrossRef Proix T, Spiegler A, Schirner M, Rothmeier S, Ritter P, Jirsa VK (2016) How do parcellation size and short-range connectivity affect dynamics in large-scale brain network models? NeuroImage 142:135–149CrossRef
24.
Zurück zum Zitat Rafique A, Constantinides GA, Kapre N (2015) Communication optimization of iterative sparse matrix-vector multiply on GPUs and FPGAs. IEEE Trans Parallel Distrib Syst 26(1):24–34CrossRef Rafique A, Constantinides GA, Kapre N (2015) Communication optimization of iterative sparse matrix-vector multiply on GPUs and FPGAs. IEEE Trans Parallel Distrib Syst 26(1):24–34CrossRef
25.
Zurück zum Zitat Sanz-Leon P, Knock SA, Spiegler A, Jirsa VK (2015) Mathematical framework for large-scale brain network modeling in The Virtual Brain. Neuroimage 111:385–430CrossRef Sanz-Leon P, Knock SA, Spiegler A, Jirsa VK (2015) Mathematical framework for large-scale brain network modeling in The Virtual Brain. Neuroimage 111:385–430CrossRef
26.
Zurück zum Zitat Spiegler A, Jirsa V (2013) Systematic approximations of neural fields through networks of neural masses in The Virtual Brain. NeuroImage 83:704–725CrossRef Spiegler A, Jirsa V (2013) Systematic approximations of neural fields through networks of neural masses in The Virtual Brain. NeuroImage 83:704–725CrossRef
27.
Zurück zum Zitat Strout M, Carter L, Ferrante J (2001) Rescheduling for locality in sparse matrix computations. In: Computational Science—ICCS 2001. pp 137–146 Strout M, Carter L, Ferrante J (2001) Rescheduling for locality in sparse matrix computations. In: Computational Science—ICCS 2001. pp 137–146
28.
Zurück zum Zitat Strout MM, Carter L, Ferrante J, Kreaseck B (2004) Sparse tiling for stationary iterative methods. Int J High Perform Comput Appl 18(1):95–113CrossRef Strout MM, Carter L, Ferrante J, Kreaseck B (2004) Sparse tiling for stationary iterative methods. Int J High Perform Comput Appl 18(1):95–113CrossRef
29.
Zurück zum Zitat Strout MM, LaMielle A, Carter L, Ferrante J, Kreaseck B, Olschanowsky C (2016) An approach for code generation in the sparse polyhedral framework. Parallel Comput 53:32–57MathSciNetCrossRef Strout MM, LaMielle A, Carter L, Ferrante J, Kreaseck B, Olschanowsky C (2016) An approach for code generation in the sparse polyhedral framework. Parallel Comput 53:32–57MathSciNetCrossRef
30.
Zurück zum Zitat Thapliyal H, Arabnia HR (2006) A reversible programmable logic array (RPLA) using Fredkin and Feynman gates for industrial electronics and applications. In: Proceedings of the 2006 International Conference on Computer Design & Conference on Computing in Nanotechnology, CDES 2006, Las Vegas, 26–29 June 2006. pp 70–76 Thapliyal H, Arabnia HR (2006) A reversible programmable logic array (RPLA) using Fredkin and Feynman gates for industrial electronics and applications. In: Proceedings of the 2006 International Conference on Computer Design & Conference on Computing in Nanotechnology, CDES 2006, Las Vegas, 26–29 June 2006. pp 70–76
31.
Zurück zum Zitat Thapliyal H, Arabnia HR, Bajpai R, Sharma KK (2007) Combined integer and variable precision (CIVP) floating point multiplication architecture for FPGAs. In: Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications, PDPTA 2007, Las Vegas, 25–28 June 2007, Vol 1. pp 449–452 Thapliyal H, Arabnia HR, Bajpai R, Sharma KK (2007) Combined integer and variable precision (CIVP) floating point multiplication architecture for FPGAs. In: Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications, PDPTA 2007, Las Vegas, 25–28 June 2007, Vol 1. pp 449–452
33.
Zurück zum Zitat Treibig J, Hager G, Wellein G (2010) LIKWID: A lightweight performance-oriented tool suite for x86 multicore environments. In: Proceedings of PSTI2010, the First International Workshop on Parallel Software Tools and Tool Infrastructures, San Diego Treibig J, Hager G, Wellein G (2010) LIKWID: A lightweight performance-oriented tool suite for x86 multicore environments. In: Proceedings of PSTI2010, the First International Workshop on Parallel Software Tools and Tool Infrastructures, San Diego
34.
Zurück zum Zitat Tzourio-Mazoyer N, Landeau B, Papathanassiou D, Crivello F, Etard O, Delcroix N, Mazoyer B, Joliot M (2002) Automated anatomical labeling of activations in SPM using a macroscopic anatomical parcellation of the MNI MRI single-subject brain. Neuroimage 15(1):273–289CrossRef Tzourio-Mazoyer N, Landeau B, Papathanassiou D, Crivello F, Etard O, Delcroix N, Mazoyer B, Joliot M (2002) Automated anatomical labeling of activations in SPM using a macroscopic anatomical parcellation of the MNI MRI single-subject brain. Neuroimage 15(1):273–289CrossRef
35.
Zurück zum Zitat Venkat A, Shantharam M, Hall M, Strout MM (2014) Non-affine extensions to polyhedral code generation. In: Proceedings of Annual IEEE/ACM International Symposium on Code Generation and Optimization. ACM, p 185 Venkat A, Shantharam M, Hall M, Strout MM (2014) Non-affine extensions to polyhedral code generation. In: Proceedings of Annual IEEE/ACM International Symposium on Code Generation and Optimization. ACM, p 185
36.
Zurück zum Zitat Williams S, Oliker L, Vuduc R, Shalf J, Yelick K, Demmel J (2009) Optimization of sparse matrix-vector multiplication on emerging multicore platforms. Parallel Comput 35(3):178–194CrossRef Williams S, Oliker L, Vuduc R, Shalf J, Yelick K, Demmel J (2009) Optimization of sparse matrix-vector multiplication on emerging multicore platforms. Parallel Comput 35(3):178–194CrossRef
37.
Zurück zum Zitat Wulf WA, McKee SA (1995) Hitting the memory wall: implications of the obvious. ACM SIGARCH Comput Archit News 23(1):20–24CrossRef Wulf WA, McKee SA (1995) Hitting the memory wall: implications of the obvious. ACM SIGARCH Comput Archit News 23(1):20–24CrossRef
38.
Zurück zum Zitat Yzelman AJN, Roose D (2014) High-level strategies for parallel shared-memory sparse matrix-vector multiplication. IEEE Trans Parallel Distrib Syst 25(1):116–125CrossRef Yzelman AJN, Roose D (2014) High-level strategies for parallel shared-memory sparse matrix-vector multiplication. IEEE Trans Parallel Distrib Syst 25(1):116–125CrossRef
Metadaten
Titel
Efficient sparse matrix-delayed vector multiplication for discretized neural field model
verfasst von
Jan Fousek
Publikationsdatum
15.12.2017
Verlag
Springer US
Erschienen in
The Journal of Supercomputing / Ausgabe 5/2018
Print ISSN: 0920-8542
Elektronische ISSN: 1573-0484
DOI
https://doi.org/10.1007/s11227-017-2194-4

Weitere Artikel der Ausgabe 5/2018

The Journal of Supercomputing 5/2018 Zur Ausgabe