Skip to main content
Top

2015 | OriginalPaper | Chapter

Automatic Parameter Tuning of Three-Dimensional Tiled FDTD Kernel

Authors : Takeshi Minami, Motoharu Hibino, Tasuku Hiraishi, Takeshi Iwashita, Hiroshi Nakashima

Published in: High Performance Computing for Computational Science -- VECPAR 2014

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

This paper introduces an automatic tuning method for the tiling parameters required in an implementation of the three-dimensional FDTD method based on time-space tiling. In this tuning process, an appropriate range for the tile size is first determined by trial experiments using cubic tiles. The tile shape is then optimized by using the Monte Carlo method. The tiled FDTD kernel was multi-threaded and its performance with the tuned parameters was evaluated on multi-core processors. When compared with a naively implemented kernel, the performance of the tuned FDTD kernel was improved by more than a factor of two.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Lu, J., Thiel, D., Saario, S.: FDTD analysis of dielectric-embedded electronically switched multiple-beam (DE-ESMB) antenna array. IEEE Trans. Magn. 38, 701–704 (2002)CrossRef Lu, J., Thiel, D., Saario, S.: FDTD analysis of dielectric-embedded electronically switched multiple-beam (DE-ESMB) antenna array. IEEE Trans. Magn. 38, 701–704 (2002)CrossRef
2.
go back to reference Ala, G., Di Piazza, M.C., Tine, G., Viola, F., Vitale, G.: Numerical simulation of radiated EMI in 42 V electrical automotive architectures. IEEE Trans. Magn. 42, 879–882 (2006)CrossRef Ala, G., Di Piazza, M.C., Tine, G., Viola, F., Vitale, G.: Numerical simulation of radiated EMI in 42 V electrical automotive architectures. IEEE Trans. Magn. 42, 879–882 (2006)CrossRef
3.
go back to reference Chew, K.C., Fusco, V.F.: A parallel implementation of the finite difference time-domain algorithm. Int. J. Numer. Model. 8, 293–299 (1995)CrossRef Chew, K.C., Fusco, V.F.: A parallel implementation of the finite difference time-domain algorithm. Int. J. Numer. Model. 8, 293–299 (1995)CrossRef
4.
go back to reference Wolf, M.: More iteration space tiling. In: Proceedings of the Supercomputing 1989, pp. 655–664 (1989) Wolf, M.: More iteration space tiling. In: Proceedings of the Supercomputing 1989, pp. 655–664 (1989)
5.
go back to reference Wonnacott, D.: Using time skewing to eliminate idle time due to memory bandwidth and network limitations. In: Proceedings of the IPDPS 2000 (2000) Wonnacott, D.: Using time skewing to eliminate idle time due to memory bandwidth and network limitations. In: Proceedings of the IPDPS 2000 (2000)
6.
go back to reference Strzodka, R., et al.: Cache oblivious parallelograms in iterative stencil computations. In: Proceedings of the ICS 2010, pp. 49–59 (2010) Strzodka, R., et al.: Cache oblivious parallelograms in iterative stencil computations. In: Proceedings of the ICS 2010, pp. 49–59 (2010)
7.
go back to reference Bondhugula, U., Hartono, A., Ramanujam, J., Sadayaooan, P.: A practical automatic polyhedral parallelizer and locality optimizer. In: Proceedings of the 2008 ACM SIGPLAN Programming Language Design and Implementation (PLDI), pp. 101–113 (2008) Bondhugula, U., Hartono, A., Ramanujam, J., Sadayaooan, P.: A practical automatic polyhedral parallelizer and locality optimizer. In: Proceedings of the 2008 ACM SIGPLAN Programming Language Design and Implementation (PLDI), pp. 101–113 (2008)
8.
go back to reference Minami, T., et al.: Temporal and spatial tiling method without redundant calculations for three-dimensional FDTD method. IPSJ Tran. Adv. Comput. Syst. (In Japanese) (to appear) Minami, T., et al.: Temporal and spatial tiling method without redundant calculations for three-dimensional FDTD method. IPSJ Tran. Adv. Comput. Syst. (In Japanese) (to appear)
9.
go back to reference Hiraishi, T., et al.: Xcrypt: a perl extension for job level parallel programming. In: Proceedings of the WHIST 2012 (2012) Hiraishi, T., et al.: Xcrypt: a perl extension for job level parallel programming. In: Proceedings of the WHIST 2012 (2012)
10.
go back to reference Whaley, R.C., Petitet, A., Dongarra, J.: Automated empirical optimization of software and the ATLAS project. Parallel Comput. 27, 3–35 (2001)CrossRefMATH Whaley, R.C., Petitet, A., Dongarra, J.: Automated empirical optimization of software and the ATLAS project. Parallel Comput. 27, 3–35 (2001)CrossRefMATH
11.
go back to reference Vuduc, R., Demmel, J., Yelick, K.: OSKI: a library of automatically tuned sparse matrix kernels. In: Proceedings of the SciDAC 2005, Journal of Physics: Conference Series, vol. 16, pp. 521–530 (2005) Vuduc, R., Demmel, J., Yelick, K.: OSKI: a library of automatically tuned sparse matrix kernels. In: Proceedings of the SciDAC 2005, Journal of Physics: Conference Series, vol. 16, pp. 521–530 (2005)
12.
go back to reference Datta, K., et al.: Stencil computation optimization and auto-tuning on state-of-the-art muticore architectures. In: Proceedings of the SC 2008 (2008) Datta, K., et al.: Stencil computation optimization and auto-tuning on state-of-the-art muticore architectures. In: Proceedings of the SC 2008 (2008)
13.
go back to reference Datta, K., et al.: Auto-tuning the 27-point stencil for multicore. In: Proceedings of the iWAPT 2009 (2009) Datta, K., et al.: Auto-tuning the 27-point stencil for multicore. In: Proceedings of the iWAPT 2009 (2009)
14.
go back to reference Shirako, J., Sharma, K., Fauzia, N., Pouchet, L.-N., Ramanujam, J., Sadayappan, P., Sarkar, V.: Analytical bounds for optimal tile size selection. In: O’Boyle, M. (ed.) CC 2012. LNCS, vol. 7210, pp. 101–121. Springer, Heidelberg (2012) CrossRef Shirako, J., Sharma, K., Fauzia, N., Pouchet, L.-N., Ramanujam, J., Sadayappan, P., Sarkar, V.: Analytical bounds for optimal tile size selection. In: O’Boyle, M. (ed.) CC 2012. LNCS, vol. 7210, pp. 101–121. Springer, Heidelberg (2012) CrossRef
15.
go back to reference Maruyama, N., Nomura, T., Sato, K., Matsuoka, S.: Physis: an implicitly parallel programming model for stencil computations on large-scale GPU-accelerated supercomputers. In: Proceedings of the SC 2011 (2008) Maruyama, N., Nomura, T., Sato, K., Matsuoka, S.: Physis: an implicitly parallel programming model for stencil computations on large-scale GPU-accelerated supercomputers. In: Proceedings of the SC 2011 (2008)
16.
go back to reference Wellein, G., et al.: Efficient temporal blocking for stencil computations by multicore-aware wavefront parallelization. In: Proceedings of the COMPSAC 2009, pp. 579–586 (2009) Wellein, G., et al.: Efficient temporal blocking for stencil computations by multicore-aware wavefront parallelization. In: Proceedings of the COMPSAC 2009, pp. 579–586 (2009)
17.
go back to reference Wittmann, M., Hager, G., Wellein, G.: Multicore-aware parallel temporal blocking of stencil codes for shared and distributed memory. In: Proceedings of the 2010 IEEE International Symposium on Parallel and Distributed Processing. WS and Phd Forum (IPDPSW) (2010) Wittmann, M., Hager, G., Wellein, G.: Multicore-aware parallel temporal blocking of stencil codes for shared and distributed memory. In: Proceedings of the 2010 IEEE International Symposium on Parallel and Distributed Processing. WS and Phd Forum (IPDPSW) (2010)
18.
go back to reference Orozco, D., Gau, G.: Mapping the FDTD application to many-core chip architectures. In: Proceedings of the 2009 International Conference on Parallel Processing (ICPP), pp. 309–316 (2009) Orozco, D., Gau, G.: Mapping the FDTD application to many-core chip architectures. In: Proceedings of the 2009 International Conference on Parallel Processing (ICPP), pp. 309–316 (2009)
20.
go back to reference Nguyen, A., Satish, N., Chhugani, J., Changkyu, K., Dubey, P.: 3.5-D blocking optimization for stencil computations on modern CPUs and GPUs. In: Proceedings of the SC 2010 (2010) Nguyen, A., Satish, N., Chhugani, J., Changkyu, K., Dubey, P.: 3.5-D blocking optimization for stencil computations on modern CPUs and GPUs. In: Proceedings of the SC 2010 (2010)
21.
go back to reference Jin, G., Endo, T., Matsuoka, S.: A multi-level optimization method for stencil computation on the domain that is bigger than memory capacity of GPU. In: Proceedings of the 2013 27th IEEE International Symposium on Parallel and Distributed Processing. WS and Phd Forum (IPDPSW), pp. 1080–1087 (2010) Jin, G., Endo, T., Matsuoka, S.: A multi-level optimization method for stencil computation on the domain that is bigger than memory capacity of GPU. In: Proceedings of the 2013 27th IEEE International Symposium on Parallel and Distributed Processing. WS and Phd Forum (IPDPSW), pp. 1080–1087 (2010)
Metadata
Title
Automatic Parameter Tuning of Three-Dimensional Tiled FDTD Kernel
Authors
Takeshi Minami
Motoharu Hibino
Tasuku Hiraishi
Takeshi Iwashita
Hiroshi Nakashima
Copyright Year
2015
DOI
https://doi.org/10.1007/978-3-319-17353-5_24

Premium Partner