Skip to main content
Top
Published in: The Journal of Supercomputing 2/2017

10-06-2016

Optimal dynamic data layouts for 2D FFT on 3D memory integrated FPGA

Authors: Ren Chen, Shreyas G. Singapura, Viktor K. Prasanna

Published in: The Journal of Supercomputing | Issue 2/2017

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

FPGAs have been widely used for accelerating various applications. For many data intensive applications, the memory bandwidth limits the performance. 3D memories with through-silicon-via connections provide potential solutions to the latency and bandwidth limitations. In this paper, we revisit the classic 2D FFT problem to evaluate the performance of 3D memory integrated FPGA. To fully utilize the fine-grained parallelism in 3D memory, data layouts which take into account the structure and organization of the memory are required. We propose dynamic data layouts for optimizing the performance of the 3D architecture. In 2D FFT, data are accessed in row major order in the first phase, whereas the data are accessed in column major order in the second phase. This column major order results in high memory latency and low bandwidth due to high row activation overhead of memory. Using the proposed dynamic data layouts, we improve memory access performance in the second phase without degrading the performance of the first phase. With parallelism employed in the third dimension of the memory, data parallelism can be increased to further improve the performance. We adopt a model-based approach for 3D memory and we perform experiments on the FPGA to validate our analysis and evaluate the performance. Compared with the baseline architecture, our approach achieves up to \(40\times \) peak memory bandwidth utilization for columnwise FFT, thus resulting in approximately \(97\,\,\%\) improvement in throughput for the complete 2D FFT application.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Chen R, Prasanna VK (2015) Energy and memory efficient bitonic sorting on FPGA. In: Proc. of ACM/SIGDA FPGA, pp 45–54 Chen R, Prasanna VK (2015) Energy and memory efficient bitonic sorting on FPGA. In: Proc. of ACM/SIGDA FPGA, pp 45–54
2.
go back to reference Chen R, Prasanna VK (2015) Automatic generation of high throughput energy efficient streaming architectures for arbitrary fixed permutations. In: Proc. of IEEE Conference on Field Programmable Logic and Applications (FPL), pp 1–8. IEEE Chen R, Prasanna VK (2015) Automatic generation of high throughput energy efficient streaming architectures for arbitrary fixed permutations. In: Proc. of IEEE Conference on Field Programmable Logic and Applications (FPL), pp 1–8. IEEE
3.
go back to reference Akin B, Milder PA, Franchetti F, Hoe JC (2012) Memory bandwidth efficient two-dimensional fast fourier transform algorithm and implementation for large problem sizes. In: Proc. of IEEE International Symposium on Field-Programmable Custom Computing Machines (FCCM ’12), pp 188–191 Akin B, Milder PA, Franchetti F, Hoe JC (2012) Memory bandwidth efficient two-dimensional fast fourier transform algorithm and implementation for large problem sizes. In: Proc. of IEEE International Symposium on Field-Programmable Custom Computing Machines (FCCM ’12), pp 188–191
4.
go back to reference Chen R, Prasanna VK (2013) Energy efficient parameterized FFT architecture. In: Proc. of IEEE International Conference on FPL Chen R, Prasanna VK (2013) Energy efficient parameterized FFT architecture. In: Proc. of IEEE International Conference on FPL
5.
go back to reference Kim JS, Yu C-L, Deng L, Kestur S, Narayanan V, Chakrabarti C (2009) FPGA Architecture for 2D Discrete Fourier Transform based on 2D decomposition for large-sized data. In: Proc. of IEEE Workshop on Signal Processing Systems, pp 121–126 Kim JS, Yu C-L, Deng L, Kestur S, Narayanan V, Chakrabarti C (2009) FPGA Architecture for 2D Discrete Fourier Transform based on 2D decomposition for large-sized data. In: Proc. of IEEE Workshop on Signal Processing Systems, pp 121–126
7.
go back to reference Park Neungsoo, Prasanna Viktor K (2004) Dynamic data layouts for cache-conscious implementation of a class of signal transforms. IEEE Trans Signal Process 52(7):2120–2134CrossRef Park Neungsoo, Prasanna Viktor K (2004) Dynamic data layouts for cache-conscious implementation of a class of signal transforms. IEEE Trans Signal Process 52(7):2120–2134CrossRef
8.
go back to reference Wang W, Duan B, Zhang C, Zhang P, Sun N (2010) Accelerating 2D FT with non-power-of-two problem size on FPGA. In: Proc. of IEEE International Conference on Reconfigurable Computing and FPGAs (ReConFig ’10), pp 208–213 Wang W, Duan B, Zhang C, Zhang P, Sun N (2010) Accelerating 2D FT with non-power-of-two problem size on FPGA. In: Proc. of IEEE International Conference on Reconfigurable Computing and FPGAs (ReConFig ’10), pp 208–213
9.
go back to reference Akin B, Franchetti F, Hoe JC (2014) Understanding the Design Space of Dram-optimized Hardware FFT Accelerators. In: Application-specific Systems, Architectures and Processors (ASAP), 2014 IEEE 25th International Conference on, pp 248–255. IEEE Akin B, Franchetti F, Hoe JC (2014) Understanding the Design Space of Dram-optimized Hardware FFT Accelerators. In: Application-specific Systems, Architectures and Processors (ASAP), 2014 IEEE 25th International Conference on, pp 248–255. IEEE
10.
go back to reference Wu Hong Ren, Paoloni Frank John (1989) The structure of vector radix fast Fourier transforms. IEEE Trans Acoust Speech Signal Process 37(9):1415–1424CrossRefMATH Wu Hong Ren, Paoloni Frank John (1989) The structure of vector radix fast Fourier transforms. IEEE Trans Acoust Speech Signal Process 37(9):1415–1424CrossRefMATH
11.
go back to reference Zhu Q, Akin B, Sumbul HE, Sadi F, Hoe JC, Pileggi L, Franchetti F (2013) A 3D-stacked logic-in-memory accelerator for application-specific data intensive computing. In: Proc. of IEEE International Conference on 3D Systems Integration Conference (3DIC), pp 1–7. IEEE Zhu Q, Akin B, Sumbul HE, Sadi F, Hoe JC, Pileggi L, Franchetti F (2013) A 3D-stacked logic-in-memory accelerator for application-specific data intensive computing. In: Proc. of IEEE International Conference on 3D Systems Integration Conference (3DIC), pp 1–7. IEEE
12.
go back to reference Gadfort P, Dasu A, Akoglu A, Leow YK, Fritze M (2014) A power efficient reconfigurable system-in-stack: 3D integration of accelerators, FPGAs, and DRAM. In: Proc. of IEEE International Conference on System-on-Chip Conference (SOCC), pp 11–16. IEEE Gadfort P, Dasu A, Akoglu A, Leow YK, Fritze M (2014) A power efficient reconfigurable system-in-stack: 3D integration of accelerators, FPGAs, and DRAM. In: Proc. of IEEE International Conference on System-on-Chip Conference (SOCC), pp 11–16. IEEE
13.
go back to reference Singapura SG, Panangadan A, Prasanna VK (2015) Towards performance modeling of 3D memory integrated FPGA architectures. In: Proc. of International Conference on Applied Reconfigurable Computing Singapura SG, Panangadan A, Prasanna VK (2015) Towards performance modeling of 3D memory integrated FPGA architectures. In: Proc. of International Conference on Applied Reconfigurable Computing
14.
go back to reference Singapura SG, Panangadan A, Prasanna VK (2015) Performance modeling of matrix multiplication on 3D memory integrated FPGA. In: Proc. of 22nd Reconfigurable Architectures Workshop, IPDPDS Singapura SG, Panangadan A, Prasanna VK (2015) Performance modeling of matrix multiplication on 3D memory integrated FPGA. In: Proc. of 22nd Reconfigurable Architectures Workshop, IPDPDS
15.
go back to reference Chen R, Prasanna VK (2013) Energy-efficient architecture for stride permutation on streaming data. In: Proc. of IEEE Conference on ReConFig, pp 1–7 Chen R, Prasanna VK (2013) Energy-efficient architecture for stride permutation on streaming data. In: Proc. of IEEE Conference on ReConFig, pp 1–7
16.
go back to reference Chen R, Park N, Prasanna VK (2013) High throughput energy efficient parallel FFT architecture on FPGAs. In: Proc. of IEEE High Performance Extreme Computing Conference (HPEC), pp 1–6. IEEE Chen R, Park N, Prasanna VK (2013) High throughput energy efficient parallel FFT architecture on FPGAs. In: Proc. of IEEE High Performance Extreme Computing Conference (HPEC), pp 1–6. IEEE
18.
go back to reference Chen R, Prasanna VK (2015) DRAM Row Activation Energy Optimization for Stride Memory Access on FPGA-based Systems. In: Proc. of International Conference on Applied Reconfigurable Computing Chen R, Prasanna VK (2015) DRAM Row Activation Energy Optimization for Stride Memory Access on FPGA-based Systems. In: Proc. of International Conference on Applied Reconfigurable Computing
Metadata
Title
Optimal dynamic data layouts for 2D FFT on 3D memory integrated FPGA
Authors
Ren Chen
Shreyas G. Singapura
Viktor K. Prasanna
Publication date
10-06-2016
Publisher
Springer US
Published in
The Journal of Supercomputing / Issue 2/2017
Print ISSN: 0920-8542
Electronic ISSN: 1573-0484
DOI
https://doi.org/10.1007/s11227-016-1772-1

Other articles of this Issue 2/2017

The Journal of Supercomputing 2/2017 Go to the issue

Premium Partner