Skip to main content
Top

2013 | OriginalPaper | Chapter

4. GRAPE and GRAPE-DR

Author : Junichiro Makino

Published in: GPU Solutions to Multi-scale Problems in Science and Engineering

Publisher: Springer Berlin Heidelberg

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

We describe the architecture and performance of GRAPE-DR (Greatly Reduced Array of Processor Elements with Data Reduction). It operates as an accelerator attached to general-purpose PCs or x86-based servers. The processor chip of a GRAPE-DR board have 512 cores operating at the clock frequency of 400 MHz. The peak speed of a processor chip is 410 Gflops (single precision) or 205 Gflops (double precision). A GRAPE-DR board consists of four GRAPE-DR chips, each with its own local memory of 256 MB. Thus, a GRAPE-DR board has the theoretical peak speed of 1.64 SP and 0.82 DP Tflops. Its power consumption is around 200 W. The application area of GRAPE-DR covers particle-based simulations such as astrophysical many-body simulations and molecular-dynamics simulations, quantum chemistry calculations, various applications which requires dense matrix operations, and many other compute-intensive applications. The architecture of GRAPE-DR is in many ways similar to those of modern GPUs, since the evolutionary tracks are rather similar. GPUs have evolved from specialized hardwired logic for specific operations to a more general-purpose computing engine, in order to meet the perform complex shading and other operations. The predecessor of GRAPE-DR is GRAPE (GRAvity PipE), which is a specialized pipeline processor for gravitational \(N\)-body simulations. We have changed the architecture to extend the range of applications. There are two main differences between GRAPE-DR and GPGPU. One is the transistor and power efficiency. With 90 nm technology and 400M transistors, we have integrated 512 processor cores and achieved the speed of 400 Gflops at 400 MHz clock and 50 W. A Fermi processor from NVIDIA integrates 448 processors with 3B transistors and achieved the speed of 1.03 Tflops at 1.15 GHz and 247 W. Thus, Fermi achieved 2.5 times higher speed compared to GRAPE-DR, with 2.9 times higher clock, 8 times more transistors, and 5 times more power consumption. The other is the external memory bandwidth. GPUs typically have the memory bandwidth of around 100 GB/s, while our GRAPE-DR card, with 4 chips, have only 16 GB/s. Thus, the range of application is somewhat limited, but for suitable applications, the performance and performance per watt of GRAPE-DR is quite good. The single-card performance of HPL benchmark is 480 Gflops for matrix size of t 48 k, and for 81 cards 37 Tflops.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
go back to reference Barnes J, Hut P (1986) A hiearchical o(nlogn) force calculation algorithm. Nature 324:446–449CrossRef Barnes J, Hut P (1986) A hiearchical o(nlogn) force calculation algorithm. Nature 324:446–449CrossRef
go back to reference Fukushige T, Taiji M, Makino J, Ebisuzaki T, Sugimoto D (1996) A highly-parallelized special-purchase computer for many-body simulations with an arbitrary central force: Md-grape. ApJ 468:51–61CrossRef Fukushige T, Taiji M, Makino J, Ebisuzaki T, Sugimoto D (1996) A highly-parallelized special-purchase computer for many-body simulations with an arbitrary central force: Md-grape. ApJ 468:51–61CrossRef
go back to reference Hamada T, Fukushige T, Kawai A, Makino J (1999) Progrape-1: a programmable, multi-purpose computer for many-body simulations. PASJ (submitted) Hamada T, Fukushige T, Kawai A, Makino J (1999) Progrape-1: a programmable, multi-purpose computer for many-body simulations. PASJ (submitted)
go back to reference Ito T, Makino J, Ebisuzaki T, Sugimoto D (1990) A special-purpose n-body machine grape-1. Comput Phys Commun 60:187–194MATHCrossRef Ito T, Makino J, Ebisuzaki T, Sugimoto D (1990) A special-purpose n-body machine grape-1. Comput Phys Commun 60:187–194MATHCrossRef
go back to reference Ito T, Makino J, Fukushige T, Ebisuzaki T, Okumura SK, Sugimoto D (1993) A special-purpose computer for n-body simulations: Grape-2a. PASJ 45:339–347 Ito T, Makino J, Fukushige T, Ebisuzaki T, Okumura SK, Sugimoto D (1993) A special-purpose computer for n-body simulations: Grape-2a. PASJ 45:339–347
go back to reference Kawai A, Fukushige T (2006) $158/gflop astrophysical n-body simulation with a reconfigurable add-in card and a hierarchical tree algorithm, 2006 Kawai A, Fukushige T (2006) $158/gflop astrophysical n-body simulation with a reconfigurable add-in card and a hierarchical tree algorithm, 2006
go back to reference Kawai A, Fukushige T, Makino J, Taiji M (2000) Grape-5: a special-purpose computer for n-body simulations. PASJ 52:659–676 Kawai A, Fukushige T, Makino J, Taiji M (2000) Grape-5: a special-purpose computer for n-body simulations. PASJ 52:659–676
go back to reference Makino J, Ito T, Ebisuzaki T (1990) Error analysis of the grape-1 special-purpose n-body machine. PASJ 42:717–736 Makino J, Ito T, Ebisuzaki T (1990) Error analysis of the grape-1 special-purpose n-body machine. PASJ 42:717–736
go back to reference Makino J, Taiji M, Ebisuzaki T, Sugimoto D (1997) Grape-4: a massively parallel special-purpose computer for collisional n-body simulations. ApJ 480:432–446CrossRef Makino J, Taiji M, Ebisuzaki T, Sugimoto D (1997) Grape-4: a massively parallel special-purpose computer for collisional n-body simulations. ApJ 480:432–446CrossRef
go back to reference Makino J, Fukushige T, Koga M, Namura K (2003) GRAPE-6: massively-parallel special-purpose computer for astrophysical particle simulations. PASJ 55:1163–1187 Makino J, Fukushige T, Koga M, Namura K (2003) GRAPE-6: massively-parallel special-purpose computer for astrophysical particle simulations. PASJ 55:1163–1187
go back to reference Makino J, Hiraki K, Inaba M (2007) Grape-dr: 2-pflops massively-parallel computer with 512-core, 512-gflops processor chips for scientific computing. In: Proceedings of SC07. ACM, 2007 (Online) Makino J, Hiraki K, Inaba M (2007) Grape-dr: 2-pflops massively-parallel computer with 512-core, 512-gflops processor chips for scientific computing. In: Proceedings of SC07. ACM, 2007 (Online)
go back to reference Taiji M, Narumi T, Ohno Y, Futatsugi N, Suenaga A, Takada N, Konagaya A (2003) Protein explorer: a petaflops special-purpose computer system for molecular dynamics simulations. In: The SC2003 proceedings, pages CD-ROM, 2003. IEEE, Los Alamitos Taiji M, Narumi T, Ohno Y, Futatsugi N, Suenaga A, Takada N, Konagaya A (2003) Protein explorer: a petaflops special-purpose computer system for molecular dynamics simulations. In: The SC2003 proceedings, pages CD-ROM, 2003. IEEE, Los Alamitos
Metadata
Title
GRAPE and GRAPE-DR
Author
Junichiro Makino
Copyright Year
2013
Publisher
Springer Berlin Heidelberg
DOI
https://doi.org/10.1007/978-3-642-16405-7_4

Premium Partner