Skip to main content

2013 | OriginalPaper | Buchkapitel

4. GRAPE and GRAPE-DR

verfasst von : Junichiro Makino

Erschienen in: GPU Solutions to Multi-scale Problems in Science and Engineering

Verlag: Springer Berlin Heidelberg

Aktivieren Sie unsere intelligente Suche um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

We describe the architecture and performance of GRAPE-DR (Greatly Reduced Array of Processor Elements with Data Reduction). It operates as an accelerator attached to general-purpose PCs or x86-based servers. The processor chip of a GRAPE-DR board have 512 cores operating at the clock frequency of 400 MHz. The peak speed of a processor chip is 410 Gflops (single precision) or 205 Gflops (double precision). A GRAPE-DR board consists of four GRAPE-DR chips, each with its own local memory of 256 MB. Thus, a GRAPE-DR board has the theoretical peak speed of 1.64 SP and 0.82 DP Tflops. Its power consumption is around 200 W. The application area of GRAPE-DR covers particle-based simulations such as astrophysical many-body simulations and molecular-dynamics simulations, quantum chemistry calculations, various applications which requires dense matrix operations, and many other compute-intensive applications. The architecture of GRAPE-DR is in many ways similar to those of modern GPUs, since the evolutionary tracks are rather similar. GPUs have evolved from specialized hardwired logic for specific operations to a more general-purpose computing engine, in order to meet the perform complex shading and other operations. The predecessor of GRAPE-DR is GRAPE (GRAvity PipE), which is a specialized pipeline processor for gravitational \(N\)-body simulations. We have changed the architecture to extend the range of applications. There are two main differences between GRAPE-DR and GPGPU. One is the transistor and power efficiency. With 90 nm technology and 400M transistors, we have integrated 512 processor cores and achieved the speed of 400 Gflops at 400 MHz clock and 50 W. A Fermi processor from NVIDIA integrates 448 processors with 3B transistors and achieved the speed of 1.03 Tflops at 1.15 GHz and 247 W. Thus, Fermi achieved 2.5 times higher speed compared to GRAPE-DR, with 2.9 times higher clock, 8 times more transistors, and 5 times more power consumption. The other is the external memory bandwidth. GPUs typically have the memory bandwidth of around 100 GB/s, while our GRAPE-DR card, with 4 chips, have only 16 GB/s. Thus, the range of application is somewhat limited, but for suitable applications, the performance and performance per watt of GRAPE-DR is quite good. The single-card performance of HPL benchmark is 480 Gflops for matrix size of t 48 k, and for 81 cards 37 Tflops.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Barnes J, Hut P (1986) A hiearchical o(nlogn) force calculation algorithm. Nature 324:446–449CrossRef Barnes J, Hut P (1986) A hiearchical o(nlogn) force calculation algorithm. Nature 324:446–449CrossRef
Zurück zum Zitat Fukushige T, Taiji M, Makino J, Ebisuzaki T, Sugimoto D (1996) A highly-parallelized special-purchase computer for many-body simulations with an arbitrary central force: Md-grape. ApJ 468:51–61CrossRef Fukushige T, Taiji M, Makino J, Ebisuzaki T, Sugimoto D (1996) A highly-parallelized special-purchase computer for many-body simulations with an arbitrary central force: Md-grape. ApJ 468:51–61CrossRef
Zurück zum Zitat Hamada T, Fukushige T, Kawai A, Makino J (1999) Progrape-1: a programmable, multi-purpose computer for many-body simulations. PASJ (submitted) Hamada T, Fukushige T, Kawai A, Makino J (1999) Progrape-1: a programmable, multi-purpose computer for many-body simulations. PASJ (submitted)
Zurück zum Zitat Ito T, Makino J, Ebisuzaki T, Sugimoto D (1990) A special-purpose n-body machine grape-1. Comput Phys Commun 60:187–194MATHCrossRef Ito T, Makino J, Ebisuzaki T, Sugimoto D (1990) A special-purpose n-body machine grape-1. Comput Phys Commun 60:187–194MATHCrossRef
Zurück zum Zitat Ito T, Makino J, Fukushige T, Ebisuzaki T, Okumura SK, Sugimoto D (1993) A special-purpose computer for n-body simulations: Grape-2a. PASJ 45:339–347 Ito T, Makino J, Fukushige T, Ebisuzaki T, Okumura SK, Sugimoto D (1993) A special-purpose computer for n-body simulations: Grape-2a. PASJ 45:339–347
Zurück zum Zitat Kawai A, Fukushige T (2006) $158/gflop astrophysical n-body simulation with a reconfigurable add-in card and a hierarchical tree algorithm, 2006 Kawai A, Fukushige T (2006) $158/gflop astrophysical n-body simulation with a reconfigurable add-in card and a hierarchical tree algorithm, 2006
Zurück zum Zitat Kawai A, Fukushige T, Makino J, Taiji M (2000) Grape-5: a special-purpose computer for n-body simulations. PASJ 52:659–676 Kawai A, Fukushige T, Makino J, Taiji M (2000) Grape-5: a special-purpose computer for n-body simulations. PASJ 52:659–676
Zurück zum Zitat Makino J, Ito T, Ebisuzaki T (1990) Error analysis of the grape-1 special-purpose n-body machine. PASJ 42:717–736 Makino J, Ito T, Ebisuzaki T (1990) Error analysis of the grape-1 special-purpose n-body machine. PASJ 42:717–736
Zurück zum Zitat Makino J, Taiji M, Ebisuzaki T, Sugimoto D (1997) Grape-4: a massively parallel special-purpose computer for collisional n-body simulations. ApJ 480:432–446CrossRef Makino J, Taiji M, Ebisuzaki T, Sugimoto D (1997) Grape-4: a massively parallel special-purpose computer for collisional n-body simulations. ApJ 480:432–446CrossRef
Zurück zum Zitat Makino J, Fukushige T, Koga M, Namura K (2003) GRAPE-6: massively-parallel special-purpose computer for astrophysical particle simulations. PASJ 55:1163–1187 Makino J, Fukushige T, Koga M, Namura K (2003) GRAPE-6: massively-parallel special-purpose computer for astrophysical particle simulations. PASJ 55:1163–1187
Zurück zum Zitat Makino J, Hiraki K, Inaba M (2007) Grape-dr: 2-pflops massively-parallel computer with 512-core, 512-gflops processor chips for scientific computing. In: Proceedings of SC07. ACM, 2007 (Online) Makino J, Hiraki K, Inaba M (2007) Grape-dr: 2-pflops massively-parallel computer with 512-core, 512-gflops processor chips for scientific computing. In: Proceedings of SC07. ACM, 2007 (Online)
Zurück zum Zitat Taiji M, Narumi T, Ohno Y, Futatsugi N, Suenaga A, Takada N, Konagaya A (2003) Protein explorer: a petaflops special-purpose computer system for molecular dynamics simulations. In: The SC2003 proceedings, pages CD-ROM, 2003. IEEE, Los Alamitos Taiji M, Narumi T, Ohno Y, Futatsugi N, Suenaga A, Takada N, Konagaya A (2003) Protein explorer: a petaflops special-purpose computer system for molecular dynamics simulations. In: The SC2003 proceedings, pages CD-ROM, 2003. IEEE, Los Alamitos
Metadaten
Titel
GRAPE and GRAPE-DR
verfasst von
Junichiro Makino
Copyright-Jahr
2013
Verlag
Springer Berlin Heidelberg
DOI
https://doi.org/10.1007/978-3-642-16405-7_4