Skip to main content
Erschienen in: Journal of Computational Electronics 2/2018

26.02.2018

Fast, energy-efficient electronic structure simulations for multi-million atomic systems with GPU devices

verfasst von: Hoon Ryu, Oh-Kyoung Kwon

Erschienen in: Journal of Computational Electronics | Ausgabe 2/2018

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

We report that speed and energy efficiency of large-scale electronic structure simulations, which target realistically sized systems consisting of multi-million atoms, can be hugely improved with offload-computing using graphics processing unit (GPU) devices. For simulations of quantum dot devices that have \(\sim \) 1.5 million atoms and are described by a \(sp^3d^5s^*\) tight-binding (TB) model, a remarkable performance enhancement is obtained with asynchronous sharing of computing load in host CPUs and Tesla K40 devices. Compared to the case when only host CPUs are used, sparse matrix-vector multiplications, the core operation needed to solve Schrödinger equations, become remarkably faster leading \(\sim \) 1.5\(\times \) speed-up of end-to-end simulations with GPU devices. Asynchronous streams accelerate data-transfer reducing the associated overhead below \(\sim \) 15% of the total wall-time. The speed and energy efficiency of TB simulations also turn out to be better with Tesla GPU devices than those obtained with Intel Xeon Phi Knights Corner (KNC) coprocessors, such that Tesla K40 GPU devices save \(\sim \) 10% of the wall-time and \(\sim \) 40% of the total energy consumed with KNC 7120 coprocessors for the target simulation. With technical details of offload-computing that can be also applied to accelerate other numerical problems involving large-scale sparse matrix operations, this work delivers practical information regarding the efficiency of GPU computing that has not been well covered for empirical modeling of large-scale electronic structures.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Anhänge
Nur mit Berechtigung zugänglich
Literatur
1.
Zurück zum Zitat Ruess, F.J., Oberbeck, L., Simmons, M.Y., Goh, K.E.J., Hamilton, A.R., Hallam, T., Schofield, S.R., Curson, N.J., Clark, R.G.: Toward atomic-scale device fabrication in silicon using scanning probe microscopy. Nano Lett. 4(10), 1969 (2004)CrossRef Ruess, F.J., Oberbeck, L., Simmons, M.Y., Goh, K.E.J., Hamilton, A.R., Hallam, T., Schofield, S.R., Curson, N.J., Clark, R.G.: Toward atomic-scale device fabrication in silicon using scanning probe microscopy. Nano Lett. 4(10), 1969 (2004)CrossRef
2.
Zurück zum Zitat Cooil, S.P., Mazzola, F., Klemm, H.W., Peschel, G., Niu, Y.R., Zakharov, A.A., Simmons, M.Y., Schmidt, T., Evans, D.A., Miwa, J.A., Wells, J.W.: In situ patterning of ultrasharp dopant profiles in silicon. ACS Nano 11(1683–1688), 2 (2017) Cooil, S.P., Mazzola, F., Klemm, H.W., Peschel, G., Niu, Y.R., Zakharov, A.A., Simmons, M.Y., Schmidt, T., Evans, D.A., Miwa, J.A., Wells, J.W.: In situ patterning of ultrasharp dopant profiles in silicon. ACS Nano 11(1683–1688), 2 (2017)
3.
Zurück zum Zitat Manfrinato, V.R., Zhang, L., Su, D., Duan, H., Hobbs, R.G., Stach, E.A., Berggren, K.K.: Resolution limits of electron-beam lithography toward the atomic scale. Nano Lett. 13(4), 1555 (2013)CrossRef Manfrinato, V.R., Zhang, L., Su, D., Duan, H., Hobbs, R.G., Stach, E.A., Berggren, K.K.: Resolution limits of electron-beam lithography toward the atomic scale. Nano Lett. 13(4), 1555 (2013)CrossRef
4.
Zurück zum Zitat Fiori, G., Bonaccorso, F., Iannaccone, G., Palacios, T., Neumaier, D., Seabaugh, A., Banerjee, S.K., Colombo, L.: Electronics based on two-dimensional materials. Nature Nanotechnol. 9, 768 (2014)CrossRef Fiori, G., Bonaccorso, F., Iannaccone, G., Palacios, T., Neumaier, D., Seabaugh, A., Banerjee, S.K., Colombo, L.: Electronics based on two-dimensional materials. Nature Nanotechnol. 9, 768 (2014)CrossRef
5.
Zurück zum Zitat Weber, B., Tan, Y.H.M., Mahapatra, S., Watson, T.F., Ryu, H., Rahman, R., Hollenberg, L.C.L., Klimeck, G., Simmons, M.Y.: Spin blockade and exchange in Coulomb-confined silicon double quantum dots. Nature Nanotechnol. 9, 430 (2014)CrossRef Weber, B., Tan, Y.H.M., Mahapatra, S., Watson, T.F., Ryu, H., Rahman, R., Hollenberg, L.C.L., Klimeck, G., Simmons, M.Y.: Spin blockade and exchange in Coulomb-confined silicon double quantum dots. Nature Nanotechnol. 9, 430 (2014)CrossRef
6.
Zurück zum Zitat Eaves, L.: Semiconductors: an empire of many dimensions. Nat. Mater. 5, 775 (2006)CrossRef Eaves, L.: Semiconductors: an empire of many dimensions. Nat. Mater. 5, 775 (2006)CrossRef
7.
Zurück zum Zitat Teng, Z., Liu, C., Yana, X.: A CO monolayer: first-principles design of a new direct band-gap semiconductor with excellent mechanical properties. Nanoscale 9, 5445 (2017)CrossRef Teng, Z., Liu, C., Yana, X.: A CO monolayer: first-principles design of a new direct band-gap semiconductor with excellent mechanical properties. Nanoscale 9, 5445 (2017)CrossRef
8.
Zurück zum Zitat Weber, B., Mahapatra, S., Ryu, H., Lee, S., Fuhrer, A., Reusch, T.C.G., Thompson, D.L., Lee, W.C.T., Klimeck, G., Hollenberg, L.C.L., Simmons, M.Y.: Ohmõs law survives to the atomic scale. Science 335, 64 (2012)CrossRef Weber, B., Mahapatra, S., Ryu, H., Lee, S., Fuhrer, A., Reusch, T.C.G., Thompson, D.L., Lee, W.C.T., Klimeck, G., Hollenberg, L.C.L., Simmons, M.Y.: Ohmõs law survives to the atomic scale. Science 335, 64 (2012)CrossRef
9.
Zurück zum Zitat Paul, A., Mehrotra, S., Klimeck, G., Luisier, M.: On the validity of the top of the barrier quantum transport model for ballistic nanowire MOSFETs. In: Proceedings of IEEE international workshop on computational electronics (IWCE), pp. 173–176 (2009). https://doi.org/10.1109/IWCE.2009.5091134 Paul, A., Mehrotra, S., Klimeck, G., Luisier, M.: On the validity of the top of the barrier quantum transport model for ballistic nanowire MOSFETs. In: Proceedings of IEEE international workshop on computational electronics (IWCE), pp. 173–176 (2009). https://​doi.​org/​10.​1109/​IWCE.​2009.​5091134
10.
Zurück zum Zitat Ryu, H.: A multi-subband Monte Carlo study on dominance of scattering mechanisms over carrier transport in sub-10-nm Si nanowire FETs. Nanoscale Res. Lett. 11(11), 36 (2016)CrossRef Ryu, H.: A multi-subband Monte Carlo study on dominance of scattering mechanisms over carrier transport in sub-10-nm Si nanowire FETs. Nanoscale Res. Lett. 11(11), 36 (2016)CrossRef
11.
Zurück zum Zitat Shinada, T., Okamoto, S., Kobayashi, T., Ohdomari, I.: Enhancing semiconductor device performance using ordered dopant arrays. Nature 437, 1128 (2005)CrossRef Shinada, T., Okamoto, S., Kobayashi, T., Ohdomari, I.: Enhancing semiconductor device performance using ordered dopant arrays. Nature 437, 1128 (2005)CrossRef
12.
Zurück zum Zitat Mlinar, V., Zunger, A.: Effect of atomic-scale randomness on the optical polarization of semiconductor quantum dots. Phys. Rev. B 79, 115416 (2009)CrossRef Mlinar, V., Zunger, A.: Effect of atomic-scale randomness on the optical polarization of semiconductor quantum dots. Phys. Rev. B 79, 115416 (2009)CrossRef
13.
Zurück zum Zitat Ahmed, S., Sundaresan, S., Ryu, H., Usman, M.: Multimillion-atom modeling of InAs\(/\)GaAs quantum dots: interplay of geometry, quantization, atomicity, strain, and linear and quadratic polarization fields. J. Comput. Electr. 14, 543 (2015)CrossRef Ahmed, S., Sundaresan, S., Ryu, H., Usman, M.: Multimillion-atom modeling of InAs\(/\)GaAs quantum dots: interplay of geometry, quantization, atomicity, strain, and linear and quadratic polarization fields. J. Comput. Electr. 14, 543 (2015)CrossRef
14.
Zurück zum Zitat Tatebayashi, J., Nuntawong, N., Wong, P.S., Xin, Y.C., Lester, L.F., Huffaker, D.L.: Strain compensation technique in self-assembled InAs/GaAs quantum dots for applications to photonic devices. J. Phys. D Appl. Phys. 42, 073002 (2009)CrossRef Tatebayashi, J., Nuntawong, N., Wong, P.S., Xin, Y.C., Lester, L.F., Huffaker, D.L.: Strain compensation technique in self-assembled InAs/GaAs quantum dots for applications to photonic devices. J. Phys. D Appl. Phys. 42, 073002 (2009)CrossRef
15.
Zurück zum Zitat Schwarzenbach, W., Nguyen, B., Allibert, F., Girard, C., Maleville, C.: Ultra-thin body & buried oxide SOI substrate development and qualification for fully depleted SOI device with back bias capability. Solid-State Electr. 117, 2 (2016)CrossRef Schwarzenbach, W., Nguyen, B., Allibert, F., Girard, C., Maleville, C.: Ultra-thin body & buried oxide SOI substrate development and qualification for fully depleted SOI device with back bias capability. Solid-State Electr. 117, 2 (2016)CrossRef
16.
Zurück zum Zitat Fuechsle, M., Mahapatra, S., Zwanenburg, F.A., Friesen, M., Eriksson, M.A., Simmons, M.: Spectroscopy of few-electron single-crystal silicon quantum dots. Nat. Nanotechnol. 5, 502 (2010)CrossRef Fuechsle, M., Mahapatra, S., Zwanenburg, F.A., Friesen, M., Eriksson, M.A., Simmons, M.: Spectroscopy of few-electron single-crystal silicon quantum dots. Nat. Nanotechnol. 5, 502 (2010)CrossRef
17.
Zurück zum Zitat Jancu, J.M., Scholz, R., Beltram, F., Bassani, F.: Empirical \(spds*\) tight-binding calculation for cubic semiconductors: general method and material parameters. Phys. Rev. B 57, 6493 (1998)CrossRef Jancu, J.M., Scholz, R., Beltram, F., Bassani, F.: Empirical \(spds*\) tight-binding calculation for cubic semiconductors: general method and material parameters. Phys. Rev. B 57, 6493 (1998)CrossRef
18.
Zurück zum Zitat Usman, M., Ryu, H., Woo, I., Ebert, D.S., Klimeck, G.: Moving toward nano-TCAD through multimillion-atom quantum-dot simulations matching experimental data. IEEE Trans. Nanotechnol. 8(3), 330 (2009)CrossRef Usman, M., Ryu, H., Woo, I., Ebert, D.S., Klimeck, G.: Moving toward nano-TCAD through multimillion-atom quantum-dot simulations matching experimental data. IEEE Trans. Nanotechnol. 8(3), 330 (2009)CrossRef
19.
Zurück zum Zitat Ryu, H., Kim, J., Hong, K.H.: Atomistic study on dopant-distributions in relistically sized, highly P-doped Si nanowires. Nano Lett. 1, 450 (2015)CrossRef Ryu, H., Kim, J., Hong, K.H.: Atomistic study on dopant-distributions in relistically sized, highly P-doped Si nanowires. Nano Lett. 1, 450 (2015)CrossRef
20.
Zurück zum Zitat Ryu, H., Lee, S., Fuechsle, M., Miwa, J.A., Mahapatra, S., Hollenberg, L.C.L., Simmons, M.Y., Klimeck, G.: A tight-binding study of single-atom transistors. Small 11(3), 374 (2015)CrossRef Ryu, H., Lee, S., Fuechsle, M., Miwa, J.A., Mahapatra, S., Hollenberg, L.C.L., Simmons, M.Y., Klimeck, G.: A tight-binding study of single-atom transistors. Small 11(3), 374 (2015)CrossRef
21.
Zurück zum Zitat Fuechsle, M., Miwa, J.A., Mahapatra, S., Ryu, H., Lee, S., Warschkow, O., Hollenberg, L.C.L., Klimeck, G., Simmons, M.Y.: A single-atom transistor. Nat. Nanotechonol. 7, 242 (2012)CrossRef Fuechsle, M., Miwa, J.A., Mahapatra, S., Ryu, H., Lee, S., Warschkow, O., Hollenberg, L.C.L., Klimeck, G., Simmons, M.Y.: A single-atom transistor. Nat. Nanotechonol. 7, 242 (2012)CrossRef
22.
Zurück zum Zitat Ilatikhameneh, H., Klimeck, G., Appenzeller, J., Rahman, R.: Design rules for high performance tunnel transistors from 2D materials. IEEE J. Electron Device Soc. 5, 260 (2016)CrossRef Ilatikhameneh, H., Klimeck, G., Appenzeller, J., Rahman, R.: Design rules for high performance tunnel transistors from 2D materials. IEEE J. Electron Device Soc. 5, 260 (2016)CrossRef
23.
Zurück zum Zitat Mohiyaddin, F.A., Kalra, R., Laucht, A., Rahman, R., Klimeck, G., Morello, A.: Transport of spin qubits with donor chains under realistic experimental conditions. Phys. Rev. B 94, 045314 (2016)CrossRef Mohiyaddin, F.A., Kalra, R., Laucht, A., Rahman, R., Klimeck, G., Morello, A.: Transport of spin qubits with donor chains under realistic experimental conditions. Phys. Rev. B 94, 045314 (2016)CrossRef
24.
Zurück zum Zitat Agarwal, S., Klimeck, G., Luisier, M.: Leakage-reduction design concepts for low-power vertical tunneling field-effect transistors. IEEE Electron Device Lett. 31(6), 621 (2010)CrossRef Agarwal, S., Klimeck, G., Luisier, M.: Leakage-reduction design concepts for low-power vertical tunneling field-effect transistors. IEEE Electron Device Lett. 31(6), 621 (2010)CrossRef
26.
Zurück zum Zitat Steiger, S., Povolotskyi, M., Park, H.H., Kubis, T., Klimeck, G.: NEMO5: a parallel multiscale nanoelectronics modeling tool. IEEE Trans. Nanotechnol. 10(6), 1464 (2011)CrossRef Steiger, S., Povolotskyi, M., Park, H.H., Kubis, T., Klimeck, G.: NEMO5: a parallel multiscale nanoelectronics modeling tool. IEEE Trans. Nanotechnol. 10(6), 1464 (2011)CrossRef
27.
Zurück zum Zitat Hasnip, P.J., Refson, K., Probert, M.I.J., Yates, J.R., Clark, S.J., Pickard, C.J.: Density functional theory in the solid state. Philos. Trans. R. Soc. A Math. Phys. Eng. Sci. 372(2011), 1 (2014)MathSciNetCrossRef Hasnip, P.J., Refson, K., Probert, M.I.J., Yates, J.R., Clark, S.J., Pickard, C.J.: Density functional theory in the solid state. Philos. Trans. R. Soc. A Math. Phys. Eng. Sci. 372(2011), 1 (2014)MathSciNetCrossRef
28.
Zurück zum Zitat Navarroa, C.A., Huangb, W., Dengb, Y.: Adaptive multi-GPU exchange Monte Carlo for the 3D random field Ising model. Comput. Phys. Commun. 205, 48 (2016)MathSciNetCrossRef Navarroa, C.A., Huangb, W., Dengb, Y.: Adaptive multi-GPU exchange Monte Carlo for the 3D random field Ising model. Comput. Phys. Commun. 205, 48 (2016)MathSciNetCrossRef
29.
Zurück zum Zitat Hou, Q., Li, M., Zhou, Y., Cui, J., Cui, Z., Wang, J.: Molecular dynamics simulations with many-body potentials on multiple GPUs—the implementation, package and performance. Comput. Phys. Commun. 184, 2091 (2013)CrossRef Hou, Q., Li, M., Zhou, Y., Cui, J., Cui, Z., Wang, J.: Molecular dynamics simulations with many-body potentials on multiple GPUs—the implementation, package and performance. Comput. Phys. Commun. 184, 2091 (2013)CrossRef
30.
Zurück zum Zitat Maintz, S., Eck, B., Dronskowski, R.: Speeding up plane-wave electronic-structure calculations using graphics-processing units. Comput. Phys. Commun. 182, 1421 (2011)CrossRefMATH Maintz, S., Eck, B., Dronskowski, R.: Speeding up plane-wave electronic-structure calculations using graphics-processing units. Comput. Phys. Commun. 182, 1421 (2011)CrossRefMATH
31.
Zurück zum Zitat Harju, A., Siro, T., Canova, F.F., Hakala, S., Rantalaiho, T.: Computational physics on graphics processing units. Lect. Note Comput. Sci. 7782, 3 (2012)CrossRef Harju, A., Siro, T., Canova, F.F., Hakala, S., Rantalaiho, T.: Computational physics on graphics processing units. Lect. Note Comput. Sci. 7782, 3 (2012)CrossRef
32.
Zurück zum Zitat Ryu, H., Jeong, Y., Kang, J., Cho, K.: Time-efficient simulations of tight-binding electronic structures with Intel Xeon PhiTM many-core processors. Comput. Phys. Commun. 209, 79 (2016)MathSciNetCrossRef Ryu, H., Jeong, Y., Kang, J., Cho, K.: Time-efficient simulations of tight-binding electronic structures with Intel Xeon PhiTM many-core processors. Comput. Phys. Commun. 209, 79 (2016)MathSciNetCrossRef
36.
Zurück zum Zitat Buluç, A., Fineman, J.T., Frigo M., Gilbert, J.R., Leiserson, C.E.: Parallel sparse matrix-vector and matrix-transpose-vector multiplication using compressed sparse blocks. In: Proceedings of the annual symposium on parallelism in algorithms and architectures (SPAA), pp. 233–244 (2009) https://doi.org/10.1145/1583991.1584053 Buluç, A., Fineman, J.T., Frigo M., Gilbert, J.R., Leiserson, C.E.: Parallel sparse matrix-vector and matrix-transpose-vector multiplication using compressed sparse blocks. In: Proceedings of the annual symposium on parallelism in algorithms and architectures (SPAA), pp. 233–244 (2009) https://​doi.​org/​10.​1145/​1583991.​1584053
37.
Zurück zum Zitat Lanczos, C.: An iteration method for the solution of the eigenvalue problem of linear differential and integral operators. J. Res. Natl. Bur. Stand. 45(4), 255 (1950)MathSciNetCrossRef Lanczos, C.: An iteration method for the solution of the eigenvalue problem of linear differential and integral operators. J. Res. Natl. Bur. Stand. 45(4), 255 (1950)MathSciNetCrossRef
39.
Zurück zum Zitat Xu, S., Xue, W., Lin, H.: Performance modeling and optimization of sparse matrix-vector multiplication on NVIDIA CUDA platform. J. Supercomput. 63, 710 (2011)CrossRef Xu, S., Xue, W., Lin, H.: Performance modeling and optimization of sparse matrix-vector multiplication on NVIDIA CUDA platform. J. Supercomput. 63, 710 (2011)CrossRef
42.
Zurück zum Zitat Rountree, B., Ahn, D., de Supinski, B., Lowenthal, D., Schulz, M.: Beyond DVFS: A first look at performance under a hardware-enforced power bound. In: Proceedings of IEEE international parallel and distributed processing symposium workshops & PHD forum (IPDPSW), pp. 947–953 (2012) https://doi.org/10.1109/IPDPSW.2012.116 Rountree, B., Ahn, D., de Supinski, B., Lowenthal, D., Schulz, M.: Beyond DVFS: A first look at performance under a hardware-enforced power bound. In: Proceedings of IEEE international parallel and distributed processing symposium workshops & PHD forum (IPDPSW), pp. 947–953 (2012) https://​doi.​org/​10.​1109/​IPDPSW.​2012.​116
43.
Zurück zum Zitat Reinders, J.: High Performance Parallelism Pearls, vol. 1, 1st edn. Morgan Kaufmann, Burlington (2014) Reinders, J.: High Performance Parallelism Pearls, vol. 1, 1st edn. Morgan Kaufmann, Burlington (2014)
Metadaten
Titel
Fast, energy-efficient electronic structure simulations for multi-million atomic systems with GPU devices
verfasst von
Hoon Ryu
Oh-Kyoung Kwon
Publikationsdatum
26.02.2018
Verlag
Springer US
Erschienen in
Journal of Computational Electronics / Ausgabe 2/2018
Print ISSN: 1569-8025
Elektronische ISSN: 1572-8137
DOI
https://doi.org/10.1007/s10825-018-1138-4

Weitere Artikel der Ausgabe 2/2018

Journal of Computational Electronics 2/2018 Zur Ausgabe

Neuer Inhalt