Skip to main content

2018 | OriginalPaper | Buchkapitel

RBPCCM: Relax Blocking Parallel Collective Communication Mechanism Base on Hardware with Scalability

verfasst von : Xiu-jiang Ren, Zhou Zhou, Qing Peng, Xiang-hui Xie

Erschienen in: Computer Engineering and Technology

Verlag: Springer Singapore

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

With the development of parallel computation, the scale of high performance computing system increases dramatically and the collective communication has become its bottleneck. The collective communication with the hardware support has the relatively high performance. However, scalability of collective communication is always a crucial problem, because the number of nodes involved is not fixed. This paper proposes the Relax Blocking Parallel Collective Communication Mechanism (RBPCCM) to improve the performance of the collective communication in parallel computation. This mechanism, cooperating hardware and software, implements the scalable collective communication by distributing collective resource allocation numbers. Furthermore, RBPCCM supports the implementation in various scales of endpoint, unconstrained by the interconnect network topology. A functional simulation model is built based on the system of Sunway Taihu Light to verify the correctness and scalability of this proposed method. The implementation of RBPCCM prototype is built based on the network interface, and a FPGA platform is constructed for performance test. It is testified that RBPCCM has the improvement as regards to delay performance from 2.4 to 37 times, compared with the Point-to-Point communication based on software.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Lucas, R., Ang, J., Bergman, K., et al.: DOE Advanced Scientific Computing Advisory Subcommittee (ASCAC) report: top ten exascale research challenges (2014) Lucas, R., Ang, J., Bergman, K., et al.: DOE Advanced Scientific Computing Advisory Subcommittee (ASCAC) report: top ten exascale research challenges (2014)
2.
Zurück zum Zitat Petrini, F., Kerbyson, D.J., Pakin, S.: The case of the missing supercomputer performance. In: Achieving Optimal Performance on the 8192 Processors of ASCI Q, Proceedings of SC2003, pp. 1–17. ACM, New York (2003) Petrini, F., Kerbyson, D.J., Pakin, S.: The case of the missing supercomputer performance. In: Achieving Optimal Performance on the 8192 Processors of ASCI Q, Proceedings of SC2003, pp. 1–17. ACM, New York (2003)
3.
Zurück zum Zitat Rabenseifner, R.: Automatic MPI counter profiling of all users: first result on a CRAY T3E 900-512. In: Proceedings of the Message Passing Interface Developer’s and User’s Conference (MPIDC), pp. 77–85. HLRS, Atlanta, USA (1999) Rabenseifner, R.: Automatic MPI counter profiling of all users: first result on a CRAY T3E 900-512. In: Proceedings of the Message Passing Interface Developer’s and User’s Conference (MPIDC), pp. 77–85. HLRS, Atlanta, USA (1999)
4.
Zurück zum Zitat Moody, A., Fernandez, J., Petrini, F., et al.: Scalable NIC-based reduction on large-scale clusters. In: ACM/IEEE Conference on Supercomputing, p. 59. ACM (2003) Moody, A., Fernandez, J., Petrini, F., et al.: Scalable NIC-based reduction on large-scale clusters. In: ACM/IEEE Conference on Supercomputing, p. 59. ACM (2003)
5.
Zurück zum Zitat Culler, D., Richard, K.Y., Patterson, D., Eicken, T. et al.: LogP: towards a realistic model of parallel computation. 28(7), 1–12 (1993) Culler, D., Richard, K.Y., Patterson, D., Eicken, T. et al.: LogP: towards a realistic model of parallel computation. 28(7), 1–12 (1993)
6.
Zurück zum Zitat Gabrielyan, E., Hersch, R.D.: Network topology aware scheduling of collective communications. In: International Conference on Telecommunications, vol. 2, pp. 1051–1058. IEEE (2003) Gabrielyan, E., Hersch, R.D.: Network topology aware scheduling of collective communications. In: International Conference on Telecommunications, vol. 2, pp. 1051–1058. IEEE (2003)
8.
9.
Zurück zum Zitat Petrini, F., Coll, S., Frachtenberg, E., et al.: Hardware- and software-based collective communication on the quadrics network. In: IEEE International Symposium on Network Computing and Applications, pp. 24–35. IEEE (2001) Petrini, F., Coll, S., Frachtenberg, E., et al.: Hardware- and software-based collective communication on the quadrics network. In: IEEE International Symposium on Network Computing and Applications, pp. 24–35. IEEE (2001)
10.
Zurück zum Zitat Giampapa, M.E., Giampapa, M.E., Giampapa, M.E., et al.: The deep computing messaging framework: generalized scalable message passing on the blue gene/P supercomputer. International Conference on Supercomputing, pp. 94–103. ACM (2008) Giampapa, M.E., Giampapa, M.E., Giampapa, M.E., et al.: The deep computing messaging framework: generalized scalable message passing on the blue gene/P supercomputer. International Conference on Supercomputing, pp. 94–103. ACM (2008)
11.
Zurück zum Zitat Faraj, A., Kumar, S., Smith, B., et al.: MPI collective communications on the Blue Gene/P supercomputer: algorithms and optimizations. In: International Conference on Supercomputing, pp. 489–490. ACM (2009) Faraj, A., Kumar, S., Smith, B., et al.: MPI collective communications on the Blue Gene/P supercomputer: algorithms and optimizations. In: International Conference on Supercomputing, pp. 489–490. ACM (2009)
12.
Zurück zum Zitat Haring, R., Ohmacht, M., Fox, T., et al.: The IBM Blue Gene/Q compute chip. IEEE Micro 32(2), 48–60 (2011)CrossRef Haring, R., Ohmacht, M., Fox, T., et al.: The IBM Blue Gene/Q compute chip. IEEE Micro 32(2), 48–60 (2011)CrossRef
13.
Zurück zum Zitat Arimilli, B., Arimilli, R., Chung, V., et al.: The PERCS high-performance interconnect, pp. 75–82. IEEE (2010) Arimilli, B., Arimilli, R., Chung, V., et al.: The PERCS high-performance interconnect, pp. 75–82. IEEE (2010)
14.
Zurück zum Zitat Mai, L., Rupprecht, L., Alim, A., et al.: NetAgg: using middleboxes for application-specific on-path aggregation in data centres, vol. 23(6), pp. 249–262 (2014) Mai, L., Rupprecht, L., Alim, A., et al.: NetAgg: using middleboxes for application-specific on-path aggregation in data centres, vol. 23(6), pp. 249–262 (2014)
15.
Zurück zum Zitat Wagner, A., Jin, H.W., Panda, D.K., et al.: NIC-based offload of dynamic user-defined modules for Myrinet clusters. IEEE International Conference on CLUSTER Computing, pp. 205–214. IEEE Computer Society (2004) Wagner, A., Jin, H.W., Panda, D.K., et al.: NIC-based offload of dynamic user-defined modules for Myrinet clusters. IEEE International Conference on CLUSTER Computing, pp. 205–214. IEEE Computer Society (2004)
16.
Zurück zum Zitat Yu, W., Buntinas, D., Graham, R.L., et al.: Efficient and scalable barrier over quadrics and Myrinet with a new NIC-based collective message passing protocol, p. 182 (2004) Yu, W., Buntinas, D., Graham, R.L., et al.: Efficient and scalable barrier over quadrics and Myrinet with a new NIC-based collective message passing protocol, p. 182 (2004)
17.
Zurück zum Zitat Zahavi, E., Zahavi, E., Zahavi, E., et al.: Scalable hierarchical aggregation protocol (SHArP): a hardware architecture for efficient data reduction. In: The Workshop on Optimization of Communication in HPC, pp. 1–10. IEEE Press (2016) Zahavi, E., Zahavi, E., Zahavi, E., et al.: Scalable hierarchical aggregation protocol (SHArP): a hardware architecture for efficient data reduction. In: The Workshop on Optimization of Communication in HPC, pp. 1–10. IEEE Press (2016)
18.
Zurück zum Zitat Arap, O., Swany, M.: Offloading collective operations to programmable logic on a Zynq cluster. In: High-Performance Interconnects, pp. 76–83. IEEE (2016) Arap, O., Swany, M.: Offloading collective operations to programmable logic on a Zynq cluster. In: High-Performance Interconnects, pp. 76–83. IEEE (2016)
Metadaten
Titel
RBPCCM: Relax Blocking Parallel Collective Communication Mechanism Base on Hardware with Scalability
verfasst von
Xiu-jiang Ren
Zhou Zhou
Qing Peng
Xiang-hui Xie
Copyright-Jahr
2018
Verlag
Springer Singapore
DOI
https://doi.org/10.1007/978-981-10-7844-6_7

Neuer Inhalt