Skip to main content
Erschienen in: The Journal of Supercomputing 3/2015

01.03.2015

On-the-fly adaptive routing for dragonfly interconnection networks

verfasst von: Marina García, Enrique Vallejo, Ramón Beivide, Cristóbal Camarero, Mateo Valero, Germán Rodríguez, Cyriel Minkenberg

Erschienen in: The Journal of Supercomputing | Ausgabe 3/2015

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Adaptive deadlock-free routing mechanisms are required to handle variable traffic patterns in dragonfly networks. However, distance-based deadlock avoidance mechanisms typically employed in Dragonflies increase the router cost and complexity as a function of the maximum allowed path length. This paper presents on-the-fly adaptive routing (OFAR), a routing/flow-control scheme that decouples the routing and the deadlock avoidance mechanisms. OFAR allows for in-transit adaptive routing with local and global misrouting, without imposing dependencies between virtual channels, and relying on a deadlock-free escape subnetwork to avoid deadlock. This model lowers latency, increases throughput, and adapts faster to transient traffic than previously proposed mechanisms. The low capacity of the escape subnetwork makes it prone to congestion. A simple congestion management mechanism based on injection restriction is considered to avoid such issues. Finally, reliability is considered by introducing mechanisms to find multiple edge-disjoint Hamiltonian rings embedded on the dragonfly, allowing to use multiple escape subnetworks.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Fußnoten
1
We do not consider the opposite case (switching to minimal after a first nonminimal local hop which corresponds to the global misrouting) because we model the MM+L global link selection policy [9] which does not make a first local hop for global misrouting at injection. However, since OFAR decouples the router resources and the path length, it would also support that case.
 
2
A minimum occupancy might be also required in the minimal queue to allow for misrouting; we have not considered such a threshold in this work.
 
3
We do not show results of ADVG+1 as it could be argued that the additional ring link between the source and destination groups favors the OFAR models. However, since the escape network utilization is only used to avoid potential deadlock situations and not to carry traffic to the destination, the results are similar.
 
Literatur
1.
Zurück zum Zitat Arimilli B, Arimilli R, Chung V, Clark S, Denzel W, Drerup B, Hoefler T, Joyner J, Lewis J, Li J et al (2010) The PERCS high-performance interconnect. In: 2010 18th IEEE symposium on high performance interconnects. IEEE, pp 75–82 Arimilli B, Arimilli R, Chung V, Clark S, Denzel W, Drerup B, Hoefler T, Joyner J, Lewis J, Li J et al (2010) The PERCS high-performance interconnect. In: 2010 18th IEEE symposium on high performance interconnects. IEEE, pp 75–82
2.
Zurück zum Zitat Bhatele A, Gropp WD, Jain N, Kale LV (2011) Avoiding hot-spots on two-level direct networks. In: 2011 international conference for high performance computing, networking, storage and analysis (SC), pp 1–11 Bhatele A, Gropp WD, Jain N, Kale LV (2011) Avoiding hot-spots on two-level direct networks. In: 2011 international conference for high performance computing, networking, storage and analysis (SC), pp 1–11
4.
Zurück zum Zitat Carrion C, Beivide R, Gregorio J, Vallejo F (1997) A flow control mechanism to avoid message deadlock in k-ary n-cube networks. In: Proceedings of fourth international conference on high-performance computing, 1997, pp 322–329. doi:10.1109/HIPC.1997.634510 Carrion C, Beivide R, Gregorio J, Vallejo F (1997) A flow control mechanism to avoid message deadlock in k-ary n-cube networks. In: Proceedings of fourth international conference on high-performance computing, 1997, pp 322–329. doi:10.​1109/​HIPC.​1997.​634510
6.
Zurück zum Zitat Duato J (1995) A necessary and sufficient condition for deadlock-free adaptive routing in wormhole networks. IEEE Trans Parallel Distrib Syst 6(10):1055–1067CrossRef Duato J (1995) A necessary and sufficient condition for deadlock-free adaptive routing in wormhole networks. IEEE Trans Parallel Distrib Syst 6(10):1055–1067CrossRef
7.
Zurück zum Zitat Faanes G, Bataineh A, Roweth D, Court T, Froese E, Alverson B, Johnson T, Kopnick J, Higgins M, Reinhard J (2012) Cray cascade: a scalable HPC system based on a dragonfly network. In: International conference on high performance computing, networking, storage and analysis, SC ’12. IEEE Computer Society Press, Los Alamitos, pp 103:1–103:9 Faanes G, Bataineh A, Roweth D, Court T, Froese E, Alverson B, Johnson T, Kopnick J, Higgins M, Reinhard J (2012) Cray cascade: a scalable HPC system based on a dragonfly network. In: International conference on high performance computing, networking, storage and analysis, SC ’12. IEEE Computer Society Press, Los Alamitos, pp 103:1–103:9
9.
Zurück zum Zitat García M, Vallejo E, Beivide R, Odriozola M, Camarero C, Valero M, Labarta J, Rodríguez G (2013) Global misrouting policies in two-level hierarchical networks. In: Interconnection network architecture: on-chip, multi-chip, pp 13–16 García M, Vallejo E, Beivide R, Odriozola M, Camarero C, Valero M, Labarta J, Rodríguez G (2013) Global misrouting policies in two-level hierarchical networks. In: Interconnection network architecture: on-chip, multi-chip, pp 13–16
10.
Zurück zum Zitat García M, Vallejo E, Beivide R, Odriozola M, Camarero C, Valero M, Rodríguez G, Labarta J, Minkenberg C (2012) On-the-fly adaptive routing in high-radix hierarchical networks. In: International conference on parallel processing (ICPP) García M, Vallejo E, Beivide R, Odriozola M, Camarero C, Valero M, Rodríguez G, Labarta J, Minkenberg C (2012) On-the-fly adaptive routing in high-radix hierarchical networks. In: International conference on parallel processing (ICPP)
11.
Zurück zum Zitat García M, Vallejo E, Beivide R, Valero M, Rodríguez G (2013) OFAR-CM: efficient dragonfly networks with simple congestion management. In: 2013 IEEE 21st annual symposium on high-performance interconnects (HOTI), pp 55–62. doi:10.1109/HOTI.2013.16 García M, Vallejo E, Beivide R, Valero M, Rodríguez G (2013) OFAR-CM: efficient dragonfly networks with simple congestion management. In: 2013 IEEE 21st annual symposium on high-performance interconnects (HOTI), pp 55–62. doi:10.​1109/​HOTI.​2013.​16
12.
Zurück zum Zitat Garcia PJ (2011) Congestion management in HPC interconnection networks. HPC Advisory Council European Workshop Garcia PJ (2011) Congestion management in HPC interconnection networks. HPC Advisory Council European Workshop
14.
Zurück zum Zitat Gupta P, McKeown N (1999) Designing and implementing a fast crossbar scheduler. Micro IEEE 19(1):20–28CrossRef Gupta P, McKeown N (1999) Designing and implementing a fast crossbar scheduler. Micro IEEE 19(1):20–28CrossRef
15.
Zurück zum Zitat IEEE 802 LAN/MAN Standards Committee (2004) IEEE 802.1d-2004 MAC bridges IEEE 802 LAN/MAN Standards Committee (2004) IEEE 802.1d-2004 MAC bridges
16.
Zurück zum Zitat IEEE 802 LAN/MAN Standards Committee (2010) IEEE standard for local and metropolitan area networks–virtual bridged local area networks–amendment: 10: Congestion notification, 802.1Qau IEEE 802 LAN/MAN Standards Committee (2010) IEEE standard for local and metropolitan area networks–virtual bridged local area networks–amendment: 10: Congestion notification, 802.1Qau
17.
Zurück zum Zitat Jacobson V (1988) Congestion avoidance and control. ACM SIGCOMM Comput Commun Rev 18:314–329CrossRef Jacobson V (1988) Congestion avoidance and control. ACM SIGCOMM Comput Commun Rev 18:314–329CrossRef
18.
Zurück zum Zitat Jiang N, Kim J, Dally WJ (2009) Indirect adaptive routing on large scale interconnection networks. In: ISCA ’09: 36th international symposium on computer architecture Jiang N, Kim J, Dally WJ (2009) Indirect adaptive routing on large scale interconnection networks. In: ISCA ’09: 36th international symposium on computer architecture
19.
Zurück zum Zitat Kerbyson DJ, Barker KJ (2011) Analyzing the performance bottlenecks of the POWER7-IH network. In: CLUSTER. IEEE, pp 244–252 Kerbyson DJ, Barker KJ (2011) Analyzing the performance bottlenecks of the POWER7-IH network. In: CLUSTER. IEEE, pp 244–252
20.
Zurück zum Zitat Kermani P, Kleinrock L (1976) Virtual cut-through: a new computer communication switching technique. Comput Netw 3(4):267–286MathSciNet Kermani P, Kleinrock L (1976) Virtual cut-through: a new computer communication switching technique. Comput Netw 3(4):267–286MathSciNet
21.
Zurück zum Zitat Kim J, Dally W, Scott S, Abts D (2008) Technology-driven, highly-scalable dragonfly topology. In: Proceedings of the 35th annual international symposium on computer architecture. IEEE Computer Society, pp 77–88 Kim J, Dally W, Scott S, Abts D (2008) Technology-driven, highly-scalable dragonfly topology. In: Proceedings of the 35th annual international symposium on computer architecture. IEEE Computer Society, pp 77–88
23.
Zurück zum Zitat Pinkston T (2004) Deadlock characterization and resolution in interconnection networks. In: Deadlock resolution in computer-integrated systems, CRC Press, pp 445–492 Pinkston T (2004) Deadlock characterization and resolution in interconnection networks. In: Deadlock resolution in computer-integrated systems, CRC Press, pp 445–492
24.
Zurück zum Zitat Prisacari B, Rodriguez G, Garcia M, Vallejo E, Beivide R, Minkenberg C (2014) Performance implications of remote-only load balancing under adversarial traffic in dragonflies. In: 8th international workshop on interconnection network architecture: on-chip, multi-chip, INA-OCMC ’14. doi: 10.1145/2556857.2556860 Prisacari B, Rodriguez G, Garcia M, Vallejo E, Beivide R, Minkenberg C (2014) Performance implications of remote-only load balancing under adversarial traffic in dragonflies. In: 8th international workshop on interconnection network architecture: on-chip, multi-chip, INA-OCMC ’14. doi: 10.​1145/​2556857.​2556860
25.
Zurück zum Zitat Silla F, Duato J (2000) High-performance routing in networks of workstations with irregular topology. IEEE Trans Parallel Distrib Syst 11(7):699–719. doi:10.1109/71.877816 Silla F, Duato J (2000) High-performance routing in networks of workstations with irregular topology. IEEE Trans Parallel Distrib Syst 11(7):699–719. doi:10.​1109/​71.​877816
Metadaten
Titel
On-the-fly adaptive routing for dragonfly interconnection networks
verfasst von
Marina García
Enrique Vallejo
Ramón Beivide
Cristóbal Camarero
Mateo Valero
Germán Rodríguez
Cyriel Minkenberg
Publikationsdatum
01.03.2015
Verlag
Springer US
Erschienen in
The Journal of Supercomputing / Ausgabe 3/2015
Print ISSN: 0920-8542
Elektronische ISSN: 1573-0484
DOI
https://doi.org/10.1007/s11227-014-1357-9

Weitere Artikel der Ausgabe 3/2015

The Journal of Supercomputing 3/2015 Zur Ausgabe