Skip to main content
Erschienen in: The Journal of Supercomputing 11/2021

14.04.2021

UPR: deadlock-free dynamic network reconfiguration by exploiting channel dependency graph compatibility

verfasst von: Juan-José Crespo, José L. Sánchez, Francisco J. Alfaro-Cortés, José Flich, José Duato

Erschienen in: The Journal of Supercomputing | Ausgabe 11/2021

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Deadlock-free dynamic network reconfiguration process is usually studied from the routing algorithm restrictions and resource reservation perspective. The dynamic nature yielded by the transition process from one routing function to another is often managed by restricting resource usage in a static predefined manner, which often limits the supported routing algorithms and/or inactive link patterns, or either requires additional resources such as virtual channels. Exploiting compatibility between routing functions by exploring their associated channel dependency graphs (CDG) leads to a better reconfiguration process given its dynamic nature. In this paper, we propose a new dynamic reconfiguration process called Upstream Progressive Reconfiguration (UPR). Our algorithm progressively performs dependency addition/removal in a per channel basis relying on the information provided by the CDG, while the reconfiguration process takes place. This gives us the opportunity to foresee compatible scenarios where both routing functions coexist, reducing the needed amount of resource drainage as well as packet injection halting.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Anhänge
Nur mit Berechtigung zugänglich
Fußnoten
1
In this context, \(\mathcal {P}(C_{ND})\) denotes the power set of \(C_{ND}\).
 
2
In Sect. 4 we will refer to \(R_{old}\)/\(R_{new}\) as \(R_S\)/\(R_F\) to avoid confusion with old/new routing functions arising at intermediate steps during the process.
 
3
They would become unroutable packets otherwise.
 
4
Provided by the NetworkX [22] python library.
 
5
This is referred as Latency Aware implementation in [19]. However, the new routing information may be also distributed during the reconfiguration process (a.k.a. Packet Drop Aware implementation) before channels can check Condition 2.
 
Literatur
1.
Zurück zum Zitat Alonso M, Coll S, Martínez JM, Santonja V, López P, Duato J (2006) Dynamic power saving in fat-tree interconnection networks using on/off links. In: Proceedings 20th IEEE International Parallel & Distributed Processing Symposium. IEEE, pp 8–pp Alonso M, Coll S, Martínez JM, Santonja V, López P, Duato J (2006) Dynamic power saving in fat-tree interconnection networks using on/off links. In: Proceedings 20th IEEE International Parallel & Distributed Processing Symposium. IEEE, pp 8–pp
2.
Zurück zum Zitat Avresky DR, Natchev NH, Shurbanov V (2001) Dynamic reconfiguration in high-speed computer clusters. In: Proceedings of the 2001 IEEE International Conference on Cluster Computing (CLUSTER’01). IEEE, p 380 Avresky DR, Natchev NH, Shurbanov V (2001) Dynamic reconfiguration in high-speed computer clusters. In: Proceedings of the 2001 IEEE International Conference on Cluster Computing (CLUSTER’01). IEEE, p 380
3.
Zurück zum Zitat Balboni M, Trivino F, Flich J, Bertozzi D (2013) Optimizing the overhead for network-on-chip routing reconfiguration in parallel multi-core platforms. In: 2013 International Symposium on System on Chip (SoC), IEEE, pp 1–6 Balboni M, Trivino F, Flich J, Bertozzi D (2013) Optimizing the overhead for network-on-chip routing reconfiguration in parallel multi-core platforms. In: 2013 International Symposium on System on Chip (SoC), IEEE, pp 1–6
4.
Zurück zum Zitat Bergman K, Borkar S, Campbell D, Carlson W, Dally W, Denneau M, Franzon P, Harrod W, Hill K, Hiller J, et al (2008) Exascale computing study: technology challenges in achieving exascale systems. Defense Advanced Research Projects Agency Information Processing Techniques Office (DARPA IPTO), Tech Rep 15 Bergman K, Borkar S, Campbell D, Carlson W, Dally W, Denneau M, Franzon P, Harrod W, Hill K, Hiller J, et al (2008) Exascale computing study: technology challenges in achieving exascale systems. Defense Advanced Research Projects Agency Information Processing Techniques Office (DARPA IPTO), Tech Rep 15
5.
Zurück zum Zitat Casado R, Bermúdez A, Duato J, Quiles FJ, Sanchez JL (2001) A protocol for deadlock-free dynamic reconfiguration in high-speed local area networks. IEEE Trans Parallel Distrib Syst 12(2):115–132CrossRef Casado R, Bermúdez A, Duato J, Quiles FJ, Sanchez JL (2001) A protocol for deadlock-free dynamic reconfiguration in high-speed local area networks. IEEE Trans Parallel Distrib Syst 12(2):115–132CrossRef
6.
Zurück zum Zitat Chiu GM (2000) The odd-even turn model for adaptive routing. IEEE Trans Parallel Distrib Syst 11(7):729–738CrossRef Chiu GM (2000) The odd-even turn model for adaptive routing. IEEE Trans Parallel Distrib Syst 11(7):729–738CrossRef
7.
Zurück zum Zitat Conner S, Akioka S, Irwin MJ, Raghavan P (2007) Link shutdown opportunities during collective communications in 3-d torus nets. In: 2007 IEEE International Parallel and Distributed Processing Symposium. IEEE, pp 1–8 Conner S, Akioka S, Irwin MJ, Raghavan P (2007) Link shutdown opportunities during collective communications in 3-d torus nets. In: 2007 IEEE International Parallel and Distributed Processing Symposium. IEEE, pp 1–8
9.
Zurück zum Zitat Dickov B, Carpenter PM, Pericas M, Ayguadé E (2015) Self-tuned software-managed energy reduction in infiniband links. In: 2015 IEEE 21st International Conference on Parallel and Distributed Systems (ICPADS), pp 649–657 Dickov B, Carpenter PM, Pericas M, Ayguadé E (2015) Self-tuned software-managed energy reduction in infiniband links. In: 2015 IEEE 21st International Conference on Parallel and Distributed Systems (ICPADS), pp 649–657
10.
Zurück zum Zitat Duato J (1995) A necessary and sufficient condition for deadlock-free adaptive routing in wormhole networks. IEEE Trans Parallel Distrib Syst 6(10):1055–1067CrossRef Duato J (1995) A necessary and sufficient condition for deadlock-free adaptive routing in wormhole networks. IEEE Trans Parallel Distrib Syst 6(10):1055–1067CrossRef
11.
Zurück zum Zitat Duato J, Lysne O, Pang R, Pinkston TM (2005) A theory for deadlock-free dynamic network reconfiguration. Part i. IEEE Trans Parallel Distrib Syst 16(5):412–427CrossRef Duato J, Lysne O, Pang R, Pinkston TM (2005) A theory for deadlock-free dynamic network reconfiguration. Part i. IEEE Trans Parallel Distrib Syst 16(5):412–427CrossRef
12.
Zurück zum Zitat Glass CJ, Ni LM (1992) The turn model for adaptive routing. ACM SIGARCH Comput Archit News 20(2):278–287CrossRef Glass CJ, Ni LM (1992) The turn model for adaptive routing. ACM SIGARCH Comput Archit News 20(2):278–287CrossRef
13.
Zurück zum Zitat Groves T, Grant R (2015) Power aware, dynamic provisioning of hpc networks. Sandia National Labs report 21 Groves T, Grant R (2015) Power aware, dynamic provisioning of hpc networks. Sandia National Labs report 21
14.
Zurück zum Zitat Jin C, de Supinski BR, Abramson D, Poxon H, DeRose L, Dinh MN, Endrei M, Jessup ER (2017) A survey on software methods to improve the energy efficiency of parallel computing. Int J High Perform Comput Appl 31(6):517–549CrossRef Jin C, de Supinski BR, Abramson D, Poxon H, DeRose L, Dinh MN, Endrei M, Jessup ER (2017) A survey on software methods to improve the energy efficiency of parallel computing. Int J High Perform Comput Appl 31(6):517–549CrossRef
15.
Zurück zum Zitat Lee SE, Bagherzadeh N (2009) A variable frequency link for a power-aware network-on-chip (noc). Integration 42(4):479–485CrossRef Lee SE, Bagherzadeh N (2009) A variable frequency link for a power-aware network-on-chip (noc). Integration 42(4):479–485CrossRef
16.
Zurück zum Zitat Li F, Chen G, Kandemir M, Kolcu I (2007) Profile-driven energy reduction in network-on-chips. In: Proceedings of the 28th ACM SIGPLAN Conference on Programming Language Design and Implementation, pp 394–404 Li F, Chen G, Kandemir M, Kolcu I (2007) Profile-driven energy reduction in network-on-chips. In: Proceedings of the 28th ACM SIGPLAN Conference on Programming Language Design and Implementation, pp 394–404
17.
Zurück zum Zitat Lysne O, Duato J (2000) Fast dynamic reconfiguration in irregular networks. In: Proceedings 2000 International Conference on Parallel Processing. IEEE, pp 449–458 Lysne O, Duato J (2000) Fast dynamic reconfiguration in irregular networks. In: Proceedings 2000 International Conference on Parallel Processing. IEEE, pp 449–458
18.
Zurück zum Zitat Lysne O, Pinkston TM, Duato J (2005) A methodology for developing deadlock-free dynamic network reconfiguration processes. Part ii. IEEE Trans Parallel Distrib Syst 16(5):428–443CrossRef Lysne O, Pinkston TM, Duato J (2005) A methodology for developing deadlock-free dynamic network reconfiguration processes. Part ii. IEEE Trans Parallel Distrib Syst 16(5):428–443CrossRef
19.
Zurück zum Zitat Lysne O, Montanana JM, Flich J, Duato J, Pinkston TM, Skeie T (2008) An efficient and deadlock-free network reconfiguration protocol. IEEE Trans Comput 57(6):762–779MathSciNetCrossRef Lysne O, Montanana JM, Flich J, Duato J, Pinkston TM, Skeie T (2008) An efficient and deadlock-free network reconfiguration protocol. IEEE Trans Comput 57(6):762–779MathSciNetCrossRef
20.
Zurück zum Zitat Mejia A, Flich J, Duato J, Reinemo SA, Skeie T (2006) Segment-based routing: an efficient fault-tolerant routing algorithm for meshes and tori. In: Proceedings 20th IEEE International Parallel & Distributed Processing Symposium, pp 10-pp Mejia A, Flich J, Duato J, Reinemo SA, Skeie T (2006) Segment-based routing: an efficient fault-tolerant routing algorithm for meshes and tori. In: Proceedings 20th IEEE International Parallel & Distributed Processing Symposium, pp 10-pp
21.
Zurück zum Zitat Miwa S, Nakamura H (2015) Profile-based power shifting in interconnection networks with on/off links. In: SC’15: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. IEEE, pp 1–11 Miwa S, Nakamura H (2015) Profile-based power shifting in interconnection networks with on/off links. In: SC’15: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. IEEE, pp 1–11
23.
Zurück zum Zitat Parikh R, Bertacco V (2015) Resource conscious diagnosis and reconfiguration for noc permanent faults. IEEE Trans Comput 65(7):2241–2256MathSciNetCrossRef Parikh R, Bertacco V (2015) Resource conscious diagnosis and reconfiguration for noc permanent faults. IEEE Trans Comput 65(7):2241–2256MathSciNetCrossRef
24.
Zurück zum Zitat Pinkston TM, Pang R, Duato J (2003) Deadlock-free dynamic reconfiguration schemes for increased network dependability. IEEE Trans Parallel Distrib Syst 14(8):780–794CrossRef Pinkston TM, Pang R, Duato J (2003) Deadlock-free dynamic reconfiguration schemes for increased network dependability. IEEE Trans Parallel Distrib Syst 14(8):780–794CrossRef
25.
Zurück zum Zitat Rodeheffer TL, Schroeder MD (1991) Automatic reconfiguration in autonet. In: Proceedings of the Thirteenth ACM Symposium on Operating Systems Principles, pp 183–197 Rodeheffer TL, Schroeder MD (1991) Automatic reconfiguration in autonet. In: Proceedings of the Thirteenth ACM Symposium on Operating Systems Principles, pp 183–197
26.
Zurück zum Zitat Schroeder MD, Birrell AD, Burrows M, Murray H, Needham RM, Rodeheffer TL, Satterthwaite EH, Thacker CP (1991) Autonet: a high-speed, self-configuring local area network using point-to-point links. IEEE J Sel Areas Commun 9(8):1318–1335CrossRef Schroeder MD, Birrell AD, Burrows M, Murray H, Needham RM, Rodeheffer TL, Satterthwaite EH, Thacker CP (1991) Autonet: a high-speed, self-configuring local area network using point-to-point links. IEEE J Sel Areas Commun 9(8):1318–1335CrossRef
27.
Zurück zum Zitat Schwiebert L (2001) Deadlock-free oblivious wormhole routing with cyclic dependencies. IEEE Trans Comput 50(9):865–876MathSciNetCrossRef Schwiebert L (2001) Deadlock-free oblivious wormhole routing with cyclic dependencies. IEEE Trans Comput 50(9):865–876MathSciNetCrossRef
28.
Zurück zum Zitat Seiculescu C, Murali S, Benini L, De Micheli G (2009) Noc topology synthesis for supporting shutdown of voltage islands in socs. In: Proceedings of the 46th Annual Design Automation Conference, pp 822–825 Seiculescu C, Murali S, Benini L, De Micheli G (2009) Noc topology synthesis for supporting shutdown of voltage islands in socs. In: Proceedings of the 46th Annual Design Automation Conference, pp 822–825
29.
Zurück zum Zitat Shanley T (2003) InfiniBand network architecture. Addison-Wesley Professional, Reading Shanley T (2003) InfiniBand network architecture. Addison-Wesley Professional, Reading
31.
Zurück zum Zitat Teodosiu D, Baxter J, Govil K, Chapin J, Rosenblum M, Horowitz M (1997) Hardware fault containment in scalable shared-memory multiprocessors. In: Proceedings of the 24th Annual International Symposium on Computer Architecture, pp 73–84 Teodosiu D, Baxter J, Govil K, Chapin J, Rosenblum M, Horowitz M (1997) Hardware fault containment in scalable shared-memory multiprocessors. In: Proceedings of the 24th Annual International Symposium on Computer Architecture, pp 73–84
32.
Zurück zum Zitat Vignéras P, Quintin JN (2016) The bxi routing architecture for exascale supercomputer. J Supercomput 72(12):4418–4437CrossRef Vignéras P, Quintin JN (2016) The bxi routing architecture for exascale supercomputer. J Supercomput 72(12):4418–4437CrossRef
33.
Zurück zum Zitat Zhou J, Chung YC (2012) Tree-turn routing: an efficient deadlock-free routing algorithm for irregular networks. J Supercomput 59(2):882–900CrossRef Zhou J, Chung YC (2012) Tree-turn routing: an efficient deadlock-free routing algorithm for irregular networks. J Supercomput 59(2):882–900CrossRef
Metadaten
Titel
UPR: deadlock-free dynamic network reconfiguration by exploiting channel dependency graph compatibility
verfasst von
Juan-José Crespo
José L. Sánchez
Francisco J. Alfaro-Cortés
José Flich
José Duato
Publikationsdatum
14.04.2021
Verlag
Springer US
Erschienen in
The Journal of Supercomputing / Ausgabe 11/2021
Print ISSN: 0920-8542
Elektronische ISSN: 1573-0484
DOI
https://doi.org/10.1007/s11227-021-03791-8

Weitere Artikel der Ausgabe 11/2021

The Journal of Supercomputing 11/2021 Zur Ausgabe

Premium Partner