Skip to main content
Erschienen in: The Journal of Supercomputing 12/2016

01.12.2016

An efficient fault-tolerant routing algorithm in NoCs to tolerate permanent faults

verfasst von: Reza Akbar, Ali Asghar Etedalpour, Farshad Safaei

Erschienen in: The Journal of Supercomputing | Ausgabe 12/2016

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

With the possibility of integrating multiple cores into a single chip, research on the networks-on-chip (NoCs) as a kind of interconnection network has assumed great significance. In such networks, the effort is to provide broadband and extendable infrastructure for multi-core architectures. Communication between processors in an NoC is established using routing algorithms. Meanwhile, NoCs, like any other system, are prone to failure. With the increase in the number of network components on a chip, the probability of failure increases, too. Therefore, considering a fault-tolerant mechanism in NoCs seems to be a necessity. The main challenge of this work is combining performance and fault tolerance while reducing power, complexity and cost. In this paper, a fault-tolerant routing algorithm for tolerating static and dynamic faults in 2D Mesh NoCs and node failure model is presented. It should be taken into consideration that despite many other routing algorithms, the proposed method uses only one Virtual Channel. Results show that this method has lower latency and power consumption than SAVA and segment-based (SB) routing algorithms. It showed 2.91 and 12.74 % less power consumption than SAVA and SB, respectively, under SPLASH-2 traffic in an 8 \(\times \) 8 Mesh network with 8 faulty nodes. Its average latency, under Uniform, Transpose, and SPLASH-2 traffics in a 4 \(\times \) 4 Mesh with 4 faulty nodes and an 8 \(\times \) 8 Mesh network with 8 faulty nodes, was reduced by 4.39 and 14.08 % compared to SAVA and SB, respectively.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Olukotun K, Nafieh BA, Hammond L, Wilson K, Chang K (1996) The case for a single-chip multiprocessor. In: Proceedings of the 7th international conference on architectural support for programming languages and operating systems, pp 2–11 Olukotun K, Nafieh BA, Hammond L, Wilson K, Chang K (1996) The case for a single-chip multiprocessor. In: Proceedings of the 7th international conference on architectural support for programming languages and operating systems, pp 2–11
2.
Zurück zum Zitat Barroso LA et al (2000) Piranha: a scalable architecture based on single-chip multiprocessing. In: International symposium on computer architecture, pp 282–293 Barroso LA et al (2000) Piranha: a scalable architecture based on single-chip multiprocessing. In: International symposium on computer architecture, pp 282–293
3.
Zurück zum Zitat Arabnia HR, Smith JW (1993) A reconfigurable interconnection network for imaging operations and its implementation using a multi-stage switching box. In: Proceedings of the 7th annual international high performance computing conference. The 1993 high performance computing: new horizons supercomputing symposium, Calgary, Alberta, pp 349–357 Arabnia HR, Smith JW (1993) A reconfigurable interconnection network for imaging operations and its implementation using a multi-stage switching box. In: Proceedings of the 7th annual international high performance computing conference. The 1993 high performance computing: new horizons supercomputing symposium, Calgary, Alberta, pp 349–357
4.
Zurück zum Zitat Bhandarkar SM, Arabnia HR (1995) The Hough transform on a reconfigurable multi-ring network. J Parallel Distrib Comput 24(1):107–114CrossRef Bhandarkar SM, Arabnia HR (1995) The Hough transform on a reconfigurable multi-ring network. J Parallel Distrib Comput 24(1):107–114CrossRef
5.
Zurück zum Zitat Bhandarkar SM, Arabnia HR (1995) The REFINE multiprocessor: theoretical properties and algorithms. Parallel Comput J Elsevier 21(11):1783–1806CrossRef Bhandarkar SM, Arabnia HR (1995) The REFINE multiprocessor: theoretical properties and algorithms. Parallel Comput J Elsevier 21(11):1783–1806CrossRef
6.
Zurück zum Zitat Arabnia HR, Oliver MA (1987) Arbitrary rotation of raster images with SIMD machine architectures. Int J Eurograph Assoc Comput Graph Forum 6(1):3–12CrossRef Arabnia HR, Oliver MA (1987) Arbitrary rotation of raster images with SIMD machine architectures. Int J Eurograph Assoc Comput Graph Forum 6(1):3–12CrossRef
7.
Zurück zum Zitat Arabnia HR (1990) A Parallel Algorithm for the Arbitrary Rotation of Digitized Images using Process-and-Data-Decomposition Approach. J Parallel Distrib Comput 10(2):188–193CrossRef Arabnia HR (1990) A Parallel Algorithm for the Arbitrary Rotation of Digitized Images using Process-and-Data-Decomposition Approach. J Parallel Distrib Comput 10(2):188–193CrossRef
8.
Zurück zum Zitat Jerger NE, Peh LS (2009) On-chip networks. In: Mark H (ed) Synthesis Lectures on Computer Architecture. Morgan & Claypool Publishers, Madison Jerger NE, Peh LS (2009) On-chip networks. In: Mark H (ed) Synthesis Lectures on Computer Architecture. Morgan & Claypool Publishers, Madison
9.
Zurück zum Zitat Wani MA, Arabnia HR (2003) Parallel edge-region-based segmentation algorithm targeted at reconfigurable multi-ring network. J Supercomput 25(1):43–63CrossRefMATH Wani MA, Arabnia HR (2003) Parallel edge-region-based segmentation algorithm targeted at reconfigurable multi-ring network. J Supercomput 25(1):43–63CrossRefMATH
10.
Zurück zum Zitat Arabnia HR, Bhandarkar SM (1996) Parallel stereocorrelation on a reconfigurable multi-ring network. J Supercomput (Springer Publishers) 10(3):243–270CrossRefMATH Arabnia HR, Bhandarkar SM (1996) Parallel stereocorrelation on a reconfigurable multi-ring network. J Supercomput (Springer Publishers) 10(3):243–270CrossRefMATH
11.
Zurück zum Zitat Dally WJ, Dennison LR, Harris D, Kan K, Xanthopoulos T (1994) The reliable router: a reliable and high-performance communications substrate for parallel computers. In: 1st International workshop on parallel computer routing and communication, pp 241–255 Dally WJ, Dennison LR, Harris D, Kan K, Xanthopoulos T (1994) The reliable router: a reliable and high-performance communications substrate for parallel computers. In: 1st International workshop on parallel computer routing and communication, pp 241–255
12.
Zurück zum Zitat Duato J, Mejia A, Palesi M, Flich J, Kumar S (2009) Region-based routing: a mechanism to support efficient routing algorithms in NoCs. IEEE Trans Very Large Scale Integr (VLSI) Syst 17(3):356–369CrossRef Duato J, Mejia A, Palesi M, Flich J, Kumar S (2009) Region-based routing: a mechanism to support efficient routing algorithms in NoCs. IEEE Trans Very Large Scale Integr (VLSI) Syst 17(3):356–369CrossRef
13.
Zurück zum Zitat Ni LM, McKinley PK (1993) A survey of wormhole routing techniques in direct networks. Computer 26(2):62–76CrossRef Ni LM, McKinley PK (1993) A survey of wormhole routing techniques in direct networks. Computer 26(2):62–76CrossRef
15.
Zurück zum Zitat Arabnia HR, Oliver MA (1989) A transputer network for fast operations on digitised images. Int J Eurograph Assoc Comput Graph Forum 8(1):3–12CrossRef Arabnia HR, Oliver MA (1989) A transputer network for fast operations on digitised images. Int J Eurograph Assoc Comput Graph Forum 8(1):3–12CrossRef
16.
Zurück zum Zitat Bhandarkar SM, Arabnia HR, Smith JW (1995) A reconfigurable architecture for image processing and computer vision. Int J Pattern Recognit Artif Intell (IJPRAI) 9(2):201–229 special issue on VLSI Algorithms and Architectures for Computer Vision, Image Processing, Pattern Recognition and AICrossRef Bhandarkar SM, Arabnia HR, Smith JW (1995) A reconfigurable architecture for image processing and computer vision. Int J Pattern Recognit Artif Intell (IJPRAI) 9(2):201–229 special issue on VLSI Algorithms and Architectures for Computer Vision, Image Processing, Pattern Recognition and AICrossRef
17.
Zurück zum Zitat Safaei F, Khonsari A, Gilak M (2010) A new performance measure for characterizing fault rings in interconnection networks. Inf Sci 180(5):664–678MathSciNetCrossRefMATH Safaei F, Khonsari A, Gilak M (2010) A new performance measure for characterizing fault rings in interconnection networks. Inf Sci 180(5):664–678MathSciNetCrossRefMATH
18.
Zurück zum Zitat Ozturk O, Kandemir M, Irwin MJ, Narayanan SHK (2010) Compiler directed network-on-chip reliability enhancement for chip multiprocessors. In: Proceedings of the ACM SIGPLAN/SIGBED 2010 conference on Languages, compilers, and tools for embedded systems (LCTES ’10). ACM, New York, NY, pp 85–94 Ozturk O, Kandemir M, Irwin MJ, Narayanan SHK (2010) Compiler directed network-on-chip reliability enhancement for chip multiprocessors. In: Proceedings of the ACM SIGPLAN/SIGBED 2010 conference on Languages, compilers, and tools for embedded systems (LCTES ’10). ACM, New York, NY, pp 85–94
19.
Zurück zum Zitat Arabnia HR (1995) A distributed stereocorrelation algorithm. In: Proceedings of computer communications and networks (ICCCN’95), IEEE, pp 479–482 Arabnia HR (1995) A distributed stereocorrelation algorithm. In: Proceedings of computer communications and networks (ICCCN’95), IEEE, pp 479–482
20.
Zurück zum Zitat Arabnia HR, Oliver MA (1987) A transputer network for the arbitrary rotation of digitised images. Comput J 30(5):425–433CrossRef Arabnia HR, Oliver MA (1987) A transputer network for the arbitrary rotation of digitised images. Comput J 30(5):425–433CrossRef
21.
Zurück zum Zitat Boppana RV, Chalasani S (1995) Fault-tolerant wormhole routing algorithms for mesh networks. IEEE Trans Comput 44(7):848–864 Boppana RV, Chalasani S (1995) Fault-tolerant wormhole routing algorithms for mesh networks. IEEE Trans Comput 44(7):848–864
22.
Zurück zum Zitat Sui PH, Wang SD (1997) An improved algorithm for fault-tolerant wormhole routing in meshes. IEEE Trans Comput 46(9):1040–1042MathSciNetCrossRef Sui PH, Wang SD (1997) An improved algorithm for fault-tolerant wormhole routing in meshes. IEEE Trans Comput 46(9):1040–1042MathSciNetCrossRef
23.
Zurück zum Zitat Kim SP, Han T (1997) Fault-tolerant wormhole routing in mesh with overlapped solid fault regions. Parallel Comput 23:1937–1962MathSciNetCrossRefMATH Kim SP, Han T (1997) Fault-tolerant wormhole routing in mesh with overlapped solid fault regions. Parallel Comput 23:1937–1962MathSciNetCrossRefMATH
24.
Zurück zum Zitat Park S, Youn JH, Bose B (2000) Fault-tolerant wormhole routing algorithms in meshes in the presence of concave faults. In: International parallel and distributed processing symposium, pp 633–638 Park S, Youn JH, Bose B (2000) Fault-tolerant wormhole routing algorithms in meshes in the presence of concave faults. In: International parallel and distributed processing symposium, pp 633–638
25.
Zurück zum Zitat Glass CG, Ni L (1996) Fault-tolerant wormhole routing in meshes without virtual channels. IEEE Trans Parallel Distrib Syst 7(6):620–636CrossRef Glass CG, Ni L (1996) Fault-tolerant wormhole routing in meshes without virtual channels. IEEE Trans Parallel Distrib Syst 7(6):620–636CrossRef
26.
Zurück zum Zitat Glass CG, Ni L (1994) The turn model for adaptive routing. J ACM 41(5):874–902CrossRef Glass CG, Ni L (1994) The turn model for adaptive routing. J ACM 41(5):874–902CrossRef
27.
Zurück zum Zitat Cunningham CM, Avresky DR (1995) Fault-tolerant adaptive routing for two dimensional meshes. In: Symposium on high-performance computer architecture, pp 122–131 Cunningham CM, Avresky DR (1995) Fault-tolerant adaptive routing for two dimensional meshes. In: Symposium on high-performance computer architecture, pp 122–131
28.
Zurück zum Zitat Nordbotten NA, Skeie T (2007) A routing methodology for dynamic fault tolerance in meshes and tori. In: International conference on high performance, pp 514–527 Nordbotten NA, Skeie T (2007) A routing methodology for dynamic fault tolerance in meshes and tori. In: International conference on high performance, pp 514–527
29.
Zurück zum Zitat Gomez ME, Duato J, Flich J, Lopez P, Robles A (2006) A routing methodology for achieving fault tolerance in direct networks. IEEE Trans Comput 55(4):400–415CrossRef Gomez ME, Duato J, Flich J, Lopez P, Robles A (2006) A routing methodology for achieving fault tolerance in direct networks. IEEE Trans Comput 55(4):400–415CrossRef
30.
Zurück zum Zitat Mejia A, Flich J, Duato J, Reinemo S (2006) Segment-based routing: an efficient fault-tolerant routing algorithm for meshes and tori. In: Parallel and distributed processing symposium, Rhodes Island Mejia A, Flich J, Duato J, Reinemo S (2006) Segment-based routing: an efficient fault-tolerant routing algorithm for meshes and tori. In: Parallel and distributed processing symposium, Rhodes Island
31.
Zurück zum Zitat Safaei F, ValadBeigi M (2012) An efficient routing methodology to tolerate static and dynamic faults in 2-D Mesh networks-on-chip. Microprocess Microsyst 36(7):531–542CrossRef Safaei F, ValadBeigi M (2012) An efficient routing methodology to tolerate static and dynamic faults in 2-D Mesh networks-on-chip. Microprocess Microsyst 36(7):531–542CrossRef
32.
Zurück zum Zitat Safaei F, Mortazavi A (2010) A novel routing algorithm for achieving static fault-tolerance in 2-D meshes. In: 10th International conference on computer and information technology (CIT), pp 2621–2627 Safaei F, Mortazavi A (2010) A novel routing algorithm for achieving static fault-tolerance in 2-D meshes. In: 10th International conference on computer and information technology (CIT), pp 2621–2627
33.
Zurück zum Zitat Lysne EAO (2004) Simple deadlock-free dynamic network reconfiguration. In: Lecture notes in computer science 3296 Lysne EAO (2004) Simple deadlock-free dynamic network reconfiguration. In: Lecture notes in computer science 3296
34.
Zurück zum Zitat Lysne O et al (2008) An efficient and deadlock-free network reconfiguration protocol. IEEE Trans Comput 57(6):762–779MathSciNetCrossRef Lysne O et al (2008) An efficient and deadlock-free network reconfiguration protocol. IEEE Trans Comput 57(6):762–779MathSciNetCrossRef
35.
Zurück zum Zitat Pinkston TM et al (2003) Deadlock-free dynamic reconfiguration schemes for increased network dependability. IEEE Trans Parallel Distrib Syst 14(8):780–794CrossRef Pinkston TM et al (2003) Deadlock-free dynamic reconfiguration schemes for increased network dependability. IEEE Trans Parallel Distrib Syst 14(8):780–794CrossRef
36.
Zurück zum Zitat Mustafa NU, Ozturk O, Niar S (2016) Adaptive routing framework for network on chip architectures. In: Proceedings of the 2016 workshop on rapid simulation and performance evaluation: methods and tools (RAPIDO ’16), ACM, New York, NY, Article 5, p 5 Mustafa NU, Ozturk O, Niar S (2016) Adaptive routing framework for network on chip architectures. In: Proceedings of the 2016 workshop on rapid simulation and performance evaluation: methods and tools (RAPIDO ’16), ACM, New York, NY, Article 5, p 5
37.
Zurück zum Zitat ValadBeigi M, Safaei F (2012) PDR: a protocol for dynamic network reconfiguration based on deadlock recovery scheme. Simul Model Pract Theory 24:59–70CrossRef ValadBeigi M, Safaei F (2012) PDR: a protocol for dynamic network reconfiguration based on deadlock recovery scheme. Simul Model Pract Theory 24:59–70CrossRef
38.
Zurück zum Zitat Lopez P, Duato J (1993) Deadlock-free adaptive routing algorithms for the 3-D torus: limitations and solutions. In: Bode A, Reeve M, Wolf G (eds) Parallel architectures and languages. Lecture Notes in Computer Science, vol 694. Springer, Berlin, pp 684–687 Lopez P, Duato J (1993) Deadlock-free adaptive routing algorithms for the 3-D torus: limitations and solutions. In: Bode A, Reeve M, Wolf G (eds) Parallel architectures and languages. Lecture Notes in Computer Science, vol 694. Springer, Berlin, pp 684–687
39.
Zurück zum Zitat Nayebi A et al (2007) XMulator: an object oriented XML-based simulator. In: Asia international conference on modeling and simulation, pp 128–132 Nayebi A et al (2007) XMulator: an object oriented XML-based simulator. In: Asia international conference on modeling and simulation, pp 128–132
40.
Zurück zum Zitat Kahng A, Li B, Peh L, Samadi K (2009) ORION 2.0: a fast and accurate NoC power and area model for early-stage design space exploration. In: Proceedings of design, automation test Europe (DATE), pp 423–428 Kahng A, Li B, Peh L, Samadi K (2009) ORION 2.0: a fast and accurate NoC power and area model for early-stage design space exploration. In: Proceedings of design, automation test Europe (DATE), pp 423–428
41.
Zurück zum Zitat Dally WJ, Towles B (2004) Principles and practices of interconnection networks. Morgan Kaufmann Publishers, Burlington Dally WJ, Towles B (2004) Principles and practices of interconnection networks. Morgan Kaufmann Publishers, Burlington
Metadaten
Titel
An efficient fault-tolerant routing algorithm in NoCs to tolerate permanent faults
verfasst von
Reza Akbar
Ali Asghar Etedalpour
Farshad Safaei
Publikationsdatum
01.12.2016
Verlag
Springer US
Erschienen in
The Journal of Supercomputing / Ausgabe 12/2016
Print ISSN: 0920-8542
Elektronische ISSN: 1573-0484
DOI
https://doi.org/10.1007/s11227-016-1749-0

Weitere Artikel der Ausgabe 12/2016

The Journal of Supercomputing 12/2016 Zur Ausgabe