Skip to main content
Erschienen in: The Journal of Supercomputing 4/2023

26.09.2022

C-DMR: a cache-based fault-tolerant protection method for register file

verfasst von: Zongnan Liang, Jiawei Nian, Hongjin Liu, Xuru Wang, Mengfei Yang

Erschienen in: The Journal of Supercomputing | Ausgabe 4/2023

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

The processor in the space environment is susceptible to the interference of high-energy particles, resulting in abnormal operation of the processor. These processors require fault-tolerant designs to handle various disturbances in the environment. In this paper, we propose a cache-based fault-tolerant protection method for the register file and we implement it in a RISC-V processor on the FPGA platform. This method uses spatial redundancy and information redundancy to reduce the propagation of single-bit errors, and it can resolve potential fault accumulation issues in the register file. Compared with other methods for register files, the proposed implementation has advantages in resource consumption by reusing the inherent data cache structure in the processor, only increasing 76% look-up tables and causing 6.09% extra delay. Finally, we evaluate the impact on system performance after removing some cachelines from the data cache and get conclusion that this design improves the processor’s fault tolerance with small impact on the original data cache performance.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Yang M, Hua G, Feng Y et al (2017) Fault-tolerance techniques for spacecraft control computers. Wiley, LondonCrossRef Yang M, Hua G, Feng Y et al (2017) Fault-tolerance techniques for spacecraft control computers. Wiley, LondonCrossRef
2.
Zurück zum Zitat Hulme CA, Loomis HH, Ross AA et al (2004) Configurable fault-tolerant processor (CFTP) for spacecraft onboard processing. In: 2004 IEEE Aerospace Conference Proceedings (IEEE Cat. No. 04TH8720). IEEE, 4: 2269–2276 Hulme CA, Loomis HH, Ross AA et al (2004) Configurable fault-tolerant processor (CFTP) for spacecraft onboard processing. In: 2004 IEEE Aerospace Conference Proceedings (IEEE Cat. No. 04TH8720). IEEE, 4: 2269–2276
3.
Zurück zum Zitat Hijorth M, Aberg M, Wessman NJ, et al (2015) GR740: Rad-hard quad-core LEON4FT system-on-chip. DASIA 2015-DAta Systems in Aerospace 732:7 Hijorth M, Aberg M, Wessman NJ, et al (2015) GR740: Rad-hard quad-core LEON4FT system-on-chip. DASIA 2015-DAta Systems in Aerospace 732:7
4.
Zurück zum Zitat Ustaoglu B, Yalcin BO (2015) Fault tolerant register file design for MIPS AES-crypto microprocessor. In: 2015 IEEE International Conference on Electronics, Circuits, and Systems (ICECS). IEEE, pp 442–445 Ustaoglu B, Yalcin BO (2015) Fault tolerant register file design for MIPS AES-crypto microprocessor. In: 2015 IEEE International Conference on Electronics, Circuits, and Systems (ICECS). IEEE, pp 442–445
5.
Zurück zum Zitat Santos DA, Luza LM, Zeferino CA, et al (2020) A low-cost fault-tolerant RISC-V processor for space systems. In: 2020 15th Design and Technology of Integrated Systems in Nanoscale era (DTIS). IEEE, pp 1–5 Santos DA, Luza LM, Zeferino CA, et al (2020) A low-cost fault-tolerant RISC-V processor for space systems. In: 2020 15th Design and Technology of Integrated Systems in Nanoscale era (DTIS). IEEE, pp 1–5
6.
Zurück zum Zitat Ramos A, Ullah A, Reviriego P et al (2017) Efficient protection of the register file in soft-processors implemented on Xilinx FPGAs. IEEE Trans Comput 67(2):299–304CrossRef Ramos A, Ullah A, Reviriego P et al (2017) Efficient protection of the register file in soft-processors implemented on Xilinx FPGAs. IEEE Trans Comput 67(2):299–304CrossRef
7.
Zurück zum Zitat Carmichael C (2001) Triple module redundancy design techniques for Virtex FPGAs. Xilinx Application Note XAPP197, p 1 Carmichael C (2001) Triple module redundancy design techniques for Virtex FPGAs. Xilinx Application Note XAPP197, p 1
8.
Zurück zum Zitat Velazco R, Cheynet P, Ecoffet R (1999) Effects of radiation on digital architectures: one year results from a satellite experiment. In: Proceedings. XII Symposium on Integrated Circuits and Systems Design (Cat. No. PR00387). IEEE, pp 164–169 Velazco R, Cheynet P, Ecoffet R (1999) Effects of radiation on digital architectures: one year results from a satellite experiment. In: Proceedings. XII Symposium on Integrated Circuits and Systems Design (Cat. No. PR00387). IEEE, pp 164–169
9.
Zurück zum Zitat Katz R, Wang J J, McCollum J, et al (2001) A SEU-Hard Flip-Flop for Antifuse FPGAs. IEEE NSREC 2001 Katz R, Wang J J, McCollum J, et al (2001) A SEU-Hard Flip-Flop for Antifuse FPGAs. IEEE NSREC 2001
10.
Zurück zum Zitat Hentschke R, Marques F, Lima F, et al (2002) Analyzing area and performance penalty of protecting different digital modules with Hamming code and triple modular redundancy. In: Proceedings of 15th Symposium on Integrated Circuits and Systems Design. IEEE, pp 95–100 Hentschke R, Marques F, Lima F, et al (2002) Analyzing area and performance penalty of protecting different digital modules with Hamming code and triple modular redundancy. In: Proceedings of 15th Symposium on Integrated Circuits and Systems Design. IEEE, pp 95–100
11.
Zurück zum Zitat Das A, Touba NA (2019) Layered-ECC: a class of double error correcting codes for high density memory systems. In: 2019 IEEE 37th VLSI Test Symposium (VTS). IEEE, pp 1–6 Das A, Touba NA (2019) Layered-ECC: a class of double error correcting codes for high density memory systems. In: 2019 IEEE 37th VLSI Test Symposium (VTS). IEEE, pp 1–6
12.
Zurück zum Zitat Anghel L, Alexandrescu D, Nicolaidis M (2000) Evaluation of a soft error tolerance technique based on time and/or space redundancy. In: Proceedings 13th Symposium on Integrated Circuits and Systems Design (Cat. No. PR00843). IEEE, pp 237–242 Anghel L, Alexandrescu D, Nicolaidis M (2000) Evaluation of a soft error tolerance technique based on time and/or space redundancy. In: Proceedings 13th Symposium on Integrated Circuits and Systems Design (Cat. No. PR00843). IEEE, pp 237–242
13.
Zurück zum Zitat Houghton A (2012) The engineer’s error coding handbook. Springer Houghton A (2012) The engineer’s error coding handbook. Springer
14.
Zurück zum Zitat Cui Y, Lou M, Xiao J et al (2013) Research and implementation of SEC-DED Hamming code algorithm. In: 2013 IEEE International Conference of IEEE Region 10 (TENCON. IEEE 2013), pp 1–5 Cui Y, Lou M, Xiao J et al (2013) Research and implementation of SEC-DED Hamming code algorithm. In: 2013 IEEE International Conference of IEEE Region 10 (TENCON. IEEE 2013), pp 1–5
15.
Zurück zum Zitat Hamming RW (1950) Error detecting and error correcting codes. Bell Syst Tech J 29(2):147–160CrossRefMATH Hamming RW (1950) Error detecting and error correcting codes. Bell Syst Tech J 29(2):147–160CrossRefMATH
16.
Zurück zum Zitat Tam S (2004) Multiple bit error correction. Application Note: Virtex-4 and Virtex-II Pro FPGAs Tam S (2004) Multiple bit error correction. Application Note: Virtex-4 and Virtex-II Pro FPGAs
17.
Zurück zum Zitat Reed IS (1953) A class of multiple-error-correcting codes and the decoding scheme. Massachusetts Inst of Tech Lexington Lincoln Lab Reed IS (1953) A class of multiple-error-correcting codes and the decoding scheme. Massachusetts Inst of Tech Lexington Lincoln Lab
18.
Zurück zum Zitat Muller DE (1954) Application of Boolean algebra to switching circuit design and to error detection. Trans IRE Prof Group Electron Comput 3:6–12CrossRef Muller DE (1954) Application of Boolean algebra to switching circuit design and to error detection. Trans IRE Prof Group Electron Comput 3:6–12CrossRef
19.
Zurück zum Zitat Didehban M, Khoshbakht S, Zarandi HR, et al (2010) Reducing of soft error effects on a MIPS-based dual-core processor. In: 2010 15th CSI International Symposium on Computer Architecture and Digital Systems. IEEE, pp 151–152 Didehban M, Khoshbakht S, Zarandi HR, et al (2010) Reducing of soft error effects on a MIPS-based dual-core processor. In: 2010 15th CSI International Symposium on Computer Architecture and Digital Systems. IEEE, pp 151–152
20.
Zurück zum Zitat Goerl RC, Villa PRC, Poehls LB et al (2018) An efficient EDAC approach for handling multiple bit upsets in memory array. Microelectron Reliab 88:214–218CrossRef Goerl RC, Villa PRC, Poehls LB et al (2018) An efficient EDAC approach for handling multiple bit upsets in memory array. Microelectron Reliab 88:214–218CrossRef
21.
Zurück zum Zitat Wirthlin MJ, Keller AM, McCloskey C et al (2016) SEU mitigation and validation of the LEON3 soft processor using triple modular redundancy for space processing. In: Proceedings of the. ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, pp. 205–214 Wirthlin MJ, Keller AM, McCloskey C et al (2016) SEU mitigation and validation of the LEON3 soft processor using triple modular redundancy for space processing. In: Proceedings of the. ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, pp. 205–214
22.
Zurück zum Zitat Aranda LA, Wessman NJ, Santos L et al (2020) Analysis of the critical bits of a RISC-V processor implemented in an SRAM-based FPGA for space applications. Electronics 9(1):175CrossRef Aranda LA, Wessman NJ, Santos L et al (2020) Analysis of the critical bits of a RISC-V processor implemented in an SRAM-based FPGA for space applications. Electronics 9(1):175CrossRef
23.
Zurück zum Zitat Wilson AE, Wirthlin M (2019) Neutron radiation testing of fault tolerant RISC-V soft processor on Xilinx SRAM-based FPGAs. In: 2019 IEEE Space Computing Conference (SCC). IEEE, pp 25–32 Wilson AE, Wirthlin M (2019) Neutron radiation testing of fault tolerant RISC-V soft processor on Xilinx SRAM-based FPGAs. In: 2019 IEEE Space Computing Conference (SCC). IEEE, pp 25–32
24.
Zurück zum Zitat Ramos A, Toral RG, Reviriego P et al (2019) An ALU protection methodology for soft processors on SRAM-based FPGAs. IEEE Trans Comput 68(9):1404–1410CrossRefMATH Ramos A, Toral RG, Reviriego P et al (2019) An ALU protection methodology for soft processors on SRAM-based FPGAs. IEEE Trans Comput 68(9):1404–1410CrossRefMATH
25.
Zurück zum Zitat Earnshaw R (2003) Procedure call standard for the ARM architecture. ARM Limited Earnshaw R (2003) Procedure call standard for the ARM architecture. ARM Limited
26.
Zurück zum Zitat Waterman A, Asanovic K (2021) The RISC-V Instruction Set Manual, Volume I: Unprivileged ISA. Vol. 1. SiFive Inc Waterman A, Asanovic K (2021) The RISC-V Instruction Set Manual, Volume I: Unprivileged ISA. Vol. 1. SiFive Inc
27.
Zurück zum Zitat Kito C, Jessica C, Palmer D, Andrew W, Jim W (2021) RISC-V ELF psABI specification Kito C, Jessica C, Palmer D, Andrew W, Jim W (2021) RISC-V ELF psABI specification
29.
Zurück zum Zitat Quinn H, Robinson WH, Rech P et al (2015) Using benchmarks for radiation testing of microprocessors and FPGAs. IEEE Trans Nucl Sci 62(6):2547–2554CrossRef Quinn H, Robinson WH, Rech P et al (2015) Using benchmarks for radiation testing of microprocessors and FPGAs. IEEE Trans Nucl Sci 62(6):2547–2554CrossRef
Metadaten
Titel
C-DMR: a cache-based fault-tolerant protection method for register file
verfasst von
Zongnan Liang
Jiawei Nian
Hongjin Liu
Xuru Wang
Mengfei Yang
Publikationsdatum
26.09.2022
Verlag
Springer US
Erschienen in
The Journal of Supercomputing / Ausgabe 4/2023
Print ISSN: 0920-8542
Elektronische ISSN: 1573-0484
DOI
https://doi.org/10.1007/s11227-022-04836-2

Weitere Artikel der Ausgabe 4/2023

The Journal of Supercomputing 4/2023 Zur Ausgabe

Premium Partner