Skip to main content
Erschienen in: Journal of Electronic Testing 1/2020

19.12.2019

Multi-Threaded Mitigation of Radiation-Induced Soft Errors in Bare-Metal Embedded Systems

verfasst von: Alejandro Serrano-Cases, Felipe Restrepo-Calle, Sergio Cuenca-Asensi, Antonio Martínez-Álvarez

Erschienen in: Journal of Electronic Testing | Ausgabe 1/2020

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

This article presents a software protection technique against radiation-induced faults which is based on a multi-threaded strategy. Data triplication and instructions flow duplication or triplication techniques are used to improve system reliability and thus, ensure a correct system operation. To achieve this objective, a relaxed lockstep model to synchronize the execution of both, redundant threads and variables under protection on different processing units is defined. The evaluation was performed by means of simulated fault injection campaigns in a multi-core ARM system. Results show that despite being considered techniques that imply an evident overhead in memory and instructions (Duplication With Comparison and Re-Execution – DWC-R and Triple Modular Redundancy – TMR), spreading the replicas in different instruction flows not only produce similar results than classic techniques, but also improves the computational and recovery time in presence of soft-errors. In addition, this paper highlights the importance of protecting memory-allocated data, since the instruction flow triplication is not enough to improve the overall system reliability.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Weitere Produktempfehlungen anzeigen
Literatur
1.
Zurück zum Zitat Benedetto JM, Eaton PH, Mavis DG, Gadlage M, Turflinger T (2006) Digital single event transient trends with technology node scaling. IEEE Trans Nuclear Sci 53:3462–3465CrossRef Benedetto JM, Eaton PH, Mavis DG, Gadlage M, Turflinger T (2006) Digital single event transient trends with technology node scaling. IEEE Trans Nuclear Sci 53:3462–3465CrossRef
2.
Zurück zum Zitat Gaillard R (2011) Single event effects: mechanisms and classification. In: Nicolaidis M (ed) Soft errors in modern electronic systems, vol. 41 of frontiers in electronic testing. Springer, Dordrecht, pp 27–54, Gaillard R (2011) Single event effects: mechanisms and classification. In: Nicolaidis M (ed) Soft errors in modern electronic systems, vol. 41 of frontiers in electronic testing. Springer, Dordrecht, pp 27–54,
3.
Zurück zum Zitat Baumann R (2005) Radiation-induced soft errors in advanced semiconductor technologies. IEEE Trans Dev Mater Reliab 5:305–316CrossRef Baumann R (2005) Radiation-induced soft errors in advanced semiconductor technologies. IEEE Trans Dev Mater Reliab 5:305–316CrossRef
4.
Zurück zum Zitat Iturbe X, Venu B, Ozer E, Das S (2016) A triple core lock-step (TCLS) ARM®; cortex®;-R5 processor for safety-critical and ultra-reliable applications. In: Proc. 2016 46th Annual IEEE/IFIP international conference on dependable systems and networks workshop (DSN-W). IEEE, pp 246–249 Iturbe X, Venu B, Ozer E, Das S (2016) A triple core lock-step (TCLS) ARM®; cortex®;-R5 processor for safety-critical and ultra-reliable applications. In: Proc. 2016 46th Annual IEEE/IFIP international conference on dependable systems and networks workshop (DSN-W). IEEE, pp 246–249
5.
Zurück zum Zitat Goloubeva O, Rebaudengo M, Reorda S, Violante M (2006) Software-implemented hardware fault tolerance, vol XIV. Springer Goloubeva O, Rebaudengo M, Reorda S, Violante M (2006) Software-implemented hardware fault tolerance, vol XIV. Springer
6.
Zurück zum Zitat Quinn H, Baker Z, Fairbanks T, Tripp JL, Duran G (2015) Software resilience and the effectiveness of software mitigation in microcontrollers. IEEE Trans Nuclear Sci, 62:2532–2538CrossRef Quinn H, Baker Z, Fairbanks T, Tripp JL, Duran G (2015) Software resilience and the effectiveness of software mitigation in microcontrollers. IEEE Trans Nuclear Sci, 62:2532–2538CrossRef
7.
Zurück zum Zitat Cuenca-Asensi S, Martinez-Alvarez A, Restrepo-Calle F, Palomo FR, Guzman-Miranda H, Aguirre MA (2011) A novel co-design approach for soft errors mitigation in embedded systems. IEEE Trans Nuclear Sci 58:1059–1065CrossRef Cuenca-Asensi S, Martinez-Alvarez A, Restrepo-Calle F, Palomo FR, Guzman-Miranda H, Aguirre MA (2011) A novel co-design approach for soft errors mitigation in embedded systems. IEEE Trans Nuclear Sci 58:1059–1065CrossRef
8.
Zurück zum Zitat Oz I, Arslan S (2019) A survey on multithreading alternatives for soft error fault tolerance. ACM Comput Surv 52:27,1–27,38CrossRef Oz I, Arslan S (2019) A survey on multithreading alternatives for soft error fault tolerance. ACM Comput Surv 52:27,1–27,38CrossRef
9.
Zurück zum Zitat Reinhardt SK, Mukherjee S (2000) Transient fault detection via simultaneous multithreading. ACM SIGARCH Comput Architect News 28:25–36CrossRef Reinhardt SK, Mukherjee S (2000) Transient fault detection via simultaneous multithreading. ACM SIGARCH Comput Architect News 28:25–36CrossRef
10.
Zurück zum Zitat Mukherjee S, Kontz M, Reinhardt SK (2002) Detailed design and evaluation of redundant multithreading alternatives. ACM SIGARCH Comput Architect News 30:99–110CrossRef Mukherjee S, Kontz M, Reinhardt SK (2002) Detailed design and evaluation of redundant multithreading alternatives. ACM SIGARCH Comput Architect News 30:99–110CrossRef
11.
Zurück zum Zitat Wang C, seop Kim H, Wu Y, Ying V (2007) Compiler-managed software-based redundant multi-threading for transient fault detection. In: Proc. International symposium on code generation and optimization (CGO2007). IEEE, pp 244–258 Wang C, seop Kim H, Wu Y, Ying V (2007) Compiler-managed software-based redundant multi-threading for transient fault detection. In: Proc. International symposium on code generation and optimization (CGO2007). IEEE, pp 244–258
12.
Zurück zum Zitat Shye A, Blomstedt J, Moseley T, Reddi V, Connors D (2009) PLR: a software approach to transient fault tolerance for multicore architectures. IEEE Trans Depend Secur Comput 6: 135–148CrossRef Shye A, Blomstedt J, Moseley T, Reddi V, Connors D (2009) PLR: a software approach to transient fault tolerance for multicore architectures. IEEE Trans Depend Secur Comput 6: 135–148CrossRef
13.
Zurück zum Zitat Rodrigues G, Rosa F, Kastensmidt FL, Reis R, Ost L (2017) Investigating parallel TMR approaches and thread disposability in Linux. In: Proc. 2017 24th IEEE international conference on electronics, circuits and systems (ICECS). IEEE, pp 393– 396 Rodrigues G, Rosa F, Kastensmidt FL, Reis R, Ost L (2017) Investigating parallel TMR approaches and thread disposability in Linux. In: Proc. 2017 24th IEEE international conference on electronics, circuits and systems (ICECS). IEEE, pp 393– 396
14.
Zurück zum Zitat de Oliveira A, Tambara LA, Kastensmidt FL (2017) Applying lockstep in dual-core ARM cortex-a9 to mitigate radiation-induced soft errors. In: 2017 IEEE 8th Latin American symposium on circuits & systems (LASCAS). IEEE, pp 1–4 de Oliveira A, Tambara LA, Kastensmidt FL (2017) Applying lockstep in dual-core ARM cortex-a9 to mitigate radiation-induced soft errors. In: 2017 IEEE 8th Latin American symposium on circuits & systems (LASCAS). IEEE, pp 1–4
15.
Zurück zum Zitat de Oliveira AB, Rodrigues G, Kastensmidt FL (2017) Analyzing lockstep dual-core ARM cortex-a9 soft error mitigation in freeRTOS applications. In: Proceedings of the 30th symposium on integrated circuits and systems design chip on the sands - SBCCI 2017, SBCCI ’17. ACM Press, New York, pp 84–89 de Oliveira AB, Rodrigues G, Kastensmidt FL (2017) Analyzing lockstep dual-core ARM cortex-a9 soft error mitigation in freeRTOS applications. In: Proceedings of the 30th symposium on integrated circuits and systems design chip on the sands - SBCCI 2017, SBCCI ’17. ACM Press, New York, pp 84–89
16.
Zurück zum Zitat Rodrigues G, ROSA F, de Oliveira A, Kastensmidt FL, Ost L, Reis R (2017) Analyzing the impact of fault tolerance methods in ARM processors under soft errors running linux and parallelization APIs. IEEE Trans Nuclear Sci 64(8):2196–2203 Rodrigues G, ROSA F, de Oliveira A, Kastensmidt FL, Ost L, Reis R (2017) Analyzing the impact of fault tolerance methods in ARM processors under soft errors running linux and parallelization APIs. IEEE Trans Nuclear Sci 64(8):2196–2203
17.
Zurück zum Zitat Rodrigues G, Kastensmidt FL, Reis R, Rosa F, Ost L (2016) Analyzing the impact of using pthreads versus OpenMP under fault injection in ARM cortex-a9 dual-core. In: 2016 16th European conference on radiation and its effects on components and systems (RADECS). IEEE, pp 1–6 Rodrigues G, Kastensmidt FL, Reis R, Rosa F, Ost L (2016) Analyzing the impact of using pthreads versus OpenMP under fault injection in ARM cortex-a9 dual-core. In: 2016 16th European conference on radiation and its effects on components and systems (RADECS). IEEE, pp 1–6
18.
Zurück zum Zitat Hukerikar S, Teranishi K, Diniz PC, Lucas RF (2017) RedThreads: an interface for application-level fault detection/correction through adaptive redundant multithreading. Int J Parallel Prog 46:225–251CrossRef Hukerikar S, Teranishi K, Diniz PC, Lucas RF (2017) RedThreads: an interface for application-level fault detection/correction through adaptive redundant multithreading. Int J Parallel Prog 46:225–251CrossRef
19.
Zurück zum Zitat Monson JS, Wirthlin M, Hutchings B (2010) Fault injection results of linux operating on an FPGA embedded platform. In: Proc. 2010 international conference on reconfigurable computing and FPGAs. IEEE, pp 37–42 Monson JS, Wirthlin M, Hutchings B (2010) Fault injection results of linux operating on an FPGA embedded platform. In: Proc. 2010 international conference on reconfigurable computing and FPGAs. IEEE, pp 37–42
20.
Zurück zum Zitat So H, Didehban M, Shrivastava A, Lee K (2019) A software-level redundant multithreading for soft/hard error detection and recovery. In: Proc. 2019 design, automation & test in europe conference & exhibition (DATE). IEEE, pp 1559–1562 So H, Didehban M, Shrivastava A, Lee K (2019) A software-level redundant multithreading for soft/hard error detection and recovery. In: Proc. 2019 design, automation & test in europe conference & exhibition (DATE). IEEE, pp 1559–1562
21.
Zurück zum Zitat Serrano-Cases A, Restrepo-Calle F, Cuenca-Asensi S, Martinez-Alvarez A (2019) Softerror mitigation for multi-core processors based on thread replication. In: Proc. 2019 IEEE Latin American test symposium (LATS). IEEE, pp 1–5 Serrano-Cases A, Restrepo-Calle F, Cuenca-Asensi S, Martinez-Alvarez A (2019) Softerror mitigation for multi-core processors based on thread replication. In: Proc. 2019 IEEE Latin American test symposium (LATS). IEEE, pp 1–5
22.
Zurück zum Zitat Reinhardt SK, Mukherjee S (2000) Transient fault detection via simultaneous multithreading. ACM SIGARCH Comput Architect News 28:25–36CrossRef Reinhardt SK, Mukherjee S (2000) Transient fault detection via simultaneous multithreading. ACM SIGARCH Comput Architect News 28:25–36CrossRef
23.
Zurück zum Zitat Martinez-Alvarez A, Cuenca-Asensi S, Restrepo-Calle F, Palomo Pinto FR, Guzman-Miranda H, Aguirre MA (2012) Compiler-directed soft error mitigation for embedded systems. IEEE Trans Depend Secur Comput 9:159–172CrossRef Martinez-Alvarez A, Cuenca-Asensi S, Restrepo-Calle F, Palomo Pinto FR, Guzman-Miranda H, Aguirre MA (2012) Compiler-directed soft error mitigation for embedded systems. IEEE Trans Depend Secur Comput 9:159–172CrossRef
25.
Zurück zum Zitat Isaza-Gonzalez J, Serrano-Cases A, Restrepo-Calle F, Cuenca-Asensi S, Martinez-Alvarez A (2016) Dependability evaluation of COTS microprocessors via on-chip debugging facilities. In: Proc. 2016 17th Latin-American test symposium (LATS). IEEE, pp 27–32 Isaza-Gonzalez J, Serrano-Cases A, Restrepo-Calle F, Cuenca-Asensi S, Martinez-Alvarez A (2016) Dependability evaluation of COTS microprocessors via on-chip debugging facilities. In: Proc. 2016 17th Latin-American test symposium (LATS). IEEE, pp 27–32
26.
Zurück zum Zitat Reyneri LM, Serrano-Cases A, Morilla Y, Cuenca-Asensi S, Martínez-Álvarez A (2019) A compact model to evaluate the effects of high level C++ code hardening in radiation environments. Electronics 8:653CrossRef Reyneri LM, Serrano-Cases A, Morilla Y, Cuenca-Asensi S, Martínez-Álvarez A (2019) A compact model to evaluate the effects of high level C++ code hardening in radiation environments. Electronics 8:653CrossRef
27.
Zurück zum Zitat Reis G, Chang J, Vachharajani N, Rangan R, August D, Mukherjee S (2005) Design and evaluation of hybrid fault-detection systems. In: Proc. 32nd International symposium on computer architecture (ISCA2005). IEEE, pp 148–159 Reis G, Chang J, Vachharajani N, Rangan R, August D, Mukherjee S (2005) Design and evaluation of hybrid fault-detection systems. In: Proc. 32nd International symposium on computer architecture (ISCA2005). IEEE, pp 148–159
Metadaten
Titel
Multi-Threaded Mitigation of Radiation-Induced Soft Errors in Bare-Metal Embedded Systems
verfasst von
Alejandro Serrano-Cases
Felipe Restrepo-Calle
Sergio Cuenca-Asensi
Antonio Martínez-Álvarez
Publikationsdatum
19.12.2019
Verlag
Springer US
Erschienen in
Journal of Electronic Testing / Ausgabe 1/2020
Print ISSN: 0923-8174
Elektronische ISSN: 1573-0727
DOI
https://doi.org/10.1007/s10836-019-05846-4

Weitere Artikel der Ausgabe 1/2020

Journal of Electronic Testing 1/2020 Zur Ausgabe

Neuer Inhalt