Skip to main content
Top
Published in: Journal of Electronic Testing 1/2020

19-12-2019

Multi-Threaded Mitigation of Radiation-Induced Soft Errors in Bare-Metal Embedded Systems

Authors: Alejandro Serrano-Cases, Felipe Restrepo-Calle, Sergio Cuenca-Asensi, Antonio Martínez-Álvarez

Published in: Journal of Electronic Testing | Issue 1/2020

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

This article presents a software protection technique against radiation-induced faults which is based on a multi-threaded strategy. Data triplication and instructions flow duplication or triplication techniques are used to improve system reliability and thus, ensure a correct system operation. To achieve this objective, a relaxed lockstep model to synchronize the execution of both, redundant threads and variables under protection on different processing units is defined. The evaluation was performed by means of simulated fault injection campaigns in a multi-core ARM system. Results show that despite being considered techniques that imply an evident overhead in memory and instructions (Duplication With Comparison and Re-Execution – DWC-R and Triple Modular Redundancy – TMR), spreading the replicas in different instruction flows not only produce similar results than classic techniques, but also improves the computational and recovery time in presence of soft-errors. In addition, this paper highlights the importance of protecting memory-allocated data, since the instruction flow triplication is not enough to improve the overall system reliability.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Show more products
Literature
1.
go back to reference Benedetto JM, Eaton PH, Mavis DG, Gadlage M, Turflinger T (2006) Digital single event transient trends with technology node scaling. IEEE Trans Nuclear Sci 53:3462–3465CrossRef Benedetto JM, Eaton PH, Mavis DG, Gadlage M, Turflinger T (2006) Digital single event transient trends with technology node scaling. IEEE Trans Nuclear Sci 53:3462–3465CrossRef
2.
go back to reference Gaillard R (2011) Single event effects: mechanisms and classification. In: Nicolaidis M (ed) Soft errors in modern electronic systems, vol. 41 of frontiers in electronic testing. Springer, Dordrecht, pp 27–54, Gaillard R (2011) Single event effects: mechanisms and classification. In: Nicolaidis M (ed) Soft errors in modern electronic systems, vol. 41 of frontiers in electronic testing. Springer, Dordrecht, pp 27–54,
3.
go back to reference Baumann R (2005) Radiation-induced soft errors in advanced semiconductor technologies. IEEE Trans Dev Mater Reliab 5:305–316CrossRef Baumann R (2005) Radiation-induced soft errors in advanced semiconductor technologies. IEEE Trans Dev Mater Reliab 5:305–316CrossRef
4.
go back to reference Iturbe X, Venu B, Ozer E, Das S (2016) A triple core lock-step (TCLS) ARM®; cortex®;-R5 processor for safety-critical and ultra-reliable applications. In: Proc. 2016 46th Annual IEEE/IFIP international conference on dependable systems and networks workshop (DSN-W). IEEE, pp 246–249 Iturbe X, Venu B, Ozer E, Das S (2016) A triple core lock-step (TCLS) ARM®; cortex®;-R5 processor for safety-critical and ultra-reliable applications. In: Proc. 2016 46th Annual IEEE/IFIP international conference on dependable systems and networks workshop (DSN-W). IEEE, pp 246–249
5.
go back to reference Goloubeva O, Rebaudengo M, Reorda S, Violante M (2006) Software-implemented hardware fault tolerance, vol XIV. Springer Goloubeva O, Rebaudengo M, Reorda S, Violante M (2006) Software-implemented hardware fault tolerance, vol XIV. Springer
6.
go back to reference Quinn H, Baker Z, Fairbanks T, Tripp JL, Duran G (2015) Software resilience and the effectiveness of software mitigation in microcontrollers. IEEE Trans Nuclear Sci, 62:2532–2538CrossRef Quinn H, Baker Z, Fairbanks T, Tripp JL, Duran G (2015) Software resilience and the effectiveness of software mitigation in microcontrollers. IEEE Trans Nuclear Sci, 62:2532–2538CrossRef
7.
go back to reference Cuenca-Asensi S, Martinez-Alvarez A, Restrepo-Calle F, Palomo FR, Guzman-Miranda H, Aguirre MA (2011) A novel co-design approach for soft errors mitigation in embedded systems. IEEE Trans Nuclear Sci 58:1059–1065CrossRef Cuenca-Asensi S, Martinez-Alvarez A, Restrepo-Calle F, Palomo FR, Guzman-Miranda H, Aguirre MA (2011) A novel co-design approach for soft errors mitigation in embedded systems. IEEE Trans Nuclear Sci 58:1059–1065CrossRef
8.
go back to reference Oz I, Arslan S (2019) A survey on multithreading alternatives for soft error fault tolerance. ACM Comput Surv 52:27,1–27,38CrossRef Oz I, Arslan S (2019) A survey on multithreading alternatives for soft error fault tolerance. ACM Comput Surv 52:27,1–27,38CrossRef
9.
go back to reference Reinhardt SK, Mukherjee S (2000) Transient fault detection via simultaneous multithreading. ACM SIGARCH Comput Architect News 28:25–36CrossRef Reinhardt SK, Mukherjee S (2000) Transient fault detection via simultaneous multithreading. ACM SIGARCH Comput Architect News 28:25–36CrossRef
10.
go back to reference Mukherjee S, Kontz M, Reinhardt SK (2002) Detailed design and evaluation of redundant multithreading alternatives. ACM SIGARCH Comput Architect News 30:99–110CrossRef Mukherjee S, Kontz M, Reinhardt SK (2002) Detailed design and evaluation of redundant multithreading alternatives. ACM SIGARCH Comput Architect News 30:99–110CrossRef
11.
go back to reference Wang C, seop Kim H, Wu Y, Ying V (2007) Compiler-managed software-based redundant multi-threading for transient fault detection. In: Proc. International symposium on code generation and optimization (CGO2007). IEEE, pp 244–258 Wang C, seop Kim H, Wu Y, Ying V (2007) Compiler-managed software-based redundant multi-threading for transient fault detection. In: Proc. International symposium on code generation and optimization (CGO2007). IEEE, pp 244–258
12.
go back to reference Shye A, Blomstedt J, Moseley T, Reddi V, Connors D (2009) PLR: a software approach to transient fault tolerance for multicore architectures. IEEE Trans Depend Secur Comput 6: 135–148CrossRef Shye A, Blomstedt J, Moseley T, Reddi V, Connors D (2009) PLR: a software approach to transient fault tolerance for multicore architectures. IEEE Trans Depend Secur Comput 6: 135–148CrossRef
13.
go back to reference Rodrigues G, Rosa F, Kastensmidt FL, Reis R, Ost L (2017) Investigating parallel TMR approaches and thread disposability in Linux. In: Proc. 2017 24th IEEE international conference on electronics, circuits and systems (ICECS). IEEE, pp 393– 396 Rodrigues G, Rosa F, Kastensmidt FL, Reis R, Ost L (2017) Investigating parallel TMR approaches and thread disposability in Linux. In: Proc. 2017 24th IEEE international conference on electronics, circuits and systems (ICECS). IEEE, pp 393– 396
14.
go back to reference de Oliveira A, Tambara LA, Kastensmidt FL (2017) Applying lockstep in dual-core ARM cortex-a9 to mitigate radiation-induced soft errors. In: 2017 IEEE 8th Latin American symposium on circuits & systems (LASCAS). IEEE, pp 1–4 de Oliveira A, Tambara LA, Kastensmidt FL (2017) Applying lockstep in dual-core ARM cortex-a9 to mitigate radiation-induced soft errors. In: 2017 IEEE 8th Latin American symposium on circuits & systems (LASCAS). IEEE, pp 1–4
15.
go back to reference de Oliveira AB, Rodrigues G, Kastensmidt FL (2017) Analyzing lockstep dual-core ARM cortex-a9 soft error mitigation in freeRTOS applications. In: Proceedings of the 30th symposium on integrated circuits and systems design chip on the sands - SBCCI 2017, SBCCI ’17. ACM Press, New York, pp 84–89 de Oliveira AB, Rodrigues G, Kastensmidt FL (2017) Analyzing lockstep dual-core ARM cortex-a9 soft error mitigation in freeRTOS applications. In: Proceedings of the 30th symposium on integrated circuits and systems design chip on the sands - SBCCI 2017, SBCCI ’17. ACM Press, New York, pp 84–89
16.
go back to reference Rodrigues G, ROSA F, de Oliveira A, Kastensmidt FL, Ost L, Reis R (2017) Analyzing the impact of fault tolerance methods in ARM processors under soft errors running linux and parallelization APIs. IEEE Trans Nuclear Sci 64(8):2196–2203 Rodrigues G, ROSA F, de Oliveira A, Kastensmidt FL, Ost L, Reis R (2017) Analyzing the impact of fault tolerance methods in ARM processors under soft errors running linux and parallelization APIs. IEEE Trans Nuclear Sci 64(8):2196–2203
17.
go back to reference Rodrigues G, Kastensmidt FL, Reis R, Rosa F, Ost L (2016) Analyzing the impact of using pthreads versus OpenMP under fault injection in ARM cortex-a9 dual-core. In: 2016 16th European conference on radiation and its effects on components and systems (RADECS). IEEE, pp 1–6 Rodrigues G, Kastensmidt FL, Reis R, Rosa F, Ost L (2016) Analyzing the impact of using pthreads versus OpenMP under fault injection in ARM cortex-a9 dual-core. In: 2016 16th European conference on radiation and its effects on components and systems (RADECS). IEEE, pp 1–6
18.
go back to reference Hukerikar S, Teranishi K, Diniz PC, Lucas RF (2017) RedThreads: an interface for application-level fault detection/correction through adaptive redundant multithreading. Int J Parallel Prog 46:225–251CrossRef Hukerikar S, Teranishi K, Diniz PC, Lucas RF (2017) RedThreads: an interface for application-level fault detection/correction through adaptive redundant multithreading. Int J Parallel Prog 46:225–251CrossRef
19.
go back to reference Monson JS, Wirthlin M, Hutchings B (2010) Fault injection results of linux operating on an FPGA embedded platform. In: Proc. 2010 international conference on reconfigurable computing and FPGAs. IEEE, pp 37–42 Monson JS, Wirthlin M, Hutchings B (2010) Fault injection results of linux operating on an FPGA embedded platform. In: Proc. 2010 international conference on reconfigurable computing and FPGAs. IEEE, pp 37–42
20.
go back to reference So H, Didehban M, Shrivastava A, Lee K (2019) A software-level redundant multithreading for soft/hard error detection and recovery. In: Proc. 2019 design, automation & test in europe conference & exhibition (DATE). IEEE, pp 1559–1562 So H, Didehban M, Shrivastava A, Lee K (2019) A software-level redundant multithreading for soft/hard error detection and recovery. In: Proc. 2019 design, automation & test in europe conference & exhibition (DATE). IEEE, pp 1559–1562
21.
go back to reference Serrano-Cases A, Restrepo-Calle F, Cuenca-Asensi S, Martinez-Alvarez A (2019) Softerror mitigation for multi-core processors based on thread replication. In: Proc. 2019 IEEE Latin American test symposium (LATS). IEEE, pp 1–5 Serrano-Cases A, Restrepo-Calle F, Cuenca-Asensi S, Martinez-Alvarez A (2019) Softerror mitigation for multi-core processors based on thread replication. In: Proc. 2019 IEEE Latin American test symposium (LATS). IEEE, pp 1–5
22.
go back to reference Reinhardt SK, Mukherjee S (2000) Transient fault detection via simultaneous multithreading. ACM SIGARCH Comput Architect News 28:25–36CrossRef Reinhardt SK, Mukherjee S (2000) Transient fault detection via simultaneous multithreading. ACM SIGARCH Comput Architect News 28:25–36CrossRef
23.
go back to reference Martinez-Alvarez A, Cuenca-Asensi S, Restrepo-Calle F, Palomo Pinto FR, Guzman-Miranda H, Aguirre MA (2012) Compiler-directed soft error mitigation for embedded systems. IEEE Trans Depend Secur Comput 9:159–172CrossRef Martinez-Alvarez A, Cuenca-Asensi S, Restrepo-Calle F, Palomo Pinto FR, Guzman-Miranda H, Aguirre MA (2012) Compiler-directed soft error mitigation for embedded systems. IEEE Trans Depend Secur Comput 9:159–172CrossRef
25.
go back to reference Isaza-Gonzalez J, Serrano-Cases A, Restrepo-Calle F, Cuenca-Asensi S, Martinez-Alvarez A (2016) Dependability evaluation of COTS microprocessors via on-chip debugging facilities. In: Proc. 2016 17th Latin-American test symposium (LATS). IEEE, pp 27–32 Isaza-Gonzalez J, Serrano-Cases A, Restrepo-Calle F, Cuenca-Asensi S, Martinez-Alvarez A (2016) Dependability evaluation of COTS microprocessors via on-chip debugging facilities. In: Proc. 2016 17th Latin-American test symposium (LATS). IEEE, pp 27–32
26.
go back to reference Reyneri LM, Serrano-Cases A, Morilla Y, Cuenca-Asensi S, Martínez-Álvarez A (2019) A compact model to evaluate the effects of high level C++ code hardening in radiation environments. Electronics 8:653CrossRef Reyneri LM, Serrano-Cases A, Morilla Y, Cuenca-Asensi S, Martínez-Álvarez A (2019) A compact model to evaluate the effects of high level C++ code hardening in radiation environments. Electronics 8:653CrossRef
27.
go back to reference Reis G, Chang J, Vachharajani N, Rangan R, August D, Mukherjee S (2005) Design and evaluation of hybrid fault-detection systems. In: Proc. 32nd International symposium on computer architecture (ISCA2005). IEEE, pp 148–159 Reis G, Chang J, Vachharajani N, Rangan R, August D, Mukherjee S (2005) Design and evaluation of hybrid fault-detection systems. In: Proc. 32nd International symposium on computer architecture (ISCA2005). IEEE, pp 148–159
Metadata
Title
Multi-Threaded Mitigation of Radiation-Induced Soft Errors in Bare-Metal Embedded Systems
Authors
Alejandro Serrano-Cases
Felipe Restrepo-Calle
Sergio Cuenca-Asensi
Antonio Martínez-Álvarez
Publication date
19-12-2019
Publisher
Springer US
Published in
Journal of Electronic Testing / Issue 1/2020
Print ISSN: 0923-8174
Electronic ISSN: 1573-0727
DOI
https://doi.org/10.1007/s10836-019-05846-4

Other articles of this Issue 1/2020

Journal of Electronic Testing 1/2020 Go to the issue

Announcement

2019 Reviewers

EditorialNotes

Editorial