Skip to main content
Erschienen in: Journal of Electronic Testing 2/2019

16.03.2019

Memory-Aware Design Space Exploration for Reliability Evaluation in Computing Systems

verfasst von: Maha Kooli, Giorgio Di Natale, Alberto Bosio

Erschienen in: Journal of Electronic Testing | Ausgabe 2/2019

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In this paper, we present an analytical methodology to measure the vulnerability of the memory components of a microprocessor-based computing system. It is based on the data and the instruction lifetime and residence. The proposed approach considers only the software-layer of the system, which makes it usable at early design stage when the hardware architecture is not fully defined. Then, to consider the hardware memory hierarchy (i.e., RAM, Caches, Register Files) at software level, we have developed a memory subsystem emulator that can be easily configured to support different features. The methodology can be used to perform a fast, easy and not costly cache-aware Design Space Exploration (DSE) to accurately evaluate the vulnerability of the RAM and the caches. The first set of experiments run on Mibench benchmarks shows that we can perform a fast, easy and not costly DSE to accurately evaluate the effects of the faults in both the RAM and the caches. In addition, we validate the proposed approach on a real industrial test case, which is a Flight Management System for avionic application. The results show that the proposed methodology give precise results compared to a classical fault injection tool, and it scales well with the complexity of the application.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Weitere Produktempfehlungen anzeigen
Literatur
1.
Zurück zum Zitat Alipour M, Salehi ME, Baghini HS (2012) Design space exploration to find the optimum cache and register file size for embedded applications. arXiv:1205.1871 Alipour M, Salehi ME, Baghini HS (2012) Design space exploration to find the optimum cache and register file size for embedded applications. arXiv:1205.​1871
2.
Zurück zum Zitat Avižienis A, Laprie J-C, Randell B, Landwehr C (2004) Basic concepts and taxonomy of dependable and secure computing. IEEE Trans Dependable Secure Comput 1(1):11–33CrossRef Avižienis A, Laprie J-C, Randell B, Landwehr C (2004) Basic concepts and taxonomy of dependable and secure computing. IEEE Trans Dependable Secure Comput 1(1):11–33CrossRef
3.
Zurück zum Zitat Baumann R (2005) Soft errors in advanced computer systems. IEEE Des Test 22(3):258–266CrossRef Baumann R (2005) Soft errors in advanced computer systems. IEEE Des Test 22(3):258–266CrossRef
4.
Zurück zum Zitat Benso A, Di Carlo S, Di Natale G, Prinetto P, Taghaferri L (2003) Data criticality estimation in software applications. In: Proceedings international test conference (ITC), 2003, vol 1, pp 802–810 Benso A, Di Carlo S, Di Natale G, Prinetto P, Taghaferri L (2003) Data criticality estimation in software applications. In: Proceedings international test conference (ITC), 2003, vol 1, pp 802–810
5.
Zurück zum Zitat Biswas A, Racunas P, Cheveresan R, Emer J, Mukherjee SS, Rangan R (2005) Computing architectural vulnerability factors for address-based structures. SIGARCH Comput Archit News 33(2):532–543CrossRef Biswas A, Racunas P, Cheveresan R, Emer J, Mukherjee SS, Rangan R (2005) Computing architectural vulnerability factors for address-based structures. SIGARCH Comput Archit News 33(2):532–543CrossRef
6.
Zurück zum Zitat Borkar S, Karnik T, De V (2004) Design and reliability challenges in nanometer technologies. In: Proceedings of the 41st annual design automation conference, DAC ’04, pp 75–75 Borkar S, Karnik T, De V (2004) Design and reliability challenges in nanometer technologies. In: Proceedings of the 41st annual design automation conference, DAC ’04, pp 75–75
7.
Zurück zum Zitat Cai Y, Schmitz MT, Ejlali A, Al-Hashimi BM, Reddy SM (2006) Cache size selection for performance, energy and reliability of time-constrained systems. In: Proceedings of the conference on Asia South Pacific design automation: ASP-DAC, Yokohama, Japan, January 24–27, pp 923–928 Cai Y, Schmitz MT, Ejlali A, Al-Hashimi BM, Reddy SM (2006) Cache size selection for performance, energy and reliability of time-constrained systems. In: Proceedings of the conference on Asia South Pacific design automation: ASP-DAC, Yokohama, Japan, January 24–27, pp 923–928
8.
Zurück zum Zitat Ebrahimi M, Chen L, Asadi H, Tahoori MB (2013) CLASS: combined logic and architectural soft error sensitivity analysis. In: 18th Asia and South Pacific design automation conference, ASP-DAC 2013, Yokohama, Japan, January 22–25, 2013, pp 601–607 Ebrahimi M, Chen L, Asadi H, Tahoori MB (2013) CLASS: combined logic and architectural soft error sensitivity analysis. In: 18th Asia and South Pacific design automation conference, ASP-DAC 2013, Yokohama, Japan, January 22–25, 2013, pp 601–607
9.
Zurück zum Zitat George NJ, Elks CR, Johnson BW, Lach J (2010) Transient fault models and avf estimation revisited. In: 2010 IEEE/IFIP international conference on dependable systems & networks (DSN). IEEE, pp 477–486 George NJ, Elks CR, Johnson BW, Lach J (2010) Transient fault models and avf estimation revisited. In: 2010 IEEE/IFIP international conference on dependable systems & networks (DSN). IEEE, pp 477–486
10.
Zurück zum Zitat George NJ, Elks CR, Johnson BW, Lach J (2010) Transient fault models and AVF estimation revisited. In: Proceedings of the 2010 IEEE/IFIP international conference on dependable systems and networks, DSN 2010, Chicago, IL, USA, June 28 – July 1 2010, pp 477–486 George NJ, Elks CR, Johnson BW, Lach J (2010) Transient fault models and AVF estimation revisited. In: Proceedings of the 2010 IEEE/IFIP international conference on dependable systems and networks, DSN 2010, Chicago, IL, USA, June 28 – July 1 2010, pp 477–486
11.
Zurück zum Zitat Ghosh A, Givargis T (2003) Analytical design space exploration of caches for embedded systems. In: Design, automation and test in Europe conference and exposition DATE, Munich, Germany, March 3–7, pp 10650–10655 Ghosh A, Givargis T (2003) Analytical design space exploration of caches for embedded systems. In: Design, automation and test in Europe conference and exposition DATE, Munich, Germany, March 3–7, pp 10650–10655
12.
Zurück zum Zitat Hiser J, Davidson JW, Whalley DB (2007) Fast, accurate design space exploration of embedded systems memory configurations. In: Proceedings of the 2007 ACM symposium on applied computing SAC, Seoul, Korea, March 11–15, pp 699–706 Hiser J, Davidson JW, Whalley DB (2007) Fast, accurate design space exploration of embedded systems memory configurations. In: Proceedings of the 2007 ACM symposium on applied computing SAC, Seoul, Korea, March 11–15, pp 699–706
13.
Zurück zum Zitat Kooli M (2016) Analysing and supporting the reliability decision-making process in computing systems with a reliability evaluation framework. Theses, Université Montpellier Kooli M (2016) Analysing and supporting the reliability decision-making process in computing systems with a reliability evaluation framework. Theses, Université Montpellier
14.
Zurück zum Zitat Kooli M, Di Natale G (2014) A survey on simulation-based fault injection tools for complex systems. In: Proceedings of the 9th international conference on design & technology of integrated systems in Nanoscale Era, DTIS, Santorini, Greece, May 6–8, pp 1–6 Kooli M, Di Natale G (2014) A survey on simulation-based fault injection tools for complex systems. In: Proceedings of the 9th international conference on design & technology of integrated systems in Nanoscale Era, DTIS, Santorini, Greece, May 6–8, pp 1–6
15.
Zurück zum Zitat Kooli M, Di Natale G, Bosio A (2016) Cache-aware reliability evaluation through llvm-based analysis and fault injection. In: 22nd IEEE international symposium on on-line testing and robust system design, IOLTS, Catalunya, Spain, July 4–6 Kooli M, Di Natale G, Bosio A (2016) Cache-aware reliability evaluation through llvm-based analysis and fault injection. In: 22nd IEEE international symposium on on-line testing and robust system design, IOLTS, Catalunya, Spain, July 4–6
16.
Zurück zum Zitat Kooli M, Kaddachi F, Di Natale G, Bosio A (2016) Cache- and register-aware system reliability evaluation based on data lifetime analysis. In: 34th IEEE VLSI test symposium, VTS 2016, Las Vegas, NV, USA, April 25–27, pp 1–6 Kooli M, Kaddachi F, Di Natale G, Bosio A (2016) Cache- and register-aware system reliability evaluation based on data lifetime analysis. In: 34th IEEE VLSI test symposium, VTS 2016, Las Vegas, NV, USA, April 25–27, pp 1–6
17.
Zurück zum Zitat Lattner C, Vikram A (2004) LLVM A compilation framework for lifelong program analysis & transformation. Proceedings of the international symposium on code generation and optimization: feedback-directed and runtime optimization, CGO ’04 p 75 Lattner C, Vikram A (2004) LLVM A compilation framework for lifelong program analysis & transformation. Proceedings of the international symposium on code generation and optimization: feedback-directed and runtime optimization, CGO ’04 p 75
18.
Zurück zum Zitat Leveugle R, Calvez A, Maistri P, Vanhauwaert P (2009) Statistical fault injection: quantified error and confidence. In: Proceedings of the conference on design, automation and test in Europe, DATE Nice, France, pp 502–506 Leveugle R, Calvez A, Maistri P, Vanhauwaert P (2009) Statistical fault injection: quantified error and confidence. In: Proceedings of the conference on design, automation and test in Europe, DATE Nice, France, pp 502–506
19.
Zurück zum Zitat Li X, Negi HS, Mitra T, Roychoudhury A (2004) Design space exploration of caches using compressed traces. In: Proceedigns of the 18th annual international conference on supercomputing, ICS, Saint Malo, France, June 26 - July 01, pp 116–125 Li X, Negi HS, Mitra T, Roychoudhury A (2004) Design space exploration of caches using compressed traces. In: Proceedigns of the 18th annual international conference on supercomputing, ICS, Saint Malo, France, June 26 - July 01, pp 116–125
20.
Zurück zum Zitat Liang Y, Mitra T (2008) Static analysis for fast and accurate design space exploration of caches. In: Proceedings of the 6th international conference on hardware/software codesign and system synthesis, CODES+ISSS 2008, Atlanta, GA, USA, October 19–24, pp 103–108 Liang Y, Mitra T (2008) Static analysis for fast and accurate design space exploration of caches. In: Proceedings of the 6th international conference on hardware/software codesign and system synthesis, CODES+ISSS 2008, Atlanta, GA, USA, October 19–24, pp 103–108
21.
Zurück zum Zitat Liang Y, Mitra T (2013) An analytical approach for fast and accurate design space exploration of instruction caches. ACM Trans Embedded Comput Syst 13(3):43:1–43:29CrossRef Liang Y, Mitra T (2013) An analytical approach for fast and accurate design space exploration of instruction caches. ACM Trans Embedded Comput Syst 13(3):43:1–43:29CrossRef
22.
Zurück zum Zitat SimpleScalar LLC (2004) Simplescalar LLC to serve and project SimpleScalar LLC (2004) Simplescalar LLC to serve and project
24.
Zurück zum Zitat Ma A, Cheng Y, Xing Z (2011) Accurate and simplified prediction of AVF for delay and energy efficient cache design. J Comput Sci Technol 26(3):504–519CrossRef Ma A, Cheng Y, Xing Z (2011) Accurate and simplified prediction of AVF for delay and energy efficient cache design. J Comput Sci Technol 26(3):504–519CrossRef
25.
Zurück zum Zitat Maghsoudloo M, Zarandi HR (2015) Design space exploration of non-uniform cache access for soft-error vulnerability mitigation. Microelectron Reliab 55(11):2439–2452CrossRef Maghsoudloo M, Zarandi HR (2015) Design space exploration of non-uniform cache access for soft-error vulnerability mitigation. Microelectron Reliab 55(11):2439–2452CrossRef
27.
Zurück zum Zitat Montesinos P, Liu W, Torrellas J (2007) Using register lifetime predictions to protect register files against soft errors. In: Proceedings of the 37th annual IEEE/IFIP international conference on dependable systems and networks, DSN ’07, Washington, DC, USA. IEEE Computer Society, pp 286–296 Montesinos P, Liu W, Torrellas J (2007) Using register lifetime predictions to protect register files against soft errors. In: Proceedings of the 37th annual IEEE/IFIP international conference on dependable systems and networks, DSN ’07, Washington, DC, USA. IEEE Computer Society, pp 286–296
28.
Zurück zum Zitat Mukherjee SS, Weaver C, Emer J, Reinhardt SK, Austin T (2003) A systematic methodology to compute the architectural vulnerability factors for a high-performance microprocessor. In: Proceedings of the 36th annual IEEE/ACM international symposium on microarchitecture, MICRO 36, San Diego, CA, USA, December 3–5, pp 29–42 Mukherjee SS, Weaver C, Emer J, Reinhardt SK, Austin T (2003) A systematic methodology to compute the architectural vulnerability factors for a high-performance microprocessor. In: Proceedings of the 36th annual IEEE/ACM international symposium on microarchitecture, MICRO 36, San Diego, CA, USA, December 3–5, pp 29–42
29.
Zurück zum Zitat Nicolaidis M (2010) Soft errors in modern electronic systems, vol 41. Springer Science & Business Media, Berlin Nicolaidis M (2010) Soft errors in modern electronic systems, vol 41. Springer Science & Business Media, Berlin
30.
Zurück zum Zitat Patel R, Rajawat A (2015) Instruction cache design space exploration for embedded software applications. In: 19th international symposium on VLSI design and test, VDAT, Ahmedabad, India, June 26–29, pp 1–5 Patel R, Rajawat A (2015) Instruction cache design space exploration for embedded software applications. In: 19th international symposium on VLSI design and test, VDAT, Ahmedabad, India, June 26–29, pp 1–5
31.
Zurück zum Zitat Savino A, Vallero A, Di Carlo S (2018) Redo: cross-layer multi-objective design-exploration framework for efficient soft error resilient systems. IEEE Trans Comput 67(10):1462–1477MathSciNetCrossRefMATH Savino A, Vallero A, Di Carlo S (2018) Redo: cross-layer multi-objective design-exploration framework for efficient soft error resilient systems. IEEE Trans Comput 67(10):1462–1477MathSciNetCrossRefMATH
32.
Zurück zum Zitat Shafique M, Rehman S, Aceituno PV, Henkel J (2013) Exploiting program-level masking and error propagation for constrained reliability optimization. In: Proceedings ACM/EDAC/IEEE design automation conference (DAC), pp 1–9 Shafique M, Rehman S, Aceituno PV, Henkel J (2013) Exploiting program-level masking and error propagation for constrained reliability optimization. In: Proceedings ACM/EDAC/IEEE design automation conference (DAC), pp 1–9
33.
Zurück zum Zitat Vadlamani R, Zhao J, Burleson W, Tessier R (2010) Multicore soft error rate stabilization using adaptive dual modular redundancy. In: Proceedings of the conference on design, automation and test in Europe, DATE, Dresden, Germany, pp 27–32 Vadlamani R, Zhao J, Burleson W, Tessier R (2010) Multicore soft error rate stabilization using adaptive dual modular redundancy. In: Proceedings of the conference on design, automation and test in Europe, DATE, Dresden, Germany, pp 27–32
34.
Zurück zum Zitat Vallero A, Savino A, Chatzidimitriou A, Kaliorakis M, Kooli M, Riera Villanueva M, Di Natale G, Bosio A, Canal R, Gizopoulos D, Di Carlo S (2018) Syra: early system reliability analysis for cross-layer soft errors resilience in memory arrays of microprocessor systems. IEEE Trans Comput pp 1–1 Vallero A, Savino A, Chatzidimitriou A, Kaliorakis M, Kooli M, Riera Villanueva M, Di Natale G, Bosio A, Canal R, Gizopoulos D, Di Carlo S (2018) Syra: early system reliability analysis for cross-layer soft errors resilience in memory arrays of microprocessor systems. IEEE Trans Comput pp 1–1
35.
Zurück zum Zitat Vallero A, Savino A, Politano G, Di Carlo S, Chatzidimitriou A, Tselonis S, Kaliorakis M, Gizopoulos D, Riera M, Canal R, Gonzalez A, Kooli M, Bosio A, Di Natale G (2016) Cross-layer system reliability assessment framework for hardware faults. In: Proceedings IEEE international test conference (ITC) , pp 1–10 Vallero A, Savino A, Politano G, Di Carlo S, Chatzidimitriou A, Tselonis S, Kaliorakis M, Gizopoulos D, Riera M, Canal R, Gonzalez A, Kooli M, Bosio A, Di Natale G (2016) Cross-layer system reliability assessment framework for hardware faults. In: Proceedings IEEE international test conference (ITC) , pp 1–10
36.
Zurück zum Zitat Vallero A, Tselonis S, Foutris N, Kaliorakis M, Kooli M, Savino A, Politano G, Bosio A, Di Natale G, Gizopoulos D, Di Carlo S (2015) Cross-layer reliability evaluation, moving from the hardware architecture to the system level: a clereco eu project overview. Microprocess Microsyst 39(8):1204–1214CrossRef Vallero A, Tselonis S, Foutris N, Kaliorakis M, Kooli M, Savino A, Politano G, Bosio A, Di Natale G, Gizopoulos D, Di Carlo S (2015) Cross-layer reliability evaluation, moving from the hardware architecture to the system level: a clereco eu project overview. Microprocess Microsyst 39(8):1204–1214CrossRef
37.
Zurück zum Zitat Wang S, Jie S, Ziavras SG (2009) On the characterization and optimization of on-chip cache reliability against soft errors. IEEE Trans Computers 58(9):1171–1184MathSciNetCrossRefMATH Wang S, Jie S, Ziavras SG (2009) On the characterization and optimization of on-chip cache reliability against soft errors. IEEE Trans Computers 58(9):1171–1184MathSciNetCrossRefMATH
38.
Zurück zum Zitat Wattanapongsakorn N, Levitan SP (2004) Reliability optimization models for embedded systems with multiple applications. IEEE Trans Reliab 53(3):406–416CrossRef Wattanapongsakorn N, Levitan SP (2004) Reliability optimization models for embedded systems with multiple applications. IEEE Trans Reliab 53(3):406–416CrossRef
Metadaten
Titel
Memory-Aware Design Space Exploration for Reliability Evaluation in Computing Systems
verfasst von
Maha Kooli
Giorgio Di Natale
Alberto Bosio
Publikationsdatum
16.03.2019
Verlag
Springer US
Erschienen in
Journal of Electronic Testing / Ausgabe 2/2019
Print ISSN: 0923-8174
Elektronische ISSN: 1573-0727
DOI
https://doi.org/10.1007/s10836-019-05785-0

Weitere Artikel der Ausgabe 2/2019

Journal of Electronic Testing 2/2019 Zur Ausgabe

EditorialNotes

Editorial

Neuer Inhalt