Skip to main content
Top
Published in: Real-Time Systems 2/2015

01-03-2015

Static probabilistic worst case execution time estimation for architectures with faulty instruction caches

Authors: Damien Hardy, Isabelle Puaut

Published in: Real-Time Systems | Issue 2/2015

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Semiconductor technology evolution suggests that permanent failure rates will increase dramatically with scaling, in particular for SRAM cells. While well known approaches such as error correcting codes exist to recover from failures and provide fault-free chips, they will not be affordable anymore in the future due to their growing cost. Consequently, other approaches like fine-grained disabling and reconfiguration of hardware elements (e.g. individual functional units or cache blocks) will become economically necessary. This fine-grained disabling will degrade performance compared to a fault-free execution. To the best of our knowledge, all static worst-case execution time (WCET) estimation methods assume fault-free processors. Their result is not safe anymore when fine-grained disabling of hardware components is used. In this paper we provide the first method that statically calculates a probabilistic WCET bound in the presence of permanent faults in instruction caches. The proposed method derives a probabilistic WCET bound for a program, cache configuration, and probability of cell failure. As our method relies on static analysis to bound the longest path, its probabilistic nature only stems from the probability that faults actually occur. Our method is computationally tractable because it does not require an exhaustive enumeration of all the possible combinations of faulty cache blocks. Experimental results show that it provides WCET estimates very close to, but never below, the method that derives probabilistic WCETs by enumerating all possible locations of faulty cache blocks. The proposed method not only allows to quantify the impact of permanent faults on WCET estimates, but, most importantly, can be used in architectural exploration frameworks to select the most appropriate fault management mechanisms and design parameters for current and future chip designs.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Footnotes
1
Especially, an in-order pipeline with a constant latency per instruction is assumed like in Slijepcevic et al. (2013).
 
2
Due to the definition of CHMC FM, variable \(x_b\) is split into two distinct variables, one for the first execution of the block (\(x^{first}_b\)) and another for the next executions (\(x^{next}_b\) ).
 
5
In the conference version of the paper (Hardy and Puaut 2013) it was 270 s. The significant improvement comes from our complexity enhancement method consisting of injecting fault-insensitive frequency values into the different ILP systems (Sect. 4.2.1).
 
Literature
go back to reference Borkar S, Karnik T, Narendra S, Tschanz J, Keshavarzi A, De V (2003) Parameter variations and impact on circuits and microarchitecture. In DAC40, pp 338–342 Borkar S, Karnik T, Narendra S, Tschanz J, Keshavarzi A, De V (2003) Parameter variations and impact on circuits and microarchitecture. In DAC40, pp 338–342
go back to reference Bowman K, Tschanz J, Wilkerson C, Lu SL, Karnik T, De V, Borkar S. (2009) Circuit techniques for dynamic variation tolerance. In: DAC46. ACM, New York, pp 4–7. doi:10.1145/1629911.1629915 Bowman K, Tschanz J, Wilkerson C, Lu SL, Karnik T, De V, Borkar S. (2009) Circuit techniques for dynamic variation tolerance. In: DAC46. ACM, New York, pp 4–7. doi:10.​1145/​1629911.​1629915
go back to reference Cazorla FJ, Quiñones E, Vardanega T, Cucu L, Triquet B, Bernat G, Berger E, Abella J, Wartel F, Houston M, Santinellei L, Kosmidis L, Lo C, Maxim D (2013) Proartis: probabilistically analysable real-time systems. ACM Trans Embed Comput Syst 12:79CrossRef Cazorla FJ, Quiñones E, Vardanega T, Cucu L, Triquet B, Bernat G, Berger E, Abella J, Wartel F, Houston M, Santinellei L, Kosmidis L, Lo C, Maxim D (2013) Proartis: probabilistically analysable real-time systems. ACM Trans Embed Comput Syst 12:79CrossRef
go back to reference Cheng L, Gupta P, Spanos CJ, Qian K, He L (2011) Physically justifiable die-level modeling of spatial variation in view of systematic across wafer variability. IEEE Trans CAD Integr Circuits Syst 30(3):388–401CrossRef Cheng L, Gupta P, Spanos CJ, Qian K, He L (2011) Physically justifiable die-level modeling of spatial variation in view of systematic across wafer variability. IEEE Trans CAD Integr Circuits Syst 30(3):388–401CrossRef
go back to reference Chevochot P, Puaut I (1999) Scheduling fault-tolerant distributed hard real-time tasks independently of the replication strategies. In: 6th international conference on real-time computing and applications symposium, pp 356–363 Chevochot P, Puaut I (1999) Scheduling fault-tolerant distributed hard real-time tasks independently of the replication strategies. In: 6th international conference on real-time computing and applications symposium, pp 356–363
go back to reference Colin A, Puaut I (2001) A modular and retargetable framework for tree-based WCET analysis. In: Euromicro conference on real-time systems (ECRTS), Delft, pp. 37–44 Colin A, Puaut I (2001) A modular and retargetable framework for tree-based WCET analysis. In: Euromicro conference on real-time systems (ECRTS), Delft, pp. 37–44
go back to reference Engblom J (2002) Processor pipelines and static worst-case execution time analysis. PhD thesis, Uppsala University. Engblom J (2002) Processor pipelines and static worst-case execution time analysis. PhD thesis, Uppsala University.
go back to reference Ferdinand C, Wilhelm R (1998) On predicting data cache behavior for real-time systems. In LCTES ’98: proceedings of the ACM SIGPLAN workshop on languages, compilers, and tools for embedded systems, pp 16–30 Ferdinand C, Wilhelm R (1998) On predicting data cache behavior for real-time systems. In LCTES ’98: proceedings of the ACM SIGPLAN workshop on languages, compilers, and tools for embedded systems, pp 16–30
go back to reference Ghosh S, Melhem R, Mossé D (1997) Fault-tolerance through scheduling of aperiodic tasks in hard real-time multiprocessor systems. IEEE Trans Parallel Distrib Syst 8(3):272–284CrossRef Ghosh S, Melhem R, Mossé D (1997) Fault-tolerance through scheduling of aperiodic tasks in hard real-time multiprocessor systems. IEEE Trans Parallel Distrib Syst 8(3):272–284CrossRef
go back to reference Hardy D, Puaut I (2008) WCET analysis of multi-level non-inclusive set-associative instruction caches. In: Proceedings of the 29th real-time systems symposium, pp 456–466 Hardy D, Puaut I (2008) WCET analysis of multi-level non-inclusive set-associative instruction caches. In: Proceedings of the 29th real-time systems symposium, pp 456–466
go back to reference Hardy D, Puaut I (2013) Static probabilistic worst case execution time estimation for architectures with faulty instruction caches. In: 21st international conference on real-time networks and systems, RTNS 2013, Sophia Antipolis, October 17–18, pp 35–44 Hardy D, Puaut I (2013) Static probabilistic worst case execution time estimation for architectures with faulty instruction caches. In: 21st international conference on real-time networks and systems, RTNS 2013, Sophia Antipolis, October 17–18, pp 35–44
go back to reference Hardy D, Sideris I, Ladas N, Sazeides Y (2012) The performance vulnerability of architectural and non-architectural arrays to permanent faults. In: Proceedings of the 45th annual IEEE/ACM international symposium on microarchitecture, MICRO’12, pp 48–59 Hardy D, Sideris I, Ladas N, Sazeides Y (2012) The performance vulnerability of architectural and non-architectural arrays to permanent faults. In: Proceedings of the 45th annual IEEE/ACM international symposium on microarchitecture, MICRO’12, pp 48–59
go back to reference Höfig K (2012) Failure-dependent timing analysis: a new methodology for probabilistic worst-case execution time analysis. In: Schmitt J (ed) Measurement, modelling, and evaluation of computing systems and dependability and fault tolerance. Lecture notes in computer science, vol 7201. Springer, Berlin, pp 61–75 Höfig K (2012) Failure-dependent timing analysis: a new methodology for probabilistic worst-case execution time analysis. In: Schmitt J (ed) Measurement, modelling, and evaluation of computing systems and dependability and fault tolerance. Lecture notes in computer science, vol 7201. Springer, Berlin, pp 61–75
go back to reference Kosmidis L, Abella J, Quiñones E, Cazorla FJ (2013) A cache design for probabilistically analysable real-time systems. In: Proceedings of the conference on design, automation and test in Europe, DATE ’13, pp 513–518 Kosmidis L, Abella J, Quiñones E, Cazorla FJ (2013) A cache design for probabilistically analysable real-time systems. In: Proceedings of the conference on design, automation and test in Europe, DATE ’13, pp 513–518
go back to reference Li X, Roychoudhury A, Mitra T (2006) Modeling out-of-order processors for WCET estimation. Real Time Syst J 34(3):195–227CrossRefMATH Li X, Roychoudhury A, Mitra T (2006) Modeling out-of-order processors for WCET estimation. Real Time Syst J 34(3):195–227CrossRefMATH
go back to reference Li YTS, Malik S (1995) Performance analysis of embedded software using implicit path enumeration. In: Gerber R, Marlowe T (eds) LCTES ’95: proceedings of the ACM SIGPLAN 1995 workshop on languages, compilers, & tools for real-time systems, vol 30, New York, pp 88–98 Li YTS, Malik S (1995) Performance analysis of embedded software using implicit path enumeration. In: Gerber R, Marlowe T (eds) LCTES ’95: proceedings of the ACM SIGPLAN 1995 workshop on languages, compilers, & tools for real-time systems, vol 30, New York, pp 88–98
go back to reference Maxim D, Houston M, Santinelli L, Bernat G, Davis RI, Cucu-Grosjean L (2012) Re-sampling for statistical timing analysis of real-time systems. In: Proceedings of the 20th international conference on real-time and network systems, RTNS ’12. ACM, New York, pp. 111–120. doi:10.1145/2392987.2393001 Maxim D, Houston M, Santinelli L, Bernat G, Davis RI, Cucu-Grosjean L (2012) Re-sampling for statistical timing analysis of real-time systems. In: Proceedings of the 20th international conference on real-time and network systems, RTNS ’12. ACM, New York, pp. 111–120. doi:10.​1145/​2392987.​2393001
go back to reference McNairy C, Mayfield, J (2005) Montecito error protection and mitigation. In: HPCRI ’05: 1st workshop on high performance computing reliability issues McNairy C, Mayfield, J (2005) Montecito error protection and mitigation. In: HPCRI ’05: 1st workshop on high performance computing reliability issues
go back to reference Mueller F (2000) Timing analysis for instruction caches. Real Time Syst 18(2–3):217–247CrossRef Mueller F (2000) Timing analysis for instruction caches. Real Time Syst 18(2–3):217–247CrossRef
go back to reference Nassif SR, Mehta N, Cao Y (2010) A resilience roadmap. In: DATE, pp 1011–1016 Nassif SR, Mehta N, Cao Y (2010) A resilience roadmap. In: DATE, pp 1011–1016
go back to reference Punnekkat S, Burns A, Davis R (2001) Analysis of checkpointing for real-time systems. Real Time Syst 20(1):83–102CrossRefMATH Punnekkat S, Burns A, Davis R (2001) Analysis of checkpointing for real-time systems. Real Time Syst 20(1):83–102CrossRefMATH
go back to reference Puschner P, Schedl AV (1997) Computing maximum task execution times: a graph based approach. Proc IEEE Real Time Syst Symp 13:67–91CrossRef Puschner P, Schedl AV (1997) Computing maximum task execution times: a graph based approach. Proc IEEE Real Time Syst Symp 13:67–91CrossRef
go back to reference Reineke J, Grund D, Berg C, Wilhelm R (2007) Timing predictability of cache replacement policies. Real Time Syst 37(2):99–122CrossRefMATH Reineke J, Grund D, Berg C, Wilhelm R (2007) Timing predictability of cache replacement policies. Real Time Syst 37(2):99–122CrossRefMATH
go back to reference Slijepcevic M, Kosmidis L, Abella J, Quinones E, Cazorla F (2013) DTM: degraded test mode for fault-aware probabilistic timing analysis. In: 2013 25th Euromicro conference on real-time systems (ECRTS), pp 237–248 Slijepcevic M, Kosmidis L, Abella J, Quinones E, Cazorla F (2013) DTM: degraded test mode for fault-aware probabilistic timing analysis. In: 2013 25th Euromicro conference on real-time systems (ECRTS), pp 237–248
go back to reference Theiling H, Ferdinand C, Wilhelm R (2000) Fast and precise WCET prediction by separated cache and path analyses. Real Time Syst 18(2–3):157–179CrossRef Theiling H, Ferdinand C, Wilhelm R (2000) Fast and precise WCET prediction by separated cache and path analyses. Real Time Syst 18(2–3):157–179CrossRef
go back to reference Wilhelm R, Engblom J, Ermedahl A, Holsti N, Thesing S, Whalley D, Bernat G, Ferdinand C, Heckmann R, Mitra T, Mueller F, Puaut I, Puschner P, Staschulat J, Stenström P (2008) The worst-case execution-time problem-overview of methods and survey of tools. ACM Trans Embed Comput Syst 7(3):36:1–36:53. doi:10.1145/1347375.1347389 CrossRef Wilhelm R, Engblom J, Ermedahl A, Holsti N, Thesing S, Whalley D, Bernat G, Ferdinand C, Heckmann R, Mitra T, Mueller F, Puaut I, Puschner P, Staschulat J, Stenström P (2008) The worst-case execution-time problem-overview of methods and survey of tools. ACM Trans Embed Comput Syst 7(3):36:1–36:53. doi:10.​1145/​1347375.​1347389 CrossRef
go back to reference Zhou ST, Katariya S, Ghasemi H, Draper S, Kim NS (2010) Minimizing total area of low-voltage SRAM arrays through joint optimization of cell size, redundancy, and ECC. In: 2010 IEEE international conference on computer design (ICCD), pp 112–117. doi:10.1109/ICCD.2010.5647605 Zhou ST, Katariya S, Ghasemi H, Draper S, Kim NS (2010) Minimizing total area of low-voltage SRAM arrays through joint optimization of cell size, redundancy, and ECC. In: 2010 IEEE international conference on computer design (ICCD), pp 112–117. doi:10.​1109/​ICCD.​2010.​5647605
Metadata
Title
Static probabilistic worst case execution time estimation for architectures with faulty instruction caches
Authors
Damien Hardy
Isabelle Puaut
Publication date
01-03-2015
Publisher
Springer US
Published in
Real-Time Systems / Issue 2/2015
Print ISSN: 0922-6443
Electronic ISSN: 1573-1383
DOI
https://doi.org/10.1007/s11241-014-9212-x

Premium Partner