skip to main content
10.1145/2039370.2039408acmconferencesArticle/Chapter ViewAbstractPublication PagesesweekConference Proceedingsconference-collections
research-article

Reliable software for unreliable hardware: embedded code generation aiming at reliability

Authors Info & Claims
Published:09 October 2011Publication History

ABSTRACT

A compilation technique for reliability-aware software transformations is presented. An instruction-level reliability estimation technique quantifies the effects of hardware-level faults at the instruction-level while considering spatial and temporal vulnerabilities. It bridges the gap between hardware - where faults occur according to our fault model - and software (the abstraction level where we aim to increase reliability). For a given tolerable performance overhead, an optimization algorithm compiles an application software with respect to a tradeoff between performance and reliability. Compared to performance-optimized compilation, our method incurs 60%-80% lower application failures, averaged over various fault injection scenarios and fault rates.

References

  1. R. Baumann, "Radiation-induced soft errors in advanced semiconductor technologies," IEEE TDMR, vol. 5, no. 3, pp. 305--316, 2005.Google ScholarGoogle Scholar
  2. P. Giacinto et al., "An experimental Study of Soft Error in Microprocessors", MICRO, pp. 30--39, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. R. Vadlamani et al., "Multicore soft error rate stabilization using adaptive dual modular redundancy", DATE, pp. 27--32, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. D. Ernst et al., "Razor: circuit-level correction of timing errors for low-power operation," IEEE MICRO, vol. 24, no. 3, pp. 10--20, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. S. S. Mukherjee, et al., "A systematic methodology to compute the architectural vulnerability factors for a high-performance microprocessor", MICRO, pp. 29--40, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. R. Venkatasubramanianw et al., "Low cost on-line fault detection using control flow assertions". IEEE IOLTS, pp. 137--143, 2003.Google ScholarGoogle ScholarCross RefCross Ref
  7. P. P. Shirvani et al., "Software implemented EDAC protection against SEUs". IEEE Transactions on Reliability, vol. 49, pp. 273--284, 2000.Google ScholarGoogle ScholarCross RefCross Ref
  8. V. Sridharan, "Introducing Abstraction to Vulnerability Analysis", Ph.D. Thesis, March 2010.Google ScholarGoogle Scholar
  9. V. Sridharan et al., "Eliminating Micro-architectural Dependency from Architectural Vulnerability", HPCA, pp. 117--128, 2009.Google ScholarGoogle Scholar
  10. G. A. Reis et al., "SWIFT: Software Implemented Fault Tolerance", IEEE CGO, pp. 243--254, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. N. Oh et al., "Error detection by duplicated instructions in super-scalar processors", IEEE Transaction on Reliability, vol. 51, no. 1, pp. 63--75, 2002.Google ScholarGoogle ScholarCross RefCross Ref
  12. J. Hu et al., "In-Register Duplication: Exploiting Narrow-Width Value for Improving Register File Reliability," DSN, pp. 281--290, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. J. S. Hu et al., "Compiler-Directed Instruction Duplication for Soft Error Detection," DATE, vol. 2, pp. 1056--1057, 2005 Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. G. A. Reis et al., "Software controlled fault tolerance," ACM TACO, vol. 2, pp. 366--396, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. P. Lokuciejewski et al., "Combining Worst-Case Timing Models, Loop Unrolling, and Static Loop Analysis for WCET Minimization," ECRTS, pp. 35--44, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. V. Sarkar, "Optimized Unrolling of Nested Loops", International Journal on Parallel Programing, 29(5):545--581, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. J. Lee et al., "Compiler approach for reducing soft errors in register file", IEEE LCTES, pp. 41--49, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. J. Yan et al., "Compiler guided register reliability improvement against soft errors," IEEE EMSOFT, pp. 203--209, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. D. Borodin et al., "Protected Redundancy Overhead Reduction Using Instruction Vulnerability Factor," IEEE CF, pp. 319--326, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. U. Schiffel et al., "Software-Implemented Hardware Error Detection: Costs and Gains," IEEE DEPEND, pp. 51--57, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. C. Lee et al., "Compiler optimization on instruction scheduling for low power," IEEE ISSS, pp. 55--60, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. K. Pattabiraman et al., "SymPLFIED: Symbolic program-level fault injection and error detection framework", DSN, pp. 472--481, 2008.Google ScholarGoogle ScholarCross RefCross Ref
  23. H. Ziade et al., "A Survey on Fault Injection Techniques", IAJIT, vol. 1, no. 2, pp. 171--186, 2004.Google ScholarGoogle Scholar
  24. R. Velazco et al., "Injecting Bit Flip Faults by Means of a purely Software Approach: a Case Studied", IEEE DFT, pp. 108--116, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. M. Rebaudengo, M. S. Reorda, M. Violante, "Analysis of SEU effects in a pipelined processor", IEEE IOLTW, pp. 112--116, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Flux calculator: www.seutest.com/cgi-bin/FluxCalculator.cgi.Google ScholarGoogle Scholar
  27. J. Gaisler, "A portable and fault-tolerant microprocessor based on the SPARC v8 architecture", DSN, pp. 409--415, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. IBM® XIV® Storage System cache: http://publib.boulder.ibm.com/infocenter/ibmxiv/r2/index.jsp.Google ScholarGoogle Scholar
  29. AMD Phenom™ II Processor Product Data Sheet 2010.Google ScholarGoogle Scholar
  30. X. Fu, W. Zhang, T. Li, J. Fortes, "Optimizing Issue Queue Reliability to Soft Errors on Simultaneous Multithreaded Architectures", International Conference on Parallel Processing, pp. 190--197, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. H.264 Codec: http://iphome.hhi.de/suehring/tml/index.htmGoogle ScholarGoogle Scholar
  32. L. Lin et al., "Soft error and energy consumption interactions: a data cache perspective", ISLPED, pp. 132--137, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Reliable software for unreliable hardware: embedded code generation aiming at reliability

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          CODES+ISSS '11: Proceedings of the seventh IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
          October 2011
          402 pages
          ISBN:9781450307154
          DOI:10.1145/2039370

          Copyright © 2011 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 9 October 2011

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          Overall Acceptance Rate280of864submissions,32%

          Upcoming Conference

          ESWEEK '24
          Twentieth Embedded Systems Week
          September 29 - October 4, 2024
          Raleigh , NC , USA

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader