research-article

Protective redundancy overhead reduction using instruction vulnerability factor

Authors:
Demid Borodin

Delft University of Technology, Delft, Netherlands

Delft University of Technology, Delft, Netherlands
View Profile

,
Ben H.H. Juurlink

Technische Universität Berlin, Berlin, Germany

Technische Universität Berlin, Berlin, Germany
View Profile

CF '10: Proceedings of the 7th ACM international conference on Computing frontiersMay 2010Pages 319–326https://doi.org/10.1145/1787275.1787342

Published:17 May 2010Publication History

CF '10: Proceedings of the 7th ACM international conference on Computing frontiers

Pages 319–326

ABSTRACT

Due to modern technology trends, fault tolerance (FT) is acquiring an ever increasing research attention. To reduce the overhead introduced by the FT features, several techniques have been proposed. One of these techniques is Instruction-Level Fault Tolerance Configurability (ILCOFT). ILCOFT enables application developers to protect different instructions at varying degrees, devoting more resources to protect the most critical instructions, and saving resources by weakening protection of other instructions. It is, however, not trivial to assign a proper protection level for every instruction. This work introduces the notion of Instruction Vulnerability Factor (IVF), which evaluates how faults in every instruction affect the final application output. The IVF is computed off-line, and is then used by ILCOFT-enabled systems to assign the appropriate protection level to every instruction. IVF releases the programmer from the need to assign the necessary protection level to every instruction by hand. Experimental results demonstrate that IVF-based ILCOFT reduces the instruction duplication performance penalty by up to 77%, while the maximum output damage due to undetected faults does not exceed 0.6% of the total application output.

References

P. Shivakumar, M. Kistler, S. Keckler, D. Burger, and L. Alvisi, "Modeling the Effect of Technology Trends on the Soft Error Rate of Combinational Logic," in DSN-02: Proc. 2002 Int. Conf. on Dependable Systems and Networks, Washington, DC, USA, 2002, pp. 389--398. Google ScholarDigital Library
T. Rao and E. Fujiwara, Error-Control Coding for Computer Systems. Upper Saddle River, NJ, USA: Prentice-Hall, Inc., 1989. Google ScholarDigital Library
D. Borodin, B. Juurlink, and S. Vassiliadis, "Instruction-Level Fault Tolerance Configurability," in IC-SAMOS VII: Proc. Int. Conf. on Embedded Computer Systems: Architectures, Modeling, and Simulation, July 2007, pp. 110--117.Google Scholar
D. Borodin, B. Juurlink, S. Hamdioui, and S. Vassiliadis, "Instruction-Level Fault Tolerance Configurability," Journal of Signal Processing Systems, vol. 57, no. 1, pp. 89--105, October 2009. Google ScholarDigital Library
A. Sundaram, A. Aakel, D. Lockhart, D. Thaker, and D. Franklin, "Efficient Fault Tolerance in Multi-Media Applications through Selective Instruction Replication," in WREFT-08: Proc. of the 2008 workshop on Radiation effects and fault tolerance in nanometer technologies. New York, NY, USA: ACM, 2008, pp. 339--346. Google ScholarDigital Library
S. Mukherjee, C. Weaver, J. Emer, S. Reinhardt, and T. Austin, "A Systematic Methodology to Compute the Architectural Vulnerability Factors for a High-Performance Microprocessor," in MICRO-36: Proc. of the 36th Annual IEEE/ACM Int. Symp. on Microarchitecture. Washington, DC, USA: IEEE Computer Society, 2003, p. 29. Google ScholarDigital Library
T. Austin, E. Larson, and D. Ernst, "SimpleScalar: An Infrastructure for Computer System Modeling," Computer, vol. 35, no. 2, pp. 59--67, 2002. Google ScholarDigital Library
M. Franklin, "A Study of Time Redundant Fault Tolerance Techniques for Superscalar Processors," Proc. IEEE Int. Workshop on Defect and Fault Tolerance in VLSI Systems, pp. 207--215, Nov 1995. Google ScholarDigital Library
J. von Neumann, "Probabilistic Logics and the Synthesis of Reliable Organisms from Unreliable Components," in Automata Studies, ser. Annals of Mathematics Studies. Princeton, NJ: Princeton University Press, 1956, vol. 34, pp. 43--98.Google Scholar
B. Johnson, Design and Analysis of Fault-Tolerant Digital Systems. Addison-Wesley, Jan 1989. Google ScholarDigital Library
Fibonacci numbers at Wikipedia, http://en.wikipedia.org/wiki/Fibonacci_number.Google Scholar
C. Lee, M. Potkonjak, and W. H. Mangione-Smith, "MediaBench: A Tool for Evaluating and Synthesizing Multimedia and Communicatons Systems," in MICRO-30: Proc. of the 30th Annual ACM/IEEE Int. Symp. on Microarchitecture. Washington, DC, USA: IEEE Computer Society, 1997, pp. 330--335. Google ScholarDigital Library
N. Oh, P. P. Shirvani, and E. J. McCluskey, "Error Detection by Duplicated Instructions in Super-Scalar Processors," IEEE Transactions on Reliability, vol. 51, no. 1, pp. 63--75, Mar 2002.Google ScholarCross Ref

Index Terms

Protective redundancy overhead reduction using instruction vulnerability factor

Recommendations

Computational Arrays with Flexible Redundancy

Different multiple redundancy schemes for fault detection and correction in computational arrays are proposed and analyzed. The basic idea is to embed a logical array of nodes onto a processor/switch array such that d processors, 1/spl les/d/spl les/4, ...
Read More
A Time Redundancy Approach to TMR Failures Using Fault-State Likelihoods

Failure to establish a majority among the processing modules in a triple modular redundant (TMR) system, called a TMR failure, is detected by using two voters and a disagreement detector. Assuming that no more than one module becomes permanently faulty ...
Read More
On Redundancy and Fault Detection in Sequential Circuits

In this correspondence we show that the well-known concepts of redundancy and undetectability of a stuck-at fault, which are equivalent in combinational circuits, are not equivalent in sequential circuits. We also show that some faults in sequential ...
Read More

Reviews

Reviewer: Amos O Olagunju

Due to increasing pervasive vulnerabilities and attacks, emerging computer technologies require new and effective fault tolerance mechanisms and algorithms. How should fault tolerance mechanisms and algorithms be designed and implemented to minimize energy consumption, hardware overhead, and instruction processing performance attributable to fault tolerance features__?__ How should programmers assign an adequate level of security to crucial instructions for mission-critical applications__?__ Advocating for programmers to become more proficient at assigning the adequate protection level to every computer instruction, Borodin and Juurlink present a novel metric for offline evaluation of the effect of faults in individual instructions on the overall result of computer applications. The proposed plan requires a simulation environment or the injection of faults by hardware into every instruction, in order to generate the offline profile of vulnerability for each instruction. In the scheme, faults such as arithmetic operations and memory loads are injected into each executed instruction that generates results, and the application's output is weighed against the correct result to gauge the effect of the faulty instruction. The instruction vulnerability factor (IVF) is the percentage of output items or bytes it corrupts, depending on the type of application. The average IVF rates are estimated, stored, and used to enforce adequate protection, only for time-consuming components of an application. The authors outline a technique for utilizing the IVF rate to execute each instruction once with no error discovery, or to duplicate and validate its execution. Borodin and Juurlink perform simulations with different kernels and applications, such as image addition, matrix multiplication, sum of absolute difference, computation of Fibonacci numbers, sound compression, and encoders and decoders for image compression. The experimental results of the evaluation of individual instruction-level vulnerabilities show significant performance improvement over well-known instruction-level fault tolerance configuration techniques [1]. The paper clearly articulates the issues of imprecise IVF assessment due to uncertainties in instruction execution of dynamic real-world applications. The IVF estimation is only appropriate for applications with evenly significant rates, but the authors offer valuable insights into the efficient design and implementation of fault tolerance mechanisms and algorithms. Online Computing Reviews Service

Access critical reviews of Computing literature here

Become a reviewer for Computing Reviews.

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
CF '10: Proceedings of the 7th ACM international conference on Computing frontiers
May 2010
370 pages
ISBN:9781450300445
DOI:10.1145/1787275
General Chair:
Nancy M. Amato
Texas A&M University, USA
,
Program Chairs:
Hubertus Franke
IBM Research, USA
,
Paul H.J. Kelly
Imperial College London, UK
Copyright © 2010 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 17 May 2010
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
fault detection
instruction vulnerability
performance
redundancy
selective protection
Qualifiers
- research-article
Conference

Acceptance Rates
CF '10 Paper Acceptance Rate30of113submissions,27%Overall Acceptance Rate240of680submissions,35%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 36
  Total Citations
  View Citations
- 201
  Total Downloads
- Downloads (Last 12 months)6
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Protective redundancy overhead reduction using instruction vulnerability factor

CF '10: Proceedings of the 7th ACM international conference on Computing frontiers

ABSTRACT

References

Cited By

Index Terms

Recommendations

Computational Arrays with Flexible Redundancy

A Time Redundancy Approach to TMR Failures Using Fault-State Likelihoods

On Redundancy and Fault Detection in Sequential Circuits

Reviews

Access critical reviews of Computing literature here