skip to main content
article

SlicK: slice-based locality exploitation for efficient redundant multithreading

Published:20 October 2006Publication History
Skip Abstract Section

Abstract

Transient faults are expected a be a major design consideration in future microprocessors. Recent proposals for transient fault detection in processor cores have revolved around the idea of redundant threading, which involves redundant execution of a program across multiple execution contexts. This paper presents a new approach to redundant threading by bringing together the concepts of slice-level execution and value and control-flow locality into a novel partial redundant threading mechanism called SlicK.The purpose of redundant execution is to check the integrity of the outputs propagating out of the core (typically through stores). SlicK implements redundancy at the granularity of backward-slices of these output instructions and exploits value and control-flow locality to avoid redundantly executing slices that lead to predictable outputs, thereby avoiding redundant execution of a significant fraction of instructions while maintaining extremely low vulnerabilities for critical processor structures.We propose the microarchitecture of a backward-slice extractor called SliceEM that is able to identify backward slices without interrupting the instruction flow, and show how this extractor and a set of predictors can be integrated into a redundant threading mechanism to form SlicK. Detailed simulations with SPEC CPU2000 benchmarks show that SlicK can provide around 10.2% performance improvement over a well known redundant threading mechanism, buying back over 50% of the loss suffered due to redundant execution. SlicK can keep the Architectural Vulnerability Factors of processor structures to typically 0%-2%. More importantly, SlicK's slice-based mechanisms provide future opportunities for exploring interesting points in the performance-reliability design space based on market segment needs.

References

  1. T. Austin. DIVA: A Reliable Substrate for Deep Submicron Microarchitecture Design. In Proceedings of the International ymposium on Microarchitecture (MICRO), pages 196--207, November 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. M. Brown, J. Stark, and Y. Patt. Select-Free Instruction Scheduling Logic. In Proceedings of the International Symposium on Microarchitecture (MICRO), pages 204--213, December 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. D. Burger and T. Austin. The SimpleScalar Toolset, Version 3.0. http://www.simplescalar.com.Google ScholarGoogle Scholar
  4. M. Burtscher. An Improved Index Function for (D)FCM Predictors. ACM SIGARCH Computer Architecture News, 30(3):19--24, June 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. J. Collins, D. Tullsen, H. Wang, and J. Shen. Dynamic Speculative Precomputation. In Proceedings of the International Symposium on Microarchitecture (MICRO), pages 306--317, December 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. E. Duesterwald, R. Gupta, and M.L. Soffa. Distributed slicing and partial re-execution for distributed programs. In Languages and Compilers for Parallel Computing, pages 497--511, 1992. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. M.A. Gomaa and T.N. Vijaykumar. Opportunistic transient-fault detection. In Proceedings of the International Symposium on Computer Architecture (ISCA), pages 172--183, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. D. Grunwald, A. Klauser, S. Manne, and A.R. Pleszkun. Confidence estimation for speculation control. In Proceedings of the International Symposium on Computer Architecture (ISCA), pages 122--131, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. S. Gurumurthi, A. Parashar, and A. Sivasubramaniam. SOS: Using Speculation for Memory Error Detection. In Proceedings of the Workshop on High Performance Computing Reliability Issues (held in conjunction with HPCA), February 2005.Google ScholarGoogle Scholar
  10. HP NonStop Himalaya. http://nonstop.compaq.com/.Google ScholarGoogle Scholar
  11. J.J. Koppanalil and E. Rotenberg. A simple mechanism for detecting ineffectual instructions in slipstream processors. IEEE Transactions on Computers, 53(4):399--413, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. K. Lepak, G. Bell, and M. Lipasti. Silent Stores and Store Value Locality. IEEE Transactions on Computers, 50 11):1174--1190, November 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. X. Li, S. V. Adve, P. Bose, and J.A. Rivers. Softarch: An architecture level tool for modeling and analyzing soft errors. In Proceedings of the International Conference on Dependable Systems and Networks (DSN), pages 496--505, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. E. Morancho, J. Labia, and A. Olive. Recovery mechanism for latency misprediction. In Proceedings of the 2001 ACM/IEEE nternational Conference on Parallel Architectures and Compilation Techniques, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. A. Moshovos, D.N. Pnevmatikatos, and A. Baniasadi. Slice-processors: an implementation of operation-based prediction. n ICS '01: Proceedings of the 15th international conference on Supercomputing, pages 321--334, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. S. Mukherjee, M. Kontz, and S. Reinhardt. Detailed Design and Evaluation of Redundant Multithreading Alternatives. In roceedings of the International Symposium on Computer Architecture (ISCA), pages 99--110, May 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. S. Mukherjee, C. Weaver, J. Emer, S. Reinhardt, and T. Austin. A Systematic Methodology to Compute the Architectural Vulnerability Factors for a High-Performance Microprocessor. In Proceedings of the International Symposium on Microarchitecture (MICRO), pages 29--40, December 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. A. Parashar, S. Gurumurthi, and A. Sivasubramaniam. A Complexity-Effective Approach to ALU Bandwidth Enhancement or Instruction-Level Temporal Redundancy. In Proceedings of the International Symposium on Computer Architecture (ISCA), pages 376--386, June 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. M.K. Qureshi, O. Mutlu, and Y.N. Patt. Microarchitecture-based introspection: A technique for transient-fault tolerance in microprocessors. In Proceedings of the 2005 International Conference on Dependable Systems and Networks (DSN'05), pages 434--443, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. S. Reinhardt and S. Mukherjee. Transient Fault Detection via Simultaneous Multithreading. In Proceedings of the International Symposium on Computer Architecture (ISCA), pages 25--36, June 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. S. R. Sarangi, J. T. Wei Liu, and Y. Zhou. Reslice: Selective re-execution of long-retired misspeculated instructions using forward slicing. In Proceedings of the 38th annual IEEE/ACM International Symposium on Microarchitecture, pages 257--270, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. T. Sherwood, E. Perelman, G. Hamerly, and B. Calder. Automatically Characterizing Large Scale Program Behavior. In roceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), October 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. P. Shivakumar, M. Kistler, S. Keckler, D. Burger, and L. Alvisi. Modeling the Effect of Technology Trends on Soft Error Rate of Combinational Logic. In Proceedings of the International Conference on Dependable Systems and Networks (DSN), June 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. T. Slegel et al. IBM's S/390 G5 Microprocessor Design. IEEE Micro, 19(2), March 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. J. Smolens, B. Gold, J. Kim, B. Falsafi, J. Hoe, and A. Nowatzyk. Fingerprinting: Bounding Soft-Error Detection Latency and Bandwidth. In Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), pages 224--234, October 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. J. Smolens, J. Kim, J. Hoe, and B. Falsafi. Efficient Resource Sharing in Concurrent Error Detecting Superscalar Microarchitectures. In Proceedings of the International Symposium on Microarchitecture (MICRO), pages 257--268, December 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. K. Sundaramoorthy, Z. Purser, and E. Rotenburg. Slipstream processors: improving both performance and fault tolerance. In ASPLOS-IX: Proceedings of the ninth international conference on Architectural support for programming languages and operating systems, pages 257--268, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. D. Tullsen, S. Eggers, and H. Levy. Simultaneous Multithreading: Maximizing On-Chip Parallelism. In Proceedings of the nternational Symposium on Computer Architecture (ISCA), pages 392--403, June 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. T. Vijaykumar, I. Pomeranz, and K. Cheng. Transient-Fault Recovery via Simultaneous Multithreading. In Proceedings of the International Symposium on Computer Architecture (ISCA), pages 87--98, May 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. N.J. Wang and S.J. Patel. Restore: Symptom based soft error detection in microprocessors. In Proceedings of the 2005 International Conference on Dependable Systems and Networks (DSN'05), pages 30--39, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. T.-Y. Yeh and Y. Patt. Alternative Implementations of Two-Level Adaptive Branch Prediction. In Proceedings of the International Symposium on Computer Architecture (ISCA), pages 124--134, May 1992. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. C. Zilles and G. Sohi. Understanding the Backward Slices of Performance Degrading Instructions. In Proceedings of the International Symposium on Computer Architecture (ISCA), pages 172--181, June 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. SlicK: slice-based locality exploitation for efficient redundant multithreading

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM SIGOPS Operating Systems Review
      ACM SIGOPS Operating Systems Review  Volume 40, Issue 5
      Proceedings of the 2006 ASPLOS Conference
      December 2006
      425 pages
      ISSN:0163-5980
      DOI:10.1145/1168917
      Issue’s Table of Contents
      • cover image ACM Conferences
        ASPLOS XII: Proceedings of the 12th international conference on Architectural support for programming languages and operating systems
        October 2006
        440 pages
        ISBN:1595934510
        DOI:10.1145/1168857

      Copyright © 2006 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 20 October 2006

      Check for updates

      Qualifiers

      • article

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader