skip to main content
article

ReEnact: using thread-level speculation mechanisms to debug data races in multithreaded codes

Published:01 May 2003Publication History
Skip Abstract Section

Abstract

While removing software bugs consumes vast amounts of human time, hardware support for debugging in modern computers remains rudimentary. Fortunately, we show that mechanisms for Thread-Level Speculation (TLS) can be reused to boost debugging productivity. Most notably, TLS's rollback capabilities can be extended to support rolling back recent buggy execution and repeating it as many times as necessary until the bug is fully characterized. These incremental re-executions are deterministic even in multithreaded codes. Importantly, this operation can be done automatically on the fly, and is compatible with production runs.As a specific implementation of a TLS-based debugging framework, we introduce ReEnact. ReEnact targets a particularly hairy class of bugs: data races in multithreaded programs. ReEnact extends the communication monitoring mechanisms in TLS to also detect data races. It extends TLS's rollback capabilities to be able to roll back and deterministically re-execute the code with races to obtain the race signature. Finally, the signature is compared to a library of race patterns and, if a match occurs, the execution may be repaired. Overall, ReEnact successfully detects, characterizes, and often repairs races automatically on the fly. Moreover, it is fully compatible with always-on use in production runs: the slowdown of race-free execution with ReEnact is on average only 5.8%.

References

  1. S. V. Adve, M. D. Hill, B. P. Miller, and R. H. B. Netzer. Detecting Data Races on Weak Memory Systems. In 18th Intl. Symp. on Computer Architecture, pages 234--243, 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. T. M. Austin. DIVA: A Reliable Substrate for Deep Submicron Microarchitecture Design. In 32nd Intl. Symp. on Microarchitecture, pages 196--207, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. J.-D. Choi et al. Efficient and Precise Datarace Detection for Multithreaded Object-Oriented Programs. In ACM SIGPLAN 2002 Conf. on Prog. Lang. Design and Implementation, pages 258--269, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. J.-D. Choi and S. L. Min. Race Frontier: Reproducing Data Races in Parallel-Program Debugging. In 3rd ACM SIGPLAN Symp. on Principles & Practice of Parallel Programming, pages 145--154, 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. M. Cintra, J. F. Martinez, and J. Torrellas. Architectural Support for Scalable Speculative Parallelization in Shared-Memory Multiprocessors. In 27th Intl. Symp. on Computer Architecture, pages 13--24, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. K. D. Cooper et al. The ParaScope Parallel Programming Environment. Proc. of the IEEE, 81(2):244--263, 1993.Google ScholarGoogle ScholarCross RefCross Ref
  7. C. Fidge. Logical Time in Distributed Computing Systems. IEEE Computer, 24(8):23--33, 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. M. Garzaran et al. Tradeoffs in Buffering Memory State for Thread-Level Speculation in Multiprocessors. In 8th Intl. Symp. on High-Performance Computer Architecture, pages 191--202, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. K. Gharachorloo and P. B. Gibbons. Detecting Violations of Sequential Consistency. In 3rd Symp. on Parallel Algorithms and Architectures, pages 316--326, 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. S. Gopal, T. N. Vijaykumar, J. E. Smith, and G. S. Sohi. Speculative Versioning Cache. In 4th Intl. Symp. on High-Performance Computer Architecture, pages 195--205, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. L. Hammond, M. Willey, and K. Olukotun. Data Speculation Support for a Chip Multiprocessor. In 8th Intl. Conf. on Arch. Support for Prog. Lang. and Operating Sys., pages 58--69, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Intel Corporation. The IA-32 Intel Architecture Software Developer's Manual, Volume 3: System Programming Guide. Intel Corporation, 2002.Google ScholarGoogle Scholar
  13. S. W. Keckler et al. Exploiting Fine-Grain Thread-Level Parallelism on the MIT Multi-ALU Processor. In 25th Intl. Symp. on Computer Architecture, pages 306--317, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. E. Marcus and H. Stern. Blueprints for High Availability. John Willey & Sons, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. S. L. Min and J.-D. Choi. An Efficient Cache-Based Access Anomaly Detection Scheme. In 4th Intl. Conf. on Arch. Support for Prog. Lang. and Operating Sys., pages 235--244, 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. S. S. Mukherjee, M. Kontz, and S. K. Reinhardt. Detailed Design and Evaluation of Redundant Multithreading Alternatives. In 29th Intl. Symp. on Computer Architecture, pages 99--110, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. J. Oplinger and M. S. Lam. Enhancing Software Reliability with Speculative Threads. In 10th Intl. Conf. on Arch. Support for Prog. Lang. and Operating Sys., pages 184--196, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. D. Perkovic and P. J. Keleher. A Protocol-Centric Approach to Onthe-Fly Race Detection. IEEE Trans. on Parallel and Distributed Systems, 11(10):1058--1072, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. M. Prvulovic, M. J. Garzaran, L. Rauchwerger, and J. Torrellas. Removing Architectural Bottlenecks to the Scalability of Speculative Parallelization. In 28th Intl. Symp. on Computer Architecture, pages 204--215, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. M. Prvulovic, Z. Zhang, and J. Torrellas. ReVive: Cost-Effective Architectural Support for Rollback Recovery in Shared-Memory Multiprocessors. In 29th Intl. Symp. on Computer Architecture, pages 111--122, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. M. Ronsse and K. D. Bosschere. RecPlay: A Fully Integrated Practical Record/Replay System. ACM Trans. on Computer Systems, 17(2):133--152, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. S. Savage et al. Eraser: A Dynamic Data Race Detector for Multi-Threaded Programs. ACM Trans. on Computer Systems, 15(4):391--411, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. D. Shasha and M. Snir. Efficient and Correct Execution of Parallel Programs that Share Memory. ACM Trans. on Prog. Lang. and Systems, 10(2):282--312, 1988. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. D. J. Sorin, M. M. K. Martin, M. D. Hill, and D. A. Wood. SafetyNet: Improving the Availability of Shared Memory Multiprocessors with Global Checkpoint/Recovery. In 29th Intl. Symp. on Computer Architecture, pages 123--134, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. R. Stallman, R. Pesch, and S. Shebs. Debugging with GDB - The GNU Source-Level Debugger. Free Software Foundation, 2002.Google ScholarGoogle Scholar
  26. J. G. Steffan, C. B. Colohan, A. Zhai, and T. C. Mowry. A Scalable Approach to Thread-Level Speculation. In 27th Intl. Symp. on Computer Architecture, pages 1--12, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. J. Y. Tsai et al. The Superthreaded Processor Architecture. IEEE Trans. on Computers, 48(9):881--902, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. S. C. Woo et al. The SPLASH-2 Programs: Characterization and Methodological Considerations. In 22nd Intl. Symp. on Computer Architecture, pages 24--38, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. ReEnact: using thread-level speculation mechanisms to debug data races in multithreaded codes
    Index terms have been assigned to the content through auto-classification.

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM SIGARCH Computer Architecture News
      ACM SIGARCH Computer Architecture News  Volume 31, Issue 2
      ISCA 2003
      May 2003
      422 pages
      ISSN:0163-5964
      DOI:10.1145/871656
      Issue’s Table of Contents
      • cover image ACM Conferences
        ISCA '03: Proceedings of the 30th annual international symposium on Computer architecture
        June 2003
        432 pages
        ISBN:0769519458
        DOI:10.1145/859618
        • Conference Chair:
        • Allan Gottlieb,
        • Program Chair:
        • Kai Li

      Copyright © 2003 Authors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 1 May 2003

      Check for updates

      Qualifiers

      • article

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader