skip to main content
article
Free Access

Replay for concurrent non-deterministic shared-memory applications

Authors Info & Claims
Published:01 May 1996Publication History
Skip Abstract Section

Abstract

Replay of shared-memory program execution is desirable in many domains including cyclic debugging, fault tolerance and performance monitoring. Past approaches to repeatable execution have focused on the problem of re-executing the shared-memory access patterns in parallel programs. With the proliferation of operating system supported threads and shared memory for uniprocessor programs, there is a clear need for efficient replay of concurrent applications. The solutions for parallel systems can be performance prohibitive when applied to the uniprocessor case. We present an algorithm, called the repeatable scheduling algorithm, combining scheduling and instruction counts to provide an invariant for efficient, language independent replay of concurrent shared-memory applications. The approach is shown to have trace overheads that are independent of the amount of sharing that takes place. An implementation for cyclic debugging on Mach 3.0 is evaluated and benchmarks show typical performance overheads of around 10%. The algorithm implemented is compared with optimal event-based tracing and shown to do better with respect to the number of events monitored or number of events logged, in most cases by several orders of magnitude.

References

  1. 1 A. Aho, B. Kernighan and P. Weinberger, "The AWK Programming Language," Addison-Wesley, Reading, MA, 1988. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. 2 T. A. Cargill and B. N. Locanthi, "Cheap Hardware Support for Software Debugging and Profiling," in Proc. Syrup. on Architectural Support for Prog. Lang. and Operating Syst., Palo Alto, CA, Oct. 1987, pp. 82-83. Google ScholarGoogle ScholarCross RefCross Ref
  3. 3 R. H. Carver and K. C. Tai, "Reproducible Testing of Concurrent Programs Based on Shared Variables," in Proc. 6th Int. Conf. on Distributed Computing Systems, Boston, MA., May 1986, pp. 428-432.Google ScholarGoogle Scholar
  4. 4 H. Custer, "Inside Windows NT," Microsoft Press, Redmond, WA, 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. 5 P. Dodd and C. Ravishankar, "Monitoring and Debugging Distributed Real-Time Programs," Software Practice and Experience, Vol. 22(10), Oct. 1992, pp. 863- 877. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. 6 M. Johnson, "Some Requirements for Architectural Support of Software Debugging," in Proc. of the Syrup. on Architectural Support for Prog. Lang. and Operating $yst., Palo Alto, CA, Mar. 1982, pp. 140-148. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. 7 A. King, "Inside Windows 95," Microsoft Press, Redmond, WA, 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. 8 T. j. LeBlanc and J. M. Mellor-Crummey, "Debugging Parallel Programs with Instant Replay." IEEE Trans. on Computers, Apr. 1987, pp. 471-482. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. 9 C. Lin and R. LeBlanc, "Event-Based Debugging of Object/Action Programs," in Proc. of the A CM SiG- PLAN/SIGOPS Workshop on Parallel and Distributed Debugging, 1988, pp. 23-34. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. 10 C. E. McDowell and D. P. Helbold, "Debugging Concurrent Programs," A CM Computing Surveys, Dec. 1989, pp. 593-622. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. 11 J. M. Mellor-Crummey and T. J. LeBlanc, "A Software Instruction Counter," in Proc. Symp. on Architectural Support for Prog. Lang. and Operating Syst., Palo Alto, CA, Apr. 1989, pp. 78-86. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. 12 R. Netzer, "Optimal Tracing and Replay for Debugging Shared-Memory Parallel Programs," in Proc. A CM/ ONR Workshop on Parallel and Distributed Debugging, May 1993, pp. 1-11. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. 13 R. Netzer and B. Miller, "On the Complexity of Event Ordering for Shared-Memory Parallel Program Executions,'' in Proc. Int. Conf. on Parallel Processing, 1990, pp. 93-97.Google ScholarGoogle Scholar
  14. 14 D. Pan and M. Linton, "Supporting Reverse Execution of Parallel Programs," in Proc. SIGPLAN/SIGOPS Workshop on Parallel and Distributed Debugging, May 1988, pp. 124-129. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. 15 M. L. Powell, et. al., "SunOS Mulfithreaded Architecture,'' Sun Microsystems White Paper, Sun Microsystems, Cupertino, CA, June 1995.Google ScholarGoogle Scholar
  16. 16 R. Rashid, et. al., "Mach: A Foundation for Open Systems,'' in Proc. 2nd Workshop on Workstations and Operating Syst., Sept. 1989, pp. 27-29.Google ScholarGoogle ScholarCross RefCross Ref
  17. 17 M. Russinovich and B. Cogswell, "Operating System Support for Replay of Concurrent Non-Deterministic Shared Memory Applications," in Bulletin of the Technical Committee on Operating Systems and Applications Environments (TCOS), IEEE Computer Society, Winter 1995, Vol. 7, No. 4, pp. 15-19. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. 18 K. C. Tai, R. H. Carver, and E. E. Obaid, "Debugging Concurrent Ada Programs by Deterministic Execution,'' IEEE Trans. on Software Engineering, Jan. 1991, pp. 45-63. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. 19 S. C. Woo, M. Ohara, E. Torrie, J. P. Singh, and A. Gupta, "The SPLASH-2 Programs' Characterization and Methodological Considerations," in Proc. of the 22nd International Symposium on Computer Architecture, June 1995, pp. 24-36. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Replay for concurrent non-deterministic shared-memory applications

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in

          Full Access

          • Published in

            cover image ACM SIGPLAN Notices
            ACM SIGPLAN Notices  Volume 31, Issue 5
            May 1996
            300 pages
            ISSN:0362-1340
            EISSN:1558-1160
            DOI:10.1145/249069
            Issue’s Table of Contents
            • cover image ACM Conferences
              PLDI '96: Proceedings of the ACM SIGPLAN 1996 conference on Programming language design and implementation
              May 1996
              300 pages
              ISBN:0897917952
              DOI:10.1145/231379

            Copyright © 1996 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 1 May 1996

            Check for updates

            Qualifiers

            • article

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader