Abstract
We aim to improve reliability of multithreaded programs by proposing a dynamic detector that detects potentially erroneous program executions and their causes. We design and evaluate a Serializability Violation Detector (SVD) that has two unique goals: (I) triggering automatic recovery from erroneous executions using backward error recovery (BER), or simply alerting users that a software error may have occurred; and (II) helping debug programs by revealing causes of error symptoms.Two properties of SVD help in achieving these goals. First, to detect only erroneous executions, SVD checks serializability of atomic regions, which are code regions that need to be executed atomically. Second, to improve usability, SVD does not require a priori annotations of atomic regions; instead, SVD approximates them using a heuristic. Experimental results on three widely-used multithreaded server programs show that SVD finds real bugs and reports modest false positives. The goal of this paper is to develop a detector suitable for (I) BER-based avoidance of erroneous program executions; and (II) alerting users as software errors occur. We argue that such a detector should have the following two properties.
- H. Agrawal and J. R. Horgan. Dynamic Program Slicing. In Proceedings of the SIGPLAN 1990 Conference on Programming Language Design and Implementation, pages 246--256, June 1990. Google ScholarDigital Library
- A. R. Alameldeen, M. M. K. Martin, C. J. Mauer, K. E. Moore, M. Xu, D. J. Sorin, M. D. Hill, and D. A. Wood. Simulating a 2M Commercial Server on a 2K PC. IEEE Computer, 36(2):50--57, Feb. 2003. Google ScholarDigital Library
- Apache HTTP Server Project. http://www.apache.org/.Google Scholar
- D. F. Bacon and S. C. Goldstein. Hardware-Assisted Replay of Multiprocessor Programs. Proceedings of the ACM/ONR Workshop on Parallel and Distributed Debugging, published in ACM SIGPLAN Notices, pages 194--206, 1991. Google ScholarDigital Library
- P. Barford and M. Crovella. Generating Representative Web Workloads for Network and Server Performance Evaluation. In Proceedings of the 1998 ACM Sigmetrics Conference on Measurement and Modeling of Computer Systems, pages 151--160, June 1998. Google ScholarDigital Library
- M. Burrows and K. R. M. Leino. Finding stale-value errors in concurrent programs. Technical report, Compaq Systems Research Center Technical Note (2002-04), May 2002.Google Scholar
- C.-Y. Cher and T. N. Vijaykumar. Skipper: a microarchitecture for exploiting control-flow independence. In Proceedings of the 34th Annual IEEE/ACM International Symposium on Microarchitecture, pages 4--15, Dec. 2001. Google ScholarDigital Library
- J.-D. Choi, K. Lee, A. Loginov, R. O'Callahan, V. Sarkar, and M. Sridharan. Efficient and precise datarace detection for multithreaded object-oriented programs. In Proceedings of the SIGPLAN 2002 Conference on Programming Language Design and Implementation, pages 258--269, June 2002. Google ScholarDigital Library
- J.-D. Choi and S.-L. Min. Race Frontier: Reproducing Data Races in Parallel-Program Debugging. In Proceedings of the 3rd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPOPP), pages 145--154, July 1991. Google ScholarDigital Library
- M. Christiaens and K. D. Bosschere. TRaDe, a topological approach to on-the-fly race detection in java programs. In Proceedings of the Java Virtual Machine Research and Technology Symposium (JVM'01), 2001. Google ScholarDigital Library
- J. D. Collins, D. M. Tullsen, and H. Wang. Control Flow Optimization Via Dynamic Reconvergence Prediction. In Proceedings of the 37th Annual IEEE/ACM International Symposium on Microarchitecture, pages 129--140, Dec. 2004. Google ScholarDigital Library
- A. Dinning and E. Schonberg. The Empirical Comparison of Monitoring Algorithms for Access Anomaly Detection. In Proceedings of the 2nd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPOPP), pages 1--10, Mar. 1990. Google ScholarDigital Library
- D. Engler and K. Ashcraft. RacerX: effective, static detection of race conditions and deadlocks. In Proc. 19th Symposium on Operating System Principles, pages 237--252, Oct. 2003. Google ScholarDigital Library
- K. P. Eswaran, J. N. Gray, R. A. Lorie, and I. L. Traiger. The notions of consistency and predicate locks in a database system. Communications of the ACM, 19(11):624--633, 1976. Google ScholarDigital Library
- C. Flanagan and S. N. Freund. Atomizer: a dynamic atomicity checker for multithreaded programs. In Proceedings of the 31st ACM SIGPLAN-SIGACT symposium on Principles of programming languages, pages 256--267, 2004. Google ScholarDigital Library
- C. Flanagan and S. Qadeer. A Type and Effect System for Atomicity. In Proceedings of the SIGPLAN 2003 Conference on Programming Language Design and Implementation, June 2003. Google ScholarDigital Library
- Y.-K. Jun and K. Koh. On-the-Fly Detection of Access Anomalies in Nested Parallel Loops. In Proceedings of the ACM/ONR Workshop on Parallel and Distributed Debugging (PADD), pages 107--117, 1993. Google ScholarDigital Library
- L. Lamport. Time, Clocks and the Ordering of Events in a Distributed System. Communications of the ACM, 21(7):558--565, July 1978. Google ScholarDigital Library
- P. S. Magnusson et al. Simics: A Full System Simulation Platform. IEEE Computer, 35(2):50--58, Feb. 2002. Google ScholarDigital Library
- C. J. Mauer, M. D. Hill, and D. A. Wood. Full System Timing-First Simulation. In Proceedings of the 2002 ACM Sigmetrics Conference on Measurement and Modeling of Computer Systems, pages 108--116, June 2002. Google ScholarDigital Library
- B. P. Miller and J.-D. Choi. A Mechanism for Efficient Debugging of Parallel Programs. In Proceedings of the SIGPLAN 1988 Conference on Programming Language Design and Implementation, pages 135--144, June 1988. Google ScholarDigital Library
- S. L. Min and J.-D. Choi. An Efficient Cache-based Access Anomaly Detection Scheme. In Proceedings of the Fourth International Conference on Architectural Support for Programming Languages and Operating Systems, pages 235--244, Apr. 1991. Google ScholarDigital Library
- MySQL AB. http://www.mysql.com/.Google Scholar
- R. H. B. Netzer and B. P. Miller. What are Race Conditions?: Some Issues and Formalizations. ACM Letters on Programming Languages and Systems, 1(1):74--88, Mar. 1992. Google ScholarDigital Library
- C. Papadimitriou. The Theory of Database Concurrency Control. Computer Science Press, Rockville, Maryland, 1986. Google ScholarDigital Library
- D. Perkovic and P. Keleher. A Protocol-Centric Approach to On-The-Fly Race Detection. IEEE Transactions on Parallel and Distributed Systems, 11(10):1058--1072, Oct. 2000. Google ScholarDigital Library
- PostgreSQL Global Development Group. http://www.postgresql.org/.Google Scholar
- K. Poulsen. SecurityFocus News: Tracking the blackout bug. http://www.securityfocus.com/news/8412.Google Scholar
- M. Prvulovic and J. Torrellas. ReEnact: Using Thread-Level Speculation Mechanisms to Debug Data Races in Multithreaded Codes. In Proceedings of the 30th Annual International Symposium on Computer Architecture, pages 110--121, June 2003. Google ScholarDigital Library
- M. Prvulovic, Z. Zhang, and J. Torrellas. ReVive: Cost-Effective Architectural Support for Rollback Recovery in Shared-Memory Multiprocessors. In Proceedings of the 29th Annual International Symposium on Computer Architecture, pages 111--122, May 2002. Google ScholarDigital Library
- B. Richards and J. R. Larus. Protocol-based Data-race Detection. In SIGMETRICS symposium on Parallel and Distributed Tools, pages 40--47, 1998. Google ScholarDigital Library
- M. Ronsse and K. D. Bosschere. Non-intrusive On-the-fly Data Race Detection using Execution Replay. In Automated and Algorithmic Debugging, Nov. 2000.Google Scholar
- S. Savage, M. Burrows, G. Nelson, P. Sobalvarro, and T. Anderson. Eraser: A Dynamic Data Race Detector for Multithreaded Programs. ACM Transactions on Computer Systems, 15(4):391--411, Nov. 1997. Google ScholarDigital Library
- D. J. Sorin, M. M. K. Martin, M. D. Hill, and D. A. Wood. SafetyNet: Improving the Availability of Shared Memory Multiprocessors with Global Checkpoint/Recovery. In Proceedings of the 29th Annual International Symposium on Computer Architecture, pages 123--134, May 2002. Google ScholarDigital Library
- U.S.-Canada Power System Outage Task Force. Final Report on the August 14th Blackout in the United States and Canada. Technical report, Department of Energy, 2004.Google Scholar
- C. von Praun and T. Gross. Object-Race Detection. In Proceedings of the Conference on Object-Oriented Programming, Systems, Languages and Application (OOPSLA), Oct. 2001. Google ScholarDigital Library
- Wisconsin Multifacet GEMS Simulator. http://www.cs.wisc.edu/gems/.Google Scholar
- M. Xu, R. Bodík, and M. D. Hill. A "Flight Data Recorder" for Enabling Full-system Multiprocessor Deterministic Replay. In Proceedings of the 30th Annual International Symposium on Computer Architecture, pages 122--133, June 2003. Google ScholarDigital Library
Index Terms
- A serializability violation detector for shared-memory server programs
Recommendations
Dynamic detection of atomic-set-serializability violations
ICSE '08: Proceedings of the 30th international conference on Software engineeringPreviously we presented atomic sets, memory locations that share some consistency property, and units of work, code fragments that preserve consistency of atomic sets on which they are declared. We also proposed atomic-set serializability as a ...
A serializability violation detector for shared-memory server programs
PLDI '05: Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementationWe aim to improve reliability of multithreaded programs by proposing a dynamic detector that detects potentially erroneous program executions and their causes. We design and evaluate a Serializability Violation Detector (SVD) that has two unique goals: (...
Sequential verification of serializability
POPL '10Serializability is a commonly used correctness condition in concurrent programming. When a concurrent module is serializable, certain other properties of the module can be verified by considering only its sequential executions. In many cases, concurrent ...
Comments