Abstract
Concurrency bugs are caused by non-deterministic interleavings between shared memory accesses. Their effects propagate through data and control dependences until they cause software to crash, hang, produce incorrect output, etc. The lifecycle of a bug thus consists of three phases: (1) triggering, (2) propagation, and (3) failure.
Traditional techniques for detecting concurrency bugs mostly focus on phase (1)--i.e., on finding certain structural patterns of interleavings that are common triggers of concurrency bugs, such as data races. This paper explores a consequence-oriented approach to improving the accuracy and coverage of state-space search and bug detection. The proposed approach first statically identifies potential failure sites in a program binary (i.e., it first considers a phase (3) issue). It then uses static slicing to identify critical read instructions that are highly likely to affect potential failure sites through control and data dependences (phase (2)). Finally, it monitors a single (correct) execution of a concurrent program and identifies suspicious interleavings that could cause an incorrect state to arise at a critical read and then lead to a software failure (phase (1)).
ConSeq's backwards approach, (3)!(2)!(1), provides advantages in bug-detection coverage and accuracy but is challenging to carry out. ConSeq makes it feasible by exploiting the empirical observationthat phases (2) and (3) usually are short and occur within one thread. Our evaluation on large, real-world C/C++ applications shows that ConSeq detects more bugs than traditional approaches and has a much lower false-positive rate.
- G. Altekar and I. Stoica. ODR: output-deterministic replay for multicore debugging. In SOSP, 2009. Google ScholarDigital Library
- G. Balakrishnan, R. Gruian, T. Reps, and T. Teitelbaum. CodeSurfer/x86 -- A platform for analyzing x86 executables, (tool demonstration paper). In CC, 2005. Google ScholarDigital Library
- G. Balakrishnan and T. Reps. Analyzing memory accesses in x86 executables. In Compiler Construction, 2004.Google ScholarCross Ref
- E. D. Berger, T. Yang, T. Liu, and G. Novark. Grace: safe multithreaded programming for C/C. In OOPSLA, 2009. Google ScholarDigital Library
- A. Bessey, K. Block, B. Chelf, A. Chou, B. Fulton, S. Hallem, C. Henri-Gros, A. Kamsky, S. McPeak, and D. Engler. A few billion lines of code later: using static analysis to find bugs in the real world. Commun. ACM, 53(2):66--75, 2010. Google ScholarDigital Library
- F. Bourdoncle. Efficient chaotic iteration strategies with widenings. In Int. Conf. on Formal Methods in Prog. and their Appl., 1993.Google ScholarCross Ref
- S. Burckhardt, P. Kothari, M. Musuvathi, and S. Nagarakatte. A randomized scheduler with probabilistic guarantees of finding bugs. In ASPLOS, 2010. Google ScholarDigital Library
- J. Burnim and K. Sen. Asserting and checking determinism for multithreaded programs. In FSE, 2009. Google ScholarDigital Library
- C. Cadar, D. Dunbar, and D. R. Engler. Klee: Unassisted and automatic generation of high-coverage tests for complex systems programs. In OSDI, 2008. Google ScholarDigital Library
- Cherokee. Cherokee: The Fastest free Web Server out there! http://www.cherokee-project.com/.Google Scholar
- L. Chew and D. Lie. Kivati: Fast detection and prevention of atomicity violations. In EuroSys, 2010. Google ScholarDigital Library
- J.-D. Choi et al. Efficient and precise datarace detection for multithreaded object-oriented programs. In PLDI, 2002. Google ScholarDigital Library
- Click. The Click Modular Router Projec. http://read.cs.ucla.edu/click/click.Google Scholar
- J. Devietti, B. Lucia, L. Ceze, and M. Oskin. DMP: deterministic shared memory multiprocessing. In ASPLOS, 2009. Google ScholarDigital Library
- M. Dimitrov and H. Zhou. Anomaly-based bug prediction, isolation, and validation: an automated approach for software debugging. In ASPLOS, 2009. Google ScholarDigital Library
- O. Edelstein, E. Farchi, Y. Nir, G. Ratsaby, and S. Ur. Multi-threaded Java program test generation. IBM Systems Journal, 2002. Google ScholarDigital Library
- M. Ernst, A. Czeisler, W. G. Griswold, and D. Notkin. Quickly detecting relevant program invariants. In ICSE, 2000. Google ScholarDigital Library
- M. D. Ernst, J. H. Perkins, P. J. Guo, S. McCamant, C. Pacheco, M. S. Tschantz, and C. Xiao. The daikon system for dynamic detection of likely invariants. Sci. Comput. Program., 69(1-3):35--45, 2007. Google ScholarDigital Library
- C. Flanagan and S. N. Freund. FastTrack: efficient and precise dynamic race detection. In PLDI, 2009. Google ScholarDigital Library
- P. Godefroid, N. Klarlund, and K. Sen. Dart: directed automated random testing. In PLDI, 2005. Google ScholarDigital Library
- W. Gu, Z. Kalbarczyk, R. K. Iyer, and Z.-Y. Yang. Characterization of Linux kernel behavior under errors. In DSN, 2003.Google Scholar
- S. Horwitz, T. Reps, and D. Binkley. Interprocedural slicing using dependence graphs. In TOPLAS, 1990. Google ScholarDigital Library
- R. Jhala and R. Majumdar. Software model checking. Computing Surveys, 41(4), 2009. Google ScholarDigital Library
- H. Jula, D. Tralamazza, C. Zamfir, and G. Candea. Deadlock immunity: Enabling systems to defend against deadlocks. In OSDI, 2008. Google ScholarDigital Library
- N. Kidd, P. Lammich, T. Touilli, and T. Reps. A static technique for checking for multiple-variable data races. Software Tools for Technology Transfer, 2010.Google Scholar
- V. Kuznetsov, V. Chipounov, and G. Candea. Testing closed-source binary device drivers with DDT. In USENIX, 2010. Google ScholarDigital Library
- A. Lal and T. Reps. Reducing concurrent analysis under a context bound to sequential analysis. Form. Methods Syst. Des., 2009. Google ScholarDigital Library
- I. Lee and R. K. Iyer. Faults, symptoms, and software fault tolerance in the Tandem GUARDIAN90 Operating System. IEEE, pages 20--29, 1993.Google ScholarCross Ref
- N. G. Leveson and C. S. Turner. An investigation of the therac-25 accidents. Computer, 26(7):18--41, 1993. Google ScholarDigital Library
- X. Li and D. Yeung. Application-level correctness and its impact on fault tolerance. In HPCA, 2007. Google ScholarDigital Library
- S. Lu, S. Park, C. Hu, X. Ma, W. Jiang, Z. Li, R. A. Popa, and Y. Zhou. MUVI: Automatically inferring multi-variable access correlations and detecting related semantic and concurrency bugs. In SOSP, October 2007. Google ScholarDigital Library
- S. Lu, S. Park, E. Seo, and Y. Zhou. Learning from mistakes -- a comprehensive study of real world concurrency bug characteristics. In ASPLOS, 2008. Google ScholarDigital Library
- S. Lu, J. Tucek, F. Qin, and Y. Zhou. AVIO: detecting atomicity violations via access interleaving invariants. In ASPLOS, 2006. Google ScholarDigital Library
- B. Lucia and L. Ceze. Finding concurrency bugs with context-aware communication graphs. In MICRO, 2009. Google ScholarDigital Library
- B. Lucia, L. Ceze, and K. Strauss. Colorsafe: architectural support for debugging and dynamically avoiding multi-variable atomicity violations. In ISCA, 2010. Google ScholarDigital Library
- B. Lucia, J. Devietti, K. Strauss, and L. Ceze. Atom-aid: Detecting and surviving atomicity violations. In ISCA, 2008. Google ScholarDigital Library
- C.-K. Luk, R. Cohn, R. Muth, H. Patil, A. Klauser, G. Lowney, S. Wallace, V. J. Reddi, and K. Hazelwood. Pin: building customized program analysis tools with dynamic instrumentation. In PLDI, 2005. Google ScholarDigital Library
- M. Musuvathi and S. Qadeer. Iterative context bounding for systematic testing of multithreaded programs. In PLDI, 2007. Google ScholarDigital Library
- M. Musuvathi, S. Qadeer, T. Ball, G. Basler, P. A. Nainar, and I. Neamtiu. Finding and reproducing heisenbugs in concurrent programs. In OSDI, 2008. Google ScholarDigital Library
- S. Narayanasamy, Z. Wang, J. Tigani, A. Edwards, and B. Calder. Automatically classifying benign and harmful data races using replay analysis. In PLDI, 2007. Google ScholarDigital Library
- N. Nethercote and J. Seward. Valgrind: a framework for heavyweight dynamic binary instrumentation. In PLDI, 2007. Google ScholarDigital Library
- R. H. B. Netzer and B. P. Miller. Improving the accuracy of data race detection. In PPoPP, 1991. Google ScholarDigital Library
- M. Olszewski, J. Ansel, and S. P. Amarasinghe. Kendo: Efficient deterministic multithreading in software. In ASPLOS, 2009. Google ScholarDigital Library
- S. Park, S. Lu, and Y. Zhou. Ctrigger: Exposing atomicity violation bugs from their finding places. In ASPLOS, 2009. Google ScholarDigital Library
- S. Park, R. W. Vuduc, and M. J. Harrold. Falcon: fault localization in concurrent programs. In ICSE '10, 2010. Google ScholarDigital Library
- S. Park, Y. Zhou, W. Xiong, Z. Yin, R. Kaushik, K. H. Lee, and S. Lu. PRES: probabilistic replay with execution sketching on multiprocessors. In SOSP, 2009. Google ScholarDigital Library
- D. K. Pradhan. Fault-Tolerant Computer System Design. Prentice-Hall, Incorporated, 1996. Google ScholarDigital Library
- S. Qadeer and D. Wu. Kiss: keep it simple and sequential. In PLDI, 2004. Google ScholarDigital Library
- F. Qin, J. Tucek, J. Sundaresan, and Y. Zhou. Rx: Treating bugs as allergies c a safe method to survive software failures. In SOSP, 2005. Google ScholarDigital Library
- L. Ryzhyk, P. Chubb, I. Kuz, and G. Heiser. Dingo: taming device drivers. In EuroSys, 2009. Google ScholarDigital Library
- S. Savage, M. Burrows, G. Nelson, P. Sobalvarro, and T. Anderson. Eraser: A dynamic data race detector for multithreaded programs. ACM TOCS, 1997. Google ScholarDigital Library
- SecurityFocus. Software bug contributed to blackout. http://www.securityfocus.com/news/8016.Google Scholar
- K. Sen. Race directed random testing of concurrent programs. In PLDI, 2008. Google ScholarDigital Library
- K. Sen, D. Marinov, and G. Agha. Cute: a concolic unit testing engine for c. In ESEC/SIGSOFT FSE, 2005. Google ScholarDigital Library
- Y. Shi, S. Park, Z. Yin, S. Lu, Y. Zhou, W. Chen, and W. Zheng. Do i use the wrong definition? defuse: Definition-use invariants for detecting concurrency and sequential bugs. In OOPSLA, 2010. Google ScholarDigital Library
- M. Vaziri, F. Tip, and J. Dolby. Associating synchronization constraints with data in an object-oriented language. In POPL, 2006. Google ScholarDigital Library
- Y. Wang, T. Kelly, M. Kudlur, S. Lafortune, and S. A. Mahlke. Gadara: Dynamic deadlock avoidance for multithreaded programs. In OSDI, 2008. Google ScholarDigital Library
- D. Weeratunge, X. Zhang, and S. Jagannathan. Analyzing multicore dumps to facilitate concurrency bug reproduction. In ASPLOS, 2010. Google ScholarDigital Library
- M. Weiser. Program slicing. In IEEE Transactions on Software Engineering, 1984.Google ScholarDigital Library
- S. C. Woo, M. Ohara, E. Torrie, J. P. Singh, and A. Gupta. The SPLASH-2 programs: Characterization and methodological considerations. In ISCA, 1995. Google ScholarDigital Library
- W. Xiong, S. Park, J. Zhang, Y. Zhou, and Z. Ma. Ad hoc synchronization considered harmful. In OSDI, 2010. Google ScholarDigital Library
- M. Xu, R. Bodík, and M. D. Hill. A serializability violation detector for shared-memory server programs. In PLDI, 2005. Google ScholarDigital Library
- J. Yu and S. Narayanasamy. A case for an interleaving constrained shared-memory multi-processor. In ISCA, 2009. Google ScholarDigital Library
- Y. Yu, T. Rodeheffer, and W. Chen. Racetrack: Efficient detection of data race conditions via adaptive tracking. In SOSP, 2005. Google ScholarDigital Library
- C. Zamfir and G. Candea. Execution synthesis: A technique for automated software debugging. In EuroSys, 2010. Google ScholarDigital Library
- W. Zhang, C. Sun, and S. Lu. ConMem: Detecting severe concurrency bugs through an effect-oriented approach. In ASPLOS, 2010. Google ScholarDigital Library
Index Terms
- ConSeq: detecting concurrency bugs through sequential errors
Recommendations
ConMem: detecting severe concurrency bugs through an effect-oriented approach
ASPLOS '10Multicore technology is making concurrent programs increasingly pervasive. Unfortunately, it is difficult to deliver reliable concurrent programs, because of the huge and non-deterministic interleaving space. In reality, without the resources to ...
ConMem: detecting severe concurrency bugs through an effect-oriented approach
ASPLOS '10Multicore technology is making concurrent programs increasingly pervasive. Unfortunately, it is difficult to deliver reliable concurrent programs, because of the huge and non-deterministic interleaving space. In reality, without the resources to ...
ConSeq: detecting concurrency bugs through sequential errors
ASPLOS '11Concurrency bugs are caused by non-deterministic interleavings between shared memory accesses. Their effects propagate through data and control dependences until they cause software to crash, hang, produce incorrect output, etc. The lifecycle of a bug ...
Comments