skip to main content
research-article

Isolating and understanding concurrency errors using reconstructed execution fragments

Published:04 June 2011Publication History
Skip Abstract Section

Abstract

In this paper we propose Recon, a new general approach to concurrency debugging. Recon goes beyond just detecting bugs, it also presents to the programmer short fragments of buggy execution schedules that illustrate how and why bugs happened. These fragments, called reconstructions, are inferred from inter-thread communication surrounding the root cause of a bug and significantly simplify the process of understanding bugs.

The key idea in Recon is to monitor executions and build graphs that encode inter-thread communication with enough context information to build reconstructions. Recon leverages reconstructions built from multiple application executions and uses machine learning to identify which ones illustrate the root cause of a bug. Recon's approach is general because it does not rely on heuristics specific to any type of bug, application, or programming model. Therefore, it is able to deal with single- and multiple-variable concurrency bugs regardless of their type (e.g., atomicity violation, ordering, etc). To make graph collection efficient, Recon employs selective monitoring and allows metadata information to be imprecise without compromising accuracy. With these optimizations, Recon's graph collection imposes overheads typically between 5x and 20x for both C/C++ and Java programs, with overheads as low as 13% in our experiments. We evaluate Recon with buggy applications, and show it produces reconstructions that include all code points involved in bugs' causes, and presents them in an accurate order. We include a case study of understanding and fixing a previously unresolved bug to showcase Recon's effectiveness.

References

  1. Z. Anderson, D. Gay, R. Ennals, and E. Brewer. SharC: Checking Data Sharing Strategies for Multithreaded C. In PLDI, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. C. Bienia, S. Kumar, J. P. Singh, and K. Li. The parsec benchmark suite: Characterization and architectural implications. In Proceedings of the 17th International Conference on Parallel Architectures and Compilation Techniques, October 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. S. M. Blackburn et al. The DaCapo Benchmarks: Java Benchmarking Development and Analysis. In OOPSLA, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. H.-J. Boehm and S. V. Adve. Foundations of the C Concurrency Memory Model. In PLDI, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. S. Burckhardt, P. Kothari, M. Musuvathi, and S. Nagarakatte. A Randomized Scheduler with Probabilistic Guarantees of Finding Bugs. In ASPLOS, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. C. Flanagan and S. N. Freund. FastTrack: Efficient and Precise Dynamic Race Detection. In PLDI, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. C. Flanagan and S. N. Freund. The RoadRunner Dynamic Analysis Framework for Concurrent Programs. In PASTE, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. C. Flanagan, S. N. Freund, and J. Yi. Velodrome: A Sound and Complete Dynamic Atomicity Checker for Multithreaded Programs. In PLDI, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. Reutemann, and I. H. Witten. The WEKA Data Mining Software: An Update. SIGKDD Explorations, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. C. Hammer, J. Dolby, M. Vaziri, and F. Tip. Dynamic Detection of Atomic-Set-Serializability Violations. In ICSE, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. G. Jin, A. Thakur, B. Liblit, and S. Lu. Instrumentation and Sampling Strategies for Cooperative Concurrency Bug Isolation. In OOPSLA, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. P. Joshi and K. Sen. Predictive Typestate Checking of Multithreaded Java Programs. In ASE, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. K. Kawachiya, A. Koseki, and T. Onodera. Lock Reservation: Java Locks Can Mostly Do Without Atomic Operations. In OOPSLA, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. I. Kononenko. Estimating Attributes: Analysis and Extensions of RELIEF. In European Conference on Machine Learning, 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. B. Liblit. Cooperative Bug Isolation, volume 4440 of Lecture Notes in Computer Science. Springer, 2007.Google ScholarGoogle Scholar
  16. S. Lu, S. Park, E. Seo, and Y. Zhou. Learning from Mistakes - A Comprehensive Study on Real World Concurrency Bug Characteristics. In ASPLOS, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. S. Lu, J. Tucek, F. Qin, and Y. Zhou. AVIO: Detecting Atomicity Violations via Access Interleaving Invariants. In ASPLOS, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. B. Lucia and L. Ceze. Finding Concurrency Bugs with Context-Aware Communication Graphs. In MICRO, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. C.-K. Luk et al. Pin: Building Customized Program Analysis Tools with Dynamic Instrumentation. In PLDI, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. J. Manson, W. Pugh, and S. V. Adve. The Java Memory Model. In POPL, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. M. Musuvathi, S. Qadeer, T. Ball, G. Basler, P. A. Nainar, and I. Neamtiu. Finding and Reproducing Heisenbugs in Concurrent Programs. In OSDI, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. S. Park, S. Lu, and Y. Zhou. CTrigger: Exposing Atomicity Violation Bugs from Their Hiding Places. In ASPLOS, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. S. Park, R. W. Vuduc, and M. J. Harrold. Falcon: Fault Localization in Concurrent Programs. In ICSE, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. S. Park, Y. Zhou, W. Xiong, Z. Yin, R. Kaushik, K. H. Lee, and S. Lu. PRES: Probabilistic Replay with Execution Sketching on Multiprocessors. In SOSP, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. M. Ronsee and K. D. Bosschere. RecPlay: A Fully Integrated Practical Record/Replay System. ToCS, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. S. Savage, M. Burrows, G. Nelson, P. Sobalvarro, and T. Anderson. Eraser: A Dynamic Data Race Detector for Multi-Threaded Programs. ToCS, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Y. Shi, S. Park, Z. Yin, S. Lu, Y. Zhou, W. Chen, and W. Zheng. Do I Use the Wrong Definition?: DeFuse: Definition-Use Invariants for Detecting Concurrency and Sequential Bugs. In OOPSLA, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. L. A. Smith, J. M. Bull, and J. Obdrzálek. A parallel Java Grande benchmark suite. In Supercomputing, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. B. P. Wood, A. Sampson, L. Ceze, and D. Grossman. Composable Specifications for Structured Shared-Memory Communication. In OOPSLA, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. M. Xu, R. Bodík, and M. D. Hill. A Serializability Violation Detector for Shared-Memory Server Programs. In PLDI, June 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. M. Xu, M. D. Hill, and R. Bodik. A Regulated Transitive Reduction (RTR) for Longer Memory Race Recording. In ASPLOS, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. J. Yu and S. Narayanasamy. A Case for an Interleaving Constrained Shared-Memory Multi-Processor. In ISCA, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. P. Zhou, R. Teodorescu, and Y. Zhou. HARD: Hardware-Assisted Lockset-based Race Detection. In HPCA, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Isolating and understanding concurrency errors using reconstructed execution fragments

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image ACM SIGPLAN Notices
          ACM SIGPLAN Notices  Volume 46, Issue 6
          PLDI '11
          June 2011
          652 pages
          ISSN:0362-1340
          EISSN:1558-1160
          DOI:10.1145/1993316
          Issue’s Table of Contents
          • cover image ACM Conferences
            PLDI '11: Proceedings of the 32nd ACM SIGPLAN Conference on Programming Language Design and Implementation
            June 2011
            668 pages
            ISBN:9781450306638
            DOI:10.1145/1993498
            • General Chair:
            • Mary Hall,
            • Program Chair:
            • David Padua

          Copyright © 2011 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 4 June 2011

          Check for updates

          Qualifiers

          • research-article

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader