skip to main content
10.1145/1985793.1985848acmconferencesArticle/Chapter ViewAbstractPublication PagesicseConference Proceedingsconference-collections
research-article

RACEZ: a lightweight and non-invasive race detection tool for production applications

Published:21 May 2011Publication History

ABSTRACT

Concurrency bugs, particularly data races, are notoriously difficult to debug and are a significant source of unreliability in multithreaded applications. Many tools to catch data races rely on program instrumentation to obtain memory instruction traces. Unfortunately, this instrumentation introduces significant runtime overhead, is extremely invasive, or has a limited domain of applicability making these tools unsuitable for many production systems. Consequently, these tools are typically used during application testing where many data races go undetected.

This paper proposes RACEZ , a novel race detection mechanism which uses a sampled memory trace collected by the hardware performance monitoring unit rather than invasive instrumentation. The approach introduces only a modest overhead making it usable in production environments. We validate RACEZ using two open source server applications and the PARSEC benchmarks. Our experiments show that RACEZ catches a set of known bugs with reasonable probability while introducing only 2.8% runtime slow down on average.

References

  1. Concurrency bugs collection. http://www.eecs.umich.edu/~jieyu/bugs.html.Google ScholarGoogle Scholar
  2. Mao - an extensible micro-architectural optimizer. http://code.google.com/p/mao/.Google ScholarGoogle Scholar
  3. perfmon2: the hardware-based performance monitoring interface for linux. http://perfmon2.sourceforge.net/.Google ScholarGoogle Scholar
  4. M. Arnold, M. T. Vechev, and E. Yahav. QVM: an efficient runtime for detecting defects in deployed systems. In OOPSLA, pages 143--162. ACM, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. M. D. Bond, K. E. Coons, and K. S. McKinley. Pacer: Proportional detection of data races. In PLDI, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. S. Burckhardt, P. Kothari, M. Musuvathi, and S. Nagarakatte. A randomized scheduler with probabilistic guarantees of finding bugs. In ASPLOS, pages 167--178, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. D. Chen, N. Vachharajani, R. Hundt, S. wei Liao, V. Ramasamy, P. Yuan, W. Chen, and W. Zheng. Taming hardware event samples for fdo compilation. In Code Generation and Optimization(CGO), April 24-28, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. J. Dean, J. E. Hicks, C. A. Waldspurger, W. E. Weihl, and G. Z. Chrysos. ProfileMe: Hardware support for instruction-level profiling on out-of-order processors. In MICRO, pages 292--302, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. D. Engler and K. Ashcraft. RacerX: effective, static detection of race conditions and deadlocks. In Proceedings of the nineteenth ACM symposium on Operating systems principles, pages 237--252, New York, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. C. Flanagan and S. N. Freund. Fasttrack: efficient and precise dynamic race detection. In PLDI, pages 121--133, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. H. Jula, D. M. Tralamazza, C. Zamfir, and G. Candea. Deadlock immunity: Enabling systems to defend against deadlocks. In OSDI, pages 295--308, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. S. Lu, S. Park, E. Seo, and Y. Zhou. Learning from mistakes: a comprehensive study on real world concurrency bug characteristics. In ASPLOS, pages 329--339, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. S. Lu, J. Tucek, F. Qin, and Y. Zhou. Avio: detecting atomicity violations via access interleaving invariants. In ASPLOS, pages 37--48, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. D. Marino, M. Musuvathi, and S. Narayanasamy. Literace: effective sampling for lightweight data-race detection. In PLDI, pages 134--143, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. M. Musuvathi, S. Qadeer, T. Ball, G. Basler, P. A. Nainar, and I. Neamtiu. Finding and reproducing heisenbugs in concurrent programs. In OSDI, pages 267--280, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. A. Muzahid, D. S. Gracia, S. Qi, and J. Torrellas. Sigrace: signature-based data race detection. In S. W. Keckler and L. A. Barroso, editors, ISCA, pages 337--348. ACM, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. R. O'Callahan and J.-D. Choi. Hybrid dynamic data race detection. In PPOPP, pages 167--178, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. S. Park, S. Lu, and Y. Zhou. Ctrigger: exposing atomicity violation bugs from their hiding places. In ASPLOS, pages 25--36, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. E. Pozniansky and A. Schuster. Multirace: efficient on-the-fly data race detection in multithreaded C++ programs. Concurrency and Computation: Practice and Experience, 19(3):327--340, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. F. Qin, J. Tucek, J. Sundaresan, and Y. Zhou. Rx: treating bugs as allergies - a safe method to survive software failures. In SOSP, pages 235--248, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. P. Sack, B. E. Bliss, Z. Ma, P. Petersen, and J. Torrellas. Accurate and efficient filtering for the intel thread checker race detector. In ASID, pages 34--41, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. S. Savage, M. Burrows, G. Nelson, P. Sobalvarro, and T. E. Anderson. Eraser: A dynamic data race detector for multithreaded programs. ACM Trans. Comput. Syst, 15(4):391--411, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. SecurityFocus. Software bug constributed to blackout. http://www.securityfocus.com/news/8016.Google ScholarGoogle Scholar
  24. K. Sen. Race directed random testing of concurrent programs. In PLDI, pages 11--21, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. K. Serebryany and T. Iskhodzhanov. Threadsanitizer - data race detection in practice. In WBIA09, December 12, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. A. Shye, M. Iyer, V. J. Reddi, and D. A. Connors. Code coverage testing using hardware performance monitoring support. In AADEBUG, pages 159--163, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. J. W. Voung, R. Jhala, and S. Lerner. Relay: static race detection on millions of lines of code. In ESEC/SIGSOFT FSE, pages 205--214, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. J. Wu, H. Cui, and J. Yang. Bypassing races in live applications with execution filters. In Proceedings of the Ninth Symposium on Operating Systems Design and Implementation (OSDI '10), 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. W. Zhang, C. Sun, and S. Lu. Conmem: detecting severe concurrency bugs through an effect-oriented approach. In ASPLOS, pages 179--192, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. P. Zhou, R. Teodorescu, and Y. Zhou. Hard: Hardware-assisted lockset-based race detection. In HPCA, pages 121--132, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. P. Zijlstra. Per-thread self-monitoring official linux support patch. http://lkml.org/lkml/2009/8/4/128.Google ScholarGoogle Scholar

Index Terms

  1. RACEZ: a lightweight and non-invasive race detection tool for production applications

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      ICSE '11: Proceedings of the 33rd International Conference on Software Engineering
      May 2011
      1258 pages
      ISBN:9781450304450
      DOI:10.1145/1985793

      Copyright © 2011 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 21 May 2011

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate276of1,856submissions,15%

      Upcoming Conference

      ICSE 2025

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader