ABSTRACT
Compilers sometimes generate correct sequential code but break the concurrency memory model of the programming language: these subtle compiler bugs are observable only when the miscompiled functions interact with concurrent contexts, making them particularly hard to detect. In this work we design a strategy to reduce the hard problem of hunting concurrency compiler bugs to differential testing of sequential code and build a tool that puts this strategy to work. Our first contribution is a theory of sound optimisations in the C11/C++11 memory model, covering most of the optimisations we have observed in real compilers and validating the claim that common compiler optimisations are sound in the C11/C++11 memory model. Our second contribution is to show how, building on this theory, concurrency compiler bugs can be identified by comparing the memory trace of compiled code against a reference memory trace for the source code. Our tool identified several mistaken write introductions and other unexpected behaviours in the latest release of the gcc compiler.
- The cpptest tool. smallhttp://www.di.ens.fr/~zappa/projects/cmmtest/.Google Scholar
- S. V. Adve and M. D. Hill. Weak ordering - a new definition. In ISCA, 1990. Google ScholarDigital Library
- J. Alglave, L. Maranget, S. Sarkar, and P. Sewell. Litmus: Running tests against hardware. In TACAS, 2011. Google ScholarDigital Library
- M. Batty, S. Owens, S. Sarkar, P. Sewell, and T. Weber. Mathematizing C concurrency. In POPL, 2011. Google ScholarDigital Library
- M. Batty, K. Memarian, S. Owens, S. Sarkar, and P. Sewell. Clarifying and compiling C/C concurrency: from C 11 to POWER. In POPL, 2012. Google ScholarDigital Library
- P. Becker. phStandard for Programming Language C- ISO/IEC 14882, 2011.Google Scholar
- H.-J. Boehm and S. V. Adve. Foundations of the C++concurrency memory model. In PLDI, 2008. Google ScholarDigital Library
- A. S. Boujarwah and K. Saleh. Compiler test case generation methods: a survey and assessment. Information & Software Technology, 39 (9): 617--625, 1997.Google ScholarCross Ref
- C. J. Burgess and M. Saidi. The automatic generation of test cases for optimizing fortran compilers. Information & Software Technology, 38 (2): 111--119, 1996.Google ScholarCross Ref
- E. Eide and J. Regehr. Volatiles are miscompiled and what to do about it. EMSOFT, 2008. Google ScholarDigital Library
- A. C. J. Fox and M. O. Myreen. A trustworthy monadic formalization of the ARMv7 instruction set architecture. In ITP, 2010. Google ScholarDigital Library
- C. Lindig. Random testing of C calling conventions. In AADEBUG, 2005. Google ScholarDigital Library
- C.-K. Luk, R. Cohn, R. Muth, H. Patil, A. Klauser, G. Lowney, S. Wallace, V. J. Reddi, and K. Hazelwood. Pin: building customized program analysis tools with dynamic instrumentation. In PLDI, 2005. Google ScholarDigital Library
- W. M. McKeeman. Differential testing for software. Digital Technical Journal, 10 (1): 100--107, 1998.Google Scholar
- P. E. McKenney and R. Silvera, 2011. smallhttp://www.rdrop.com/users/paulmck/scalability/!paper/N2745r.2011.03.04a.html.Google Scholar
- J. Regehr, Y. Chen, P. Cuoq, E. Eide, C. Ellison, and X. Yang. Test-case reduction for C compiler bugs. In PLDI, 2012. Google ScholarDigital Library
- J. Ševčík. Program Transformations in Weak Memory Models. PhD thesis, University of Edinburgh, 2008.Google Scholar
- J. Ševčík The Sun Hotspot JVM does not conform with the Java memory model. Technical Report EDI-INF-RR-1252, School of Informatics, University of Edinburgh, 2008.Google Scholar
- J. Ševčík. Safe optimisations for shared-memory concurrent programs. In PLDI, 2011. Google ScholarDigital Library
- D. Shasha and M. Snir. Efficient and correct execution of parallel programs that share memory. phACM Transactions on Programming Languages and Systems, 10 (2), 1988. Google ScholarDigital Library
- F. Sheridan. Practical testing of a C99 compiler using output comparison. Software: Practice and Experience, 37 (14): 1475--1488, 2007. Google ScholarDigital Library
- R. L. Solder. A general test data generator for COBOL. In AFIPS Joint Computer Conferences, 1962. Google ScholarDigital Library
- A. Terekhov. Brief tentative example x86 implementation for C/C memory model, 2008. http://www.decadent.org.uk/pipermail/~cpp-threads/2008-December/001933.html.Google Scholar
- X. Yang, Y. Chen, E. Eide, and J. Regehr. Finding and understanding bugs in C compilers. In PLDI, 2011. Google ScholarDigital Library
- C. Zhao, Y. Xue, Q. Tao, L. Guo, and Z. Wang. Automated test program generation for an industrial optimizing compiler. In AST, 2009.Google Scholar
- J. Zhao, S. Nagarakatte, M. M. K. Martin, and S. Zdancewic. Formalizing the LLVM intermediate representation for verified program transformations. In POPL, 2012. Google ScholarDigital Library
Index Terms
- Compiler testing via a theory of sound optimisations in the C11/C++11 memory model
Recommendations
A Survey of Compiler Testing
Virtually any software running on a computer has been processed by a compiler or a compiler-like tool. Because compilers are such a crucial piece of infrastructure for building software, their correctness is of paramount importance. To validate and ...
Compiler testing via a theory of sound optimisations in the C11/C++11 memory model
PLDI '13Compilers sometimes generate correct sequential code but break the concurrency memory model of the programming language: these subtle compiler bugs are observable only when the miscompiled functions interact with concurrent contexts, making them ...
Toward understanding compiler bugs in GCC and LLVM
ISSTA 2016: Proceedings of the 25th International Symposium on Software Testing and AnalysisCompilers are critical, widely-used complex software. Bugs in them have significant impact, and can cause serious damage when they silently miscompile a safety-critical application. An in-depth understanding of compiler bugs can help detect and fix ...
Comments