ABSTRACT
The upcoming C and C++ revised standards add concurrency to the languages, for the first time, in the form of a subtle *relaxed memory model* (the *C++11 model*). This aims to permit compiler optimisation and to accommodate the differing relaxed-memory behaviours of mainstream multiprocessors, combining simple semantics for most code with high-performance *low-level atomics* for concurrency libraries. In this paper, we first establish two simpler but provably equivalent models for C++11, one for the full language and another for the subset without consume operations. Subsetting further to the fragment without low-level atomics, we identify a subtlety arising from atomic initialisation and prove that, under an additional condition, the model is equivalent to sequential consistency for race-free programs.
We then prove our main result, the correctness of two proposed compilation schemes for the C++11 load and store concurrency primitives to Power assembly, having noted that an earlier proposal was flawed. (The main ideas apply also to ARM, which has a similar relaxed memory architecture.)
This should inform the ongoing development of production compilers for C++11 and C1x, clarifies what properties of the machine architecture are required, and builds confidence in the C++11 and Power semantics.
Supplemental Material
- Mustaque Ahamad, Rida A. Bazzi, Ranjit John, Prince Kohli, and Gil Neiger. The power of processor consistency. In Proc. SPAA, 1993. Google ScholarDigital Library
- S. V. Adve and M. D. Hill. Weak ordering -- a new definition. In Proc. ISCA, 1990. Google ScholarDigital Library
- J. Alglave. A Shared Memory Poetics. PhD thesis, Université Paris 7 and INRIA, 2010.Google Scholar
- J. Alglave and L. Maranget. Stability in weak memory models. In Proc. CAV, 2011. Google ScholarDigital Library
- J. Alglave, L. Maranget, S. Sarkar, and P. Sewell. Fences in weak memory models. In Proc. CAV, 2010. Google ScholarDigital Library
- H.-J. Boehm and S.V. Adve. Foundations of the C+ concurrency memory model. In Proc. PLDI, 2008. Google ScholarDigital Library
- P. Becker, editor. Programming Languages -- C+. 2011. ISO/IEC 14882:2011. A non-final but recent version is available at http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2011/n3242.pdf.Google Scholar
- M. Batty, K. Memarian, S. Owens, S. Sarkar, and P. Sewell. http://www.cl.cam.ac.uk/users/pes20/cppppc.Google Scholar
- S. Burckhardt, M. Musuvathi, and V. Singh. Verifying local transformations on relaxed memory models. In CC, 2010. Google ScholarDigital Library
- Hans Boehm. Atomic synchronization sequences, 2011. Mailing list communication, July 18th.Google Scholar
- M. Batty, S. Owens, S. Sarkar, P. Sewell, and T. Weber. Mathematizing C+ concurrency: The post-Rapperswil model. Technical Report N3132, ISO IEC JTC1/SC22/WG21, August 2010. http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2010/n3132.pdf.Google Scholar
- M. Batty, S. Owens, S. Sarkar, P. Sewell, and T. Weber. Mathematizing C+ concurrency. In Proc. POPL, 2011. Google ScholarDigital Library
- J. C. Blanchette, T. Weber, M. Batty, S. Owens, and S. Sarkar. Nitpicking C+ concurrency. In Proc. PPDP, 2011. Google ScholarDigital Library
- X. Fang, J. Lee, and S. P. Midkiff. Automatic fence insertion for shared memory multiprocessing. In Proc. ICS, 2003. Google ScholarDigital Library
- L. Lamport. How to make a multiprocessor computer that correctly executes multiprocess programs. IEEE Trans. Comput., C-28(9):690--691, 1979. Google ScholarDigital Library
- D. Lea. The JSR-133 cookbook for compiler writers. http://gee.cs.oswego.edu/dl/jmm/cookbook.html.Google Scholar
- X. Leroy. A formally verified compiler back-end. Journal of Automated Reasoning, 43(4):363--446, 2009. Google ScholarDigital Library
- A. Lochbihler. Verifying a compiler for Java threads. In Proc. ESOP'10, 2010. Google ScholarDigital Library
- P. E. McKenney and R. Silvera. Example POWER implementation for C/C+ memory model. http://www.rdrop.com/users/paulmck/scalability/paper/N2745r.2011.03.04a%.html, 2011.Google Scholar
- S. Owens, P. Böhm, F. Zappa Nardelli, and P. Sewell. Lem: A lightweight tool for heavyweight semantics. In Proc. ITP, LNCS 6898, 2011. "Rough Diamond" section. Google ScholarDigital Library
- J. Sevcik. Safe optimisations for shared-memory concurrent programs. In Proc. PLDI, 2011. Google ScholarDigital Library
- D. Shasha and M. Snir. Efficient and correct execution of parallel programs that share memory. TOPLAS, 10:282--312, 1988. Google ScholarDigital Library
- S. Sarkar, P. Sewell, J. Alglave, L. Maranget, and D. Williams. Understanding POWER multiprocessors. In PLDI, 2011. Google ScholarDigital Library
- P. Sewell, S. Sarkar, S. Owens, F. Zappa Nardelli, and M. O. Myreen. x86-TSO: A rigorous and usable programmer's model for x86 multiprocessors. C. ACM, 53(7):89--97, 2010. Google ScholarDigital Library
- J. Sevcik, V. Vafeiadis, F. Zappa Nardelli, S. Jagannathan, and P. Sewell. Relaxed-memory concurrency and verified compilation. In Proc. POPL, 2011. Google ScholarDigital Library
- V. Vafeiadis and F. Zappa Nardelli. Verifying fence elimination optimisations. In Proc. SAS, 2011. Google ScholarDigital Library
Index Terms
- Clarifying and compiling C/C++ concurrency: from C++11 to POWER
Recommendations
Mathematizing C++ concurrency
POPL '11: Proceedings of the 38th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languagesShared-memory concurrency in C and C++ is pervasive in systems programming, but has long been poorly defined. This motivated an ongoing shared effort by the standards committees to specify concurrent behaviour in the next versions of both languages. ...
Clarifying and compiling C/C++ concurrency: from C++11 to POWER
POPL '12The upcoming C and C++ revised standards add concurrency to the languages, for the first time, in the form of a subtle *relaxed memory model* (the *C++11 model*). This aims to permit compiler optimisation and to accommodate the differing relaxed-memory ...
Mathematizing C++ concurrency
POPL '11Shared-memory concurrency in C and C++ is pervasive in systems programming, but has long been poorly defined. This motivated an ongoing shared effort by the standards committees to specify concurrent behaviour in the next versions of both languages. ...
Comments