Abstract
The shift from single to multiple core architectures means that programmers must write concurrent, multithreaded programs in order to increase application performance. Unfortunately, multithreaded applications are susceptible to numerous errors, including deadlocks, race conditions, atomicity violations, and order violations. These errors are notoriously difficult for programmers to debug.
This paper presents Grace, a software-only runtime system that eliminates concurrency errors for a class of multithreaded programs: those based on fork-join parallelism. By turning threads into processes, leveraging virtual memory protection, and imposing a sequential commit protocol, Grace provides programmers with the appearance of deterministic, sequential execution, while taking advantage of available processing cores to run code concurrently and efficiently. Experimental results demonstrate Grace's effectiveness: with modest code changes across a suite of computationally-intensive benchmarks (1-16 lines), Grace can achieve high scalability and performance while preventing concurrency errors.
- M. Abadi, A. Birrell, T. Harris, and M. Isard. Semantics of transactional memory and automatic mutual exclusion. In POPL '08: Proceedings of the 35th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages, pages 63--74, New York, NY, USA, 2008. ACM. Google ScholarDigital Library
- D. F. Bacon and S. C. Goldstein. Hardware-assisted replay of multiprocessor programs. In PADD '91: Proceedings of the 1991 ACM/ONR workshop on Parallel and distributed debugging, pages 194--206, New York, NY, USA, 1991. ACM. Google ScholarDigital Library
- E. D. Berger, K. S. McKinley, R. D. Blumofe, and P. R. Wilson. Hoard: A scalable memory allocator for multithreaded applications. In Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS-IX), pages 117--128, New York, NY, USA, Nov. 2000. ACM. Google ScholarDigital Library
- E. D. Berger, B. G. Zorn, and K. S. McKinley. Composing high-performance memory allocators. In Proceedings of the 2001 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2001), pages 114--124, New York, NY, USA, June 2001. ACM. Google ScholarDigital Library
- R. D. Blumofe, C. F. Joerg, B. C. Kuszmaul, C. E. Leiserson, K. H. Randall, and Y. Zhou. Cilk: an efficient multithreaded runtime system. J. Parallel Distrib. Comput., 37(1):55--69, 1996. Google ScholarDigital Library
- C. Blundell, E. C. Lewis, and M. M. K. Martin. Deconstructing transactions: The subtleties of atomicity. In WDDD '05: 4th Workshop on Duplicating, Deconstructing, and Debunking, June 2005.Google Scholar
- J. B. Carter, J. K. Bennett, and W. Zwaenepoel. Implementation and performance of Munin. In SOSP '91: Proceedings of the Thirteenth ACM Symposium on Operating Systems Principles, pages 152--164, New York, NY, USA, 1991. ACM. Google ScholarDigital Library
- G.-I. Cheng, M. Feng, C. E. Leiserson, K. H. Randall, and A. F. Stark. Detecting data races in cilk programs that use locks. In SPAA '98: Proceedings of the tenth annual ACM symposium on Parallel algorithms and architectures, pages 298--309, New York, NY, USA, 1998. ACM. Google ScholarDigital Library
- J. Dean and S. Ghemawat. MapReduce: simplified data processing on large clusters. In OSDI'04: Proceedings of the 6th conference on Symposium on Opearting Systems Design&Implementation, pages 10--10, Berkeley, CA, USA, 2004. USENIX Association. Google ScholarDigital Library
- J. Devietti, B. Lucia, L. Ceze, and M. Oskin. DMP: deterministic shared memory multiprocessing. In ASPLOS '09: Proceedings of the 14th International Conference on Architectural Support for Programming Languages and Operating Systems, pages 85--96, New York, NY, USA, 2009. ACM. Google ScholarDigital Library
- D. Dice, O. Shalev, and N. Shavit. Transactional locking ii. In S. Dolev, editor, DISC, volume 4167 of Lecture Notes in Computer Science, pages 194--208. Springer, 2006. Google ScholarDigital Library
- C. Ding, X. Shen, K. Kelsey, C. Tice, R. Huang, and C. Zhang. Software behavior oriented parallelization. In PLDI '07: Proceedings of the 2007 ACM SIGPLAN conference on Programming language design and implementation, pages 223--234, New York, NY, USA, 2007. ACM. Google ScholarDigital Library
- M. Feng and C. E. Leiserson. Efficient detection of determinacy races in cilk programs. In SPAA '97: Proceedings of the ninth annual ACM symposium on Parallel algorithms and architectures, pages 1--11, New York, NY, USA, 1997. ACM. Google ScholarDigital Library
- C. Flanagan and S. Qadeer. A type and effect system for atomicity. In PLDI '03: Proceedings of the ACM SIGPLAN 2003 conference on Programming language design and implementation, pages 338--349, New York, NY, USA, 2003. ACM. Google ScholarDigital Library
- T. Harris and K. Fraser. Language support for lightweight transactions. In OOPSLA '03: Proceedings of the 18th annual ACM SIGPLAN conference on Object-oriented programing, systems, languages, and applications, pages 388--402, New York, NY, USA, 2003. ACM. Google ScholarDigital Library
- J. W. Havender. Avoiding deadlock in multitasking systems. IBM Systems Journal, 7(2):74--84, 1968.Google ScholarDigital Library
- M. Herlihy and J. E. B. Moss. Transactional memory: architectural support for lock-free data structures. In ISCA '93: Proceedings of the 20th annual international symposium on Computer architecture, pages 289--300, New York, NY, USA, 1993. ACM. Google ScholarDigital Library
- K. Huang. Data-race detection in transactions-everywhere parallel programming. Master's thesis, Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, June 2003.Google Scholar
- R. L. Hudson, B. Saha, A.-R. Adl-Tabatabai, and B. C. Hertzberg. Mcrt-malloc: a scalable transactional memory allocator. In ISMM '06: Proceedings of the 5th International Symposium on Memory Management, pages 74--83, New York, NY, USA, 2006. ACM. Google ScholarDigital Library
- M. Isard and A. Birrell. Automatic mutual exclusion. In HotOS XI: 11th Workshop on Hot Topics in Operating Systems, Berkeley, CA, May 2007. Google ScholarDigital Library
- R. L. B. Jr., V. S. Adve, D. Dig, S. Adve, S. Heumann, R. Komuravelli, J. Overbey, P. Simmons, H. Sung, and M. Vakilian. A type and effect system for deterministic parallel Java. In OOPSLA '09: Proceedings of the 24th ACM SIGPLAN Conference on Object-oriented Programming Systems, Languages, and Applications, New York, NY, USA, 2009. ACM. Google ScholarDigital Library
- P. Keleher, A. L. Cox, S. Dwarkadas, and W. Zwaenepoel. Treadmarks: Distributed shared memory on standard workstations and operating systems. In WTEC'94: Proceedings of the USENIX Winter 1994 Technical Conference, pages 10--10, Berkeley, CA, USA, 1994. USENIX Association. Google ScholarDigital Library
- J. R. Larus and R. Rajwar. Transactional Memory. Morgan&Claypool, 2006.Google Scholar
- D. Lea. A Java fork/join framework. In JAVA '00: Proceedings of the ACM 2000 conference on Java Grande, pages 36--43, New York, NY, USA, 2000. ACM. Google ScholarDigital Library
- S. Lu, S. Park, C. Hu, X. Ma, W. Jiang, Z. Li, R. A. Popa, and Y. Zhou. MUVI: automatically inferring multi-variable access correlations and detecting related semantic and concurrency bugs. In SOSP '07: Proceedings of the Twenty-First ACM SIGOPS Symposium on Operating Systems Principles, pages 103--116, New York, NY, USA, 2007. ACM. Google ScholarDigital Library
- S. Lu, S. Park, E. Seo, and Y. Zhou. Learning from mistakes: a comprehensive study on real world concurrency bug characteristics. In ASPLOS XIII: Proceedings of the 13th international conference on Architectural support for programming languages and operating systems, pages 329--339, New York, NY, USA, 2008. ACM. Google ScholarDigital Library
- B. Lucia, J. Devietti, K. Strauss, and L. Ceze. Atom-Aid: Detecting and surviving atomicity violations. In ISCA '08: Proceedings of the 35th Annual International Symposium on Computer Architecture, New York, NY, USA, June 2008. ACM Press. Google ScholarDigital Library
- C.-K. Luk, R. Cohn, R. Muth, H. Patil, A. Klauser, G. Lowney, S. Wallace, V. J. Reddi, and K. Hazelwood. Pin: building customized program analysis tools with dynamic instrumentation. In PLDI '05: Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation, pages 190--200, New York, NY, USA, 2005. ACM. Google ScholarDigital Library
- C. E. McDowell and D. P. Helmbold. Debugging concurrent programs. ACM Comput. Surv., 21(4):593--622, 1989. Google ScholarDigital Library
- R. H. B. Netzer and B. P. Miller. What are race conditions?: Some issues and formalizations. ACM Lett. Program. Lang. Syst., 1(1):74--88, 1992. Google ScholarDigital Library
- Y. Ni, A. Welc, A.-R. Adl-Tabatabai, M. Bach, S. Berkowits, J. Cownie, R. Geva, S. Kozhukow, R. Narayanaswamy, J. Olivier, S. Preis, B. Saha, A. Tal, and X. Tian. Design and implementation of transactional constructs for C/C++. In OOPSLA '08: Proceedings of the 23rd ACM SIGPLAN Conference on Object-oriented Programming Systems, Languages, and Applications, pages 195--212, New York, NY, USA, 2008. ACM. Google ScholarDigital Library
- M. Olszewski, J. Ansel, and S. Amarasinghe. Kendo: efficient deterministic multithreading in software. In ASPLOS '09: Proceedings of the 14th International Conference on Architectural Support for Programming Languages and Operating Systems, pages 97--108, New York, NY, USA, 2009. ACM. Google ScholarDigital Library
- S. Rajamani, G. Ramalingam, V. P. Ranganath, and K. Vaswani. ISOLATOR: dynamically ensuring isolation in comcurrent programs. In ASPLOS '09: Proceeding of the 14th International Conference on Architectural Support for Programming Languages and Operating Systems, pages 181--192, New York, NY, USA, 2009. ACM. Google ScholarDigital Library
- C. Ranger, R. Raghuraman, A. Penmetsa, G. Bradski, and C. Kozyrakis. Evaluating MapReduce for multi-core and multiprocessor systems. In Proceedings of the 13th Intl. Symposium on High-Performance Computer Architecture (HPCA), feb 2007. Google ScholarDigital Library
- J. Reinders. Intel Threading Building Blocks: Outfitting C++ for Multi-core Processor Parallelism. O'Reilly Media, Inc., 2007. Google ScholarDigital Library
- N. Shavit and D. Touitou. Software transactional memory. In PODC '95: Proceedings of the fourteenth annual ACM symposium on Principles of distributed computing, pages 204--213, New York, NY, USA, 1995. ACM. Google ScholarDigital Library
- T. Shpeisman, V. Menon, A.-R. Adl-Tabatabai, S. Balensiefer, D. Grossman, R. L. Hudson, K. F. Moore, and B. Saha. Enforcing isolation and ordering in S™. In PLDI '07: Proceedings of the 2007 ACM SIGPLAN conference on Programming language design and implementation, pages 78--88, New York, NY, USA, 2007. ACM. Google ScholarDigital Library
- C. von Praun, L. Ceze, and C. Caşcaval. Implicit parallelism with ordered transactions. In PPoPP '07: Proceedings of the 12th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pages 79--89, New York, NY, USA, 2007. ACM. Google ScholarDigital Library
- A. Welc, S. Jagannathan, and A. Hosking. Safe futures for Java. In OOPSLA '05: Proceedings of the 20th annual ACM SIGPLAN Conference on Object oriented Programming, Systems, Languages, and applications, pages 439--453, New York, NY, USA, 2005. ACM. Google ScholarDigital Library
- A. Welc, B. Saha, and A.-R. Adl-Tabatabai. Irrevocable transactions and their applications. In SPAA '08: Proceedings of the Twentieth Annual Symposium on Parallelism in Algorithms and Architectures, pages 285--296, New York, NY, USA, 2008. ACM. Google ScholarDigital Library
Index Terms
- Grace: safe multithreaded programming for C/C++
Recommendations
Grace: safe multithreaded programming for C/C++
OOPSLA '09: Proceedings of the 24th ACM SIGPLAN conference on Object oriented programming systems languages and applicationsThe shift from single to multiple core architectures means that programmers must write concurrent, multithreaded programs in order to increase application performance. Unfortunately, multithreaded applications are susceptible to numerous errors, ...
Adding Concurrency to Smart Contracts
PODC '17: Proceedings of the ACM Symposium on Principles of Distributed ComputingModern cryptocurrency systems, such as Ethereum, permit complex financial transactions through scripts called smart contracts. These smart contracts are executed many, many times, always without real concurrency. First, all smart contracts are serially ...
Nested parallelism in transactional memory
PPoPP '08: Proceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programmingThis paper investigates adding transactions with nested parallelism and nested transactions to a dynamically multithreaded parallel programming language that generates only series-parallel programs. We describe XConflict, a data structure that ...
Comments