ABSTRACT
Flat combining (FC) and transactional lock elision (TLE) are two techniques that facilitate efficient multi-thread access to a sequentially implemented data structure protected by a lock. FC allows threads to delegate their operations to another (combiner) thread, and benefit from executing multiple operations by that thread under the lock through combining and elimination optimizations tailored to the specific data structure. TLE employs hardware transactional memory (HTM) that allows multiple threads to apply their operations concurrently as long as they do not conflict. This paper explores how these two radically different techniques can complement one another, and introduces the HTM-assisted Combining Framework (HCF). HCF leverages HTM to allow multiple combiners to run concurrently with each other, as well as with other, non-combiner threads. This makes HCF a good fit for data structures and workloads in which some operations may conflict with each other while others may run concurrently without conflicts. HCF achieves all that with changes to the sequential code similar to those required by TLE and FC, and in particular, without requiring the programmer to reason about concurrency.
- Yehuda Afek, Amir Levy, and Adam Morrison. 2014. Software-improved hardware lock elision. In Proceedings of ACM PODC. 212--221. Google ScholarDigital Library
- Trevor Brown, Alex Kogan, Yossi Lev, and Victor Luchangco. 2016. Investigating the Performance of Hardware Transactions on a Multi-Socket Machine. In Proceedings of ACM SPAA. 121--132. Google ScholarDigital Library
- Vladimir Budovsky. 2010. Combining Techniques Application for Tree Search Structures. Master's thesis. Tel Aviv University.Google Scholar
- Dave Dice, Alex Kogan, Yossi Lev, Timothy Merrifield, and Mark Moir. 2014. Adaptive integration of hardware and software lock elision techniques. In Proceedings of the ACM Symposium on Parallelism in Algorithms and Architectures (SPAA). 188--197. Google ScholarDigital Library
- Dave Dice, Yossi Lev, Mark Moir, and Daniel Nussbaum. 2009. Early experience with a commercial hardware transactional memory implementation. In Proceedings of ASPLOS. 157--168. Google ScholarDigital Library
- Nuno Diegues, Paolo Romano, and Stoyan Garbatov. 2015. Seer: Probabilistic Scheduling for Hardware Transactional Memory. In Proceedings of the ACM Symposium on Parallelism in Algorithms and Architectures (SPAA). 224--233. Google ScholarDigital Library
- Nuno Diegues, Paolo Romano, and Luıs Rodrigues. 2014. Virtues and Limitations of Commodity Hardware Transactional Memory. In Proceedings of the International Conference on Parallel Architectures and Compilation (PACT). 3--14. Google ScholarDigital Library
- Dana Drachsler-Cohen and Erez Petrank. 2014. LCD: Local Combining on Demand. In Proceedings of OPODIS. 355--371. Google ScholarCross Ref
- Panagiota Fatourou and Nikolaos D. Kallimanis. 2012. Revisiting the Combining Synchronization Technique. In Proceedings of ACM PPOPP. 257--266. Google ScholarDigital Library
- Steve Heller, Maurice Herlihy, Victor Luchangco, Mark Moir, William N. Scherer, and Nir Shavit. 2005. A Lazy Concurrent List-based Set Algorithm. In Proceedings of the Int. Conference on Principles of Distributed Systems (OPODIS). 3--16.Google Scholar
- Danny Hendler, Itai Incze, Nir Shavit, and Moran Tzafrir. 2010a. Flat Combining and the Synchronization-parallelism Tradeoff. In Proceedings of the ACM Symposium on Parallelism in Algorithms and Architectures (SPAA). 355--364. Google ScholarDigital Library
- Danny Hendler, Itai Incze, Nir Shavit, and Moran Tzafrir. 2010b. Scalable Flat-Combining Based Synchronous Queues. In Proceedings of DISC. 79--93. Google ScholarCross Ref
- Takuya Nakaike, Rei Odaira, Matthew Gaudet, Maged Michael, and Hisanobu Tomari. 2015. Quantitative Comparison of Hardware Transactional Memory for Blue Gene/Q, zEnterprise EC12, Intel Core, and POWER8. In Proceedings of the International Symposium on Computer Architecture (ISCA). 144--157. Google ScholarDigital Library
- Ravi Rajwar and James R. Goodman. 2001. Speculative Lock Elision: Enabling Highly Concurrent Multithreaded Execution. In Proceedings of the ACM/IEEE International Symposium on Microarchitecture (Micro). 294--305. Google ScholarCross Ref
- Richard M. Yoo, Christopher J. Hughes, Konrad Lai, and Ravi Rajwar. 2013. Performance evaluation of Intel® transactional synchronization extensions for high-performance computing. In International Conference for High Performance Computing, Networking, Storage and Analysis (SC). Article 19.Google ScholarDigital Library
Index Terms
- Transactional Lock Elision Meets Combining
Recommendations
Refined transactional lock elision
PPoPP '16Transactional lock elision (TLE) is a well-known technique that exploits hardware transactional memory (HTM) to introduce concurrency into lock-based software. It achieves that by attempting to execute a critical section protected by a lock in an atomic ...
Refined transactional lock elision
PPoPP '16: Proceedings of the 21st ACM SIGPLAN Symposium on Principles and Practice of Parallel ProgrammingTransactional lock elision (TLE) is a well-known technique that exploits hardware transactional memory (HTM) to introduce concurrency into lock-based software. It achieves that by attempting to execute a critical section protected by a lock in an atomic ...
Software-improved hardware lock elision
PODC '14: Proceedings of the 2014 ACM symposium on Principles of distributed computingWith hardware transactional memory (HTM) becoming available in mainstream processors, lock-based critical sections may now initiate a hardware transaction instead of taking the lock, enabling their concurrent execution unless a real data conflict ...
Comments