Abstract
Concurrent data-structures, such as stacks, queues, and deques, often implicitly enforce a total order over elements in their underlying memory layout. However, much of this order is unnecessary: linearizability only requires that elements are ordered if the insert methods ran in sequence. We propose a new approach which uses timestamping to avoid unnecessary ordering. Pairs of elements can be left unordered if their associated insert operations ran concurrently, and order imposed as necessary at the eventual removal.
We realise our approach in a new non-blocking data-structure, the TS (timestamped) stack. Using the same approach, we can define corresponding queue and deque data-structures. In experiments on x86, the TS stack outperforms and outscales all its competitors -- for example, it outperforms the elimination-backoff stack by factor of two. In our approach, more concurrency translates into less ordering, giving less-contended removal and thus higher performance and scalability. Despite this, the TS stack is linearizable with respect to stack semantics.
The weak internal ordering in the TS stack presents a challenge when establishing linearizability: standard techniques such as linearization points work well when there exists a total internal order. We present a new stack theorem, mechanised in Isabelle, which characterises the orderings sufficient to establish stack semantics. By applying our stack theorem, we show that the TS stack is indeed linearizable. Our theorem constitutes a new, generic proof technique for concurrent stacks, and it paves the way for future weakly ordered data-structure designs.
Supplemental Material
Available for Download
A Scalable, Correct Time-Stamped Stack We provide an appendix and two artifacts: * A proof in Isabelle/HOL of the Stack Theorem (Theorem 1 in our paper). * The TS stack, a concurrent stack written in C, along with implementations of several other algorithms for comparison purposes.
- Y. Afek and A. Morrison. Fast concurrent queues for x86 processors. In phPPoPP. ACM, 2013. Google ScholarDigital Library
- H. Attiya, R. Guerraoui, D. Hendler, P. Kuznetsov, M. Michael, and M. Vechev. Laws of order: expensive synchronization in concurrent algorithms cannot be eliminated. In phPOPL, 2011. Google ScholarDigital Library
- G. Bar-Nissan, D. Hendler, and A. Suissa. A dynamic elimination-combining stack algorithm. In phOPODIS, 2011. Google ScholarDigital Library
- M. Batty, M. Dodds, and A. Gotsman. Library abstraction for C/CGoogle Scholar
- concurrency. In phPOPL, 2013.Google Scholar
- }scalComputational Systems Group, University of Salzburg. Scal framework. URL http://scal.cs.uni-salzburg.at.Google Scholar
- M. Gorelik and D. Hendler. Brief announcement: an asymmetric flat-combining based queue algorithm. In phPODC, 2013. Google ScholarDigital Library
- A. Haas, C. Kirsch, M. Lippautz, and H. Payer. How FIFO is your concurrent FIFO queue? In phRACES. ACM, 2012. Google ScholarDigital Library
- A. Haas, T. Henzinger, C. Kirsch, M. Lippautz, H. Payer, A. Sezgin, and A. Sokolova. Distributed queues in shared memory--multicore performance and scalability through quantitative relaxation. In phCF. ACM, 2013. Google ScholarDigital Library
- D. Hendler, N. Shavit, and L. Yerushalmi. A scalable lock-free stack algorithm. In phSPAA. ACM, 2004. Google ScholarDigital Library
- D. Hendler, I. Incze, N. Shavit, and M. Tzafrir. Flat combining and the synchronization-parallelism tradeoff. In phSPAA, 2010. Google ScholarDigital Library
- Henzinger, Payer, and Sezgin}Henzinger13bT. Henzinger, H. Payer, and A. Sezgin. Replacing competition with cooperation to achieve scalable lock-free FIFO queues. Technical Report IST-2013--124-v1Google Scholar
- 1, IST Austria, 2013\natexlaba.Google Scholar
- Henzinger, Sezgin, and Vafeiadis}Henzinger13T. A. Henzinger, A. Sezgin, and V. Vafeiadis. Aspect-oriented linearizability proofs. In phCONCUR, 2013\natexlabb. Google ScholarDigital Library
- M. Herlihy and N. Shavit. phThe Art of Multiprocessor Programming. Morgan Kaufmann Publishers Inc., 2008. Google ScholarDigital Library
- M. Herlihy and J. Wing. Linearizability: a correctness condition for concurrent objects. phTOPLAS, 12 (3), 1990. Google ScholarDigital Library
- M. Hoffman, O. Shalev, and N. Shavit. The baskets queue. In phOPODIS. Springer, 2007. Google ScholarDigital Library
- Intel. Intel 64 and ia-32 architectures software developer's manual, volume 3b: System programming guide, part 2, 2013. URL http://download.intel.com/products/processor/manual/253669.pdf.Google Scholar
- L. Lamport. Time, clocks, and the ordering of events in a distributed system. phCommunications ACM, 21, July 1978. Google ScholarDigital Library
- M. Michael and M. Scott. Simple, fast, and practical non-blocking and blocking concurrent queue algorithms. In phPODC. ACM, 1996. Google ScholarDigital Library
- H. H. Nguyen and M. Rinard. Detecting and eliminating memory leaks using cyclic memory allocation. In phISMM. ACM, 2007. Google ScholarDigital Library
- W. Ruan, Y. Liu, and M. Spear. Boosting timestamp-based transactional memory by exploiting hardware cycle counters. In phTRANSACT, 2013. Google ScholarDigital Library
- P. Sewell, S. Sarkar, S. Owens, F. Z. Nardelli, and M. O. Myreen. x86-TSO: a rigorous and usable programmer's model for x86 multiprocessors. phCommun. ACM, 53 (7), 2010. Google ScholarDigital Library
- R. Treiber. Systems programming: Coping with parallelism. Technical Report RJ5118, IBM Almaden Research Center, April 1986.Google Scholar
Index Terms
- A Scalable, Correct Time-Stamped Stack
Recommendations
A scalable lock-free stack algorithm
SPAA '04: Proceedings of the sixteenth annual ACM symposium on Parallelism in algorithms and architecturesThe literature describes two high performance concurrent stack algorithms based on combining funnels and elimination trees. Unfortunately, the funnels are linearizable but blocking, and the elimination trees are non-blocking but not linearizable. ...
A choice of variational stacks: exploring variational data structures
VaMoS '17: Proceedings of the 11th International Workshop on Variability Modelling of Software-Intensive SystemsMany applications require not only representing variability in software and data, but also computing with it. To do so efficiently requires variational data structures that make the variability explicit in the underlying data and the operations used to ...
Linearizability: a correctness condition for concurrent objects
A concurrent object is a data object shared by concurrent processes. Linearizability is a correctness condition for concurrent objects that exploits the semantics of abstract data types. It permits a high degree of concurrency, yet it permits ...
Comments