2009 | OriginalPaper | Buchkapitel
NB-FEB: A Universal Scalable Easy-to-Use Synchronization Primitive for Manycore Architectures
verfasst von : Phuong Hoai Ha, Philippas Tsigas, Otto J. Anshus
Erschienen in: Principles of Distributed Systems
Verlag: Springer Berlin Heidelberg
Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.
Wählen Sie Textabschnitte aus um mit Künstlicher Intelligenz passenden Patente zu finden. powered by
Markieren Sie Textabschnitte, um KI-gestützt weitere passende Inhalte zu finden. powered by
This paper addresses the problem of universal synchronization primitives that can support
scalable
thread synchronization for large-scale manycore architectures. The universal synchronization primitives that have been deployed widely in conventional architectures, are the
compare-and-swap
(CAS) and
load-linked/store-conditional
(LL/SC) primitives. However, such synchronization primitives are expected to reach their scalability limits in the evolution to manycore architectures with thousands of cores.
We introduce a
non-blocking
full/empty bit primitive, or NB-FEB for short, as a promising synchronization primitive for parallel programming on manycore architectures. We show that the NB-FEB primitive is
universal, scalable, feasible
and
easy to use
. NB-FEB, together with registers, can solve the consensus problem for an arbitrary number of processes (
universality
). NB-FEB is
combinable
, namely its memory requests to the same memory location can be combined into only one memory request, which consequently makes NB-FEB scalable (
scalability
). Since NB-FEB is a variant of the original full/empty bit that always returns a value instead of waiting for a conditional flag, it is as feasible as the original full/empty bit, which has been implemented in many computer systems (
feasibility
). We construct, on top of NB-FEB, a non-blocking software transactional memory system called NBFEB-STM, which can be used as an abstraction to handle concurrent threads
easily
. NBFEB-STM is space efficient: the space complexity of each object updated by
N
concurrent threads/transactions is
${\it \Theta}(N)$
, which is optimal.