Skip to main content

2003 | OriginalPaper | Buchkapitel

Busy-Wait Barrier Synchronization Using Distributed Counters with Local Sensor

verfasst von : Guansong Zhang, Francisco Martínez, Arie Tal, Bob Blainey

Erschienen in: OpenMP Shared Memory Parallel Programming

Verlag: Springer Berlin Heidelberg

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Barrier synchronization is an important and performance critical primitive in many parallel programming models, including the popular OpenMP model. In this paper, we compare the performance of several software implementations of barrier synchronization and introduce a new implementation, distributed counters with local sensor, which considerably reduces overhead on POWER3 and POWER4 SMP systems. Through experiments with the EPCC OpenMP benchmark, we demonstrate a 79% reduction in overhead on a 32-way POWER4 system and an 87% reduction in overhead on a 16-way POWER3 system when comparing with a fetch-and-add implementation. Since these improvements are primarily attributed to reduced L2 and L3 cache misses, we expect the relative performance of our implementation to increase with the number of processors in an SMP and as memory latencies lengthen relative to cache latencies.

Metadaten
Titel
Busy-Wait Barrier Synchronization Using Distributed Counters with Local Sensor
verfasst von
Guansong Zhang
Francisco Martínez
Arie Tal
Bob Blainey
Copyright-Jahr
2003
Verlag
Springer Berlin Heidelberg
DOI
https://doi.org/10.1007/3-540-45009-2_7

Premium Partner