Skip to main content
main-content

Über dieses Buch

The advent of multi-core architectures and cloud-computing has brought parallel programming into the mainstream of software development. Unfortunately, writing scalable parallel programs using traditional lock-based synchronization primitives is well known to be a hard, time consuming and error-prone task, mastered by only a minority of specialized programmers. Building on the familiar abstraction of atomic transactions, Transactional Memory (TM) promises to free programmers from the complexity of conventional synchronization schemes, simplifying the development and verification of concurrent programs, enhancing code reliability, and boosting productivity. Over the last decade TM has been subject to intense research on a broad range of aspects including hardware and operating systems support, language integration, as well as algorithms and theoretical foundations. On the industrial side, the major players of the software and hardware markets have been up-front in the research and development of prototypal products providing support for TM systems. This has recently led to the introduction of hardware TM implementations on mainstream commercial microprocessors and to the integration of TM support for the world’s leading open source compiler. In such a vast inter-disciplinary domain, the Euro-TM COST Action (IC1001) has served as a catalyzer and a bridge for the various research communities looking at disparate, yet subtly interconnected, aspects of TM. This book emerged from the idea having Euro-TM experts compile recent results in the TM area in a single and consistent volume. Contributions have been carefully selected and revised to provide a broad coverage of several fundamental issues associated with the design and implementation of TM systems, including their theoretical underpinnings and algorithmic foundations, programming language integration and verification tools, hardware supports, distributed TM systems, self-tuning mechanisms, as well as lessons learnt from building complex TM-based applications.

Inhaltsverzeichnis

Frontmatter

Theoretical Foundations

Frontmatter

Consistency for Transactional Memory Computing

Abstract
This chapter provides formal definitions for a comprehensive collection of consistency conditions for transactional memory (TM) computing. We express all conditions in a uniform way using a formal framework that we present. For each of the conditions, we provide two versions: one that allows a transaction T to read the value of a data item written by another transaction T′ that can be live and not yet commit-pending provided that T′ will eventually commit, and a version which allows transactions to read values written only by transactions that have either committed before T starts or are commit-pending. Deriving the first version for a consistency condition was not an easy task but it has the benefit that this version is weaker than the second one and so it results in a wider universe of algorithms which there is no reason to exclude from being considered correct. The formalism for the presented consistency conditions is not based on any unrealistic assumptions, such as that transactional operations are executed atomically or that write operations write distinct values for data items. Making such assumptions facilitates the task of formally expressing the consistency conditions significantly, but results in formal presentations of them that are unrealistic, i.e. that cannot be used to characterize the correctness of most of the executions produced by any reasonable TM algorithm.
Dmytro Dziuma, Panagiota Fatourou, Eleni Kanellou

Liveness in Transactional Memory

Abstract
In this chapter we give a formal overview of liveness properties of transactional memory (TM) systems. Unlike safety properties, which require some ‘bad’ events not to occur, liveness properties require some ‘good’ events to eventually occur. Usually, liveness properties of shared memory systems require some operations to eventually return a response (terminate). However, in the context of TM systems operation termination is not enough to ensure meaningful progress. It is necessary to require some transactions to eventually commit. In this chapter we give precise definitions of liveness properties and what it means for a TM systems to satisfy a liveness property. Using the defined formal framework we give some impossibility results. We show that it is impossible to guarantee both local progress, the strongest TM liveness property that requires every correct transaction to eventually commit, and common TM safety properties such as strict serializability or opacity in a fault prone system.
Victor Bushkov, Rachid Guerraoui

Safety and Deferred Update in Transactional Memory

Abstract
Transactional memory allows the user to declare sequences of instructions as speculative transactions that can either commit or abort, providing all-or-nothing semantics. If a transaction commits, it should appear to execute sequentially, so that the committed transactions constitute a correct sequential execution. If a transaction aborts, none of its instructions should affect other transactions. These semantics allow the programmer to incorporate sequential code within transactions and let the transactional memory care about conflicts between concurrent transactions. In this sense, it is important that the memory is safe, i.e., every transaction has a consistent view even if the transaction aborts later. Otherwise, inconsistencies not predicted by the sequential program may cause a fatal irrecoverable error or an infinite loop. Furthermore, in a general setting, where a transaction may be explicitly aborted by the user or an external contention manager, a transaction should not be allowed to read from a not yet committed transaction, which is often called deferred-update semantics. This chapter overviews the scope of consistency criteria proposed so far to capture deferred-update semantics, and shows that—under reasonable conditions—the semantics induces a safety property.
Hagit Attiya, Sandeep Hans, Petr Kuznetsov, Srivatsan Ravi

Disjoint-Access Parallelism in Software Transactional Memory

Abstract
Disjoint-access parallelism captures the requirement that unrelated transactions progress independently, without interference, even if they occur at the same time. That is, an implementation should not cause two transactions, which are unrelated at the high-level, i.e. they access disjoint sets of data items, to simultaneously access the same low-level shared memory locations. This chapter will formalize this notion and will discuss if and when STM can achieve disjoint-access parallelism, by presenting impossibility results and discussing some of the disjoint-access parallel STM implementations. For example, no dynamic STM can be disjoint-access parallel, if it ensures wait-freedom for read-only transactions and a weak liveness property, known as minimal progress, for update transactions. In fact, even if transactions are static, STM implementations cannot be disjoint-access parallel, when read-only transactions must be wait-free and invisible. These impossibility results hold even when only snapshot isolation is required for the STM, and not stronger conditions like opacity or strict serializability. The second of these impossibility results holds for serializable STM as well.
Hagit Attiya, Panagiota Fatourou

Algorithms

Frontmatter

Algorithmic Techniques in STM Design

Abstract
The Transactional Memory paradigm has gained a lot of momentum in recent years. This is evidenced by the plethora of software transactional memory (STM) algorithms that can be found in the literature. Although their goal is common - to offer the transaction abstraction to the programmer - the implementations that they provide for this abstraction show a great variation among them. This variation appears as different algorithms aim at exhibiting different additional properties, such as offering specific liveness guarantees or good performance or both. In this chapter, we identify the basic characteristics of STM algorithms and the mechanisms that they are made up from. In conjunction with the design decisions, and in order to outline how they are used, we present in detail some representative STM algorithms. We also briefly discuss a lot of other STM algorithms found in the literature.
Panagiota Fatourou, Mykhailo Iaremko, Eleni Kanellou, Eleftherios Kosmas

Conflict Detection in Hardware Transactional Memory

Abstract
This chapter is dedicated to the conflict detection mechanism in the context of hardware transactional memory (HTM) systems. An effective mechanism is needed to detect conflicts amongst transactions, thus ensuring atomicity while allowing concurrency. Together with version management and conflict resolution, the conflict detection mechanism is one of the main design choices in HTM systems.
In this chapter, the two most common ways of detecting conflicts are described: eager and lazy. Then, we discuss the main HTM approaches to conflict detection, from the very first system proposed by Herlihy and Moss in 1993, to the commercial systems delivered by Intel or IBM, amongst others. Finally, a survey on conflict detection virtualization, i.e. support for unbounded transactions, is presented, emphasizing the signature topic.
Ricardo Quislant, Eladio Gutierrez, Emilio L. Zapata, Oscar Plata

Multi-versioning in Transactional Memory

Abstract
Reducing the number of aborts is one of the biggest challenges of most transactional systems: existing TMs may abort many transactions that could, in fact, commit without violating correctness. Historically, the commonly used method for reducing the abort rate was maintaining multiple object versions. Multiversion concurrency control is a classical approach for providing concurrent access to the database in database management systems. Its idea is to let a reading transaction obtain a consistent snapshot corresponding to an arbitrary point in time (e.g., defined at the beginning of a transaction) – concurrent updates are isolated through maintaining old versions rather than via scheduling decisions.
Multi-versioning was adopted by transactional memory algorithms as well. In this chapter we overview the multi-versioning approach by studying the inherent properties of STMs that use multiple versions to guarantee successful commits of all read-only transactions.
We first consider the challenges of garbage collecting of old object versions, and show that no STM can be optimal in the number of previous versions kept, while following the naïve approach of keeping a constant number of last versions per object might lead to an exponential memory growth.
We then show the potential performance challenges of multi-versioned STMs, including disjoint-access parallelism and visibility of read-only transactions.
We demonstrate the advantages of implementing multi-versioned STMs in managed memory environments by presenting Selective Multi-Versioning (SMV) algorithm. SMV relies on automatic garbage collection, and thus efficiently deals with old versions while still allowing invisible read-only transactions.
Idit Keidar, Dmitri Perelman

Framework Support for the Efficient Implementation of Multi-version Algorithms

Abstract
Software Transactional Memory algorithms associate metadata with the memory locations accessed during a transactions lifetime. This metadata may be stored in an external table and accessed by way of a function that maps the address of each memory location with the table entry that keeps its metadata (this is the out-place or external scheme); or alternatively may be stored adjacent to the associated memory cell by wrapping them together (the in-place scheme). In transactional memory multi-version algorithms, several versions of the same memory location may exist. The efficient implementation of these algorithms requires a one-to-one correspondence between each memory location and its list of past versions, which is stored as metadata. In this chapter we address the matter of the efficient implementation of multi-version algorithms in Java by proposing and evaluating a novel in-place metadata scheme for the Deuce framework. This new scheme is based in Java Bytecode transformation techniques and its use requires no changes to the application code. Experimentation indicates that multi-versioning STM algorithms implemented using our new in-place scheme are in average 6 × faster than when implemented with the out-place scheme.
Ricardo J. Dias, Tiago M. Vale, João M. Lourenço

Nested Parallelism in Transactional Memory

Abstract
We are witnessing an increase in the parallel power of computers for the foreseeable future, which requires parallel programming tools and models that can take advantage of the higher number of hardware threads. For some applications, reaching up to such high parallelism requires going beyond the typical monolithic parallel model: it calls for exposing fine-grained parallel tasks that might exist in a program, possibly nested within memory transactions.
While most current mainstream transactional memory (TM) systems do not yet support nested parallel transactions, recent research has proposed approaches that leverage TM with support for fine-grained parallel transactional nesting. These novel solutions promise to unleash the parallel power of TM to unprecedented levels. This chapter addresses parallel nesting models in transactional memory from two distinct perspectives.
We start from the programmer’s perspective, studying the spectrum of parallel nested models that are available to programmers, and giving a practical tutorial on the utility of each model, as well as the languages, tools and frameworks that help programmers build nested-parallel programs. We then turn to the perspective of a TM runtime designer, focusing on state-of-the art algorithms that support nested parallelism.
Ricardo Filipe, João Barreto

Contention Management and Scheduling

Frontmatter

Scheduling-Based Contention Management Techniques for Transactional Memory

Abstract
Contention management refers to the mechanisms used by transactional memory (TM) implementations “to ensure forward progress – to avoid livelock and starvation, and to promote throughput and fairness” [1]. Without effective contention management mechanisms, TM implementations are susceptible to performance degradation caused by numerous transaction collisions.
Early work on contention management focused on the narrower problem of conflict resolution. When two transactions collide, one transaction (the winner transaction) is allowed to proceed, while the other (the loser transaction) must wait and/or be aborted. Conflict resolution policies decide which transaction should win and which should lose and for how long the losing transaction should be delayed. However, it was shown that conflict resolution alone is insufficient for guaranteeing reasonable performance for high-contention TM workloads.
The key idea underlying transaction schedulers, introduced a few years ago, is that the execution of conflicting transactions must be serialized in the face of high contention and, more generally, that the level of parallelism between transactional threads should be controlled by the contention manager and dynamically adjusted. Transaction scheduling allows not only to resolve conflicts after they occur, but also to proactively reduce their probability, thus improving performance. This chapter provides a survey of the key approaches and techniques used by transaction schedulers.
Danny Hendler, Adi Suissa-Peleg

Proactive Contention Avoidance

Abstract
In current TM systems, both STM and HTM, if two transactions access the same address, and, one of them writes it, at least one of the two is aborted. However, many times, the aborted transaction was in a valid state, and work was lost for no good reason. In the first part of the chapter we discuss lowering such contention. We focus on methods that never lock-out a transaction, thus, we exclude approaches that serialize writing transaction to allow irrevocable transactions, for example. We are interested only in mechanisms that avoid conflicts and not in contention managers which resolves them. This part is about using the TM that exists, both in hardware and the compiler. The second part of the chapter is about SemanticTM, an algorithm that manages to eliminate the need for aborts, while maintaining parallelism. SemanticTM allows the application to maintain a consistent state without locks and aborts, but is currently restricted to specific scenarios.
Hillel Avni, Shlomi Dolev, Eleftherios Kosmas

Transactional Memory and Reliability

Frontmatter

Safe Exception Handling with Transactional Memory

Abstract
Exception handling is notoriously difficult for programmers whereas transactional memory has been instrumental in simplifying concurrent programming. In this chapter, we describe how the transactional syntactic sugar simplifies the exception handling problems both when writing sequential and concurrent applications. We survey exception handling solutions to prevent applications from reaching an inconsistent state in a sequential environment on the one hand, and extend these solutions to also prevent concurrent execution of multiple threads from reaching an inconsistent state, on the other hand. The resulting technique greatly simplifies exception handling and is shown surprisingly efficient.
Pascal Felber, Christof Fetzer, Vincent Gramoli, Derin Harmanci, Martin Nowack

Transactional Memory for Reliability

Abstract
It is foreseen that technology trends will increase the transient and permanent fault rates in future processors. Thus providing reliability for both the applications running on personal computers and running on mission-critical systems is becoming an absolute necessity. A reliable system requires the inclusion of two key capabilities: 1) error detection and 2) error recovery mechanisms. Transactional Memory (TM) provides an ideal base for both error detection and error recovery. First, TM provides mechanisms to abort transactions in case of a conflict, thus they discard or undo all the tentative memory updates and restart the execution from the beginning of the transaction. Thus, a transaction’s start can be viewed as a locally checkpointed stable state which can be used for error recovery. Second, transactional semantics allows the error detection to be deferred until a transaction commits (or the value becomes externally visible), so that the cost of error detection can be reduced compared to traditional error detection schemes (in which error detection is conducted et every instruction [26]) while its efficiency can be increased.
In this chapter, we first explain the hardware faults and aspects of reliability schemes such as error detection and error recovery. Then, we explain the major requirements of reliability schemes and the similarities between these requirements and transactional memory basics. Finally, we present current research landscape for reliability schemes using transactional memory.
Gulay Yalcin, Osman Unsal

Verification Tools for Transactional Programs

Abstract
While transactional memory has been investigated intensively, its use as a programming primitive by application and system builders is only recently becoming widespread, especially with the availability of hardware support in mainstream commercial CPUs. One key benefit of using transactional memory while writing applications is the simplicity of not having to reason at a low level about synchronization. For this to be possible, verification tools that are aware of atomic blocks and their semantics are needed. While such tools are clearly needed for the adoption of transactional memory in real systems, research in this area is quite preliminary. In this chapter, we provide highlights of our previous work on verification tools for transactional programs.
Adrian Cristal, Burcu Kulahcioglu Ozkan, Ernie Cohen, Gokcen Kestor, Ismail Kuru, Osman Unsal, Serdar Tasiran, Suha Orhun Mutluergil, Tayfun Elmas

Distributed Transactional Memory

Frontmatter

Introduction to Transactional Replication

Abstract
Transactional replication is a new enabling technology for service replication. Service replication means that a service runs on a group of processes (service replicas) that work together to execute requests issued by external clients. The characteristic feature of transactional replication is that client requests can be processed on a single replica concurrently as atomic transactions that can read or modify local state. Our goal is to provide an introduction to the transactional replication algorithms. We begin by discussing state machine replication and then present several algorithms that provide full transactional semantics such as deferred update replication and many variants of thereof. Finally, we compare their properties and performance as well as show their strong and weak points.
Tadeusz Kobus, Maciej Kokociński, Paweł T. Wojciechowski

Transaction Execution Models in Partially Replicated Transactional Memory: The Case for Data-Flow and Control-Flow

Abstract
In this chapter we describe solutions for managing concurrency of distributed transactional memory accesses in partially replicated deployments. A system is classified as partially replicated if, for each shared object, there is more than one node responsible for storing the object, thus resulting in multiple copies available in the system. In contrast to full replication, where all objects are replicated on all nodes, partial replication allows storing a huge amount of data that, by nature, cannot fit in a single node and improving scalability by (significantly) increasing the number of node serving transaction requests. Solutions that assume partially replicated deployments are categorized according to the mobility of shared objects. In the control-flow approach shared objects are pinned to nodes for the entire system’s lifetime, whereas in the data-flow objects are allowed to change residence node (also called owner) whenever a transaction commits a new version of the object. Intuitively, adopting the data-flow model, objects follow committing transactions whereas, relying on the control-flow model, transactions’ flow is routed towards objects. There is a number of key factors to be evaluated before preferring one transaction execution model to another. This chapter surveys all of them and provides solutions suited for different deployments. The chapter aims for helping designers to understand the execution model that better fits their requirements.
Roberto Palmieri, Sebastiano Peluso, Binoy Ravindran

Directory Protocols for Distributed Transactional Memory

Abstract
Distributed directory protocols for shared objects play an important role in providing access to higher level abstractions like transactional memory. They offer primitives to retrieve data and read it, or to move data and allow to write it. This chapter describes directory protocols for large-scale distributed systems and discusses the subtleties of incorporating them in a large-scale distributed transactional memory. We survey existing protocols, their advantages and drawbacks, and detail one protocol, Combine, which addresses these drawbacks.
Hagit Attiya, Vincent Gramoli, Alessia Milani

Applications and Self-tuning

Frontmatter

Tuning the Level of Concurrency in Software Transactional Memory: An Overview of Recent Analytical, Machine Learning and Mixed Approaches

Abstract
Synchronization transparency offered by Software Transactional Memory (STM) must not come at the expense of run-time efficiency, thus demanding from the STM-designer the inclusion of mechanisms properly oriented to performance and other quality indexes. Particularly, one core issue to cope with in STM is related to exploiting parallelism while also avoiding thrashing phenomena due to excessive transaction rollbacks, caused by excessively high levels of contention on logical resources, namely concurrently accessed data portions. A means to address run-time efficiency consists in dynamically determining the best-suited level of concurrency (number of threads) to be employed for running the application (or specific application phases) on top of the STM layer. For too low levels of concurrency, parallelism can be hampered. Conversely, over-dimensioning the concurrency level may give rise to the aforementioned thrashing phenomena caused by excessive data contention—an aspect which has reflections also on the side of reduced energy-efficiency. In this chapter we overview a set of recent techniques aimed at building “application-specific” performance models that can be exploited to dynamically tune the level of concurrency to the best-suited value. Although they share some base concepts while modeling the system performance vs the degree of concurrency, these techniques rely on disparate methods, such as machine learning or analytic methods (or combinations of the two), and achieve different tradeoffs in terms of the relation between the precision of the performance model and the latency for model instantiation. Implications of the different tradeoffs in real-life scenarios are also discussed.
Diego Rughetti, Pierangelo Di Sanzo, Alessandro Pellegrini, Bruno Ciciani, Francesco Quaglia

Self-tuning in Distributed Transactional Memory

Abstract
Many different mechanisms have been developed to implement Distributed Transactional Memory (DTM). Unfortunately, there is no “one-size-fits-all” design that offers the desirable performance across all possible workloads and scales. In fact, the performance of these mechanisms is affected by a number of intertwined factors that make it hard, or even impossible, to statically configure a DTM platform for optimal performance. These observations have motivated the emergence of self-tuning schemes for automatically adapting the algorithms and parameters used by the main building blocks of DTM systems. This chapter surveys existing research in the area of autonomic DTM design, with a focus on the approaches aimed at answering the following two fundamental questions: how many resources (number of nodes, etc.) should a DTM platform be provisioned with, and which protocols should be used to ensure data consistency.
Maria Couceiro, Diego Didona, Luís Rodrigues, Paolo Romano

Case Study: Using Transactions in Memcached

Abstract
To synthesize the topics in previous chapters of this book, we now turn to the question of how to use transactions in real-world code. We use a concrete example, transactionalization of the memcached application, as a vehicle for exploring the challenges and benefits that arise from using transactions instead of locks. Specific topics that receive attention in this chapter include irrevocability, contention management, language-level semantics and privatization, write-through and write-back algorithms, and condition synchronization.
Michael Spear, Wenjia Ruan, Yujie Liu, Trilok Vyas

Backmatter

Weitere Informationen

Premium Partner

    Bildnachweise