Invited Paper

Democratizing Transactional Programming

Abstract

The transaction abstraction is arguably one of the most appealing middleware paradigms. It lies typically between the programmer of a concurrent or distributed application on the one hand, and the operating system with the underlying network on the other hand. It encapsulates the complex internals of failure recovery and concurrency control, significantly simplifying thereby the life of a non-expert programmer.

Yet, some programmers are indeed experts and, for those, the transaction abstraction turns out to be inherently restrictive in its classic form. We argue for a genuine democratization of the paradigm, with different transactional semantics to be used by different programmers and composed within the same application.

Vincent Gramoli, Rachid Guerraoui

Social Networks

Scaling Microblogging Services with Divergent Traffic Demands

Abstract

Today’s microblogging services such as Twitter have long outgrown their initial designs as SMS-based social networks. Instead, a massive and steadily-growing user population of more than 100 million is using Twitter for everything from capturing the mood of the country to detecting earthquakes and Internet service failures. It is unsurprising that the traditional centralized client-server architecture has not scaled with user demands, leading to server overload and significant impairment of availability. In this paper, we argue that the divergence in usage models of microblogging services can be best addressed using complementary mechanisms, one that provides reliable messages between friends, and another that delivers events from popular celebrities and media outlets to their thousands or even millions of followers. We present Cuckoo, a new microblogging system that offloads processing and bandwidth costs away from a small centralized server base while ensuring reliable message delivery. We use a 20-day Twitter availability measurement to guide our design, and trace-driven emulation of 30,000 Twitter users to evaluate our Cuckoo prototype. Compared to the centralized approach, Cuckoo achieves 30-50% server bandwidth savings and 50-60% CPU load reduction, while guaranteeing reliable message delivery.

Tianyin Xu, Yang Chen, Lei Jiao, Ben Y. Zhao, Pan Hui, Xiaoming Fu

Contrail: Enabling Decentralized Social Networks on Smartphones

Abstract

Mobile devices are increasingly used for social networking applications, where data is shared between devices belonging to different users. Today, such applications are implemented as centralized services, forcing users to trust corporations with their personal data. While decentralized designs for such applications can provide privacy, they are difficult to achieve on current devices due to constraints on connectivity, energy and bandwidth. Contrail is a communication platform that allows decentralized social networks to overcome these challenges. In Contrail, a user installs content filters on her friends’ devices that express her interests; she subsequently receives new data generated by her friends that match the filters. Both data and filters are exchanged between devices via cloud-based relays in encrypted form, giving the cloud no visibility into either. In addition to providing privacy, Contrail enables applications that are very efficient in terms of energy and bandwidth.

Patrick Stuedi, Iqbal Mohomed, Mahesh Balakrishnan, Z. Morley Mao, Venugopalan Ramasubramanian, Doug Terry, Ted Wobber

Confidant: Protecting OSN Data without Locking It Up

Abstract

Online social networks (OSNs) are immensely popular, but participants are increasingly uneasy with centralized services’ handling of user data. Decentralized OSNs offer the potential to address user’s anxiety while also enhancing the features and scalability offered by existing, centralized services. In this paper, we present Confidant, a decentralized OSN designed to support a scalable application framework for OSN data without compromising users’ privacy. Confidant replicates a user’s data on servers controlled by her friends. Because data is stored on trusted servers, Confidant allows application code to run directly on these storage servers. To manage access-control policies under weakly-consistent replication, Confidant eliminates write conflicts through a lightweight cloud-based state manager and through a simple mechanism for updating the bindings between access policies and replicated data.

Dongtao Liu, Amre Shakimov, Ramón Cáceres, Alexander Varshavsky, Landon P. Cox

Storage and Performance Management

Live Deduplication Storage of Virtual Machine Images in an Open-Source Cloud

Abstract

Deduplication is an approach of avoiding storing data blocks with identical content, and has been shown to effectively reduce the disk space for storing multi-gigabyte virtual machine (VM) images. However, it remains challenging to deploy deduplication in a real system, such as a cloud platform, where VM images are regularly inserted and retrieved. We propose LiveDFS, a live deduplication file system that enables deduplication storage of VM images in an open-source cloud that is deployed under low-cost commodity hardware settings with limited memory footprints. LiveDFS has several distinct features, including spatial locality, prefetching of metadata, and journaling. LiveDFS is POSIX-compliant and is implemented as a Linux kernel-space file system. We deploy our LiveDFS prototype as a storage layer in a cloud platform based on OpenStack, and conduct extensive experiments. Compared to an ordinary file system without deduplication, we show that LiveDFS can save at least 40% of space for storing VM images, while achieving reasonable performance in importing and retrieving VM images. Our work justifies the feasibility of deploying LiveDFS in an open-source cloud.

Chun-Ho Ng, Mingcao Ma, Tsz-Yeung Wong, Patrick P. C. Lee, John C. S. Lui

Scalable Load Balancing in Cluster Storage Systems

Abstract

Enterprise and cloud data centers are comprised of tens of thousands of servers providing petabytes of storage to a large number of users and applications. At such a scale, these storage systems face two key challenges: (a) hot-spots due to the dynamic popularity of stored objects and (b) high reconfiguration costs of data migration due to bandwidth oversubscription in the data center network. Existing storage solutions, however, are unsuitable to address these challenges because of the large number of servers and data objects. This paper describes the design, implementation, and evaluation of Ursa, which scales to a large number of storage nodes and objects and aims to minimize latency and bandwidth costs during system reconfiguration. Toward this goal, Ursa formulates an optimization problem that selects a subset of objects from hot-spot servers and performs topology-aware migration to minimize reconfiguration costs. As exact optimization is computationally expensive, we devise scalable approximation techniques for node selection and efficient divide-and-conquer computation. Our evaluation shows Ursa achieves cost-effective load balancing while scaling to large systems and is time-responsive in computing placement decisions, e.g., about two minutes for 10K nodes and 10M objects.

Gae-won You, Seung-won Hwang, Navendu Jain

Predico: A System for What-if Analysis in Complex Data Center Applications

Abstract

Modern data center applications are complex distributed systems with tens or hundreds of interacting software components. An important management task in data centers is to predict the impact of a certain workload or reconfiguration change on the performance of the application. Such predictions require the design of “what-if” models of the application that take as input hypothetical changes in the application’s workload or environment and estimate its impact on performance.

We present Predico, a workload-based what-if analysis system that uses commonly available monitoring information in large scale systems to enable the administrators to ask a variety of workload-based “what-if” queries about the system. Predico uses a network of queues to analytically model the behavior of large distributed applications. It automatically generates node-level queueing models and then uses model composition to build system-wide models. Predico employs a simple what-if query language and an intelligent query execution algorithm that employs on-the-fly model construction and a change propagation algorithm to efficiently answer queries on large scale systems. We have built a prototype of Predico and have used traces from two large production applications from a financial institution as well as real-world synthetic applications to evaluate its what-if modeling framework. Our experimental evaluation validates the accuracy of Predico’s node-level resource usage, latency and workload-models and then shows how Predico enables what-if analysis in two different applications.

Rahul Singh, Prashant Shenoy, Maitreya Natu, Vaishali Sadaphal, Harrick Vin

Green Computing and Resource Management

GreenWare: Greening Cloud-Scale Data Centers to Maximize the Use of Renewable Energy

Abstract

To reduce the negative environmental implications (e.g., CO ₂ emission and global warming) caused by the rapidly increasing energy consumption, many Internet service operators have started taking various initiatives to operate their cloud-scale data centers with renewable energy. Unfortunately, due to the intermittent nature of renewable energy sources such as wind turbines and solar panels, currently renewable energy is often more expensive than brown energy that is produced with conventional fossil-based fuel. As a result, utilizing renewable energy may impose a considerable pressure on the sometimes stringent operation budgets of Internet service operators. Therefore, two key questions faced by many cloud-service operators are 1) how to dynamically distribute service requests among data centers in different geographical locations, based on the local weather conditions, to maximize the use of renewable energy, and 2) how to do that within their allowed operation budgets.

In this paper, we propose GreenWare, a novel middleware system that conducts dynamic request dispatching to maximize the percentage of renewable energy used to power a network of distributed data centers, subject to the desired cost budget of the Internet service operator. Our solution first explicitly models the intermittent generation of renewable energy, e.g., wind power and solar power, with respect to varying weather conditions in the geographical location of each data center. We then formulate the core objective of GreenWare as a constrained optimization problem and propose an efficient request dispatching algorithm based on linear-fractional programming (LFP). We evaluate GreenWare with real-world weather, electricity price, and workload traces. Our experimental results show that GreenWare can significantly increase the use of renewable energy in cloud-scale data centers without violating the desired cost budget, despite the intermittent supplies of renewable energy in different locations and time-varying electricity prices and workloads.

Yanwei Zhang, Yefu Wang, Xiaorui Wang

Resource Provisioning Framework for MapReduce Jobs with Performance Goals

Abstract

Many companies are increasingly using MapReduce for efficient large scale data processing such as personalized advertising, spam detection, and different data mining tasks. Cloud computing offers an attractive option for businesses to rent a suitable size Hadoop cluster, consume resources as a service, and pay only for resources that were utilized. One of the open questions in such environments is the amount of resources that a user should lease from the service provider. Often, a user targets specific performance goals and the application needs to complete data processing by a certain time deadline. However, currently, the task of estimating required resources to meet application performance goals is solely the users’ responsibility. In this work, we introduce a novel framework and technique to address this problem and to offer a new resource sizing and provisioning service in MapReduce environments. For a MapReduce job that needs to be completed within a certain time, the job profile is built from the job past executions or by executing the application on a smaller data set using an automated profiling tool. Then, by applying scaling rules combined with a fast and efficient capacity planning model, we generate a set of resource provisioning options. Moreover, we design a model for estimating the impact of node failures on a job completion time to evaluate worst case scenarios. We validate the accuracy of our models using a set of realistic applications. The predicted completion times of generated resource provisioning options are within 10% of the measured times in our 66-node Hadoop cluster.

Abhishek Verma, Ludmila Cherkasova, Roy H. Campbell

Resource-Aware Adaptive Scheduling for MapReduce Clusters

Abstract

We present a resource-aware scheduling technique for MapReduce multi-job workloads that aims at improving resource utilization across machines while observing completion time goals. Existing MapReduce schedulers define a static number of slots to represent the capacity of a cluster, creating a fixed number of execution slots per machine. This abstraction works for homogeneous workloads, but fails to capture the different resource requirements of individual jobs in multi-user environments. Our technique leverages job profiling information to dynamically adjust the number of slots on each machine, as well as workload placement across them, to maximize the resource utilization of the cluster. In addition, our technique is guided by user-provided completion time goals for each job. Source code of our prototype is available at [1].

Jordà Polo, Claris Castillo, David Carrera, Yolanda Becerra, Ian Whalley, Malgorzata Steinder, Jordi Torres, Eduard Ayguadé

Notification and Streaming

A Content-Based Publish/Subscribe Matching Algorithm for 2D Spatial Objects

Abstract

An important concern in the design of a publish/subscribe system is its expressiveness, which is the ability to represent various types of information in publications and to precisely select information of interest through subscriptions. We present an enhancement to existing content-based publish/subscribe systems with support for a 2D spatial data type and eight associated relational operators, including those to reveal overlap, containment, touching, and disjointedness between regions of irregular shape. We describe an algorithm for evaluating spatial relations that is founded on a new dynamic discretization method and region-intersection model. In order to make the data type practical for large-scale applications, we provide an indexing structure for accessing spatial constraints and develop a simplification method for eliminating redundant constraints. Finally, we present the results of experiments evaluating the effectiveness and scalability of our approach.

Athanasios Konstantinidis, Antonio Carzaniga, Alexander L. Wolf

FAIDECS: Fair Decentralized Event Correlation

Abstract

Many distributed applications rely on event correlation. Such applications, when not built as ad-hoc solutions, typically rely on centralized correlators or on broker overlay networks. Centralized correlators constitute performance bottlenecks and single points of failure; straightforwardly duplicating them can hamper performance and cause processes interested in the same correlations to reach different outcomes. The latter problem can manifest also if broker overlays provide redundant paths to tolerate broker failures as events do not necessarily reach all processes via the same path and thus in the same order.

This paper describes FAIDECS, a generic middleware system for fair decentralized correlation of events multicast among processes: processes with identical interests reach identical outcomes, and subsumption relationships among subscriptions are considered for respectively delivered composite events. Based on a generic subset of FAIDECS’s predicate language, we introduce properties for composite event deliveries in the presence of process failures and present novel decentralized algorithms implementing these properties. Our algorithms are compared under various workloads to solutions providing equivalent guarantees.

Gregory Aaron Wilkin, K. R. Jayaram, Patrick Eugster, Ankur Khetrapal

AmbiStream: A Middleware for Multimedia Streaming on Heterogeneous Mobile Devices

Abstract

Multimedia streaming when smartphones act as both clients and servers is difficult. Indeed, multimedia streaming protocols and associated data formats supported by today’s smartphones are highly heterogeneous. At the same time, multimedia processing is resource consuming while smartphones are resource-constrained devices. To overcome this complexity, we present AmbiStream, a lightweight middleware layer solution, which enables applications that run on smartphones to easily handle multimedia streams. Contrarily to existing multimedia-oriented middleware that propose a complete stack for multimedia streaming, our solution leverages the available highly-optimized multimedia software stack of the smartphones’ platforms and complements them with additional, yet resource-efficient, layers to enable interoperability. We introduce the challenges, present our approach and discuss the experimental results obtained when executing AmbiStream on both Android and iOS smartphones. Our results show that it is possible to perform adaptation at run time and still obtain streams with satisfactory quality.

Emil Andriescu, Roberto Speicys Cardoso, Valérie Issarny

Virtualizing Stream Processing

Abstract

Stream processing systems have evolved into established solutions as standalone engines but they still lack flexibility in terms of large-scale deployment, integration, extensibility, and interoperability. In the last years, a substantial ecosystem of new applications has emerged that can potentially benefit from stream processing but introduces different requirements on how stream processing solutions can be integrated, deployed, extended, and federated. To address these needs, we present an exoengine architecture and the associated ExoP platform. Together, they provide the means for encapsulating components of stream processing systems as well as automating the data exchange between components and their distributed deployment. The proposed solution can be used, e.g., to connect heterogeneous streaming engines, replace operators at runtime, and migrate operators across machines with a negligible overhead.

Michael Duller, Jan S. Rellermeyer, Gustavo Alonso, Nesime Tatbul

Replication and Caching

Leader Election for Replicated Services Using Application Scores

Abstract

Replicated services often rely on a leader to order client requests and broadcast state updates. In this work, we present POLE, a leader election algorithm that select leaders using application-specific scores. This flexibility given to the application enables the algorithm to tailor leader election according to metrics that are relevant in practical settings and that have been overlooked by existing approaches. Recovery time and request latency are examples of such metrics. To evaluate POLE, we use ZooKeeper, an open-source replicated service used for coordinating Web-scale applications. Our evaluation over realistic wide-area settings shows that application scores can have a significant impact on performance, and that just optimizing the latency of consensus does not translate into lower latency for clients. An important conclusion from our results is that obtaining a general strategy that satisfies a wide range of requirements is difficult, which implies that configurability is indispensable for practical leader election.

Diogo Becker, Flavio Junqueira, Marco Serafini

PolyCert: Polymorphic Self-optimizing Replication for In-Memory Transactional Grids

Abstract

In-memory NoSQL transactional data grids are emerging as an attractive alternative to conventional relational distributed databases. In these platforms, replication plays a role of paramount importance, as it represents the key mechanism to ensure data durability. In this work we focus on Atomic Broadcast (AB) based certification replication schemes, which have recently emerged as a much more scalable alternative to classical replication protocols based on active replication or atomic commit protocols. We first show that, among the existing AB-based certification protocols, no “one-fits-all” solution exists that achieves optimal performance in presence of heterogeneous workloads. Next, we present PolyCert, a polymorphic certification protocol that allows for the concurrent coexistence of different certification protocols, relying on machine-learning techniques to determine the optimal certification scheme on a per transaction basis. We design and evaluate two alternative oracles, based on parameter-free machine learning techniques that rely both on off-line and on-line training approaches. Our experimental results demonstrate the effectiveness of the proposed approach, highlighting that PolyCert is capable of achieving a performance extremely close to that of an optimal non-adaptive certification protocol in presence of non heterogeneous workloads, and significantly outperform any non-adaptive protocol when used with realistic, complex applications that generate heterogeneous workloads.

Maria Couceiro, Paolo Romano, Luis Rodrigues

A Trigger-Based Middleware Cache for ORMs

Abstract

Caching is an important technique in scaling storage for high-traffic web applications. Usually, building caching mechanisms involves significant effort from the application developer to maintain and invalidate data in the cache. In this work we present CacheGenie, a caching middleware which makes it easy for web application developers to use caching mechanisms in their applications. CacheGenie provides high-level caching abstractions for common query patterns in web applications based on Object-RelationalMapping (ORM) frameworks. Using these abstractions, the developer does not have to worry about managing the cache (e.g., insertion and deletion) or maintaining consistency (e.g., invalidation or updates) when writing application code.

We design and implement CacheGenie in the popular Django web application framework, with PostgreSQL as the database backend and memcached as the caching layer. To automatically invalidate or update cached data, we use triggers inside the database. CacheGenie requires no modifications to PostgreSQL or memcached. To evaluate our prototype, we port several Pinax web applications to use our caching abstractions. Our results show that it takes little effort for application developers to use CacheGenie, and that CacheGenie improves throughput by 2-2.5× for read-mostly workloads in Pinax.

Priya Gupta, Nickolai Zeldovich, Samuel Madden

Security and Interoperability

Deploy, Adjust and Readjust: Supporting Dynamic Reconfiguration of Policy Enforcement

Abstract

For large distributed applications, security and performance are two requirements often difficult to satisfy together. Addressing them separately leads more often to fast systems with security holes, rather than secure systems with poor performance. For instance, caching data needed for security decisions can lead to security violations when the data changes faster than the cache can refresh it. Retrieving such fresh data without caching it impacts performance. In this paper, we analyze a subproblem: how to dynamically configure a distributed authorization system when both security and performance requirements change. We examine data caching, retrieval and correlation, and propose a runtime management tool that, with external input, finds and enacts the customizations that satisfy both security and performance needs. Preliminary results show it takes around two seconds to find customization solutions in a setting with over one thousand authorization components.

Gabriela Gheorghe, Bruno Crispo, Roberto Carbone, Lieven Desmet, Wouter Joosen

A Middleware Layer for Flexible and Cost-Efficient Multi-tenant Applications

Abstract

Application-level multi-tenancy is an architectural design principle for Software-as-a-Service applications to enable the hosting of multiple customers (or tenants) by a single application instance. Despite the operational cost and maintenance benefits of application-level multi-tenancy, the current middleware component models for multi-tenant application design are inflexible with respect to providing different software variations to different customers.

In this paper we show that this limitation can be solved by a multi-tenancy support layer that combines dependency injection with middleware support for tenant data isolation. Dependency injection enables injecting different software variations on a per tenant basis, while dedicated middleware support facilitates the separation of data and configuration metadata between tenants. We implemented a prototype on top of Google App Engine and we evaluated by means of a case study that the improved flexibility of our approach has little impact on operational costs and upfront application engineering costs.

Stefan Walraven, Eddy Truyen, Wouter Joosen

Bridging the Interoperability Gap: Overcoming Combined Application and Middleware Heterogeneity

Abstract

Interoperability remains a significant challenge in today’s distributed systems; it is necessary to quickly compose and connect (often at runtime) previously developed and deployed systems in order to build more complex systems of systems. However, such systems are characterized by heterogeneity at both the application and middleware-level, where application differences are seen in terms of incompatible interface signatures and data content, and at the middleware level in terms of heterogeneous communication protocols. Consider a Flickr client implemented upon the XML-RPC protocol being composed with Picasa’s Service; here, the Flickr and Picasa APIs differ significantly, and the underlying communication protocols are different. A number of ad-hoc solutions exist to resolve differences at either distinct level, e.g., data translation technologies, service choreography tools, or protocol bridges; however, we argue that middleware solutions to interoperability should support developers in addressing these challenges using a unified framework. For this purpose we present the Starlink framework, which allows an interoperability solution to be specified using domain specific languages that are then used to generate the necessary executable software to enable runtime interoperability. We demonstrate the effectiveness of Starlink using an application case-study and show that it successfully resolves combined application and middleware heterogeneity.

Yérom-David Bromberg, Paul Grace, Laurent Réveillère, Gordon S. Blair

Run-Time (Re)configuration and Inspection

The Role of Ontologies in Emergent Middleware: Supporting Interoperability in Complex Distributed Systems

Abstract

Interoperability is a fundamental problem in distributed systems, and an increasingly difficult problem given the level of heterogeneity and dynamism exhibited by contemporary systems. While progress has been made, we argue that complexity is now at a level such that existing approaches are inadequate and that a major re-think is required to identify principles and associated techniques to achieve this central property of distributed systems. In this paper, we postulate that emergent middleware is the right way forward; emergent middleware is a dynamically generated distributed system infrastructure for the current operating environment and context. In particular, we focus on the key role of ontologies in supporting this process and in providing underlying meaning and associated reasoning capabilities to allow the right run-time choices to be made. The paper presents the Connect middleware architecture as an example of emergent middleware and highlights the role of ontologies as a cross-cutting concern throughout this architecture. Two experiments are described as initial evidence of the potential role of ontologies in middleware. Important remaining challenges are also documented.

Gordon S. Blair, Amel Bennaceur, Nikolaos Georgantas, Paul Grace, Valérie Issarny, Vatsala Nundloll, Massimo Paolucci

Co-managing Software and Hardware Modules through the Juggle Middleware

Abstract

Reprogrammable hardware like Field-Programmable Gate Arrays (FPGAs) is becoming increasingly powerful and affordable. Modern FPGA chips can be reprogrammed at runtime and with low latency which makes them attractive to be used as a dynamic resource in systems. For instance, on mobile devices FPGAs can help to accelerate the performance of critical tasks and at the same time increase the energy-efficiency of the device. The integration of FPGA resources into commodity software, however, is a highly involved task. On the one hand, there is an impedance mismatch between the hardware description languages in which FPGAs are programmed and the high-level languages in which many mobile applications are nowadays developed. On the other hand, the FPGA is a limited and shared resource and as such requires explicit resource management. In this paper, we present the Juggle middleware which leverages the ideas of modularity and service-orientation to facilitate a seamless exchange of hardware and software implementations at runtime. Juggle is built around the well-established OSGi standard for software modules in Java and extends it with support for services implemented in reprogrammable hardware, thereby leveraging the same level of management for both worlds. We show that hardware-accelerated services implemented with Juggle can help to increase the performance of applications and reduce power consumption on mobile devices without requiring any changes to existing program code.

Jan S. Rellermeyer, Ramon Küpfer

A Generic Solution for Agile Run-Time Inspection Middleware

Abstract

Contemporary middleware offers powerful abstractions to construct distributed software systems. However, when inspecting the software at run-time, these abstractions are no longer visible. While inspection, monitoring and management are increasingly important in our always-online world, they are often only possible in terms of the lower-level abstraction of the underlying platform. Due to the complexity of current programming languages and middleware, this low-level information is too complex to handle or understand.

This paper presents a run-time inspection system based on dynamic model transformation capabilities that extends run-time entities with higher-level abstract views, in order to enable inspection in terms of the original and most relevant abstractions. Our solution is lightweight in terms of performance overhead and agile in the sense that it can selectively (and on-demand) generate these high-level views.

Our prototype implementation has been applied to inspect distributed applications using RMI. In this case study, we inspect the distributed RMI system using our integrated overview over the collection of distributed objects that interact using remote method invocation.

Wouter De Borger, Bert Lagaisse, Wouter Joosen

Industry

A Comparison of Secure Multi-Tenancy Architectures for Filesystem Storage Clouds

Abstract

A filesystem-level storage cloud offers network-filesystem access to multiple customers at low cost over the Internet. In this paper, we investigate two alternative architectures for achieving multi-tenancy securely and efficiently in such storage cloud services. They isolate customers in virtual machines at the hypervisor level and through mandatory access-control checks in one shared operating-system kernel, respectively. We compare and discuss the practical security guarantees of these architectures. We have implemented both approaches and compare them using performance measurements we obtained.

Anil Kurmus, Moitrayee Gupta, Roman Pletka, Christian Cachin, Robert Haas

SafeWeb: A Middleware for Securing Ruby-Based Web Applications

Abstract

Web applications in many domains such as healthcare and finance must process sensitive data, while complying with legal policies regarding the release of different classes of data to different parties. Currently, software bugs may lead to irreversible disclosure of confidential data in multi-tier web applications. An open challenge is how developers can guarantee these web applications only ever release sensitive data to authorised users without costly, recurring security audits.

Our solution is to provide a trusted middleware that acts as a “safety net” to event-based enterprise web applications by preventing harmful data disclosure before it happens. We describe the design and implementation of SafeWeb, a Ruby-based middleware that associates data with security labels and transparently tracks their propagation at different granularities across a multi-tier web architecture with storage and complex event processing. For efficiency, maintainability and ease-of-use, SafeWeb exploits the dynamic features of the Ruby programming language to achieve label propagation and data flow enforcement. We evaluate SafeWeb by reporting our experience of implementing a web-based cancer treatment application and deploying it as part of the UK National Health Service (NHS).

Petr Hosek, Matteo Migliavacca, Ioannis Papagiannis, David M. Eyers, David Evans, Brian Shand, Jean Bacon, Peter Pietzuch

Springer Professional

Über dieses Buch

Inhaltsverzeichnis

Frontmatter

Invited Paper

Democratizing Transactional Programming

Social Networks

Scaling Microblogging Services with Divergent Traffic Demands

Contrail: Enabling Decentralized Social Networks on Smartphones

Confidant: Protecting OSN Data without Locking It Up

Storage and Performance Management

Live Deduplication Storage of Virtual Machine Images in an Open-Source Cloud

Scalable Load Balancing in Cluster Storage Systems

Predico: A System for What-if Analysis in Complex Data Center Applications

Green Computing and Resource Management

GreenWare: Greening Cloud-Scale Data Centers to Maximize the Use of Renewable Energy

Resource Provisioning Framework for MapReduce Jobs with Performance Goals

Resource-Aware Adaptive Scheduling for MapReduce Clusters

Notification and Streaming

A Content-Based Publish/Subscribe Matching Algorithm for 2D Spatial Objects

FAIDECS: Fair Decentralized Event Correlation

AmbiStream: A Middleware for Multimedia Streaming on Heterogeneous Mobile Devices

Virtualizing Stream Processing

Replication and Caching

Leader Election for Replicated Services Using Application Scores

PolyCert: Polymorphic Self-optimizing Replication for In-Memory Transactional Grids

A Trigger-Based Middleware Cache for ORMs

Security and Interoperability

Deploy, Adjust and Readjust: Supporting Dynamic Reconfiguration of Policy Enforcement

A Middleware Layer for Flexible and Cost-Efficient Multi-tenant Applications

Bridging the Interoperability Gap: Overcoming Combined Application and Middleware Heterogeneity

Run-Time (Re)configuration and Inspection

The Role of Ontologies in Emergent Middleware: Supporting Interoperability in Complex Distributed Systems

Co-managing Software and Hardware Modules through the Juggle Middleware

A Generic Solution for Agile Run-Time Inspection Middleware

Industry

A Comparison of Secure Multi-Tenancy Architectures for Filesystem Storage Clouds

SafeWeb: A Middleware for Securing Ruby-Based Web Applications

Backmatter

Premium Partner