main-content

This book constitutes the proceedings of the 20th IFIP International Conference on Distributed Applications and Interoperable Systems, DAIS 2020, which was supposed to be held in Valletta, Malta, in June 2020, as part of the 15th International Federated Conference on Distributed Computing Techniques, DisCoTec 2020. The conference was held virtually due to the COVID-19 pandemic.

The 10 full papers presented together with 1 short paper and 1 invited paper were carefully reviewed and selected from 17 submissions. The papers addressed challenges in multiple application areas, such as privacy and security, cloud and systems, fault-tolerance and reproducibility, machine learning for systems, and distributed algorithms.

### On the Trade-Offs of Combining Multiple Secure Processing Primitives for Data Analytics

Abstract
Cloud Computing services for data analytics are increasingly being sought by companies to extract value from large quantities of information. However, processing data from individuals and companies in third-party infrastructures raises several privacy concerns. To this end, different secure analytics techniques and systems have recently emerged. These initial proposals leverage specific cryptographic primitives lacking generality and thus having their application restricted to particular application scenarios. In this work, we contribute to this thriving body of knowledge by combining two complementary approaches to process sensitive data.
We present SafeSpark, a secure data analytics framework that enables the combination of different cryptographic processing techniques with hardware-based protected environments for privacy-preserving data storage and processing. SafeSpark is modular and extensible therefore adapting to data analytics applications with different performance, security and functionality requirements.
We have implemented a SafeSpark’s prototype based on Spark SQL and Intel SGX hardware. It has been evaluated with the TPC-DS Benchmark under three scenarios using different cryptographic primitives and secure hardware configurations. These scenarios provide a particular set of security guarantees and yield distinct performance impact, with overheads ranging from as low as 10% to an acceptable 300% when compared to an insecure vanilla deployment of Apache Spark.
Hugo Carvalho, Daniel Cruz, Rogério Pontes, João Paulo, Rui Oliveira

### Capturing Privacy-Preserving User Contexts with IndoorHash

Abstract
IoT devices are ubiquitous and widely adopted by end-users to gather personal and environmental data that often need to be put into context in order to gain insights. In particular, location is often a critical context information that is required by third parties in order to analyse such data at scale. However, sharing this information is i) sensitive for the user privacy and ii) hard to capture when considering indoor environments.
This paper therefore addresses the challenge of producing a new location hash, named IndoorHash, that captures the indoor location of a user, without disclosing the physical coordinates, thus preserving their privacy. This location hash leverages surrounding infrastructure, such as WiFi access points, to compute a key that uniquely identifies an indoor location.
Location hashes are only known from users physically visiting these locations, thus enabling a new generation of privacy-preserving crowdsourcing mobile applications that protect from third parties re-identification attacks. We validate our results with a crowdsourcing campaign of 31 mobile devices during one month of data collection.
Lakhdar Meftah, Romain Rouvoy, Isabelle Chrisment

### Towards Hypervisor Support for Enhancing the Performance of Virtual Machine Introspection

Abstract
Virtual machine introspection (VMI) is the process of external monitoring of virtual machines. Previous work has demonstrated that VMI can contribute to the security of cloud environments and distributed systems, as it enables, for example, stealthy intrusion detection. One of the biggest challenges for applying VMI in production environments is the performance overhead that certain tracing operations impose on the monitored virtual machine. In this paper, we show how this performance overhead can be significantly minimized by incorporating minor extensions for VMI operations into the hypervisor. In a proof-of-concept implementation, we demonstrate that the pre-processing of VMI events in the Xen hypervisor reduces the monitoring overhead for the use case of VMI-based process-bound monitoring by a factor of 18.
Benjamin Taubmann, Hans P. Reiser

### Fed-DIC: Diagonally Interleaved Coding in a Federated Cloud Environment

Abstract
Coping with failures in modern distributed storage systems that handle massive volumes of heterogeneous and potentially rapidly changing data, has become a very important challenge. A common practice is to utilize fault tolerance methods like Replication and Erasure Coding for maximizing data availability. However, while erasure codes provide better fault tolerance compared to replication with a more affordable storage overhead, they frequently suffer from high reconstruction cost as they require to access all available nodes when a data block needs to be repaired, and also can repair up to a limited number of unavailable data blocks, depending on the number of the code’s parity block capabilities. Furthermore, storing and placing the encoded data in the federated storage system also remains a challenge. In this paper we present Fed-DIC, a framework which combines Diagonally Interleaved Coding on client devices at the edge of the network with organized storage of encoded data in a federated cloud system comprised of multiple independent storage clusters. The erasure coding operations are performed on client devices at the edge while they interact with the federated cloud to store the encoded data. We describe how our solution integrates the functionality of federated clouds alongside erasure coding implemented on edge devices for maximizing data availability and we evaluate the working and benefits of our approach in terms of read access cost, data availability, storage overhead, load balancing and network bandwidth rate compared to popular Replication and Erasure Coding schemes.
Giannis Tzouros, Vana Kalogeraki

### TailX: Scheduling Heterogeneous Multiget Queries to Improve Tail Latencies in Key-Value Stores

Abstract
Users of interactive services such as e-commerce platforms have high expectations for the performance and responsiveness of these services. Tail latency, denoting the worst service times, contributes greatly to user dissatisfaction and should be minimized. Maintaining low tail latency for interactive services is challenging because a request is not complete until all its operations are completed. The challenge is to identify bottleneck operations and schedule them on uncoordinated backend servers with minimal overhead, when the duration of these operations are heterogeneous and unpredictable. In this paper, we focus on improving the latency of multiget operations in cloud data stores. We present TailX, a task-aware multiget scheduling algorithm that improves tail latencies under heterogeneous workloads. TailX schedules operations according to an estimation of the size of the corresponding data, and allows itself to procrastinate some operations to give way to higher priority ones. We implement TailX in Cassandra, a widely used key-value store. The result is an improved overall performance of the cloud data stores for a wide variety of heterogeneous workloads. Specifically, our experiments under heterogeneous YCSB workloads show that TailX outperforms state-of-the-art solutions and reduces tail latencies by up to 70% and median latencies by up to 75%.
Vikas Jaiman, Sonia Ben Mokhtar, Etienne Rivière

### Building a Polyglot Data Access Layer for a Low-Code Application Development Platform

(Experience Report)
Abstract
Low-code application development as proposed by the OutSystems Platform enables fast mobile and desktop application development and deployment. It hinges on visual development of the interface and business logic but also on easy integration with data stores and services while delivering robust applications that scale.
Data integration increasingly means accessing a variety of NoSQL stores. Unfortunately, the diversity of data and processing models, that make them useful in the first place, is difficult to reconcile with the simplification of abstractions exposed to developers in a low-code platform. Moreover, NoSQL data stores also rely on a variety of general purpose and custom scripting languages as their main interfaces.
In this paper we report on building a polyglot data access layer for the OutSystems Platform that uses SQL with optional embedded script snippets to bridge the gap between low-code and full access to NoSQL stores.
Ana Nunes Alonso, João Abreu, David Nunes, André Vieira, Luiz Santos, Tércio Soares, José Pereira

### A Comparison of Message Exchange Patterns in BFT Protocols

(Experience Report)
Abstract
The performance and scalability of byzantine fault-tolerant (BFT) protocols for state machine replication (SMR) have recently come under scrutiny due to their application in the consensus mechanism of blockchain implementations. This led to a proliferation of proposals that provide different trade-offs that are not easily compared as, even if these are all based on message passing, multiple design and implementation factors besides the message exchange pattern differ between each of them. In this paper we focus on the impact of different combinations of cryptographic primitives and the message exchange pattern used to collect and disseminate votes, a key aspect for performance and scalability. By measuring this aspect in isolation and in a common framework, we characterise the design space and point out research directions for adaptive protocols that provide the best trade-off for each environment and workload combination.
Fábio Silva, Ana Alonso, José Pereira, Rui Oliveira

### Kollaps/Thunderstorm: Reproducible Evaluation of Distributed Systems

Tutorial Paper
Abstract
Reproducing experimental results is nowadays seen as one of the greatest impairments for the progress of science in general and distributed systems in particular. This stems from the increasing complexity of the systems under study and the inherent complexity of capturing and controlling all variables that can potentially affect experimental results. We argue that this can only be addressed with a systematic approach to all the stages and aspects of the evaluation process, such as the environment in which the experiment is run, the configuration and software versions used, and the network characteristics among others. In this tutorial paper, we focus on the networking aspect, and discuss our ongoing research efforts and tools to contribute to a more systematic and reproducible evaluation of large scale distributed systems.
Miguel Matos

### Self-tunable DBMS Replication with Reinforcement Learning

Abstract
Fault-tolerance is a core feature in distributed database systems, particularly the ones deployed in cloud environments. The dependability of these systems often relies in middleware components that abstract the DBMS logic from the replication itself. The highly configurable nature of these systems makes their throughput very dependent on the correct tuning for a given workload. Given the high complexity involved, machine learning techniques are often considered to guide the tuning process and decompose the relations established between tuning variables.
This paper presents a machine learning mechanism based on reinforcement learning that attaches to a hybrid replication middleware connected to a DBMS to dynamically live-tune the configuration of the middleware according to the workload being processed. Along with the vision for the system, we present a study conducted over a prototype of the self-tuned replication middleware, showcasing the achieved performance improvements and showing that we were able to achieve an improvement of 370.99% on some of the considered metrics.
Luís Ferreira, Fábio Coelho, José Pereira

### DroidAutoML: A Microservice Architecture to Automate the Evaluation of Android Machine Learning Detection Systems

Abstract
The mobile ecosystem is witnessing an unprecedented increase in the number of malware in the wild. To fight this threat, actors from both research and industry are constantly innovating to bring concrete solutions to improve security and malware protection. Traditional solutions such as signature-based anti viruses have shown their limits in front of massive proliferation of new malware, which are most often only variants specifically designed to bypass signature-based detection. Accordingly, it paves the way to the emergence of new approaches based on Machine Learning (ML) technics to boost the detection of unknown malware variants. Unfortunately, these solutions are most often underexploited due to the time and resource costs required to adequately fine tune machine learning algorithms. In reality, in the Android community, state-of-the-art studies do not focus on model training, and most often go through an empirical study with a manual process to choose the learning strategy, and/or use default values as parameters to configure ML algorithms. However, in the ML domain, it is well known admitted that to solve efficiently a ML problem, the tunability of hyper-parameters is of the utmost importance. Nevertheless, as soon as the targeted ML problem involves a massive amount of data, there is a strong tension between feasibility of exploring all combinations and accuracy. This tension imposes to automate the search for optimal hyper-parameters applied to ML algorithms, that is not anymore possible to achieve manually. To this end, we propose a generic and scalable solution to automatically both configure and evaluate ML algorithms to efficiently detect Android malware detection systems. Our approach is based on devOps principles and a microservice architecture deployed over a set of nodes to scale and exhaustively test a large number of ML algorithms and hyper-parameters combinations. With our approach, we are able to systematically find the best fit to increase up to 11% the accuracy of two state-of-the-art Android malware detection systems.
Yérom-David Bromberg, Louison Gitzinger

### A Resource Usage Efficient Distributed Allocation Algorithm for 5G Service Function Chains

Abstract
Recent evolution of networks introduce new challenges for the allocation of resources. Slicing in 5G networks allows multiple users to share a common infrastructure and the chaining of Network Function (NFs) introduces constraints on the order in which NFs are allocated. We first model the allocation of resources for Chains of NFs in 5G Slices. Then we introduce a distributed mutual exclusion algorithm to address the problem of the allocation of resources. We show with selected metrics that choosing an order of allocation of the resources that differs from the order in which resources are used can give better performances. We then show experimental results where we improve the usage rate of resources by more than 20% compared to the baseline algorithm in some cases. The experiments run on our own simulator based on SimGrid.
Guillaume Fraysse, Jonathan Lejeune, Julien Sopena, Pierre Sens

### A Self-stabilizing One-To-Many Node Disjoint Paths Routing Algorithm in Star Networks

Abstract
The purpose of the paper is to present the first self-stabilizing algorithm for finding $$n-1$$ one-to-many node-disjoint paths in message passing model. Two paths in a network are said to be node disjoint if they do not share any nodes except for the endpoints. Our proposed algorithm works on n-dimensional star networks $$S_n$$. Given a source node s and a set of $$D = \{d_1, d_2, ...,d_{n-1} \}$$ of $$n-1$$ destination nodes in the n-dimensional star network, our algorithm constructs $$n-1$$ node-disjoints paths $$P_1, P_2,...,P_{n-1}$$, where $$P_i$$ is a path from s to $$d_i$$, $$1 \le i \le n-1$$. Since the proposed solution is self-stabilizing [7], it does not require initialization and withstands transient faults. The stabilization time of our algorithm is $$O(n^2)$$ rounds.