Skip to main content

2014 | Buch

Software Engineering for Resilient Systems

6th International Workshop, SERENE 2014, Budapest, Hungary, October 15-16, 2014. Proceedings

herausgegeben von: István Majzik, Marco Vieira

Verlag: Springer International Publishing

Buchreihe : Lecture Notes in Computer Science

insite
SUCHEN

Über dieses Buch

This book constitutes the refereed proceedings of the 6th International Workshop on Software Engineering for Resilient Systems, SERENE 2014, held in Budapest, Hungary, in October 2014. The 11 revised technical papers presented together with one project paper and one invited talk were carefully reviewed and selected from 22 submissions. The papers are organized in topical sections on design of resilient systems; analysis of resilience; verification and validation; and monitoring.

Inhaltsverzeichnis

Frontmatter

Invited Talk

Community Resilience Engineering: Reflections and Preliminary Contributions
Abstract
An important challenge for human societies is that of mastering the complexity of Community Resilience, namely “the sustained ability of a community to utilize available resources to respond to, withstand, and recover from adverse situations”. The above concise definition puts the accent on an important requirement: a community’s ability to make use in an intelligent way of the available resources, both institutional and spontaneous, in order to match the complex evolution of the “significant multi-hazard threats characterizing a crisis”. Failing to address such requirement exposes a community to extensive failures that are known to exacerbate the consequences of natural and human-induced crises. As a consequence, we experience today an urgent need to respond to the challenges of community resilience engineering. This problem, some reflections, and preliminary prototypical contributions constitute the topics of the present article.
Vincenzo De Florio, Hong Sun, Chris Blondia

Design of Resilient Systems

Enhancing Architecture Design Decisions Evolution with Group Decision Making Principles
Abstract
In order to build resilient systems, robust architectures are needed. The software architecture community clearly recognizes that robust architectures come from a robust decision-making process. The community also acknowledges that software architecture decision making is not an individual activity but a group process where architectural design decisions are made by groups of heterogeneous and dispersed stakeholders. The decision-making process is not just data driven, but also people driven, and group decision making methodologies have been studied from multiple perspectives (e.g., psychology, organizational behavior, economics) with the clear understanding that a poor-quality decision making process is more likely than a high-quality process leading to undesirable outcomes (including disastrous fiascoes).
In this work, we propose to explicitly include group decision making strategies into an architecting phase, so to clearly document not only the architectural decisions that may lead to the success or failure of a system, but also group decision making factors driving the way architecture design decisions are made. In this regard, this work defines a group design decision metamodel (for representing group design decisions and their relationships), together with ways to trace group design decisions towards other system life-cycle artifacts, and a change impact analysis engine for supporting evolving design decisions.
Ivano Malavolta, Henry Muccini, Smrithi Rekha V.
The Role of Parts in the System Behaviour
Abstract
In today’s world, we are surrounded by software-based systems that control so many critical activities. Every few years we experiment dramatic software failures and this asks for software that gives evidence of resilience and continuity. Moreover, we are observing an unavoidable shift from stand-alone systems to systems of systems, to ecosystems, to cyber-physical systems and in general to systems that are composed of various independent parts that collaborate and cooperate to realise the desired goal.
Our thesis is that the resilience of such systems should be constructed compositionally and incrementally out of the resilience of system parts. Understanding the role of parts in the system behaviour will (i) promote a “divide-and-conquer strategy” on the verification of systems, (ii) enable the verification of systems that continuously evolve during their life-time, (iii) allow the detection and isolation of faults, and (iv) facilitate the definition of suitable reaction strategies. In this paper we propose a methodology that integrates needs of flexibility and agility with needs of resilience. We instantiate the methodology in the domain of a swarm of autonomous quadrotors that cooperate in order to achieve a given goal.
Davide Di Ruscio, Ivano Malavolta, Patrizio Pelliccione
Automatic Generation of Description Files for Highly Available Services
Abstract
Highly available services are becoming a part of our everyday life; yet building highly available systems remains a challenging task for most system integrators who are expected build reliable systems from none reliable components. The service availability forum (SAForum) defines open standards for building and maintaining HA systems using the SAForum middleware. Nevertheless this task remains tedious and error prone due to the complexity of this middleware configuration. In this paper, we present a solution to automate the generation of description files for HA systems which enables the automated generation of the middleware configuration. In order to achieve this we propose an approach based on a new domain specific language extending the UML component diagrams, along with a corresponding set of model transformations. We also present our prototype implementation and a case study as a proof of concept.
Maxime Turenne, Ali Kanso, Abdelouahed Gherbi, Ronan Barrett

Analysis of Resilience

Modelling Resilience of Data Processing Capabilities of CPS
Abstract
Modern CPS should process large amount of data with high speed and reliability. To ensure that the system can handle varying volumes of data, the system designers usually rely on the architectures with the dynamically scaling degree of parallelism. However, to guarantee resilience of data processing, we should also ensure system fault tolerance, i.e., integrate the mechanisms for dynamic reconfiguration. In this paper, we present an approach to formal modelling and assessment of reconfigurable dynamically scaling systems that guarantees resilience of data processing. We rely on modelling in Event-B to formally define the dynamic system architecture with the integrated dynamically scaling parallelism and reconfiguration. The formal development allows us to derive a complex system architecture and verify its correctness. To quantitatively assess resilience of data processing architecture, we rely on statistical model checking and evaluate the likelihood of successful data processing under different system parameters. The proposed integrated approach facilitates design space exploration and improves predictability in the development of complex data processing capabilities.
Linas Laibinis, Dmitry Klionskiy, Elena Troubitsyna, Anatoly Dorokhov, Johan Lilius, Mikhail Kupriyanov
Formal Fault Tolerance Analysis of Algorithms for Redundant Systems in Early Design Stages
Abstract
Redundant techniques, that use voting principles, are often used to increase the reliability of systems by ensuring fault tolerance. In order to increase the efficiency of these redundancy strategies we propose to exploit the inherent fault masking properties of software-algorithms at application-level. An important step in early development stages is to choose from a class of algorithms that achieve the same goal in different ways, one or more that should be executed redundantly. In order to evaluate the resilience of the algorithm variants, there is a great need for a quantitative reasoning about the algorithms fault tolerance in early design stages.
Here, we propose an approach of analyzing the vulnerability of given algorithm variants to hardware faults in redundant designs by applying a model checker and fault injection modelling. The method is capable of automatically identifying all input and fault combinations that remain undetected by a voting system. This leads to a better understanding of algorithm-specific resilience characteristics.
Andrea Höller, Nermin Kajtazovic, Christopher Preschern, Christian Kreiner
On Applying FMEA to SOAs: A Proposal and Open Challenges
Abstract
Service Oriented Architectures (SOAs) are being increasingly used to support business-critical systems, raising natural concerns regarding dependability and security attributes. In critical applications, Verification and Validation (V&V) practices are used during system development to achieve the desired level of quality. However, most V&V techniques suit a structured and documented development lifecycle, and assume that the system does not evolve after deployment, contrarily to what happens with SOA. Runtime V&V practices represent one possible solution for this problem, but they are not possible to implement without the adjustment of traditional V&V techniques.
This paper studies the adaptation of Failure Mode and Effects Analysis (FMEA) to SOA environments. A preliminary technique named FMEA4SOA is proposed and a case study is used to illustrate its usage. This process raises many challenges that must be overcome for the FMEA4SOA to become usable and effective V&V in SOA environments. The paper discusses these challenges while proposing a research roadmap.
Cristiana Areias, Nuno Antunes, João Carlos Cunha

Verification and Validation

Verification and Validation of a Pressure Control Unit for Hydraulic Systems
Abstract
This paper describes the development, verification and model-based validation of a safety-critical pressure relief function for a digital hydraulic system. It demonstrates techniques to handle typical challenges that are encountered when verifying and validating cyber-physical systems with complex dynamical behaviour. The system is developed using model-based design in Simulink. The verification part focuses on verification of functional properties of the controller, where formal automated verification tools are employed. The validation part focuses on validating that the controller has the desired impact on the physical system. In the latter part search-based methods are used to find undesired behaviour in a simulation model of the system. The combination of techniques provides confidence in the resilience of the developed system.
Pontus Boström, Mikko Heikkilä, Mikko Huova, Marina Waldén, Matti Linjama
Simulation Testing and Model Checking: A Case Study Comparing these Approaches
Abstract
One of the challenging problems in software development is the assuring of the correctness of the created software. During our previous research, we developed a framework for the simulation-based testing of software components SimCo that allows us to perform testing of component-based applications or its fragments. The SimCo was originally designed to perform the tests according to a given scenario in order to determine extra-functional properties of the components. However, it can be also used to test the correctness of the component behaviour. For this purpose, there are also other ways – the model checking tools, such as Java Pathfinder. We want to compare the strengths and weaknesses of the two approaches as represented by the SimCo and the Java Pathfinder. In this paper, the results of the comparison of these two testing methods on a case study using the implementation of the FTP protocol are discussed.
Richard Lipka, Marek Paška, Tomáš Potužák
Advanced Modelling, Simulation and Verification for Future Traffic Regulation Optimisation
Abstract
This paper introduces a new project supported by the UK Technical Strategy Leadership Group (TSLG) to contribute to its vision of Future Traffic Regulation Optimisation (FuTRO). In this project Newcastle University will closely cooperate with Siemens Rail Automation on developing novel modelling, verification and simulation techniques and tools that support and explore in an integrated approach to efficient dynamic improvement of capacity and energy consumption of railway networks and nodes while ensuring whole systems safety. The SafeCap+ (or SafeCap for FuTRO) project builds on the two previous projects (SafeCap and SafeCap Impact) which have developed a novel modelling environment that helps signalling engineers to design nodes (stations or junctions) in a way that guarantees their safety and allows engineers to explore different design options to select the ones that ensure the improved node capacity.
Alexei Iliasov, Roberto Palacin, Alexander Romanovsky

Monitoring

Using Instrumentation for Quality Assessment of Resilient Software in Embedded Systems
Abstract
The obvious growth of complexity in embedded and cyber physical systems requires from developers to be innovative in the way they carry out the verification process. To increase the amount of information available from a system, software instrumentation has been previously used in these domains, therefore solving the problem of observability. In addition, as this kind of systems tends to be increasingly involved in safety critical and dependable applications, ensuring reliability properties must also be considered as a part of the verification process. In this paper, the system observability problem is initially being introduced. Then, as a solution to overcome the previous limitation, instrumentation is being explored. To address the verification concerns of resilient systems, a three components model is designed, the latter explicitly defining degradation and compensation models to capture the resiliency routine. Finally, to conclude the models definition, a handful number of LTL properties are identified and discussed.
David Lawrence, Didier Buchs, Armin Wellig
Adaptive Domain-Specific Service Monitoring
Abstract
We propose an adaptive and domain-specific service monitoring approach to detect partner service errors in a cost-effective manner. Hereby, we not only consider generic errors such as file not found or connection timed out, but also take domain-specific errors into account. The detection of each type of error entails a different monitoring cost in terms of the consumed resources. To reduce costs, we adapt the monitoring frequency for each service and for each type of error based on the measured error rates and a cost model. We introduce an industrial case study from the broadcasting and content-delivery domain for improving the user-perceived reliability of Smart TV systems. We demonstrate the effectiveness of our approach with real data collected to be relevant for a commercial TV portal application. We present empirical results regarding the trade-off between monitoring overhead and error detection accuracy. Our results show that each service is usually subject to various types of errors with different error rates and exploiting this variation can reduce monitoring costs by up to 30% with negligible compromise on the quality of monitoring.
Arda Ahmet Ünsal, Görkem Sazara, Barış Aktemur, Hasan Sözer
Combined Error Propagation Analysis and Runtime Event Detection in Process-Driven Systems
Abstract
This paper presents an approach and Proof-of-Concept implementation for combined design time error propagation analysis and runtime diagnosis in business process driven systems. We show how error propagation analysis can be made practical in this context with qualitative error propagation notation that is approachable for the domain expert. The method uses models of business processes and their supporting IT infrastructure captured in industry-standard tools. Finite domain constraint solving is used to evaluate system alternatives from a dependability point of view. The systematic generation of event detection rules for runtime diagnosis is also supported. A real life example from the banking domain is used to demonstrate the approach.
Gábor Urbanics, László Gönczy, Balázs Urbán, János Hartwig, Imre Kocsis
Backmatter
Metadaten
Titel
Software Engineering for Resilient Systems
herausgegeben von
István Majzik
Marco Vieira
Copyright-Jahr
2014
Verlag
Springer International Publishing
Electronic ISBN
978-3-319-12241-0
Print ISBN
978-3-319-12240-3
DOI
https://doi.org/10.1007/978-3-319-12241-0