Skip to main content
Top

2023 | Book

Computer Safety, Reliability, and Security. SAFECOMP 2023 Workshops

ASSURE, DECSoS, SASSUR, SENSEI, SRToITS, and WAISE, Toulouse, France, September 19, 2023, Proceedings

Editors: Jérémie Guiochet, Stefano Tonetta, Erwin Schoitsch, Matthieu Roy, Friedemann Bitsch

Publisher: Springer Nature Switzerland

Book Series : Lecture Notes in Computer Science

insite
SEARCH

About this book

This book constitutes the proceedings of the Workshops held in conjunction with SAFECOMP 2023, held in Toulouse, France, during September 19, 2023.

The 35 full papers included in this volume were carefully reviewed and selected from 49 submissions.

- - 8th International Workshop on Assurance Cases for Software-intensive Systems (ASSURE 2023)

- - 18th International Workshop on Dependable Smart Embedded and Cyber-Physical Systems and Systems-of-Systems (DECSoS 2023)

- - 10th International Workshop on Next Generation of System Assurance Approaches for Critical Systems (SASSUR 2023)

- - Second International Workshop on Security and Safety Interactions (SENSEI 2023)

- - First International Workshop on Safety/ Reliability/ Trustworthiness of Intelligent Transportation Systems (SRToITS 2023)

- - 6th International Workshop on Artificial Intelligence Safety Engineering (WAISE 2023)

Table of Contents

Frontmatter

8th International Workshop on Assurance Cases for Software-Intensive Systems (ASSURE 2023)

Frontmatter
Using Assurance Cases to Prevent Malicious Behaviour from Targeting Safety Vulnerabilities

We discuss an approach to modifying a safety assurance case to take into account malicious intent. We show how to analyze an existing assurance case to reveal additions and modifications that need to be made in order to deal with the effects of malicious intent aimed at safety critical applications, and where to make them.

Victor Bandur, Mark Lawford, Sébastien Mosser, Richard F. Paige, Vera Pantelic, Alan Wassyng
Constructing Security Cases Based on Formal Verification of Security Requirements in Alloy

Assuring that security requirements have been met in design phases is less expensive compared with changes after system development. Security-critical systems deployment requires providing security cases demonstrating whether the design adequately incorporates the security requirements. Building arguments and generating evidence to support the claims of an assurance case is of utmost importance and should be done using a rigorous mathematical basis, namely formal methods. In this paper, we propose an approach that uses formal methods to construct security assurance cases. This approach takes a list of security requirements as input and generates security cases to assess their fulfillment. Furthermore, we define security argument patterns supported by the formal verification results presented using the GSN pattern notation. The overall approach is validated through a case study involving an autonomous drone.

Marwa Zeroual, Brahim Hamid, Morayo Adedjouma, Jason Jaskolka
Assurance Cases for Timing Properties of Automotive TSN Networks

The problem of configuring an Automotive TSN (Time-Sensitive Networking) Ethernet network with desired timing properties consists of several individual complex problems, each with its own solution landscape. When the chosen solutions come together in the implementation of timing guarantees on these networks, presenting the argument and evidence for the correct behaviour of the network with respect to the timing requirements itself becomes a difficult problem. In this paper, we present work in progress on demonstrating the use of assurance cases in making this argument explicit for an example TSN Ethernet timing requirement for an automotive powertrain network.

Ryan Kapinski, Vera Pantelic, Victor Bandur, Alan Wassyng, Mark Lawford
Toward Dependability Assurance Framework for Automated Driving Systems

Automated driving systems are advancing towards practical use, and commercial use has already begun in limited environments. Automated driving is a technology that will have a wide-ranging impact on our lives, making it crucial to form consensus on the dependability of automated driving systems among various stakeholders encompassing the general public. Since 2022, we have been conducting joint research and development with an automated driving technology startup, toward developing a framework for assuring the dependability of automated driving systems, using assurance cases. This position paper reports our goals and the current status of our work.

Yutaka Matsuno, Toshinori Takai, Manabu Okada, Tomoyuki Tsuchiya

18th International ERCIM/EWICS Workshop on Dependable Smart Embedded Cyber-Physical Systems and Systems-of-Systems

Frontmatter
A Quantitative Approach for System of Systems’ Resilience Analyzing Based on ArchiMate

With the development of IT technology and the increasing demand for service integration, the widespread application of System of Systems (SoS) is inevitable in the era to come. Among numerous key issues related to SoS, analyzing the resilience of SoS is a challenging problem. Although many studies and discussions for system engineering have provided solutions for this problem, two significant characteristics of SoS, which are the independence of Constituent systems (CSs) and the involvement of multiple stakeholders, are ignored. Based on these two characteristics, this paper proposed a quantitative method for analyzing the resilience of SoS. The method includes visual modeling of the SoS using the EA tool ArchiMate, quantitatively simulating the model based on defining the value of service capacities of the CSs and evaluating and designing resilience from multiple stakeholders’ perspectives. Finally, a case study based on Mobility as a Service (MaaS) is presented. By analyzing resilience of MaaS, the critical node is identified, and by resilience redesign, the resilience of the SoS is improved after enhancement.

Huanjun Zhang, Yutaka Matsubara, Hiroaki Takada
Towards DO-178C Compliance of a Secure Product

An approach to enhancing the cybersecurity of airborne systems is integrating certified secure products, given that the secure products demonstrate their compliance to the airworthiness standards. We conduct an evaluation of a COTS (Commercial Off The Shelf) secure product against DO-178C, so that it can be certified once integrated into an airborne system. Although the secure product has been certified under Common Criteria (CC), certifying it against DO-178C poses new challenges to the developer due to the different focuses of the two certification standards. While CC primarily focuses on evaluating the security features of a product, DO-178C places greater importance on ensuring the integrity of development assurance processes. This paper presents the insights that we obtained while addressing the challenges encountered during the evaluation under DO-178C.

Lijun Shan
The Need for Threat Modelling in Unmanned Aerial Systems

Detecting cybersecurity vulnerabilities in Unmanned Aerial Systems (UAS) is essential to ensure the safe operation of drones. This supports the determination of cybersecurity objectives and the description of security requirements needed to achieve these objectives. However, it is challenging to automate this process to identify potential cyber threats and ensure the correctness of the applied security requirements, especially in a complex system such as a UAS network. In this work, we use ThreatGet as a threat modelling tool to identify potential cyber threats in UAS and highlight existing security vulnerabilities. This assists in determining the appropriate security requirements that could be implemented to achieve our security goal. We then develop a novel ontology-based threat modelling approach to infer a set of security threats based on the applied security requirements and then check the effectiveness of these requirements against threats to ensure these requirements are fulfilled.

Abdelkader Magdy Shaaban, Oliver Jung, Christoph Schmittner
Using Runtime Information of Controllers for Safe Adaptation at Runtime: A Process Mining Approach

The increasing complexity of current Software Systems is generating the urge to find new ways to check the correct functioning of models during runtime. Runtime verification helps ensure that a system is working as expected even after being deployed, essential when dealing with systems working in critical or autonomous scenarios. This paper presents an improvement to an existing tool, named CRESCO, linking it with another tool to enable performing periodical verification based on event logs. These logs help determine whether the functioning of the system is inadequate or not after the last periodic check. If the system is determined to be working incorrectly, new code files are automatically generated from the traces of the log file, so they can be replaced when a faulty scenario is to occur. Thanks to this improvement, the CRESCO components are able to evaluate their correctness and adapt themselves at runtime, making the system more robust against unforeseen faulty scenarios.

Jorge Da Silva, Miren Illarramendi, Asier Iriarte
Safety and Robustness for Deep Neural Networks: An Automotive Use Case

Current automotive safety standards are cautious when it comes to utilizing deep neural networks in safety-critical scenarios due to concerns regarding robustness to noise, domain drift, and uncertainty quantification. In this paper, we propose a scenario where a neural network adjusts the automated driving style to reduce user stress. In this scenario, only certain actions are safety-critical, allowing for greater control over the model’s behavior. To demonstrate how safety can be addressed, we propose a mechanism based on robustness quantification and a fallback plan. This approach enables the model to minimize user stress in safe conditions while avoiding unsafe actions in uncertain scenarios. By exploring this use case, we hope to inspire discussions around identifying safety-critical scenarios and approaches where neural networks can be safely utilized. We see this also as a potential contribution to the development of new standards and best practices for the usage of AI in safety-critical scenarios. The work done here is a result of the TEACHING project, an European research project around the safe, secure and trustworthy usage of AI.

Davide Bacciu, Antonio Carta, Claudio Gallicchio, Christoph Schmittner
Towards Dependable Integration Concepts for AI-Based Systems

AI-based methods are currently on the rise for multiple applications, therefore also their application for autonomous or trustworthy embedded systems is discussed more frequently.There are various ways that can leverage AI-based technologies, but since AI has the potential to affect the key system properties of trustworthy embedded systems, a far higher level of maturity and dependability is required for the implementation of emerging AI-based technologies in safety-critical domains (such as an autonomous vehicle).The TEACHING project focuses on mission-critical, energy-sensitive autonomous systems and the development of technology bricks for humanistic AI concepts. To enhance the development of the technology bricks, the building of a dependable engineering environment to support the development of a self-adaptive artificial humanistic intelligence in a dependable manner is intended.The paper establishes the body of knowledge and fundamental ground for a workshop discussion on engineering methods and design patterns that can be used to develop dependable and AI-based autonomous system development.The assurance of dependability continues to be an open issue with no common solution yet. Therefore, the expert discussion on multiple perspectives related to all factors of the PESTEL analysis (political, environmental, social, technological, economic, and legal) should provide an updated common view of the diverse fields of expertise and exchange between domain experts during the workshop.

Georg Macher, Romana Blazevic, Omar Veledar, Eugen Brenner

10th International Workshop on Next Generation of System Assurance Approaches for Critical Systems (SASSUR 2023)

Frontmatter
A Methodology for the Qualification of Operating Systems and Hypervisors for the Deployment in IoT Devices

In an increasingly interconnected world, where critical infrastructures strongly depend on software applications there is the need to rely on software with demonstrated guarantees of reliability, availability, safety and security. Above all, Operating Systems (OSs) used in critical contexts must have specific characteristics to ensure the correct functioning of software applications and to protect from accidental and malicious failures that could lead to catastrophic consequences. To ensure a secure application layer, applications must run on OSs that possess specific properties, adequate quality and high robustness.This paper presents an OS qualification methodology, which helps designers to select an operating system (or hypervisor) suitable for being employed in a specific critical context. The methodology includes quality, safety, and security evaluations, according to the desired OS properties and the specific context of use. For each evaluation, the procedure is described through the application of different standards (e. g. ISO/IEC 25040, EN50128, ISO26262, ISO/IEC 15408, etc.), thus considering all the necessary aspects with respect to today’s technical and regulatory needs. Finally, an application of the qualifying methodology is presented, showing the safety and security evaluation of a Xen Hypervisor integrated in a railway infrastructure.

Irene Bicchierai, Enrico Schiavone, Massimiliano Leone Itria, Andrea Bondavalli, Lorenzo Falai
Computer-Aided Generation of Assurance Cases

Assurance cases (ACs) have gained attention in the aerospace, medical, and other heavily-regulated industries as a means for providing structured arguments on why a product is dependable (i.e., safe, secure, etc.) for its intended application. Challenges in AC construction stem from the complexity and uniqueness of the designs, the heterogeneous nature of the required supporting evidence, and the need to assess the quality of an argument. We present an automated AC generation framework that facilitates the construction, validation, and confidence assessment of ACs based on dependability argument patterns and confidence patterns capturing domain knowledge. The ACs are instantiated with a system’s specification and evaluated based on the available design and verification evidence. Aerospace case studies illustrate the framework’s effectiveness, efficiency, and scalability.

Timothy E. Wang, Chanwook Oh, Matthew Low, Isaac Amundson, Zamira Daw, Alessandro Pinto, Massimiliano L. Chiodo, Guoqiang Wang, Saqib Hasan, Ryan Melville, Pierluigi Nuzzo
RACK: A Semantic Model and Triplestore for Curation of Assurance Case Evidence

Certification of large systems requires reasoning over complex, diverse evidential datasets to determine whether its software is fit for purpose. This requires a detailed understanding of the meaning of that data, the context in which it is valid, and the uses to which it may reasonably be put. Unfortunately, current practices for assuring software safety do not scale to accommodate modern Department of Defense (DoD) systems, resulting in unfavorable behaviors such as putting off fixes to defects until the risk of not mitigating them outweighs the high cost of re-certification. In this work, we describe a novel data curation system, RACK, that addresses cost-effective, scalable curation of diverse certification evidence to facilitate the construction of an assurance case.

Abha Moitra, Paul Cuddihy, Kit Siu, David Archer, Eric Mertens, Daniel Russell, Kevin Quick, Valentin Robert, Baoluo Meng

2nd International Workshop on Security and Safety Interaction (SENSEI 2023)

Frontmatter
Patterns for Integrating NIST 800-53 Controls into Security Assurance Cases

It is sure that critical systems are appropriately secure and protected against malicious threats. In this paper, we present a novel pattern for Security Assurance Cases that integrates security controls from the NIST-800-53 cyber security standard into a comprehensive argument about system security. Our framework uses Eliminative Argumentation to increase confidence that these controls have been applied correctly by explicitly considering and addressing doubts in the argument.

Torin Viger, Simon Diemert, Olivia Foster
Analyzing Origins of Safety and Security Interactions Using Feared Events Trees and Multi-level Model

Existing approaches to analyzing safety and security are often limited to a standalone viewpoint and lack a comprehensive mapping of the propagation of concerns, including unwanted (feared events like faults, failures, hazards, and attacks) and wanted ones (e.g., requirements, properties) and their interplay across different granular system representations. We take this problem to a novel combination of the Fault and Attack Trees (FATs) as Feared Events-Properties Trees (FEPTs) and propose an approach for analyzing safety and security interactions considering a multi-level model. The multi-level model facilitates identifying safety- and security-related feared events and associated properties across different system representation levels, viz. system, sub-system, information, and component. Likewise, FEPT allows modeling and analyzing the inter-dependencies between the feared events and properties and their propagation across these levels. We illustrate the use of this approach in a simple and realistic case of trajectory planning in an intersection point scenario regarding autonomous Connected-Driving Vehicles (CDVs) to address the potential interactions between safety and security.

Megha Quamara, Christina Kolb, Brahim Hamid
Utilising Redundancy to Enhance Security of Safety-Critical Systems

For many safety-critical systems, implementing modern cybersecurity protection mechanisms is hindered by legacy design and high re-certification costs. Since such systems are typically designed to be highly reliable, they usually contain a large number of redundant components used to achieve fault tolerance. In this paper, we discuss challenges in utilising redundancy inherently present in the architectures of safety-critical systems to enhance system cybersecurity protection. We consider classic redundant architectures and analyse their ability to protect against cyberattacks. By evaluating the likelihood of a successful cyberattack on a redundant architecture under different implementation conditions, we conclude that redundancy in combination with diversity has better potential to be utilised for cybersecurity protection.

Elena Troubitsyna

1st International Workshop on Safety/Reliability/Trustworthiness of Intelligent Transportation Systems (SRToITS 2023)

Frontmatter
Reliability Evaluation of Autonomous Transportation System Architecture Based on Markov Chain

Reliability assessment methods are widely used in the reliability evaluation of transportation networks and transportation special equipment. The reliability assessment of Autonomous Transportation System (ATS) architecture can identify possible hidden problems and structural deficiencies before the implementation of ATS. The physical architecture of ATS maps the functional and logical architectures to the real world, hence, reliability assessment of the ATS architecture requires synthesizing information from all the three architectures. However, the current reliability assessment method of ATS architecture only took into account of the logical architecture of ATS. To meet this gap, a Markov chain model of the physical objects of ATS architecture is established, portraying the dynamic evolution of the physical object states, and failure rate and occupancy rate of physical objects is used to describe the reliability of the physical objects of ATS architecture, and the reliability of the physical objects and the importance of the physical objects in the ATS architecture is taken as the reliability index of ATS architecture. The method is applied to the reliability evaluation of the Vehicle Environment Awareness Service (VEAS) architecture, and the results show that the key physical objects that affect the reliability of the ATS architecture can be found by comparing the physical object importance and occupancy rate, and the reliability of the physical objects can be improved by raising the repairable rate and reducing the failure rate, thereby the reliability of the ATS architecture is promoted.

Bingyv Shen, Guangyun Liu, Shaowu Cheng, Xiantong Li, Kui Li, Chen Liang
Uncertainty Quantification for Semantic Segmentation Models via Evidential Reasoning

Deep learning models typically render decisions based on probabilistic outputs. However, in safety-critical applications such as environment perception for autonomous vehicles, erroneous decisions made by semantic segmentation models may lead to catastrophic results. Consequently, it would be beneficial if these models could explicitly indicate the reliability of their predictions. Essentially, stakeholders anticipate that deep learning models will convey the degree of uncertainty associated with their decisions. In this paper, we introduce EviSeg, a predictive uncertainty quantification method for semantic segmentation models, based on Dempster-Shafer (DS) theory. Specifically, we extract the discriminative information, i.e., the parameters and the output features from the last convolution layer of a semantic segmentation model. Subsequently, we model this multi-source evidence to the evidential weights, thereby estimating the predictive uncertainty of the semantic segmentation model with the Dempster’s rule of combination. Our proposed method does not require any changes to the model architecture, training process, or loss function. Thus, this uncertainty quantification process does not compromise the model performance. Validated on the urban road scene dataset CamVid, the proposed method enhanced computational efficiency by three to four times compared to the baseline method, while maintaining comparable performance with baseline methods. This improvement is critical for real-time applications.

Rui Wang, Mengying Wang, Ci Liang, Zhouxian Jiang
Research on the Reliability of High-Speed Railway Dispatching and Commanding Personnel with Multi Physiological Signals

In the event of equipment failure, traffic accident, natural disaster and other abnormal situations, the timely emergency disposal of the traffic dispatcher is required. In order to accurately evaluate the human reliability of the high-speed railway traffic dispatcher in emergency scenarios, this paper proposes a reliability analysis method based on the Phoenix model. In order to eliminate the dependence of the traditional human reliability analysis method on expert experience, a quantification method based on multiple physiological signals is designed. This paper also gives a specific application of this method in the case of inbound signal machine failure. With this human reliability analysis method, the human reliability of the traffic dispatcher and the causative behavior with the highest probability of failure can be accurately calculated, which can provide a reference for the improvement of the emergency handling protocol.

Liuxing Hu, Wei Zheng
Research on Brain Load Prediction Based on Machine Learning for High-Speed Railway Dispatching

In this paper, based on a simulation experiment platform, multimodal physiological data of the operator during emergency scenario processing are collected and processed. Specifically, for the ECG signal acquired by the ECG sensor, the noise is eliminated by using the method of stationary wavelet transform, and then the R-wave labeling is performed by the differential algorithm to obtain the HRV waveform and extract the time-domain, frequency-domain and nonlinear related features; for the multi-channel brainwave signal acquired by the EEG test system, the electrode positioning, potential re-referencing, filtering and noise removal are firstly performed using the eeglab toolkit For the eye-movement data collected by the eye tracker, the subject’s fixation behavior was extracted using the position-distance threshold algorithm, and the fixation frequency and mean fixation time were calculated, together with the mean and standard deviation data of the pupil’s diameter, as the characteristics of the eye-movement dimension. In the process of regression prediction, a feature selection method based on entropy criterion was proposed in this paper. The results showed that the feature-selected dataset achieved better performance in the regression prediction of the SVR model compared with the original feature set.

Dandan Bi, Wei Zheng, Xiaorong Meng
Paired Safety Rule Structure for Human-Machine Cooperation with Feature Update and Evolution

Autonomous control systems are used in an open environment where humans exist. Therefore, a safety design needs to be created corresponding to evolutions and changes in the behavior of humans and machines in accordance with an open changing environment. In this study, we propose a structure and derivation method of safety rules based on a pairing structure for the cooperation of humans and machines, which can facilitate feature updates and evolutions in the behavior of humans and machines. For a feature update, feature trees utilizing methods of software product line correspond to the evolution of behavior of a human and a machine by using a pairing safety rule structure.The results of a case study simulating autonomous driving systems and pedestrians in a city showed that the proposed safety rule structure can facilitate rule switching when features change. The results also showed that human-machine cooperation efficiency could be improved and safety maintained by operation following the change of safety rules in accordance with the proposed structure when the behavior of pedestrians and autonomous vehicles evolved.

Satoshi Otsuka, Natsumi Watanabe, Takehito Ogata, Donato Di Paola, Daniel Hillen, Joshua Frey, Nishanth Laxman, Jan Reich
Towards an Effective Generation of Functional Scenarios for AVs to Guide Sampling

Numerous methods have been developed for testing Connected and Automated Vehicles (CAV). The scenario-based approach is considered the most promising as it reduces the number of scenarios required to certify the CAV system. In this study, we propose a refined six-step methodology that includes two additional steps to compute a critical index for scenarios and use it to guide the sampling process The methodology starts with the generation of functional scenarios using a 5-layer ontology. Next, the driving data is processed to determine the criticality indices of the functional scenarios. This is achieved by using a latent Dirichlet Allocation technique and a Least Means Squares method. Finally, the sampling process is built on a scenario reduction based on clustering and a specific metric related to the a priori criticality indices. Overall, our refined approach enhances the scenario-based methodology by incorporating criticality indices to guide the sampling process, which can reduce drastically the number of scenarios needed for certification of CAV systems.

Hugues Blache, Pierre-Antoine Laharotte, Nour-Eddin El Faouzi
Rear-End Collision Risk Analysis for Autonomous Driving

Since there will be a mix of automated vehicles (AVs) and human-driven vehicles (HVs) on future roadways, in the literature, while many existing studies have investigated collisions where an AV hits an HV from behind, few studies have focused on the scenarios where an HV hits an AV from behind (called HV-AV collision). In this paper, we will investigate the HV-AV collision risk in the Stop-in-Lane (SiL) scenario. To achieve this aim, a Human-like Brake (HLB) model is proposed first to simulate the driver brake control. In particular, the joint distribution of Off-Road-Glance and Time-Headway is originally introduced to simulate the glance distraction of drivers during their dynamic vehicle control. Sequentially, a case study of HV-AV collisions in the SiL scenario of autonomous driving (AD) is conducted based on the HLB model, to reveal how the collision probability changes with respect to various parameters. The results of the case study provide us with an in-depth understanding of the dynamic driving conditions that lead to rear-end collisions in the SiL scenario.

Ci Liang, Mohamed Ghazel, Yusheng Ci, Nour-Eddin El Faouzi, Rui Wang, Wei Zheng
Improving Road Traffic Safety and Performance–Barriers and Directions Towards Cooperative Automated Vehicles

The complexity of deploying automated vehicles (AVs) has been grossly underestimated and vehicles at high levels of automated driving (SAE level 4 and above) have so far only been deployed in very limited areas. Highly automated AVs will face complex traffic, e.g., due to occlusions and unpredictable road-user behaviour, and AVs may shift the distribution of crashes. This has given rise to a renewed interest in connectivity and collaboration, with the need to monitor (emerging) behaviours and risk, and the promise to improve road traffic safety and performance by resolving the “information gap”. This motivates further investigations and research in this direction.In this paper we set out to identify barriers and important directions towards solutions for such collaborative systems, as formed by connected automated vehicles and a supporting cyber-physical infrastructure. Drawing upon a state-of-the art assessment and interactions with experts, we conclude that the current state-of-the art is fragmented, and therefore investigate key topics related to collaboration barriers and propose research questions to address them, hoping that the provided structure can also assist in guiding future research. The topics cover, (i) the socio-technical and system of systems nature of collaborative systems, (ii) the multifaceted design space and architectures with related trade-for such systems including between safety, performance and cost, and (iii) trustworthiness issues, ranging from safety and cybersecurity to privacy and ethics.

Gianfilippo Fornaro, Martin Törngren

6th International Workshop on Artificial Intelligence Safety Engineering (WAISE 2023)

Frontmatter
A Group-Level Learning Approach Using Logistic Regression for Fairer Decisions

Decision-making algorithms are becoming intertwined with each aspect of society. As we automate tasks which result in outcomes that affect an individual’s life, the need for assessing and understanding the ethical consequences of these processes becomes vital. With bias often originating from the datasets imbalanced group distributions, we propose a novel approach to in-processing fairness techniques, by considering training at a group-level. Adapting the standard training process of the logistic regression, our approach considers aggregating coefficient derivatives at a group-level to produce fairer outcomes. We demonstrate on two real-world datasets that our approach provides groups with more equal weighting towards defining the model parameters and displays potential to reduce unfairness disparities in group imbalanced data. Our experimental results illustrate a stronger influence on improving fairness when considering binary sensitive attributes, which may prove beneficial in continuing to construct fair algorithms to reduce biases existing in decision-making practices. Whilst the results present our group-level approach achieving less fair results than current state-of-the-art directly optimized fairness techniques, we primarily observe improved fairness over fairness-agnostic models. Subsequently, we find our novel approach towards fair algorithms to be a small but crucial step towards developing new methods for fair decision-making algorithms.

Marc Elliott, Deepak P.
Conformal Prediction and Uncertainty Wrapper: What Statistical Guarantees Can You Get for Uncertainty Quantification in Machine Learning?

With the increasing use of Artificial Intelligence (AI), the dependability of AI-based software components becomes a key factor, especially in the context of safety-critical applications. However, as current AI-based models are data-driven, there is an inherent uncertainty associated with their outcomes. Some in-model uncertainty quantification (UQ) approaches integrate techniques during model construction to obtain information about the uncertainties during inference, e.g., deep ensembles, but do not provide probabilistic guarantees. Two model-agnostic UQ approaches that both provide probabilistic guarantees are conformal prediction (CP), and uncertainty wrappers (UWs). Yet, they differentiate in the type of quantifications they provide. CP provides sets or regions containing the intended outcome with a given probability, UWs provide uncertainty estimates for point predictions. To investigate how well they perform compared to each other and a baseline in-model UQ approach, we provide a side-by-side comparison based on their key characteristics. Additionally, we introduce an approach combining UWs with CP. The UQ approaches are benchmarked with respect to point uncertainty estimates, and to prediction sets. Regarding point uncertainty estimates, the UW shows the best reliability as CP was not designed for this task. For the task of providing prediction sets, the combined approach of UWs with CP outperforms the other approaches with respect to adaptivity and conditional coverage.

Lisa Jöckel, Michael Kläs, Janek Groß, Pascal Gerber
AIMOS: Metamorphic Testing of AI - An Industrial Application

In this paper, we present the AIMOS tool as well as the results of its application to industrial use cases. Relying on the widely used Metamorphic testing paradigm, we show how the process of verification and validation can benefit from the early testing of models’ robustness to perturbations stemming from the intended operational domain.

Augustin Lemesle, Aymeric Varasse, Zakaria Chihani, Dominique Tachet
AERoS: Assurance of Emergent Behaviour in Autonomous Robotic Swarms

The behaviours of a swarm are not explicitly engineered. Instead, they are an emergent consequence of the interactions of individual agents with each other and their environment. This emergent functionality poses a challenge to safety assurance. The main contribution of this paper is a process for the safety assurance of emergent behaviour in autonomous robotic swarms called AERoS, following the guidance on the Assurance of Machine Learning for use in Autonomous Systems (AMLAS). We explore our proposed process using a case study centred on a robot swarm operating a public cloakroom.

Dhaminda B. Abeywickrama, James Wilson, Suet Lee, Greg Chance, Peter D. Winter, Arianna Manzini, Ibrahim Habli, Shane Windsor, Sabine Hauert, Kerstin Eder
A Reasonable Driver Standard for Automated Vehicle Safety

Current “safe enough” Autonomous Vehicle (AV) metrics focus on overall safety outcomes such as net losses across a deployed vehicle fleet using driving automation, compared to net losses assuming outcomes produced by human driven vehicles. While such metrics can provide an important report card to measure the long-term success of the social choice to deploy driving automation systems, they provide weak support for near-term deployment decisions based on safety considerations. Potential risk redistribution onto vulnerable populations remains problematic, even if net societal harm is reduced to create a positive risk balance. We propose a baseline comparison of the outcome expected in a crash scenario from an attentive and unimpaired “reasonable human driver,” applied on a case-by-case basis, to each actual loss event proximately caused by an automated vehicle. If the automated vehicle imitates the risk mitigation behaviors of the hypothetical reasonable human driver, no liability attaches for AV performance. Liability attaches if AV performance did not measure up to the expected human driver risk mitigation performance expected by law. This approach recognizes the importance of tort law to incentivize developers to continually work on minimizing driving negligence by computer drivers, providing a way to close gaps left by purely statistical approaches.

Philip Koopman, William H. Widen
Structuring Research Related to Dynamic Risk Management for Autonomous Systems

Conventional safety engineering is not sufficient to deal with Artificial Intelligence (AI) and Autonomous Systems (AS). Some authors propose dynamic safety approaches to deal with the challenges related to AI and AS. These approaches are referred to as dynamic risk management, dynamic safety management, dynamic assurance, or runtime certification [4]. These dynamic safety approaches are related to each other and the research in this field is increasing. In this paper, we structure the research challenges and solution approaches in order to explain why dynamic risk management is needed for dependability of autonomous systems. We will present 5 research areas in this large research field and name for each research area some concrete approaches or standardization activities. We hope the problem decomposition helps to foster effective research collaboration and enables researchers to better navigate the challenges surrounding dynamic risk management.

Rasmus Adler, Jan Reich, Richard Hawkins
Towards Safe Machine Learning Lifecycles with ESG Model Cards

Machine Learning (ML) models have played a key role in many decisions that can affect society. However, the inductive and experimental nature of ML exposes it to specific risks. If the latter are not controlled, ML has the potential to wreak havoc by impacting people and the environment. In that context, Environmental, Social and Corporate Governance (ESG) is an approach used to measure a company’s sustainable behavior along those three dimensions. To develop responsible behavior, an organization should employ an ESG framework within its structure. In this paper, we propose a risk-based approach which aims to produce safe ML lifecycles. Its objective is to smoothly implement the ESG strategy throughout the ML process by identifying and mitigating risks. Based on that analysis, we present the ESG model card, a concrete tool to report the ESG impacts of the ML lifecycle, along with the actions used to reach that outcome.

Thomas Bonnier, Benjamin Bosch
Towards Deep Anomaly Detection with Structured Knowledge Representations

Machine learning models tend to only make reliable predictions for inputs that are similar to the training data. Consequentially, anomaly detection, which can be used to detect unusual inputs, is critical for ensuring the safety of machine learning agents operating in open environments. In this work, we identify and discuss several limitations of current anomaly detection methods, such as their weak performance on tasks that require abstract reasoning, the inability to integrate background knowledge, and the opaqueness that undermines their trustworthiness in critical applications. Furthermore, we propose an architecture for anomaly detection models that aims to integrate structured knowledge representations to address these limitations. Our hypothesis is that this approach can improve performance and robustness, reduce the required resources (such as data and computation), and provide a higher degree of transparency. As a result, our work contributes to the increased safety of machine learning systems. Our code is publicly available. ( https://github.com/kkirchheim/sumnist )

Konstantin Kirchheim
Evaluating and Increasing Segmentation Robustness in CARLA

Model robustness is a crucial property in safety-critical applications such as autonomous driving and medical diagnosis. In this paper, we use the CARLA simulation environment to evaluate the robustness of various architectures for semantic segmentation to adverse environmental changes. Contrary to previous work, the environmental changes that we test the models against are not applied to existing images, but rendered directly in the simulation, enabling more realistic robustness tests. Surprisingly, we find that Transformers provide only slightly increased robustness compared to some CNNs. Furthermore, we demonstrate that training on a small set of adverse samples can significantly improve the robustness of most models. The code and supplementary results for our experiments are available online ( https://github.com/venkatesh-thiru/weather-robustness ).

Venkatesh Thirugnana Sambandham, Konstantin Kirchheim, Frank Ortmeier
Safety Integrity Levels for Artificial Intelligence

Artificial Intelligence (AI) and Machine Learning (ML) technologies are rapidly being adopted to perform safety-related tasks in critical systems. These AI-based systems pose significant challenges, particularly regarding their assurance. Existing safety approaches defined in internationally recognized standards such as ISO 26262, DO-178C, UL 4600, EN 50126, and IEC 61508 do not provide detailed guidance on how to assure AI-based systems. For conventional (non-AI) systems, these standards adopt a ‘Level of Rigor’ (LoR) approach, where increasingly demanding engineering activities are required as risk associated with the system increases. This paper proposes an extension to existing LoR approaches, which considers the complexity of the task(s) being performed by an AI-based component. Complexity is assessed in terms of input entropy and output non-determinism, and then combined with the allocated Safety Integrity Level (SIL) to produce an AI-SIL. That AI-SIL may be used to identify appropriate measures and techniques for the development and verification of the system. The proposed extension is illustrated by examples from the automotive, aviation, and medical industries.

Simon Diemert, Laure Millet, Jonathan Groves, Jeff Joyce
Can Large Language Models Assist in Hazard Analysis?

Large Language Models (LLMs), such as GPT-3, have demonstrated remarkable natural language processing and generation capabilities and have been applied to a variety tasks, such as source code generation. This paper explores the potential of integrating LLMs in the hazard analysis for safety-critical systems, a process which we refer to as co-hazard analysis (CoHA). In CoHA, a human analyst interacts with an LLM via a context-aware chat session and uses the responses to support elicitation of possible hazard causes. In a preliminary experiment, we explore CoHA with three increasingly complex versions of a simple system, using Open AI’s ChatGPT service. The quality of ChatGPT’s responses were systematically assessed to determine the feasibility of CoHA given the current state of LLM technology. The results suggest that LLMs may be useful for supporting human analysts performing hazard analysis.

Simon Diemert, Jens H. Weber
Contextualised Out-of-Distribution Detection Using Pattern Identification

In this work, we propose CODE, an extension of existing work from the field of explainable AI that identifies class-specific recurring patterns to build a robust Out-of-Distribution (OoD) detection method for visual classifiers. CODE does not require any classifier retraining and is OoD-agnostic, i.e., tuned directly to the training dataset. Crucially, pattern identification allows us to provide images from the In-Distribution (ID) dataset as reference data to provide additional context to the confidence scores. In addition, we introduce a new benchmark based on perturbations of the ID dataset that provides a known and quantifiable measure of the discrepancy between the ID and OoD datasets serving as a reference value for the comparison between OoD detection methods.

Romain Xu-Darme, Julien Girard-Satabin, Darryl Hond, Gabriele Incorvaia, Zakaria Chihani
Backmatter
Metadata
Title
Computer Safety, Reliability, and Security. SAFECOMP 2023 Workshops
Editors
Jérémie Guiochet
Stefano Tonetta
Erwin Schoitsch
Matthieu Roy
Friedemann Bitsch
Copyright Year
2023
Electronic ISBN
978-3-031-40953-0
Print ISBN
978-3-031-40952-3
DOI
https://doi.org/10.1007/978-3-031-40953-0

Premium Partner