main-content

Open Access 2022 | Open Access | Book

# Deep Neural Networks and Data for Automated Driving

## Robustness, Uncertainty Quantification, and Insights Towards Safety

Editors: Prof. Tim Fingscheidt, Prof. Hanno Gottschalk, Prof. Sebastian Houben

Publisher:

Part of:

insite
SEARCH

This open access book brings together the latest developments from industry and research on automated driving and artificial intelligence.

Environment perception for highly automated driving heavily employs deep neural networks, facing many challenges. How much data do we need for training and testing? How to use synthetic data to save labeling costs for training? How do we increase robustness and decrease memory usage? For inevitably poor conditions: How do we know that the network is uncertain about its decisions? Can we understand a bit more about what actually happens inside neural networks? This leads to a very practical problem particularly for DNNs employed in automated driving: What are useful validation techniques and how about safety?

This book unites the views from both academia and industry, where computer vision and machine learning meet environment perception for highly automated driving. Naturally, aspects of data, robustness, uncertainty quantification, and, last but not least, safety are at the core of it. This book is unique: In its first part, an extended survey of all the relevant aspects is provided. The second part contains the detailed technical elaboration of the various questions mentioned above.

#### Safe AI—An Overview

Open Access

##### Inspect, Understand, Overcome: A Survey of Practical Methods for AI Safety
Abstract
Deployment of modern data-driven machine learning methods, most often realized by deep neural networks (DNNs), in safety-critical applications such as health care, industrial plant control, or autonomous driving is highly challenging due to numerous model-inherent shortcomings. These shortcomings are diverse and range from a lack of generalization over insufficient interpretability and implausible predictions to directed attacks by means of malicious inputs. Cyber-physical systems employing DNNs are therefore likely to suffer from so-called safety concerns, properties that preclude their deployment as no argument or experimental setup can help to assess the remaining risk. In recent years, an abundance of state-of-the-art techniques aiming to address these safety concerns has emerged. This chapter provides a structured and broad overview of them. We first identify categories of insufficiencies to then describe research activities aiming at their detection, quantification, or mitigation. Our work addresses machine learning experts and safety engineers alike: The former ones might profit from the broad range of machine learning topics covered and discussions on limitations of recent methods. The latter ones might gain insights into the specifics of modern machine learning methods. We hope that this contribution fuels discussions on desiderata for machine learning systems and strategies on how to help to advance existing approaches accordingly.
Sebastian Houben, Stephanie Abrecht, Maram Akila, Andreas Bär, Felix Brockherde, Patrick Feifel, Tim Fingscheidt, Sujan Sai Gannamaneni, Seyed Eghbal Ghobadi, Ahmed Hammam, Anselm Haselhoff, Felix Hauser, Christian Heinzemann, Marco Hoffmann, Nikhil Kapoor, Falk Kappel, Marvin Klingner, Jan Kronenberger, Fabian Küppers, Jonas Löhdefink, Michael Mlynarski, Michael Mock, Firas Mualla, Svetlana Pavlitskaya, Maximilian Poretschkin, Alexander Pohl, Varun Ravi-Kumar, Julia Rosenzweig, Matthias Rottmann, Stefan Rüping, Timo Sämann, Jan David Schneider, Elena Schulz, Gesina Schwalbe, Joachim Sicking, Toshika Srivastava, Serin Varghese, Michael Weber, Sebastian Wirkert, Tim Wirtz, Matthias Woehrle

#### Recent Advances in Safe AI for Automated Driving

Open Access

##### Does Redundancy in AI Perception Systems Help to Test for Super-Human Automated Driving Performance?
Abstract
While automated driving is often advertised with better-than-human driving performance, this chapter reviews that it is nearly impossible to provide direct statistical evidence on the system level that this is actually the case. The amount of labeled data needed would exceed dimensions of present-day technical and economical capabilities. A commonly used strategy therefore is the use of redundancy along with the proof of sufficient subsystems’ performances. As it is known, this strategy is efficient especially for the case of subsystems operating independently, i.e., the occurrence of errors is independent in a statistical sense. Here, we give some first considerations and experimental evidence that this strategy is not a free ride as the errors of neural networks fulfilling the same computer vision task, at least in some cases, show correlated occurrences of errors. This remains true, if training data, architecture, and training are kept separate or independence is trained using special loss functions. Using data from different sensors (realized by up to five 2D projections of the 3D MNIST dataset) in our experiments is more efficiently reducing correlations, however not to an extent that is realizing the potential of reduction of testing data that can be obtained for redundant and statistically independent subsystems.
Hanno Gottschalk, Matthias Rottmann, Maida Saltagic

Open Access

##### Analysis and Comparison of Datasets by Leveraging Data Distributions in Latent Spaces
Abstract
Automated driving is widely seen as one of the areas, where key innovations are driven by the application of deep learning. The development of safe and robust deep neural network (DNN) functions requires new validation methods. A core insufficiency of DNNs is the lack of generalization for out-of-distribution datasets. One path to overcome this insufficiency is through the analysis and comparison of the domains of training and test datasets. This is important because otherwise, deep learning cannot advance automated driving. Variational autoencoders (VAEs) are able to extract meaningful encodings from datasets in their latent space. This chapter examines various methods based on these encodings and presents a broad evaluation on different automotive datasets and potential domain shifts, such as weather changes or new locations. The used methods are based on the distance to the nearest neighbors between datasets and leverage several network architectures and metrics. Several experiments with different domain shifts on different datasets are conducted and compared with a reconstruction-based method. The results show that the presented methods can be a promising alternative to the reconstruction error for detecting automotive-relevant domain shifts between different datasets. It is also shown that VAE loss variants that focus on a disentangled latent space can improve the stability of the domain shift detection quality. Best results were achieved with nearest neighbor methods using VAE and JointVAE, a VAE variant with a discrete and a continuous latent space, in combination with a metric based on the well-known z-score, and with the NVAE, a VAE variant with optimizations regarding reconstruction quality, in combination with the deterministic reconstruction error.
Hanno Stage, Lennart Ries, Jacob Langner, Stefan Otten, Eric Sax

Open Access

##### Optimized Data Synthesis for DNN Training and Validation by Sensor Artifact Simulation
Abstract
Synthetic, i.e., computer-generated imagery (CGI) data is a key component for training and validating deep-learning-based perceptive functions due to its ability to simulate rare cases, avoidance of privacy issues, and generation of pixel-accurate ground truth data. Today, physical-based rendering (PBR) engines simulate already a wealth of realistic optical effects but are mainly focused on the human perception system. Whereas the perceptive functions require realistic images modeled with sensor artifacts as close as possible toward the sensor, the training data has been recorded. This chapter proposes a way to improve the data synthesis process by application of realistic sensor artifacts. To do this, one has to overcome the domain distance between real-world imagery and the synthetic imagery. Therefore, we propose a measure which captures the generalization distance of two distinct datasets which have been trained on the same model. With this measure the data synthesis pipeline can be improved to produce realistic sensor-simulated images which are closer to the real-world domain. The proposed measure is based on the Wasserstein distance (earth mover’s distance, EMD) over the performance metric mean intersection-over-union (mIoU) on a per-image basis, comparing synthetic and real datasets using deep neural networks (DNNs) for semantic segmentation. This measure is subsequently used to match the characteristic of a real-world camera for the image synthesis pipeline which considers realistic sensor noise and lens artifacts. Comparing the measure with the well-established Fréchet inception distance (FID) on real and artificial datasets demonstrates the ability to interpret the generalization distance which is inherent asymmetric and more informative than just a simple distance measure. Furthermore, we use the metric as an optimization criterion to adapt a synthetic dataset to a real dataset, decreasing the EMD distance between a synthetic and the Cityscapes dataset from 32.67 to 27.48 and increasing the mIoU of our test algorithm (DeeplabV3+) from 40.36 to $$47.63\%$$.
Korbinian Hagn, Oliver Grau

Open Access

##### Improved DNN Robustness by Multi-task Training with an Auxiliary Self-Supervised Task
Abstract
Marvin Klingner, Tim Fingscheidt

Open Access

##### Improving Transferability of Generated Universal Adversarial Perturbations for Image Classification and Segmentation
Abstract
Although deep neural networks (DNNs) are high-performance methods for various complex tasks, e.g., environment perception in automated vehicles (AVs), they are vulnerable to adversarial perturbations. Recent works have proven the existence of universal adversarial perturbations (UAPs), which, when added to most images, destroy the output of the respective perception function. Existing attack methods often show a low success rate when attacking target models which are different from the one that the attack was optimized on. To address such weak transferability, we propose a novel learning criterion by combining a low-level feature loss, addressing the similarity of feature representations in the first layer of various model architectures, with a cross-entropy loss. Experimental results on ImageNet and Cityscapes datasets show that our method effectively generates universal adversarial perturbations achieving state-of-the-art fooling rates across different models, tasks, and datasets. Due to their effectiveness, we propose the use of such novel generated UAPs in robustness evaluation of DNN-based environment perception functions for AVs.
Atiye Sadat Hashemi, Andreas Bär, Saeed Mozaffari, Tim Fingscheidt

Open Access

##### Invertible Neural Networks for Understanding Semantics of Invariances of CNN Representations
Abstract
To tackle increasingly complex tasks, it has become an essential ability of neural networks to learn abstract representations. These task-specific representations and, particularly, the invariances they capture turn neural networks into black-box models that lack interpretability. To open such a black box, it is, therefore, crucial to uncover the different semantic concepts a model has learned as well as those that it has learned to be invariant to. We present an approach based on invertible neural networks (INNs) that (i) recovers the task-specific, learned invariances by disentangling the remaining factor of variation in the data and that (ii) invertibly transforms these recovered invariances combined with the model representation into an equally expressive one with accessible semantic concepts. As a consequence, neural network representations become understandable by providing the means to (i) expose their semantic meaning, (ii) semantically modify a representation, and (iii) visualize individual learned semantic concepts and invariances. Our invertible approach significantly extends the abilities to understand black-box models by enabling post hoc interpretations of state-of-the-art networks without compromising their performance. Our implementation is available at https://​compvis.​github.​io/​invariances/​.
Robin Rombach, Patrick Esser, Andreas Blattmann, Björn Ommer

Open Access

##### Confidence Calibration for Object Detection and Segmentation
Abstract
Calibrated confidence estimates obtained from neural networks are crucial, particularly for safety-critical applications such as autonomous driving or medical image diagnosis. However, although the task of confidence calibration has been investigated on classification problems, thorough investigations on object detection and segmentation problems are still missing. Therefore, we focus on the investigation of confidence calibration for object detection and segmentation models in this chapter. We introduce the concept of multivariate confidence calibration that is an extension of well-known calibration methods to the task of object detection and segmentation. This allows for an extended confidence calibration that is also aware of additional features such as bounding box/pixel position and shape information. Furthermore, we extend the expected calibration error (ECE) to measure miscalibration of object detection and segmentation models. We examine several network architectures on MS COCO as well as on Cityscapes and show that especially object detection as well as instance segmentation models are intrinsically miscalibrated given the introduced definition of calibration. Using our proposed calibration methods, we have been able to improve calibration so that it also has a positive impact on the quality of segmentation masks as well.
Fabian Küppers, Anselm Haselhoff, Jan Kronenberger, Jonas Schneider

Open Access

##### Uncertainty Quantification for Object Detection: Output- and Gradient-Based Approaches
Abstract
Safety-critical applications of deep neural networks require reliable confidence estimation methods with high predictive power. However, evaluating and comparing different methods for uncertainty quantification is oftentimes highly context-dependent. In this chapter, we introduce flexible evaluation protocols which are applicable to a wide range of tasks with an emphasis on object detection. In this light, we investigate uncertainty metrics based on the network output, as well as metrics based on a learning gradient, both of which significantly outperform the confidence score of the network. While output-based uncertainty is produced by post-processing steps and is computationally efficient, gradient-based uncertainty, in principle, allows for localization of uncertainty within the network architecture. We show that both sources of uncertainty are mutually non-redundant and can be combined beneficially. Furthermore, we show direct applications of uncertainty quantification by improving detection accuracy.
Tobias Riedlinger, Marius Schubert, Karsten Kahl, Matthias Rottmann

Open Access

##### Detecting and Learning the Unknown in Semantic Segmentation
Abstract
Semantic segmentation is a crucial component for perception in automated driving. Deep neural networks (DNNs) are commonly used for this task, and they are usually trained on a closed set of object classes appearing in a closed operational domain. However, this is in contrast to the open world assumption in automated driving that DNNs are deployed to. Therefore, DNNs necessarily face data that they have never encountered previously, also known as anomalies, which are extremely safety-critical to properly cope with. In this chapter, we first give an overview about anomalies from an information-theoretic perspective. Next, we review research in detecting unknown objects in semantic segmentation. We present a method outperforming recent approaches by training for high entropy responses on anomalous objects, which is in line with our theoretical findings. Finally, we propose a method to assess the occurrence frequency of anomalies in order to select anomaly types to include into a model’s set of semantic categories. We demonstrate that those anomalies can then be learned in an unsupervised fashion which is particularly suitable in online applications.
Robin Chan, Svenja Uhlemeyer, Matthias Rottmann, Hanno Gottschalk

Open Access

##### Evaluating Mixture-of-Experts Architectures for Network Aggregation
Abstract
The mixture-of-experts (MoE) architecture is an approach to aggregate several expert components via an additional gating module, which learns to predict the most suitable distribution of the expert’s outputs for each input. An MoE thus not only relies on redundancy for increased robustness—we also demonstrate how this architecture can provide additional interpretability, while retaining performance similar to a standalone network. As an example, we train expert networks to perform semantic segmentation of the traffic scenes and combine them into an MoE with an additional gating network. Our experiments with two different expert model architectures (FRRN and DeepLabv3+) reveal that the MoE is able to reach, and for certain data subsets even surpass, the baseline performance and also outperforms a simple aggregation via ensembling. A further advantage of an MoE is the increased interpretability—a comparison of pixel-wise predictions of the whole MoE model and the participating experts’ help to identify regions of high uncertainty in an input.
Svetlana Pavlitskaya, Christian Hubschneider, Michael Weber

Open Access

##### Safety Assurance of Machine Learning for Perception Functions
Abstract
The latest generation of safety standards applicable to automated driving systems require both qualitative and quantitative safety acceptance criteria to be defined and argued. At the same time, the use of machine learning (ML) functions is increasingly seen as a prerequisite to achieving the necessary levels of perception performance in the complex operating environments of these functions. This inevitably leads to the question of which supporting evidence must be presented to demonstrate the safety of ML-based automated driving systems. This chapter discusses the challenge of deriving suitable acceptance criteria for the ML function and describes how such evidence can be structured in order to support a convincing safety assurance case for the system. In particular, we show how a combination of methods can be used to estimate the overall machine learning performance, as well as to evaluate and reduce the impact of ML-specific insufficiencies, both during design and operation.
Simon Burton, Christian Hellert, Fabian Hüger, Michael Mock, Andreas Rohatschek

Open Access

##### A Variational Deep Synthesis Approach for Perception Validation
Abstract
This chapter introduces a novel data synthesis framework for validation of perception functions based on machine learning to ensure the safety and functionality of these systems, specifically in the context of automated driving. The main contributions are the introduction of a generative, parametric description of three-dimensional scenarios in a validation parameter space, and layered scene generation process to reduce the computational effort. Specifically, we combine a module for probabilistic scene generation, a variation engine for scene parameters, and a more realistic sensor artifacts simulation. The work demonstrates the effectiveness of the framework for the perception of pedestrians in urban environments based on various deep neural networks (DNNs) for semantic segmentation and object detection. Our approach allows a systematic evaluation of a high number of different objects and combined with our variational approach we can effectively simulate and test a wide range of additional conditions as, e.g., various illuminations. We can demonstrate that our generative approach produces a better approximation of the spatial object distribution to real datasets, compared to hand-crafted 3D scenes.
Oliver Grau, Korbinian Hagn, Qutub Syed Sha

Open Access

##### The Good and the Bad: Using Neuron Coverage as a DNN Validation Technique
Abstract
Verification and validation (V&V) is a crucial step for the certification and deployment of deep neural networks (DNNs). Neuron coverage, inspired by code coverage in software testing, has been proposed as one such V&V method. We provide a summary of different neuron coverage variants and their inspiration from traditional software engineering V&V methods. Our first experiment shows that novelty and granularity are important considerations when assessing a coverage metric. Building on these observations, we provide an illustrative example for studying the advantages of pairwise coverage over simple neuron coverage. Finally, we show that there is an upper bound of realizable neuron coverage when test data are sampled from inside the operational design domain (in-ODD) instead of the entire input space.
Sujan Sai Gannamaneni, Maram Akila, Christian Heinzemann, Matthias Woehrle

Open Access

##### Joint Optimization for DNN Model Compression and Corruption Robustness
Abstract
Modern deep neural networks (DNNs) are achieving state-of-the-art results due to their capability to learn a faithful representation of the data they are trained on. In this chapter, we address two insufficiencies of DNNs, namely, the lack of robustness to corruptions in the data, and the lack of real-time deployment capabilities, that need to be addressed to enable their safe and efficient deployment in real-time environments. We introduce hybrid corruption-robustness focused compression (HCRC), an approach that jointly optimizes a neural network for achieving network compression along with improvement in corruption robustness, such as noise and blurring artifacts that are commonly observed. For this study, we primarily consider the task of semantic segmentation for automated driving and focus on the interactions between robustness and compression of the network. HCRC improves the robustness of the DeepLabv3+ network by 8.39% absolute mean performance under corruption (mPC) on the Cityscapes dataset, and by 2.93% absolute mPC on the Sim KI-A dataset, while generalizing even to augmentations not seen by the network in the training process. This is achieved with only minor degradations on undisturbed data. Our approach is evaluated over two strong compression ratios (30% and 50%) and consistently outperforms all considered baseline approaches. Additionally, we perform extensive ablation studies to further leverage and extend existing state-of-the-art methods.
Serin Varghese, Christoph Hümmer, Andreas Bär, Fabian Hüger, Tim Fingscheidt
Title
Deep Neural Networks and Data for Automated Driving
Editors
Prof. Tim Fingscheidt
Prof. Hanno Gottschalk
Prof. Sebastian Houben