Top

2024 | Book

Read chapter Read first chapter

Software Quality as a Foundation for Security

16th International Conference on Software Quality, SWQD 2024, Vienna, Austria, April 23–25, 2024, Proceedings

Editors: Peter Bludau, Rudolf Ramler, Dietmar Winkler, Johannes Bergsmann

Publisher: Springer Nature Switzerland

Book Series : Lecture Notes in Business Information Processing

Part of: Springer Professional "Wirtschaft+Technik" , Springer Professional "Technik" , Springer Professional "Wirtschaft"

About this book

This book constitutes the refereed proceedings of the 16th Software Quality Days Conference, SWQD 2024, held in Vienna, Austria, during April 23-25, 2024.

The Software Quality Days (SWQD) conference started in 2009 and has grown to the biggest conference on software quality in Europe. The program of the SWQD conference is designed to encompass a stimulating mixture of practical presentations and new research topics in scientific presentations. The guiding conference topic of the SWQD 2024 is “Software Quality as a Foundation for Security”.

The 7 full papers and 2 short papers presented in this volume were carefully reviewed and selected from 16 submissions. The papers were organized in topical sections as follows: Requirements engineering; software quality; continuous integration and deployment; communication and collaboration; artificial intelligence; and security and compliance.

Frontmatter

Requirements Engineering

Frontmatter

A Process Proposal for Requirements Engineering for Third-Party Products and a Preliminary Evaluation at Munich Re

Abstract

Context: When in need for a software solution, companies of all sizes prefer buying an existing commercial-off-the-shelf (COTS) product rather than investing the time and effort on developing and maintaining their own. However, purchasing the wrong COTS solution can lead to a painful and company-critical process as well. Problem: Within this context, the absence of a both repeatable as well as pragmatic approach for the selection of a suitable third-party tool remains a common problem at various companies, including Munich Re. Approach: To this end, this work combines and extends established methodologies aiming for an efficient and effective requirements engineering approach. To validate feasibility of the approach, we furthermore report on a pilot study at Munich Re, in which we exemplarily apply the process in-vivo for selecting a requirements modeling tool suggested to be used across all development teams of the whole organization. Results: The application at Munich Re indicates the feasibility of the approach for the selection of a medium sized software solution. Impact: We encourage practitioners to extend the presented method and incorporate it into their own decision-making process for third-party tools, with the aim to making buy-decisions more objective and more efficient in the future.

Marcel Koschinsky, Henning Femmer, Claudia Schindler

Software Quality

Frontmatter

Source Code Clone Detection Using Unsupervised Similarity Measures

Abstract

Assessing similarity in source code has gained significant attention in recent years due to its importance in software engineering tasks such as clone detection and code search and recommendation. This work presents a comparative analysis of unsupervised similarity measures for identifying source code clone detection. The goal is to overview the current state-of-the-art techniques, their strengths, and weaknesses. To do that, we compile the existing unsupervised strategies and evaluate their performance on a benchmark dataset to guide software engineers in selecting appropriate methods for their specific use cases. The source code of this study is available at https://github.com/jorge-martinez-gil/codesim

Jorge Martinez-Gil

Continuous Integration and Deployment

Frontmatter

Using Datalog for Effective Continuous Integration Policy Evaluation

Abstract

Containerisation and microservices have introduced unprecedented complexity in system configurations, exacerbating the blast zone of misconfigurations and system failures. This complexity is further amplified within the DevOps paradigm, where developers are entrusted with the entire software development lifecycle, often without comprehensive insights into the impact of their configurations. This article explores using the declarative logic programming language Datalog in automating and optimizing configuration validation to mitigate these challenges.

We present an overview of a real-world case involving a software company with approximately 300 engineers, highlighting the challenges that lead to delegating mission-critical configuration validation to a declarative language.

With Datalog, we spearheaded an initiative to entirely deprecate a non-declarative solution in order to attempt to circumvent the problem of writing business logic alongside its evaluation. The outcome revealed a substantial reduction in maintenance efforts and user complaints, providing further evidence of Datalog’s potential in streamlining internal policy enforcement.

We propose a set of best practices, extrapolated from our findings, to guide organizations in both implementing and optimizing automatic configuration validation. These insights offer a strategic roadmap for harnessing declarative languages like Datalog to effectively navigate the intricate configuration landscapes of contemporary software systems.

Kaarel Loide, Bruno Rucy Carneiro Alves de Lima, Pelle Jakovits, Jevgeni Demidov

Communication and Collaboration

Frontmatter

On the Interaction Between Software Engineers and Data Scientists When Building Machine Learning-Enabled Systems

Abstract

In recent years, Machine Learning (ML) components have been increasingly integrated into the core systems of organizations. Engineering such systems presents various challenges from both a theoretical and practical perspective. One of the key challenges is the effective interaction between actors with different backgrounds who need to work closely together, such as software engineers and data scientists. This paper presents an exploratory case study that aims to understand the current interaction and collaboration dynamics between these two roles in ML projects. We conducted semi-structured interviews with four practitioners with experience in software engineering and data science of a large ML-enabled system project and analyzed the data using reflexive thematic analysis. Our findings reveal several challenges that can hinder collaboration between software engineers and data scientists, including differences in technical expertise, unclear definitions of each role’s duties, and the lack of documents that support the specification of the ML-enabled system. We also indicate potential solutions to address these challenges, such as fostering a collaborative culture, encouraging team communication, and producing concise system documentation. This study contributes to understanding the complex dynamics between software engineers and data scientists in ML projects and provides insights for improving collaboration and communication in this context. We encourage future studies investigating this interaction in other projects.

Gabriel Busquim, Hugo Villamizar, Maria Julia Lima, Marcos Kalinowski

Towards Integrating Knowledge Graphs into Process-Oriented Human-AI Collaboration in Industry

Abstract

Human-AI collaboration in industrial manufacturing promises to overcome current limitations by combining the flexibility of human intelligence and the scaling and processing capabilities of machine intelligence. To ensure effective collaboration between human and AI team members, we envision a software-driven coordination mechanism that orchestrates the interactions between the participants in Human-AI teaming scenarios and help to synchronize the information flow between them. A structured process-oriented approach to systems engineering aims at generalizability, deployment efficiency and enhancing the quality of the resulting software by formalizing the human-AI interaction as a BPMN process model. During runtime, this process model is executed by the teaming engine, one of the core components of the Teaming.AI software platform. By incorporating dynamic execution traces of these process models into a knowledge graph structure and linking them to contextual background knowledge, we facilitate the monitoring of variations in process executions and inference of new insights during runtime. Knowledge graphs are a powerful tool for semantic integration of diverse data, thereby significantly improving the data quality, which is still one of the biggest issues in AI-driven software solutions. We present the Teaming.AI software platform and its key components as a framework for enabling transparent teamwork between humans and AI in industry. We discuss its application in the context of an industrial use case in plastic injection molding production. Overall, this Teaming.AI platform provides a robust, flexible and accountable solution for human-AI collaboration in manufacturing.

Bernhard Heinzl, Agastya Silvina, Franz Krause, Nicole Schwarz, Kabul Kurniawan, Elmar Kiesling, Mario Pichler, Bernhard Moser

Artificial Intelligence

Frontmatter

Impact of Image Data Splitting on the Performance of Automotive Perception Systems

Abstract

Context: Training image recognition systems is one of the crucial elements of the AI Engineering process in general and for automotive systems in particular. The quality of data and the training process can have a profound impact on the quality, performance, and safety of automotive software. Objective: Splitting data between train and test sets is one of the crucial elements in this process as it can determine both how well the system learns and generalizes to new data. Typical data splits take into consideration either randomness or timeliness of data points. However, in image recognition systems, the similarity of images is of equal importance. Methods: In this computational experiment, we study the impact of six data-splitting techniques. We use an industrial dataset with high-definition color images of driving sequences to train a YOLOv7 network. Results: The mean average precision (mAP) was 0.943 and 0.841 when the similarity-based and the frame-based splitting techniques were applied, respectively. However, the object-based splitting technique produces the worst mAP score (0.118). Conclusion: There are significant differences in the performance of object detection methods when applying different data-splitting techniques. The most positive results are the random selections, whereas the most objective ones are splits based on sequences that represent different geographical locations.

Md. Abu Ahammed Babu, Sushant Kumar Pandey, Darko Durisic, Ashok Chaitanya Koppisetty, Miroslaw Staron

ML-Enabled Systems Model Deployment and Monitoring: Status Quo and Problems

Abstract

[Context] Systems that incorporate Machine Learning (ML) models, often referred to as ML-enabled systems, have become commonplace. However, empirical evidence on how ML-enabled systems are engineered in practice is still limited; this is especially true for activities surrounding ML model dissemination. [Goal] We investigate contemporary industrial practices and problems related to ML model dissemination, focusing on the model deployment and the monitoring ML life cycle phases. [Method] We conducted an international survey to gather practitioner insights on how ML-enabled systems are engineered. We gathered a total of 188 complete responses from 25 countries. We analyze the status quo and problems reported for the model deployment and monitoring phases. We analyzed contemporary practices using bootstrapping with confidence intervals and conducted qualitative analyses on the reported problems applying open and axial coding procedures. [Results] Practitioners perceive the model deployment and monitoring phases as relevant and difficult. With respect to model deployment, models are typically deployed as separate services, with limited adoption of MLOps principles. Reported problems include difficulties in designing the architecture of the infrastructure for production deployment and legacy application integration. Concerning model monitoring, many models in production are not monitored. The main monitored aspects are inputs, outputs, and decisions. Reported problems involve the absence of monitoring practices, the need to create custom monitoring tools, and the selection of suitable metrics. [Conclusion] Our results help provide a better understanding of the adopted practices and problems in practice and support guiding ML deployment and monitoring research in a problem-driven manner.

Eduardo Zimelewicz, Marcos Kalinowski, Daniel Mendez, Görkem Giray, Antonio Pedro Santos Alves, Niklas Lavesson, Kelly Azevedo, Hugo Villamizar, Tatiana Escovedo, Helio Lopes, Stefan Biffl, Juergen Musil, Michael Felderer, Stefan Wagner, Teresa Baldassarre, Tony Gorschek

Security and Compliance

Frontmatter

Challenges of Assuring Compliance of Information Systems in Finance

Abstract

Assuring regulatory compliance of information systems (IS), as a bundle of software systems and business processes, is an important, but costly and continuous effort. Laws formulate demands for quality properties in ambiguous language, requiring substantial interpretation. Industry standards provide support, but remain generic and applicable to heterogeneous company IS contexts. Before compliance measures can be implemented in software assets and processes, a specific interpretation based on the context of each company is a prerequisite. Compliance experts such as auditors support this process by accounting for the perspectives of company stakeholders. Ultimately, however, the complexity of the required knowledge, legal and technical facets prevents organizations from continuously establishing situational awareness or guarantees, and answering the question: is the company currently compliant? We illustrate the complexity of assuring compliance in a qualitative case study with a European, software-driven corporation in the financial industry. Through modeling of an example of annual audits and analyzing literature, we describe the perspectives of the involved stakeholders with their roles, knowledge needs and facets. We observe six challenges: (1) large number of items and links; (2) unclear and implicit links; (3) siloing of knowledge; (4) multiple sources of truth; (5) high costs of learning from audits; and (6) uncertain results of traditional auditing. We discuss the implications of these observed challenges, and briefly explore potential avenues for resolution.

Tomas Bueno Momčilović, Dian Balta

A PUF-Based Approach for Copy Protection of Intellectual Property in Neural Network Models

Abstract

More and more companies’ Intellectual Property (IP) is being integrated into Neural Network (NN) models. This IP has considerable value for companies and, therefore, requires adequate protection. For example, an attacker might replicate a production machines’ hardware and subsequently simply copy associated software and NN models onto the cloned hardware. To make copying NN models onto cloned hardware infeasible, we present an approach to bind NN models—and thus also the IP contained within them—to their underlying hardware. For this purpose, we link an NN model’s weights, which are crucial for its operation, to unique and unclonable hardware properties by leveraging Physically Unclonable Functions (PUFs). By doing so, sufficient accuracy can only be achieved using the target hardware to restore the original weights, rendering proper execution of the NN model on cloned hardware impossible. We demonstrate that our approach accomplishes the desired degradation of accuracy on various NN models and outline possible future improvements.

Daniel Dorfmeister, Flavio Ferrarotti, Bernhard Fischer, Martin Schwandtner, Hannes Sochor

Backmatter

Title: Software Quality as a Foundation for Security
Editors: Peter Bludau
Rudolf Ramler
Dietmar Winkler
Johannes Bergsmann
Publisher: Springer Nature Switzerland
Electronic ISBN: 978-3-031-56281-5
Print ISBN: 978-3-031-56280-8
DOI: https://doi.org/10.1007/978-3-031-56281-5

Springer Professional

Software Quality as a Foundation for Security

16th International Conference on Software Quality, SWQD 2024, Vienna, Austria, April 23–25, 2024, Proceedings

About this book

Table of Contents

Frontmatter

Requirements Engineering

Frontmatter

A Process Proposal for Requirements Engineering for Third-Party Products and a Preliminary Evaluation at Munich Re

Software Quality

Frontmatter

Source Code Clone Detection Using Unsupervised Similarity Measures

Continuous Integration and Deployment

Frontmatter

Using Datalog for Effective Continuous Integration Policy Evaluation

Communication and Collaboration

Frontmatter

On the Interaction Between Software Engineers and Data Scientists When Building Machine Learning-Enabled Systems

Towards Integrating Knowledge Graphs into Process-Oriented Human-AI Collaboration in Industry

Artificial Intelligence

Frontmatter

Impact of Image Data Splitting on the Performance of Automotive Perception Systems

ML-Enabled Systems Model Deployment and Monitoring: Status Quo and Problems

Security and Compliance

Frontmatter

Challenges of Assuring Compliance of Information Systems in Finance

A PUF-Based Approach for Copy Protection of Intellectual Property in Neural Network Models

Backmatter

Premium Partner