ICT Verification and Validation

Frontmatter

Reducing Flakiness in End-to-End Test Suites: An Experience Report

End-to-end (E2E) testing, a technique employed to assure quality of web applications, is cost-effective only if the test suite is not flaky. Flaky test scripts produce non-deterministic results that undermine testers’ trust and thus the usefulness of the entire test suite.Recently, we were involved in the refactoring of an existing automated flaky E2E test suite for a large Web application. In this paper, we report on our experience. During the refactoring, we have computed the effort made and formalized the procedure we followed in algorithmic way so that our experience can also be of help to other developers/testers. Our procedure allowed to reduce the flakiness to virtually zero w.r.t. the original flaky test suite. Moreover, as a positive side effect, the execution time of the test suite has been reduced by of the 57%.

Dario Olianas, Maurizio Leotta, Filippo Ricca, Luca Villa

Mutation Subsumption as Relative Incorrectness

This paper attempts to link two lines of research that have proceeded independently so far: Mutant subsumption, which is used to identify redundant mutants; and Relative correctness, which is used to define and analyze software faults. We say that a mutant $$M'$$ M ′ of a program P subsumes a mutant M of P if and only if any test datum that kills M kills $$M'$$ M ′ . On the other hand, we say that a program $$P'$$ P ′ is more-correct than a program P with respect to a specification R if and only if whenever program P behaves correctly with respect to R on some input datum, so does program $$P'$$ P ′ . We highlight the relationships between these two concepts and consider some potential synergies between these two research directions.

Besma Khaireddine, Amani Ayad, Imen Marsit, Ali Mili

What We Talk About When We Talk About Software Test Flakiness

Software test flakiness is drawing increasing interest among both academic researchers and practitioners. In this work we report our findings from a scoping review of white and grey literature, highlighting variations across flaky tests key concepts. Our study clearly indicates the need of a unifying definition as well as of a more comprehensive analysis for establishing a conceptual map that can better guide future research.

Morena Barboni, Antonia Bertolino, Guglielmo De Angelis

Looking for the Needle in the Haystack: End-to-end Tests in Open Source Projects

There’s a common agreement in the industry that integration and end-to-end (e2e) tests are a challenge for many teams wanting to enable frequent deployments while at the same time guaranteeing quality. What this means for the research community is that there are open research problems that might be interesting to solve. However, little effort is put by academia on these integration and e2e tests. Truth is that all datasets available for research in software testing are focused on unit tests.In this paper we propose an approach to build datasets of e2e tests from active open source projects. The approach is based on mining open source repositories from GitHub, in order to find those projects containing e2e tests. We defined 12 different criteria to find those tests. We investigate which of the 12 criteria are more reliable for detecting this kind of tests by manually analyzing the results of these criteria on 100 projects from GitHub. Then we performed a search on 1,800 projects (900 Java-specific, and 900 not constrained to Java), and used the three most promising criteria to detect e2e tests within all of them. Our results show that it is easier to detect this kind of tests in Java projects than on projects using other programming languages. Also, more than 500 projects were reported as having e2e tests. We hypothesize that good e2e test datasets could be built out of these results.

Francisco Gortázar, Michel Maes-Bermejo, Micael Gallego, Jorge Contreras Padilla

Evaluating Sensor Interaction Failures in Mobile Applications

Mobile devices have a rich set of small-scale sensors which improve the functionalities possibilities. The growing use of mobile applications has aroused the interest of researchers in testing mobile applications. However, sensor interaction failures are a challenging and still a little-explored aspect of research. Unexpected behavior because the sensor interactions can introduce failures that manifest themselves in specific sensor configurations. Sensor interaction failures can compromise the mobile application’s quality and harm the user’s experience. We propose an approach for extending test suites of mobile applications in order to evaluate the sensor interactions aspects of mobile applications. We used eight sensors to verify the occurrence of sensor interaction failures. We generated all configurations considering the sensors enabled or disabled. We observed that some pairs of sensors cause failures in some applications including those not so obvious.

Euler Horta Marinho, João P. Diniz, Fischer Ferreira, Eduardo Figueiredo

Software Evolution

Frontmatter

Feature-Oriented Clone and Pull for Distributed Development and Evolution

Product line engineering aims at quickly delivering individual solutions to customers by customizing and evolving products based on a common platform. Engineers commonly follow a distributed and feature-oriented process, supported by version control systems, to track implementation-level changes. For instance, feature branches are widely used to add new or modify existing features. However, when merging back features to the product line, the information how features map to code is usually lost. Furthermore, the granularity of merging is limited to branches, making it hard to transfer individual features from one product to another. This paper thus presents feature-oriented clone and pull operations for the distributed development and evolution of product lines, which are implemented in the FORCE2 platform. Our evaluation uses the ArgoUML product line to assess the correctness and performance of our approach. The results show that the feature-oriented operations work with high precision and recall for different cases of feature interactions. The performance measurements demonstrate that the clone and pull operations can be integrated in typical workflows of engineers.

Daniel Hinterreiter, Lukas Linsbauer, Herbert Prähofer, Paul Grünbacher

Detecting Sudden Variations in Web Apps Code Smells’ Density: A Longitudinal Study

Code smells are considered potentially harmful to software maintenance. Their introduction is dependent on the production of new code or the addition of smelly code produced by another team. Code smells survive until being refactored or the code where they stand is removed. Under normal conditions, we expect code smells density to be relatively stable throughout time. Anomalous (sudden) increases in this density are expected to hurt maintenance costs and the other way round. In the case of sudden increases, especially in pre-release tests in an automation server pipeline, detecting those outlier situations can trigger refactoring actions before releasing the new version.This paper presents a longitudinal study on the sudden variations in the introduction and removal of 18 server code smells on 8 PHP web apps, across several years. The study regards web applications but can be generalized to other domains, using other CS and tools. We propose a standardized detection criterion for this kind of code smell anomalies. Besides providing a retrospective view of the code smell evolution phenomenon, our detection approach, which is particularly amenable to graphical monitoring, can make software project managers aware of the need for enforcing refactoring actions.

Américo Rio, Fernando Brito e Abreu

Risk and Complexity Assessment on the Context of Language Migration

Language Migration is a highly risky and complex process. Many authors have provided different ways to tackle down the problem, but it still not completely resolved, even-more it is considered almost impossible on many circumstances. Despite the approaches and solutions available, no work has been done on measuring the risks and complexity of a migration process based on the technological gap. In this article we contribute a first iteration on Language Migration complexity metrics, we apply and interpret metrics on an industrial project. We end the article with a discussion and proposing future works.

Santiago Bragagnolo, Abderrahmane Seriai, Stéphane Ducasse, Mustapha Derras

Automatically Assessing Complexity of Contributions to Git Repositories

Lehman’s second law of software evolution suggests that under certain conditions software “becomes more difficult to evolve”. Similarly, Technical Debt (TD) is often considered as technical compromises that render future changes of software more costly. But how does one actually assess if modifying software becomes more difficult or costly? So far research studied this question indirectly by assessing internal structural complexity of successive software versions arguing that increasing internal complexity renders evolution tasks more difficult and costly too. Our goal is to assess complexity of evolution tasks directly. Therefore, we present an algorithm and tool that allows to automatically assess Contribution Complexity (CC), which is the complexity of a contribution respecting difficulty of integration work. Our initial evaluation suggests that our proposed algorithm and readily available tool are suitable to automatically assess complexity of contributions to software in Git repositories and the results of applying it on 8 686 contributions to two open-source systems indicate that evolution tasks actually become slightly more difficult.

Rolf-Helge Pfeiffer

Process Modeling, Improvement and Assessment

Frontmatter

Scrum for Safety: Agile Development in Safety-Critical Software Systems

The adoption of agile methodologies in all domains of software development is a desired goal. Unfortunately, many obstacles have been meet in the past for a full adoption in secure and safe systems, where different standards and operational constraints apply. In this paper we propose a novel agile methodology to be applied in the development of safety critical systems. In particular, we developed an extension of the well-known Scrum methodology and discussed the complete workflow. We finally validated the applicability of the methodology over a real case study from the railway domain.

Riccardo Carbone, Salvatore Barone, Mario Barbareschi, Valentina Casola

Empirical Evaluation of Agile Teamwork

During the fall 2020 we observed and tracked several student teams working remotely and independently to develop a non-trivial software product as the capstone project for a course of Software Engineering in our university. The teams used an integrated open-source development environment that we designed to be useful to support and measure Agile development efforts, storing all artifacts and logging productivity and interaction data. Moreover, teams were required to use the Essence visual language during the retrospectives in order to analyze and improve their Scrum-like process. The tools used by the teams were used to store and collect several process data, that post-mortem were also integrated by the answers given by the students to some questionnaires. This paper proposes an empirical evaluation of the process followed by the teams, using a teamwork quality model and an Agile maturity model. The two models highlight different facets of the teamwork. We have studied and compared the development and interaction activities of the teams, and found a correlation between the results of the two models.

Paolo Ciancarini, Marcello Missiroli, Sofia Zani

STAMP 4 NLP – An Agile Framework for Rapid Quality-Driven NLP Applications Development

The progress in natural language processing (NLP) research over the last years, offers novel business opportunities for companies, as automated user interaction or improved data analysis. Building sophisticated NLP applications requires dealing with modern machine learning (ML) technologies, which impedes enterprises from establishing successful NLP projects. Our experience in applied NLP research projects shows that the continuous integration of research prototypes in production-like environments with quality assurance builds trust in the software and shows convenience and usefulness regarding the business goal. We introduce STAMP 4 NLP as an iterative and incremental process model for developing NLP applications. With STAMP 4 NLP, we merge software engineering principles with best practices from data science. Instantiating our process model allows efficiently creating prototypes by utilizing templates, conventions, and implementations, enabling developers and data scientists to focus on the business goals. Due to our iterative-incremental approach, businesses can deploy an enhanced version of the prototype to their software environment after every iteration, maximizing potential business value and trust early and avoiding the cost of successful yet never deployed experiments.

Philipp Kohl, Oliver Schmidts, Lars Klöser, Henri Werth, Bodo Kraft, Albert Zündorf

Evaluating Predictive Business Process Monitoring Approaches on Small Event Logs

Predictive business process monitoring is concerned with the prediction how a running process instance will unfold up to its completion at runtime. Most of the proposed approaches rely on a wide number of machine learning techniques. In the last years numerous studies revealed that these methods can be successfully applied for different prediction targets. However, these techniques require a qualitatively and quantitatively sufficient dataset. Unfortunately, there are many situations in business process management where only a quantitatively insufficient dataset is available. The problem of insufficient data in the context of BPM is still neglected. Hence, none of the comparative studies investigates the performance of predictive business process monitoring techniques in environments with small datasets. In this paper an evaluation framework for comparing existing approaches with regard to their suitability for small datasets is developed and exemplarily applied to state-of-the-art approaches in next activity prediction.

Martin Käppel, Stefan Jablonski, Stefan Schönig

Analyzing a Process Core Ontology and Its Usefulness for Different Domains

A well-specified strategy should define and integrate consistently three capabilities: process, method, and common vocabulary specifications. The domain vocabularies of different strategies should be built on common reference terminologies. For example, a process ontology should be a common reference since it considers cross-cutting concerns for different domains. This work specifies and defines the main terms of ProcessCO (Process Core Ontology). This is an ontology placed at the core level in the context of a four-layered ontological architecture. A practical use of an upper-level ontology is to semantically enrich the lowest-level ontologies. For example, ThingFO (an ontology at the foundational level in that architecture) enriches ProcessCO. Since ProcessCO is at the core level, ontologies at the domain level benefit from reusing and extending its concepts. Therefore, ProcessCO can be seen as a reusable resource to semantically enrich domain ontologies. To illustrate its applicability, this work shows the semantic enrichment of two top-domain ontologies. By using ProcessCO (and other core ontologies) as a common terminological reference, the domain ontologies used in the different strategies are conceptually harmonized. Hence, strategies ensure terminological uniformity and consistency, thus facilitating the understanding of process and method specifications.

Pablo Becker, Fernanda Papa, Guido Tebes, Luis Olsina

Towards Understanding Quality-Related Characteristics in Knowledge-Intensive Processes - A Systematic Literature Review

Context: Contemporary process management systems have been supporting users during the execution of repetitive, predefined business processes. Many business processes are no longer limited to explicit business rules as processes can be unpredictable, knowledge-driven and emergent. In recent years, knowledge-intensive processes (KIPs) have become more important for many businesses. However, quality-related aspects of these processes are still scarce. Therefore, it is hard to evaluate these types of processes in terms of their quality. Objective: In this paper, we present a Systematic Literature Review aiming at investigating and reporting quality-related aspects of KIPs. Results: We identified in the selected studies the characteristics and methods related to KIPs. Although several papers present quality aspects of processes, literature still lacks directions on the quality-related approaches in KIPs.

Rachel Vital Simões, Glaucia Melo, Fernando Brito e Abreu, Toacy Oliveira

Quality Aspects in Quantum Computing

Frontmatter

KDM to UML Model Transformation for Quantum Software Modernization

Thanks to the last engineering advances, quantum computing is gaining an increasing importance in many sectors that will be benefited from its superior computational power. Before achieving all those promising benefits, companies must be able to combine their classical information systems and the new quantum software to operate with the so-called hybrid information systems. This implies, at some point of such a modernization process, that hybrid information systems will have to be (re)designed. UML can be used for defining abstract design models, not only for the classical part as done before, but also for the quantum software in an integrated manner. This paper proposes a model transformation for generating UML models that represents quantum circuits as activity diagrams. Thanks to the usage of UML, these designs are technological-independent which contributes to the modernization of hybrid information systems. The outgoing UML models are compliant with a vast amount of design tools and might be understood by a big community.

Luis Jiménez-Navajas, Ricardo Pérez-Castillo, Mario Piattini

Hybrid Classical-Quantum Software Services Systems: Exploration of the Rough Edges

The development that quantum computing technologies are achieving is beginning to attract the interest of companies that could potentially be users of quantum software. Thus, it is perfectly feasible that during the next few years hybrid systems will start to appear integrating both the classical software systems of companies and new quantum ones providing solutions to problems that still remain unmanageable today. A natural way to support such integration is Service-Oriented Computing. While conceptually the invocation of a quantum software service is similar to that of a classical one, technically there are many differences. To highlight these differences and the difficulties to develop quality quantum services, this paper takes a well-known problem to which a quantum solution can be provided, integer factorization, and the Amazon Braket quantum service platform. The exercise of trying to provide the factorization as a quantum service is carried out. This case study is used to show the rough edges that arise in the integration of classical-quantum hybrid systems using Service-Oriented Computing. The conclusion of the study allows us to point out directions in which to focus research efforts in order to achieve effective Quantum Service-Oriented Computing.

David Valencia, Jose Garcia-Alonso, Javier Rojo, Enrique Moguel, Javier Berrocal, Juan Manuel Murillo

Towards a Set of Metrics for Quantum Circuits Understandability

Quantum computing is the basis of a new revolution. Several quantum computers are already available and, with them, quantum programming languages, quantum software development kits and platforms, quantum error correction and optimization tools are proposed and presented continuously. In connection with this, disciplines such as the Quantum Software Engineering are appearing for applying the knowledge acquired through time in their corresponding classical relatives. Besides, measurement is well known as a key factor for assessing, and improving if needed, the quality of any model in terms of, for instance, its understandability. The easier to understand a model is, the easier to maintain, reuse, etc. In this work, we present the definition of a set of metrics for assessing the understandability of quantum circuits. Some examples of the calculation of the metrics are also presented. This is just the beginning of a more thorough process in which they will be empirically validated by the performance of empirical studies, especially experiments.

José A. Cruz-Lemus, Luis A. Marcelo, Mario Piattini

Safety, Security and Privacy

Frontmatter

A Critique on the Use of Machine Learning on Public Datasets for Intrusion Detection

Intrusion detection has become an open challenge in any modern ICT system due to the ever-growing urge towards assuring security of present day networks. Various machine learning methods have been proposed for finding an effective solution to detect and prevent network intrusions. Many approaches, tuned and tested by means of public datasets, capitalize on well-known classifiers, which often reach detection accuracy close to 1. However, these results strongly depend on the training data, which may not be representative of real production environments and ever-evolving attacks. This paper is an initial exploration around this problem. After having learned a detector on the top of a public intrusion detection dataset, we test it against held-out data not used for learning and additional data gathered by attack emulation in a controlled network. The experiments presented are focused on Denial of Service attacks and based on the CICIDS2017 dataset. Overall, the figures gathered confirm that results obtained in the context of synthetic datasets may not generalize in practice.

Marta Catillo, Andrea Del Vecchio, Antonio Pecchia, Umberto Villano

A Comparison of Different Source Code Representation Methods for Vulnerability Prediction in Python

In the age of big data and machine learning, at a time when the techniques and methods of software development are evolving rapidly, a problem has arisen: programmers can no longer detect all the security flaws and vulnerabilities in their code manually. To overcome this problem, developers can now rely on automatic techniques, like machine learning based prediction models, to detect such issues. An inherent property of such approaches is that they work with numeric vectors (i.e., feature vectors) as inputs. Therefore, one needs to transform the source code into such feature vectors, often referred to as code embedding. A popular approach for code embedding is to adapt natural language processing techniques, like text representation, to automatically derive the necessary features from the source code. However, the suitability and comparison of different text representation techniques for solving Software Engineering (SE) problems is rarely studied systematically. In this paper, we present a comparative study on three popular text representation methods, word2vec, fastText, and BERT applied to the SE task of detecting vulnerabilities in Python code. Using a data mining approach, we collected a large volume of Python source code in both vulnerable and fixed forms that we embedded with word2vec, fastText, and BERT to vectors and used a Long Short-Term Memory network to train on them. Using the same LSTM architecture, we could compare the efficiency of the different embeddings in deriving meaningful feature vectors. Our findings show that all the text representation methods are suitable for code representation in this particular task, but the BERT model is the most promising as it is the least time consuming and the LSTM model based on it achieved the best overall accuracy (93.8%) in predicting Python source code vulnerabilities.

Amirreza Bagheri, Péter Hegedűs

Threat Modeling of Edge-Based IoT Applications

The Multi-access Edge Computing (MEC) computing model provides on-demand cloud resources and services to the edge of the network, to offer storage and computing capacity, mobility, and context awareness support for emerging Internet of Things (IoT) applications. On the other hand, its complex hierarchical model introduces new vulnerabilities, which can influence the security of IoT applications. The use of different enabling technologies at the edge of the network, such as various wireless access and virtualization technologies, implies several threats and challenges that make the security analysis and the deployment of security mechanisms a technically challenging problem. This paper proposes a technique to model Edge-based systems and automatically extract security threats and plan possible security tests. The proposed approach is tested against a simple, but significant case study. The main contribution consists of a threat catalog that can be used to derive a threat model and perform a risk analysis process of specific MEC-based IoT scenarios.

Massimo Ficco, Daniele Granata, Massimiliano Rak, Giovanni Salzillo

Enforcing Mutual Authentication and Confidentiality in Wireless Sensor Networks Using Physically Unclonable Functions: A Case Study

The technological progress we witnessed in recent years has led to a pervasive usage of smart and embedded devices in many application domains. The monitoring of Power Delivery Networks (PDNs) is an example: the use of interconnected sensors makes it possible to detect faults and to dynamically adapt the network topology to isolate and compensate for them. In this paper we discuss how Fault-Detection, Isolation and Service Recovery (FDISR) for PDNs can be modeled according to the fog-computing paradigm, which distributes part of the computation among edge nodes and the cloud. In particular, we consider an FDISR application on Medium-Voltage PDNs (MV-PDNs) based on a Wireless Sensor Network (WSN) whose nodes make use of the Long Range (LoRa) technology to communicate with each other. Security concerns and the attack model of such application are discussed, then the use of a communication protocol based on the Physically Unclonable Functions (PUFs) mechanism is proposed to achieve both mutual authentication and confidentiality. Finally, an implementation of the proposal is presented and evaluated w.r.t. security concerns and communication overhead.

Mario Barbareschi, Salvatore Barone, Alfonso Fezza, Erasmo La Montagna

GRADUATION: A GDPR-Based Mutation Methodology

The adoption of the General Data Protection Regulation (GDPR) is enhancing different business and research opportunities that evidence the necessity of appropriate solutions supporting specification, processing, testing, and assessing the overall (personal) data management. This paper proposes GRADUATION (GdpR-bAseD mUtATION) methodology, for mutation analysis of data protection policies test cases. The new methodology provides generic mutation operators in reference to the currently applicable EU Data Protection Regulation. The preliminary implementation of the steps involved in the GDPR-based mutants derivation is also described.

Said Daoudagh, Eda Marchetti

A Proposal for the Classification of Methods for Verification and Validation of Safety, Cybersecurity, and Privacy of Automated Systems

As our dependence on automated systems grows, so does the need for guaranteeing their safety, cybersecurity, and privacy (SCP). Dedicated methods for verification and validation (V&V) must be used to this end and it is necessary that the methods and their characteristics can be clearly differentiated. This can be achieved via method classifications. However, we have experienced that existing classifications are not suitable to categorise V&V methods for SCP of automated systems. They do not pay enough attention to the distinguishing characteristics of this system type and of these quality concerns. As a solution, we present a new classification developed in the scope of a large-scale industry-academia project. The classification considers both the method type, e.g., testing, and the concern addressed, e.g., safety. Over 70 people have successfully used the classification on 53 methods. We argue that the classification is a more suitable means to categorise V&V methods for SCP of automated systems and that it can help other researchers and practitioners.

Jose Luis de la Vara, Thomas Bauer, Bernhard Fischer, Mustafa Karaca, Henrique Madeira, Martin Matschnig, Silvia Mazzini, Giann Spilere Nandi, Fabio Patrone, David Pereira, José Proença, Rupert Schlick, Stefano Tonetta, Ugur Yayan, Behrooz Sangchoolie

Risk Identification Based on Architectural Patterns

We present a novel approach for the identification of risks for IT-based systems, where we base risk identification on the system architecture, in particular, the architectural principles a system is built on. Such principles can be expressed as architectural patterns, which are amenable to specific risks. We represent those risks – concerning e.g. safety, security or fault tolerance – as Risk Issue Questionnaires (RIQs). A RIQ enumerates the typical risks associated with a given architectural pattern. Risk identification proceeds by identifying the architectural patterns contained in a system architecture and processing the associated RIQs, i.e., for each issue in the RIQ it has to be assessed whether it is relevant for the system under analysis or not. We present an example of a RIQ, a RIQ-driven risk identification method, an application example, and the results of an initial experiment evaluating the RIQ method.

Maritta Heisel, Aida Omerovic

Expressing Structural Temporal Properties of Safety Critical Hierarchical Systems

Software-intensive safety critical systems are becoming more and more widespread and are involved in many aspects of our daily lives. Since a failure of these systems could lead to unacceptable consequences, it is imperative to guarantee high safety standards. In practice, as a way to handle their increasing complexity, these systems are often modelled as hierarchical systems.To date, a good deal of work has focused on the definition and analysis of hierarchical modelling languages and on their integration within model-driven development frameworks. Less work, however, has been directed towards formalisms to effectively express, in a precise and rigorous way, relevant behavioural properties of such systems (e.g.: safety requirements).In this work, we propose a novel extension of classic Linear Temporal Logic (LTL) called Hierarchical Linear Temporal Logic (HLTL), designed to express, in a natural yet rigorous way, behavioural properties of hierarhical systems. The formalism we propose does not commit to any specific modelling language, and can be used to predicate over a large variety of hierarchical systems.

Massimo Benerecetti, Fabio Mogavero, Adriano Peron, Luigi Libero Lucio Starace

Quality Aspects in Machine Learning, AI and Data Analytics

Frontmatter

Facing Many Objectives for Fairness in Machine Learning

Fairness is an increasingly important topic in the world of Artificial Intelligence. Machine learning techniques are widely used nowadays to solve huge amounts of problems, but those techniques may be biased against certain social groups due to different reasons. Using fair classification methods we can attenuate this discrimination source. Nevertheless, there are lots of valid fairness definitions which may be mutually incompatible.The aim of this paper is to propose a method which generates fair solutions for machine learning binary classification problems with one sensitive attribute. As we want accurate, fair and interpretable solutions, our method is based on Many Objective Evolutionary Algorithms (MaOEAs). The decision space will represent hyperparameters for training our classifiers, which will be decision trees, while the objective space will be a four-dimensional space representing the quality of the classifier in terms of an accuracy measure, two contradictory fairness criteria and an interpretability indicator.Experimentation have been done using four well known fairness datasets. As we will see, our algorithm generates good solutions compared to previous work, and a presumably well populated pareto-optimal population is found so that different classifiers could be used depending on our needs.

David Villar, Jorge Casillas

A Streaming Approach for Association Rule Analysis of Spanish Politics on Twitter

The technological era in which we live has supposed an exponential rise in the quantity of data daily-generated in the Internet. Social networks and particularly Twitter has been one of the most disruptive factors in this era, allowing people to share easily opinions and ideas. Data generated in this social network is an example of streams, which are outlined by the challenges that arise from their particular features: continue, unlimited, high-speed arrivals, demand of fast reaction and with changes over time (known as concept drifts). The dynamism that characterizes this type of problem requires from a streaming analysis in order to perform an adequate treatment. In this situation, data stream mining appears as an emergent field of data science with specialized machine learning techniques according to the nature of streams. One of the most prominent tasks in this field is association stream mining, which focuses on the problem of dynamical extraction of interesting association rules from data features in a situation where it is not possible to assume an priori data structure and there is an evolution of these data features over the time. This paper aims to carry out a proof of concept focused on politics by studying a real collection of tweets related to the 2019 Spanish Investiture process. Thereby, Fuzzy-CSar-AFP algorithm has been applied in order to carry out an online analysis of association rules among a collection of terms of interest from our Twitter database.

Pedro J. López, Elena Ruiz, Jorge Casillas

On the Trade-off Between Robustness and Complexity in Data Pipelines

Data pipelines play an important role throughout the data management process whether these are used for data analytics or machine learning. Data-driven organizations can make use of data pipelines for producing good quality data applications. Moreover, data pipelines ensure end-to-end velocity by automating the processes involved in extracting, transforming, combining, validating, and loading data for further analysis and visualization. However, the robustness of data pipelines is equally important since unhealthy data pipelines can add more noise to the input data. This paper identifies the essential elements for a robust data pipeline and analyses the trade-off between data pipeline robustness and complexity.

Aiswarya Raj Munappy, Jan Bosch, Helena Homström Olsson

Big Data Quality Models: A Systematic Mapping Study

In the last decade, we have been witnesses of the considerable increment of projects based on big data applications and the evident growing interest in implementing these kind of systems. It has become a great challenge to assure the expected quality in Big Data contexts. In this paper, a Systematic Mapping Study (SMS) is conducted to reveal what quality models have been analyzed and proposed in the context of Big Data in the last decade, and which quality dimensions support those quality models. The results are exposed and analyzed for further research.

Osbel Montero, Yania Crespo, Mario Piatini

Business Process and Organizational Data Quality Model (BPODQM) for Integrated Process and Data Mining

Data Quality (DQ) is a key element in any Data Science project to guarantee that its results provide consistent and reliable information. Both process mining and data mining, as part of Data Science, operate over large sets of data from the organization, carrying out the analysis effort. In the first case, data represent the daily execution of business processes (BPs) in the organization, such as sales process or health process, and in the second case, they correspond to organizational data regarding the organization’s domain such as clients, sales, patients, among others. This separate view on the data prevents organizations from having a complete view of their daily operation and corresponding evaluation, probably hiding useful information to improve their processes. Although there are several DQ approaches and models for organizational data, and a few DQ proposals for business process data, none of them takes an integrated view over process and organizational data. In this paper we present a quality model named Business Process and Organizational Data Quality Model (BPODQM) defining specific dimensions, factors and metrics for quality evaluation of integrated process and organizational data, in order to detect key issues in datasets used for process and data mining efforts.

Francisco Betancor, Federico Pérez, Adriana Marotta, Andrea Delgado

A Checklist for Explainable AI in the Insurance Domain

Artificial intelligence (AI) is a powerful tool to accomplish a great many tasks. This exciting branch of technology is being adopted increasingly across varying sectors, including the insurance domain. With that power arise several complications. One of which is a lack of transparency and explainability of an algorithm for experts and non-experts alike. This brings into question both the usefulness as well as the accuracy of the algorithm, coupled with an added difficulty to assess potential biases within the data or the model. In this paper, we investigate the current usage of AI algorithms in the Dutch insurance industry and the adoption of explainable artificial intelligence (XAI) techniques. Armed with this knowledge we design a checklist for insurance companies that should help assure quality standards regarding XAI and a solid foundation for cooperation between organisations. This checklist extends an existing checklist that SIVI, the standardisation institute for digital cooperation and innovation in Dutch insurance.

Olivier Koster, Ruud Kosman, Joost Visser

Evidence-Based Software Quality Engineering

Frontmatter

Where the Bugs are: A Quasi-replication Study of the Effect of Inheritance Depth and Width in Java Systems

The role of inheritance in the OO paradigm and its inherent complexity has caused conflicting results in the software engineering community. In a seminal empirical study, Basili et al., suggest that, based on a critique of the Chidamber and Kemerer OO metrics suite, a class located deeper in an inheritance hierarchy will introduce more bugs because it inherits a large number of definitions from its ancestors. Equally, classes with a large number of children (i.e., descendants) are difficult to modify and usually require more testing because the class potentially affects all of its children. In this paper, we use a large data set containing bug and inheritance data from eleven Java systems (seven open-source and four commercial) to explore these two research questions. We explore whether it is the case that a class deeper in the hierarchy is more buggy because of its deep position. Equally, we explore whether there is a positive relationship between the number of children and bugs, if classes with large numbers of children are indeed more difficult to modify. Results showed no specific trend for classes deeper in the hierarchy to be more buggy vis-a-vis shallower classes; the four commercial systems actually showed a negative relationship. The majority of classes across the hierarchy were also found to have no children and those classes included the most buggy.

Steve Counsell, Stephen Swift, Amjed Tahir

30 Years of Automated GUI Testing: A Bibliometric Analysis

Context: Over the last 30 years, GUIs have changed considerably, becoming everyday part of our lives through smart phones and other devices. More complex GUIs and multitude of platforms have increased the challenges when testing software through the GUI. Objective: To visualise how the field of automated GUI testing has evolved by studying the growth of the field; types of publications; influential events, papers and authors; collaboration among authors; and trends on GUI testing. Method: To conduct a bibliometric analysis of automated GUI testing by performing a systematic search of primary studies in Scopus from 1990 to 2020. Results: 744 publications were selected as primary studies. The majority of them were conference papers, the most cited paper was published on 2013, and the most published author has 53 papers. Conclusions: Automated GUI testing has continuously grown. Keywords show that testing applied to mobile interfaces will be the trend in next years, along with the integration of Artificial Intelligence and automated exploration techniques.

Olivia Rodríguez-Valdés, Tanja E. J. Vos, Pekka Aho, Beatriz Marín

A Large-Scale Investigation of Local Variable Names in Java Programs: Is Longer Name Better for Broader Scope Variable?

Variables are fundamental elements of software, and their names hold vital clues to comprehending the source code. It is ideal that a variable’s name should be informative that anyone quickly understands its role. When a variable’s scope gets broader, the demand for such an informative name becomes higher. Although the standard naming conventions provide valuable guidelines for naming variables, there is a lack of concrete and quantitative criteria regarding a better name. That challenge in naming variables is the motivation of the quantitative investigation conducted in this paper. The investigation collects 637,077 local variables from 1,000 open-source Java projects to get a detailed view of the variable naming trend. The data analysis reveals frequently-used terms for variable names, the naming styles, and the length of names when the variable scopes are broad. The results showed that developers prefer to use fully spelled English words or compounded names for broad-scope variables, but they tend to avoid long names; Developers often use simple words or abbreviations shorter than seven or eight characters.

Hirohisa Aman, Sousuke Amasaki, Tomoyuki Yokogawa, Minoru Kawahara

Quality in Cyber-physical Systems

Frontmatter

KNN-Averaging for Noisy Multi-objective Optimisation

Multi-objective optimisation is a popular approach for finding solutions to complex problems with large search spaces that reliably yields good optimisation results. However, with the rise of cyber-physical systems, emerges a new challenge of noisy fitness functions, whose objective value for a given configuration is non-deterministic, producing varying results on each execution. This leads to an optimisation process that is based on stochastically sampled information, ultimately favouring solutions with fitness values that have co-incidentally high outlier noise. In turn, the results are unfaithful due to their large discrepancies between sampled and expectable objective values. Motivated by our work on noisy automated driving systems, we present the results of our ongoing research to counteract the effect of noisy fitness functions without requiring repeated executions of each solution. Our method kNN-Avg identifies the k-nearest neighbours of a solution point and uses the weighted average value as a surrogate for its actually sampled fitness. We demonstrate the viability of kNN-Avg on common benchmark problems and show that it produces comparably good solutions whose fitness values are closer to the expected value.

Stefan Klikovits, Paolo Arcaini

Software Quality Education and Training

Frontmatter

Exercise Perceptions: Experience Report from a Secure Software Development Course

The ubiquitous use of software in critical systems necessitates integrating cybersecurity concepts into the software engineering curriculum so that students studying software engineering have adequate knowledge to securely develop software projects, which could potentially secure critical systems. An experience report of developing and conducting a course can help educators to gain an understanding of student preferences on topics related to secure software development. We provide an experience report related to the ‘Secure Software Development’ course conducted at Tennessee Technological University. We discuss student motivations, as well as positive and negative perceptions of students towards exercises. Based on our findings, we recommend educators to integrate real-world exercises into a secure software development course with careful consideration of tool documentation, balance in exercise diversity, and student background.

Akond Rahman, Shahriar Hossain, Dibyendu Brinto Bose

A Software Quality Course: The Breadth Approach

We present a Software Quality course taught in a MSc program in Computer Science and Engineering. The course takes an overview (‘breadth’) approach, reviewing the most important topics that contribute to the quality of software. The course has been taught traditionally as well as online; we discuss the advantages and disadvantages of both styles and point out what should be kept from the online experience. We also discuss the students’ evaluation and feedback.

Luigia Petre

Students Projects’ Source Code Changes Impact on Software Quality Through Static Analysis

Monitoring and examining source code and quality metrics is an essential task in software development projects. Still, it is challenging to evaluate for educational projects due to the time and effort required by instructors, and constant change during the software project evolution. In this paper, we used an automated approach to analyze source code and quality metrics’ evolution and impact in software engineering projects using static code analysis on each software change (commits and merges). We examined five undergraduate software engineering projects’ changed modules, compilability, and source code and quality metrics (size, complexity, duplication, maintainability, and security). In total, we assessed 12,103 changes from 103 students contributing to the projects. Our approach allowed us to identify students’ project trends in the impact of the source code changes, providing insights into behaviors such as technology knowledge deficiencies, issues in continuous integration practices, and software quality degradation. We believe that the early, constant feedback on student software engineering project quality can help instructors improve their courses and students enhance their development practices. Tracking of source code evolution could be done via static analysis and instructors could use the analysis results for teaching.

Sivana Hamer, Christian Quesada-López, Marcelo Jenkins

Springer Professional

Über dieses Buch

Inhaltsverzeichnis

Frontmatter

ICT Verification and Validation

Frontmatter

Reducing Flakiness in End-to-End Test Suites: An Experience Report

Mutation Subsumption as Relative Incorrectness

What We Talk About When We Talk About Software Test Flakiness

Looking for the Needle in the Haystack: End-to-end Tests in Open Source Projects

Evaluating Sensor Interaction Failures in Mobile Applications

Software Evolution

Frontmatter

Feature-Oriented Clone and Pull for Distributed Development and Evolution

Detecting Sudden Variations in Web Apps Code Smells’ Density: A Longitudinal Study

Risk and Complexity Assessment on the Context of Language Migration

Automatically Assessing Complexity of Contributions to Git Repositories

Process Modeling, Improvement and Assessment

Frontmatter

Scrum for Safety: Agile Development in Safety-Critical Software Systems

Empirical Evaluation of Agile Teamwork

STAMP 4 NLP – An Agile Framework for Rapid Quality-Driven NLP Applications Development

Evaluating Predictive Business Process Monitoring Approaches on Small Event Logs

Analyzing a Process Core Ontology and Its Usefulness for Different Domains

Towards Understanding Quality-Related Characteristics in Knowledge-Intensive Processes - A Systematic Literature Review

Quality Aspects in Quantum Computing

Frontmatter

KDM to UML Model Transformation for Quantum Software Modernization

Hybrid Classical-Quantum Software Services Systems: Exploration of the Rough Edges

Towards a Set of Metrics for Quantum Circuits Understandability

Safety, Security and Privacy

Frontmatter

A Critique on the Use of Machine Learning on Public Datasets for Intrusion Detection

A Comparison of Different Source Code Representation Methods for Vulnerability Prediction in Python

Threat Modeling of Edge-Based IoT Applications

Enforcing Mutual Authentication and Confidentiality in Wireless Sensor Networks Using Physically Unclonable Functions: A Case Study

GRADUATION: A GDPR-Based Mutation Methodology

A Proposal for the Classification of Methods for Verification and Validation of Safety, Cybersecurity, and Privacy of Automated Systems

Risk Identification Based on Architectural Patterns

Expressing Structural Temporal Properties of Safety Critical Hierarchical Systems

Quality Aspects in Machine Learning, AI and Data Analytics

Frontmatter

Facing Many Objectives for Fairness in Machine Learning

A Streaming Approach for Association Rule Analysis of Spanish Politics on Twitter

On the Trade-off Between Robustness and Complexity in Data Pipelines

Big Data Quality Models: A Systematic Mapping Study

Business Process and Organizational Data Quality Model (BPODQM) for Integrated Process and Data Mining

A Checklist for Explainable AI in the Insurance Domain

Evidence-Based Software Quality Engineering

Frontmatter

Where the Bugs are: A Quasi-replication Study of the Effect of Inheritance Depth and Width in Java Systems

30 Years of Automated GUI Testing: A Bibliometric Analysis

A Large-Scale Investigation of Local Variable Names in Java Programs: Is Longer Name Better for Broader Scope Variable?

Quality in Cyber-physical Systems

Frontmatter

KNN-Averaging for Noisy Multi-objective Optimisation

Software Quality Education and Training

Frontmatter

Exercise Perceptions: Experience Report from a Secure Software Development Course

A Software Quality Course: The Breadth Approach

Students Projects’ Source Code Changes Impact on Software Quality Through Static Analysis

Backmatter