Skip to main content

2023 | Buch

Research Challenges in Information Science: Information Science and the Connected World

17th International Conference, RCIS 2023, Corfu, Greece, May 23–26, 2023, Proceedings

insite
SUCHEN

Über dieses Buch

This book constitutes the proceedings of the 17th International Conference on Research Challenges in Information Sciences, RCIS 2023, which took place in Corfu, Greece, during May 23–26, 2023. It focused on the special theme "Information Science and the Connected World".

The scope of RCIS is summarized by the thematic areas of information systems and their engineering; user-oriented approaches; data and information management; business process management; domain-specific information systems engineering; data science; information infrastructures, and reflective research and practice.

The 28 full papers presented in this volume were carefully reviewed and selected from a total of 87 submissions. The book also includes 15 Forum papers and 6 Doctoral Consortium papers. The contributions were organized in topical sections named: Requirements; conceptual modeling and ontologies; machine learning and analytics; conceptual modeling and semantic networks; business process design and computing in the continuum; requirements and evaluation; monitoring and recommending; business process analysis and improvement; user interface and experience; forum papers; doctoral consortium papers. Two-page abstracts of the tutorials can be found in the back matter of the volume.

Inhaltsverzeichnis

Frontmatter

Requirements

Frontmatter
Goal Modelling: Design and Manufacturing in Aeronautics

In aeronautics, the development of a new aircraft is usually organised in sequence. That means the aircraft is designed first, then its industrial system. Therefore, the industrial system may endure stringent constraints due to aircraft design choices. This can result in suboptimal performance with respect to manufacturing. But approaches such as Collaborative Engineering or Concurrent Engineering invite different engineering teams to work simultaneously and together in order to open up new prospects for a product design. In the context of a project that aims at developing methods and tools for co-designing an aircraft and its industrial system, we use Goal-Oriented Requirements Engineering (GORE) to model and to understand their respective expectations but also their dependencies. In this paper, we describe our application of goal modelling based on three iterative attempts. We start from an exploratory stage to have a global picture of the dependencies between the design of an aircraft nose section and its industrial system. We finish with a focus on a smaller problem in which we understand the key elements of the assembly line performance based on a nose design. For each attempt, we describe our results and feedback, and show how we overcame issues raised at the previous stage. We also highlight the links with known issues about GORE practical application.

Anouck Chan, Anthony Fernandes Pires, Thomas Polacsek, Stéphanie Roussel, François Bouissière, Claude Cuiller, Pierre-Eric Dereux
Cloud Migration High-Level Requirements

With the increasing adoption of Cloud Computing in the industry, new challenges have emerged for information systems design. In this context, many requirements are met by choosing adequate cloud providers, cloud services, and service configurations. To provide design support, it is necessary to understand what drives the selection of each of these elements. Therefore, it may be interesting to elicitate these high-level requirements, to have an overview of what significantly impacts the cloud environment selection. Here we focus on a particular case of cloud system design: migrations. Through a qualitative study with cloud migration experts, we identify eleven high-level requirements that drive design decisions. We propose an analysis of these results and two classifications to support elicitation and analysis of requirements in cloud migrations.

Antoine Aubé, Thomas Polacsek
Idea Browsing on Digital Participation Platforms: A Mixed-Methods Requirements Study

Digital participation platforms (DPP) are websites initiated by local governments through which citizens can post and react to ideas for their city. In practice, the majority of DPP users browse the posted ideas without contributing any. This activity, referred to as lurking, has widely recognized positive outcomes, especially in a citizen participation context. However, it has been devoted little attention. In practice, the idea browsing features available on current DPP are limited, and the literature has not evaluated the available approaches nor studied the requirements for idea browsing. In this paper, we report on an evaluation of the filterable list, which is the most common idea browsing approach on DPP. Our findings show that it lacks stimulation hedonic quality and call for a more stimulating approach. Thus, we conducted 11 semi-structured interviews to collect requirements and found that idea browsing on DPP should be supported by the combination of (1) a stimulating interactive representation such as circle packing or thematic trees displayed as entry point and (2) a filterable list for deeper exploration. This article is the first to study requirements for idea browsing features on DPP.

Antoine Clarinval, Julien Albert, Clémentine Schelings, Catherine Elsen, Bruno Dumas, Annick Castiaux

Conceptual Modeling and Ontologies

Frontmatter
What Do Users Think About Abstractions of Ontology-Driven Conceptual Models?

In a previous paper, we proposed an algorithm for ontology-driven conceptual model abstractions [18]. We have implemented and tested this algorithm over a FAIR Catalog of such models represented in the OntoUML language. This provided evidence for the correctness of the algorithm’s implementation, i.e., that it correctly implements the model transformation rules prescribed by the algorithm, and its effectiveness, i.e., it is able to achieve high compression (summarization) rates over these models. However, in addition to these properties, it is fundamental to test the validity of this algorithm, i.e., that it achieves what it is intended to do, namely provide summarizing abstractions over these models whilst preserving the gist of the conceptualization being represented. We performed three user studies to evaluate the usefulness of the resulting abstractions as perceived by modelers. This paper reports on the findings of these user studies and reflects on how they can be exploited to improve the existing algorithm.

Elena Romanenko, Diego Calvanese, Giancarlo Guizzardi
On the Semantics of Risk Propagation

Risk propagation encompasses a plethora of techniques for analyzing how risk “spreads” in a given system. Albeit commonly used in technical literature, the very notion of risk propagation turns out to be a conceptually imprecise and overloaded one. This might also explain the multitude of modeling solutions that have been proposed in the literature. Having a clear understanding of what exactly risk is, how it be quantified, and in what sense it can be propagated is fundamental for devising high-quality risk assessment and decision-making solutions. In this paper, we exploit a previous well-established work about the nature of risk and related notions with the goal of providing a proper interpretation of the different notions of risk propagation, as well as revealing and harmonizing the alternative semantics for the links used in common risk propagation graphs. Finally, we discuss how these results can be leveraged in practice to model risk propagation scenarios.

Mattia Fumagalli, Gal Engelberg, Tiago Prince Sales, Ítalo Oliveira, Dan Klein, Pnina Soffer, Riccardo Baratella, Giancarlo Guizzardi
The Omnipresent Role of Technology in Social-Ecological Systems
Ontological Discussion and Updated Integrated Framework

Technology-driven development is one of the main causes of the triple planetary crises of climate change, biodiversity loss and pollution, yet it is also an important factor in the potential mitigation of and adaptation to these crises. In spite of its omnipresence, technology is often overlooked in the discourses of social and environmental sustainability, while in practice sustainability initiatives often draw criticism for favouring technical solutions or oversimplifying the relationships between society, environment and technology. This article extends our RCIS 2022 publication “Conceptual integration for social-ecological systems: an ontological approach” with an ontological examination of technology in two prominent social-ecological systems paradigms, social-ecological system framework (SESF) and ecosystem services (ESs) cascade. We ground the ontological analysis of technology on analytical and postphenomenlogical philosophical literature and effect several re-designs to the initially proposed integrated framework. The main aim of this work is to provide a clearer and theoretically founded semantics of technology within SESF and ESs to improve knowledge representation and facilitate comparability of results in support of decision-making for sustainability.

Greta Adamo, Max Willis

Machine Learning and Analytics

Frontmatter
Detection of Fishing Activities from Vessel Trajectories

This work is part of a design science project where the aim is to develop Machine Learning (ML) tools for analyzing tracks of fishing vessels. The ML models can potentially be used to automatically analyse Automatic Identification System (AIS) data for ships to identify fishing activity. Creating such technology is dependent on having labeled data, but the vast amounts of AIS data produced every day do not include any labels about the activities. We propose a labeling method based on verified heuristics, where we use an auxiliary source of data to label training data. In an evaluation, a series of tests have been done on the labeled data using deep learning architectures such as Long Short-Term Memory (LSTM), Recurrent Neural Network (RNN), 1D Convolutional Neural Network (1D CNN), and Fully Connected Neural Network (FCNN). The data consists of AIS data and daily fishing activity reports from Norwegian waters with a focus on bottom trawlers. Accuracy is higher than or equal to 87% for all deep learning models. Example applications of the trained models show how they can be used in a practical setting to identify likely unreported fishing activities.

Aida Ashrafi, Bjørnar Tessem, Katja Enberg
A General Framework for Blockchain Data Analysis

Blockchain is a foundational technology that allows application paradigms to shift from trusting humans to trusting machines and from centralized to decentralized control. Along with its explosive growth, blockchain data analysis is getting increasingly important for both scientific research and commercial applications. The current blockchain analysis systems and frameworks have limitations and weaknesses; they have excessively focused on Bitcoin and a small set of features. This paper presents a framework for blockchain data analysis. The framework is general and can be applied to a wide range of data analyses. Our main contributions are as follows: (i) we formulate the requirements of the framework; (ii) we present the detailed design of the framework with multiple components to collect, extract, enrich, store, and do further processing with blockchain data; (iii) we implement the framework and evaluate its performance in a specific use case that analyzes token-transferring transactions. We also discuss the potential of the framework for a number of blockchain data analyses.

Anh Luu, Tuan-Dat Trinh, Van-Thanh Nguyen
Reinforcement Learning for Scriptless Testing: An Empirical Investigation of Reward Functions

Testing web applications through the GUI can be complex and time-consuming, as it involves checking the functionality of the system under test (SUT) from the user’s perspective. Random testing can improve test efficiency by automating the process, but achieving good exploration is difficult because it requires uniform distribution over a large search space while also taking into account the dynamic content commonly found in web applications. Reinforcement learning can improve the efficiency of random testing by guiding the generation of test sequences. This is achieved by assigning rewards to specific actions and using them to determine which actions are most likely to lead to a desired outcome. While rewards based on the difference between consecutive states are commonly used in modern tools, they can lead to the Jumping Between States (JBS) problem, where large rewards are generated without significantly increasing exploration. We propose a solution to the JBS problem by combining rewards based on the change of state and a metric to estimate the level of exploration reached in the next state based on the frequency of actions executed. Our results show that this multi-faceted approach increases the exploration efficiency.

Olivia Rodríguez-Valdés, Tanja E. J. Vos, Beatriz Marín, Pekka Aho

Conceptual Modeling and Semantic Networks

Frontmatter
DBSpark: A System for Natural Language to SPARQL Translation

Knowledge bases offer clear advantages when compared to traditional databases, mainly due to semantic connections and automated reasoning over large datasets. However, limited knowledge of the specialized knowledge base query language (SPARQL) makes it difficult for most users to freely access these resources. To solve this issue, we propose a question-answering system able to translate natural language questions into SPARQL queries. The presented method is a rule-based approach that integrates information regarding dependency and constituency parsing, WordNet and named entity recognition to capture the structural and semantic representation of the question. The proposed solution is able to handle a wide variety of question types (list, count, yes/no, wh-questions, questions involving rankings, ordinals, and/or superlatives). Moreover, all involved steps except the phrase mapping phase (in which properties and entities from the ontological model are mapped to words from the natural language question) are knowledge base independent. Tests performed over the QALD-9 question-answering dataset using the DBpedia knowledge base have shown that our system obtains state-of-the-art results and a very good time-performance balance.

Laura-Maria Cornei, Diana Trandabat
An Automated Patterns-Based Model-to-Model Mapping and Transformation System for Labeled Property Graphs

Due to the increasing collection of highly interconnected and complex datasets, Labeled Property Graphs are gaining importance in extracting meaningful information for decision support. In addition, UML Class Diagrams are still a commonly used modeling technique for representing the main concepts of a domain. Although there are several model-to-model transformation approaches, these are mainly focused on moving from class diagrams to relational databases. Less work has been done on transforming class diagrams into labeled property graphs. This work constitutes a step forward in filling this gap by i) using a method that defines a set of patterns to improve the transformation process from class diagrams to labeled property graphs, considering the analytical requirements of a domain, and ii) proposing a technological system as an instantiation of the method, demonstrating its feasibility and enabling the assessment of its suitability. This system is grounded in a collection of templates for specifying the domain concepts and a library of transformation rules and patterns, and was evaluated using a widely known dataset exhibiting the proposed model-to-model transformation approach.

Pedro Guimarães, Ana León, Maribel Yasmina Santos
Improving Conceptual Domain Characterization in Ontology Networks

The community of Conceptual Modeling (CM) perception that Semantic Interoperability cannot be achieved without the support of an ontology-driven approach has become increasingly consensual. Moreover, the more complex and extensive the domain of the application of conceptual models, the harder it is to achieve semantic consensus. Therefore, it has emerged the perception that ontologies built to describe complex domains should not be overly large or be used in isolation. Ontology Networks arose to cover this issue. The community had to deal with issues such as different ontologies of the network using the same concept with different meanings or the same term used to designate distinct concepts. We developed a framework for classifying ontologies that provides a stable and homogeneous environment to facilitate the ontological analysis process by dealing simultaneously with ontological and domain perspectives. This article presents our proposal where conceptualization is used to identify the relationships among the evaluated ontologies. Our goal is to facilitate semantic consensus, providing guidelines and best practices supported by a stable, homogeneous, and repeatable environment.

Beatriz Franco Martins, José Fabián Reyes Román, Oscar Pastor, Moshe Hadad

Business Process Design and Computing in the Continuum

Frontmatter

Open Access

Digital Technology-Driven Business Process Redesign: A Classification Framework

Organizations constantly seek ways to improve their business processes. This often involves using digital technologies to enable process improvements. However, simply substituting existing technology with newer technology has limited value as compared to using the capabilities of digital technologies to introduce changes to business processes. Therefore, process analysts need to understand how the capabilities of digital technologies can be used to redesign business processes. In this paper, we conducted a systematic literature review and examined 40 case studies where digital technologies were used to redesign business processes. We identified that, within the context of business process improvement, capabilities of digitalization, communication, analytics, digital representation, and connectivity can enable business process redesign. Furthermore, we note that these capabilities enable applying nine redesign heuristics. Based on our review, we map how each capability can facilitate the implementation of specific redesign heuristics to improve a business process. Thus, our mapping can aid analysts in identifying candidate redesigns that capitalize on the capabilities of digital technologies.

Kateryna Kubrak, Fredrik Milani, Juuli Nava
Supporting the Implementation of Digital Twins for IoT-Enhanced BPs

IoT-Enhanced Business Processes make use of Internet of Things technology to integrate physical devices into the process as digital actors. Closely related to this topic arises the concept of Digital Twin, which is a virtual representation of real-world entities and processes that connect to the physical counterpart to represent, simulate, or predict changes in the physical system. There are many works that focus on supporting the high-fidelity implementation of Digital Twins for specific physical devices. However, few of them consider the process as a real-world entity to be integrated into the Digital Twin. In this work, we present a microservice architecture to support the implementation of Digital Twins for IoT-Enhanced Business Processes, considering not only the physical devices but also the process itself and the relationship among them. This architectural solution is supported by a model-driven development approach, which proposes (1) the construction of a BPMN model to represent an IoT-enhanced Business Process and (2) the application of model transformation to automatically generate both Digital Twin Definition Language (DTDL) models and microservice Java code templates. DTDL models are used in the implementation of the Digital Twins for the IoT-Enhanced Business Process. Java code templates are used to facilitate the implementation of the microservices required to deploy the IoT-enhanced Business Process and its Digital Twins into the proposed architecture and maintain the digital and physical parts synchronised.

Pedro Valderas
Context-Aware Digital Twins to Support Software Management at the Edge

With millions of connected edge gateways, there is a pressing challenge of remote maintenance of containerised software components after the initial release. To support remote update operations, edge software providers have been increasingly adopting digital twin-based device management platforms for run-time monitoring and interaction. A common limitation of these solutions is the lack of support for modelling the multi-dimensional context of edge devices deployed in the field, which hinders the software management in a tailored and context-aware manner. This paper aims to address this lack of context-awareness in digital twins required for edge software assignment by introducing two modelling principles, which allow focusing on the device fleet as a whole and capturing the diverse cyber-physical-social context of individual devices. As part of proof of concept, these principles were incorporated in an existing digital twin platform. This prototype implementation demonstrates the viability of the proposed modelling principles via a running example in the context of a telemedicine application system.

Rustem Dautov, Hui Song
Adoption of Virtual Agents in Healthcare E-Commerce: A Perceived Value Perspective

Virtual agents help their users find what they need thanks to an interactive dialog. In the healthcare e-commerce market, virtual agents allow “virtual consultations” available on the web, that lead to a recommendation for personalized treatment. While the adoption intention of these virtual agents by their users is critical for many organizations, the traditional explanatory models such as UTAUT-2 miss key elements specific to the virtual agents in healthcare e-commerce. Filling this gap can help organizations better understand the factors leading to the adoption of such solutions and take them into account in the design and launch of their virtual agents. This paper adopts a perceived value perspective, and proposes an extended model explaining the adoption of virtual agents and of their recommendation in a healthcare context. We test this model with 903 observations collected via an online survey in collaboration with a major European actor in the food supplement market. Our model provides highly actionable recommendations for practitioners and offers a complementary view on the adoption mechanisms of virtual agents, leading to further research recommendations.

Claire Deventer, Pietro Zidda

Requirements and Evaluation

Frontmatter
Addressing Trust Issues in Supply-Chain Management Systems Through Blockchain Software Patterns

Blockchain technology is a decentralized and distributed ledger that allows for secure, transparent, and immutable tracking of transactions. However, it is not a one-size-fits-all solution for addressing trust issues in the supply chain. In software engineering, design patterns provide a blueprint that developers can follow to solve a specific problem in a structured and efficient manner. In this paper, we identify and discuss the reusable blockchain software patterns that can be applied to design trustworthy solutions in supply chain management (SCM). Based on the literature analysis, we define a comprehensive taxonomy of SCM-specific trust issues. Then we apply requirement engineering technique to translate these issues into trust requirements and demonstrate how these requirements can be met by the specific blockchain software patterns.

Eddy Kiomba Kambilo, Irina Rychkova, Nicolas Herbaut, Carine Souveyet
Evaluating Process Efficiency with Data Envelopment Analysis: A Case in the Automotive Industry

In some industries, small improvements to processes and profit margins may lead to a significant change in profit, and this holds especially in the automotive industry. A popular approach to achieving process improvement is benchmarking, in which the execution of a process is measured and compared between different work units so that improvement opportunities can be identified. This paper reports on our efforts to improve car dealership benchmarking by designing a benchmarking tool for the automotive workshop department, such that it calculates the efficiency of its main process, which we call the Standard Service Process (SSP) in this paper. We achieved this by designing a Data Envelopment Analysis (DEA) Network Slacks-Based Measure (NSBM) model and used this model to measure the SSP efficiency for each workshop by considering its sub-processes. This model was programmed using R, after which it was verified and extended based on the literature, and the results of the verified model were validated using a real case. In this paper, we show that this has enabled a more insightful assessment of the workshops so that suggestions for improvement can be automated. In this way, we demonstrate that our approach is appropriate to rank the efficiency of work units that perform a certain process.

Rutger Kerkhof, Luís Ferreira Pires, Renata Guizzardi

Open Access

A Model of Qualitative Factors in Forensic-Ready Software Systems

Forensic-ready software systems enhance the security posture by designing the systems prepared for potential investigation of incidents. Yet, the principal obstacle is defining their exact requirements, i.e., what they should implement. Such a requirement needs to be on-point and verifiable. However, what exactly comprises a forensic readiness requirement is not fully understood due to distinct fields of expertise in software engineering and digital forensics. This paper describes a forensic readiness qualitative factor reference model that enables the formulation of specific requirements for forensic-ready software systems. It organises the qualitative properties of forensic readiness into a taxonomy, which can then be used to formulate a verifiable requirement targeted at a specific quality. The model is then utilised in an automated valet parking service to define requirements addressing found inadequacies regarding a potential incident investigation.

Lukas Daubner, Raimundas Matulevičius, Barbora Buhnova

Monitoring and Recommending

Frontmatter
Monitoring Object-Centric Business Processes: An Empirical Study

Monitoring dashboards provide an appropriate way of presenting a multitude of information on running business processes to involved actors. Essential components of a monitoring dashboard are charts that visualise this information in an aggregated, intuitive and useful way. A popular representative is the Sunburst Chart, which constitutes a pie chart with several colour-coded circles that can visualise a hierarchical structure. This visualisation technique seems to be particularly suited for monitoring object-centric processes. In this paper a procedure for automatically deriving a sunburst chart from the patterns of a relational process structure, describing an object-centric process, is presented. To investigate the readability, comprehension, and general acceptance of sunburst chart in the context of a monitoring object-centric process, an empirical study with 157 participants was conducted. As key observation of this study, the majority of the participants can read and comprehend the sunburst chart very well, e.g., on average more than 90% of the multiple-choice questions were answered correctly. Overall, sunburst charts offer promising perspective for the monitoring of large object-centric process.

Lisa Arnold, Marius Breitmayer, Manfred Reichert
A Peek into the Working Day: Comparing Techniques for Recording Employee Behaviour

Detailed recordings of employee behaviour can give organisations valuable insights into their work processes. However, recording techniques each have their advantages and disadvantages in terms of their obtrusiveness for participants, the richness of information they capture, and the risks that are involved. In an effort to systematically compare recording techniques, we conducted a multiple-case study at a multinational professional services organisation. We followed six participants for a working day, comparing the outcomes from non-participant observation, screen recording, and timesheet techniques. We generated 136:04 h of data and 849 records of activities. We identified 58 differences between the techniques. The results show that the use of only one technique will not produce a complete and accurate record of the activities that occur on the screen (online), in the hallway (offline), and in the extra hours (overtime). Therefore, it is vital to choose a technique wisely, taking into account the type of information it does not capture. Furthermore, this study identifies some open challenges with respect to accurately recording employee behaviour.

Tea Šinik, Iris Beerepoot, Hajo A. Reijers
Context-Aware Recommender Systems: Aggregation-Based Dimensionality Reduction

Context-aware recommender systems (CARS) rest on a multidimensional rating function: Users $$\times $$ × Items $$\times $$ × Context $$\rightarrow $$ → Ratings. This multidimensional modelling should improve the quality of the recommendation process, but unfortunately, it is rare or even impossible to have ratings for all possible cases of context. Our objective is therefore twofold: (i) to reduce the dimensionality of the contextual information (in order to reduce the sparsity), which leads to (ii) propose a technique for aggregating the ratings associated with the aggregated dimensions. To do this, we organize, in the CARS utility matrix, the contextual information according to hierarchical dimensions as is done in OLAP (OnLine Analytical Processing) and we use a regression-based approach for the rating aggregation according to previously defined hierarchies. Our approach supports multiple dimensions and hierarchical aggregation of ratings. It was validated on two real world datasets.

Elsa Negre, Franck Ravat, Olivier Teste

Business Processes Analysis and Improvement

Frontmatter

Open Access

Discovery of Improvement Opportunities in Knock-Out Checks of Business Processes

Overprocessing is a source of waste that occurs when unnecessary work is performed in a process. Overprocessing is often found in application-to-approval processes since a rejected application does not add value, and thus, work that leads to the rejection constitutes overprocessing. Analyzing how the knock-out checks are executed can help analysts to identify opportunities to reduce overprocessing waste and time. This paper proposes an interpretable process mining approach for discovering improvement opportunities in the knock-out checks and recommending redesigns to address them. Experiments on synthetic and real-life event logs show that the approach successfully identifies improvement opportunities while attaining a performance comparable to black-box approaches. Moreover, by leveraging interpretable machine learning techniques, our approach provides further insights on knock-out check executions, explaining to analysts the logic behind the suggested redesigns. The approach is implemented as a software tool and its applicability is demonstrated on a real-life process.

Katsiaryna Lashkevich, Lino Moises Mediavilla Ponce, Manuel Camargo, Fredrik Milani, Marlon Dumas

Open Access

Persuasive Visual Presentation of Prescriptive Business Processes

Prescriptive process monitoring methods recommend interventions during the execution of a case that, if followed, can improve performance. Research on prescriptive process monitoring so far has focused mainly on improving the underlying algorithms and providing suitable explanations for recommendations. Empirical works indicate, though, that process workers often do not follow recommendations even if they understand them. Drawing inspiration from the field of persuasive technology, we developed and evaluated a visualization that nudges process workers towards accepting a recommendation, following a design science approach. Our evaluation points towards the feasibility of the visualization and provides insights into how users perceive different persuasive elements, thus providing a basis for the design of future systems.

Janna-Liina Leemets, Kateryna Kubrak, Fredrik Milani, Alexander Nolte
TraVaG: Differentially Private Trace Variant Generation Using GANs

Process mining is rapidly growing in the industry. Consequently, privacy concerns regarding sensitive and private information included in event data, used by process mining algorithms, are becoming increasingly relevant. State-of-the-art research mainly focuses on providing privacy guarantees, e.g., differential privacy, for trace variants that are used by the main process mining techniques, e.g., process discovery. However, privacy preservation techniques for releasing trace variants still do not fulfill all the requirements of industry-scale usage. Moreover, providing privacy guarantees when there exists a high rate of infrequent trace variants is still a challenge. In this paper, we introduce TraVaG as a new approach for releasing differentially private trace variants based on Generative Adversarial Networks (GANs) that provides industry-scale benefits and enhances the level of privacy guarantees when there exists a high ratio of infrequent variants. Moreover, TraVaG overcomes shortcomings of conventional privacy preservation techniques such as bounding the length of variants and introducing fake variants. Experimental results on real-life event data show that our approach outperforms state-of-the-art techniques in terms of privacy guarantees, plain data utility preservation, and result utility preservation.

Majid Rafiei, Frederik Wangelik, Mahsa Pourbafrani, Wil M. P. van der Aalst

User Interface and Experience

Frontmatter
When Dashboard’s Content Becomes a Barrier - Exploring the Effects of Cognitive Overloads on BI Adoption

Decision makers in organizations strive to improve the quality of their decisions. One way to improve that process is to objectify the decisions with facts. Big data, business analytics, business intelligence, and more generally data-driven Decision Support Systems (data-driven DSS) intend to achieve this. Organizations invest massively in the development of data-driven DSS and expect them to be adopted and to effectively support decision makers. This raises many technical and methodological challenges, especially regarding the design of dashboards, which can be seen as the visible tip of the data-driven DSS iceberg and which play a major role in the adoption of the entire system. This paper advances early empirical research conducted on one possible root cause for data-driven DSS dashboard adoption or rejection, namely the dashboard content. We study the effect of dashboards over- and underloading on traditional Technology Adoption Models, and try to uncover the trade-offs to which data-driven DSS interface designers are confronted when creating new dashboards. The result is a Dashboard Adoption Model, enriching the seminal TAM model with new content-oriented variables to support the design of more supportive data-driven DSS dashboards.

Corentin Burnay, Sarah Bouraga, Mathieu Lega
The Effect of Visual Information Complexity on Urban Mobility Intention and Behavior

Encouraging soft mobility practices is a central issue for the ecological transition. Green information systems and more specifically self-tracking applications are tools that can be used to raise awareness and changing behavior. Based on the theoretical framework of visual complexity, this paper examines how the level of visual complexity of a mobile application influences users’ urban mobility intentions and behaviors. We conducted two experimental studies. The first one investigated how the visual complexity of homepages affects mobility intentions with an application to measure one’s carbon footprint in a situational setting. The first result of our research is that moderate information visual complexity positively influences the acceptability of a mobile application as well as mobility intentions. A second experimental research is divided into two parts, firstly, participants responded to our questionnaire, secondly, in a longitudinal approach, 51 subjects used the application over a 3-month period. The conceptual framework was tested using regression analyses. We find that intention to change behavior influences responsible urban mobility behavior. However, our experiment shows that the visual complexity of information does not have a significant influence on behavior. We then propose theoretical implications.

Thomas Chambon, Ulysse Soulat, Jeanne Lallement, Jean-Loup Guillaume
Interoperability of Open Science Metadata: What About the Reality?

Open Science aims at sharing results and data widely between different research domains. Interoperability is one of the keys to enable the exchange and crossing of data between different research communities. In this paper, we assess the state of the interoperability of Open Science datasets from various communities. The diversity of metadata schemata of these datasets from different sources does not allow for native interoperability, highlighting the need for matching tools. The question is whether current metadata schema matching tools are sufficiently efficient to achieve interoperability between existing datasets. In our study, we first define our vision of interoperability by transversally considering the technical and semantic aspects when dealing with metadata schemata coming from various domains. We then evaluate the interoperability of some datasets from the medical domain and Earth system study domain using acknowledged matching tools. We evaluate the performance of the tools, then we investigate the correlation between various metrics characterizing the schemata and the performance related to their mapping. This paper leads to identify complementary ways to improve dataset interoperability: (1) to adapt mapping algorithms to the issues raised by metadata schema matching; (2) to adapt metadata schemata, for instance by sharing a core vocabulary and/or reusing existing standards; (3) to combine various trends in a more complex interoperability approach that would also make available and operational the (RDA) crosswalks between schemata and that would promote good practices in metadata labeling and documentation.

Vincent-Nam Dang, Nathalie Aussenac-Gilles, Imen Megdiche, Franck Ravat

Forum Papers

Frontmatter
Online-Notes System: Real-Time Speech Recognition and Translation of Lectures

Student mobility gives students the opportunity to visit different universities across the world. Not all courses are offered in English or in other languages foreign students might understand, so they often face problems with following the lectures. To resolve these problems, we propose Online Notes (ON), which is a real-time speech recognition and translation system. The system is trained using existing course materials. During lectures, the lecturer wears a microphone, while the students can follow the lecture by using the ON system. After the lecture, the professor can edit and update the transcripts and translations, and students have the option to listen to the lecture and read the materials. We have conducted a series of one-time tests and currently, we are in the middle of two whole-semester pilot tests at the University of Ljubljana. In the tests, a speech recognition accuracy of up to 87.4% was achieved. Preliminary results have shown that the tool is especially useful for students who either do not understand the language of the course or understand it to a limited extent. Additionally, the transcripts of the lectures have shown to be useful for creating additional learning materials.

Tjaša Jelovšek, Marko Bajec, Iztok Lebar Bajec, Kaja Gantar, Slavko Žitnik
Temporal Relation Extraction from Clinical Texts Using Knowledge Graphs

An integral task for many natural language processing applications is the extraction of the narrative process described in a document. For understanding such processes we need to recognize the mentioned events and their temporal component. With this information, we can understand the sequence of events i.e. construct a timeline. The main task dealing with the temporal component of events is temporal relation extraction. The goal of temporal relation extraction is to determine how the times of two events are related to one another. For example, such relation would tell us whether one event happened before or after another one. In this paper, we propose a novel architecture for a temporal relation extraction model combining text information with information captured in the form of a temporal event graph. We present our initial results on the domain of clinical documents. Using a temporal event graph with only correct relations, the model achieves F1 score of 83.6% which is higher than any of our state-of-the-art baseline models. This shows the promise of our proposed approach.

Timotej Knez, Slavko Žitnik
Domain TILEs: Test Informed Learning with Examples from the Testing Domain

Test Informed Learning with Examples (TILE) helps educators to add testing to their programming courses early, easily and in a subtle way. Currently, TILE describes how to add informed examples of testing to test runs, test cases, and messages. In this paper, we extend TILE by incorporating information from the testing domain itself into the examples. Our non-conclusive results from a survey with 300 participants indicate that using TILE, results in students creating more tests that cover more parts of the code.

Niels Doorn, Tanja Vos, Beatriz Marín, Christoph Bockisch, Steffen Dick, Erik Barendsen
Using GUI Change Detection for Delta Testing

Current software development processes in the industry are designed to respond to rapid modification or changes in software features. Delta testing is a technique used to check that the identified changes are deliberate and neither compromise existing functionality nor result in introducing new defects. This paper proposes a technique for delta testing at the Graphical User Interface (GUI) level. We employ scriptless testing and state-model inference to automatically detect and visualize GUI changes between different versions of the same application. Our proposed offline change detection algorithm compares two existing GUI state models to detect changes. We present a proof of concept experiment with the open-source application Notepad++, which allows automatic inference and highlights GUI changes. The results show that our technique is a valuable amplification of scriptless testing tools for delta testing.

Fernando Pastor Ricós, Rick Neeft, Beatriz Marín, Tanja E. J. Vos, Pekka Aho
Enterprise Modeling for Machine Learning: Case-Based Analysis and Initial Framework Proposal

Artificial Intelligence (AI) continuously paves its way into even the most traditional business domains. This particularly applies to data-driven AI, like machine learning (ML). Several data-driven approaches like CRISP-DM and KKD exist that help develop and engineer new ML-enhanced solutions. A new breed of approaches, often called canvas-driven or visual ideation approaches, extend the scope by a perspective on the business value an ML-enhanced solution shall enable. In this paper, we reflect on two recent ML projects. We show that the data-driven and canvas-driven approaches cover only some necessary information for developing and operating ML-enhanced solutions. Consequently, we propose to put ML into an enterprise context for which we sketch a first framework and spark the role enterprise modeling can play.

Dominik Bork, Panagiotis Papapetrou, Jelena Zdravkovic
Towards Creating a Secure Framework for Building Mirror World Applications

The growing interest in Virtual Reality and Augmented Reality technologies has led to the development of the concept of Mirror Worlds, the Metaverse, and the AR Cloud. Different companies are creating their own version of the Metaverse and Mirror Worlds serve as a foundation for these experiences. As these technologies continue to grow, there is a need for the design of a systematic way for designing and implementing applications in virtual spaces. There are emerging technologies and standards that could form the basis of such a framework, however, privacy concerns like data privacy and location tracking need to be addressed while designing any potential framework or solution.

Panos Mazarakis
Towards a Secure and Privacy Compliant Framework for Educational Data Mining

Digital education technology has become an essential component of the educational process and has greatly impacted the learning experience in schools and higher education. Especially, in recent years, there has been a surge in the adoption of educational applications and the use of e-learning platforms resulting in a vast amount of educational data being generated. Using Learning Analytics and Educational Data Mining can assist in extracting valuable information from educational data, which can enhance the learning experience for both learners and educators. However, there is a growing need to address privacy concerns and ensure data security, and thus research focus has shifted to the development of privacy frameworks and principles to ensure data privacy and security. This paper highlights the significance of data protection and privacy regulations, such as GDPR, in utilizing data mining methods in educational environments aiming on establishing the foundations for the creation of a privacy-focused framework for data processing within the educational platform.

Polydorou Eleni
Internet of Cloud (IoC): The Need of Raising Privacy and Security Awareness

In our fast-paced society, due to the interconnection of millions of devices on the Internet, a great amount of complex data is shared, received, and managed by facilitating communication among users and devices globally. As a result, users from different educational backgrounds and professions are exposed to cyber threats and risks. Current megatrends such as Industry 4.0, Internet of Things (IoT), Cloud Computing, Metaverse, and 6G technology introduce a new era of user experience through strong connectivity, and costless service with high adaptability, however, it raises major data privacy and security concerns. These trends can ameliorate people’s lives; however, they are characterized as “attractive vulnerable targets” for cyber-attacks with high intent to harm and disrupt people’s daily lives. This paper focuses on two main pillars: the privacy and security fundamental challenges of the synergy between the Internet of Things (IoT) and the Cloud which creates the Internet of Cloud (IoC) and the need of raising awareness to prevent violation of user’s data on the IoC. The scope of this paper is to indicate the importance of privacy and security concerns on IoC and to explain why privacy awareness training should be a priority.

Asimina Tsouplaki
Comparative Study of Unsupervised Keyword Extraction Methods for Job Recommendation in an Industrial Environment

Automatic keyword extraction has important applications in various fields such as information retrieval, text mining and automatic text summarization. Different models of keyword extraction exist in the literature. In most cases, these models are designed for English-language documents, including scientific journals, news articles, or web pages. In this work, we evaluate state-of-the-art unsupervised approaches for extracting keywords from French-language Curricula Vitae (CVs) and job offers. The goal is to use these keywords to match a candidate and a job offer as part of a job recommendation system. Our evaluation showed that statistical baselines obtain good results with an interesting processing time in an industrial context. It also allowed us to highlight, on the one hand, biases related to pre-trained word embedding models on corpora of a different nature than CVs and job offers, and on the other hand, the difficulties of annotation within the framework of job search platforms.

Bissan Audeh, Maia Sutter, Christine Largeron
A Meta-model for Digital Business Ecosystem Design

The Digital Business Ecosystem (DBE) theory has evolved to facilitate the functioning of open business networks by adopting the ecosystem paradigm from nature in a shared digital environment. While it enables exhibiting diverse interests, it also places high demands on managing the DBE’s resilience. The current research lacks support for how a DBE-based business model can be integrated with its supporting information system’s data structure to enable the design and monitoring of resilience for indicating the needed adaptations caused by changes in DBE’s actors, their engagement or performance balance. We propose and instantiate a meta-model that describes the DBE’s entities relevant to its design using the CIVIS ecosystem. The meta-model provides a foundation for a modelling language for the management of the resilience of DBE.

Chen Hsi Tsai, Jelena Zdravkovic, Janis Stirna
Supporting Students in Team-Based Software Development Projects: An Exploratory Study

Team-based software development projects (TBSDP) are a useful instrument to expose students to teamwork in an industry-like working context. However, TBSDP exposes students to a number of challenges. This paper has a twofold objective. First, understand the practices and challenges that students face in TBDSP in an Agile context. Second, investigate whether the use of a teaching domain-specific information system, a learning dashboard, could help them in improving these practices and facing these challenges. We conducted a multi-instrument exploratory study at the Polytechnical University of Catalunya. We gathered information about the progress of 39 students organised in 6 teams during one semester by mining two software repositories and conducting a number of questionnaires and interviews related both to working practices and to students’ perception of learning dashboard adoption. Results show that many students do not follow adequate practices for some TBSDP activities. On the other hand, metrics informing about the use of code repositories and task management were generally well understood and perceived as potentially useful by students when shown in a learning dashboard. We conclude that the adoption of learning dashboards is a viable approach to improve student practices in TBSDP, but it needs to be carefully considered which metrics provide the most value to face the identified challenges.

Carles Farré, Xavier Franch, Marc Oriol, Alexandra Volkova
Ontology of Product Provenance for Value Networks

The economic issues surrounding value networks have received much attention from the research community in business modeling. However, other aspects can also influence the success of a network. One of them is sharing subjacent information that has value for the actors involved and fulfills a consumer’s business need, such as product provenance. Thus, it is also essential to address these aspects in business modeling. Considering that provenance is a significant value to be explored, this work proposes an ontology for modeling value networks with an indication of provenance, focused on geographical indications. The ontology allows configuring models that show different ways of sharing information to assist business analysts in making strategic decisions about their value propositions for consumers. The Design Science Methodology and an Ontology Engineering methodology (SAMOD) guided the design of the ontology. Technical Action Research (TAR) supported the validation of the ontology in a Brazilian biotechnology company providing organic beverages. Expert opinion helped evaluate the utility of the ontology according to the support to decision-making provided by its derived models. This research also provides new insights into the importance of considering provenance information essential for modeling business networks, especially in economic and environmental sustainability cases.

Lohanna Saraiva, Patricio Silva, Angelica Castro, Claudia Ribeiro, Joao Moreira
A Data Value Matrix: Linking FAIR Data with Business Models

Data is the raw material of digitization, but its economic use and potential economic benefits are not always clear. Therefore, we would like to show that the processing of data, especially according to FAIR principles, plays an enormous role in enabling business models and improving existing business models. This is being tested within the EU funded project Marispace-X, part of the Gaia-X initiative. The maritime domain in particular currently still suffers from a lack of digitization: while as much data is being collected as ever before, this data is often kept in silos and hardly reused or even shared. This work therefore involved linking the FAIR principles to a data value chain, which together with business model dimensions form a data value matrix. This application of this matrix was carried out together with practice partners using the example of maritime data processing in the use case “offshore wind” and can be used and adapted as an analysis tool for data-driven business models.

Ben Hellmanzik, Kurt Sandkuhl
An Information Privacy Competency Model for Online Consumers

E-commerce has taken a prominent role in our everyday life. However, the increasing disclosure of personal information as a prerequisite to join in such services in combination with consumers’ low privacy knowledge, has raised serious concerns regarding the protection of information privacy. To address this issue, the current study proposes a novel framework for the design of information privacy competency model for online consumers, incorporating Protection Motivation Theory and Big Personality Theory. Additionally, synthesizes the results into an indicative privacy competency model for online consumers. The results of this paper can act as a guide for the development of privacy awareness and training programs.

Aikaterini Soumelidou, Thanos Papaioannou
DECENT: A Domain Specific Language to Design Governance Decisions

Decentralized ecosystems, such as the Bitcoin, claim to be decentralized to avoid power concentrations such as commercial banks. This is perhaps true for their operations, but often not the case for their governance, which is about deciding the rules of monitoring and controlling protocols. In previous work, we have developed DECENT a domain specific language (DSL) to conceptualize the domain of decentralized governance design. In this paper, we focus on deriving governance design decisions based on the DECENT language. We do so by taking the case of Fractional Reserve Banking (FRB), which is about governance rules for commercial banks to create and destroy money. As many banks are licensed to do FRB, under the control of national central banks and the European Central Bank (ECB), this is already a case of a decentralized ecosystem. The governance design decisions are developed in close cooperation with our case study partner, a commercial bank.

Fadime Kaya, Francisco Perez, Joris Dekker, Jaap Gordijn

Doctoral Consortium Papers

Frontmatter
Business User-Oriented Recommender System of Data

Companies nowadays are increasingly dependent on data. In an environment that is more dynamic than ever, they are looking for tools to leverage those data and obtain valuable information in a rapid and flexible way. One way to achieve this is by using Data-Driven Decision Support Systems (Data-Driven DSS). In this project, I focus on one such type of DSS, namely the Self-Service Business Intelligence (SSBI) Systems. These systems are designed specifically to avoid the involvement of the IT department when creating business reports by empowering businesspeople in the production of their own reports, thereby reducing the time-to-release of a given report and improving the responsiveness of companies. Business decision-makers, when developing their own reports, however face barriers. These challenges are related to the current self-service features that are not sufficiently adapted to their business needs and their lack of technical knowledge. The objective of my project is to build a framework based on Artificial Intelligence (AI) techniques such as Natural Language Processing techniques, Semantic and Recommender Systems to solve one of the main challenges faced by businesspeople, namely: the data picking within technical and large current databases. These AI systems offer a number of benefits that are strongly linked to the problems encountered by business users in the data picking process. This paper introduces the three main research questions of my thesis and positions them in the current literature. It then elaborates on the different theoretical, methodological and empirical contributions I plan to advance as part of my project.

Sarah Pinon
Secure Infrastructure for Cyber-Physical Ranges

Industrial systems (IS), including critical ones, swiftly move towards integrating elements of modern Information Technology (IT) into their formerly air-gapped Operational Technology (OT) architectures. And, naturally, the more such systems become interconnected, the more alluring they pose to attackers. Concurrently, the twenty-four-seven availability of these systems renders it harder for defenders to promptly apply contemporary security controls. In this context, cyber ranges have emerged as a proper complementary solution for better comprehending and subsequently tackling the relevant risks without endangering the operation of the real systems. This work aspires to contribute a reference architecture for designing and developing cross-sector critical infrastructure (CI) cyber-physical ranges and security testbeds. A second key goal is to demonstrate the soundness of the proposed reference architecture through the implementation and evaluation of a number of cyber range instances specifically tailored for CIs of interest, including manufacturing, energy, and healthcare.

Vyron Kampourakis
Guidelines for Developers and Recommendations for Users to Mitigate Phishing Attacks: An Interdisciplinary Research Approach

Phishing attacks are common these days. If successful, these attacks cause psychological, emotional, and financial damage to the victims. Such damages may have a long-term impact. The overall objective of this Ph.D. research is to contribute to mitigating phishing victimization risks by exploring phishing prevalence, user-related risk factors, and vulnerable target groups and by designing (1) guidelines for social website developers focused on internet user vulnerabilities and (2) recommendations for users to avoid such attacks. The Ph.D. research acknowledges that phishing attacks are technical in nature, while the impact is financial and psychological. Therefore, an interdisciplinary research approach focusing on empirical research methods from social sciences (i.e., focus groups and surveys) and computer science (i.e., data-driven techniques such as machine learning) is adopted for the research. In particular, we aim to use a machine learning model for data analytics and quantitative and qualitative research design for psychological analysis. The research outcome of this Ph.D. work is expected to provide recommendations for internet users and organizations developing social-media-based software systems through more phishing aware development practices.

Javara Allah Bukhsh
Leveraging Exogeneous Data for the Predictive Monitoring of IT Service Management Processes

Accurate prediction of process execution time in IT Service Management (ITSM) is essential for service providers to meet service-level agreements (SLA). However, traditional pre dictive process monitoring methods struggle with processes delivering complex process artifacts, where event log data is insufficient to understand the flow of instances. To overcome this challenge, exogenous predictive process monitoring is proposed, utilizing exogenous data sources available in IT organizations to improve the accuracy of ITSM process predictions. This approach leverages a wide range of exogenous data sources, such as the service knowledge management system, to enhance the predictions and decision-making process. The resulting planning and decision support system, incorporating exogenous data, improves SLA compliance through better resource allocation and decisions throughout the ITSM process instance lifecycles.

Marc C. Hennig
Predicting Privacy Decisions in Mobile Applications and Raising Users’ Privacy Awareness

Smartphones and mobile applications are now an integral part of our daily lives. Managing mobile privacy is a challenging task, that overwhelms users with numerous and complex privacy decisions related to permission settings. Many users do not have sufficient knowledge and understanding of how applications use their personal data, and others do not spend enough time configuring these settings. Various approaches to assist users by automating their decisions have been proposed in the literature. This paper presents a literature review of the current state of knowledge in the area of permission settings and the solutions proposed by different researchers, focusing on the use of machine learning techniques. Machine learning can address the challenges of mobile privacy management by learning users' preferences and predicting their decisions based on a relatively small number of factors. We then describe our future research plans to reduce the user's burden in configuring application permissions, to increase their awareness, and to protect their privacy.

Rena Lavranou
Information Overload: Coping Mechanisms and Tools Impact

The issue of information overload is becoming increasingly prevalent in society and the workplace, with the widespread use of digital technologies and big data. However, this phenomenon has led to negative effects at the organizational level, such as time loss, decreased efficiency, and poor employee health, among others. In academic literature, information overload is part of a new research trend in information systems management (MSI), known as the “dark side of IT,” which focuses on studying the negative effects of organizational use of ICT to propose solutions. Thus, our research objective explores the role of software tools and their features as coping strategies in response to information overload. Specifically, we want to study the features and uses of software tools used by managers to reduce their information overload. The literature on information overload identifies its determinants, consequences, and mechanisms for alleviation. Research on alleviation mechanisms, particularly on the role of tools, is still in its early stages and deserves greater attention. This research is important because it addresses the call from the information systems research community for the urgent need to promote healthy management of the interaction between humans and technologies and its implications for workplace health.

Philippe Aussu
Backmatter
Metadaten
Titel
Research Challenges in Information Science: Information Science and the Connected World
herausgegeben von
Selmin Nurcan
Andreas L. Opdahl
Haralambos Mouratidis
Aggeliki Tsohou
Copyright-Jahr
2023
Electronic ISBN
978-3-031-33080-3
Print ISBN
978-3-031-33079-7
DOI
https://doi.org/10.1007/978-3-031-33080-3