Machine Learning and Knowledge Extraction
7th IFIP TC 5, TC 12, WG 8.4, WG 8.9, WG 12.9 International Cross-Domain Conference, CD-MAKE 2023, Benevento, Italy, August 29 – September 1, 2023, Proceedings
- 2023
- Buch
- Herausgegeben von
- Andreas Holzinger
- Peter Kieseberg
- Federico Cabitza
- Andrea Campagner
- A Min Tjoa
- Edgar Weippl
- Buchreihe
- Lecture Notes in Computer Science
- Verlag
- Springer Nature Switzerland
insite
SUCHEN
Über dieses Buch
Dieser Band LNCS-IFIP ist das Referat der 7. IFIP TC 5, TC 12, WG 8.4, WG 8.9, WG 12.9 International Cross-Domain Conference, CD-MAKE 2023 in Benevent, Italien, vom 28. August bis 1. September 2023. Die insgesamt 18 Beiträge wurden sorgfältig geprüft und aus 30 Einreichungen ausgewählt. Die Konferenz konzentriert sich auf den integrativen Ansatz des maschinellen Lernens und berücksichtigt die Bedeutung von Datenwissenschaft und Visualisierung für die algorithmische Pipeline mit einem starken Schwerpunkt auf Privatsphäre, Datenschutz, Sicherheit und Sicherheit.
Mit KI übersetzt
Über dieses Buch
This volume LNCS-IFIP constitutes the refereed proceedings of the 7th IFIP TC 5, TC 12, WG 8.4, WG 8.9, WG 12.9 International Cross-Domain Conference, CD-MAKE 2023 in Benevento, Italy, during August 28 – September 1, 2023.
The 18 full papers presented together were carefully reviewed and selected from 30 submissions. The conference focuses on integrative machine learning approach, considering the importance of data science and visualization for the algorithmic pipeline with a strong emphasis on privacy, data protection, safety and security.
Anzeige
Inhaltsverzeichnis
-
Frontmatter
-
Controllable AI - An Alternative to Trustworthiness in Complex AI Systems?
- Open Access
PDF-Version jetzt herunterladenAbstractThe release of ChatGPT to the general public has sparked discussions about the dangers of artificial intelligence (AI) among the public. The European Commission’s draft of the AI Act has further fueled these discussions, particularly in relation to the definition of AI and the assignment of risk levels to different technologies. Security concerns in AI systems arise from the need to protect against potential adversaries and to safeguard individuals from AI decisions that may harm their well-being. However, ensuring secure and trustworthy AI systems is challenging, especially with deep learning models that lack explainability. This paper proposes the concept of Controllable AI as an alternative to Trustworthy AI and explores the major differences between the two. The aim is to initiate discussions on securing complex AI systems without sacrificing practical capabilities or transparency. The paper provides an overview of techniques that can be employed to achieve Controllable AI. It discusses the background definitions of explainability, Trustworthy AI, and the AI Act. The principles and techniques of Controllable AI are detailed, including detecting and managing control loss, implementing transparent AI decisions, and addressing intentional bias or backdoors. The paper concludes by discussing the potential applications of Controllable AI and its implications for real-world scenarios. -
Efficient Approximation of Asymmetric Shapley Values Using Functional Decomposition
- Open Access
PDF-Version jetzt herunterladenAbstractAsymmetric Shapley values (ASVs) are an extension of Shapley values that allow a user to incorporate partial causal knowledge into the explanation process. Unfortunately, computing ASVs requires sampling permutations, which quickly becomes computationally expensive. We propose A-PDD-SHAP, an algorithm that employs a functional decomposition approach to approximate ASVs at a speed orders of magnitude faster compared to permutation sampling, which significantly reduces the amortized complexity of computing ASVs when many explanations are needed. Apart from this, once the A-PDD-SHAP model is trained, it can be used to compute both symmetric and asymmetric Shapley values without having to re-train or re-sample, allowing for very efficient comparisons between different types of explanations. -
Domain-Specific Evaluation of Visual Explanations for Application-Grounded Facial Expression Recognition
- Open Access
PDF-Version jetzt herunterladenAbstractResearch in the field of explainable artificial intelligence has produced a vast amount of visual explanation methods for deep learning-based image classification in various domains of application. However, there is still a lack of domain-specific evaluation methods to assess an explanation’s quality and a classifier’s performance with respect to domain-specific requirements. In particular, evaluation methods could benefit from integrating human expertise into quality criteria and metrics. Such domain-specific evaluation methods can help to assess the robustness of deep learning models more precisely. In this paper, we present an approach for domain-specific evaluation of visual explanation methods in order to enhance the transparency of deep learning models and estimate their robustness accordingly. As an example use case, we apply our framework to facial expression recognition. We can show that the domain-specific evaluation is especially beneficial for challenging use cases such as facial expression recognition and provides application-grounded quality criteria that are not covered by standard evaluation methods. Our comparison of the domain-specific evaluation method with standard approaches thus shows that the quality of the expert knowledge is of great importance for assessing a model’s performance precisely. -
Human-in-the-Loop Integration with Domain-Knowledge Graphs for Explainable Federated Deep Learning
- Open Access
PDF-Version jetzt herunterladenAbstractWe explore the integration of domain knowledge graphs into Deep Learning for improved interpretability and explainability using Graph Neural Networks (GNNs). Specifically, a protein-protein interaction (PPI) network is masked over a deep neural network for classification, with patient-specific multi-modal genomic features enriched into the PPI graph’s nodes. Subnetworks that are relevant to the classification (referred to as “disease subnetworks”) are detected using explainable AI. Federated learning is enabled by dividing the knowledge graph into relevant subnetworks, constructing an ensemble classifier, and allowing domain experts to analyze and manipulate detected subnetworks using a developed user interface. Furthermore, the human-in-the-loop principle can be applied with the incorporation of experts, interacting through a sophisticated User Interface (UI) driven by Explainable Artificial Intelligence (xAI) methods, changing the datasets to create counterfactual explanations. The adapted datasets could influence the local model’s characteristics and thereby create a federated version that distils their diverse knowledge in a centralized scenario. This work demonstrates the feasibility of the presented strategies, which were originally envisaged in 2021 and most of it has now been materialized into actionable items. In this paper, we report on some lessons learned during this project. -
The Tower of Babel in Explainable Artificial Intelligence (XAI)
- Open Access
PDF-Version jetzt herunterladenAbstractAs machine learning (ML) has emerged as the predominant technological paradigm for artificial intelligence (AI), complex black box models such as GPT-4 have gained widespread adoption. Concurrently, explainable AI (XAI) has risen in significance as a counterbalancing force. But the rapid expansion of this research domain has led to a proliferation of terminology and an array of diverse definitions, making it increasingly challenging to maintain coherence. This confusion of languages also stems from the plethora of different perspectives on XAI, e.g. ethics, law, standardization and computer science. This situation threatens to create a “tower of Babel” effect, whereby a multitude of languages impedes the establishment of a common (scientific) ground. In response, this paper first maps different vocabularies, used in ethics, law and standardization. It shows that despite a quest for standardized, uniform XAI definitions, there is still a confusion of languages. Drawing lessons from these viewpoints, it subsequently proposes a methodology for identifying a unified lexicon from a scientific standpoint. This could aid the scientific community in presenting a more unified front to better influence ongoing definition efforts in law and standardization, often without enough scientific representation, which will shape the nature of AI and XAI in the future. -
Hyper-Stacked: Scalable and Distributed Approach to AutoML for Big Data
Ryan Dave, Juan S. Angarita-Zapata, Isaac TrigueroAbstractThe emergence of Machine Learning (ML) has altered how researchers and business professionals value data. Applicable to almost every industry, considerable amounts of time are wasted creating bespoke applications and repetitively hand-tuning models to reach optimal performance. For some, the outcome may be desired; however, the complexity and lack of knowledge in the field of ML become a hindrance. This, in turn, has seen an increasing demand for the automation of the complete ML workflow (from data preprocessing to model selection), known as Automated Machine Learning (AutoML). Although AutoML solutions have been developed, Big Data is now seen as an impediment for large organisations with massive data outputs. Current methods cannot extract value from large volumes of data due to tight coupling with centralised ML libraries, leading to limited scaling potential. This paper introduces Hyper-Stacked, a novel AutoML component built natively on Apache Spark. Hyper-Stacked combines multi-fidelity hyperparameter optimisation with the Super Learner stacking technique to produce a strong and diverse ensemble. Integration with Spark allows for a parallelised and distributed approach, capable of handling the volume and complexity associated with Big Data. Scalability is demonstrated through an in-depth analysis of speedup, sizeup and scaleup. -
Transformers are Short-Text Classifiers
Fabian Karl, Ansgar ScherpAbstractShort text classification is a crucial and challenging aspect of Natural Language Processing. For this reason, there are numerous highly specialized short text classifiers. A variety of approaches have been employed in short text classifiers such as convolutional and recurrent networks. Also many short text classifier based on graph neural networks have emerged in the last years. However, in recent short text research, State of the Art (SOTA) methods for traditional text classification, particularly the pure use of Transformers, have been unexploited. In this work, we examine the performance of a variety of short text classifiers as well as the top performing traditional text classifier on benchmark datasets. We further investigate the effects on two new real-world short text datasets in an effort to address the issue of becoming overly dependent on benchmark datasets with a limited number of characteristics. The datasets are motivated from a real-world use case on classifying goods and services for tax auditing. NICE is a classification system for goods and services that divides them into 45 classes and is based on the Nice Classification of the World Intellectual Property Organization. The Short Texts Of Products and Services (STOPS) dataset is based on Amazon product descriptions and Yelp business entries. Our experiments unambiguously demonstrate that Transformers achieve SOTA accuracy on short text classification tasks, raising the question of whether specialized short text techniques are necessary. The NICE dataset showed to be particularly challenging and makes a good benchmark for future advancements.A preprint can be also found on arXiv [14]. Source code is available here: https://github.com/FKarl/short-text-classification. -
Reinforcement Learning with Temporal-Logic-Based Causal Diagrams
Yash Paliwal, Rajarshi Roy, Jean-Raphaël Gaglione, Nasim Baharisangari, Daniel Neider, Xiaoming Duan, Ufuk Topcu, Zhe XuAbstractWe study a class of reinforcement learning (RL) tasks where the objective of the agent is to accomplish temporally extended goals. In this setting, a common approach is to represent the tasks as deterministic finite automata (DFA) and integrate them into the state-space for RL algorithms. However, while these machines model the reward function, they often overlook the causal knowledge about the environment. To address this limitation, we propose the Temporal-Logic-based Causal Diagram (TL-CD) in RL, which captures the temporal causal relationships between different properties of the environment. We exploit the TL-CD to devise an RL algorithm in which an agent requires significantly less exploration of the environment. To this end, based on a TL-CD and a task DFA, we identify configurations where the agent can determine the expected rewards early during an exploration. Through a series of case studies, we demonstrate the benefits of using TL-CDs, particularly the faster convergence of the algorithm to an optimal policy due to reduced exploration of the environment. -
Using Machine Learning to Generate a Dictionary for Environmental Issues
Daniel E. O’Leary, Yangin YoonAbstractThe purpose of this paper is to investigate the use of machine learning approaches to build a dictionary of terms to analyze text for ESG content using a bag of words approach, where ESG stands for “environment, social and governance.” Specifically, the paper reviews some experiments performed to develop a dictionary for information about the environment, for “carbon footprint”. We investigate using Word2Vec based on Form 10K text and from Earnings Calls, and queries of ChatGPT and compare the results. As part of the development of our dictionaries we find that bigrams and trigrams are more likely to be found when using ChatGPT, suggesting that bigrams and trigrams provide a “better” approach for the dictionaries developed with Word2Vec. We also find that terms provided by ChatGPT were not as likely to appear in Form 10Ks or other business disclosures, as were those terms generated using Word2Vec. In addition, we explored different question approaches to ChatGPT to find different perspectives on carbon footprint, such as “reducing carbon footprint” or “negative effects of carbon footprint.” We then discuss combining the findings from each of these approaches, to build a dictionary that could be used alone or with other ESG concept dictionaries. -
Let Me Think! Investigating the Effect of Explanations Feeding Doubts About the AI Advice
Federico Cabitza, Andrea Campagner, Lorenzo Famiglini, Chiara Natali, Valerio Caccavella, Enrico GallazziAbstractAugmented Intelligence (AuI) refers to the use of artificial intelligence (AI) to amplify certain cognitive tasks performed by human decision-makers. However, there are concerns that AI’s increasing capability and alignment with human values may undermine user agency, autonomy, and responsible decision-making. To address these concerns, we conducted a user study in the field of orthopedic radiology diagnosis, introducing a reflective XAI (explainable AI) support that aimed to stimulate human reflection, and we evaluated its impact of in terms of decision performance, decision confidence and perceived utility. Specifically, the reflective XAI support system prompted users to reflect on the dependability of AI-generated advice by presenting evidence both in favor of and against its recommendation. This evidence was presented via two cases that closely resembled a given base case, along with pixel attribution maps. These cases were associated with the same AI advice for the base case, but one case was accurate while the other was erroneous with respect to the ground truth. While the introduction of this support system did not significantly enhance diagnostic accuracy, it was highly valued by more experienced users. Based on the findings of this study, we advocate for further research to validate the potential of reflective XAI in fostering more informed and responsible decision-making, ultimately preserving human agency. -
Enhancing Trust in Machine Learning Systems by Formal Methods
With an Application to a Meteorological Problem Christina Tavolato-Wötzl, Paul TavolatoAbstractWith the deployment of applications based on machine learning techniques the need for understandable explanations of these systems’ results becomes evident. This paper clarifies the concept of an “explanation”: the main goal of an explanation is to build trust in the recipient of the explanation. This can only be achieved by creating an understanding of the results of the AI systems in terms of the users’ domain knowledge. In contrast to most of the approaches found in the literature, which base the explanation of the AI system’s results on the model provided by the machine learning algorithm, this paper tries to find an explanation in the specific expert knowledge of the system’s users. The domain knowledge is defined as a formal model derived from a set of if-then-rules provided by experts. The result from the AI system is represented as a proposition in a temporal logic. Now we attempt to formally prove this proposition within the domain model. We use model checking algorithms and tools for this purpose. If the proof is successful, the result of the AI system is consistent with the model of the domain knowledge. The model contains the rules it is based on and hence the path representing the proof can be translated back to the rules: this explains, why the proposition is consistent with the domain knowledge. The paper describes the application of this approach to a real world example from meteorology, the short-term forecasting of cloud coverage for particular locations. -
Sustainability Effects of Robust and Resilient Artificial Intelligence
- Open Access
PDF-Version jetzt herunterladenAbstractIt is commonly understood that the resilience of critical information technology (IT) systems based on artificial intelligence (AI) must be ensured. In this regard, we consider resilience both in terms of IT security threats, such as cyberattacks, as well as the ability to robustly persist under uncertain and changing environmental conditions, such as climate change or economic crises. This paper explores the relationship between resilience and sustainability with regard to AI systems, develops fields of action for resilient AI, and elaborates direct and indirect influences on the achievement of the United Nations Sustainable Development Goals. Indirect in this case means that a sustainability effect is reached by taking resilience measures when applying AI in a sustainability-relevant application area, for example precision agriculture or smart health. -
The Split Matters: Flat Minima Methods for Improving the Performance of GNNs
Nicolas Lell, Ansgar ScherpAbstractWhen training a Neural Network, it is optimized using the available training data with the hope that it generalizes well to new or unseen testing data. At the same absolute value, a flat minimum in the loss landscape is presumed to generalize better than a sharp minimum. Methods for determining flat minima have been mostly researched for independent and identically distributed (i.i.d.) data such as images. Graphs are inherently non-i.i.d. since the vertices are edge-connected. We investigate flat minima methods and combinations of those methods for training graph neural networks (GNNs). We use GCN and GAT as well as extend Graph-MLP to work with more layers and larger graphs. We conduct experiments on small and large citation, co-purchase, and protein datasets with different train-test splits in both the transductive and inductive training procedure. Results show that flat minima methods can improve the performance of GNN models by over 2 points, if the train-test split is randomized. Following Shchur et al., randomized splits are essential for a fair evaluation of GNNs, as other (fixed) splits like “Planetoid” are biased. Overall, we provide important insights for improving and fairly evaluating flat minima methods on GNNs. We recommend practitioners to always use weight averaging techniques, in particular EWA when using early stopping. While weight averaging techniques are only sometimes the best performing method, they are less sensitive to hyperparameters, need no additional training, and keep the original model unchanged. All source code is available under https://github.com/Foisunt/FMMs-in-GNNs. -
Probabilistic Framework Based on Deep Learning for Differentiating Ultrasound Movie View Planes
Andrei Gabriel Nascu, Smaranda Belciug, Anca-Maria Istrate-Ofiteru, Dominic Gabriel IliescuAbstractFetal death, infant morbidity and mortality are generally caused by the presence of congenital anomalies. By performing a fetal morphology scan, the sonographer can detect their presence and have a thorough conversation with the soon-to-be parents. Diagnosing congenital anomalies is a difficult task even for an experienced sonographer. A more accurate diagnosis can be set through a merger between the doctor’s knowledge and Artificial Intelligence. The aim of paper is to present an intelligent framework that is able to differentiate accurately between the view planes of the fetal abdomen in an ultrasound movie. Deep learning methods, such as ResNet50, DenseNet121, and InceptionV3, have been trained to classify each movie frame. A thorough statistical analysis is used to benchmark the neural networks, and to build a hierarchy. The best performing algorithm is used to classify each frame of the movie, followed by a synergetic weighted voting system that sets the label of the entire ultrasound video. We have tested our proposed framework on several fetal morphology videos. The experimental results showed that the framework differentiates well between the fetal abdomen view planes, even if the deep learning neural networks are able to differentiate between the static images with accuracies that range between 46.01% and 77.80%. -
Standing Still Is Not an Option: Alternative Baselines for Attainable Utility Preservation
Sebastian Eresheim, Fabian Kovac, Alexander AdrowitzerAbstractSpecifying reward functions without causing side effects is still a challenge to be solved in Reinforcement Learning. Attainable Utility Preservation (AUP) seems promising to preserve the ability to optimize for a correct reward function in order to minimize negative side-effects. Current approaches however assume the existence of a no-op action in the environment’s action space, which limits AUP to solve tasks where doing nothing for a single time-step is a valuable option. Depending on the environment, this cannot always be guaranteed. We introduce four different baselines that do not build on such actions and therefore extend the concept of AUP to a broader class of environments. We evaluate all introduced variants on different AI safety gridworlds and show that this approach generalizes AUP to a broader range of tasks, with only little performance losses. -
Memorization of Named Entities in Fine-Tuned BERT Models
Andor Diera, Nicolas Lell, Aygul Garifullina, Ansgar ScherpAbstractPrivacy preserving deep learning is an emerging field in machine learning that aims to mitigate the privacy risks in the use of deep neural networks. One such risk is training data extraction from language models that have been trained on datasets, which contain personal and privacy sensitive information. In our study, we investigate the extent of named entity memorization in fine-tuned BERT models. We use single-label text classification as representative downstream task and employ three different fine-tuning setups in our experiments, including one with Differentially Privacy (DP). We create a large number of text samples from the fine-tuned BERT models utilizing a custom sequential sampling strategy with two prompting strategies. We search in these samples for named entities and check if they are also present in the fine-tuning datasets. We experiment with two benchmark datasets in the domains of emails and blogs. We show that the application of DP has a detrimental effect on the text generation capabilities of BERT. Furthermore, we show that a fine-tuned BERT does not generate more named entities specific to the fine-tuning dataset than a BERT model that is pre-trained only. This suggests that BERT is unlikely to emit personal or privacy sensitive named entities. Overall, our results are important to understand to what extent BERT-based services are prone to training data extraction attacks (Source code and datasets are available at: https://github.com/drndr/bert_ent_attack. An extended version of this paper can be also found on arXiv [12]). -
Event and Entity Extraction from Generated Video Captions
Johannes Scherer, Deepayan Bhowmik, Ansgar ScherpAbstractAnnotation of multimedia data by humans is time-consuming and costly, while reliable automatic generation of semantic metadata is a major challenge. We propose a framework to extract semantic metadata solely from automatically generated video captions. As metadata, we consider entities, the entities’ properties, relations between entities, and the video category. Our framework combines automatic video captioning models with natural language processing (NLP) methods. We use state-of-the-art dense video captioning models with masked transformer (MT) and parallel decoding (PVDC) to generate captions for videos of the ActivityNet Captions dataset. We analyze the output of the video captioning models using NLP methods. We evaluate the performance of our framework for each metadata type, while varying the amount of information the video captioning model provides. Our experiments show that it is possible to extract high-quality entities, their properties, and relations between entities. In terms of categorizing a video based on generated captions, the results can be improved. We observe that the quality of the extracted information is mainly influenced by the dense video captioning model’s capability to locate events in the video and to generate the event captions.An earlier version of this paper has been published on arXiv [20]. We provide the source code here: -
Fine-Tuning Language Models for Scientific Writing Support
Justin Mücke, Daria Waldow, Luise Metzger, Philipp Schauz, Marcel Hoffman, Nicolas Lell, Ansgar ScherpAbstractWe support scientific writers in determining whether a written sentence is scientific, to which section it belongs, and suggest paraphrasings to improve the sentence. Firstly, we propose a regression model trained on a corpus of scientific sentences extracted from peer-reviewed scientific papers and non-scientific text to assign a score that indicates the scientificness of a sentence. We investigate the effect of equations and citations on this score to test the model for potential biases. Secondly, we create a mapping of section titles to a standard paper layout in AI and machine learning to classify a sentence to its most likely section. We study the impact of context, i. e., surrounding sentences, on the section classification performance. Finally, we propose a paraphraser, which suggests an alternative for a given sentence that includes word substitutions, additions to the sentence, and structural changes to improve the writing style. We train various large language models on sentences extracted from arXiv papers that were peer reviewed and published at A*, A, B, and C ranked conferences. On the scientificness task, all models achieve an MSE smaller than 2%. For the section classification, BERT outperforms WideMLP and SciBERT in most cases. We demonstrate that using context enhances the classification of a sentence, achieving up to a 90% F1-score. Although the paraphrasing models make comparatively few alterations, they produce output sentences close to the gold standard. Large fine-tuned models such as T5 Large perform best in experiments considering various measures of difference between input sentence and gold standard.Code is provided here: https://github.com/JustinMuecke/SciSen. -
Backmatter
- Titel
- Machine Learning and Knowledge Extraction
- Herausgegeben von
-
Andreas Holzinger
Peter Kieseberg
Federico Cabitza
Andrea Campagner
A Min Tjoa
Edgar Weippl
- Copyright-Jahr
- 2023
- Verlag
- Springer Nature Switzerland
- Electronic ISBN
- 978-3-031-40837-3
- Print ISBN
- 978-3-031-40836-6
- DOI
- https://doi.org/10.1007/978-3-031-40837-3
Informationen zur Barrierefreiheit für dieses Buch folgen in Kürze. Wir arbeiten daran, sie so schnell wie möglich verfügbar zu machen. Vielen Dank für Ihre Geduld.