Skip to main content

2024 | Book

Domain Adaptation and Representation Transfer

5th MICCAI Workshop, DART 2023, Held in Conjunction with MICCAI 2023, Vancouver, BC, Canada, October 12, 2023, Proceedings

Editors: Lisa Koch, M. Jorge Cardoso, Enzo Ferrante, Konstantinos Kamnitsas, Mobarakol Islam, Meirui Jiang, Nicola Rieke, Sotirios A. Tsaftaris, Dong Yang

Publisher: Springer Nature Switzerland

Book Series : Lecture Notes in Computer Science


About this book

This book constitutes the refereed proceedings of the 5th MICCAI Workshop on Domain Adaptation and Representation Transfer, DART 2023, which was held in conjunction with MICCAI 2023, in October 2023.
The 16 full papers presented in this book were carefully reviewed and selected from 32 submissions. They discuss methodological advancements and ideas that can improve the applicability of machine learning (ML)/deep learning (DL) approaches to clinical setting by making them robust and consistent across different domains.

Table of Contents

Domain Adaptation of MRI Scanners as an Alternative to MRI Harmonization
Combining large multi-center datasets can enhance statistical power, particularly in the field of neurology, where data can be scarce. However, applying a deep learning model trained on existing neuroimaging data often leads to inconsistent results when tested on new data due to domain shift caused by differences between the training (source domain) and testing (target domain) data. Existing literature offers several solutions based on domain adaptation (DA) techniques, which ignore complex practical scenarios where heterogeneity may exist in the source or target domain. This study proposes a new perspective in solving the domain shift issue for MRI data by identifying and addressing the dominant factor causing heterogeneity in the dataset. We design an unsupervised DA method leveraging the maximum mean discrepancy and correlation alignment loss in order to align domain-invariant features. Instead of regarding the entire dataset as a source or target domain, the dataset is processed based on the dominant factor of data variations, which is the scanner manufacturer. Afterwards, the target domain’s feature space is aligned pairwise with respect to each source domain’s feature map. Experimental results demonstrate significant performance gain for multiple inter- and intra-study neurodegenerative disease classification tasks. Source code available at (https://​github.​com/​rkushol/​DAMS).
Rafsanjany Kushol, Richard Frayne, Simon J. Graham, Alan H. Wilman, Sanjay Kalra, Yee-Hong Yang
MultiVT: Multiple-Task Framework for Dentistry
Current image understanding methods on dental data are often trained end-to-end on inputs and labels, with focus on using state-of-the-art neural architectures. Such approaches, however, typically ignore domain specific peculiarities and lack the ability to generalize outside their training dataset. We observe that, in RGB images, teeth display a weak or unremarkable texture while exhibiting strong boundaries; similarly, in panoramic radiographs root tips and crowns are generally visible, while other parts of the teeth appear blurry. In their respective image type, these features are robust to the domain shift given by different sources or acquisition tools. Therefore, we formulate a method which we call MultiVT, able to leverage these low level image features to achieve results with good domain generalization properties. We demonstrate with experiments that, by focusing on such domain-robust features, we can achieve better segmentation and detection results. Additionally, MultiVT improves generalization capabilities without applying domain adaptive techniques - a characteristic which renders our method suitable for use in real-world applications.
Edoardo Mello Rella, Ajad Chhatkuli, Ender Konukoglu, Luc Van Gool
Black-Box Unsupervised Domain Adaptation for Medical Image Segmentation
Unsupervised Domain Adaptation (UDA) is one of the key technologies to solve the problem of obtaining ground truth labels needed for supervised learning. In general, UDA assumes that information about the source model, such as its architecture and weights, and all samples from the source domains are available when a target domain model is trained. However, this is not a realistic assumption in applications where privacy and white-box attacks are a concern, or where the model is only be accessible through an API. To overcome this limitation, UDA without source model information and source data, called Black-Box Unsupervised Domain Adaptation (BBUDA), has recently been proposed. Here, we propose an improved BBUDA method for medical image segmentation. Our main contribution is the introduction of a mean teacher algorithm during the training of the target domain model. We conduct experiments on datasets containing different types of source-target domain combinations to demonstrate the versatility and robustness of our method. We confirm that our method outperforms the state-of-the-art on all datasets.
Satoshi Kondo
PLST: A Pseudo-labels with a Smooth Transition Strategy for Medical Site Adaptation
This study addresses the challenge of medical image segmentation when transferring a pre-trained model from one medical site to another without access to pre-existing labels. The method involves utilizing a self-training approach by generating pseudo-labels of the target domain data. To do so, a strategy that is based on a smooth transition between domains is implemented where we initially feed easy examples to the network and gradually increase the difficulty of the examples. To identify the level of difficulty, we use a binary classifier trained to distinguish between the two domains by considering that target images easier if they are classified as source examples. We demonstrate the improved performance of our method on a range of medical MRI image segmentation tasks. When integrating our approach as a post-processing step in several standard Unsupervised Domain Adaptation (UDA) algorithms, we consistently observed significant improvements in the segmentation results on test images from the target site.
Tomer Bar Natan, Hayit Greenspan, Jacob Goldberger
Compositional Representation Learning for Brain Tumour Segmentation
For brain tumour segmentation, deep learning models can achieve human expert-level performance given a large amount of data and pixel-level annotations. However, the expensive exercise of obtaining pixel-level annotations for large amounts of data is not always feasible, and performance is often heavily reduced in a low-annotated data regime. To tackle this challenge, we adapt a mixed supervision framework, vMFNet, to learn robust compositional representations using unsupervised learning and weak supervision alongside non-exhaustive pixel-level pathology labels. In particular, we use the BraTS dataset to simulate a collection of 2-point expert pathology annotations indicating the top and bottom slice of the tumour (or tumour sub-regions: peritumoural edema, GD-enhancing tumour, and the necrotic/non-enhancing tumour) in each MRI volume, from which weak image-level labels that indicate the presence or absence of the tumour (or the tumour sub-regions) in the image are constructed. Then, vMFNet models the encoded image features with von-Mises-Fisher (vMF) distributions, via learnable and compositional vMF kernels which capture information about structures in the images. We show that good tumour segmentation performance can be achieved with a large amount of weakly labelled data but only a small amount of fully-annotated data. Interestingly, emergent learning of anatomical structures occurs in the compositional representation even given only supervision relating to pathology (tumour).
Xiao Liu, Antanas Kascenas, Hannah Watson, Sotirios A. Tsaftaris, Alison Q. O’Neil
Hierarchical Compositionality in Hyperbolic Space for Robust Medical Image Segmentation
Deep learning based medical image segmentation models need to be robust to domain shifts and image distortion for the safe translation of these models into clinical practice. The most popular methods for improving robustness are centred around data augmentation and adversarial training. Many image segmentation tasks exhibit regular structures with only limited variability. We aim to exploit this notion by learning a set of base components in the latent space whose composition can account for the entire structural variability of a specific segmentation task. We enforce a hierarchical prior in the composition of the base components and consider the natural geometry in which to build our hierarchy. Specifically, we embed the base components on a hyperbolic manifold which we claim leads to a more natural composition. We demonstrate that our method improves model robustness under various perturbations and in the task of single domain generalisation.
Ainkaran Santhirasekaram, Mathias Winkler, Andrea Rockall, Ben Glocker
Realistic Data Enrichment for Robust Image Segmentation in Histopathology
Poor performance of quantitative analysis in histopathological Whole Slide Images (WSI) has been a significant obstacle in clinical practice. Annotating large-scale WSIs manually is a demanding and time-consuming task, unlikely to yield the expected results when used for fully supervised learning systems. Rarely observed disease patterns and large differences in object scales are difficult to model through conventional patient intake. Prior methods either fall back to direct disease classification, which only requires learning a few factors per image, or report on average image segmentation performance, which is highly biased towards majority observations. Geometric image augmentation is commonly used to improve robustness for average case predictions and to enrich limited datasets. So far no method provided sampling of a realistic posterior distribution to improve stability, e.g. for the segmentation of imbalanced objects within images. Therefore, we propose a new approach, based on diffusion models, which can enrich an imbalanced dataset with plausible examples from underrepresented groups by conditioning on segmentation maps. Our method can simply expand limited clinical datasets making them suitable to train machine learning pipelines, and provides an interpretable and human-controllable way of generating histopathology images that are indistinguishable from real ones to human experts. We validate our findings on two datasets, one from the public domain and one from a Kidney Transplant study. \(^{1}\)(The source code and trained models will be publicly available at the time of the conference, on huggingface and github. )
Sarah Cechnicka, James Ball, Hadrien Reynaud, Callum Arthurs, Candice Roufosse, Bernhard Kainz
Boosting Knowledge Distillation via Random Fourier Features for Prostate Cancer Grading in Histopathology Images
There has been a growing number of pathology image datasets, in particular for cancer diagnosis. Although these datasets permit easy access and development of computational pathology tools, the current computational models still struggle to handle unseen datasets due to various reasons. Transfer learning and fine-tuning are standard techniques to adapt an existing model that was trained on one dataset to another. However, this approach does not fully exploit the existing model and the target dataset. Inspired by knowledge distillation, we propose a student-teacher strategy that distills knowledge from a well-trained teacher model, generally trained on a larger dataset, to a student model to be tested on a small dataset. To facilitate efficient and effective knowledge distillation and transfer, we employ contrastive learning and non-parameterized random Fourier features for compressed feature mapping into a lower-dimensional space. We evaluated our proposed method using three prostate cancer datasets, including a teacher dataset, a target student dataset, and an independent test dataset. The experimental results demonstrate that the proposed approach outperforms other transfer learning and state-of-the-art knowledge distillation methods. Code is available at: https://​github.​com/​trinhvg/​KD_​CoRFF.
Trinh Thi Le Vuong, Jin Tae Kwak
Semi-supervised Domain Adaptation for Automatic Quality Control of FLAIR MRIs in a Clinical Data Warehouse
Domain adaptation is a very useful approach to exploit the potential of clinical data warehouses, which gather a vast amount of medical imaging encompassing various modalities, sequences, manufacturers and machines. In this study, we propose a semi-supervised domain adaptation (SSDA) framework for automatically detecting poor quality FLAIR MRIs within a clinical data warehouse. Leveraging a limited number of labelled FLAIR and a large number of labelled T1-weighted MRIs, we introduce a novel architecture based on the well known Domain Adversarial Neural Network (DANN) that incorporates a specific classifier for the target domain. Our method effectively addresses the covariate shift and class distribution shift between T1-weighted and FLAIR MRIs, surpassing existing SSDA approaches by more than 10% points.
Sophie Loizillon, Olivier Colliot, Lydia Chougar, Sebastian Stroer, Yannick Jacob, Aurélien Maire, Didier Dormont, Ninon Burgos
Towards Foundation Models Learned from Anatomy in Medical Imaging via Self-supervision
Human anatomy is the foundation of medical imaging and boasts one striking characteristic: its hierarchy in nature, exhibiting two intrinsic properties: (1) locality: each anatomical structure is morphologically distinct from the others; and (2) compositionality: each anatomical structure is an integrated part of a larger whole. We envision a foundation model for medical imaging that is consciously and purposefully developed upon this foundation to gain the capability of “understanding” human anatomy and to possess the fundamental properties of medical imaging. As our first step in realizing this vision towards foundation models in medical imaging, we devise a novel self-supervised learning (SSL) strategy that exploits the hierarchical nature of human anatomy. Our extensive experiments demonstrate that the SSL pretrained model, derived from our training strategy, not only outperforms state-of-the-art (SOTA) fully/self-supervised baselines but also enhances annotation efficiency, offering potential few-shot segmentation capabilities with performance improvements ranging from 9% to 30% for segmentation tasks compared to SSL baselines. This performance is attributed to the significance of anatomy comprehension via our learning strategy, which encapsulates the intrinsic attributes of anatomical structures—locality and compositionality—within the embedding space, yet overlooked in existing SSL methods. All code and pretrained models are available at GitHub.​com/​JLiangLab/​Eden.
Mohammad Reza Hosseinzadeh Taher, Michael B. Gotway, Jianming Liang
The Performance of Transferability Metrics Does Not Translate to Medical Tasks
Transfer learning boosts the performance of medical image analysis by enabling deep learning (DL) on small datasets through the knowledge acquired from large ones. As the number of DL architectures explodes, exhaustively attempting all candidates becomes unfeasible, motivating cheaper alternatives for choosing them. Transferability scoring methods emerge as an enticing solution, allowing to efficiently calculate a score that correlates with the architecture accuracy on any target dataset. However, since transferability scores have not been evaluated on medical datasets, their use in this context remains uncertain, preventing them from benefiting practitioners. We fill that gap in this work, thoroughly evaluating seven transferability scores in three medical applications, including out-of-distribution scenarios. Despite promising results in general-purpose datasets, our results show that no transferability score can reliably and consistently estimate target performance in medical contexts, inviting further work in that direction.
Levy Chaves, Alceu Bissoto, Eduardo Valle, Sandra Avila
DGM-DR: Domain Generalization with Mutual Information Regularized Diabetic Retinopathy Classification
The domain shift between training and testing data presents a significant challenge for training generalizable deep learning models. As a consequence, the performance of models trained with the independent and identically distributed (i.i.d) assumption deteriorates when deployed in the real world. This problem is exacerbated in the medical imaging context due to variations in data acquisition across clinical centers, medical apparatus, and patients. Domain generalization (DG) aims to address this problem by learning a model that generalizes well to any unseen target domain. Many domain generalization techniques were unsuccessful in learning domain-invariant representations due to the large domain shift. Furthermore, multiple tasks in medical imaging are not yet extensively studied in existing literature when it comes to DG point of view. In this paper, we introduce a DG method that re-establishes the model objective function as a maximization of mutual information with a large pretrained model to the medical imaging field. We re-visit the problem of DG in Diabetic Retinopathy (DR) classification to establish a clear benchmark with a correct model selection strategy and to achieve robust domain-invariant representation for an improved generalization. Moreover, we conduct extensive experiments on public datasets to show that our proposed method consistently outperforms the previous state-of-the-art by a margin of 5.25% in average accuracy and a lower standard deviation. Source code available at https://​github.​com/​BioMedIA-MBZUAI/​DGM-DR.
Aleksandr Matsun, Dana O. Mohamed, Sharon Chokuwa, Muhammad Ridzuan, Mohammad Yaqub
SEDA: Self-ensembling ViT with Defensive Distillation and Adversarial Training for Robust Chest X-Rays Classification
Deep Learning methods have recently seen increased adoption in medical imaging applications. However, elevated vulnerabilities have been explored in recent Deep Learning solutions, which can hinder future adoption. Particularly, the vulnerability of Vision Transformer (ViT) to adversarial, privacy, and confidentiality attacks raise serious concerns about their reliability in medical settings. This work aims to enhance the robustness of self-ensembling ViTs for the tuberculosis chest x-ray classification task. We propose Self-Ensembling ViT with defensive Distillation and Adversarial training (SEDA). SEDA utilizes efficient CNN blocks to learn spatial features with various levels of abstraction from feature representations extracted from intermediate ViT blocks, that are largely unaffected by adversarial perturbations. Furthermore, SEDA leverages adversarial training in combination with defensive distillation for improved robustness against adversaries. Training using adversarial examples leads to better model generalizability and improves its ability to handle perturbations. Distillation using soft probabilities introduces uncertainty and variation into the output probabilities, making it more difficult for adversarial and privacy attacks. Extensive experiments performed with the proposed architecture and training paradigm on publicly available Tuberculosis x-ray dataset shows SOTA efficacy of SEDA compared to SEViT in terms of computational efficiency with \(70\times \) times lighter framework and enhanced robustness of +9%. Code: Github.
Raza Imam, Ibrahim Almakky, Salma Alrashdi, Baketah Alrashdi, Mohammad Yaqub
A Continual Learning Approach for Cross-Domain White Blood Cell Classification
Accurate classification of white blood cells in peripheral blood is essential for diagnosing hematological diseases. Due to constantly evolving clinical settings, data sources, and disease classifications, it is necessary to update machine learning classification models regularly for practical real-world use. Such models significantly benefit from sequentially learning from incoming data streams without forgetting previously acquired knowledge. However, models can suffer from catastrophic forgetting, causing a drop in performance on previous tasks when fine-tuned on new data. Here, we propose a rehearsal-based continual learning approach for class incremental and domain incremental scenarios in white blood cell classification. To choose representative samples from previous tasks, we employ exemplar set selection based on the model’s predictions. This involves selecting the most confident samples and the most challenging samples identified through uncertainty estimation of the model. We thoroughly evaluated our proposed approach on three white blood cell classification datasets that differ in color, resolution, and class composition, including scenarios where new domains or new classes are introduced to the model with every task. We also test a long class incremental experiment with both new domains and new classes. Our results demonstrate that our approach outperforms established baselines in continual learning, including existing iCaRL and EWC methods for classifying white blood cells in cross-domain environments.
Ario Sadafi, Raheleh Salehi, Armin Gruber, Sayedali Shetab Boushehri, Pascal Giehr, Nassir Navab, Carsten Marr
Metadata Improves Segmentation Through Multitasking Elicitation
Metainformation is a common companion to biomedical images. However, this potentially powerful additional source of signal from image acquisition has had limited use in deep learning methods, for semantic segmentation in particular. Here, we incorporate metadata by employing a channel modulation mechanism in convolutional networks and study its effect on semantic segmentation tasks. We demonstrate that metadata as additional input to a convolutional network can improve segmentation results while being inexpensive in implementation as a nimble add-on to popular models. We hypothesize that this benefit of metadata can be attributed to facilitating multitask switching. This aspect of metadata-driven systems is explored and discussed in detail.
Iaroslav Plutenko, Mikhail Papkov, Kaupo Palo, Leopold Parts, Dmytro Fishman
Self-prompting Large Vision Models for Few-Shot Medical Image Segmentation
Recent advancements in large foundation models have shown promising potential in the medical industry due to their flexible prompting capability. One such model, the Segment Anything Model (SAM), a prompt-driven segmentation model, has shown remarkable performance improvements, surpassing state-of-the-art approaches in medical image segmentation. However, existing methods primarily rely on tuning strategies that require extensive data or prior prompts tailored to the specific task, making it particularly challenging when only a limited number of data samples are available. In this paper, we propose a novel perspective on self-prompting in medical vision applications. Specifically, we harness the embedding space of SAM to prompt itself through a simple yet effective linear pixel-wise classifier. By preserving the encoding capabilities of the large model, the contextual information from its decoder, and leveraging its interactive promptability, we achieve competitive results on multiple datasets (i.e. improvement of more than 15% compared to fine-tuning the mask decoder using a few images). Our code is available at https://​github.​com/​PeterYYZhang/​few-shot-self-prompt-SAM
Qi Wu, Yuyao Zhang, Marawan Elbatel
Domain Adaptation and Representation Transfer
Lisa Koch
M. Jorge Cardoso
Enzo Ferrante
Konstantinos Kamnitsas
Mobarakol Islam
Meirui Jiang
Nicola Rieke
Sotirios A. Tsaftaris
Dong Yang
Copyright Year
Electronic ISBN
Print ISBN

Premium Partner