Skip to main content

2019 | Buch

Domain Adaptation and Representation Transfer and Medical Image Learning with Less Labels and Imperfect Data

First MICCAI Workshop, DART 2019, and First International Workshop, MIL3ID 2019, Shenzhen, Held in Conjunction with MICCAI 2019, Shenzhen, China, October 13 and 17, 2019, Proceedings

herausgegeben von: Qian Wang, Fausto Milletari, Hien V. Nguyen, Shadi Albarqouni, M. Jorge Cardoso, Nicola Rieke, Prof. Ziyue Xu, Konstantinos Kamnitsas, Prof. Vishal Patel, Badri Roysam, Steve Jiang, Kevin Zhou, Khoa Luu, Ngan Le

Verlag: Springer International Publishing

Buchreihe: Lecture Notes in Computer Science


Über dieses Buch

This book constitutes the refereed proceedings of the First MICCAI Workshop on Domain Adaptation and Representation Transfer, DART 2019, and the First International Workshop on Medical Image Learning with Less Labels and Imperfect Data, MIL3ID 2019, held in conjunction with MICCAI 2019, in Shenzhen, China, in October 2019.

DART 2019 accepted 12 papers for publication out of 18 submissions. The papers deal with methodological advancements and ideas that can improve the applicability of machine learning and deep learning approaches to clinical settings by making them robust and consistent across different domains.

MIL3ID accepted 16 papers out of 43 submissions for publication, dealing with best practices in medical image learning with label scarcity and data imperfection.



DART 2019

Noise as Domain Shift: Denoising Medical Images by Unpaired Image Translation
We cast the problem of image denoising as a domain translation problem between high and low noise domains. By modifying the cycleGAN model, we are able to learn a mapping between these domains on unpaired retinal optical coherence tomography images. In quantitative measurements and a qualitative evaluation by ophthalmologists, we show how this approach outperforms other established methods. The results indicate that the network differentiates subtle changes in the level of noise in the image. Further investigation of the model’s feature maps reveals that it has learned to distinguish retinal layers and other distinct regions of the images.
Ilja Manakov, Markus Rohm, Christoph Kern, Benedikt Schworm, Karsten Kortuem, Volker Tresp
Temporal Consistency Objectives Regularize the Learning of Disentangled Representations
There has been an increasing focus in learning interpretable feature representations, particularly in applications such as medical image analysis that require explainability, whilst relying less on annotated data (since annotations can be tedious and costly). Here we build on recent innovations in style-content representations to learn anatomy, imaging characteristics (appearance) and temporal correlations. By introducing a self-supervised objective of predicting future cardiac phases we improve disentanglement. We propose a temporal transformer architecture that given an image conditioned on phase difference, it predicts a future frame. This forces the anatomical decomposition to be consistent with the temporal cardiac contraction in cine MRI and to have semantic meaning with less need for annotations. We demonstrate that using this regularization, we achieve competitive results and improve semi-supervised segmentation, especially when very few labelled data are available. Specifically, we show Dice increase of up to 19% and 7% compared to supervised and semi-supervised approaches respectively on the ACDC dataset. Code is available at: https://​github.​com/​gvalvano/​sdtnet.
Gabriele Valvano, Agisilaos Chartsias, Andrea Leo, Sotirios A. Tsaftaris
Multi-layer Domain Adaptation for Deep Convolutional Networks
Despite their success in many computer vision tasks, convolutional networks tend to require large amounts of labeled data to achieve generalization. Furthermore, the performance is not guaranteed on a sample from an unseen domain at test time, if the network was not exposed to similar samples from that domain at training time. This hinders the adoption of these techniques in clinical setting where the imaging data is scarce, and where the intra- and inter-domain variance of the data can be substantial. We propose a domain adaptation technique that is especially suitable for deep networks to alleviate this requirement of labeled data. Our method utilizes gradient reversal layers [4] and Squeeze-and-Excite modules [6] to stabilize the training in deep networks. The proposed method was applied to publicly available histopathology and chest X-ray databases and achieved superior performance to existing state-of-the-art networks with and without domain adaptation. Depending on the application, our method can improve multi-class classification accuracy by 5–20% compared to DANN introduced in [4].
Ozan Ciga, Jianan Chen, Anne Martel
Intramodality Domain Adaptation Using Self Ensembling and Adversarial Training
Advances in deep learning techniques have led to compelling achievements in medical image analysis. However, performance of neural network models degrades drastically if the test data is from a domain different from training data. In this paper, we present and evaluate a novel unsupervised domain adaptation (DA) framework for semantic segmentation which uses self ensembling and adversarial training methods to effectively tackle domain shift between MR images. We evaluate our method on two publicly available MRI dataset to address two different types of domain shifts: On the BraTS dataset [11] to mitigate domain shift between high grade and low grade gliomas and on the SCGM dataset [13] to tackle cross institutional domain shift. Through extensive evaluation, we show that our method achieves favorable results on both datasets.
Zahil Shanis, Samuel Gerber, Mingchen Gao, Andinet Enquobahrie
Learning Interpretable Disentangled Representations Using Adversarial VAEs
Learning Interpretable representation in medical applications is becoming essential for adopting data-driven models into clinical practice. It has been recently shown that learning a disentangled feature representation is important for a more compact and explainable representation of the data. In this paper, we introduce a novel adversarial variational autoencoder with a total correlation constraint to enforce independence on the latent representation while preserving the reconstruction fidelity. Our proposed method is validated on a publicly available dataset showing that the learned disentangled representation is not only interpretable, but also superior to the state-of-the-art methods. We report a relative improvement of \(81.50\%\) in terms of disentanglement, \(11.60\%\) in clustering, and \(2\%\) in supervised classification with a few amount of labeled data.
Mhd Hasan Sarhan, Abouzar Eslami, Nassir Navab, Shadi Albarqouni
Synthesising Images and Labels Between MR Sequence Types with CycleGAN
Real-time (RT) sequences for cardiac magnetic resonance imaging (CMR) have recently been proposed as alternatives to standard cine CMR sequences for subjects unable to hold the breath or suffering from arrhythmia. RT image acquisitions during free breathing produce comparatively poor quality images, a trade-off necessary to achieve the high temporal resolution needed for RT imaging and hence are less suitable in the clinical assessment of cardiac function. We demonstrate the application of a CycleGAN architecture to train autoencoder networks for synthesising cine-like images from RT images and vice versa. Applying this conversion to real-time data produces clearer images with sharper distinctions between myocardial and surrounding tissues, giving clinicians a more precise means of visually inspecting subjects. Furthermore, applying the transformation to segmented cine data to produce pseudo-real-time images allows this label information to be transferred to the real-time image domain. We demonstrate the feasibility of this approach by training a U-net based architecture using these pseudo-real-time images which can effectively segment actual real-time images.
Eric Kerfoot, Esther Puyol-Antón, Bram Ruijsink, Rina Ariga, Ernesto Zacur, Pablo Lamata, Julia Schnabel
Multi-domain Adaptation in Brain MRI Through Paired Consistency and Adversarial Learning
Supervised learning algorithms trained on medical images will often fail to generalize across changes in acquisition parameters. Recent work in domain adaptation addresses this challenge and successfully leverages labeled data in a source domain to perform well on an unlabeled target domain. Inspired by recent work in semi-supervised learning we introduce a novel method to adapt from one source domain to n target domains (as long as there is paired data covering all domains). Our multi-domain adaptation method utilises a consistency loss combined with adversarial learning. We provide results on white matter lesion hyperintensity segmentation from brain MRIs using the MICCAI 2017 challenge data as the source domain and two target domains. The proposed method significantly outperforms other domain adaptation baselines.
Mauricio Orbes-Arteaga, Thomas Varsavsky, Carole H. Sudre, Zach Eaton-Rosen, Lewis J. Haddow, Lauge Sørensen, Mads Nielsen, Akshay Pai, Sébastien Ourselin, Marc Modat, Parashkev Nachev, M. Jorge Cardoso
Cross-Modality Knowledge Transfer for Prostate Segmentation from CT Scans
Creating large scale high-quality annotations is a known challenge in medical imaging. In this work, based on the CycleGAN algorithm, we propose leveraging annotations from one modality to be useful in other modalities. More specifically, the proposed algorithm creates highly realistic synthetic CT images (SynCT) from prostate MR images using unpaired data sets. By using SynCT images (without segmentation labels) and MR images (with segmentation labels available), we have trained a deep segmentation network for precise delineation of prostate from real CT scans. For the generator in our CycleGAN, the cycle consistency term is used to guarantee that SynCT shares the identical manually-drawn, high-quality masks originally delineated on MR images. Further, we introduce a cost function based on structural similarity index (SSIM) to improve the anatomical similarity between real and synthetic images. For segmentation followed by the SynCT generation from CycleGAN, automatic delineation is achieved through a 2.5D Residual U-Net. Quantitative evaluation demonstrates comparable segmentation results between our SynCT and radiologist drawn masks for real CT images, solving an important problem in medical image segmentation field when ground truth annotations are not available for the modality of interest.
Yucheng Liu, Naji Khosravan, Yulin Liu, Joseph Stember, Jonathan Shoag, Ulas Bagci, Sachin Jambawalikar
A Pulmonary Nodule Detection Method Based on Residual Learning and Dense Connection
Pulmonary nodule detection using chest CT scan is an essential but challenging step towards the early diagnosis of lung cancer. Although a number of deep learning-based methods have been published in the literature, these methods still suffer from less accuracy. In this paper, we propose a novel pulmonary module detection method, which uses a 3D residual U-Net (3D RU-Net) for nodule candidate detection and a 3D densely connected CNN (3D DC-Net) for false positive reduction. 3D RU-Net contains residual blocks in both contracting and expansive paths, and 3D DC-Net leverages three dense blocks to facilitate gradients flow. We evaluated our method on the benchmark LUng Nodule Analysis 2016 (LUNA16) dataset and achieved a CPM score of 0.941, which is higher than those achieved by five competing methods. Our results suggest that the proposed method can effectively detect pulmonary nodules on chest CT.
Feng Zhang, Yutong Xie, Yong Xia, Yanning Zhang
Harmonization and Targeted Feature Dropout for Generalized Segmentation: Application to Multi-site Traumatic Brain Injury Images
While learning based methods have brought extremely promising results in medical imaging, a major bottleneck is the lack of generalizability. Medical images are often collected from multiple sites and/or protocols for increasing statistical power, while CNN trained on one site typically cannot be well-transferred to others. Further, expert-defined manual labels for medical images are typically rare, making training a dedicated CNN for each site unpractical, so it is important to make best use of the limited labeled source data. To address this problem, we harmonize the target data using adversarial learning, and propose targeted feature dropout (TFD) to enhance the robustness of the model to variations in target images. Specifically, TFD is guided by attention to stochastically remove some of the most discriminative features. Essentially, this technique combines the benefits of attention mechanism and dropout, while it does not increase parameters and computational costs, making it well-suited for small neuroimaging datasets. We evaluated our method on a challenging Traumatic Brain Injury (TBI) dataset collected from 13 sites, using labeled source data of only 14 healthy subjects. Experimental results confirmed the feasibility of using the Cycle-consistent adversarial network for harmonizing multi-site MR images, and demonstrated that TFD further improved the generalization of the vanilla segmentation model on TBI data, reaching comparable accuracy with that of the supervised learning. The code is available at https://​github.​com/​YilinLiu97/​Targeted-Feature-Dropout.​git.
Yilin Liu, Gregory R. Kirk, Brendon M. Nacewicz, Martin A. Styner, Mingren Shen, Dong Nie, Nagesh Adluru, Benjamin Yeske, Peter A. Ferrazzano, Andrew L. Alexander
Improving Pathological Structure Segmentation via Transfer Learning Across Diseases
One of the biggest challenges in developing robust machine learning techniques for medical image analysis is the lack of access to large-scale annotated image datasets needed for supervised learning. When the task is to segment pathological structures (e.g. lesions, tumors) from patient images, training on a dataset with few samples is very challenging due to the large class imbalance and inter-subject variability. In this paper, we explore how to best leverage a segmentation model that has been pre-trained on a large dataset of patients images with one disease in order to successfully train a deep learning pathology segmentation model for a different disease, for which only a relatively small patient dataset is available. Specifically, we train a UNet model on a large-scale, proprietary, multi-center, multi-scanner Multiple Sclerosis (MS) clinical trial dataset containing over 3500 multi-modal MRI samples with expert-derived lesion labels. We explore several transfer learning approaches to leverage the learned MS model for the task of multi-class brain tumor segmentation on the BraTS 2018 dataset. Our results indicate that adapting and fine-tuning the encoder and decoder of the network trained on the larger MS dataset leads to improvement in brain tumor segmentation when few instances are available. This type of transfer learning outperforms training and testing the network on the BraTS dataset from scratch as well as several other transfer learning approaches, particularly when only a small subset of the dataset is available.
Barleen Kaur, Paul Lemaître, Raghav Mehta, Nazanin Mohammadi Sepahvand, Doina Precup, Douglas Arnold, Tal Arbel
Generating Virtual Chromoendoscopic Images and Improving Detectability and Classification Performance of Endoscopic Lesions
Endoscopy is a standard method for the diagnosis and detection of colorectal lesions. As a method to enhance the detectability of lesions, the effectiveness of pancolonic chromoendoscopy with indigocarmine has been reported. On the other hand, computer-aided diagnosis (CAD) has attracted attention. However, existing CAD systems are mainly for white light imaging (WLI) endoscopy, and the effect of the combination of CAD and indigocarmine dye spraying is not clear. Besides, it is difficult to gather a lot of indigocarmine dye-sprayed (IC) images for training. Here, we propose image-to-image translation from WLI to virtual indigocarmine dye-sprayed (VIC) images based on unpaired cycle-consistent Generative Adversarial Networks. Using this generator as preprocess part, we constructed detection models to evaluate the effectiveness of VIC translation for localization and classification of lesions. We also compared the localization and classification performance with and without image augmentation by using generated VIC images. Our results show that the model trained on IC and VIC images had the highest performance in both localization and classification. Therefore, VIC images are useful for the augmentation of IC images.
Akihiro Fukuda, Tadashi Miyamoto, Shunsuke Kamba, Kazuki Sumiyama

MIL3ID 2019

Self-supervised Learning of Inverse Problem Solvers in Medical Imaging
In the past few years, deep learning-based methods have demonstrated enormous success for solving inverse problems in medical imaging. In this work, we address the following question: Given a set of measurements obtained from real imaging experiments, what is the best way to use a learnable model and the physics of the modality to solve the inverse problem and reconstruct the latent image? Standard supervised learning based methods approach this problem by collecting data sets of known latent images and their corresponding measurements. However, these methods are often impractical due to the lack of availability of appropriately sized training sets, and, more generally, due to the inherent difficulty in measuring the “groundtruth” latent image. In light of this, we propose a self-supervised approach to training inverse models in medical imaging in the absence of aligned data. Our method only requiring access to the measurements and the forward model at training. We showcase its effectiveness on inverse problems arising in accelerated magnetic resonance imaging (MRI).
Ortal Senouf, Sanketh Vedula, Tomer Weiss, Alex Bronstein, Oleg Michailovich, Michael Zibulevsky
Weakly Supervised Segmentation of Vertebral Bodies with Iterative Slice-Propagation
Vertebral body (VB) segmentation is an important preliminary step towards medical visual diagnosis for spinal diseases. However, most previous works require pixel/voxel-wise strong supervisions, which is expensive, tedious and time-consuming for experts to annotate. In this paper, we propose a Weakly supervised Iterative Spinal Segmentation (WISS) method leveraging only four corner landmark weak labels on a single sagittal slice to achieve automatic volumetric segmentation from CT images for VBs. WISS first segments VBs on an annotated sagittal slice in an iterative self-training manner. This self-training method alternates between training and refining labels in the training set. Then WISS proceeds to segment the whole VBs slice by slice with a slice-propagation method to obtain volumetric segmentations. We evaluate the performance of WISS on a private spinal metastases CT dataset and the public lumbar CT dataset. On the first dataset, WISS achieves distinct improvements with regard to two different backbones. For the second dataset, WISS achieves dice coefficients of \(91.7\%\) and \(83.7\%\) for mid-sagittal slices and 3D CT volumes, respectively, saving a lot of labeling costs and only sacrificing a little segmentation performance.
Shiqi Peng, Bolin Lai, Guangyu Yao, Xiaoyun Zhang, Ya Zhang, Yan-Feng Wang, Hui Zhao
A Cascade Attention Network for Liver Lesion Classification in Weakly-Labeled Multi-phase CT Images
Focal liver lesion classification is important to the diagnostics of liver disease. In clinics, lesion type is usually determined by multi-phase contrast-enhanced CT images. Previous methods of automatic liver lesion classification are conducted on lesion-level, which rely heavily on lesion-level annotations. In order to reduce the burden of annotation, in this paper, we explore automatic liver lesion classification with weakly-labeled CT images (i.e. with only image-level labels). The major challenge is how to localize the region of interests (ROIs) accurately by using only coarse image-level annotations and accordingly make the right lesion classification decision. We propose a cascade attention network to address the challenge by two stages: Firstly, a dual-attention dilated residual network (DADRN) is proposed to generate a class-specific lesion localization map, which incorporates spatial attention and channel attention blocks for capturing the high-level feature map’s long-range dependencies and helps to synthesize a more semantic-consistent feature map, and thereby boosting weakly-supervised lesion localization and classification performance; Secondly, a multi-channel dilated residual network (MCDRN) embedded with a convolutional long short-term memory (CLSTM) block is proposed to extract temporal enhancement information and make the final classification decision. The experiment results show that our method could achieve a mean classification accuracy of 89.68%, which significantly mitigates the performance gap between weakly-supervised approaches and fully supervised counterparts.
Xiao Chen, Lanfen Lin, Hongjie Hu, Qiaowei Zhang, Yutaro Iwamoto, Xianhua Han, Yen-Wei Chen, Ruofeng Tong, Jian Wu
CT Data Curation for Liver Patients: Phase Recognition in Dynamic Contrast-Enhanced CT
As the demand for more descriptive machine learning models grows within medical imaging, bottlenecks due to data paucity will exacerbate. Thus, collecting enough large-scale data will require automated tools to harvest data/label pairs from messy and real-world datasets, such as hospital picture archiving and communication systems (PACSs). This is the focus of our work, where we present a principled data curation tool to extract multi-phase computed tomography (CT) liver studies and identify each scan’s phase from a real-world and heterogenous hospital PACS dataset. Emulating a typical deployment scenario, we first obtain a set of noisy labels from our institutional partners that are text mined using simple rules from DICOM tags. We train a deep learning system, using a customized and streamlined 3D squeeze and excitation (SE) architecture, to identify non-contrast, arterial, venous, and delay phase dynamic CT liver scans, filtering out anything else, including other types of liver contrast studies. To exploit as much training data as possible, we also introduce an aggregated cross entropy loss that can learn from scans only identified as “contrast”. Extensive experiments on a dataset of 43K scans of 7680 patient imaging studies demonstrate that our 3DSE architecture, armed with our aggregated loss, can achieve a mean F1 of 0.977 and can correctly harvest up to \(92.7\%\) of studies, which significantly outperforms the text-mined and standard-loss approach, and also outperforms other, and more complex, model architectures.
Bo Zhou, Adam P. Harrison, Jiawen Yao, Chi-Tung Cheng, Jing Xiao, Chien-Hung Liao, Le Lu
Active Learning Technique for Multimodal Brain Tumor Segmentation Using Limited Labeled Images
Image segmentation is an essential step in biomedical image analysis. In recent years, deep learning models have achieved significant success in segmentation. However, deep learning requires the availability of large annotated data to train these models, which can be challenging in biomedical imaging domain. In this paper, we aim to accomplish biomedical image segmentation with limited labeled data using active learning. We present a deep active learning framework that selects additional data points to be annotated by combining U-Net with an efficient and effective query strategy to capture the most uncertain and representative points. This algorithm decouples the representative part by first finding the core points in the unlabeled pool and then selecting the most uncertain points from the reduced pool, which are different from the labeled pool. In our experiment, only 13% of the dataset was required with active learning to outperform the model trained on the entire 2018 MICCAI Brain Tumor Segmentation (BraTS) dataset. Thus, active learning reduced the amount of labeled data required for image segmentation without a significant loss in the accuracy.
Dhruv Sharma, Zahil Shanis, Chandan K. Reddy, Samuel Gerber, Andinet Enquobahrie
Semi-supervised Learning of Fetal Anatomy from Ultrasound
Semi-supervised learning methods have achieved excellent performance on standard benchmark datasets using very few labelled images. Anatomy classification in fetal 2D ultrasound is an ideal problem setting to test whether these results translate to non-ideal data. Our results indicate that inclusion of a challenging background class can be detrimental and that semi-supervised learning mostly benefits classes that are already distinct, sometimes at the expense of more similar classes.
Jeremy Tan, Anselm Au, Qingjie Meng, Bernhard Kainz
Multi-modal Segmentation with Missing MR Sequences Using Pre-trained Fusion Networks
Missing data is a common problem in machine learning and in retrospective imaging research it is often encountered in the form of missing imaging modalities. We propose to take into account missing modalities in the design and training of neural networks, to ensure that they are capable of providing the best possible prediction even when multiple images are not available. The proposed network combines three modifications to the standard 3D UNet architecture: a training scheme with dropout of modalities, a multi-pathway architecture with fusion layer in the final stage, and the separate pre-training of these pathways. These modifications are evaluated incrementally in terms of performance on full and missing data, using the BraTS multi-modal segmentation challenge. The final model shows significant improvement with respect to the state of the art on missing data and requires less memory during training.
Karin van Garderen, Marion Smits, Stefan Klein
More Unlabelled Data or Label More Data? A Study on Semi-supervised Laparoscopic Image Segmentation
Improving a semi-supervised image segmentation task has the option of adding more unlabelled images, labelling the unlabelled images or combining both, as neither image acquisition nor expert labelling can be considered trivial in most clinical applications. With a laparoscopic liver image segmentation application, we investigate the performance impact by altering the quantities of labelled and unlabelled training data, using a semi-supervised segmentation algorithm based on the mean teacher learning paradigm. We first report a significantly higher segmentation accuracy, compared with supervised learning. Interestingly, this comparison reveals that the training strategy adopted in the semi-supervised algorithm is also responsible for this observed improvement, in addition to the added unlabelled data. We then compare different combinations of labelled and unlabelled data set sizes for training semi-supervised segmentation networks, to provide a quantitative example of the practically useful trade-off between the two data planning strategies in this surgical guidance application.
Yunguan Fu, Maria R. Robu, Bongjin Koo, Crispin Schneider, Stijn van Laarhoven, Danail Stoyanov, Brian Davidson, Matthew J. Clarkson, Yipeng Hu
Few-Shot Learning with Deep Triplet Networks for Brain Imaging Modality Recognition
Image modality recognition is essential for efficient imaging workflows in current clinical environments, where multiple imaging modalities are used to better comprehend complex diseases. Emerging biomarkers from novel, rare modalities are being developed to aid in such understanding, however the availability of these images is often limited. This scenario raises the necessity of recognising new imaging modalities without them being collected and annotated in large amounts. In this work, we present a few-shot learning model for limited training examples based on Deep Triplet Networks. We show that the proposed model is more accurate in distinguishing different modalities than a traditional Convolutional Neural Network classifier when limited samples are available. Furthermore, we evaluate the performance of both classifiers when presented with noisy samples and provide an initial inspection of how the proposed model can incorporate measures of uncertainty to be more robust against out-of-sample examples.
Santi Puch, Irina Sánchez, Matt Rowe
A Convolutional Neural Network Method for Boundary Optimization Enables Few-Shot Learning for Biomedical Image Segmentation
Obtaining large amounts of annotated biomedical data to train convolutional neural networks (CNNs) for image segmentation is expensive. We propose a method that requires only a few segmentation examples to accurately train a semi-automated segmentation algorithm. Our algorithm, a convolutional neural network method for boundary optimization (CoMBO), can be used to rapidly outline object boundaries using orders of magnitude less annotation than full segmentation masks, i.e., only a few pixels per image. We found that CoMBO is significantly more accurate than state-of-the-art machine learning methods such as Mask R-CNN. We also show how we can use CoMBO predictions, when CoMBO is trained on just 3 images, to rapidly create large amounts of accurate training data for Mask R-CNN. Our few-shot method is demonstrated on ISBI cell tracking challenge datasets.
Erica M. Rutter, John H. Lagergren, Kevin B. Flores
Transfer Learning from Partial Annotations for Whole Brain Segmentation
Brain MR image segmentation is a key task in neuroimaging studies. It is commonly conducted using standard computational tools, such as FSL, SPM, multi-atlas segmentation etc, which are often registration-based and suffer from expensive computation cost. Recently, there is an increased interest using deep neural networks for brain image segmentation, which have demonstrated advantages in both speed and performance. However, neural networks-based approaches normally require a large amount of manual annotations for optimising the massive amount of network parameters. For 3D networks used in volumetric image segmentation, this has become a particular challenge, as a 3D network consists of many more parameters compared to its 2D counterpart. Manual annotation of 3D brain images is extremely time-consuming and requires extensive involvement of trained experts. To address the challenge with limited manual annotations, here we propose a novel multi-task learning framework for brain image segmentation, which utilises a large amount of automatically generated partial annotations together with a small set of manually created full annotations for network training. Our method yields a high performance comparable to state-of-the-art methods for whole brain segmentation.
Chengliang Dai, Yuanhan Mo, Elsa Angelini, Yike Guo, Wenjia Bai
Learning to Segment Skin Lesions from Noisy Annotations
Deep convolutional neural networks have driven substantial advancements in the automatic understanding of images. Requiring a large collection of images and their associated annotations is one of the main bottlenecks limiting the adoption of deep networks. In the task of medical image segmentation, requiring pixel-level semantic annotations performed by human experts exacerbate this difficulty. This paper proposes a new framework to train a fully convolutional segmentation network from a large set of cheap unreliable annotations and a small set of expert-level clean annotations. We propose a spatially adaptive reweighting approach to treat clean and noisy pixel-level annotations commensurately in the loss function. We deploy a meta-learning approach to assign higher importance to pixels whose loss gradient direction is closer to those of clean data. Our experiments on training the network using segmentation ground truth corrupted with different levels of annotation noise show how spatial reweighting improves the robustness of deep networks to noisy annotations.
Zahra Mirikharaji, Yiqi Yan, Ghassan Hamarneh
A Weakly Supervised Method for Instance Segmentation of Biological Cells
We present a weakly supervised deep learning method to perform instance segmentation of cells present in microscopy images. Annotation of biomedical images in the lab can be scarce, incomplete, and inaccurate. This is of concern when supervised learning is used for image analysis as the discriminative power of a learning model might be compromised in these situations. To overcome the curse of poor labeling, our method focuses on three aspects to improve learning: (i) we propose a loss function operating in three classes to facilitate separating adjacent cells and to drive the optimizer to properly classify underrepresented regions; (ii) a contour-aware weight map model is introduced to strengthen contour detection while improving the network generalization capacity; and (iii) we augment data by carefully modulating local intensities on edges shared by adjoining regions and to account for possibly weak signals on these edges. Generated probability maps are segmented using different methods, with the watershed based one generally offering the best solutions, specially in those regions where the prevalence of a single class is not clear. The combination of these contributions allows segmenting individual cells on challenging images. We demonstrate our methods in sparse and crowded cell images, showing improvements in the learning process for a fixed network architecture.
Fidel A. Guerrero-Peña, Pedro D. Marrero Fernandez, Tsang Ing Ren, Alexandre Cunha
Towards Practical Unsupervised Anomaly Detection on Retinal Images
Supervised deep learning approaches provide state-of-the-art performance on medical image classification tasks for disease screening. However, these methods require large labeled datasets that involve resource-intensive expert annotation. Further, disease screening applications have low prevalence of abnormal samples; this class imbalance makes the task more akin to anomaly detection. While the machine learning community has proposed unsupervised deep learning methods for anomaly detection, they have yet to be characterized on medical images where normal vs. anomaly distinctions may be more subtle and variable. In this work, we characterize existing unsupervised anomaly detection methods on retinal fundus images, and find that they require significant fine tuning and offer unsatisfactory performance. We thus propose an efficient and effective transfer-learning based approach for unsupervised anomaly detection. Our method employs a deep convolutional neural network trained on ImageNet as a feature extractor, and subsequently feeds the learned feature representations into an existing unsupervised anomaly detection method. We show that our approach significantly outperforms baselines on two natural image datasets and two retinal fundus image datasets, all with minimal fine-tuning. We further show the ability to leverage very small numbers of labelled anomalies to improve performance. Our work establishes a strong unsupervised baseline for image-based anomaly detection, alongside a flexible and scalable approach for screening applications.
Khalil Ouardini, Huijuan Yang, Balagopal Unnikrishnan, Manon Romain, Camille Garcin, Houssam Zenati, J. Peter Campbell, Michael F. Chiang, Jayashree Kalpathy-Cramer, Vijay Chandrasekhar, Pavitra Krishnaswamy, Chuan-Sheng Foo
Fine Tuning U-Net for Ultrasound Image Segmentation: Which Layers?
Fine-tuning a network which has been trained on a large dataset is an alternative to full training in order to overcome the problem of scarce and expensive data in medical applications. While the shallow layers of the network are usually kept unchanged, deeper layers are modified according to the new dataset. This approach may not work for ultrasound images due to their drastically different appearance. In this study, we investigated the effect of fine-tuning different layers of a U-Net which was trained on segmentation of natural images in breast ultrasound image segmentation. Tuning the contracting part and fixing the expanding part resulted in substantially better results compared to fixing the contracting part and tuning the expanding part. Furthermore, we showed that starting to fine-tune the U-Net from the shallow layers and gradually including more layers will lead to a better performance compared to fine-tuning the network from the deep layers moving back to shallow layers. We did not observe the same results on segmentation of X-ray images, which have different salient features compared to ultrasound, it may therefore be more appropriate to fine-tune the shallow layers rather than deep layers. Shallow layers learn lower level features (including speckle pattern, and probably the noise and artifact properties) which are critical in automatic segmentation in this modality.
Mina Amiri, Rupert Brooks, Hassan Rivaz
Multi-task Learning for Neonatal Brain Segmentation Using 3D Dense-Unet with Dense Attention Guided by Geodesic Distance
The deep convolutional neural network has achieved outstanding performance on neonatal brain MRI tissue segmentation. However, it may fail to produce reasonable results on unseen datasets that have different imaging appearance distributions with the training data. The main reason is that deep learning models tend to have a good fitting to the training dataset, but do not lead to a good generalization on the unseen datasets. To address this problem, we propose a multi-task learning method, which simultaneously learns both tissue segmentation and geodesic distance regression to regularize a shared encoder network. Furthermore, a dense attention gate is explored to force the network to learn rich contextual information. By using three neonatal brain datasets with different imaging protocols from different scanners, our experimental results demonstrate superior performance of our proposed method over the existing deep learning-based methods on the unseen datasets.
Toan Duc Bui, Li Wang, Jian Chen, Weili Lin, Gang Li, Dinggang Shen
Domain Adaptation and Representation Transfer and Medical Image Learning with Less Labels and Imperfect Data
herausgegeben von
Qian Wang
Fausto Milletari
Hien V. Nguyen
Shadi Albarqouni
M. Jorge Cardoso
Nicola Rieke
Prof. Ziyue Xu
Konstantinos Kamnitsas
Prof. Vishal Patel
Badri Roysam
Steve Jiang
Kevin Zhou
Khoa Luu
Ngan Le
Electronic ISBN
Print ISBN

Premium Partner