Zum Inhalt

Resource-Efficient Medical Image Analysis

First MICCAI Workshop, REMIA 2022, Singapore, September 22, 2022, Proceedings

  • 2022
  • Buch
insite
SUCHEN

Über dieses Buch

Dieses Buch ist das Referat des ersten MICCAI Workshops on Resource-Efficient Medical Image Analysis, REMIA 2022, der in Verbindung mit MICCAI 2022 im September 2022 als hybride Veranstaltung abgehalten wurde. REMIA 2022 nahm 13 der 19 eingegangenen Beiträge an. Ziel des Workshops ist es, eine Diskussion über die praktischen Anwendungsmöglichkeiten medizinischer Bildgebungssysteme mit Einschränkungen bei Daten, Etiketten und Hardware anzustoßen.

Inhaltsverzeichnis

Frontmatter
Multi-task Semi-supervised Learning for Vascular Network Segmentation and Renal Cell Carcinoma Classification
Abstract
Vascular network analysis is crucial to define the tumoral architecture and then diagnose the cancer subtype. However, automatic vascular network segmentation from Hematoxylin and Eosin (H &E) staining histopathological images is still a challenge due to the background complexity. Moreover, there is a lack of large manually annotated vascular network databases. In this paper, we propose a method that reduces reliance on labeled data through semi-supervised learning (SSL). Additionally, considering the correlation between tumor classification and vascular segmentation, we propose a multi-task learning (MTL) model that can simultaneously segment the vascular network using SSL and predict the tumor class in a supervised context. This multi-task learning procedure offers an end-to-end machine learning solution to joint vascular network segmentation and tumor classification. Experiments were carried out on a database of histopathological images of renal cell carcinoma (RCC) and then tested on both own RCC and open-source TCGA datasets. The results show that the proposed MTL-SSL model outperforms the conventional supervised-learning segmentation approach.
Rudan Xiao, Damien Ambrosetti, Xavier Descombes
Self-supervised Antigen Detection Artificial Intelligence (SANDI)
Abstract
Multiplexed pathology imaging techniques allow spatially resolved analysis of cell phenotypes for interrogating disease biology. Existing methods for cell phenotyping in multiplex images require extensive annotation workload due to the need for fully supervised training. To overcome this challenge, we develop SANDI, a self-supervised-based pipeline that learns intrinsic similarities in unlabeled cell images to mitigate the requirement for expert supervision. The capability of SANDI to efficiently classify cells with minimal manual annotations is demonstrated through the analysis of 3 different multiplexed immunohistochemistry datasets. We show that in coupled with representations learnt by SANDI from unlabeled cell images, a linear Support Vector Machine classifier trained on 10 annotations per cell type yields a higher or comparable weighted F1-score to the supervised classifier trained on an average of about 300–1000 annotations per cell type. By striking a fine balance between minimal expert guidance and the power of deep learning to learn similarity within abundant data, SANDI presents new opportunities for efficient, large-scale learning for multiplexed imaging data.
Hanyun Zhang, Khalid AbdulJabbar, Tami Grunewald, Ayse Akarca, Yeman Hagos, Catherine Lecat, Dominic Pate, Lydia Lee, Manuel Rodriguez-Justo, Kwee Yong, Jonathan Ledermann, John Le Quesne, Teresa Marafioti, Yinyin Yuan
RadTex: Learning Efficient Radiograph Representations from Text Reports
Abstract
Automated analysis of chest radiography using deep learning has tremendous potential to enhance the clinical diagnosis of diseases in patients. However, deep learning models typically require large amounts of annotated data to achieve high performance – often an obstacle to medical domain adaptation. In this paper, we build a data-efficient learning framework that utilizes radiology reports to improve medical image classification performance with limited labeled data (fewer than 1000 examples). Specifically, we examine image-captioning pretraining to learn high-quality medical image representations that train on fewer examples. Following joint pretraining of a convolutional encoder and transformer decoder, we transfer the learned encoder to various classification tasks. Averaged over 9 pathologies, we find that our model achieves higher classification performance than ImageNet-supervised and in-domain supervised pretraining when labeled training data is limited.
Keegan Quigley, Miriam Cha, Ruizhi Liao, Geeticka Chauhan, Steven Horng, Seth Berkowitz, Polina Golland
Single Domain Generalization via Spontaneous Amplitude Spectrum Diversification
Abstract
Due to the domain shift problem, the deep learning models trained on one domain often fail to generalize well on others. Researchers formulated such a realistic-yet-challenging scenario as a new research line, termed single domain generalization (single-DG), which aims to generalize a model trained on single source domain to multiple target domains. The existing single-DG approaches tried to address the problem by generating diverse samples using extra trainable network modules. However, due to the limited amount of medical data, the extra network parameters are difficult to train. The generated samples are often failed to achieve satisfactory effect for improving model generalization. In this paper, we propose a simple-yet-effective Fourier-based approach, which augments data via spontaneous Amplitude SPECTrum diverSification (ASPECTS), for single domain generalization. Concretely, the proposed approach first converts the image into frequency domain using the Fourier transform, and then spontaneously generates diverse samples by editing the amplitude spectrum using a pool of randomization operations. The proposed approach is established upon the assumption that the high-level semantic information (domain-invariant) is embedded in the phase spectrum of images after Fourier transform, while the amplitude spectrum mainly contains the domain-variant information. We evaluate the proposed ASPECTS approach on both publicly available and private multi-domain datasets. Compared to the existing single-DG approaches, our method is much easier to implement (i.e., without training of extra network modules) and yields the superior improvement.
Yuexiang Li, Nanjun He, Yawen Huang
Triple-View Feature Learning for Medical Image Segmentation
Abstract
Deep learning models, e.g. supervised Encoder-Decoder style networks, exhibit promising performance in medical image segmentation, but come with a high labelling cost. We propose TriSegNet, a semi-supervised semantic segmentation framework. It uses triple-view feature learning on a limited amount of labelled data and a large amount of unlabeled data. The triple-view architecture consists of three pixel-level classifiers and a low-level shared-weight learning module. The model is first initialized with labelled data. Label processing, including data perturbation, confidence label voting and unconfident label detection for annotation, enables the model to train on labelled and unlabeled data simultaneously. The confidence of each model gets improved through the other two views of the feature learning. This process is repeated until each model reaches the same confidence level as its counterparts. This strategy enables triple-view learning of generic medical image datasets. Bespoke overlap-based and boundary-based loss functions are tailored to the different stages of the training. The segmentation results are evaluated on four publicly available benchmark datasets including Ultrasound, CT, MRI, and Histology images. Repeated experiments demonstrate the effectiveness of the proposed network compared against other semi-supervised algorithms, across a large set of evaluation measures.
Ziyang Wang, Irina Voiculescu
Classification of 4D fMRI Images Using ML, Focusing on Computational and Memory Utilization Efficiency
Abstract
Resting state functional magnetic resonance images (fMRI) based on BOLD signals are commonly used for classification of patients as having Alzheimer’s disease (AD), mild cognitive impairment (MCI) or being cognitive normal (CN). In this research, we represent Resting-State brain activity in Regions-of-Interest (ROI) by subsets of anatomical region voxels formed by segments of a whole brain bounding box Hilbert curve resulting in an average 5× fewer voxels per ROI than the average number of AAL90 region voxels. We represent each 4D ROI data set with a vector that on average reduces a ROI data set from about 55,000 voxel signal values to 100 to 200 aggregated values in our spatial representation and to 15,000–30,000 in our spatial-temporal representation. We show that a Convolutional Neural Network (CNN) with a model size of about 168 kiB and a Transformer model of only 37 kiB yields classification accuracies of 80–90% for AD, MCI, and CN subject binary classification. Training the CNN and Transformer models on a data set of 551 subjects required 188 and 27 s respectively using Pytorch.1.5.0, Python 3.7.7, and CUDA 10.1 on a system with two 10 cores, 2.8 GHz Intel Xeon E5-2670v2 CPUs and one NVIDIA K40 GPU.
Nazanin Beheshti, Lennart Johnsson
An Efficient Defending Mechanism Against Image Attacking on Medical Image Segmentation Models
Abstract
Image attacking has been studied for a long time. However, in reality, the number of research on defending against the attacks on segmentation models is still limited especially for medical imaging. To fill this research gap, we propose a novel defending mechanism against adversarial attacks for the segmentation models. We focus on segmentation as robustness improvement on segmentation is much more challenging due to its dense nature, and segmentation is at the center of medical imaging tasks. In this paper, we are the first time to employ Transformer as a technique to protect the segmentation models from attacks. Our result on several medical well-known benchmark datasets shows that the proposed defending mechanism to enhance the segmentation models is effective with high scores and better compared to other strong methods.
Linh D. Le, Huazhu Fu, Xinxing Xu, Yong Liu, Yanyu Xu, Jiawei Du, Joey T. Zhou, Rick Goh
Leverage Supervised and Self-supervised Pretrain Models for Pathological Survival Analysis via a Simple and Low-cost Joint Representation Tuning
Abstract
The large-scale pretrained models from terabyte-level (TB) data are now broadly used in feature extraction, model initialization, and transfer learning in pathological image analyses. Most existing studies have focused on developing more powerful pretrained models, which are increasingly unscalable for academic institutes. Very few, if any, studies have investigated how to take advantage of existing, yet heterogeneous, pretrained models for downstream tasks. As an example, our experiments elucidated that self-supervised models (e.g., contrastive learning on the entire The Cancer Genome Atlas (TCGA) dataset) achieved a superior performance compared with supervised models (e.g., ImageNet pretraining) on a classification cohort. Surprisingly, it yielded an inferior performance when it was translated to a cancer prognosis task. Such a phenomenon inspired us to explore how to leverage the already trained supervised and self-supervised models for pathological survival analysis. In this paper, we present a simple and low-cost joint representation tuning (JRT) to aggregate task-agnostic vision representation (supervised ImageNet pretrained models) and pathological specific feature representation (self-supervised TCGA pretrained models) for downstream tasks. Our contribution is in three-fold: (1) we adapt and aggregate classification-based supervised and self-supervised representation to survival prediction via joint representation tuning, (2) comprehensive analyses on prevalent strategies of pretrained models are conducted, (3) the joint representation tuning provides a simple, yet computationally efficient, perspective to leverage large-scale pretrained models for both cancer diagnosis and prognosis. The proposed JRT method improved the c-index from 0.705 to 0.731 on the TCGA brain cancer survival dataset. The feature-direct JRT (f-JRT) method achieved \(60\times \) training speedup while maintaining 0.707 c-index score.
Quan Liu, Can Cui, Ruining Deng, Zuhayr Asad, Tianyuan Yao, Zheyu Zhu, Yuankai Huo
Pathological Image Contrastive Self-supervised Learning
Abstract
Self-supervised learning methods have been receiving wide attentions in recent years, where contrastive learning starts to show encouraging performance in many tasks in the field of computer vision. Contrastive learning methods build pre-training weight parameters by crafting positive/negative samples and optimizing their distance in the feature space. It is easy to construct positive/negative samples on natural images, but the methods cannot directly apply to histopathological images because of the unique characteristics of the images such as staining invariance and vertical flip invariance. This paper proposes a general method for constructing clinical-equivalent positive sample pairs on histopathological images for applying contrastive learning on histopathological images. Results on the PatchCamelyon benchmark show that our method can improve model accuracy up to 6% while reducing the training costs, as well as reducing reliance on labeled data.
Wenkang Qin, Shan Jiang, Lin Luo
Investigation of Training Multiple Instance Learning Networks with Instance Sampling
Abstract
One challenge of training deep neural networks with gigapixel whole-slide images (WSIs) is the lack of annotation at pixel level or patch (instance) level due to the high cost and time-consuming labeling effort. Multiple instance learning (MIL) as a typical weakly supervised learning method aimed to resolve this challenge by using only the slide-level label without needing patch labels. Not all patches/instances are predictive of the outcome. The attention-based MIL method leverages this fact to enhance the performance by weighting the instances based on their contribution to predicting the outcome. A WSI typically contains hundreds of thousands of image patches. Training a deep neural network with thousands of image patches per slide is computationally expensive and requires a lot of time for convergence. One way to alleviate this issue is to sample a subset of instances/patches from the available instances within each bag for training. While the benefit of sampling strategies for decreasing computing time might be evident, there is a lack of effort to investigate their performances. This project proposes and compares an adaptive sampling strategy with other sampling strategies. Although all sampling strategies substantially reduce computation time, their performance is influenced by the number of selected instances. We show that if we are limited to only select a few instances (e.g., in order of 1\(\sim \)10 instances), the adaptive sampling outperforms other sampling strategies. However, if we are allowed to select more instances (e.g., in order of 100\(\sim \)1000 instances), the random sampling outperforms other sampling strategies.
Aliasghar Tarkhan, Trung Kien Nguyen, Noah Simon, Jian Dai
Masked Video Modeling with Correlation-Aware Contrastive Learning for Breast Cancer Diagnosis in Ultrasound
Abstract
Breast cancer is one of the leading causes of cancer deaths in women. As the primary output of breast screening, breast ultrasound (US) video contains exclusive dynamic information for cancer diagnosis. However, training models for video analysis is non-trivial as it requires a voluminous dataset which is also expensive to annotate. Furthermore, the diagnosis of breast lesion faces unique challenges such as inter-class similarity and intra-class variation. In this paper, we propose a pioneering approach that directly utilizes US videos in computer-aided breast cancer diagnosis. It leverages masked video modeling as pretraning to reduce reliance on dataset size and detailed annotations. Moreover, a correlation-aware contrastive loss is developed to facilitate the identifying of the internal and external relationship between benign and malignant lesions. Experimental results show that our proposed approach achieved promising classification performance and can outperform other state-of-the-art methods.
Zehui Lin, Ruobing Huang, Dong Ni, Jiayi Wu, Baoming Luo
A Self-attentive Meta-learning Approach for Image-Based Few-Shot Disease Detection
Abstract
In this paper, we propose a few-shot medical images classification framework for low prevalence disease detection. The proposed method leans to transfer medical knowledge from common diseases to low prevalence cases using meta-learning. Indeed, compared to natural images, medical images vary less diversely from one image to another, with complex patterns and less semantic information. Hence, extracting clinically relevant features and learning a disease-specific signature from few images is challenging. Inspired by clinician’s cognitive and visual diagnosis, we integrate an attention mechanism in the meta-learning process and we revised the meta-loss function to learn clinically disease-specific features over tasks. The proposed approach has been evaluated on two image-based diagnosis problems namely low prevalence skin and low prevalence thorax diseases diagnosis. We obtained respectively for those two use-cases \(84.3\%\) and \(73.4\%\) of average AUC in 2-way 5-shot classification setting. Obtained results demonstrate the effectiveness of the proposed framework compared to baselines and state-of the-art few-shot disease detection methods.
Achraf Ouahab, Olfa Ben-Ahmed, Christine Fernandez-Maloigne
Facing Annotation Redundancy: OCT Layer Segmentation with only 10 Annotated Pixels per Layer
Abstract
The retinal layer segmentation from OCT images is a fundamental and important task in the diagnosis and monitoring of eye-related diseases. The quest for improved accuracy is driving the use of increasingly large dataset with fully pixel-level layer annotations. But the manual annotation process is expensive and tedious, further, the annotators also need sufficient medical knowledge which brings a great burden on the doctors. We observe that there exist a large number of repetitive texture patterns in the flatten OCT images. More surprisingly, by significantly reducing the annotation from 100% to 10%, even to 1%, the performance of a segmentation model only drops a little, i.e., error from \(2.53\, \upmu \text {m}\) to \(2.76\,\upmu \text {m}\), and to \(3.27\,\upmu \text {m}\) on a validation set, respectively. Such observation motivates us to deeply investigate the redundancies of the annotation in the feature space which would definitely facilitate the annotation for medical images. To greatly reduce the expensive annotation costs, we propose a new annotation-efficient learning paradigm by annotating a fixed and limited number of pixels for each layer in each image. Considering the redundancies in the repetitive patterns in each layer of OCT images, we employ a VQ memory bank to store the extracted features on the whole datasets to augment the visual representation. The experimental results on two public datasets validate the effectiveness of our model. With only 10 annotated pixels for each layer in an image, our performance is very close to the previous methods trained with the whole fully annotated dataset.
Yanyu Xu, Xinxing Xu, Huazhu Fu, Meng Wang, Rick Siow Mong Goh, Yong Liu
Backmatter
Titel
Resource-Efficient Medical Image Analysis
Herausgegeben von
Xinxing Xu
Xiaomeng Li
Dwarikanath Mahapatra
Li Cheng
Caroline Petitjean
Huazhu Fu
Copyright-Jahr
2022
Electronic ISBN
978-3-031-16876-5
Print ISBN
978-3-031-16875-8
DOI
https://doi.org/10.1007/978-3-031-16876-5

Informationen zur Barrierefreiheit für dieses Buch folgen in Kürze. Wir arbeiten daran, sie so schnell wie möglich verfügbar zu machen. Vielen Dank für Ihre Geduld.

    Bildnachweise
    AvePoint Deutschland GmbH/© AvePoint Deutschland GmbH, NTT Data/© NTT Data, Wildix/© Wildix, arvato Systems GmbH/© arvato Systems GmbH, Ninox Software GmbH/© Ninox Software GmbH, Nagarro GmbH/© Nagarro GmbH, GWS mbH/© GWS mbH, CELONIS Labs GmbH, USU GmbH/© USU GmbH, G Data CyberDefense/© G Data CyberDefense, Vendosoft/© Vendosoft, Kumavision/© Kumavision, Noriis Network AG/© Noriis Network AG, WSW Software GmbH/© WSW Software GmbH, tts GmbH/© tts GmbH, Asseco Solutions AG/© Asseco Solutions AG, AFB Gemeinnützige GmbH/© AFB Gemeinnützige GmbH, Ferrari electronic AG/© Ferrari electronic AG