Skip to main content

2017 | Buch

Medical Image Computing and Computer Assisted Intervention − MICCAI 2017

20th International Conference, Quebec City, QC, Canada, September 11-13, 2017, Proceedings, Part III

herausgegeben von: Maxime Descoteaux, Lena Maier-Hein, Alfred Franz, Pierre Jannin, D. Louis Collins, Dr. Simon Duchesne

Verlag: Springer International Publishing

Buchreihe : Lecture Notes in Computer Science

insite
SUCHEN

Über dieses Buch

The three-volume set LNCS 10433, 10434, and 10435 constitutes the refereed proceedings of the 20th International Conference on Medical Image Computing and Computer-Assisted Intervention, MICCAI 2017, held inQuebec City, Canada, in September 2017.

The 255 revised full papers presented were carefully reviewed and selected from 800 submissions in a two-phase review process. The papers have been organized in the following topical sections: Part I: atlas and surface-based techniques; shape and patch-based techniques; registration techniques, functional imaging, connectivity, and brain parcellation; diffusion magnetic resonance imaging (dMRI) and tensor/fiber processing; and image segmentation and modelling. Part II: optical imaging; airway and vessel analysis; motion and cardiac analysis; tumor processing; planning and simulation for medical interventions; interventional imaging and navigation; and medical image computing. Part III: feature extraction and classification techniques; and machine learning in medical image computing.

Inhaltsverzeichnis

Frontmatter
Correction to: Retinal Microaneurysm Detection Using Clinical Report Guided Multi-Sieving CNN

In the originally published version important references were omitted.

Ling Dai, Bin Sheng, Qiang Wu, Huating Li, Xuhong Hou, Weiping Jia, Ruogu Fang

Feature Extraction and Classification Techniques

Frontmatter
Deep Multi-task Multi-channel Learning for Joint Classification and Regression of Brain Status

Jointly identifying brain diseases and predicting clinical scores have attracted increasing attention in the domain of computer-aided diagnosis using magnetic resonance imaging (MRI) data, since these two tasks are highly correlated. Although several joint learning models have been developed, most existing methods focus on using human-engineered features extracted from MRI data. Due to the possible heterogeneous property between human-engineered features and subsequent classification/regression models, those methods may lead to sub-optimal learning performance. In this paper, we propose a deep multi-task multi-channel learning (DM$$^2$$L) framework for simultaneous classification and regression for brain disease diagnosis, using MRI data and personal information (i.e., age, gender, and education level) of subjects. Specifically, we first identify discriminative anatomical landmarks from MR images in a data-driven manner, and then extract multiple image patches around these detected landmarks. A deep multi-task multi-channel convolutional neural network is then developed for joint disease classification and clinical score regression. We train our model on a large multi-center cohort (i.e., ADNI-1) and test it on an independent cohort (i.e., ADNI-2). Experimental results demonstrate that DM$$^2$$L is superior to the state-of-the-art approaches in brain diasease diagnosis.

Mingxia Liu, Jun Zhang, Ehsan Adeli, Dinggang Shen
Nonlinear Feature Space Transformation to Improve the Prediction of MCI to AD Conversion

Accurate identification of patients with Mild Cognitive Impairment (MCI) at high risk for conversion to Alzheimer’s Disease (AD) offers an opportunity to target the disease process early. In this paper, we present a novel nonlinear feature transformation scheme to improve the prediction of MCI-AD conversion through semi-supervised learning. Utilizing Laplacian SVM (LapSVM) as a host classifier, the proposed method learns a smooth spatially varying transformation that makes the input data more linearly separable. Our approach has a broad applicability to boost the classification performance of many other semi-supervised learning solutions. Using baseline MR images from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) database, we evaluate the effectiveness of the proposed semi-supervised framework and demonstrate the improvements over the state-of-the-art solutions within the same category.

Pin Zhang, Bibo Shi, Charles D. Smith, Jundong Liu
Kernel Generalized-Gaussian Mixture Model for Robust Abnormality Detection

Typical methods for abnormality detection in medical images rely on principal component analysis (PCA), kernel PCA (KPCA), or their robust invariants. However, typical robust-KPCA methods use heuristics for model fitting and perform outlier detection ignoring the variances of the data within principal subspaces. In this paper, we propose a novel method for robust statistical learning by extending the multivariate generalized-Gaussian distribution to a reproducing kernel Hilbert space and employing it within a mixture model. We propose expectation maximization to fit our kernel generalized-Gaussian mixture model (KGGMM), using solely the Gram matrix and without the explicit lifting map. We exploit the KGGMM, including component means, principal directions, and variances, for abnormality detection in images. The results on 4 large publicly available datasets, involving retinopathy and cancer, show that our method outperforms the state of the art.

Nitin Kumar, Ajit V. Rajwade, Sharat Chandran, Suyash P. Awate
Latent Processes Governing Neuroanatomical Change in Aging and Dementia

Clinically normal aging and pathological processes cause structural changes in the brain. These changes likely occur in overlapping regions that accommodate neural systems with high susceptibility to deleterious factors. Due to the overlap, the separation between aging and pathological processes is challenging when analyzing brain structures independently. We propose to identify multivariate latent processes that govern cross-sectional and longitudinal neuroanatomical changes across the brain in aging and dementia. A discriminative representation of neuroanatomy is obtained from spectral shape descriptors in the BrainPrint. We identify latent factors by maximizing the covariance between morphological change and response variables of age and a proxy for dementia. Our results reveal cross-sectional and longitudinal patterns of change in neuroanatomy that distinguishes aging processes from disease processes. Finally, latent processes do not only yield a parsimonious model but also a significantly improved prediction accuracy.

Christian Wachinger, Anna Rieckmann, Martin Reuter
A Multi-armed Bandit to Smartly Select a Training Set from Big Medical Data

With the availability of big medical image data, the selection of an adequate training set is becoming more important to address the heterogeneity of different datasets. Simply including all the data does not only incur high processing costs but can even harm the prediction. We formulate the smart and efficient selection of a training dataset from big medical image data as a multi-armed bandit problem, solved by Thompson sampling. Our method assumes that image features are not available at the time of the selection of the samples, and therefore relies only on meta information associated with the images. Our strategy simultaneously exploits data sources with high chances of yielding useful samples and explores new data regions. For our evaluation, we focus on the application of estimating the age from a brain MRI. Our results on 7,250 subjects from 10 datasets show that our approach leads to higher accuracy while only requiring a fraction of the training data.

Benjamín Gutiérrez, Loïc Peter, Tassilo Klein, Christian Wachinger
Multi-level Multi-task Structured Sparse Learning for Diagnosis of Schizophrenia Disease

In recent studies, it has attracted increasing attention in multi-frequency bands analysis for diagnosis of schizophrenia (SZ). However, most existing feature selection methods designed for multi-frequency bands analysis do not take into account the inherent structures (i.e., both frequency specificity and complementary information) from multi-frequency bands in the model, which are limited to identify the discriminative feature subset in a single step. To address this problem, we propose a multi-level multi-task structured sparse learning (MLMT-TS) framework to explicitly consider the common features with a hierarchical structure. Specifically, we introduce two regularization terms in the hierarchical framework to impose the common features across different bands and the specificity from individuals. Then, the selected features are used to construct multiple support vector machine (SVM) classifiers. Finally, we adopt an ensemble strategy to combine outputs of all SVM classifiers to achieve the final decision. Our method has been evaluated on 46 subjects, and the superior classification results demonstrate the effectiveness of our proposed method as compared to other methods.

Mingliang Wang, Xiaoke Hao, Jiashuang Huang, Kangcheng Wang, Xijia Xu, Daoqiang Zhang
An Unbiased Penalty for Sparse Classification with Application to Neuroimaging Data

We present a novel formulation for discriminative anatomy detection in high dimensional neuroimaging data. While most studies solve this problem using mass univariate approaches, recent works show better accuracy and variable selection using a sparse classification model. Such methods typically use an $$l_1$$ penalty for imposing sparseness and a graph net (GN) or a total variation (TV) penalty for ensuring spatial continuity and interpretability of the results. However it is known that the $$l_1$$ and TV penalties have inherent bias that leads to less stable region detection and less accurate prediction. To overcome these limitations, we propose a novel variable selection method in the context of classification, based on the Smoothly Clipped Absolute Deviation (SCAD) penalty. We experimentally show superiority of three models based on the SCAD and SCADTV penalties when compared to the classical $$l_1$$ and TV penalties in both simulated and real MRI data from a multiple sclerosis study.

Li Zhang, Dana Cobzas, Alan Wilman, Linglong Kong
Unsupervised Feature Learning for Endomicroscopy Image Retrieval

Learning the visual representation for medical images is a critical task in computer-aided diagnosis. In this paper, we propose Unsupervised Multimodal Graph Mining (UMGM) to learn the discriminative features for probe-based confocal laser endomicroscopy (pCLE) mosaics of breast tissue. We build a multiscale multimodal graph based on both pCLE mosaics and histology images. The positive pairs are mined via cycle consistency and the negative pairs are extracted based on geodetic distance. Given the positive and negative pairs, the latent feature space is discovered by reconstructing the similarity between pCLE and histology images. Experiments on a database with 700 pCLE mosaics demonstrate that the proposed method outperforms previous works on pCLE feature learning. Specially, the top-1 accuracy in an eight-class retrieval task is 0.659 which leads to 10% improvement compared with the state-of-the-art method.

Yun Gu, Khushi Vyas, Jie Yang, Guang-Zhong Yang
Maximum Mean Discrepancy Based Multiple Kernel Learning for Incomplete Multimodality Neuroimaging Data

It is challenging to use incomplete multimodality data for Alzheimer’s Disease (AD) diagnosis. The current methods to address this challenge, such as low-rank matrix completion (i.e., imputing the missing values and unknown labels simultaneously) and multi-task learning (i.e., defining one regression task for each combination of modalities and then learning them jointly), are unable to model the complex data-to-label relationship in AD diagnosis and also ignore the heterogeneity among the modalities. In light of this, we propose a new Maximum Mean Discrepancy (MMD) based Multiple Kernel Learning (MKL) method for AD diagnosis using incomplete multimodality data. Specifically, we map all the samples from different modalities into a Reproducing Kernel Hilbert Space (RKHS), by devising a new MMD algorithm. The proposed MMD method incorporates data distribution matching, pair-wise sample matching and feature selection in an unified formulation, thus alleviating the modality heterogeneity issue and making all the samples comparable to share a common classifier in the RKHS. The resulting classifier obviously captures the nonlinear data-to-label relationship. We have tested our method using MRI and PET data from Alzheimer’s Disease Neuroimaging Initiative (ADNI) dataset for AD diagnosis. The experimental results show that our method outperforms other methods.

Xiaofeng Zhu, Kim-Han Thung, Ehsan Adeli, Yu Zhang, Dinggang Shen
Liver Tissue Classification in Patients with Hepatocellular Carcinoma by Fusing Structured and Rotationally Invariant Context Representation

This work addresses multi-class liver tissue classification from multi-parameter MRI in patients with hepatocellular carcinoma (HCC), and is among the first to do so. We propose a structured prediction framework to simultaneously classify parenchyma, blood vessels, viable tumor tissue, and necrosis, which overcomes limitations related to classifying these tissue classes individually and consecutively. A novel classification framework is introduced, based on the integration of multi-scale shape and appearance features to initiate the classification, which is iteratively refined by augmenting the feature space with both structured and rotationally invariant label context features. We study further the topic of rotationally invariant label context feature representations, and introduce a method for this purpose based on computing the energies of the spherical harmonic decompositions computed at different frequencies and radii. We test our method on full 3D multi-parameter MRI volumes from 47 patients with HCC and achieve promising results.

John Treilhard, Susanne Smolka, Lawrence Staib, Julius Chapiro, MingDe Lin, Georgy Shakirin, James S. Duncan
DOTE: Dual cOnvolutional filTer lEarning for Super-Resolution and Cross-Modality Synthesis in MRI

Cross-modal image synthesis is a topical problem in medical image computing. Existing methods for image synthesis are either tailored to a specific application, require large scale training sets, or are based on partitioning images into overlapping patches. In this paper, we propose a novel Dual cOnvolutional filTer lEarning (DOTE) approach to overcome the drawbacks of these approaches. We construct a closed loop joint filter learning strategy that generates informative feedback for model self-optimization. Our method can leverage data more efficiently thus reducing the size of the required training set. We extensively evaluate DOTE in two challenging tasks: image super-resolution and cross-modality synthesis. The experimental results demonstrate superior performance of our method over other state-of-the-art methods.

Yawen Huang, Ling Shao, Alejandro F. Frangi
Supervised Intra-embedding of Fisher Vectors for Histopathology Image Classification

In this paper, we present a histopathology image classification method with supervised intra-embedding of Fisher vectors. Recently in general computer vision, Fisher encoding combined with convolutional neural network (ConvNet) has become popular as a highly discriminative feature descriptor. However, Fisher vectors have two intrinsic problems that could limit their performance: high dimensionality and bursty visual elements. To address these problems, we design a novel supervised intra-embedding algorithm with a multilayer neural network model to transform the ConvNet-based Fisher vectors into a more discriminative feature representation. We apply this feature encoding method on two public datasets, including the BreaKHis image dataset of benign and malignant breast tumors, and IICBU 2008 lymphoma dataset of three malignant lymphoma subtypes. The results demonstrate that our supervised intra-embedding method helps to enhance the ConvNet-based Fisher vectors effectively, and our classification results largely outperform the state-of-the-art approaches on these datasets.

Yang Song, Hang Chang, Heng Huang, Weidong Cai
GSplit LBI: Taming the Procedural Bias in Neuroimaging for Disease Prediction

In voxel-based neuroimage analysis, lesion features have been the main focus in disease prediction due to their interpretability with respect to the related diseases. However, we observe that there exist another type of features introduced during the preprocessing steps and we call them “Procedural Bias”. Besides, such bias can be leveraged to improve classification accuracy. Nevertheless, most existing models suffer from either under-fit without considering procedural bias or poor interpretability without differentiating such bias from lesion ones. In this paper, a novel dual-task algorithm namely GSplit LBI is proposed to resolve this problem. By introducing an augmented variable enforced to be structural sparsity with a variable splitting term, the estimators for prediction and selecting lesion features can be optimized separately and mutually monitored by each other following an iterative scheme. Empirical experiments have been evaluated on the Alzheimer’s Disease Neuroimaging Initiative (ADNI) database. The advantage of proposed model is verified by improved stability of selected lesion features and better classification results.

Xinwei Sun, Lingjing Hu, Yuan Yao, Yizhou Wang
MRI-Based Surgical Planning for Lumbar Spinal Stenosis

The most common reason for spinal surgery in elderly patients is lumbar spinal stenosis (LSS). For LSS, treatment decisions based on clinical and radiological information as well as personal experience of the surgeon show large variance. Thus a standardized support system is of high value for a more objective and reproducible decision. In this work, we develop an automated algorithm to localize the stenosis causing the symptoms of the patient in magnetic resonance imaging (MRI). With 22 MRI features of each of five spinal levels of 321 patients, we show it is possible to predict the location of lesion triggering the symptoms. To support this hypothesis, we conduct an automated analysis of labeled and unlabeled MRI scans extracted from 788 patients. We confirm quantitatively the importance of radiological information and provide an algorithmic pipeline for working with raw MRI scans. Both code and data are provided for further research at www.spinalstenosis.ethz.ch.

Gabriele Abbati, Stefan Bauer, Sebastian Winklhofer, Peter J. Schüffler, Ulrike Held, Jakob M. Burgstaller, Johann Steurer, Joachim M. Buhmann
Pattern Visualization and Recognition Using Tensor Factorization for Early Differential Diagnosis of Parkinsonism

Idiopathic Parkinsons disease (PD) and atypical parkinsonian syndromes may have similar symptoms at the early disease stage. Pattern recognition on metabolic imaging has been confirmed of distinct value in the early differential diagnosis of Parkinsonism. However, the principal component analysis (PCA) based method ends up with a unique probability score of each disease pattern. This restricts the exploration of heterogeneous characteristic features for differentiation. There is no visualization of the underlying mechanism to assist the radiologist/neurologist either. We propose a tensor factorization based method to extract the characteristic patterns of the diseases. By decomposing the 3D data, we can capture the intrinsic characteristic pattern in the data. In particular, the disease-related patterns can be visualized individually for the inspection by physicians. The test on PET images of 206 early parkinsonian patients has confirmed differential patterns on the visualized feature images using the proposed method. Computer-aided diagnosis based on multi-class support vector machine (SVM) shown improved diagnostic accuracy of Parkinsonism using the tensor-factorized feature images compared to the state-of-the-art PCA-based scores [Tang et al. Lancet Neurol. 2010].

Rui Li, Ping Wu, Igor Yakushev, Jian Wang, Sibylle I. Ziegler, Stefan Förster, Sung-Cheng Huang, Markus Schwaiger, Nassir Navab, Chuantao Zuo, Kuangyu Shi
Physiological Parameter Estimation from Multispectral Images Unleashed

Multispectral imaging in laparoscopy can provide tissue reflectance measurements for each point in the image at multiple wavelengths of light. These reflectances encode information on important physiological parameters not visible to the naked eye. Fast decoding of the data during surgery, however, remains challenging. While model-based methods suffer from inaccurate base assumptions, a major bottleneck related to competing machine learning-based solutions is the lack of labelled training data. In this paper, we address this issue with the first transfer learning-based method to physiological parameter estimation from multispectral images. It relies on a highly generic tissue model that aims to capture the full range of optical tissue parameters that can potentially be observed in vivo. Adaptation of the model to a specific clinical application based on unlabelled in vivo data is achieved using a new concept of domain adaptation that explicitly addresses the high variance often introduced by conventional covariance-shift correction methods. According to comprehensive in silico and in vivo experiments our approach enables accurate parameter estimation for various tissue types without the need for incorporating specific prior knowledge on optical properties and could thus pave the way for many exciting applications in multispectral laparoscopy.

Sebastian J. Wirkert, Anant S. Vemuri, Hannes G. Kenngott, Sara Moccia, Michael Götz, Benjamin F. B. Mayer, Klaus H. Maier-Hein, Daniel S. Elson, Lena Maier-Hein
Segmentation of Cortical and Subcortical Multiple Sclerosis Lesions Based on Constrained Partial Volume Modeling

We propose a novel method to automatically detect and segment multiple sclerosis lesions, located both in white matter and in the cortex. The algorithm consists of two main steps: (i) a supervised approach that outputs an initial bitmap locating candidates of lesional tissue and (ii) a Bayesian partial volume estimation framework that estimates the lesion concentration in each voxel. By using a “mixel” approach, potential partial volume effects especially affecting small lesions can be modeled, thus yielding improved lesion segmentation. The proposed method is tested on multiple MR image sequences including 3D MP2RAGE, 3D FLAIR, and 3D DIR. Quantitative evaluation is done by comparison with manual segmentations on a cohort of 39 multiple sclerosis early-stage patients.

Mário João Fartaria, Alexis Roche, Reto Meuli, Cristina Granziera, Tobias Kober, Meritxell Bach Cuadra
Classification of Pancreatic Cysts in Computed Tomography Images Using a Random Forest and Convolutional Neural Network Ensemble

There are many different types of pancreatic cysts. These range from completely benign to malignant, and identifying the exact cyst type can be challenging in clinical practice. This work describes an automatic classification algorithm that classifies the four most common types of pancreatic cysts using computed tomography images. The proposed approach utilizes the general demographic information about a patient as well as the imaging appearance of the cyst. It is based on a Bayesian combination of the random forest classifier, which learns subclass-specific demographic, intensity, and shape features, and a new convolutional neural network that relies on the fine texture information. Quantitative assessment of the proposed method was performed using a 10-fold cross validation on 134 patients and reported a classification accuracy of 83.6%.

Konstantin Dmitriev, Arie E. Kaufman, Ammar A. Javed, Ralph H. Hruban, Elliot K. Fishman, Anne Marie Lennon, Joel H. Saltz
Classification of Major Depressive Disorder via Multi-site Weighted LASSO Model

Large-scale collaborative analysis of brain imaging data, in psychiatry and neurology, offers a new source of statistical power to discover features that boost accuracy in disease classification, differential diagnosis, and outcome prediction. However, due to data privacy regulations or limited accessibility to large datasets across the world, it is challenging to efficiently integrate distributed information. Here we propose a novel classification framework through multi-site weighted LASSO: each site performs an iterative weighted LASSO for feature selection separately. Within each iteration, the classification result and the selected features are collected to update the weighting parameters for each feature. This new weight is used to guide the LASSO process at the next iteration. Only the features that help to improve the classification accuracy are preserved. In tests on data from five sites (299 patients with major depressive disorder (MDD) and 258 normal controls), our method boosted classification accuracy for MDD by 4.9% on average. This result shows the potential of the proposed new strategy as an effective and practical collaborative platform for machine learning on large scale distributed imaging and biobank data.

Dajiang Zhu, Brandalyn C. Riedel, Neda Jahanshad, Nynke A. Groenewold, Dan J. Stein, Ian H. Gotlib, Matthew D. Sacchet, Danai Dima, James H. Cole, Cynthia H. Y. Fu, Henrik Walter, Ilya M. Veer, Thomas Frodl, Lianne Schmaal, Dick J. Veltman, Paul M. Thompson
A Multi-atlas Approach to Region of Interest Detection for Medical Image Classification

A common approach for image classification is based on image feature extraction and supervised discriminative learning. For medical image classification problems where discriminative image features are spatially distributed around certain anatomical structures, localizing the region of interest (ROI) essential for the classification task is a key to success. To address this problem, we develop a multi-atlas label fusion technique for automatic ROI detection. Given a set of training images with class labels, our method infers voxel-wise scores for each image showing how discriminative each voxel is for categorizing the image. We applied our method in a 2D cardiac CT body part classification application and show the effectiveness of the detected ROIs.

Hongzhi Wang, Mehdi Moradi, Yaniv Gur, Prasanth Prasanna, Tanveer Syeda-Mahmood
Spectral Graph Convolutions for Population-Based Disease Prediction

Exploiting the wealth of imaging and non-imaging information for disease prediction tasks requires models capable of representing, at the same time, individual features as well as data associations between subjects from potentially large populations. Graphs provide a natural framework for such tasks, yet previous graph-based approaches focus on pairwise similarities without modelling the subjects’ individual characteristics and features. On the other hand, relying solely on subject-specific imaging feature vectors fails to model the interaction and similarity between subjects, which can reduce performance. In this paper, we introduce the novel concept of Graph Convolutional Networks (GCN) for brain analysis in populations, combining imaging and non-imaging data. We represent populations as a sparse graph where its vertices are associated with image-based feature vectors and the edges encode phenotypic information. This structure was used to train a GCN model on partially labelled graphs, aiming to infer the classes of unlabelled nodes from the node features and pairwise associations between subjects. We demonstrate the potential of the method on the challenging ADNI and ABIDE databases, as a proof of concept of the benefit from integrating contextual information in classification tasks. This has a clear impact on the quality of the predictions, leading to 69.5% accuracy for ABIDE (outperforming the current state of the art of 66.8%) and 77% for ADNI for prediction of MCI conversion, significantly outperforming standard linear classifiers where only individual features are considered.

Sarah Parisot, Sofia Ira Ktena, Enzo Ferrante, Matthew Lee, Ricardo Guerrerro Moreno, Ben Glocker, Daniel Rueckert
Predicting Future Disease Activity and Treatment Responders for Multiple Sclerosis Patients Using a Bag-of-Lesions Brain Representation

The growth of lesions and the development of new lesions in MRI are markers of new disease activity in Multiple Sclerosis (MS) patients. Successfully predicting future lesion activity could lead to a better understanding of disease worsening, as well as prediction of treatment efficacy. We introduce the first, fully automatic, probabilistic framework for the prediction of future lesion activity in relapsing-remitting MS patients, based only on baseline multi-modal MRI, and use it to successfully identify responders to two different treatments. We develop a new Bag-of-Lesions (BoL) representation for patient images based on a variety of features extracted from lesions. A probabilistic codebook of lesion types is created by clustering features using Gaussian mixture models. Patients are represented as a probabilistic histogram of lesion-types. A Random Forest classifier is trained to automatically predict future MS activity up to two years ahead based on the patient’s baseline BoL representation. The framework is trained and tested on a large, proprietary, multi-centre, multi-modal clinical trial dataset consisting of 1048 patients. Testing based on 50-fold cross validation shows that our framework compares favourably to several other classifiers. Automated identification of responders in two different treated groups of patients leads to sensitivity of 82% and 84% and specificity of 92% and 94% respectively, showing that this is a very promising approach towards personalized treatment for MS patients.

Andrew Doyle, Doina Precup, Douglas L. Arnold, Tal Arbel
Sparse Multi-kernel Based Multi-task Learning for Joint Prediction of Clinical Scores and Biomarker Identification in Alzheimer’s Disease

Machine learning methods have been used to predict the clinical scores and identify the image biomarkers from individual MRI scans. Recently, the multi-task learning (MTL) with sparsity-inducing norm have been widely studied to investigate the prediction power of neuroimaging measures by incorporating inherent correlations among multiple clinical cognitive measures. However, most of the existing MTL algorithms are formulated linear sparse models, in which the response (e.g., cognitive score) is a linear function of predictors (e.g., neuroimaging measures). To exploit the nonlinear relationship between the neuroimaging measures and cognitive measures, we consider that tasks to be learned share a common subset of features in the kernel space as well as the kernel functions. Specifically, we propose a multi-kernel based multi-task learning with a mixed sparsity-inducing norm to better capture the complex relationship between the cognitive scores and the neuroimaging measures. The formation can be efficiently solved by mirror-descent optimization. Experiments on the Alzheimers Disease Neuroimaging Initiative (ADNI) database showed that the proposed algorithm achieved better prediction performance than state-of-the-art linear based methods both on single MRI and multiple modalities.

Peng Cao, Xiaoli Liu, Jinzhu Yang, Dazhe Zhao, Osmar Zaiane

Machine Learning in Medical Image Computing

Frontmatter
Personalized Diagnosis for Alzheimer’s Disease

Current learning-based methods for the diagnosis of Alzheimer’s Disease (AD) rely on training a general classifier aiming to recognize abnormal structural alternations from homogenously distributed dataset deriving from a large population. However, due to diverse disease pathology, the real imaging data in routine clinic practices is highly complex and heterogeneous. Hence, prototype methods commonly performing well in the laboratory cannot achieve expected outcome when applied under the real clinic setting. To address this issue, herein we propose a novel personalized model for AD diagnosis. We customize a subject-specific AD classifier for the new testing data by iteratively reweighting the training data to reveal the latent testing data distribution and refining the classifier based on the weighted training data. Furthermore, to improve estimation of diagnosis result and clinical scores at the individual level, we extend our personalized AD diagnosis model to a joint classification and regression scenario. Our model shows improved performance on classification and regression accuracy when applied on Magnetic Resonance Imaging (MRI) selected from Alzheimer’s Disease Neuroimaging Initiative (ADNI) database. Our work pinpoints the clinical potential of personalized diagnosis framework in AD.

Yingying Zhu, Minjeong Kim, Xiaofeng Zhu, Jin Yan, Daniel Kaufer, Guorong Wu
GP-Unet: Lesion Detection from Weak Labels with a 3D Regression Network

We propose a novel convolutional neural network for lesion detection from weak labels. Only a single, global label per image - the lesion count - is needed for training. We train a regression network with a fully convolutional architecture combined with a global pooling layer to aggregate the 3D output into a scalar indicating the lesion count. When testing on unseen images, we first run the network to estimate the number of lesions. Then we remove the global pooling layer to compute localization maps of the size of the input image. We evaluate the proposed network on the detection of enlarged perivascular spaces in the basal ganglia in MRI. Our method achieves a sensitivity of $$62\%$$ with on average 1.5 false positives per image. Compared with four other approaches based on intensity thresholding, saliency and class maps, our method has a $$20\%$$ higher sensitivity.

Florian Dubost, Gerda Bortsova, Hieab Adams, Arfan Ikram, Wiro J. Niessen, Meike Vernooij, Marleen De Bruijne
Deep Supervision for Pancreatic Cyst Segmentation in Abdominal CT Scans

Automatic segmentation of an organ and its cystic region is a prerequisite of computer-aided diagnosis. In this paper, we focus on pancreatic cyst segmentation in abdominal CT scan. This task is important and very useful in clinical practice yet challenging due to the low contrast in boundary, the variability in location, shape and the different stages of the pancreatic cancer. Inspired by the high relevance between the location of a pancreas and its cystic region, we introduce extra deep supervision into the segmentation network, so that cyst segmentation can be improved with the help of relatively easier pancreas segmentation. Under a reasonable transformation function, our approach can be factorized into two stages, and each stage can be efficiently optimized via gradient back-propagation throughout the deep networks. We collect a new dataset with 131 pathological samples, which, to the best of our knowledge, is the largest set for pancreatic cyst segmentation. Without human assistance, our approach reports a $$63.44\%$$ average accuracy, measured by the Dice-Sørensen coefficient (DSC), which is higher than the number ($$60.46\%$$) without deep supervision.

Yuyin Zhou, Lingxi Xie, Elliot K. Fishman, Alan L. Yuille
Error Corrective Boosting for Learning Fully Convolutional Networks with Limited Data

Training deep fully convolutional neural networks (F-CNNs) for semantic image segmentation requires access to abundant labeled data. While large datasets of unlabeled image data are available in medical applications, access to manually labeled data is very limited. We propose to automatically create auxiliary labels on initially unlabeled data with existing tools and to use them for pre-training. For the subsequent fine-tuning of the network with manually labeled data, we introduce error corrective boosting (ECB), which emphasizes parameter updates on classes with lower accuracy. Furthermore, we introduce SkipDeconv-Net (SD-Net), a new F-CNN architecture for brain segmentation that combines skip connections with the unpooling strategy for upsampling. The SD-Net addresses challenges of severe class imbalance and errors along boundaries. With application to whole-brain MRI T1 scan segmentation, we generate auxiliary labels on a large dataset with FreeSurfer and fine-tune on two datasets with manual annotations. Our results show that the inclusion of auxiliary labels and ECB yields significant improvements. SD-Net segments a 3D scan in 7 s in comparison to 30 h for the closest multi-atlas segmentation method, while reaching similar performance. It also outperforms the latest state-of-the-art F-CNN models.

Abhijit Guha Roy, Sailesh Conjeti, Debdoot Sheet, Amin Katouzian, Nassir Navab, Christian Wachinger
Direct Detection of Pixel-Level Myocardial Infarction Areas via a Deep-Learning Algorithm

Accurate detection of the myocardial infarction (MI) area is crucial for early diagnosis planning and follow-up management. In this study, we propose an end-to-end deep-learning algorithm framework (OF-RNN) to accurately detect the MI area at the pixel level. Our OF-RNN consists of three different function layers: the heart localization layers, which can accurately and automatically crop the region-of-interest (ROI) sequences, including the left ventricle, using the whole cardiac magnetic resonance image sequences; the motion statistical layers, which are used to build a time-series architecture to capture two types of motion features (at the pixel-level) by integrating the local motion features generated by long short-term memory-recurrent neural networks and the global motion features generated by deep optical flows from the whole ROI sequence, which can effectively characterize myocardial physiologic function; and the fully connected discriminate layers, which use stacked auto-encoders to further learn these features, and they use a softmax classifier to build the correspondences from the motion features to the tissue identities (infarction or not) for each pixel. Through the seamless connection of each layer, our OF-RNN can obtain the area, position, and shape of the MI for each patient. Our proposed framework yielded an overall classification accuracy of 94.35% at the pixel level, from 114 clinical subjects. These results indicate the potential of our proposed method in aiding standardized MI assessments.

Chenchu Xu, Lei Xu, Zhifan Gao, Shen Zhao, Heye Zhang, Yanping Zhang, Xiuquan Du, Shu Zhao, Dhanjoo Ghista, Shuo Li
Skin Disease Recognition Using Deep Saliency Features and Multimodal Learning of Dermoscopy and Clinical Images

Skin cancer is the most common cancer world-wide, among which Melanoma the most fatal cancer, accounts for more than 10,000 deaths annually in Australia and United States. The 5-year survival rate for Melanoma can be increased over 90% if detected in its early stage. However, intrinsic visual similarity across various skin conditions makes the diagnosis challenging both for clinicians and automated classification methods. Many automated skin cancer diagnostic systems have been proposed in literature, all of which consider solely dermoscopy images in their analysis. In reality, however, clinicians consider two modalities of imaging; an initial screening using clinical photography images to capture a macro view of the mole, followed by dermoscopy imaging which visualizes morphological structures within the skin lesion. Evidences show that these two modalities provide complementary visual features that can empower the decision making process. In this work, we propose a novel deep convolutional neural network (DCNN) architecture along with a saliency feature descriptor to capture discriminative features of the two modalities for skin lesions classification. The proposed DCNN accepts a pair images of clinical and dermoscopic view of a single lesion and is capable of learning single-modality and cross-modality representations, simultaneously. Using one of the largest collected skin lesion datasets, we demonstrate that the proposed multi-modality method significantly outperforms single-modality methods on three tasks; differentiation between 15 various skin diseases, distinguishing cancerous (3 cancer types including melanoma) from non-cancerous moles, and detecting melanoma from benign cases.

Zongyuan Ge, Sergey Demyanov, Rajib Chakravorty, Adrian Bowling, Rahil Garnavi
Boundary Regularized Convolutional Neural Network for Layer Parsing of Breast Anatomy in Automated Whole Breast Ultrasound

A boundary regularized deep convolutional encoder-decoder network (ConvEDNet) is developed in this study to address the difficult anatomical layer parsing problem in the noisy Automated Whole Breast Ultrasound (AWBUS) images. To achieve better network initialization, a two-stage adaptive domain transfer (2DT) is employed to land the VGG-16 encoder on the AWBUS domain with the bridge of network training for AWBUS edge detector. The knowledge transferred encoder is denoted as VGG-USEdge. To further augment the training of ConvEDNet, a deep boundary supervision (DBS) strategy is introduced to regularize the feature learning for better robustness to speckle noise and shadowing effect. We argue that simply counting on the image context cue, which can be learnt with the guidance of label maps, may not be sufficient to deal with the intrinsic noisy property of ultrasound images. With the regularization of boundary cue, the segmentation learning can be boosted. The efficacy of the proposed 2DT-DBS ConvEDNet is corroborated with the extensive comparison to the state-of-the-art deep learning segmentation methods. The segmentation results may assist the clinical image reading, particularly for junior medical doctors and residents and help to reduce false-positive findings from a computer-aided detection scheme.

Cheng Bian, Ran Lee, Yi-Hong Chou, Jie-Zhi Cheng
Zoom-in-Net: Deep Mining Lesions for Diabetic Retinopathy Detection

We propose a convolution neural network based algorithm for simultaneously diagnosing diabetic retinopathy and highlighting suspicious regions. Our contributions are two folds: (1) a network termed Zoom-in-Net which mimics the zoom-in process of a clinician to examine the retinal images. Trained with only image-level supervisions, Zoom-in-Net can generate attention maps which highlight suspicious regions, and predicts the disease level accurately based on both the whole image and its high resolution suspicious patches. (2) Only four bounding boxes generated from the automatically learned attention maps are enough to cover 80% of the lesions labeled by an experienced ophthalmologist, which shows good localization ability of the attention maps. By clustering features at high response locations on the attention maps, we discover meaningful clusters which contain potential lesions in diabetic retinopathy. Experiments show that our algorithm outperform the state-of-the-art methods on two datasets, EyePACS and Messidor.

Zhe Wang, Yanxin Yin, Jianping Shi, Wei Fang, Hongsheng Li, Xiaogang Wang
Full Quantification of Left Ventricle via Deep Multitask Learning Network Respecting Intra- and Inter-Task Relatedness

Cardiac left ventricle (LV) quantification is among the most clinically important tasks for identification and diagnosis of cardiac diseases, yet still a challenge due to the high variability of cardiac structure and the complexity of temporal dynamics. Full quantification, i.e., to simultaneously quantify all LV indices including two areas (cavity and myocardium), six regional wall thicknesses (RWT), three LV dimensions, and one cardiac phase, is even more challenging since the uncertain relatedness intra and inter each type of indices may hinder the learning procedure from better convergence and generalization. In this paper, we propose a newly-designed multitask learning network (FullLVNet), which is constituted by a deep convolution neural network (CNN) for expressive feature embedding of cardiac structure; two followed parallel recurrent neural network (RNN) modules for temporal dynamic modeling; and four linear models for the final estimation. During the final estimation, both intra- and inter-task relatedness are modeled to enforce improvement of generalization: (1) respecting intra-task relatedness, group lasso is applied to each of the regression tasks for sparse and common feature selection and consistent prediction; (2) respecting inter-task relatedness, three phase-guided constraints are proposed to penalize violation of the temporal behavior of the obtained LV indices. Experiments on MR sequences of 145 subjects show that FullLVNet achieves high accurate prediction with our intra- and inter-task relatedness, leading to MAE of 190 mm$$^2$$, 1.41 mm, 2.68 mm for average areas, RWT, dimensions and error rate of 10.4% for the phase classification. This endows our method a great potential in comprehensive clinical assessment of global, regional and dynamic cardiac function.

Wufeng Xue, Andrea Lum, Ashley Mercado, Mark Landis, James Warrington, Shuo Li
Scalable Multimodal Convolutional Networks for Brain Tumour Segmentation

Brain tumour segmentation plays a key role in computer-assisted surgery. Deep neural networks have increased the accuracy of automatic segmentation significantly, however these models tend to generalise poorly to different imaging modalities than those for which they have been designed, thereby limiting their applications. For example, a network architecture initially designed for brain parcellation of monomodal T1 MRI can not be easily translated into an efficient tumour segmentation network that jointly utilises T1, T1c, Flair and T2 MRI. To tackle this, we propose a novel scalable multimodal deep learning architecture using new nested structures that explicitly leverage deep features within or across modalities. This aims at making the early layers of the architecture structured and sparse so that the final architecture becomes scalable to the number of modalities. We evaluate the scalable architecture for brain tumour segmentation and give evidence of its regularisation effect compared to the conventional concatenation approach.

Lucas Fidon, Wenqi Li, Luis C. Garcia-Peraza-Herrera, Jinendra Ekanayake, Neil Kitchen, Sebastien Ourselin, Tom Vercauteren
Pathological OCT Retinal Layer Segmentation Using Branch Residual U-Shape Networks

The automatic segmentation of retinal layer structures enables clinically-relevant quantification and monitoring of eye disorders over time in OCT imaging. Eyes with late-stage diseases are particularly challenging to segment, as their shape is highly warped due to pathological biomarkers. In this context, we propose a novel fully-Convolutional Neural Network (CNN) architecture which combines dilated residual blocks in an asymmetric U-shape configuration, and can segment multiple layers of highly pathological eyes in one shot. We validate our approach on a dataset of late-stage AMD patients and demonstrate lower computational costs and higher performance compared to other state-of-the-art methods.

Stefanos Apostolopoulos, Sandro De Zanet, Carlos Ciller, Sebastian Wolf, Raphael Sznitman
Quality Assessment of Echocardiographic Cine Using Recurrent Neural Networks: Feasibility on Five Standard View Planes

Echocardiography (echo) is a clinical imaging technique which is highly dependent on operator experience. We aim to reduce operator variability in data acquisition by automatically computing an echo quality score for real-time feedback. We achieve this with a deep neural network model, with convolutional layers to extract hierarchical features from the input echo cine and recurrent layers to leverage the sequential information in the echo cine loop. Using data from 509 separate patient studies, containing 2,450 echo cines across five standard echo imaging planes, we achieved a mean quality score accuracy of 85$$\%$$ compared to the gold-standard score assigned by experienced echosonographers. The proposed approach calculates the quality of a given 20 frame echo sequence within 10 ms, sufficient for real-time deployment.

Amir H. Abdi, Christina Luong, Teresa Tsang, John Jue, Ken Gin, Darwin Yeung, Dale Hawley, Robert Rohling, Purang Abolmaesumi
Semi-supervised Deep Learning for Fully Convolutional Networks

Deep learning usually requires large amounts of labeled training data, but annotating data is costly and tedious. The framework of semi-supervised learning provides the means to use both labeled data and arbitrary amounts of unlabeled data for training. Recently, semi-supervised deep learning has been intensively studied for standard CNN architectures. However, Fully Convolutional Networks (FCNs) set the state-of-the-art for many image segmentation tasks. To the best of our knowledge, there is no existing semi-supervised learning method for such FCNs yet. We lift the concept of auxiliary manifold embedding for semi-supervised learning to FCNs with the help of Random Feature Embedding. In our experiments on the challenging task of MS Lesion Segmentation, we leverage the proposed framework for the purpose of domain adaptation and report substantial improvements over the baseline model.

Christoph Baur, Shadi Albarqouni, Nassir Navab
TandemNet: Distilling Knowledge from Medical Images Using Diagnostic Reports as Optional Semantic References

In this paper, we introduce the semantic knowledge of medical images from their diagnostic reports to provide an inspirational network training and an interpretable prediction mechanism with our proposed novel multimodal neural network, namely TandemNet. Inside TandemNet, a language model is used to represent report text, which cooperates with the image model in a tandem scheme. We propose a novel dual-attention model that facilitates high-level interactions between visual and semantic information and effectively distills useful features for prediction. In the testing stage, TandemNet can make accurate image prediction with an optional report text input. It also interprets its prediction by producing attention on the image and text informative feature pieces, and further generating diagnostic report paragraphs. Based on a pathological bladder cancer images and their diagnostic reports (BCIDR) dataset, sufficient experiments demonstrate that our method effectively learns and integrates knowledge from multimodalities and obtains significantly improved performance than comparing baselines.

Zizhao Zhang, Pingjun Chen, Manish Sapkota, Lin Yang
BRIEFnet: Deep Pancreas Segmentation Using Binary Sparse Convolutions

Dense prediction using deep convolutional neural networks (CNNs) has recently advanced the field of segmentation in computer vision and medical imaging. In contrast to patch-based classification, it requires only a single path through a deep network to segment every voxel in an image. However, it is difficult to incorporate contextual information without using contracting (pooling) layers, which would reduce the spatial accuracy for thinner structures. Consequently, huge receptive fields are required which might lead to disproportionate computational demand. Here, we propose to use binary sparse convolutions in the first layer as a particularly effective approach to reduce complexity while achieving high accuracy. The concept is inspired by the successful BRIEF descriptors and complemented with $$1\times 1$$ convolutions (cf. network in network) to further reduce the number of trainable parameters. Sparsity is in particular important for small datasets often found in medical imaging. Our experimental validation demonstrates accuracies for pancreas segmentation in CT that are comparable with state-of-the-art deep learning approaches and registration based multi-atlas segmentation with label fusion. The whole network, which also includes a classic CNN path to improve local details, can be trained in 10 min. Segmenting a new scan takes 3 s even without using a GPU.

Mattias P. Heinrich, Ozan Oktay
Supervised Action Classifier: Approaching Landmark Detection as Image Partitioning

In medical imaging, landmarks have significant clinical and scientific importance. Clinical measurements, derived from the landmarks, are used for diagnosis, therapy planning and interventional guidance in many cases. Automatic algorithms have been studied to reduce the need for manual placement of landmarks. Traditional machine learning techniques provide reasonable results; however, they have limitation of either robustness or precision given complexities and variabilities of the medical images. Recently, deep learning technologies have been emerging to tackle the problems. Among them, a deep reinforcement learning approach (DRL) has shown to successfully detect landmark locations by implicitly learning the optimized path from a starting location; however, its learning process can only include subsets of the almost infinite paths across the image context, and may lead to major failures if not trained with adequate dataset variations. Here, we propose a new landmark detection approach inspired from DRL. Instead of learning limited action paths in an image in a greedy manner, we construct a global action map across the whole image, which divides the image into four action regions (left, right, up and bottom) depending on the relative location towards the target landmark. The action map guides how to move to reach the target landmark from any location of the input image. This effectively translates the landmark detection problem into an image partition problem which enables us to leverage a deep image-to-image network to train a supervised action classifier for detection of the landmarks. We discuss the experiment results of two ultrasound datasets (cardiac and obstetric) by applying the proposed algorithm. It shows consistent improvement over traditional machine learning based and deep learning based methods.

Zhoubing Xu, Qiangui Huang, JinHyeong Park, Mingqing Chen, Daguang Xu, Dong Yang, David Liu, S. Kevin Zhou
Robust Multi-modal MR Image Synthesis

We present a multi-input encoder-decoder neural network model able to perform MR image synthesis from any subset of its inputs, outperforming prior methods in both single and multi-input settings. This is achieved by encouraging the network to learn a modality invariant latent embedding during training. We demonstrate that a spatial transformer module [7] can be included in our model to automatically correct misalignment in the input data. Thus, our model is robust both to missing and misaligned data at test time. Finally, we show that the model’s modular nature allows transfer learning to different datasets.

Thomas Joyce, Agisilaos Chartsias, Sotirios A. Tsaftaris
Segmentation of Intracranial Arterial Calcification with Deeply Supervised Residual Dropout Networks

Intracranial carotid artery calcification (ICAC) is a major risk factor for stroke, and might contribute to dementia and cognitive decline. Reliance on time-consuming manual annotation of ICAC hampers much demanded further research into the relationship between ICAC and neurological diseases. Automation of ICAC segmentation is therefore highly desirable, but difficult due to the proximity of the lesions to bony structures with a similar attenuation coefficient. In this paper, we propose a method for automatic segmentation of ICAC; the first to our knowledge. Our method is based on a 3D fully convolutional neural network that we extend with two regularization techniques. Firstly, we use deep supervision to encourage discriminative features in the hidden layers. Secondly, we augment the network with skip connections, as in the recently developed ResNet, and dropout layers, inserted in a way that skip connections circumvent them. We investigate the effect of skip connections and dropout. In addition, we propose a simple problem-specific modification of the network objective function that restricts the focus to the most important image regions and simplifies the optimization. We train and validate our model using 882 CT scans and test on 1,000. Our regularization techniques and objective improve the average Dice score by 7.1%, yielding an average Dice of 76.2% and 97.7% correlation between predicted ICAC volumes and manual annotations.

Gerda Bortsova, Gijs van Tulder, Florian Dubost, Tingying Peng, Nassir Navab, Aad van der Lugt, Daniel Bos, Marleen De Bruijne
Clinical Target-Volume Delineation in Prostate Brachytherapy Using Residual Neural Networks

Low dose-rate prostate brachytherapy is commonly used to treat early stage prostate cancer. This intervention involves implanting radioactive seeds inside a volume containing the prostate. Planning the intervention requires obtaining a series of ultrasound images from the prostate. This is followed by delineation of a clinical target volume, which mostly traces the prostate boundary in the ultrasound data, but can be modified based on institution-specific clinical guidelines. Here, we aim to automate the delineation of clinical target volume by using a new deep learning network based on residual neural nets and dilated convolution at deeper layers. In addition, we propose to include an exponential weight map in the optimization to improve local prediction. We train the network on 4,284 expert-labeled transrectal ultrasound images and test it on an independent set of 1,081 ultrasound images. With respect to the gold-standard delineation, we achieve a mean Dice similarity coefficient of 94%, a mean surface distance error of 1.05 mm and a mean Hausdorff distance error of 3.0 mm. The obtained results are statistically significantly better than two previous state-of-the-art techniques.

Emran Mohammad Abu Anas, Saman Nouranian, S. Sara Mahdavi, Ingrid Spadinger, William J. Morris, Septimu E. Salcudean, Parvin Mousavi, Purang Abolmaesumi
Using Convolutional Neural Networks to Automatically Detect Eye-Blink Artifacts in Magnetoencephalography Without Resorting to Electrooculography

Magnetoencephelography (MEG) is a functional neuroimaging tool that records the magnetic fields induced by neuronal activity; however, signal from muscle activity often corrupts the data. Eye-blinks are one of the most common types of muscle artifact. They can be recorded by affixing eye proximal electrodes, as in electrooculography (EOG), however this complicates patient preparation and decreases comfort. Moreover, it can induce further muscular artifacts from facial twitching. We propose an EOG free, data driven approach. We begin with Independent Component Analysis (ICA), a well-known preprocessing approach that factors observed signal into statistically independent components. When applied to MEG, ICA can help separate neuronal components from non-neuronal ones, however, the components are randomly ordered. Thus, we develop a method to assign one of two labels, non-eye-blink or eye-blink, to each component.Our contributions are two-fold. First, we develop a 10-layer Convolutional Neural Network (CNN), which directly labels eye-blink artifacts. Second, we visualize the learned spatial features using attention mapping, to reveal what it has learned and bolster confidence in the method’s ability to generalize to unseen data. We acquired 8-min, eyes open, resting state MEG from 44 subjects. We trained our method on the spatial maps from ICA of 14 subjects selected randomly with expertly labeled ground truth. We then tested on the remaining 30 subjects. Our approach achieves a test classification accuracy of 99.67%, sensitivity: 97.62%, specificity: 99.77%, and ROC AUC: 98.69%. We also show the learned spatial features correspond to those human experts typically use which corroborates our model’s validity. This work (1) facilitates creation of fully automated processing pipelines in MEG that need to remove motion artifacts related to eye blinks, and (2) potentially obviates the use of additional EOG electrodes for the recording of eye-blinks in MEG studies.

Prabhat Garg, Elizabeth Davenport, Gowtham Murugesan, Ben Wagner, Christopher Whitlow, Joseph Maldjian, Albert Montillo
Image Super Resolution Using Generative Adversarial Networks and Local Saliency Maps for Retinal Image Analysis

We propose an image super resolution (ISR) method using generative adversarial networks (GANs) that takes a low resolution input fundus image and generates a high resolution super resolved (SR) image upto scaling factor of 16. This facilitates more accurate automated image analysis, especially for small or blurred landmarks and pathologies. Local saliency maps, which define each pixel’s importance, are used to define a novel saliency loss in the GAN cost function. Experimental results show the resulting SR images have perceptual quality very close to the original images and perform better than competing methods that do not weigh pixels according to their importance. When used for retinal vasculature segmentation, our SR images result in accuracy levels close to those obtained when using the original images.

Dwarikanath Mahapatra, Behzad Bozorgtabar, Sajini Hewavitharanage, Rahil Garnavi
Synergistic Combination of Learned and Hand-Crafted Features for Prostate Lesion Classification in Multiparametric Magnetic Resonance Imaging

In this paper, we propose and evaluate a new method for classifying between malignant and benign prostate cancer lesions in multiparametric magnetic resonance imaging (MRI). We show that synergistically combining automatically-learned and handcrafted features can significantly improve the classification performance. Our method utilizes features extracted from convolutional neural networks (CNNs), texture features learned via a discriminative sparsity-regularized approach, and hand-crafted statistical features. To assess the efficacy of different feature sets, we use AdaBoost with decision trees to classify prostate cancer lesions using different sets of features. CNN-derived, texture, and statistical features achieved area under the receiver operating characteristic curve (AUC) of 0.75, 0.68, and 0.70, respectively. Augmenting CNN features with texture and statistical features increased the AUC to 0.84 and 0.82, respectively. Combining all three feature types led to an AUC of 0.87. Our results indicate that in medical applications where training data is scarce, the classification performance achieved by CNNs or sparsity-regularized classification methods alone can be sub-optimal. Alternatively, one can treat these methods as implicit feature extraction mechanisms and combine their learned features with hand-crafted features using meta-classifiers to obtain superior classification performance.

Davood Karimi, Dan Ruan
Suggestive Annotation: A Deep Active Learning Framework for Biomedical Image Segmentation

Image segmentation is a fundamental problem in biomedical image analysis. Recent advances in deep learning have achieved promising results on many biomedical image segmentation benchmarks. However, due to large variations in biomedical images (different modalities, image settings, objects, noise, etc.), to utilize deep learning on a new application, it usually needs a new set of training data. This can incur a great deal of annotation effort and cost, because only biomedical experts can annotate effectively, and often there are too many instances in images (e.g., cells) to annotate. In this paper, we aim to address the following question: With limited effort (e.g., time) for annotation, what instances should be annotated in order to attain the best performance? We present a deep active learning framework that combines fully convolutional network (FCN) and active learning to significantly reduce annotation effort by making judicious suggestions on the most effective annotation areas. We utilize uncertainty and similarity information provided by FCN and formulate a generalized version of the maximum set cover problem to determine the most representative and uncertain areas for annotation. Extensive experiments using the 2015 MICCAI Gland Challenge dataset and a lymph node ultrasound image segmentation dataset show that, using annotation suggestions by our method, state-of-the-art segmentation performance can be achieved by using only 50% of training data.

Lin Yang, Yizhe Zhang, Jianxu Chen, Siyuan Zhang, Danny Z. Chen
Deep Adversarial Networks for Biomedical Image Segmentation Utilizing Unannotated Images

Semantic segmentation is a fundamental problem in biomedical image analysis. In biomedical practice, it is often the case that only limited annotated data are available for model training. Unannotated images, on the other hand, are easier to acquire. How to utilize unannotated images for training effective segmentation models is an important issue. In this paper, we propose a new deep adversarial network (DAN) model for biomedical image segmentation, aiming to attain consistently good segmentation results on both annotated and unannotated images. Our model consists of two networks: (1) a segmentation network (SN) to conduct segmentation; (2) an evaluation network (EN) to assess segmentation quality. During training, EN is encouraged to distinguish between segmentation results of unannotated images and annotated ones (by giving them different scores), while SN is encouraged to produce segmentation results of unannotated images such that EN cannot distinguish these from the annotated ones. Through an iterative adversarial training process, because EN is constantly “criticizing” the segmentation results of unannotated images, SN can be trained to produce more and more accurate segmentation for unannotated and unseen samples. Experiments show that our proposed DAN model is effective in utilizing unannotated image data to obtain considerably better segmentation.

Yizhe Zhang, Lin Yang, Jianxu Chen, Maridel Fredericksen, David P. Hughes, Danny Z. Chen
Medical Image Synthesis with Context-Aware Generative Adversarial Networks

Computed tomography (CT) is critical for various clinical applications, e.g., radiation treatment planning and also PET attenuation correction in MRI/PET scanner. However, CT exposes radiation during acquisition, which may cause side effects to patients. Compared to CT, magnetic resonance imaging (MRI) is much safer and does not involve radiations. Therefore, recently researchers are greatly motivated to estimate CT image from its corresponding MR image of the same subject for the case of radiation planning. In this paper, we propose a data-driven approach to address this challenging problem. Specifically, we train a fully convolutional network (FCN) to generate CT given the MR image. To better model the nonlinear mapping from MRI to CT and produce more realistic images, we propose to use the adversarial training strategy to train the FCN. Moreover, we propose an image-gradient-difference based loss function to alleviate the blurriness of the generated CT. We further apply Auto-Context Model (ACM) to implement a context-aware generative adversarial network. Experimental results show that our method is accurate and robust for predicting CT images from MR images, and also outperforms three state-of-the-art methods under comparison.

Dong Nie, Roger Trullo, Jun Lian, Caroline Petitjean, Su Ruan, Qian Wang, Dinggang Shen
Joint Detection and Diagnosis of Prostate Cancer in Multi-parametric MRI Based on Multimodal Convolutional Neural Networks

This paper presents an automated method for jointly localizing prostate cancer (PCa) in multi-parametric MRI (mp-MRI) images and assessing the aggressiveness of detected lesions. Our method employs multimodal multi-label convolutional neural networks (CNNs), which are trained in a weakly-supervised manner by providing a set of prostate images with image-level labels without priors of lesions’ locations. By distinguishing images with different labels, discriminative visual patterns related to indolent PCa and clinically significant (CS) PCa are automatically learned from clutters of prostate tissues. Cancer response maps (CRMs) with each pixel indicating the likelihood of being part of indolent/CS are explicitly generated at the last convolutional layer. We define new back-propagate error of CNN to enforce both optimized classification results and consistent CRMs for different modalities. Our method enables the feature learning processes of different modalities to mutually influence each other and, in turn yield more representative features. Comprehensive evaluation based on 402 lesions demonstrates superior performance of our method to the state-of-the-art method [13].

Xin Yang, Zhiwei Wang, Chaoyue Liu, Hung Minh Le, Jingyu Chen, Kwang-Ting (Tim) Cheng, Liang Wang
SD-Layer: Stain Deconvolutional Layer for CNNs in Medical Microscopic Imaging

Convolutional Neural Networks (CNNs) are typically trained in the RGB color space. However, in medical imaging, we believe that pixel stain quantities offer a fundamental view of the interaction between tissues and stain chemicals. Since the optical density (OD) colorspace allows to compute pixel stain quantities from pixel RGB intensities using the Beer-Lambert’s law, we propose a stain deconvolutional layer, hereby named as SD-Layer, affixed at the front of CNN that performs two functions: (1) it transforms the input RGB microscopic images to Optical Density (OD) space and (2) this layer deconvolves OD image with the stain basis learned through backpropagation and provides tissue-specific stain absorption quantities as input to the following CNN layers. With the introduction of only nine additional learnable parameters in the proposed SD-Layer, we obtain a considerably improved performance on two standard CNN architectures: AlexNet and T-CNN. Using the T-CNN architecture prefixed with the proposed SD-Layer, we obtain 5-fold cross-validation accuracy of 93.2% in the problem of differentiating malignant immature White Blood Cells (WBCs) from normal immature WBCs for cancer detection.

Rahul Duggal, Anubha Gupta, Ritu Gupta, Pramit Mallick
X-Ray In-Depth Decomposition: Revealing the Latent Structures

X-ray is the most readily available imaging modality and has a broad range of applications that spans from diagnosis to intra-operative guidance in cardiac, orthopedics, and trauma procedures. Proper interpretation of the hidden and obscured anatomy in X-ray images remains a challenge and often requires high radiation dose and imaging from several perspectives. In this work, we aim at decomposing the conventional X-ray image into d X-ray components of independent, non-overlapped, clipped sub-volume, that separate rigid structures into distinct layers, leaving all deformable organs in one layer, such that the sum resembles the original input. Our proposed model is validaed on 6 clinical datasets ($$\sim $$7200 X-ray images) in addition to 615 real chest X-ray images. Despite the challenging aspects of modeling such a highly ill-posed problem, exciting and encouraging results are obtained paving the path for further contributions in this direction.

Shadi Albarqouni, Javad Fotouhi, Nassir Navab
Fast Prospective Detection of Contrast Inflow in X-ray Angiograms with Convolutional Neural Network and Recurrent Neural Network

Automatic detection of contrast inflow in X-ray angiographic sequences can facilitate image guidance in computer-assisted cardiac interventions. In this paper, we propose two different approaches for prospective contrast inflow detection. The methods were developed and evaluated to detect contrast frames from X-ray sequences. The first approach trains a convolutional neural network (CNN) to distinguish whether a frame has contrast agent or not. The second method extracts contrast features from images with enhanced vessel structures; the contrast frames are then detected based on changes in the feature curve using long short-term memory (LSTM), a recurrent neural network architecture. Our experiments show that both approaches achieve good performance on detection of the beginning contrast frame from X-ray sequences and are more robust than a state-of-the-art method. As the proposed methods work in prospective settings and run fast, they have the potential of being used in clinical practice.

Hua Ma, Pierre Ambrosini, Theo van Walsum
Quantification of Metabolites in Magnetic Resonance Spectroscopic Imaging Using Machine Learning

Magnetic Resonance Spectroscopic Imaging (MRSI) is a clinical imaging modality for measuring tissue metabolite levels in-vivo. An accurate estimation of spectral parameters allows for better assessment of spectral quality and metabolite concentration levels. The current gold standard quantification method is the LCModel - a commercial fitting tool. However, this fails for spectra having poor signal-to-noise ratio (SNR) or a large number of artifacts. This paper introduces a framework based on random forest regression for accurate estimation of the output parameters of a model based analysis of MR spectroscopy data. The goal of our proposed framework is to learn the spectral features from a training set comprising of different variations of both simulated and in-vivo brain spectra and then use this learning for the subsequent metabolite quantification. Experiments involve training and testing on simulated and in-vivo human brain spectra. We estimate parameters such as concentration of metabolites and compare our results with that from the LCModel.

Dhritiman Das, Eduardo Coello, Rolf F. Schulte, Bjoern H. Menze
Building Disease Detection Algorithms with Very Small Numbers of Positive Samples

Although deep learning can provide promising results in medical image analysis, the lack of very large annotated datasets confines its full potential. Furthermore, limited positive samples also create unbalanced datasets which limit the true positive rates of trained models. As unbalanced datasets are mostly unavoidable, it is greatly beneficial if we can extract useful knowledge from negative samples to improve classification accuracy on limited positive samples. To this end, we propose a new strategy for building medical image analysis pipelines that target disease detection. We train a discriminative segmentation model only on normal images to provide a source of knowledge to be transferred to a disease detection classifier. We show that using the feature maps of a trained segmentation network, deviations from normal anatomy can be learned by a two-class classification network on an extremely unbalanced training dataset with as little as one positive for 17 negative samples. We demonstrate that even though the segmentation network is only trained on normal cardiac computed tomography images, the resulting feature maps can be used to detect pericardial effusion and cardiac septal defects with two-class convolutional classification networks.

Ken C. L. Wong, Alexandros Karargyris, Tanveer Syeda-Mahmood, Mehdi Moradi
Hierarchical Multimodal Fusion of Deep-Learned Lesion and Tissue Integrity Features in Brain MRIs for Distinguishing Neuromyelitis Optica from Multiple Sclerosis

Neuromyelitis optica spectrum disorder (NMOSD) is a disease of the central nervous system that is often misdiagnosed as multiple sclerosis (MS) because they share similar clinical and radiological characteristics. Two key pathological signs of NMOSD and MS that are detectable on magnetic resonance imaging (MRI) are white matter lesions and alterations in tissue integrity as measured by fractional anisotropy (FA) values on diffusion tensor images (DTIs). This paper proposes a multimodal deep learning model that discovers latent features in brain lesion masks and DTIs for distinguishing NMOSD from MS. The main technical challenge is to optimally extract and integrate features from two very heterogeneous image types (lesion masks and FA maps). Our solution is to first build two modality-specific pathways, each designed to accommodate the expected feature density and scale, then integrate them into a hierarchical multimodal fusion (HMF) model. The HMF model contains two multimodal fusion layers operating at two different scales, which in turn are joined by a multi-scale fusion layer. We hypothesize that the HMF approach would allow the automatic extraction of joint-features of heterogeneous image types to be optimized with greater efficiency and accuracy than the traditional multimodal approach of combining only the top-layer modality-specific features with a single fusion layer. The proposed model gives an average diagnostic accuracy of 81.3% (85.3% sensitivity and 75.0% specificity) on 82 NMOSD patients and 52 MS patients in a seven-fold cross-validation, which significantly outperforms the user-defined MRI features previously used in clinical studies, as well as deep-learned features using the conventional fusion approach.

Youngjin Yoo, Lisa Y. W. Tang, Su-Hyun Kim, Ho Jin Kim, Lisa Eunyoung Lee, David K. B. Li, Shannon Kolind, Anthony Traboulsee, Roger Tam
Deep Convolutional Encoder-Decoders for Prostate Cancer Detection and Classification

Prostate cancer accounts for approximately 11% of all cancer cases. Definitive diagnosis is made by histopathological examination of tissue biopsies. Recently, there have been strong correlations established between pre-biopsy multi-parametric MR image findings and the histopathology results. We investigate novel deep learning networks that provide tumor localization and classification solely based on prostate multi-parametric MR images using images with biopsy confirmed lesions. We propose to use a multi-channel image-to-image convolutional encoder-decoders where responses signify localized lesions and output channels represent different tumor classes. We take simple point locations in the labeled ground truth data and train networks to output Gaussian kernels around those points across multiple channels. This approach allows for both localization and classification within a single run. The input data consists of axial T2-weighted images, apparent diffusion coefficient maps, high b-value diffusion-weighted images, and K-trans parameter maps from 202 patients. The images were co-registered on a per patient basis and exhaustive comparisons were performed with 5-fold cross-validation across three different models with increasing complexity. The highest average classification area-under-the-curve (AUC) achieved was 83.4% using a medium complexity model, in which no skip-connection were used across layers. In individual k-folds, AUCs above 90% were achieved. The results demonstrate promise for directly determining tumor malignancy without performing an invasive biopsy procedure.

Atilla P. Kiraly, Clement Abi Nader, Ahmet Tuysuzoglu, Robert Grimm, Berthold Kiefer, Noha El-Zehiry, Ali Kamen
Deep Image-to-Image Recurrent Network with Shape Basis Learning for Automatic Vertebra Labeling in Large-Scale 3D CT Volumes

Automatic vertebra localization and identification in 3D medical images plays an important role in many clinical tasks, including pathological diagnosis, surgical planning and postoperative assessment. In this paper, we propose an automatic and efficient algorithm to localize and label the vertebra centroids in 3D CT volumes. First, a deep image-to-image network (DI2IN) is deployed to initialize vertebra locations, employing the convolutional encoder-decoder architecture. Next, the centroid probability maps from DI2IN are modeled as a sequence according to the spatial relationship of vertebrae, and evolved with the convolutional long short-term memory (ConvLSTM) model. Finally, the landmark positions are further refined and regularized by another neural network with a learned shape basis. The whole pipeline can be conducted in the end-to-end manner. The proposed method outperforms other state-of-the-art methods on a public database of 302 spine CT volumes with various pathologies. To further boost the performance and validate that large labeled training data can benefit the deep learning algorithms, we leverage the knowledge of additional 1000 3D CT volumes from different patients. Our experimental results show that training with a large database improves the performance of proposed framework by a large margin and achieves an identification rate of 89%.

Dong Yang, Tao Xiong, Daguang Xu, S. Kevin Zhou, Zhoubing Xu, Mingqing Chen, JinHyeong Park, Sasa Grbic, Trac D. Tran, Sang Peter Chin, Dimitris Metaxas, Dorin Comaniciu
Automatic Liver Segmentation Using an Adversarial Image-to-Image Network

Automatic liver segmentation in 3D medical images is essential in many clinical applications, such as pathological diagnosis of hepatic diseases, surgical planning, and postoperative assessment. However, it is still a very challenging task due to the complex background, fuzzy boundary, and various appearance of liver. In this paper, we propose an automatic and efficient algorithm to segment liver from 3D CT volumes. A deep image-to-image network (DI2IN) is first deployed to generate the liver segmentation, employing a convolutional encoder-decoder architecture combined with multi-level feature concatenation and deep supervision. Then an adversarial network is utilized during training process to discriminate the output of DI2IN from ground truth, which further boosts the performance of DI2IN. The proposed method is trained on an annotated dataset of 1000 CT volumes with various different scanning protocols (e.g., contrast and non-contrast, various resolution and position) and large variations in populations (e.g., ages and pathology). Our approach outperforms the state-of-the-art solutions in terms of segmentation accuracy and computing efficiency.

Dong Yang, Daguang Xu, S. Kevin Zhou, Bogdan Georgescu, Mingqing Chen, Sasa Grbic, Dimitris Metaxas, Dorin Comaniciu
Transfer Learning for Domain Adaptation in MRI: Application in Brain Lesion Segmentation

Magnetic Resonance Imaging (MRI) is widely used in routine clinical diagnosis and treatment. However, variations in MRI acquisition protocols result in different appearances of normal and diseased tissue in the images. Convolutional neural networks (CNNs), which have shown to be successful in many medical image analysis tasks, are typically sensitive to the variations in imaging protocols. Therefore, in many cases, networks trained on data acquired with one MRI protocol, do not perform satisfactorily on data acquired with different protocols. This limits the use of models trained with large annotated legacy datasets on a new dataset with a different domain which is often a recurring situation in clinical settings. In this study, we aim to answer the following central questions regarding domain adaptation in medical image analysis: Given a fitted legacy model, (1) How much data from the new domain is required for a decent adaptation of the original network?; and, (2) What portion of the pre-trained model parameters should be retrained given a certain number of the new domain training samples? To address these questions, we conducted extensive experiments in white matter hyperintensity segmentation task. We trained a CNN on legacy MR images of brain and evaluated the performance of the domain-adapted network on the same task with images from a different domain. We then compared the performance of the model to the surrogate scenarios where either the same trained network is used or a new network is trained from scratch on the new dataset. The domain-adapted network tuned only by two training examples achieved a Dice score of 0.63 substantially outperforming a similar network trained on the same set of examples from scratch.

Mohsen Ghafoorian, Alireza Mehrtash, Tina Kapur, Nico Karssemeijer, Elena Marchiori, Mehran Pesteie, Charles R. G. Guttmann, Frank-Erik de Leeuw, Clare M. Tempany, Bram van Ginneken, Andriy Fedorov, Purang Abolmaesumi, Bram Platel, William M. Wells III
Retinal Microaneurysm Detection Using Clinical Report Guided Multi-sieving CNN

Timely detection and treatment of microaneurysms (MA) is a critical step to prevent the development of vision-threatening eye diseases such as diabetic retinopathy. However, detecting MAs in fundus images is a highly challenging task due to the large variation of imaging conditions. In this paper, we focus on developing an interleaved deep mining technique to cope intelligently with the unbalanced MA detection problem. Specifically, we present a clinical report guided multi-sieving convolutional neural network (MS-CNN) which leverages a small amount of supervised information in clinical reports to identify the potential MA regions via a text-to-image mapping in the feature space. These potential MA regions are then interleaved with the fundus image information for multi-sieving deep mining in a highly unbalanced classification problem. Critically, the clinical reports are employed to bridge the semantic gap between low-level image features and high-level diagnostic information. Extensive evaluations show our framework achieves 99.7% precision and 87.8% recall, comparing favorably with the state-of-the-art algorithms. Integration of expert domain knowledge and image information demonstrates the feasibility to reduce the training difficulty of the classifiers under extremely unbalanced data distribution.

Ling Dai, Bin Sheng, Qiang Wu, Huating Li, Xuhong Hou, Weiping Jia, Ruogu Fang
Lesion Detection and Grading of Diabetic Retinopathy via Two-Stages Deep Convolutional Neural Networks

We propose an automatic diabetic retinopathy (DR) analysis algorithm based on two-stages deep convolutional neural networks (DCNN). Compared to existing DCNN-based DR detection methods, the proposed algorithm has the following advantages: (1) Our algorithm can not only point out the lesions in fundus color images, but also give the severity grades of DR. (2) By introducing an imbalanced weighting scheme, more attentions will be payed on lesion patches for DR grading, which significantly improves the performance of DR grading under the same implementation setup. In this study, we label 12, 206 lesion patches and re-annotate the DR grades of 23, 595 fundus images from Kaggle competition dataset. Under the guidance of clinical ophthalmologists, the experimental results show that our lesion detection net achieves comparable performance with trained human observers, and the proposed imbalanced weighted scheme also be proved to significantly enhance the capability of our DCNN-based DR grading algorithm.

Yehui Yang, Tao Li, Wensi Li, Haishan Wu, Wei Fan, Wensheng Zhang
Hashing with Residual Networks for Image Retrieval

We propose a novel deeply learnt convolutional neural network architecture for supervised hashing of medical images through residual learning, coined as Deep Residual Hashing (DRH). It offers maximal separability of classes in hashing space while preserving semantic similarities in local embedding neighborhoods. We also introduce a new optimization formulation comprising of complementary loss terms and regularizations that suit hashing objectives the best by controlling over quantization errors. We conduct extensive validations on 2,599 Chest X-ray images with co-morbidities against eight state-of-the-art hashing techniques and demonstrate improved performance and computational benefits of the proposed algorithm for fast and scalable retrieval.

Sailesh Conjeti, Abhijit Guha Roy, Amin Katouzian, Nassir Navab
Deep Multiple Instance Hashing for Scalable Medical Image Retrieval

In this paper, for the first time, we introduce a multiple instance (MI) deep hashing technique for learning discriminative hash codes with weak bag-level supervision suited for large-scale retrieval. We learn such hash codes by aggregating deeply learnt hierarchical representations across bag members through an MI pool layer. For better trainability and retrieval quality, we propose a two-pronged approach that includes robust optimization and training with an auxiliary single instance hashing arm which is down-regulated gradually. We pose retrieval for tumor assessment as an MI problem because tumors often coexist with benign masses and could exhibit complementary signatures when scanned from different anatomical views. Experimental validations demonstrate improved retrieval performance over the state-of-the-art methods.

Sailesh Conjeti, Magdalini Paschali, Amin Katouzian, Nassir Navab
Accurate Pulmonary Nodule Detection in Computed Tomography Images Using Deep Convolutional Neural Networks

Early detection of pulmonary cancer is the most promising way to enhance a patient’s chance for survival. Accurate pulmonary nodule detection in computed tomography (CT) images is a crucial step in diagnosing pulmonary cancer. In this paper, inspired by the successful use of deep convolutional neural networks (DCNNs) in natural image recognition, we propose a novel pulmonary nodule detection approach based on DCNNs. We first introduce a deconvolutional structure to Faster Region-based Convolutional Neural Network (Faster R-CNN) for candidate detection on axial slices. Then, a three-dimensional DCNN is presented for the subsequent false positive reduction. Experimental results of the LUng Nodule Analysis 2016 (LUNA16) Challenge demonstrate the superior detection performance of the proposed approach on nodule detection (average FROC-score of 0.893, ranking the 1st place over all submitted results), which outperforms the best result on the leaderboard of the LUNA16 Challenge (average FROC-score of 0.864).

Jia Ding, Aoxue Li, Zhiqiang Hu, Liwei Wang
Discriminative Localization in CNNs for Weakly-Supervised Segmentation of Pulmonary Nodules

Automated detection and segmentation of pulmonary nodules on lung computed tomography (CT) scans can facilitate early lung cancer diagnosis. Existing supervised approaches for automated nodule segmentation on CT scans require voxel-based annotations for training, which are labor- and time-consuming to obtain. In this work, we propose a weakly-supervised method that generates accurate voxel-level nodule segmentation trained with image-level labels only. By adapting a convolutional neural network (CNN) trained for image classification, our proposed method learns discriminative regions from the activation maps of convolution units at different scales, and identifies the true nodule location with a novel candidate-screening framework. Experimental results on the public LIDC-IDRI dataset demonstrate that, our weakly-supervised nodule segmentation framework achieves competitive performance compared to a fully-supervised CNN-based segmentation method.

Xinyang Feng, Jie Yang, Andrew F. Laine, Elsa D. Angelini
Liver Lesion Detection Based on Two-Stage Saliency Model with Modified Sparse Autoencoder

Liver lesion detection is an important task for diagnosis and surgical planning of focal liver disease. The large numbers of images in routine liver CT studies, in addition to their high diversity in appearance, have been hurdles for detecting all lesions by visual inspection. Automated methods for lesion identification are desirable, but the results of current approaches are limited due to the diversity of the training sets and the extensive tuning of parameters. In this paper, we propose a novel saliency model for lesion detection in CT images. First, we segment the image into multi-scale patch sizes. Then, a two-stage saliency model is proposed to detect liver lesions. In the first stage, we calculate the gray level contrast saliency map based on a prior knowledge to reduce the influence of blood vessels in CT images. In the second stage, we propose a modified sparse autoencoder (SAE) with neighbourhood information to learn discriminative features directly from raw patch features and adopt Locality-constrained Linear Coding (LLC) method to encode the obtained discriminative features of each patch. Then the second saliency map is calculated based on feature uniqueness and spatial distribution of patches. Followed by an appropriate mapping fusion, the liver lesions can be detected well. With $$7\times 7$$ sized patches, a 120 visual word dictionary, and 14 feature dimension, our model achieved 90.81% accuracy for lesion detection.

Yixuan Yuan, Max Q.-H. Meng, Wenjian Qin, Lei Xing
Manifold Learning of COPD

Analysis of CT scans for studying Chronic Obstructive Pulmonary Disease (COPD) is generally limited to mean scores of disease extent. However, the evolution of local pulmonary damage may vary between patients with discordant effects on lung physiology. This limits the explanatory power of mean values in clinical studies. We present local disease and deformation distributions to address this limitation. The disease distribution aims to quantify two aspects of parenchymal damage: locally diffuse/dense disease and global homogeneity/heterogeneity. The deformation distribution links parenchymal damage to local volume change. These distributions are exploited to quantify inter-patient differences. We used manifold learning to model variations of these distributions in 743 patients from the COPDGene study. We applied manifold fusion to combine distinct aspects of COPD into a single model. We demonstrated the utility of the distributions by comparing associations between learned embeddings and measures of severity. We also illustrated the potential to identify trajectories of disease progression in a manifold space of COPD.

Felix J. S. Bragman, Jamie R. McClelland, Joseph Jacob, John R. Hurst, David J. Hawkes
Hybrid Mass Detection in Breast MRI Combining Unsupervised Saliency Analysis and Deep Learning

To interpret a breast MRI study, a radiologist has to examine over 1000 images, and integrate spatial and temporal information from multiple sequences. The automated detection and classification of suspicious lesions can help reduce the workload and improve accuracy. We describe a hybrid mass-detection algorithm that combines unsupervised candidate detection with deep learning-based classification. The detection algorithm first identifies image-salient regions, as well as regions that are cross-salient with respect to the contralateral breast image. We then use a convolutional neural network (CNN) to classify the detected candidates into true-positive and false-positive masses. The network uses a novel multi-channel image representation; this representation encompasses information from the anatomical and kinetic image features, as well as saliency maps. We evaluated our algorithm on a dataset of MRI studies from 171 patients, with 1957 annotated slices of malignant (59%) and benign (41%) masses. Unsupervised saliency-based detection provided a sensitivity of 0.96 with 9.7 false-positive detections per slice. Combined with CNN classification, the number of false positive detections dropped to 0.7 per slice, with 0.85 sensitivity. The multi-channel representation achieved higher classification performance compared to single-channel images. The combination of domain-specific unsupervised methods and general-purpose supervised learning offers advantages for medical imaging applications, and may improve the ability of automated algorithms to assist radiologists.

Guy Amit, Omer Hadad, Sharon Alpert, Tal Tlusty, Yaniv Gur, Rami Ben-Ari, Sharbell Hashoul
Deep Multi-instance Networks with Sparse Label Assignment for Whole Mammogram Classification

Mammogram classification is directly related to computer-aided diagnosis of breast cancer. Traditional methods rely on regions of interest (ROIs) which require great efforts to annotate. Inspired by the success of using deep convolutional features for natural image analysis and multi-instance learning (MIL) for labeling a set of instances/patches, we propose end-to-end trained deep multi-instance networks for mass classification based on whole mammogram without the aforementioned ROIs. We explore three different schemes to construct deep multi-instance networks for whole mammogram classification. Experimental results on the INbreast dataset demonstrate the robustness of proposed networks compared to previous work using segmentation and detection annotations. (Code: https://github.com/wentaozhu/deep-mil-for-whole-mammogram-classification.git).

Wentao Zhu, Qi Lou, Yeeleng Scott Vang, Xiaohui Xie
Segmentation-Free Kidney Localization and Volume Estimation Using Aggregated Orthogonal Decision CNNs

Kidney volume is an important bio-marker in the clinical diagnosis of various renal diseases. For example, it plays an essential role in follow-up evaluation of kidney transplants. Most existing methods for volume estimation rely on kidney segmentation as a prerequisite step, which has various limitations such as initialization-sensitivity and computationally-expensive optimization. In this paper, we propose a hybrid localization-volume estimation deep learning approach capable of (i) localizing kidneys in abdominal CT images, and (ii) estimating renal volume without requiring segmentation. Our approach involves multiple levels of self-learning of image representation using convolutional neural layers, which we show better capture the rich and complex variability in kidney data, demonstrably outperforming hand-crafted feature representations. We validate our method on clinical data of 100 patients with a total of 200 kidney samples (left and right). Our results demonstrate a 55% increase in kidney boundary localization accuracy, and a 30% increase in volume estimation accuracy compared to recent state-of-the-art methods deploying regression-forest-based learning for the same tasks.

Mohammad Arafat Hussain, Alborz Amir-Khalili, Ghassan Hamarneh, Rafeef Abugharbieh
Progressive and Multi-path Holistically Nested Neural Networks for Pathological Lung Segmentation from CT Images

Pathological lung segmentation (PLS) is an important, yet challenging, medical image application due to the wide variability of pathological lung appearance and shape. Because PLS is often a pre-requisite for other imaging analytics, methodological simplicity and generality are key factors in usability. Along those lines, we present a bottom-up deep-learning based approach that is expressive enough to handle variations in appearance, while remaining unaffected by any variations in shape. We incorporate the deeply supervised learning framework, but enhance it with a simple, yet effective, progressive multi-path scheme, which more reliably merges outputs from different network stages. The result is a deep model able to produce finer detailed masks, which we call progressive holistically-nested networks (P-HNNs). Using extensive cross-validation, our method is tested on a multi-institutional dataset comprising 929 CT scans (848 publicly available), of pathological lungs, reporting mean dice scores of 0.985 and demonstrating significant qualitative and quantitative improvements over state-of-the art approaches.

Adam P. Harrison, Ziyue Xu, Kevin George, Le Lu, Ronald M. Summers, Daniel J. Mollura
Automated Pulmonary Nodule Detection via 3D ConvNets with Online Sample Filtering and Hybrid-Loss Residual Learning

In this paper, we propose a novel framework with 3D convolutional networks (ConvNets) for automated detection of pulmonary nodules from low-dose CT scans, which is a challenging yet crucial task for lung cancer early diagnosis and treatment. Different from previous standard ConvNets, we try to tackle the severe hard/easy sample imbalance problem in medical datasets and explore the benefits of localized annotations to regularize the learning, and hence boost the performance of ConvNets to achieve more accurate detections. Our proposed framework consists of two stages: (1) candidate screening, and (2) false positive reduction. In the first stage, we establish a 3D fully convolutional network, effectively trained with an online sample filtering scheme, to sensitively and rapidly screen the nodule candidates. In the second stage, we design a hybrid-loss residual network which harnesses the location and size information as important cues to guide the nodule recognition procedure. Experimental results on the public large-scale LUNA16 dataset demonstrate superior performance of our proposed method compared with state-of-the-art approaches for the pulmonary nodule detection task.

Qi Dou, Hao Chen, Yueming Jin, Huangjing Lin, Jing Qin, Pheng-Ann Heng
CASED: Curriculum Adaptive Sampling for Extreme Data Imbalance

We introduce CASED, a novel curriculum sampling algorithm that facilitates the optimization of deep learning segmentation or detection models on data sets with extreme class imbalance. We evaluate the CASED learning framework on the task of lung nodule detection in chest CT. In contrast to two-stage solutions, wherein nodule candidates are first proposed by a segmentation model and refined by a second detection stage, CASED improves the training of deep nodule segmentation models (e.g. UNet) to the point where state of the art results are achieved using only a trivial detection stage. CASED improves the optimization of deep segmentation models by allowing them to first learn how to distinguish nodules from their immediate surroundings, while continuously adding a greater proportion of difficult-to-classify global context, until uniformly sampling from the empirical data distribution. Using CASED during training yields a minimalist proposal to the lung nodule detection problem that tops the LUNA16 nodule detection benchmark with an average sensitivity score of 88.35%. Furthermore, we find that models trained using CASED are robust to nodule annotation quality by showing that comparable results can be achieved when only a point and radius for each ground truth nodule are provided during training. Finally, the CASED learning framework makes no assumptions with regard to imaging modality or segmentation target and should generalize to other medical imaging problems where class imbalance is a persistent problem.

Andrew Jesson, Nicolas Guizard, Sina Hamidi Ghalehjegh, Damien Goblot, Florian Soudan, Nicolas Chapados
Intra-perinodular Textural Transition (Ipris): A 3D Descriptor for Nodule Diagnosis on Lung CT

This paper presents Ipris (Intra-perinodular textural transition), a new radiomic method, to automatically distinguish between benign and malignant nodules on routine lung CT scans. Ipris represents a minimal set of quantitative measurements which attempt to capture the transition in textural appearance going from the inside to the outside of the nodule. Briefly the approach involves partitioning the 3D volume and interface of the nodule into K nested shells. Then, a set of 48 Ipris features from 2D slices of the shells are extracted. The features pertain to the spiculations, intensity and gradient sharpness obtained from intensity differences between inner and outer voxels of an interface voxel. The Ipris features were used to train a support vector machine classifier in order to distinguish between benign (granulomas) from malignant (adenocarcinomas) nodules on non-contrast CT scans. We used CT scans of 290 patients from multiple institutions, one cohort for training (N = 145) and the other (N = 145) for independent validation. Independent validation of the Ipris approach yielded an AUC of 0.83 whereas, the established textural and shape radiomic features yielded a corresponding AUC of 0.75, while the AUCs for two human experts (1 pulmonologist, 1 radiologist) yielded corresponding AUCs of 0.69 and 0.73.

Mehdi Alilou, Mahdi Orooji, Anant Madabhushi
Transferable Multi-model Ensemble for Benign-Malignant Lung Nodule Classification on Chest CT

The classification of benign versus malignant lung nodules using chest CT plays a pivotal role in the early detection of lung cancer and this early detection has the best chance of cure. Although deep learning is now the most successful solution for image classification problems, it requires a myriad number of training data, which are not usually readily available for most routine medical imaging applications. In this paper, we propose the transferable multi-model ensemble (TMME) algorithm to separate malignant from benign lung nodules using limited chest CT data. This algorithm transfers the image representation abilities of three ResNet-50 models, which were pre-trained on the ImageNet database, to characterize the overall appearance, heterogeneity of voxel values and heterogeneity of shape of lung nodules, respectively, and jointly utilizes them to classify lung nodules with an adaptive weighting scheme learned during the error back propagation. Experimental results on the benchmark LIDC-IDRI dataset show that our proposed TMME algorithm achieves a lung nodule classification accuracy of 93.40%, which is markedly higher than the accuracy of seven state-of-the-art approaches.

Yutong Xie, Yong Xia, Jianpeng Zhang, David Dagan Feng, Michael Fulham, Weidong Cai
Deep Reinforcement Learning for Active Breast Lesion Detection from DCE-MRI

We present a novel methodology for the automated detection of breast lesions from dynamic contrast-enhanced magnetic resonance volumes (DCE-MRI). Our method, based on deep reinforcement learning, significantly reduces the inference time for lesion detection compared to an exhaustive search, while retaining state-of-art accuracy.This speed-up is achieved via an attention mechanism that progressively focuses the search for a lesion (or lesions) on the appropriate region(s) of the input volume. The attention mechanism is implemented by training an artificial agent to learn a search policy, which is then exploited during inference. Specifically, we extend the deep Q-network approach, previously demonstrated on simpler problems such as anatomical landmark detection, in order to detect lesions that have a significant variation in shape, appearance, location and size. We demonstrate our results on a dataset containing 117 DCE-MRI volumes, validating run-time and accuracy of lesion detection.

Gabriel Maicas, Gustavo Carneiro, Andrew P. Bradley, Jacinto C. Nascimento, Ian Reid
Pancreas Segmentation in MRI Using Graph-Based Decision Fusion on Convolutional Neural Networks

Deep neural networks have demonstrated very promising performance on accurate segmentation of challenging organs (e.g., pancreas) in abdominal CT and MRI scans. The current deep learning approaches conduct pancreas segmentation by processing sequences of 2D image slices independently through deep, dense per-pixel masking for each image, without explicitly enforcing spatial consistency constraint on segmentation of successive slices. We propose a new convolutional/recurrent neural network architecture to address the contextual learning and segmentation consistency problem. A deep convolutional sub-network is first designed and pre-trained from scratch. The output layer of this network module is then connected to recurrent layers and can be fine-tuned for contextual learning, in an end-to-end manner. Our recurrent sub-network is a type of Long short-term memory (LSTM) network that performs segmentation on an image by integrating its neighboring slice segmentation predictions, in the form of a dependent sequence processing. Additionally, a novel segmentation-direct loss function (named Jaccard Loss) is proposed and deep networks are trained to optimize Jaccard Index (JI) directly. Extensive experiments are conducted to validate our proposed deep models, on quantitative pancreas segmentation using both CT and MRI scans. Our method outperforms the state-of-the-art work on CT [11] and MRI pancreas segmentation [1], respectively.

Jinzheng Cai, Le Lu, Yuanpu Xie, Fuyong Xing, Lin Yang
Modeling Cognitive Trends in Preclinical Alzheimer’s Disease (AD) via Distributions over Permutations

This paper presents an algorithm to identify subsets of subjects who share similarities in the context of imaging and clinical measurements within a cohort of cognitively healthy individuals at risk for Alzheimer’s disease (AD). In particular, we wish to evaluate how patterns in the subjects’ cognitive scores or PIB-PET image measurements are associated with a clinical assessment of risk of developing AD, image based measures, and future cognitive decline. The challenge here is that all the participants are asymptomatic, our predictors are noisy and heterogeneous, and the disease specific signal, when present, is weak. As a result, off-the-shelf methods do not work well. We develop a model that uses a probability distribution over the set of permutations to represent the data; this yields a distance measure robust to these issues. We then show that our algorithm produces consistent and meaningful groupings of subjects based on their cognitive scores and that it provides a novel and interesting representation of measurements from PIB-PET images.

Gregory Plumb, Lindsay Clark, Sterling C. Johnson, Vikas Singh
Does Manual Delineation only Provide the Side Information in CT Prostate Segmentation?

Prostate segmentation, for accurate prostate localization in CT images, is regarded as a crucial yet challenging task. Nevertheless, due to the inevitable factors (e.g., low contrast, large appearance and shape changes), the most important problem is how to learn the informative feature representation to distinguish the prostate from non-prostate regions. We address this challenging feature learning by leveraging the manual delineation as guidance: the manual delineation does not only indicate the category of patches, but also helps enhance the appearance of prostate. This is realized by the proposed cascaded deep domain adaptation (CDDA) model. Specifically, CDDA constructs several consecutive source domains by employing a mask of manual delineation to overlay on the original CT images with different mask ratios. Upon these source domains, convnet will guide better transferrable feature learning until to the target domain. Particularly, we implement two typical methods: patch-to-scalar (CDDA-CNN) and patch-to-patch (CDDA-FCN). Also, we theoretically analyze the generalization error bound of CDDA. Experimental results show the promising results of our method.

Yinghuan Shi, Wanqi Yang, Yang Gao, Dinggang Shen
Backmatter
Metadaten
Titel
Medical Image Computing and Computer Assisted Intervention − MICCAI 2017
herausgegeben von
Maxime Descoteaux
Lena Maier-Hein
Alfred Franz
Pierre Jannin
D. Louis Collins
Dr. Simon Duchesne
Copyright-Jahr
2017
Electronic ISBN
978-3-319-66179-7
Print ISBN
978-3-319-66178-0
DOI
https://doi.org/10.1007/978-3-319-66179-7