Skip to main content

2019 | Buch

Medical Image Computing and Computer Assisted Intervention – MICCAI 2019

22nd International Conference, Shenzhen, China, October 13–17, 2019, Proceedings, Part I

herausgegeben von: Dinggang Shen, Prof. Tianming Liu, Dr. Terry M. Peters, Lawrence H. Staib, Caroline Essert, Sean Zhou, Prof. Dr. Pew-Thian Yap, Ali Khan

Verlag: Springer International Publishing

Buchreihe : Lecture Notes in Computer Science

insite
SUCHEN

Über dieses Buch

The six-volume set LNCS 11764, 11765, 11766, 11767, 11768, and 11769 constitutes the refereed proceedings of the 22nd International Conference on Medical Image Computing and Computer-Assisted Intervention, MICCAI 2019, held in Shenzhen, China, in October 2019.

The 539 revised full papers presented were carefully reviewed and selected from 1730 submissions in a double-blind review process. The papers are organized in the following topical sections:

Part I: optical imaging; endoscopy; microscopy.

Part II: image segmentation; image registration; cardiovascular imaging; growth, development, atrophy and progression.

Part III: neuroimage reconstruction and synthesis; neuroimage segmentation; diffusion weighted magnetic resonance imaging; functional neuroimaging (fMRI); miscellaneous neuroimaging.

Part IV: shape; prediction; detection and localization; machine learning; computer-aided diagnosis; image reconstruction and synthesis.

Part V: computer assisted interventions; MIC meets CAI.

Part VI: computed tomography; X-ray imaging.

Inhaltsverzeichnis

Frontmatter

Optical Imaging

Frontmatter
Enhancing OCT Signal by Fusion of GANs: Improving Statistical Power of Glaucoma Clinical Trials

Accurately monitoring the efficacy of disease-modifying drugs in glaucoma therapy is of critical importance. Albeit high resolution spectral-domain optical coherence tomography (SDOCT) is now in widespread clinical use, past landmark glaucoma clinical trials have used time-domain optical coherence tomography (TDOCT), which leads, however, to poor statistical power due to low signal-to-noise characteristics. Here, we propose a probabilistic ensemble model for improving the statistical power of imaging-based clinical trials. TDOCT are converted to synthesized SDOCT images and segmented via Bayesian fusion of an ensemble of generative adversarial networks (GANs). The proposed model integrates super resolution (SR) and multi-atlas segmentation (MAS) in a principled way. Experiments on the UK Glaucoma Treatment Study (UKGTS) show that the model successfully combines the strengths of both techniques (improved image quality of SR and effective label propagation of MAS), and produces a significantly better separation between treatment arms than conventional segmentation of TDOCT.

Georgios Lazaridis, Marco Lorenzi, Sebastien Ourselin, David Garway-Heath
A Deep Reinforcement Learning Framework for Frame-by-Frame Plaque Tracking on Intravascular Optical Coherence Tomography Image

Intravascular Optical Coherence Tomography (IVOCT) is considered as the gold standard for the atherosclerotic plaque analysis in clinical application. A continuous and accurate plaque tracking algorithm is critical for coronary heart disease diagnosis and treatment. However, continuous and accurate plaque tracking frame-by-frame is very challenging because of some difficulties from IVOCT imaging conditions, such as speckle noise, complex and various intravascular morphology, and large numbers of IVOCT images in a pullback. To address such a challenging problem, for the first time we proposed a novel Reinforcement Learning (RL) based framework for accurate and continuous plaque tracking frame-by-frame on IVOCT images. In this framework, eight transformation actions are well-designed for IVOCT images to fit any possible changes of plaque’s location and scale, and the spatio-temporal location correlation information of adjacent frames is modeled into state representation of RL to achieve continuous and accurate plaque detection, avoiding potential omissions. What’s more, the proposed method has strong expansibility, because the fully-automated and semi-automated tracking patterns are both allowed to fit the clinical practice. Experiments on the large-scale IVOCT data show that the plaque-level accuracy of the proposed method can achieve 0.89 and 0.94 for the fully-automated tracking pattern and semi-automated tracking pattern respectively. This proves that our method has big application potential in future clinical practice. The code is open accessible: https://github.com/luogongning/PlaqueRL .

Gongning Luo, Suyu Dong, Kuanquan Wang, Dong Zhang, Yue Gao, Xin Chen, Henggui Zhang, Shuo Li
Multi-index Optic Disc Quantification via MultiTask Ensemble Learning

Accurate quantification of optic disc (OD) is clinically significant for the assessment and diagnosis of ophthalmic disease. Multi-index OD quantification, i.e., to simultaneously quantify a set of clinical indices including 2 vertical diameters (cup and disc), 2 whole areas (disc and rim), and 16 regional areas, is an untouched challenge due to its complexity of the multi-dimensional nonlinear mapping and various visual appearance across patients. In this paper, we propose a novel multitask ensemble learning framework (DMTFs) to automatically achieve accurate multi-types multi-index OD quantification. DMTFs creates an ensemble of multiple OD quantification tasks (OD segmentation and indices estimation) that are individually accurate and mutually complementary, and then learns the ensemble under a multi-task learning framework which is formed as a tree structure with a root network for shared feature representation, two branches for task-specific prediction, and a multitask ensemble module for aggregation of multi-index OD quantification. DMTFs models the consistency correlation between OD segmentation and indices estimation tasks to conform to the accurate multi-index OD quantification. Experiments on the ORIGA datasets show that the proposed method achieves impressive performance with the average mean absolute error on 20 indices of $$0.99\,\pm \,0.20$$ , $$0.73\,\pm \,0.14$$ and $$1.23\,\pm \,0.24$$ for diameters, whole areas and regional area, respectively. Besides, the obtained quantitative indices achieve competitive performance (AUC = 0.8623) on glaucoma diagnosis. As the first multi-index OD quantification, the proposed DMTFs demonstrates great potential in clinical application.

Rongchang Zhao, Zailiang Chen, Xiyao Liu, Beiji Zou, Shuo Li
Retinal Abnormalities Recognition Using Regional Multitask Learning

The number of people suffering from retinal diseases increases with population aging and the popularity of electronic screens. Previous studies on deep learning based automatic screening generally focused on specific types of retinal diseases, such as diabetic retinopathy and glaucoma. Since patients may suffer from various types of retinal diseases simultaneously, these solutions are not clinically practical. To address this issue, we propose a novel deep learning based method that can recognise 36 different retinal diseases with a single model. More specifically, the proposed method uses a region-specific multi-task recognition model by learning diseases affecting different regions of the retina with three sub-networks. The three sub-networks are semantically trained to recognise diseases affecting optic-disc, macula and entire retina. Our contribution is two-fold. First, we use multitask learning for retinal disease classification and achieve significant improvements for recognising three main groups of retinal diseases in general, macular and optic-disc regions. Second, we collect a multi-label retinal dataset to the community as standard benchmark and release it for further research opportunities.

Xin Wang, Lie Ju, Xin Zhao, Zongyuan Ge
Unifying Structure Analysis and Surrogate-Driven Function Regression for Glaucoma OCT Image Screening

Optical Coherence Tomography (OCT) imaging plays an important role in glaucoma diagnosis in clinical practice. Early detection and timely treatment can prevent glaucoma patients from permanent vision loss. However, only a dearth of automated methods has been developed based on OCT images for glaucoma study. In this paper, we present a novel framework to effectively classify glaucoma OCT images from normal ones. A semi-supervised learning strategy with smoothness assumption is applied for surrogate assignment of missing function regression labels. Besides, the proposed multi-task learning network is capable of exploring the structure and function relationship from the OCT image and visual field measurement simultaneously, which contributes to classification performance boosting. Essentially, we are the first to unify the structure analysis and function regression for glaucoma screening. It is also worth noting that we build the largest glaucoma OCT image dataset involving 4877 volumes to develop and evaluate the proposed method. Extensive experiments demonstrate that our framework outperforms the baseline methods and two glaucoma experts by a large margin, achieving 93.2%, 93.2% and 97.8% on accuracy, F1 score and AUC, respectively.

Xi Wang, Hao Chen, Luyang Luo, An-ran Ran, Poemen P. Chan, Clement C. Tham, Carol Y. Cheung, Pheng-Ann Heng
Evaluation of Retinal Image Quality Assessment Networks in Different Color-Spaces

Retinal image quality assessment (RIQA) is essential for controlling the quality of retinal imaging and guaranteeing the reliability of diagnoses by ophthalmologists or automated analysis systems. Existing RIQA methods focus on the RGB color-space and are developed based on small datasets with binary quality labels (i.e., ‘Accept’ and ‘Reject’). In this paper, we first re-annotate an Eye-Quality (EyeQ) dataset with 28,792 retinal images from the EyePACS dataset, based on a three-level quality grading system (i.e., ‘Good’, ‘Usable’ and ‘Reject’) for evaluating RIQA methods. Our RIQA dataset is characterized by its large-scale size, multi-level grading, and multi-modality. Then, we analyze the influences on RIQA of different color-spaces, and propose a simple yet efficient deep network, named Multiple Color-space Fusion Network (MCF-Net), which integrates the different color-space representations at both a feature-level and prediction-level to predict image quality grades. Experiments on our EyeQ dataset show that our MCF-Net obtains a state-of-the-art performance, outperforming the other deep learning methods. Furthermore, we also evaluate diabetic retinopathy (DR) detection methods on images of different quality, and demonstrate that the performances of automated diagnostic systems are highly dependent on image quality.

Huazhu Fu, Boyang Wang, Jianbing Shen, Shanshan Cui, Yanwu Xu, Jiang Liu, Ling Shao
3D Surface-Based Geometric and Topological Quantification of Retinal Microvasculature in OCT-Angiography via Reeb Analysis

3D optical coherence tomography angiography (OCT-A) is a novel, non-invasive imaging modality for studying important retina-related diseases. Current works have been mainly focusing on the microvascular analysis of 2D enface OCT-A projections while direct 3D analysis using rich depth-resolved microvascular information is rarely considered. In this work, we aim to set up an innovative 3D microvascular modeling framework via Reeb analysis to explore rich geometric and topological information. We first use effective vessel extraction and surface reconstruction techniques to establish a complete 3D mesh representation of retinal OCT-A microvasculature. We propose to use geodesic distance as a feature function to build level contours with smooth transitions on mesh surface. Intrinsic Reeb graphs are thereby constructed through level contours to represent general OCT-A microvascular topology. Afterwards, specific geometric and topological analysis are performed on Reeb graphs to quantify critical microvascular characteristics. The proposed Reeb analysis framework is evaluated on a clinical DR dataset and shows great advantage in describing 3D microvascular changes. It is able to produce important surface-based microvascular biomarkers with high statistical power for disease studies.

Jiong Zhang, Amir H. Kashani, Yonggang Shi
Limited-Angle Diffuse Optical Tomography Image Reconstruction Using Deep Learning

Diffuse optical tomography (DOT) leverages near-infrared light propagation through in vivo tissue to assess its optical properties and identify abnormalities such as cancerous lesions. While this relatively new optical imaging modality is cost-effective and non-invasive, its inverse problem (i.e., recovering an image from raw signal measurements) is ill-posed, due to the highly diffusive nature of light propagation in biological tissues and limited boundary measurements. Solving the inverse problem becomes even more challenging in the case of limited-angle data acquisition given the restricted number of sources and sensors, the sparsity of the recovered information, and the presence of noise, representative of real world acquisition environments. Traditional optimization-based reconstruction methods are computationally intensive and thus too slow for real-time imaging applications. We propose a novel image reconstruction method for breast cancer DOT imaging. Our method is highlighted by two components: (i) a deep learning network with a novel hybrid loss, and (ii) a distribution transfer learning module. Our model is designed to focus on lesion specific information and small reconstruction details to reduce reconstruction loss and lesion localization errors. The transfer learning module alleviates the need for real training data by taking advantage of cross-domain learning. Both quantitative and qualitative results demonstrate that the proposed method’s accuracy surpasses existing methods’ in detecting tissue abnormalities.

Hanene Ben Yedder, Majid Shokoufi, Ben Cardoen, Farid Golnaraghi, Ghassan Hamarneh
Data-Driven Enhancement of Blurry Retinal Images via Generative Adversarial Networks

In this paper, we aim at improving the quality of blurry retinal images that are caused by ocular diseases. The blurry images could affect clinical diagnosis for both ophthalmologists and automatic aided system. Inspired by the great success of generative adversarial networks, a data-driven approach is proposed to enhance the blurry images in a weakly supervised manner. That is to say, instead of paired blurry and high-quality images, our approach can be trained with two sets of unpaired images. The advantage of unpaired training setting makes our approach easily applicable, since the annotated data are very limited in medical images. Compared with traditional methods, our model is an end-to-end approach without human designed adjustments or prior knowledge. However, it achieves a superior performance on blurry images. Besides, a dynamic retinal image feature constraint is proposed to guide the generator to improve the performance and avoid over-enhancing the extremely blurry region. Our approach can work on large image resolution which makes it widely beneficial to clinic images.

He Zhao, Bingyu Yang, Lvchen Cao, Huiqi Li
Dual Encoding U-Net for Retinal Vessel Segmentation

Retinal Vessel Segmentation is an essential step for the early diagnosis of eye-related diseases, such as diabetes and hypertension. Segmentation of blood vessels requires both sizeable receptive field and rich spatial information. In this paper, we propose a novel Dual Encoding U-Net (DEU-Net), which have two encoders: a spatial path with large kernel to preserve the spatial information and a context path with multiscale convolution block to capture more semantic information. On the top of the two paths, we introduce a feature fusion module to combine the different level of feature representation. Besides, we apply channel attention to select useful feature map in a skip connection. Furthermore, low-level and high-level prediction are combined in multiscale prediction module for a better accuracy. We evaluated this model on the digital retinal images for vessel extraction (DRIVE) dataset and the child heart and health study (CHASEDB1) dataset. Results show that the proposed DEU-Net model achieved the state-of-the-art retinal vessel segmentation accuracy on both datasets.

Bo Wang, Shuang Qiu, Huiguang He
A Deep Learning Design for Improving Topology Coherence in Blood Vessel Segmentation

The segmentation of blood vessels in medical images has been heavily studied, given its impact in several clinical practices. Deep Learning methods have been applied to supervised segmentation of blood vessels, mainly the retinal ones due to the availability of manual annotations. Despite their success, they typically minimize the Binary Cross Entropy loss, which does not penalize topological mistakes. These errors are relevant in graph-like structures such as blood vessel trees, as a missing segment or an inadequate merging or splitting of branches, may severely change the topology of the network and put at risk the extraction of vessel pathways and their characterization. In this paper, we propose an end-to-end network design comprising a cascade of a typical segmentation network and a Variational Auto-Encoder which, by learning a rich but compact latent space, is able to correct many topological incoherences. Our experiments in three of the most commonly used retinal databases, DRIVE, STARE, and CHASEDB1, show that the proposed model effectively learns representations inducing better segmentations in terms of topology, without hurting the usual pixel-wise metrics. The implementation is available at https://github.com/rjtaraujo/dvae-refiner .

Ricardo J. Araújo, Jaime S. Cardoso, Hélder P. Oliveira
Boundary and Entropy-Driven Adversarial Learning for Fundus Image Segmentation

Accurate segmentation of the optic disc (OD) and cup (OC) in fundus images from different datasets is critical for glaucoma disease screening. The cross-domain discrepancy (domain shift) hinders the generalization of deep neural networks to work on different domain datasets. In this work, we present an unsupervised domain adaptation framework, called Boundary and Entropy-driven Adversarial Learning (BEAL), to improve the OD and OC segmentation performance, especially on the ambiguous boundary regions. In particular, our proposed BEAL framework utilizes the adversarial learning to encourage the boundary prediction and mask probability entropy map (uncertainty map) of the target domain to be similar to the source ones, generating more accurate boundaries and suppressing the high uncertainty predictions of OD and OC segmentation. We evaluate the proposed BEAL framework on two public retinal fundus image datasets (Drishti-GS and RIM-ONE-r3), and the experiment results demonstrate that our method outperforms the state-of-the-art unsupervised domain adaptation methods. Our code is available at https://github.com/EmmaW8/BEAL .

Shujun Wang, Lequan Yu, Kang Li, Xin Yang, Chi-Wing Fu, Pheng-Ann Heng
Unsupervised Ensemble Strategy for Retinal Vessel Segmentation

Retinal vessel segmentation is a fundamental step in diagnosis for retinal image analysis. Though many segmentation methods are proposed, little research considers how to ensemble their results to fully exploit the advantages of each method. In this work, we propose a novel unsupervised ensemble strategy to automatically combine multiple segmentation results for an accurate result. There is a no-reference network that could assess the vessel segmentation quality without knowing the ground truth. We then optimize the weight of individual result to maximize this segmentation quality score to enhance the final result. Through extensive experiments, our method has shown superior performance over the state-of-the-art on the DRIVE, STARE, CHASE_DB1 datasets.

Bo Liu, Lin Gu, Feng Lu
Fully Convolutional Boundary Regression for Retina OCT Segmentation

A major goal of analyzing retinal optical coherence tomography (OCT) images is retinal layer segmentation. Accurate automated algorithms for segmenting smooth continuous layer surfaces, with correct hierarchy (topology) are desired for monitoring disease progression. State-of-the-art methods use a trained classifier to label each pixel into background, layer, or surface pixels. The final step of extracting the desired smooth surfaces with correct topology are mostly performed by graph methods (e.g. shortest path, graph cut). However, manually building a graph with varying constraints by retinal region and pathology and solving the minimization with specialized algorithms will degrade the flexibility and time efficiency of the whole framework. In this paper, we directly model the distribution of surface positions using a deep network with a fully differentiable soft argmax to obtain smooth, continuous surfaces in a single feed forward operation. A special topology module is used in the deep network both in the training and testing stages to guarantee the surface topology. An extra deep network output branch is also used for predicting lesion and layers in a pixel-wise labeling scheme. The proposed method was evaluated on two publicly available data sets of healthy controls, subjects with multiple sclerosis, and diabetic macular edema; it achieves state-of-the art sub-pixel results.

Yufan He, Aaron Carass, Yihao Liu, Bruno M. Jedynak, Sharon D. Solomon, Shiv Saidha, Peter A. Calabresi, Jerry L. Prince
PM-Net: Pyramid Multi-label Network for Joint Optic Disc and Cup Segmentation

Accurate segmentation of optic disc (OD) and optic cup (OC) is a fundamental task for fundus image analysis. Most existing methods focus on segmenting OD and OC inside the optic nerve head (ONH) area but paying little attention to accurate ONH localization. In this paper, we propose a Mask-RCNN based paradigm to localize ONH and jointly segment OD and OC in a whole fundus image. However, directly using Mask-RCNN faces some critical issues: First, for some glaucoma cases, the highly overlapping of OD and OC may lead to the missing of OC proposals. Second, some proposals may not fully surround the object, and thus the segmentation can be incomplete. Last, the instance head in Mask-RCNN cannot well incorporate the prior such as the OC is inside the OD. To address these issues, we first propose a segmentation based region proposal network (RPN) to improve the accuracy of proposals and then propose a pyramid RoIAlign module to aggregate the multi-level information to get a better feature representation. Furthermore, we employ a multi-label head strategy to incorporate the prior for better performance. Extensive experiments verify our method.

Pengshuai Yin, Qingyao Wu, Yanwu Xu, Huaqing Min, Ming Yang, Yubing Zhang, Mingkui Tan
Biological Age Estimated from Retinal Imaging: A Novel Biomarker of Aging

Biological age (BA) is widely introduced as a biomarker of aging, which can indicate the individual difference underlying the aging progress objectively. Recently, a new type of BA - ‘brain age’ predicted from brain neuroimaging has been proved to be a novel effective biomarker of aging. The retina is considered to share anatomical and physiological similarities with the brain, and rich information related with aging can be visualized non-invasively from retinal imaging. However, there are very few studies exploring BA estimation from retinal imaging. In this paper, we conducted a pilot study to explore the potential of using fundus images to estimate BA. Modeling the BA estimation as a multi-classification problem, we developed a convolutional neural network (CNN)-based classifier using 12,000 fundus images from healthy subjects. An image detail enhancement method was introduced for global anatomical and physiological features enhancement. A joint loss function with label distribution and error tolerance was proposed to improve the model performance in learning the time-continuous nature of aging within an acceptable range of ambiguity. The proposed methods were evaluated in healthy subjects from a clinical dataset based on the VGG-19 network. The optimal model achieved a mean absolute error of 3.73 years, outperforming existing ‘brain age’ models. An additional individual-based validation was conducted in another real-world dataset, which showed an increasing BA difference between healthy subjects and unhealthy subjects with aging. Results of our study indicate that retinal imaging–based BA could be potentially used as a novel candidate biomarker of aging.

Chi Liu, Wei Wang, Zhixi Li, Yu Jiang, Xiaotong Han, Jason Ha, Wei Meng, Mingguang He
Task Adaptive Metric Space for Medium-Shot Medical Image Classification

In the medical domain, one challenge of deep learning is to build sample-efficient models from a small number of labeled data. In recent years, meta-learning has become an important approach to few-shot image classification. However, current research on meta-learning focuses on learning from a few examples; we propose to extend few-shot learning to medium-shot to evaluate medical classification tasks in a more realistic setup. We build a baseline evaluation procedure by analyzing two representative meta-learning methods through the lens of bias-variance tradeoff, and propose to fuse the two techniques for better bias-variance equilibrium. The proposed method, Task Adaptive Metric Space (TAMS), fine-tunes parameters of a metric space to represent medical data in a more semantically meaningful way. Our empirical studies suggest that TAMS outperforms other baselines. Visualizations on the metric space show TAMS leads to better-separated clusters. Our baselines and evaluation procedure of the proposed TAMS opens the door to more research on medium-shot medical image classification.

Xiang Jiang, Liqiang Ding, Mohammad Havaei, Andrew Jesson, Stan Matwin
Two-Stream CNN with Loose Pair Training for Multi-modal AMD Categorization

This paper studies automated categorization of age-related macular degeneration (AMD) given a multi-modal input, which consists of a color fundus image and an optical coherence tomography (OCT) image from a specific eye. Previous work uses a traditional method, comprised of feature extraction and classifier training that cannot be optimized jointly. By contrast, we propose a two-stream convolutional neural network (CNN) that is end-to-end. The CNN’s fusion layer is tailored to the need of fusing information from the fundus and OCT streams. For generating more multi-modal training instances, we introduce Loose Pair training, where a fundus image and an OCT image are paired based on class labels rather than eyes. Moreover, for a visual interpretation of how the individual modalities make contributions, we extend the class activation mapping technique to the multi-modal scenario. Experiments on a real-world dataset collected from an outpatient clinic justify the viability of our proposal for multi-modal AMD categorization.

Weisen Wang, Zhiyan Xu, Weihong Yu, Jianchun Zhao, Jingyuan Yang, Feng He, Zhikun Yang, Di Chen, Dayong Ding, Youxin Chen, Xirong Li
Deep Multi-label Classification in Affine Subspaces

Multi-label classification (MLC) problems are becoming increasingly popular in the context of medical imaging. This has in part been driven by the fact that acquiring annotations for MLC is far less burdensome than for semantic segmentation and yet provides more expressiveness than multi-class classification. However, to train MLCs, most methods have resorted to similar objective functions as with traditional multi-class classification settings. We show in this work that such approaches are not optimal and instead propose a novel deep MLC classification method in affine subspace. At its core, the method attempts to pull features of class-labels towards different affine subspaces while maximizing the distance between them. We evaluate the method using two MLC medical imaging datasets and show a large performance increase compared to previous multi-label frameworks. This method can be seen as a plug-in replacement loss function and is trainable in an end-to-end fashion.

Thomas Kurmann, Pablo Márquez-Neila, Sebastian Wolf, Raphael Sznitman
Multi-scale Microaneurysms Segmentation Using Embedding Triplet Loss

Deep learning techniques are recently being used in fundus image analysis and diabetic retinopathy detection. Microaneurysms are an important indicator of diabetic retinopathy progression. We introduce a two-stage deep learning approach for microaneurysms segmentation using multiple scales of the input with selective sampling and embedding triplet loss. The model first segments on two scales and then the segmentations are refined with a classification model. To enhance the discriminative power of the classification model, we incorporate triplet embedding loss with a selective sampling routine. The model is evaluated quantitatively to assess the segmentation performance and qualitatively to analyze the model predictions. This approach introduces a $$30.29\%$$ relative improvement over the fully convolutional neural network.

Mhd Hasan Sarhan, Shadi Albarqouni, Mehmet Yigitsoy, Nassir Navab, Abouzar Eslami
A Divide-and-Conquer Approach Towards Understanding Deep Networks

Deep neural networks have achieved tremendous success in various fields including medical image segmentation. However, they have long been criticized for being a black-box, in that interpretation, understanding and correcting architectures is difficult as there is no general theory for deep neural network design. Previously, precision learning was proposed to fuse deep architectures and traditional approaches. Deep networks constructed in this way benefit from the original known operator, have fewer parameters, and improved interpretability. However, they do not yield state-of-the-art performance in all applications. In this paper, we propose to analyze deep networks using known operators, by adopting a divide-and-conquer strategy to replace network components, whilst retaining networks performance. The task of retinal vessel segmentation is investigated for this purpose. We start with a high-performance U-Net and show by step-by-step conversion that we are able to divide the network into modules of known operators. The results indicate that a combination of a trainable guided filter and a trainable version of the Frangi filter yields a performance at the level of U-Net (AUC 0.974 vs. 0.972) with a tremendous reduction in parameters (111, 536 vs. 9, 575). In addition, the trained layers can be mapped back into their original algorithmic interpretation and analyzed using standard tools of signal processing.

Weilin Fu, Katharina Breininger, Roman Schaffert, Nishant Ravikumar, Andreas Maier
Multiclass Segmentation as Multitask Learning for Drusen Segmentation in Retinal Optical Coherence Tomography

Automated drusen segmentation in retinal optical coherence tomography (OCT) scans is relevant for understanding age-related macular degeneration (AMD) risk and progression. This task is usually performed by segmenting the top/bottom anatomical interfaces that define drusen, the outer boundary of the retinal pigment epithelium (OBRPE) and the Bruch’s membrane (BM), respectively. In this paper we propose a novel multi-decoder architecture that tackles drusen segmentation as a multitask problem. Instead of training a multiclass model for OBRPE/BM segmentation, we use one decoder per target class and an extra one aiming for the area between the layers. We also introduce connections between each class-specific branch and the additional decoder to increase the regularization effect of this surrogate task. We validated our approach on private/public data sets with 166 early/intermediate AMD Spectralis, and 200 AMD and control Bioptigen OCT volumes, respectively. Our method consistently outperformed several baselines in both layer and drusen segmentation evaluations.

Rhona Asgari, José Ignacio Orlando, Sebastian Waldstein, Ferdinand Schlanitz, Magdalena Baratsits, Ursula Schmidt-Erfurth, Hrvoje Bogunović
Active Appearance Model Induced Generative Adversarial Network for Controlled Data Augmentation

Data augmentation is an important strategy for enlarging training datasets in deep learning-based medical image analysis. This is because large, annotated medical datasets are not only difficult and costly to generate, but also quickly become obsolete due to rapid advances in imaging technology. Image-to-image conditional generative adversarial networks (C-GAN) provide a potential solution for data augmentation. However, annotations used as inputs to C-GAN are typically based only on shape information, which can result in undesirable intensity distributions in the resulting artificially-created images. In this paper, we introduce an active cell appearance model (ACAM) that can measure statistical distributions of shape and intensity and use this ACAM model to guide C-GAN to generate more realistic images, which we call A-GAN. A-GAN provides an effective means for conveying anisotropic intensity information to C-GAN. A-GAN incorporates a statistical model (ACAM) to determine how transformations are applied for data augmentation. Traditional approaches for data augmentation that are based on arbitrary transformations might lead to unrealistic shape variations in an augmented dataset that are not representative of real data. A-GAN is designed to ameliorate this. To validate the effectiveness of using A-GAN for data augmentation, we assessed its performance on cell analysis in adaptive optics retinal imaging, which is a rapidly-changing medical imaging modality. Compared to C-GAN, A-GAN achieved stability in fewer iterations. The cell detection and segmentation accuracy when assisted by A-GAN augmentation was higher than that achieved with C-GAN. These findings demonstrate the potential for A-GAN to substantially improve existing data augmentation methods in medical image analysis.

Jianfei Liu, Christine Shen, Tao Liu, Nancy Aguilera, Johnny Tam
Biomarker Localization by Combining CNN Classifier and Generative Adversarial Network

This paper proposes a novel deep neural network architecture to effectively localize potential biomarkers in medical images, when only the image-level labels are available during model training. The proposed architecture combines a CNN classifier and a generative adversarial network (GAN) in a novel way, such that the CNN classifier and the discriminator in the GAN can effectively help the encoder-decoder in the GAN to remove biomarkers. Biomarkers in abnormal images can then be easily localized and segmented by subtracting the output of the encoder-decoder from its original input. The proposed approach was evaluated on diabetic retinopathy images with real biomarkers and on skin images with simulated biomarkers, showing state-of-the-art performance in localizing biomarkers even if biomarkers are irregularly scattered and are of various sizes in images.

Rong Zhang, Shuhan Tan, Ruixuan Wang, Siyamalan Manivannan, Jingjing Chen, Haotian Lin, Wei-Shi Zheng
Probabilistic Atlases to Enforce Topological Constraints

Probabilistic atlases (PAs) have long been used in standard segmentation approaches and, more recently, in conjunction with Convolutional Neural Networks (CNNs). However, their use has been restricted to relatively standardized structures such as the brain or heart which have limited or predictable range of deformations. Here we propose an encoding-decoding CNN architecture that can exploit rough atlases that encode only the topology of the target structures that can appear in any pose and have arbitrarily complex shapes to improve the segmentation results. It relies on the output of the encoder to compute both the pose parameters used to deform the atlas and the segmentation mask itself, which makes it effective and end-to-end trainable.

Udaranga Wickramasinghe, Graham Knott, Pascal Fua
Synapse-Aware Skeleton Generation for Neural Circuits

Reconstructed terabyte and petabyte electron microscopy image volumes contain fully-segmented neurons at resolutions fine enough to identify every synaptic connection. After manual or automatic reconstruction, neuroscientists want to extract wiring diagrams and connectivity information to analyze the data at a higher level. Despite significant advances in image acquisition, neuron segmentation, and synapse detection techniques, the extracted wiring diagrams are still quite coarse, and often do not take into account the wealth of information in the densely reconstructed volumes. We propose a synapse-aware skeleton generation strategy to transform the reconstructed volumes into an information-rich yet abstract format on which neuroscientists can perform biological analysis and run simulations. Our method extends existing topological thinning strategies and guarantees a one-to-one correspondence between skeleton endpoints and synapses while simultaneously generating vital geometric statistics on the neuronal processes. We demonstrate our results on three large-scale connectomic datasets and compare against current state-of-the-art skeletonization algorithms.

Brian Matejek, Donglai Wei, Xueying Wang, Jinglin Zhao, Kálmán Palágyi, Hanspeter Pfister
Seeing Under the Cover: A Physics Guided Learning Approach for In-bed Pose Estimation

Human in-bed pose estimation has huge practical values in medical and healthcare applications yet still mainly relies on expensive pressure mapping (PM) solutions. In this paper, we introduce our novel physics inspired vision-based approach that addresses the challenging issues associated with the in-bed pose estimation problem including monitoring a fully covered person in complete darkness. We reformulated this problem using our proposed Under the Cover Imaging via Thermal Diffusion (UCITD) method to capture the high resolution pose information of the body even when it is fully covered by using a long wavelength IR technique. We proposed a physical hyperparameter concept through which we achieved high quality groundtruth pose labels in different modalities. A fully annotated in-bed pose dataset called Simultaneously-collected multimodal Lying Pose (SLP) is also formed/released with the same order of magnitude as most existing large-scale human pose datasets to support complex models’ training and evaluation. A network trained from scratch on it and tested on two diverse settings, one in a living room and the other in a hospital room showed pose estimation performance of 98.0% and 96.0% in PCK0.2 standard, respectively. Moreover, in a multi-factor comparison with a state-of-the art in-bed pose monitoring solution based on PM, our solution showed significant superiority in all practical aspects by being 60 times cheaper, 300 times smaller, while having higher pose recognition granularity and accuracy.

Shuangjun Liu, Sarah Ostadabbas
EDA-Net: Dense Aggregation of Deep and Shallow Information Achieves Quantitative Photoacoustic Blood Oxygenation Imaging Deep in Human Breast

Accurately and quantitatively imaging blood oxygen saturation (sO2) is a very meaningful application of photoacoustic tomography (PAT), which is an important indicator for measuring physiological diseases and assisting cancer diagnostic and treatment. Yet, due to the complex optical properties of heterogeneous biological tissues, the diffusely scattered light in the tissue faces the unknown wavelength-dependent optical attenuation and causes the uncertain distribution of the fluence, which fundamentally limits the quantification accuracy of PAT for imaging sO2. To tackle this problem, we propose an architecture, named EDA-Net, with Encoder, Decoder and Aggregator, which can aggregate features for a richer representation. We argue that the dense aggregated information helps to extract the comprehensive context information from the multi-wavelength PA images, then accurately infer the quantitative distribution of sO2. The numerical experiment is performed by using PA images, which are obtained by Monte Carlo optical preprocessing and k-Wave acoustic preprocessing based on clinically-obtained female breast phantom. We also explore the effect of the combination of different wavelengths on the accuracy of estimating sO2 to guide the design of PA imaging systems for meeting clinical needs. The experimental results demonstrate the efficacy and robustness of our proposed method, and also compare it with other methods to further prove the reliability of our quantitative sO2 results.

Changchun Yang, Fei Gao
Fused Detection of Retinal Biomarkers in OCT Volumes

Optical Coherence Tomography (OCT) is the primary imaging modality for detecting pathological biomarkers associated to retinal diseases such as Age-Related Macular Degeneration. In practice, clinical diagnosis and treatment strategies are closely linked to biomarkers visible in OCT volumes and the ability to identify these plays an important role in the development of ophthalmic pharmaceutical products. In this context, we present a method that automatically predicts the presence of biomarkers in OCT cross-sections by incorporating information from the entire volume. We do so by adding a bidirectional LSTM to fuse the outputs of a Convolutional Neural Network that predicts individual biomarkers. We thus avoid the need to use pixel-wise annotations to train our method and instead provide fine-grained biomarker information regardless. On a dataset of 416 volumes, we show that our approach imposes coherence between biomarker predictions across volume slices and our predictions are superior to several existing approaches.

Thomas Kurmann, Pablo Márquez-Neila, Siqing Yu, Marion Munk, Sebastian Wolf, Raphael Sznitman
Vessel-Net: Retinal Vessel Segmentation Under Multi-path Supervision

Due to the complex morphology of fine vessels, it remains challenging for most of existing models to accurately segment them, particularly the capillaries in color fundus retinal images. In this paper, we propose a novel and lightweight deep learning model called Vessel-Net for retinal vessel segmentation. First, we design an efficient inception-residual convolutional block to combine the advantages of the Inception model and residual module for improved feature representation. Next, we embed the inception-residual blocks inside a U-like encoder-decoder architecture for vessel segmentation. Then, we introduce four supervision paths, including the traditional supervision path, a richer feature supervision path, and two multi-scale supervision paths to preserve the rich and multi-scale deep features during model optimization. We evaluated our Vessel-Net against several recent methods on two benchmark retinal databases and achieved the new state-of-the-art performance (i.e. AUC of 98.21%/98.60% on the DRIVE and CHASE databases, respectively). Our ablation studies also demonstrate that the proposed inception-residual block and the multi-path supervision both can produce impressive performance gains for retinal vessel segmentation.

Yicheng Wu, Yong Xia, Yang Song, Donghao Zhang, Dongnan Liu, Chaoyi Zhang, Weidong Cai
Ki-GAN: Knowledge Infusion Generative Adversarial Network for Photoacoustic Image Reconstruction In Vivo

Photoacoustic computed tomography (PACT) breaks through the depth restriction in optical imaging, and the contrast restriction in ultrasound imaging, which is achieved by receiving thermoelastically induced ultrasound signal triggered by an ultrashort laser pulse. The photoacoustic (PA) images reconstructed from the raw PA signals usually utilize conventional reconstruction algorithms, e.g. filtered back-projection. However, the performance of conventional reconstruction algorithms is usually limited by complex and uncertain physical parameters due to heterogeneous tissue structure. In recent years, deep learning has emerged to show great potential in the reconstruction problem. In this work, for the first time to our best knowledge, we propose to infuse the classical signal processing and certified knowledge into the deep learning for PA imaging reconstruction. Specifically, we make these contributions to propose a novel Knowledge Infusion Generative Adversarial Network (Ki-GAN) architecture that combines conventional delay-and-sum algorithm to reconstruct PA image. We train the network on a public clinical database. Our method shows better image reconstruction performance in cases of both full-sampled data and sparse-sampled data compared with state-of-the-art methods. Lastly, our proposed approach also shows high potential for other imaging modalities beyond PACT.

Hengrong Lan, Kang Zhou, Changchun Yang, Jun Cheng, Jiang Liu, Shenghua Gao, Fei Gao
Uncertainty Guided Semi-supervised Segmentation of Retinal Layers in OCT Images

Deep convolutional neural networks have shown outstanding performance in medical image segmentation tasks. The usual problem when training supervised deep learning methods is the lack of labeled data which is time-consuming and costly to obtain. In this paper, we propose a novel uncertainty guided semi-supervised learning based on student-teacher approach for training the segmentation network using limited labeled samples and large number of unlabeled images. First, a teacher segmentation model is trained from the labeled samples using Bayesian deep learning. The trained model is used to generate soft segmentation labels and uncertainty map for the unlabeled set. The student model is then updated using the softly segmented samples and the corresponding pixel-wise confidence of the segmentation quality estimated from the uncertainty of the teacher model using a newly designed loss function. Experimental results on a retinal layer segmentation task show that the proposed method improves the segmentation performance in comparison to the fully supervised approach and is on par with the expert annotator. The proposed semi-supervised segmentation framework is a key contribution and applicable for biomedical image segmentation across various imaging modalities where access to annotated medical images is challenging.

Suman Sedai, Bhavna Antony, Ravneet Rai, Katie Jones, Hiroshi Ishikawa, Joel Schuman, Wollstein Gadi, Rahil Garnavi

Endoscopy

Frontmatter
Triple ANet: Adaptive Abnormal-aware Attention Network for WCE Image Classification

Accurate detection of abnormal regions in Wireless Capsule Endoscopy (WCE) images is crucial for early intestine cancer diagnosis and treatment, while it still remains challenging due to the relatively low contrasts and ambiguous boundaries between abnormalities and normal regions. Additionally, the huge intra-class variances, alone with the high degree of visual similarities shared by inter-class abnormalities prevent the network from robust classification. To tackle these dilemmas, we propose an Adaptive Abnormal-aware Attention Network (Triple ANet) with Adaptive Dense Block (ADB) and Abnormal-aware Attention Module (AAM) for automatic WCE image analysis. ADB is designed to assign one attention score for each dense connection in dense blocks and to enhance useful features, while AAM aims to adaptively adjust the respective field according to the abnormal regions and help pay attention to abnormalities. Moreover, we propose a novel Angular Contrastive loss (AC Loss) to reduce the intra-class variances and enlarge the inter-class differences effectively. Our methods achieved 89.41% overall accuracy and showed better performance compared with state-of-the-art WCE image classification methods. The source code is available at https://github.com/Guo-Xiaoqing/Triple-ANet .

Xiaoqing Guo, Yixuan Yuan
Selective Feature Aggregation Network with Area-Boundary Constraints for Polyp Segmentation

Automatic polyp segmentation is considered indispensable in modern polyp screening systems. It can help the clinicians accurately locate polyp areas for further diagnosis or surgeries. Benefit from the advancement of deep learning techniques, various neural networks are developed for handling the polyp segmentation problem. However, most of these methods neither aggregate multi-scale or multi-receptive-field features nor consider the area-boundary constraints. To address these issues, we propose a novel selective feature aggregation network with the area and boundary constraints. The network contains a shared encoder and two mutually constrained decoders for predicting polyp areas and boundaries, respectively. Feature aggregation is achieved by (1) introducing three up-concatenations between encoder and decoders and (2) embedding Selective Kernel Modules into convolutional layers which can adaptively extract features from different size of kernels. We call these two operations the Selective Feature Aggregation. Furthermore, a new boundary-sensitive loss function is proposed to take into account the dependency between the area and boundary branch, thus two branches can be reciprocally influenced and enable more accurate area predictions. We evaluate our method on the EndoScene dataset and achieve the state-of-the-art results with a Dice of 83.08% and a Accuracy of 96.68%.

Yuqi Fang, Cheng Chen, Yixuan Yuan, Kai-yu Tong
Deep Sequential Mosaicking of Fetoscopic Videos

Twin-to-twin transfusion syndrome treatment requires fetoscopic laser photocoagulation of placental vascular anastomoses to regulate blood flow to both fetuses. Limited field-of-view (FoV) and low visual quality during fetoscopy make it challenging to identify all vascular connections. Mosaicking can align multiple overlapping images to generate an image with increased FoV, however, existing techniques apply poorly to fetoscopy due to the low visual quality, texture paucity, and hence fail in longer sequences due to the drift accumulated over time. Deep learning techniques can facilitate in overcoming these challenges. Therefore, we present a new generalized Deep Sequential Mosaicking (DSM) framework for fetoscopic videos captured from different settings such as simulation, phantom, and real environments. DSM extends an existing deep image-based homography model to sequential data by proposing controlled data augmentation and outlier rejection methods. Unlike existing methods, DSM can handle visual variations due to specular highlights and reflection across adjacent frames, hence reducing the accumulated drift. We perform experimental validation and comparison using 5 diverse fetoscopic videos to demonstrate the robustness of our framework.

Sophia Bano, Francisco Vasconcelos, Marcel Tella Amo, George Dwyer, Caspar Gruijthuijsen, Jan Deprest, Sebastien Ourselin, Emmanuel Vander Poorten, Tom Vercauteren, Danail Stoyanov
Landmark-Guided Deformable Image Registration for Supervised Autonomous Robotic Tumor Resection

Oral squamous cell carcinoma (OSCC) is the most common cancer in the head and neck region, and is associated with high morbidity and mortality rates. Surgical resection is usually the primary treatment strategy for OSCC, and maintaining effective tumor resection margins is paramount to surgical outcomes. In practice, wide tumor excisions impair post-surgical organ function, while narrow resection margins are associated with tumor recurrence. Identification and tracking of these resection margins remain a challenge because they migrate and shrink from pre-operative chemo or radiation therapies, and deform intra-operatively. This paper reports a novel near-infrared (NIR) fluorescent marking and landmark-based deformable image registration (DIR) method to precisely predict deformed margins. The accuracy of DIR predicted resection margins on porcine cadaver tongues is compared with rigid image registration and surgeon’s manual prediction. Furthermore, our tracking and registration technique is integrated into a robotic system, and tested using ex vivo porcine cadaver tongues to demonstrate the feasibility of supervised autonomous tumor bed resections.

Jiawei Ge, Hamed Saeidi, Justin D. Opfermann, Arjun S. Joshi, Axel Krieger
Multi-view Learning with Feature Level Fusion for Cervical Dysplasia Diagnosis

In this paper, we propose a novel multi-view deep learning approach for cervical dysplasia diagnosis (CDD), using multi-views of image data (acetic images and iodine images) from colposcopy. In general, a major challenge to analyzing multi-view medical image data is how to effectively exploit meaningful correlations among such views. We develop a new feature level fusion (FLF) method, which captures comprehensive correlations between the acetic and iodine image views and sufficiently utilizes information from these two views. Our FLF method is based on attention mechanisms and allows one view to assist another view or allows both views to assist mutually to better facilitate feature learning. Specifically, we explore deep networks for two kinds of FLF methods, uni-directional fusion (UFNet) and bi-directional fusion (BFNet). Experimental results show that our methods are effective for characterizing features of cervical lesions and outperform known methods for CDD.

Tingting Chen, Xinjun Ma, Xuechen Liu, Wenzhe Wang, Ruiwei Feng, Jintai Chen, Chunnv Yuan, Weiguo Lu, Danny Z. Chen, Jian Wu
Real-Time Surface Deformation Recovery from Stereo Videos

Tissue deformation during the surgery may significantly decrease the accuracy of surgical navigation systems. In this paper, we propose an approach to estimate the deformation of tissue surface from stereo videos in real-time, which is capable of handling occlusion, smooth surface and fast deformation. We first use a stereo matching method to extract depth information from stereo video frames and generate the tissue template, and then estimate the deformation of the obtained template by minimizing ICP, ORB feature matching and as-rigid-as-possible (ARAP) costs. The main novelties are twofold: (1) Due to non-rigid deformation, feature matching outliers are difficult to be removed by traditional RANSAC methods; therefore we propose a novel 1-point RANSAC and reweighting method to preselect matching inliers, which handles smooth surfaces and fast deformations. (2) We propose a novel ARAP cost function based on dense connections between the control points to achieve better smoothing performance with limited number of iterations. Algorithms are designed and implemented for GPU parallel computing. Experiments on ex- and in vivo data showed that this approach works at an update rate of 15 Hz with an accuracy of less than 2.5 mm on a NVIDIA Titan X GPU.

Haoyin Zhou, Jayender Jagadeesan

Microscopy

Frontmatter
Rectified Cross-Entropy and Upper Transition Loss for Weakly Supervised Whole Slide Image Classifier

Convolutional neural network (CNN) has achieved promising results in classifying histopathology images so far. However, most clinical data only has label information for the whole tissue slide and annotating every region of different tissue type is prohibitively expensive. Hence, computer aided diagnosis of whole slide images (WSIs) is challenging due to: (1) a WSI contains tissues with different types but it is classified by the most malignant tissue; (2) the gigapixel size of WSIs makes loading the whole image and end-to-end CNN training computationally infeasible. Previous works tended to classify WSI patch-wisely using the whole slide label and overlooked one useful information: it is an error to classify a patch as higher-grade classes. To address this, we propose a rectified cross-entropy loss as a combination of soft pooling and hard pooling of discriminative patches. We also introduce an upper transition loss to restrain errors. Our experimental results on colon polyp WSIs showed that, the two new losses can effectively guide the CNN optimization. With only WSI class information available for training, the patch-wise classification results on the testing set largely agree with human experts’ domain knowledge.

Hanbo Chen, Xiao Han, Xinjuan Fan, Xiaoying Lou, Hailing Liu, Junzhou Huang, Jianhua Yao
From Whole Slide Imaging to Microscopy: Deep Microscopy Adaptation Network for Histopathology Cancer Image Classification

Deep learning (DL) has achieved remarkable performance on digital pathology image classification with whole slide images (WSIs). Unfortunately, high acquisition costs of WSIs hinder the applications in practical scenarios, and most pathologists still use microscopy images (MSIs) in their workflows. However, it is especially challenging to train DL models on MSIs, given limited image qualities and high annotation costs. Alternatively, directly applying a WSI-trained DL model on MSIs usually performs poorly due to huge gaps between WSIs and MSIs. To address these issues, we propose to exploit deep unsupervised domain adaptation to adapt DL models trained on the labeled WSI domain to the unlabeled MSI domain. Specifically, we propose a novel Deep Microscopy Adaptation Network (DMAN). By reducing domain discrepancies via adversarial learning and entropy minimization, and alleviating class imbalance with sample reweighting, DMAN can classify MSIs effectively even without MSI annotations. Extensive experiments on colon cancer diagnosis demonstrate the effectiveness of DMAN and its potential in customizing models for each pathologist’s microscope.

Yifan Zhang, Hanbo Chen, Ying Wei, Peilin Zhao, Jiezhang Cao, Xinjuan Fan, Xiaoying Lou, Hailing Liu, Jinlong Hou, Xiao Han, Jianhua Yao, Qingyao Wu, Mingkui Tan, Junzhou Huang
Multi-scale Cell Instance Segmentation with Keypoint Graph Based Bounding Boxes

Most existing methods handle cell instance segmentation problems directly without relying on additional detection boxes. These methods generally fails to separate touching cells due to the lack of global understanding of the objects. In contrast, box-based instance segmentation solves this problem by combining object detection with segmentation. However, existing methods typically utilize anchor box-based detectors, which would lead to inferior instance segmentation performance due to the class imbalance issue. In this paper, we propose a new box-based cell instance segmentation method. In particular, we first detect the five pre-defined points of a cell via keypoints detection. Then we group these points according to a keypoint graph and subsequently extract the bounding box for each cell. Finally, cell segmentation is performed on feature maps within the bounding boxes. We validate our method on two cell datasets with distinct object shapes, and empirically demonstrate the superiority of our method compared to other instance segmentation techniques. Code is available at: https://github.com/yijingru/KG_Instance_Segmentation .

Jingru Yi, Pengxiang Wu, Qiaoying Huang, Hui Qu, Bo Liu, Daniel J. Hoeppner, Dimitris N. Metaxas
Improving Nuclei/Gland Instance Segmentation in Histopathology Images by Full Resolution Neural Network and Spatial Constrained Loss

Image segmentation plays an important role in pathology image analysis as the accurate separation of nuclei or glands is crucial for cancer diagnosis and other clinical analyses. The networks and cross entropy loss in current deep learning-based segmentation methods originate from image classification tasks and have drawbacks for segmentation. In this paper, we propose a full resolution convolutional neural network (FullNet) that maintains full resolution feature maps to improve the localization accuracy. We also propose a variance constrained cross entropy (varCE) loss that encourages the network to learn the spatial relationship between pixels in the same instance. Experiments on a nuclei segmentation dataset and the 2015 MICCAI Gland Segmentation Challenge dataset show that the proposed FullNet with the varCE loss achieves state-of-the-art performance. The code is publicly available ( https://github.com/huiqu18/FullNet-varCE ).

Hui Qu, Zhennan Yan, Gregory M. Riedlinger, Subhajyoti De, Dimitris N. Metaxas
Synthetic Augmentation and Feature-Based Filtering for Improved Cervical Histopathology Image Classification

Cervical intraepithelial neoplasia (CIN) grade of histopathology images is a crucial indicator in cervical biopsy results. Accurate CIN grading of epithelium regions helps pathologists with precancerous lesion diagnosis and treatment planning. Although an automated CIN grading system has been desired, supervised training of such a system would require a large amount of expert annotations, which are expensive and time-consuming to collect. In this paper, we investigate the CIN grade classification problem on segmented epithelium patches. We propose to use conditional Generative Adversarial Networks (cGANs) to expand the limited training dataset, by synthesizing realistic cervical histopathology images. While the synthetic images are visually appealing, they are not guaranteed to contain meaningful features for data augmentation. To tackle this issue, we propose a synthetic-image filtering mechanism based on the divergence in feature space between generated images and class centroids in order to control the feature quality of selected synthetic images for data augmentation. Our models are evaluated on a cervical histopathology image dataset with limited number of patch-level CIN grade annotations. Extensive experimental results show a significant improvement of classification accuracy from 66.3% to 71.7% using the same ResNet18 baseline classifier after leveraging our cGAN generated images with feature based filtering, which demonstrates the effectiveness of our models.

Yuan Xue, Qianying Zhou, Jiarong Ye, L. Rodney Long, Sameer Antani, Carl Cornwell, Zhiyun Xue, Xiaolei Huang
Cell Tracking with Deep Learning for Cell Detection and Motion Estimation in Low-Frame-Rate

Cell behavior analysis in high-throughput biological experiments is important for research and discovery in biology and medicine. To perform the high-throughput experiments, it requires to capture images in low frame rate in order to record images on multi-points. In such a low frame rate image sequence, movements of cells between successive frames are often larger than distances to nearby cells, and thus current methods based on proximity do not work properly. In this study, we propose a cell tracking method that enables to track cells in low frame rate by simultaneously estimating all of the cell motions in successive frames. In the experiments under dense conditions in low frame rate, our method outperformed the other methods.

Junya Hayashida, Ryoma Bise
Accelerated ML-Assisted Tumor Detection in High-Resolution Histopathology Images

Color normalization is one of the main tasks in the processing pipeline of computer-aided diagnosis (CAD) systems in histopathology. This task reduces the color and intensity variations that are typically present in stained whole-slide images (WSI) due to, e.g., non-standardization of staining protocols. Moreover, it increases the accuracy of machine learning (ML) based CAD systems. Given the vast amount of gigapixel-sized WSI data, and the need to reduce the time-to-insight, there is an increasing demand for efficient ML systems. In this work, we present a high-performance pipeline that enables big data analytics for WSIs in histopathology. As an exemplary ML inference pipeline, we employ a convolutional neural network (CNN), used to detect prostate cancer in WSIs, with stain normalization preprocessing. We introduce a set of optimizations across the whole pipeline: (i) we parallelize and optimize the stain normalization process, (ii) we introduce a multi-threaded I/O framework optimized for fast non-volatile memory (NVM) storage, and (iii) we integrate the stain normalization optimizations and the enhanced I/O framework in the ML pipeline to minimize the data transfer overheads and the overall prediction time. Our combined optimizations accelerate the end-to-end ML pipeline by $$7.2{\times }$$ and $$21.2{\times }$$ , on average, for low and high resolution levels of WSIs, respectively. Significantly, it allows for a seamless integration of the ML-assisted diagnosis with state-of-the-art whole slide scanners, by reducing the prediction time for high-resolution histopathology images from $$\sim $$ 30 min to under 80 s.

Nikolas Ioannou, Milos Stanisavljevic, Andreea Anghel, Nikolaos Papandreou, Sonali Andani, Jan Hendrik Rüschoff, Peter Wild, Maria Gabrani, Haralampos Pozidis
Pre-operative Overall Survival Time Prediction for Glioblastoma Patients Using Deep Learning on Both Imaging Phenotype and Genotype

Glioblastoma (GBM) is the most common and deadly malignant brain tumor with short yet varied overall survival (OS) time. Per request of personalized treatment, accurate pre-operative prognosis for GBM patients is highly desired. Currently, many machine learning-based studies have been conducted to predict OS time based on pre-operative multimodal MR images of brain tumor patients. However, tumor genotype, such as MGMT and IDH, which has been proven to have strong relationship with OS, is completely not considered in pre-operative prognosis as the genotype information is unavailable until craniotomy. In this paper, we propose a new deep learning based method for OS time prediction. It can derive genotype related features from pre-operative multimodal MR images of brain tumor patients to guide OS time prediction. Particularly, we propose a multi-task convolutional neural network (CNN) to accomplish tumor genotype and OS time prediction tasks. As the network can benefit from learning genotype related features toward genotype prediction, we verify upon a dataset of 120 GBM patients and conclude that the multi-task learning can effectively improve the accuracy of predicting OS time in personalized prognosis.

Zhenyu Tang, Yuyun Xu, Zhicheng Jiao, Junfeng Lu, Lei Jin, Abudumijiti Aibaidula, Jinsong Wu, Qian Wang, Han Zhang, Dinggang Shen
Pathology-Aware Deep Network Visualization and Its Application in Glaucoma Image Synthesis

The past few years have witnessed the great success of applying deep neural networks (DNNs) in computer-aided diagnosis. However, little attention has been paid to provide pathological evidence in the existing DNNs for medical diagnosis. In fact, feature visualization in DNNs is able to help understanding how the computer make decisions, and thus it shows promise on finding pathological evidence from computer-aided diagnosis. In this paper, we propose a novel pathology-aware visualization approach for DNN-based glaucoma classification, which is used to locate the pathological evidence from fundus images for glaucoma. Besides, we apply the visualization framework to the glaucoma images synthesis task, through which specific pathological areas of synthesized images can be enhanced. Finally, experimental results show that the visualization heat maps can pinpoint different glaucoma pathologies with high accuracy, and that the generated glaucoma images are more pathophysiologically clear in rim loss (RL) and retinal neural fiber layer damage (RNFLD), which is verified by the ophthalmologist.

Xiaofei Wang, Mai Xu, Liu Li, Zulin Wang, Zhenyu Guan
CORAL8: Concurrent Object Regression for Area Localization in Medical Image Panels

This work tackles the problem of generating a medical report for multi-image panels. We apply our solution to the Renal Direct Immunofluorescence (RDIF) assay which requires a pathologist to generate a report based on observations across eight different whole slide images (WSI) in concert with existing clinical features. To this end, we propose a novel attention-based multi-modal generative recurrent neural network (RNN) architecture capable of dynamically sampling image data concurrently across the RDIF panel. The proposed methodology incorporates text from the clinical notes of the requesting physician to regulate the output of the network to align with the overall clinical context. In addition, we found the importance of regularizing attention weights for the word generation processes. This is because the system can ignore the attention mechanism by assigning equal weights for all members. Thus, we propose two regularizations to encourage efficient use of the attention mechanism. Experiments on our novel collection of RDIF WSIs provided by Sullivan Nicolaides Pathology demonstrate that our framework offers significant improvements over existing methods.

Sam Maksoud, Arnold Wiliem, Kun Zhao, Teng Zhang, Lin Wu, Brian Lovell
ET-Net: A Generic Edge-aTtention Guidance Network for Medical Image Segmentation

Segmentation is a fundamental task in medical image analysis. However, most existing methods focus on primary region extraction and ignore edge information, which is useful for obtaining accurate segmentation. In this paper, we propose a generic medical segmentation method, called Edge-aTtention guidance Network (ET-Net), which embeds edge-attention representations to guide the segmentation network. Specifically, an edge guidance module is utilized to learn the edge-attention representations in the early encoding layers, which are then transferred to the multi-scale decoding layers, fused using a weighted aggregation module. The experimental results on four segmentation tasks (i.e., optic disc/cup and vessel segmentation in retinal images, and lung segmentation in chest X-Ray and CT images) demonstrate that preserving edge-attention representations contributes to the final segmentation accuracy, and our proposed method outperforms current state-of-the-art segmentation methods. The source code of our method is available at https://github.com/ZzzJzzZ/ETNet .

Zhijie Zhang, Huazhu Fu, Hang Dai, Jianbing Shen, Yanwei Pang, Ling Shao
Instance Segmentation of Biomedical Images with an Object-Aware Embedding Learned with Local Constraints

Automatic instance segmentation is a problem that occurs in many biomedical applications. State-of-the-art approaches either perform semantic segmentation or refine object bounding boxes obtained from detection methods. Both suffer from crowded objects to varying degrees, merging adjacent objects or suppressing a valid object. In this work, we assign an embedding vector to each pixel through a deep neural network. The network is trained to output embedding vectors of similar directions for pixels from the same object, while adjacent objects are orthogonal in the embedding space, which effectively avoids the fusion of objects in a crowd. Our method yields state-of-the-art results even with a light-weighted backbone network on a cell segmentation (BBBC006 + DSB2018) and a leaf segmentation data set (CVPPP2017). The code and model weights are public available ( https://github.com/looooongChen/instance_segmentation_with_pixel_embeddings/ ).

Long Chen, Martin Strauch, Dorit Merhof
Diverse Multiple Prediction on Neuron Image Reconstruction

Neuron reconstruction from anisotropic 3D Electron Microscopy (EM) images is a challenging problem. One often considers an input image as a stack of 2D image slices, and consider both intra and inter slice segments information. In this paper, we present a new segmentation algorithm which builds a unified energy function and jointly optimize the per-slice segmentation and the inter-slice consistency. To find an optimal solution from the huge solution space, we propose a novel diverse multiple prediction method which also encourages diversity in partial solutions. We demonstrate the strength of our method in several public datasets.

Ze Ye, Cong Chen, Changhe Yuan, Chao Chen
Deep Segmentation-Emendation Model for Gland Instance Segmentation

Accurate and automated gland instance segmentation on histology microscopy images can assist pathologists to diagnose the malignancy degree of colorectal adenocarcinoma. To address this problem, many deep convolutional neural network (DCNN) based methods have been proposed, most of which aim to generate better segmentation by improving the model structure and loss function. Few of them, however, focus on further emendating the inferred predictions, thus missing a chance to refine the obtained segmentation results. In this paper, we propose the deep segmentation-emendation (DSE) model for gland instance segmentation. This model consists of a segmentation network (Seg-Net) and an emendation network (Eme-Net). The Seg-Net is dedicated to generating segmentation results, and the Eme-Net learns to predict the inconsistency between the ground truth and the segmentation results generated by Seg-Net. The predictions made by Eme-Net can in turn be used to refine the segmentation result. We evaluated our DSE model against five recent deep learning models on the 2015 MICCAI Gland Segmentation challenge (GlaS) dataset and against two deep learning models on the colorectal adenocarcinoma (CRAG) dataset. Our results indicate that using Eme-Net results in significant improvement in segmentation accuracy, and the proposed DSE model is able to substantially outperform all the rest models in gland instance segmentation on both datasets.

Yutong Xie, Hao Lu, Jianpeng Zhang, Chunhua Shen, Yong Xia
Fast and Accurate Electron Microscopy Image Registration with 3D Convolution

We propose an unsupervised deep learning method for serial electron microscopy (EM) image registration with fast speed and high accuracy. Current registration methods are time consuming in practice due to the iterative optimization procedure. We model the registration process as a parametric function in the form of convolutional neural networks, and optimize its parameters based on features extracted from training serial EM images in a training set. Given a new series of EM images, the deformation field of each serial image can be rapidly generated through the learned function. Specifically, we adopt a spatial transformer layer to reconstruct features in the subject image from the reference ones while constraining smoothness on the deformation field. Moreover, for the first time, we introduce the 3D convolution layer to learn the relationship between several adjacent images, which effectively reduces error accumulation in serial EM image registration. Experiments on two popular EM datasets, Cremi and FIB25, demonstrate our method can operate in an unprecedented speed while providing competitive registration accuracy compared with state-of-the-art methods, including learning-based ones.

Shenglong Zhou, Zhiwei Xiong, Chang Chen, Xuejin Chen, Dong Liu, Yueyi Zhang, Zheng-Jun Zha, Feng Wu
PlacentaNet: Automatic Morphological Characterization of Placenta Photos with Deep Learning

Analysis of the placenta is extremely useful for evaluating health risks of the mother and baby after delivery. In this paper, we tackle the problem of automatic morphological characterization of placentas, including the tasks of placenta image segmentation, umbilical cord insertion point localization, and maternal/fetal side classification. We curated an existing dataset consisting of around 1,000 placenta images taken at Northwestern Memorial Hospital, together with their pixel-level segmentation map. We propose a novel pipeline, PlacentaNet, which consists of three encoder-decoder convolutional neural networks with a shared encoder, to address these morphological characterization tasks by employing a transfer learning training strategy. We evaluated its effectiveness using the curated dataset as well as the pathology reports in the medical record. The system produced accurate morphological characterization, which enabled subsequent feature analysis of placentas. In particular, we show promising results for detection of retained placenta (i.e., incomplete placenta) and umbilical cord insertion type categorization, both of which may possess clinical impact.

Yukun Chen, Chenyan Wu, Zhuomin Zhang, Jeffery A. Goldstein, Alison D. Gernand, James Z. Wang
Deep Multi-instance Learning for Survival Prediction from Whole Slide Images

Recent image-based survival models rely on discriminative patch labeling, which are both time consuming and infeasible to extend to large scale cancer datasets. Different from the existing works on learning using key patches or clusters from WSIs, we take advantages of a deep multiple instance learning to encode all possible patterns from WSIs and consider the joint effects from different patterns for clinical outcomes prediction. We evaluate our model in its ability to predict patients’ survival risks across the Lung and Brain tumors from two large whole slide pathological images datasets. The proposed framework can improve the prediction performances compared with existing state-of-the-arts survival analysis approaches. Results also demonstrate the effectiveness of the proposed method as a recommender system to provide personalized recommendations based on an individual’s calculated risk.

Jiawen Yao, Xinliang Zhu, Junzhou Huang
High-Resolution Diabetic Retinopathy Image Synthesis Manipulated by Grading and Lesions

Diabetic retinopathy (DR) is a complication of diabetes that severely affects eyes, and can be graded into five levels according to international protocol. However, optimizing a grading model with strong generalization ability requires large balanced training data, which is difficult to collect in general but particularly for the high severity levels. Typical data augmentation methods, including flip and rotation cannot generate data with high diversity. In this paper, we propose a diabetic retinopathy generative adversarial network (DR-GAN) to synthesize high-resolution fundus images, which can be manipulated with arbitrary grading and lesion information. Thus, large-scale generated data can be used for more meaningful augmentation to train a DR grading model. The proposed retina generator is conditioned on vessel and lesion masks, and adaptive grading vectors sampled from the latent grading space, which can be adopted to control the synthesized grading severity. Moreover, multi-scale discriminators are designed to operate from large to small receptive fields, and joint adversarial losses are adopted to optimize the whole network in an end-to-end manner. With extensive experiments evaluated on the EyePACS dataset connected to Kaggle, we validate the effectiveness of our method, which can both synthesize highly realistic ( $$1280 \times 1280$$ ) controllable fundus images and contribute to the DR grading task.

Yi Zhou, Xiaodong He, Shanshan Cui, Fan Zhu, Li Liu, Ling Shao
Deep Instance-Level Hard Negative Mining Model for Histopathology Images

Histopathology image analysis can be considered as a Multiple instance learning (MIL) problem, where the whole slide histopathology image (WSI) is regarded as a bag of instances (i.e., patches) and the task is to predict a single class label to the WSI. However, in many real-life applications such as computational pathology, discovering the key instances that trigger the bag label is of great interest because it provides reasons for the decision made by the system. In this paper, we propose a deep convolutional neural network (CNN) model that addresses the primary task of a bag classification on a histopathology image and also learns to identify the response of each instance to provide interpretable results to the final prediction. We incorporate the attention mechanism into the proposed model to operate the transformation of instances and learn attention weights to allow us to find key patches. To perform a balanced training, we introduce adaptive weighing in each training bag to explicitly adjust the weight distribution in order to concentrate more on the contribution of hard samples. Based on the learned attention weights, we further develop a solution to boost the classification performance by generating the bags with hard negative instances. We conduct extensive experiments on colon and breast cancer histopathology data and show that our framework achieves state-of-the-art performance.

Meng Li, Lin Wu, Arnold Wiliem, Kun Zhao, Teng Zhang, Brian Lovell
Synthetic Patches, Real Images: Screening for Centrosome Aberrations in EM Images of Human Cancer Cells

Recent advances in high-throughput electron microscopy imaging enable detailed study of centrosome aberrations in cancer cells. While the image acquisition in such pipelines is automated, manual detection of centrioles is still necessary to select cells for re-imaging at higher magnification. In this contribution we propose an algorithm which performs this step automatically and with high accuracy. From the image labels produced by human experts and a 3D model of a centriole we construct an additional training set with patch-level labels. A two-level DenseNet is trained on the hybrid training data with synthetic patches and real images, achieving much better results on real patient data than training only at the image-level. The code can be found at https://github.com/kreshuklab/centriole_detection .

Artem Lukoyanov, Isabella Haberbosch, Constantin Pape, Alwin Krämer, Yannick Schwab, Anna Kreshuk
Patch Transformer for Multi-tagging Whole Slide Histopathology Images

Automated whole slide image (WSI) tagging has become a growing demand due to the increasing volume and diversity of WSIs collected nowadays in histopathology. Various methods have been studied to classify WSIs with single tags but none of them focuses on labeling WSIs with multiple tags. To this end, we propose a novel end-to-end trainable deep neural network named Patch Transformer which can effectively predict multiple slide-level tags from WSI patches based on both the correlations and the uniqueness between the tags. Specifically, the proposed method learns patch characteristics considering (1) patch-wise relations through a patch transformation module and (2) tag-wise uniqueness for each tagging task through a multi-tag attention module. Extensive experiments on a large and diverse dataset consisting of 4,920 WSIs prove the effectiveness of the proposed model.

Weijian Li, Viet-Duy Nguyen, Haofu Liao, Matt Wilder, Ke Cheng, Jiebo Luo
Pancreatic Cancer Detection in Whole Slide Images Using Noisy Label Annotations

We propose an approach to accurately predict regions of pancreatic cancer in whole-slide images (WSIs) by leveraging a relatively large, but noisy, dataset. We employ a noisy label classification (NLC) method (called the NLC model) that utilizes a small set of clean training samples and assigns the appropriate weights to training samples to deal with sample noise. The weights are assigned online so that the network loss approximates the loss for the clean samples. This method results in a 9.7% performance improvement over the baseline non-NLC method (the Baseline-Noisy model). We use both methods in an ensemble setup to generate labels for a large training dataset to train a classifier. This classifier outperforms a classifier trained with manually annotated data by 2.94%–3.74% in terms of AUC for testing patches in WSIs.

Han Le, Dimitris Samaras, Tahsin Kurc, Rajarsi Gupta, Kenneth Shroyer, Joel Saltz
Encoding Histopathological WSIs Using GNN for Scalable Diagnostically Relevant Regions Retrieval

The research on content-based histopathological image retrieval (CBHIR) has become popular in recent years. CBHIR systems provide auxiliary diagnosis information for pathologists by searching for and returning regions that are contently similar to the region of interest (ROI) from a pre-established database. To retrieve diagnostically relevant regions from a database that consists of histopathological whole slide images (WSIs) for query ROIs is challenging and yet significant for clinical applications. In this paper, we propose a novel CBHIR framework for regions retrieval from WSI-database based on hierarchical graph neural networks (GNNs). Compared to the present CBHIR framework, the structural information of WSI is preserved by the proposed model, which makes the retrieval framework more sensitive to regions that are similar in tissue distribution. Moreover, benefited from the hierarchical GNN structures, the proposed framework is scalable for both the size and shape variation of ROIs. It allows the pathologist defining the query region using free curves. Thirdly, the retrieval is achieved by binary codes and hashing methods, which makes it very efficient and thereby adequate for practical large-scale WSI-database. The proposed method is validated on a lung cancer dataset and compared to the state-of-the-art methods. The proposed method achieved precisions above 82.4% in the irregular region retrieval task, which are superior to the state-of-the-art methods. The average time of retrieval is 0.514 ms.

Yushan Zheng, Bonan Jiang, Jun Shi, Haopeng Zhang, Fengying Xie
Local and Global Consistency Regularized Mean Teacher for Semi-supervised Nuclei Classification

Nucleus classification is a fundamental task in pathology diagnosis for cancers, e.g., Ki-67 index estimation. Supervised deep learning methods have achieved promising classification accuracy. However, the success of these methods heavily relies on massive manually annotated data. Manual annotation for nucleus classification are usually time consuming and laborious. In this paper, we propose a novel semi-supervised deep learning method that can learn from small portion of labeled data and large-scale unlabeled data for nucleus classification. Our method is inspired by the recent state-of-the-art self-ensembling (SE) methods. These methods learn from unlabeled data by enforcing consistency of predictions under different perturbations while ignoring local and global consistency hidden in data structure. In our work, a label propagation (LP) step is integrated into the SE method, and a graph is constructed using the LP predictions that encode the local and global data structure. Finally, a Siamese loss is used to learn the local and global consistency from the graph. Our implementation is based on the state-of-the-art SE method Mean Teacher. Extensive experiments on two nucleus datasets demonstrate that our method outperforms the state-of-the-art SE methods, and achieves $$F_1$$ scores close to the supervised methods using only 5%–25% labeled data.

Hai Su, Xiaoshuang Shi, Jinzheng Cai, Lin Yang
Perceptual Embedding Consistency for Seamless Reconstruction of Tilewise Style Transfer

Style transfer is a field with growing interest and use cases in deep learning. Recent work has shown Generative Adversarial Networks (GANs) can be used to create realistic images of virtually stained slide images in digital pathology with clinically validated interpretability. Digital pathology images are typically of extremely high resolution, making tilewise analysis necessary for deep learning applications. It has been shown that image generators with instance normalization can cause a tiling artifact when a large image is reconstructed from the tilewise analysis. We introduce a novel perceptual embedding consistency loss significantly reducing the tiling artifact created in the reconstructed whole slide image (WSI). We validate our results by comparing virtually stained slide images with consecutive real stained tissue slide images. We also demonstrate that our model is more robust to contrast, color and brightness perturbations by running comparative sensitivity analysis tests.

Amal Lahiani, Nassir Navab, Shadi Albarqouni, Eldad Klaiman
Precise Separation of Adjacent Nuclei Using a Siamese Neural Network

Nuclei segmentation in digital histopathology images plays an important role in distinguishing stages of cancer. Recently, deep learning based methods for segmenting the nuclei have been proposed, but precise boundary delineation of adjacent nuclei is still challenging. To address this problem, we propose a post processing method which can accurately separate the adjacent nuclei when an image and a predicted nuclei segmentation are given. Specifically, we propose a novel deep neural network which can predict whether adjacent two instances belong to a single nuclei or separate nuclei. By borrowing the idea of decision making with Siamese networks, the proposed network learns the affinity between two adjacent instances and surrounding features from a large amount of adjacent nuclei even though the training data is limited. Furthermore, we estimate the segmentation of instances through a decoding network and then use their overlapping Dice score for class prediction to improve the classification accuracy. The proposed method effectively alleviates the over-fitting problem and compatible with any cell segmentation algorithms. Experimental results show that our proposed method significantly improves the cell separation accuracy.

Miguel Luna, Mungi Kwon, Sang Hyun Park
PFA-ScanNet: Pyramidal Feature Aggregation with Synergistic Learning for Breast Cancer Metastasis Analysis

Automatic detection of cancer metastasis from whole slide images (WSIs) is a crucial step for following patient staging and prognosis. Recent convolutional neural network based approaches are struggling with the trade-off between accuracy and computational efficiency due to the difficulty in processing large-scale gigapixel WSIs. To meet this challenge, we propose a novel Pyramidal Feature Aggregation ScanNet (PFA-ScanNet) for robust and fast analysis of breast cancer metastasis. Our method mainly benefits from the aggregation of extracted local-to-global features with diverse receptive fields, as well as the proposed synergistic learning for training the main detector and extra decoder with semantic guidance. Furthermore, a high-efficiency inference mechanism is designed with dense pooling layers, which allows dense and fast scanning for gigapixel WSI analysis. As a result, the proposed PFA-ScanNet achieved the state-of-the-art FROC of 89.1% on the Camelyon16 dataset, as well as competitive kappa score of 0.905 on the Camelyon17 leaderboard without model ensemble. In addition, our method shows leading speed advantage over other methods, about 7.2 min per WSI with a single GPU, making automatic analysis of breast cancer metastasis more applicable in the clinical usage.

Zixu Zhao, Huangjing Lin, Hao Chen, Pheng-Ann Heng
DeepACE: Automated Chromosome Enumeration in Metaphase Cell Images Using Deep Convolutional Neural Networks

Chromosome enumeration is an important but tedious procedure in karyotyping analysis. In this paper, to automate the enumeration process, we developed a chromosome enumeration framework, DeepACE, based on the region based object detection scheme. Firstly, the ability of region proposal network is enhanced by a newly proposed Hard Negative Anchors Sampling to extract unapparent but important information about highly confusing partial chromosomes. Next, to alleviate serious occlusion problems, we novelly introduced a weakly-supervised mechanism by adding a Template Module into classification branch to heuristically separate overlapped chromosomes. The template features are further incorporated into the NMS procedure to further improve the detection of overlapping chromosomes. In the newly collected clinical dataset, the proposed method outperform all the previous method, yielding an mAP with respect to chromosomes as 99.45, and the error rate is about 2.4%.

Li Xiao, Chunlong Luo, Yufan Luo, Tianqi Yu, Chan Tian, Jie Qiao, Yi Zhao
Unsupervised Subtyping of Cholangiocarcinoma Using a Deep Clustering Convolutional Autoencoder

Unlike common cancers, such as those of the prostate and breast, tumor grading in rare cancers is difficult and largely undefined because of small sample sizes, the sheer volume of time and experience needed to undertake such a task, and the inherent difficulty of extracting human-observed patterns. One of the most challenging examples is intrahepatic cholangiocarcinoma (ICC), a primary liver cancer arising from the biliary system, for which there is well-recognized tumor heterogeneity and no grading paradigm or prognostic biomarkers. In this paper, we propose a new unsupervised deep convolutional autoencoder-based clustering model that groups together cellular and structural morphologies of tumor in 246 digitized whole slides, based on visual similarity. Clusters based on this visual dictionary of histologic patterns are interpreted as new ICC subtypes and evaluated by training Cox-proportional hazard survival models, resulting in statistically significant patient stratification.

Hassan Muhammad, Carlie S. Sigel, Gabriele Campanella, Thomas Boerner, Linda M. Pak, Stefan Büttner, Jan N. M. IJzermans, Bas Groot Koerkamp, Michael Doukas, William R. Jarnagin, Amber L. Simpson, Thomas J. Fuchs
Evidence Localization for Pathology Images Using Weakly Supervised Learning

Despite deep convolutional neural networks boost the performance of image classification and segmentation in digital pathology analysis, they are usually weak in interpretability for clinical applications or require heavy annotations to achieve object localization. To overcome this problem, we propose a weakly supervised learning-based approach that can effectively learn to localize the discriminative evidence for a diagnostic label from weakly labeled training data. Experimental results show that our proposed method can reliably pinpoint the location of cancerous evidence supporting the decision of interest, while still achieving a competitive performance on glimpse-level and slide-level histopathologic cancer detection tasks.

Yongxiang Huang, Albert C. S. Chung
Nuclear Instance Segmentation Using a Proposal-Free Spatially Aware Deep Learning Framework

Nuclear segmentation in histology images is a challenging task due to significant variations in the shape and appearance of nuclei. One of the main hurdles in nuclear instance segmentation is overlapping nuclei where a smart algorithm is needed to separate each nucleus. In this paper, we introduce a proposal-free deep learning based framework to address these challenges. To this end, we propose a spatially-aware network (SpaNet) to capture spatial information in a multi-scale manner. A dual-head variation of the SpaNet is first utilized to predict the pixel-wise segmentation and centroid detection maps of nuclei. Based on these outputs, a single-head SpaNet predicts the positional information related to each nucleus instance. Spectral clustering method is applied on the output of the last SpaNet, which utilizes the nuclear mask and the Gaussian-like detection map for determining the connected components and associated cluster identifiers, respectively. The output of the clustering method is the final nuclear instance segmentation mask. We applied our method on a publicly available multi-organ data set ( https://nucleisegmentationbenchmark.weebly.com/ ) and achieved state-of-the-art performance for nuclear segmentation.

Navid Alemi Koohbanani, Mostafa Jahanifar, Ali Gooya, Nasir Rajpoot
GAN-Based Image Enrichment in Digital Pathology Boosts Segmentation Accuracy

We introduce the idea of ‘image enrichment’ whereby the information content of images is increased in order to enhance segmentation accuracy. Unlike in data augmentation, the focus is not on increasing the number of training samples (by adding new virtual samples), but on increasing the information for each sample. For this purpose, we use a GAN-based image-to-image translation approach to generate corresponding virtual samples from a given (original) image. The virtual samples are then merged with the original sample to create a multi-channel image, which serves as the enriched image. We train and test a segmentation network on enriched images showing kidney pathology and obtain segmentation scores exhibiting an improvement compared to conventional processing of the original images only. We perform an extensive evaluation and discuss the reasons for the improvement.

Laxmi Gupta, Barbara M. Klinkhammer, Peter Boor, Dorit Merhof, Michael Gadermayr
IRNet: Instance Relation Network for Overlapping Cervical Cell Segmentation

Cell instance segmentation in Pap smear image remains challenging due to the wide existence of occlusion among translucent cytoplasm in cell clumps. Conventional methods heavily rely on accurate nuclei detection results and are easily disturbed by miscellaneous objects. In this paper, we propose a novel Instance Relation Network (IRNet) for robust overlapping cell segmentation by exploring instance relation interaction. Specifically, we propose the Instance Relation Module to construct the cell association matrix for transferring information among individual cell-instance features. With the collaboration of different instances, the augmented features gain benefits from contextual information and improve semantic consistency. Meanwhile, we proposed a sparsity constrained Duplicate Removal Module to eliminate the misalignment between classification and localization accuracy for candidates selection. The largest cervical Pap smear (CPS) dataset with more than 8000 cell annotations in Pap smear image was constructed for comprehensive evaluation. Our method outperforms other methods by a large margin, demonstrating the effectiveness of exploring instance relation.

Yanning Zhou, Hao Chen, Jiaqi Xu, Qi Dou, Pheng-Ann Heng
Weakly Supervised Cell Instance Segmentation by Propagating from Detection Response

Cell shape analysis is important in biomedical research. Deep learning methods may perform to segment individual cells if they use sufficient training data that the boundary of each cell is annotated. However, it is very time-consuming for preparing such detailed annotation for many cell culture conditions. In this paper, we propose a weakly supervised method that can segment individual cell regions who touch each other with unclear boundaries in dense conditions without the training data for cell regions. We demonstrated the efficacy of our method using several data-set including multiple cell types captured by several types of microscopy. Our method achieved the highest accuracy compared with several conventional methods. In addition, we demonstrated that our method can perform without any annotation by using fluorescence images that cell nuclear were stained as training data. Code is publicly available in https://github.com/naivete5656/WSISPDR .

Kazuya Nishimura, Dai Fei Elmer Ker, Ryoma Bise
Robust Non-negative Tensor Factorization, Diffeomorphic Motion Correction, and Functional Statistics to Understand Fixation in Fluorescence Microscopy

Fixation is essential for preserving cellular morphology in biomedical research. However, it may also affect spectra captured in multispectral fluorescence microscopy, impacting molecular interpretations. To investigate fixation effects on tissue, multispectral fluorescence microscopy images of pairs of samples with and without fixation are captured. Each pixel might exhibit overlapping spectra, creating a blind source separation problem approachable with linear unmixing. With multiple excitation wavelengths, unmixing is intuitively extended to tensor factorizations. Yet these approaches are limited by nonlinear effects like attenuation. Further, light exposure during image acquisition introduces subtle Brownian motion between image channels of non-fixed tissue. Finally, hypothesis testing for spectral differences due to fixation is non-trivial as retrieved spectra are paired sequential samples. To these ends, we present three contributions, (1) a novel robust non-negative tensor factorization using the $$\beta $$ -divergence and $$L_{2,1}$$ -norm, which decomposes the data into a low-rank multilinear and group-sparse non-multilinear tensor without making any explicit nonlinear modeling choices or assumptions on noise statistics; (2) a diffeomorphic atlas-based strategy for motion correction; (3) a non-parametric hypothesis testing framework for paired sequential data using functional principal component analysis. PyTorch code for robust non-negative tensor factorization is available at https://github.com/neel-dey/robustNTF .

Neel Dey, Jeffrey Messinger, R. Theodore Smith, Christine A. Curcio, Guido Gerig
ConCORDe-Net: Cell Count Regularized Convolutional Neural Network for Cell Detection in Multiplex Immunohistochemistry Images

In digital pathology, cell detection and classification are often prerequisites to quantify cell abundance and explore tissue spatial heterogeneity. However, these tasks are particularly challenging for multiplex immunohistochemistry (mIHC) images due to high levels of variability in staining, expression intensity, and inherent noise as a result of preprocessing artefacts. We proposed a deep learning method to detect and classify cells in mIHC whole-tumor slide images of breast cancer. Inspired by inception-v3, we developed Cell COunt RegularizeD Convolutional neural Network (ConCORDe-Net) which integrates conventional dice overlap and a new cell count loss function for optimizing cell detection, followed by a multi-stage convolutional neural network for cell classification. In total, 20447 cells, belonging to five cell classes were annotated by experts from 175 patches extracted from 6 whole-tumor mIHC images. These patches were randomly split into training, validation and testing sets. Using ConCORDe-Net, we obtained a cell detection F1 score of 0.873, which is the best score compared to three state of the art methods. In particular, ConCORDe-Net excels at detecting closely located and weakly stained cells compared to other methods. Incorporating cell count loss in the objective function regularizes the network to learn weak gradient boundaries and separate weakly stained cells from background artefacts. Moreover, cell classification accuracy of $$96.5\%$$ was achieved. These results support that incorporating problem specific knowledge such as cell count into deep learning based cell detection architectures improves robustness of the algorithm.

Yeman Brhane Hagos, Priya Lakshmi Narayanan, Ayse U. Akarca, Teresa Marafioti, Yinyin Yuan
Multi-task Learning of a Deep K-Nearest Neighbour Network for Histopathological Image Classification and Retrieval

Deep neural networks have achieved tremendous success in image recognition, classification and object detection. However, deep learning is often criticised for its lack of transparency and general inability to rationalise its predictions. The issue of poor model interpretability becomes critical in medical applications: a model that is not understood and trusted by physicians is unlikely to be used in daily clinical practice. In this work, we develop a novel multi-task deep learning framework for simultaneous histopathology image classification and retrieval, leveraging on the classic concept of k-nearest neighbours to improve model interpretability. For a test image, we retrieve the most similar images from our training databases. These retrieved nearest neighbours can be used to classify the test image with a confidence score, and provide a human-interpretable explanation of our classification. Our original framework can be built on top of any existing classification network (and therefore benefit from pretrained models), by (i) combining a triplet loss function with a novel triplet sampling strategy to compare distances between samples and (ii) adding a Cauchy hashing loss function to accelerate neighbour searching. We evaluate our method on colorectal cancer histology slides and show that the confidence estimates are strongly correlated with model performance. Nearest neighbours are intuitive and useful for expert evaluation. They give insights into understanding possible model failures, and can support clinical decision making by comparing archived images and patient records with the actual case.

Tingying Peng, Melanie Boxberg, Wilko Weichert, Nassir Navab, Carsten Marr
Multiclass Deep Active Learning for Detecting Red Blood Cell Subtypes in Brightfield Microscopy

The recent success of deep learning approaches relies partly on large amounts of well annotated training data. For natural images object annotation is easy and cheap. For biomedical images however, annotation crucially depends on the availability of a trained expert whose time is typically expensive and scarce. To ensure efficient annotation, only the most relevant objects should be presented to the expert. Currently, no approach exists that allows to select those for a multiclass detection problem. Here, we present an active learning framework that identifies the most relevant samples from a large set of not annotated data for further expert annotation. Applied to brightfield images of red blood cells with seven subtypes, we train a faster R-CNN for single cell identification and classification, calculate a novel confidence score using dropout variational inference and select relevant images for annotation based on (i) the confidence of the single cell detection and (ii) the rareness of the classes contained in the image. We show that our approach leads to a drastic increase of prediction accuracy with already few annotated images. Our original approach improves classification of red blood cell subtypes and speeds up the annotation. This important step in diagnosing blood diseases will profit from our framework as well as many other clinical challenges that suffer from the lack of annotated training data.

Ario Sadafi, Niklas Koehler, Asya Makhro, Anna Bogdanova, Nassir Navab, Carsten Marr, Tingying Peng
Enhanced Cycle-Consistent Generative Adversarial Network for Color Normalization of H&E Stained Images

Due to differences in tissue preparations, staining protocols and scanner models, stain colors of digitized histological images are excessively diverse. Color normalization is almost a necessary procedure for quantitative digital pathology analysis. Though several color normalization methods have been proposed, most of them depend on selection of representative templates and may fail in regions not matching the templates. We propose an enhanced cycle-GAN based method with a novel auxiliary input for the generator by computing a stain color matrix for every H&E image in the training set. The matrix guides the translation in the generator, and thus stabilizes the cycle consistency loss. We applied our proposed method as a pre-processing step for a breast metastasis classification task on a dataset from five medical centers and achieved the highest performance compared to other color normalization methods. Furthermore, our method is template-free and may be applied to other datasets without finetuning.

Niyun Zhou, De Cai, Xiao Han, Jianhua Yao
Nuclei Segmentation in Histopathological Images Using Two-Stage Learning

Nuclei segmentation is a fundamental and important task in histopathological image analysis. However, it still has some challenges such as difficulty in segmenting the overlapping or touching nuclei, and limited ability of generalization to different organs and tissue types. In this paper, we propose a novel nuclei segmentation approach based on a two-stage learning framework and Deep Layer Aggregation (DLA). We convert the original binary segmentation task into a two-step task by adding nuclei-boundary prediction (3-classes) as an intermediate step. To solve our two-step task, we design a two-stage learning framework by stacking two U-Nets. The first stage estimates nuclei and their coarse boundaries while the second stage outputs the final fine-grained segmentation map. Furthermore, we also extend the U-Nets with DLA by iteratively merging features across different levels. We evaluate our proposed method on two public diverse nuclei datasets. The experimental results show that our proposed approach outperforms many standard segmentation architectures and recently proposed nuclei segmentation methods, and can be easily generalized across different cell types in various organs.

Qingbo Kang, Qicheng Lao, Thomas Fevens
ACE-Net: Biomedical Image Segmentation with Augmented Contracting and Expansive Paths

Nowadays U-net-like FCNs predominate various biomedical image segmentation applications and attain promising performance, largely due to their elegant architectures, e.g., symmetric contracting and expansive paths as well as lateral skip-connections. It remains a research direction to devise novel architectures to further benefit the segmentation. In this paper, we develop an ACE-net that aims to enhance the feature representation and utilization by augmenting the contracting and expansive paths. In particular, we augment the paths by the recently proposed advanced techniques including ASPP, dense connection and deep supervision mechanisms, and novel connections such as directly connecting the raw image to the expansive side. With these augmentations, ACE-net can utilize features from multiple sources, scales and reception fields to segment while still maintains a relative simple architecture. Experiments on two typical biomedical segmentation tasks validate its effectiveness, where highly competitive results are obtained in both tasks while ACE-net still runs fast at inference.

Yanhao Zhu, Zhineng Chen, Shuai Zhao, Hongtao Xie, Wenming Guo, Yongdong Zhang
CS-Net: Channel and Spatial Attention Network for Curvilinear Structure Segmentation

The detection of curvilinear structures in medical images, e.g., blood vessels or nerve fibers, is important in aiding management of many diseases. In this work, we propose a general unifying curvilinear structure segmentation network that works on different medical imaging modalities: optical coherence tomography angiography (OCT-A), color fundus image, and corneal confocal microscopy (CCM). Instead of the U-Net based convolutional neural network, we propose a novel network (CS-Net) which includes a self-attention mechanism in the encoder and decoder. Two types of attention modules are utilized - spatial attention and channel attention, to further integrate local features with their global dependencies adaptively. The proposed network has been validated on five datasets: two color fundus datasets, two corneal nerve datasets and one OCT-A dataset. Experimental results show that our method outperforms state-of-the-art methods, for example, sensitivities of corneal nerve fiber segmentation were at least 2% higher than the competitors. As a complementary output, we made manual annotations of two corneal nerve datasets which have been released for public access.

Lei Mou, Yitian Zhao, Li Chen, Jun Cheng, Zaiwang Gu, Huaying Hao, Hong Qi, Yalin Zheng, Alejandro Frangi, Jiang Liu
PseudoEdgeNet: Nuclei Segmentation only with Point Annotations

Nuclei segmentation is one of the important tasks for whole slide image analysis in digital pathology. With the drastic advance of deep learning, recent deep networks have demonstrated successful performance of the nuclei segmentation task. However, a major bottleneck to achieving good performance is the cost for annotation. A large network requires a large number of segmentation masks, and this annotation task is given to pathologists, not the public. In this paper, we propose a weakly supervised nuclei segmentation method, which requires only point annotations for training. This method can scale to large training set as marking a point of a nucleus is much cheaper than the fine segmentation mask. To this end, we introduce a novel auxiliary network, called PseudoEdgeNet, which guides the segmentation network to recognize nuclei edges even without edge annotations. We evaluate our method with two public datasets, and the results demonstrate that the method consistently outperforms other weakly supervised methods.

Inwan Yoo, Donggeun Yoo, Kyunghyun Paeng
Adversarial Domain Adaptation and Pseudo-Labeling for Cross-Modality Microscopy Image Quantification

Cell or nucleus quantification has recently achieved state-of-the-art performance by using convolutional neural networks (CNNs). In general, training CNNs requires a large amount of annotated microscopy image data, which is prohibitively expensive or even impossible to obtain in some applications. Additionally, when applying a deep supervised model to new datasets, it is common to annotate individual cells in those target datasets for model re-training or fine-tuning, leading to low-throughput image analysis. In this paper, we propose a novel adversarial domain adaptation method for cell/nucleus quantification across multimodality microscopy image data. Specifically, we learn a fully convolutional network detector with task-specific cycle-consistent adversarial learning, which conducts pixel-level adaptation between source and target domains and then completes a cell/nucleus detection task. Next, we generate pseudo-labels on target training data using the detector trained with adapted source images and further fine-tune the detector towards the target domain to boost the performance. We evaluate the proposed method on multiple cross-modality microscopy image datasets and obtain a significant improvement in cell/nucleus detection compared to the reference baselines and a recent state-of-the-art deep domain adaptation approach. In addition, our method is very competitive with the fully supervised models trained with all real target training labels.

Fuyong Xing, Tell Bennett, Debashis Ghosh
Progressive Learning for Neuronal Population Reconstruction from Optical Microscopy Images

Reconstruction of 3D neuronal populations from optical microscopy images is essential to investigate neural pathways and functions. This task is challenging because of the low signal-to-noise ratio and non-continuous intensities of neurite segments in optical microscopy images. Recently, significant improvement has been made on neuron reconstruction due to the development of deep neural networks (DNNs). Training such a DNN usually relies on a large number of images with voxel-wise annotations, and annotating these 3D images is very costly in terms of both finance and labor. In this paper, we propose a progressive learning strategy to take advantages of both traditional neuron tracing methods and deep learning techniques. Traditional neuron tracing techniques, which do not require expensive manual annotations for dense neurites, are employed to produce pseudo labels for neuron voxels. With the pseudo labels, a deep segmentation network is trained to learn discriminative and comprehensive features for neuron voxel extraction from noisy backgrounds. The neuron tracing module and the segmentation network can be mutually complemented and progressively improved to reconstruct more complete neuronal populations without using manual annotations. Moreover, we build a dataset called “VISoR-40” that consists of 40 optical microscopy 3D images from mouse cortical regions to demonstrate the superior performance of our progressive learning method. This dataset will be available at https://braindata.bitahub.com to support further study of deep learning techniques for brain exploration.

Jie Zhao, Xuejin Chen, Zhiwei Xiong, Dong Liu, Junjie Zeng, Yueyi Zhang, Zheng-Jun Zha, Guoqiang Bi, Feng Wu
Whole-Sample Mapping of Cancerous and Benign Tissue Properties

Structural and mechanical differences between cancerous and healthy tissue give rise to variations in macroscopic properties such as visual appearance and elastic modulus that show promise as signatures for early cancer detection. Atomic force microscopy (AFM) has been used to measure significant differences in stiffness between cancerous and healthy cells owing to its high force sensitivity and spatial resolution, however due to absorption and scattering of light, it is often challenging to accurately locate where AFM measurements have been made on a bulk tissue sample. In this paper we describe an image registration method that localizes AFM elastic stiffness measurements with high-resolution images of haematoxylin and eosin (H&E)-stained tissue to within ±1.5 $$\upmu $$ m. Color RGB images are segmented into three structure types (lumen, cells and stroma) by a neural network classifier trained on ground-truth pixel data obtained through k-means clustering in HSV color space. Using the localized stiffness maps and corresponding structural information, a whole-sample stiffness map is generated with a region matching and interpolation algorithm that associates similar structures with measured stiffness values. We present results showing significant differences in stiffness between healthy and cancerous liver tissue and discuss potential applications of this technique.

Lydia Neary-Zajiczek, Clara Essmann, Neil Clancy, Aiman Haider, Elena Miranda, Michael Shaw, Amir Gander, Brian Davidson, Delmiro Fernandez-Reyes, Vijay Pawar, Danail Stoyanov
Multi-task Neural Networks with Spatial Activation for Retinal Vessel Segmentation and Artery/Vein Classification

Retinal artery/vein (A/V) classification plays a critical role in the clinical biomarker study of how various systemic and cardiovascular diseases affect the retinal vessels. Conventional methods of automated A/V classification are generally complicated and heavily depend on the accurate vessel segmentation. In this paper, we propose a multi-task deep neural network with spatial activation mechanism that is able to segment full retinal vessel, artery and vein simultaneously, without the pre-requirement of vessel segmentation. The input module of the network integrates the domain knowledge of widely used retinal preprocessing and vessel enhancement techniques. We specially customize the output block of the network with a spatial activation mechanism, which takes advantage of a relatively easier task of vessel segmentation and exploits it to boost the performance of A/V classification. In addition, deep supervision is introduced to the network to assist the low level layers to extract more semantic information. The proposed network achieves pixel-wise accuracy of 95.70% for vessel segmentation, and A/V classification accuracy of 94.50%, which is the state-of-the-art performance for both tasks on the AV-DRIVE dataset. Furthermore, we have also tested the model performance on INSPIRE-AVR dataset, which achieves a skeletal A/V classification accuracy of 91.6%.

Wenao Ma, Shuang Yu, Kai Ma, Jiexiang Wang, Xinghao Ding, Yefeng Zheng
Fine-Scale Vessel Extraction in Fundus Images by Registration with Fluorescein Angiography

We present a new framework for fine-scale vessel segmentation from fundus images through registration and segmentation of corresponding fluorescein angiography (FA) images. In FA, fluorescent dye is used to highlight the vessels and increase their contrast. Since these highlights are temporally dispersed among multiple FA frames, we first register the FA frames and aggregate the per-frame segmentations to construct a detailed vessel mask. The constructed FA vessel mask is then registered to the fundus image based on an initial fundus vessel mask. Postprocessing is performed to refine the final vessel mask. Registration of FA frames, as well as registration of FA vessel mask to the fundus image, are done by similar hierarchical coarse-to-fine frameworks, both comprising rigid and non-rigid registration. Two CNNs with identical network structures, both trained on public datasets but with different settings, are used for vessel segmentation. The resulting final vessel segmentation contains fine-scale, filamentary vessels extracted from FA and corresponding to the fundus image. We provide quantitative evaluation as well as qualitative examples which support the robustness and the accuracy of the proposed method.

Kyoung Jin Noh, Sang Jun Park, Soochahn Lee
DME-Net: Diabetic Macular Edema Grading by Auxiliary Task Learning

Diabetic macular edema (DME) is a consequence of diabetic retinopathy (DR), characterized by the abnormal accumulation of fluid and protein deposits in the macular region of the retina. Early detection and grading of DME is of great clinical significance, yet remains a challenging problem. In this work, we propose a highly accurate DME grading model by exploiting macular and hard exudate detection results in an auxiliary learning manner. Specifically, we adopt XGBoost [4] as the classifier, which allows us to use different types of multi-scale features that are extracted by the multi-scale feature extraction models from the image, hard exudate mask, macula mask, and macula image. Experiments have been conducted on the IDRiD and Messidor datasets. Our model achieves a large improvement over previous methods. Our method yields an accuracy of 0.9417 on IDRiD and beats the champion method of the “Diabetic Retinopathy: Segmentation and Grading Challenge” [1]. Our method also produces a high overall performance on Messidor, obtaining scores of 0.9591, 0.9712, 0.9824 and 0.9633 in terms of sensitivity, specificity, AUC and accuracy, respectively.

Xiaodong He, Yi Zhou, Boyang Wang, Shanshan Cui, Ling Shao
Attention Guided Network for Retinal Image Segmentation

Learning structural information is critical for producing an ideal result in retinal image segmentation. Recently, convolutional neural networks have shown a powerful ability to extract effective representations. However, convolutional and pooling operations filter out some useful structural information. In this paper, we propose an Attention Guided Network (AG-Net) to preserve the structural information and guide the expanding operation. In our AG-Net, the guided filter is exploited as a structure sensitive expanding path to transfer structural information from previous feature maps, and an attention block is introduced to exclude the noise and reduce the negative influence of background further. The extensive experiments on two retinal image segmentation tasks (i.e., blood vessel segmentation, optic disc and cup segmentation) demonstrate the effectiveness of our proposed method.

Shihao Zhang, Huazhu Fu, Yuguang Yan, Yubing Zhang, Qingyao Wu, Ming Yang, Mingkui Tan, Yanwu Xu
An Unsupervised Domain Adaptation Approach to Classification of Stem Cell-Derived Cardiomyocytes

The use of human embryonic stem cell-derived cardiomyocytes (hESC-CMs) in applications such as cardiac regenerative medicine requires understanding them in the context of adult CMs. Their classification in terms of the major adult CM phenotypes is a crucial step to build this understanding. However, this is a challenging problem due to the lack of labels for hESC-CMs. Adult CM phenotypes are easily distinguishable based on the shape of their action potentials (APs), but it is still unclear how these phenotypes are expressed in the APs of hESC-CM populations. Recently, a metamorphosis distance was proposed to measure similarities between hESC-CM APs and adult CM APs, which led to state-of-the-art performance when used in a 1 nearest neighbor scheme. However, its computation is prohibitively expensive for large datasets. A recurrent neural network (RNN) classifier was recently shown to be computationally more efficient than the metamorphosis-based method, but at the expense of accuracy. In this paper we argue that the APs of adult CMs and hESC-CMs intrinsically belong to different domains, and propose an unsupervised domain adaptation approach to train the RNN classifier. The idea is to capture the domain shift between hESC-CMs and adult CMs by adding a term to the loss function that penalizes their maximum mean discrepancy (MMD) in feature space. Experimental results in an unlabeled 6940 hESC-CM dataset show that our approach outperforms the state of the art in terms of both clustering quality and computational efficiency. Moreover, it achieves state-of-the-art classification accuracy in a completely different dataset without retraining, which demonstrates the generalization capacity of the proposed method.

Carolina Pacheco, René Vidal
Backmatter
Metadaten
Titel
Medical Image Computing and Computer Assisted Intervention – MICCAI 2019
herausgegeben von
Dinggang Shen
Prof. Tianming Liu
Dr. Terry M. Peters
Lawrence H. Staib
Caroline Essert
Sean Zhou
Prof. Dr. Pew-Thian Yap
Ali Khan
Copyright-Jahr
2019
Electronic ISBN
978-3-030-32239-7
Print ISBN
978-3-030-32238-0
DOI
https://doi.org/10.1007/978-3-030-32239-7

Premium Partner