Skip to main content

2020 | Buch

Medical Image Computing and Computer Assisted Intervention – MICCAI 2020

23rd International Conference, Lima, Peru, October 4–8, 2020, Proceedings, Part V

herausgegeben von: Prof. Anne L. Martel, Purang Abolmaesumi, Danail Stoyanov, Diana Mateus, Maria A. Zuluaga, S. Kevin Zhou, Daniel Racoceanu, Prof. Leo Joskowicz

Verlag: Springer International Publishing

Buchreihe : Lecture Notes in Computer Science

insite
SUCHEN

Über dieses Buch

The seven-volume set LNCS 12261, 12262, 12263, 12264, 12265, 12266, and 12267 constitutes the refereed proceedings of the 23rd International Conference on Medical Image Computing and Computer-Assisted Intervention, MICCAI 2020, held in Lima, Peru, in October 2020. The conference was held virtually due to the COVID-19 pandemic.

The 542 revised full papers presented were carefully reviewed and selected from 1809 submissions in a double-blind review process. The papers are organized in the following topical sections:

Part I: machine learning methodologies

Part II: image reconstruction; prediction and diagnosis; cross-domain methods and reconstruction; domain adaptation; machine learning applications; generative adversarial networks

Part III: CAI applications; image registration; instrumentation and surgical phase detection; navigation and visualization; ultrasound imaging; video image analysis

Part IV: segmentation; shape models and landmark detection

Part V: biological, optical, microscopic imaging; cell segmentation and stain normalization; histopathology image analysis; opthalmology

Part VI: angiography and vessel analysis; breast imaging; colonoscopy; dermatology; fetal imaging; heart and lung imaging; musculoskeletal imaging

Part VI: brain development and atlases; DWI and tractography; functional brain networks; neuroimaging; positron emission tomography

Inhaltsverzeichnis

Frontmatter

Biological, Optical, Microscopic Imaging

Frontmatter
Channel Embedding for Informative Protein Identification from Highly Multiplexed Images

Interest is growing rapidly in using deep learning to classify biomedical images, and interpreting these deep-learned models is necessary for life-critical decisions and scientific discovery. Effective interpretation techniques accelerate biomarker discovery and provide new insights into the etiology, diagnosis, and treatment of disease. Most interpretation techniques aim to discover spatially-salient regions within images, but few techniques consider imagery with multiple channels of information. For instance, highly multiplexed tumor and tissue images have 30–100 channels and require interpretation methods that work across many channels to provide deep molecular insights. We propose a novel channel embedding method that extracts features from each channel. We then use these features to train a classifier for prediction. Using this channel embedding, we apply an interpretation method to rank the most discriminative channels. To validate our approach, we conduct an ablation study on a synthetic dataset. Moreover, we demonstrate that our method aligns with biological findings on highly multiplexed images of breast cancer cells while outperforming baseline pipelines. Code is available at https://sabdelmagid.github.io/miccai2020-project/ .

Salma Abdel Magid, Won-Dong Jang, Denis Schapiro, Donglai Wei, James Tompkin, Peter K. Sorger, Hanspeter Pfister
Demixing Calcium Imaging Data in C. elegans via Deformable Non-negative Matrix Factorization

Extracting calcium traces from the neurons of C. elegans is an important problem, enabling the study of individual neuronal activity and the large-scale dynamics that govern behavior. Traditionally, non-negative matrix factorization (NMF) methods have been successful in demixing and denoising cellular calcium activity in relatively motionless or pre-registered videos. However, in the case of C. elegans or other animal models where motion compensation methods fail to stabilize the effect of even mild motion in the imaging data, standard NMF methods fail to capture cellular footprints since these footprints are variable in time. In this work, we introduce deformable non-negative matrix factorization (dNMF), which models the motion trajectory of the underlying image space using a polynomial basis function. Spatial footprints and neural activity are optimized jointly with motion trajectories in a matrix tri-factorization setting. On simulated data, dNMF is demonstrated to outperform currently available demixing methods as well as methods that account for motion and demixing separately. Furthermore, we display the practical utility of our approach in extracting calcium traces from C. elegans microscopy videos. The extracted traces elucidate spontaneous neural activity as well as responses to stimuli. Open source code implementing this pipeline is available at https://github.com/amin-nejat/dNMF

Amin Nejatbakhsh, Erdem Varol, Eviatar Yemini, Vivek Venkatachalam, Albert Lin, Aravinthan D. T. Samuel, Oliver Hobert, Liam Paninski
Automated Measurements of Key Morphological Features of Human Embryos for IVF

A major challenge in clinical In-Vitro Fertilization (IVF) is selecting the highest quality embryo to transfer to the patient in the hopes of achieving a pregnancy. Time-lapse microscopy provides clinicians with a wealth of information for selecting embryos. However, the resulting movies of embryos are currently analyzed manually, which is time consuming and subjective. Here, we automate feature extraction of time-lapse microscopy of human embryos with a machine-learning pipeline of five convolutional neural networks (CNNs). Our pipeline consists of (1) semantic segmentation of the regions of the embryo, (2) regression predictions of fragment severity, (3) classification of the developmental stage, and object instance segmentation of (4) cells and (5) pronuclei. Our approach greatly speeds up the measurement of quantitative, biologically relevant features that may aid in embryo selection.

Brian D. Leahy, Won-Dong Jang, Helen Y. Yang, Robbert Struyven, Donglai Wei, Zhe Sun, Kylie R. Lee, Charlotte Royston, Liz Cam, Yael Kalma, Foad Azem, Dalit Ben-Yosef, Hanspeter Pfister, Daniel Needleman
A Novel Approach to Tongue Standardization and Feature Extraction

Fungiform papillae are large protrusions on the human tongue and contain many taste-buds. Most are found on the tip and the sides of the tongue, and their distribution varies from person to person. In this paper, we introduce a tongue-based coordinate system to investigate the density and other features of fungiform papillae on the surface of the tongue. A traditional method for estimating the density of fungiform papillae is to count the papillae in either a manually selected area or a predefined grid of areas on the tongue. However, depending on how a person presents his or her tongue in a specific image (such as narrowing, widening, and bending), this can cause visual variations in both the papillae’s apparent positions and apparent shapes, which in turn also affects the counts obtained within an area. By transforming the individual tongues into a standardized tongue, our tongue coordinate system minimizes these variations more effectively than current alignment-based methods. We further hypothesize an underlying fungiform papillae distribution for each tongue, which we estimate and use to perform statistical analysis on the different tongue categories. For this, we consider a cohort of 152 persons and the following variables: gender, ethnicity, ability to taste 6-n-propylthiouracil, and texture preference. Our results indicate possible new relations between the distribution of fungiform papillae and some of the aforementioned variables.

Chenhao Wang, Camilla Cattaneo, Jing Liu, Wender Bredie, Ella Pagliarini, Jon Sporring
Patch-Based Non-local Bayesian Networks for Blind Confocal Microscopy Denoising

Confocal microscopy is essential for histopathologic cell visualization and quantification. Despite its significant role in biology, fluorescence confocal microscopy suffers from the presence of inherent noise during image acquisition. Non-local patch-wise Bayesian mean filtering (NLB) was until recently the state-of-the-art denoising approach. However, classic denoising methods have been outperformed by neural networks in recent years. In this work, we propose to exploit the strengths of NLB in the framework of Bayesian deep learning. We do so by designing a convolutional neural network and training it to learn parameters of a Gaussian model approximating the prior on noise-free patches given their nearest, similar yet non-local, neighbors. We then apply Bayesian reasoning to leverage the prior and information from the noisy patch in the process of approximating the noise-free patch. Specifically, we use the closed-form analytic maximum a posteriori (MAP) estimate in the NLB algorithm to obtain the noise-free patch that maximizes the posterior distribution. The performance of our proposed method is evaluated on confocal microscopy images with real noise Poisson-Gaussian noise. Our experiments reveal the competitive performance of our approach compared to the state-of-the-art.

Saeed Izadi, Ghassan Hamarneh
Attention-Guided Quality Assessment for Automated Cryo-EM Grid Screening

Cryogenic electron microscopy (cryo-EM) has become an enabling technology in drug discovery and in understanding molecular bases of disease by producing near-atomic resolution (less than 0.4 nm) 3D reconstructions of biological macro-molecules. The imaging process required for 3D reconstructions involves a highly iterative and empirical screening process, starting with the acquisition of low magnification images of the cryo-EM grids. These images are inspected for squares that are likely to contain useful molecular signals. Potentially useful squares within the grid are then imaged at progressively higher magnifications, with the goal of identifying sub-micron areas within circular holes (bounded by the squares) for imaging at high magnification. This arduous, multi-step data acquisition process represents a bottleneck for obtaining a high throughput data collection. Here, we focus on automating the early decision making for the microscope operator, scoring low magnification images of squares, and proposing the first deep learning framework, XCryoNet, for automated cryo-EM grid screening. XCryoNet is a semi-supervised, attention-guided deep learning approach that provides explainable scoring of automatically extracted square images using limited amounts of labeled data. Results show up to 8% and 37% improvements over a fully supervised and a no-attention solution, respectively, when labeled data is scarce.

Hong Xu, David E. Timm, Shireen Y. Elhabian
MitoEM Dataset: Large-Scale 3D Mitochondria Instance Segmentation from EM Images

Electron microscopy (EM) allows the identification of intracellular organelles such as mitochondria, providing insights for clinical and scientific studies. However, public mitochondria segmentation datasets only contain hundreds of instances with simple shapes. It is unclear if existing methods achieving human-level accuracy on these small datasets are robust in practice. To this end, we introduce the MitoEM dataset, a 3D mitochondria instance segmentation dataset with two (30 $$\upmu $$ μ m) $$^3$$ 3 volumes from human and rat cortices respectively, 3,600 $$\times $$ × larger than previous benchmarks. With around 40K instances, we find a great diversity of mitochondria in terms of shape and density. For evaluation, we tailor the implementation of the average precision (AP) metric for 3D data with a 45 $$\times $$ × speedup. On MitoEM, we find existing instance segmentation methods often fail to correctly segment mitochondria with complex shapes or close contacts with other instances. Thus, our MitoEM dataset poses new challenges to the field. We release our code and data: https://donglaiw.github.io/page/mitoEM/index.html .

Donglai Wei, Zudi Lin, Daniel Franco-Barranco, Nils Wendt, Xingyu Liu, Wenjie Yin, Xin Huang, Aarush Gupta, Won-Dong Jang, Xueying Wang, Ignacio Arganda-Carreras, Jeff W. Lichtman, Hanspeter Pfister
Learning Guided Electron Microscopy with Active Acquisition

Single-beam scanning electron microscopes (SEM) are widely used to acquire massive datasets for biomedical study, material analysis, and fabrication inspection. Datasets are typically acquired with uniform acquisition: applying the electron beam with the same power and duration to all image pixels, even if there is great variety in the pixels’ importance for eventual use. Many SEMs are now able to move the beam to any pixel in the field of view without delay, enabling them, in principle, to invest their time budget more effectively with non-uniform imaging.In this paper, we show how to use deep learning to accelerate and optimize single-beam SEM acquisition of images. Our algorithm rapidly collects an information-lossy image (e.g. low resolution) and then applies a novel learning method to identify a small subset of pixels to be collected at higher resolution based on a trade-off between the saliency and spatial diversity. We demonstrate the efficacy of this novel technique for active acquisition by speeding up the task of collecting connectomic datasets for neurobiology by up to an order of magnitude. Code is available at https://github.com/lumi9587/learning-guided-SEM .

Lu Mi, Hao Wang, Yaron Meirovitch, Richard Schalek, Srinivas C. Turaga, Jeff W. Lichtman, Aravinthan D. T. Samuel, Nir Shavit
Neuronal Subcompartment Classification and Merge Error Correction

Recent advances in 3d electron microscopy are yielding ever larger reconstructions of brain tissue, encompassing thousands of individual neurons interconnected by millions of synapses. Interpreting reconstructions at this scale demands advances in the automated analysis of neuronal morphologies, for example by identifying morphological and functional subcompartments within neurons. We present a method that for the first time uses full 3d input (voxels) to automatically classify reconstructed neuron fragments as axon, dendrite, or somal subcompartments. Based on 3d convolutional neural networks, this method achieves a mean f1-score of 0.972, exceeding the previous state of the art of 0.955. The resulting predictions can support multiple analysis and proofreading applications. In particular, we leverage finely localized subcompartment predictions for automated detection and correction of merge errors in the volume reconstruction, successfully detecting 90.6% of inter-class merge errors with a false positive rate of only 2.7%.

Hanyu Li, Michał Januszewski, Viren Jain, Peter H. Li
Microtubule Tracking in Electron Microscopy Volumes

We present a method for microtubule tracking in electron microscopy volumes. Our method first identifies a sparse set of voxels that likely belong to microtubules. Similar to prior work, we then enumerate potential edges between these voxels, which we represent in a candidate graph. Tracks of microtubules are found by selecting nodes and edges in the candidate graph by solving a constrained optimization problem incorporating biological priors on microtubule structure. For this, we present a novel integer linear programming formulation, which results in speed-ups of three orders of magnitude and an increase of 53% in accuracy compared to prior art (evaluated on three $$1.2\times 4\times 4\,\upmu $$ 1.2 × 4 × 4 μ m volumes of Drosophila neural tissue). We also propose a scheme to solve the optimization problem in a block-wise fashion, which allows distributed tracking and is necessary to process very large electron microscopy volumes. Finally, we release a benchmark dataset for microtubule tracking, here used for training, testing and validation, consisting of eight $$30 \times 1000 \times 1000$$ 30 × 1000 × 1000 voxel blocks ( $$1.2\times 4\times 4\,\upmu $$ 1.2 × 4 × 4 μ m) of densely annotated microtubules in the CREMI data set ( https://github.com/nilsec/micron ).

Nils Eckstein, Julia Buhmann, Matthew Cook, Jan Funke
Leveraging Tools from Autonomous Navigation for Rapid, Robust Neuron Connectivity

As biological imaging datasets continue to grow in size, extracting information from large image volumes presents a computationally intensive challenge. State-of-the-art algorithms are almost entirely dominated by the use of convolutional neural network approaches that may be difficult to run at scale given schedule, cost, and resource limitations. We demonstrate a novel solution for high-resolution electron microscopy brain image volumes that permits the identification of individual neurons and synapses. Instead of conventional approaches where voxels are labelled according to the neuron or neuron segment to which they belong, we instead focus on extracting the underlying brain graph represented by synaptic connections between individual neurons, while also identifying key features like skeleton similarity and path length. This graph represents a critical step and scaffold for understanding the structure of neuronal circuitry. Our approach, which we call Agents, recasts the segmentation problem to one of path finding between keypoints (i.e., connectivity) in an information sharing framework using virtual agents. We create a family of sensors which follow local decision-making rules that perform computationally cheap operations on potential fields to perform tasks such as avoiding cell membranes and finding synapses. These enable a swarm of virtual agents to efficiently and robustly traverse three-dimensional datasets, create a sparse segmentation of pathways, and capture connectivity information. We achieve results that meet or exceed state-of-the-art performance at a substantially lower computational cost. Agents offers a categorically different approach to connectome estimation that can augment how we extract connectivity information at scale. Our method is generalizable and may be extended to biomedical imaging problems such as tracing the bronchial trees in lungs or road networks in natural images.

Nathan Drenkow, Justin Joyce, Jordan Matelsky, Jennifer Heiko, Reem Larabi, Brock Wester, Dean Kleissas, William Gray-Roncal
Statistical Atlas of C. elegans Neurons

Constructing a statistical atlas of neuron positions in the nematode Caenorhabditis elegans enables a wide range of applications that require neural identity. These applications include annotating gene expression, extracting calcium activity, and evaluating nervous-system mutations. Large complete sets of neural annotations are necessary to determine canonical neuron positions and their associated confidence regions. Recently, a transgene of C. elegans (“NeuroPAL”) has been introduced to assign correct identities to all neurons in the worm via a deterministic, fluorescent colormap. This strain has enabled efficient and accurate annotation of worm neurons. Using a dataset of 10 worms, we propose a statistical model that captures the latent means and covariances of neuron locations, with efficient optimization strategies to infer model parameters. We demonstrate the utility of this model in two critical applications. First, we use our trained atlas to automatically annotate neuron identities in C. elegans at the state-of-the-art rate. Second, we use our atlas to compute correlations between neuron positions, thereby determining covariance in neuron placement. The code to replicate the statistical atlas is distributed publicly at https://github.com/amin-nejat/StatAtlas .

Erdem Varol, Amin Nejatbakhsh, Ruoxi Sun, Gonzalo Mena, Eviatar Yemini, Oliver Hobert, Liam Paninski
Probabilistic Joint Segmentation and Labeling of C. elegans Neurons

Automatic identification and segmentation of the neurons of C. elegans enables evaluating nervous system mutations, positional variability, and allows us to conduct high-throughput population studies employing many animals. A recently introduced transgene of C. elegans, named “NeuroPAL” has enabled the efficient annotation of neurons and the construction of a statistical atlas of their positions. Previous atlas-based segmentation approaches have modeled images of cells as a mixture model. The expectation-maximization (EM) algorithm and its variants are used to find the (local) maximum likelihood parameters for this class of models. We present a variation of the EM algorithm called Sinkhorn-EM (sEM) that uses regularized optimal transport Sinkhorn iterations to enforce constraints on the marginals of the joint distribution of observed variables and latent assignments in order to incorporate our prior information about cell sizes into the cluster-data assignment proportions. We apply our method to the problem of segmenting and labeling neurons in fluorescent microscopy images of C. elegans specimens. We show empirically that sEM outperforms vanilla EM and a recently proposed 3-step (filter, detect, identify) labeling approach. Open source code implementing this method is available at https://github.com/amin-nejat/SinkhornEM .

Amin Nejatbakhsh, Erdem Varol, Eviatar Yemini, Oliver Hobert, Liam Paninski
Segmenting Continuous but Sparsely-Labeled Structures in Super-Resolution Microscopy Using Perceptual Grouping

Super Resolution (SR) microscopy leverages a variety of optical and computational techniques for overcoming the optical diffraction limit to acquire additional spatial details. However, added spatial details challenge existing segmentation tools. Confounding features include protein distributions that form membranes and boundaries, such as cellular and nuclear surfaces. We present a segmentation pipeline that retains the benefits provided by SR in surface separation while providing a tensor field to overcome these confounding features. The proposed technique leverages perceptual grouping to generate a tensor field that enables robust evolution of active contours despite ill-defined membrane boundaries.

Jiabing Li, Camille Artur, Jason Eriksen, Badrinath Roysam, David Mayerich
DISCo: Deep Learning, Instance Segmentation, and Correlations for Cell Segmentation in Calcium Imaging

Calcium imaging is one of the most important tools in neurophysiology as it enables the observation of neuronal activity for hundreds of cells in parallel and at single-cell resolution. In order to use the data gained with calcium imaging, it is necessary to extract individual cells and their activity from the recordings. We present DISCo, a novel approach for the cell segmentation in calcium imaging videos. We use temporal information from the recordings in a computationally efficient way by computing correlations between pixels and combine it with shape-based information to identify active as well as non-active cells. We first learn to predict whether two pixels belong to the same cell; this information is summarized in an undirected, edge-weighted graph which we then partition. Evaluating our method on the Neurofinder public benchmark shows that DISCo outperforms all existing models trained on these datasets.

Elke Kirschbaum, Alberto Bailoni, Fred A. Hamprecht
Isotropic Reconstruction of 3D EM Images with Unsupervised Degradation Learning

The isotropic reconstruction of 3D electron microscopy (EM) images with low axial resolution is of great importance for biological analysis. Existing deep learning-based methods rely on handcrafted down-scaled training data, which does not model the real degradation accurately and thus leads to unsatisfying performance in practice. To address this problem, we propose a universal and unsupervised framework to simultaneously learn the real axial degradation and the isotropic reconstruction of 3D EM images. First, we train a degradation network using unpaired low-resolution (LR) and high-resolution (HR) slices, both of which are from real data, in an adversarial manner. Then, the degradation network is further used to generate realistic LR data from HR labels to form paired training data. In this way, the generated degraded data is consistent with the real axial degradation process, which guarantees the generalization ability of subsequent reconstruction networks to the real data. Our framework has the flexibility to work with different existing reconstruction methods. Experiments on both simulated and real anisotropic EM images validate the superiority of our framework.

Shiyu Deng, Xueyang Fu, Zhiwei Xiong, Chang Chen, Dong Liu, Xuejin Chen, Qing Ling, Feng Wu
Background and Illumination Correction for Time-Lapse Microscopy Data with Correlated Foreground

Due to the inherent imperfections in the optical path, microscopy images, particularly fluorescence microscopy images, are often skewed by uneven illumination and hence have spurious intensity variation, also known as shading or vignetting effect. Besides spatial intensity inhomogeneity, time-lapse microscopy imaging further suffers from background variation in time, mostly due to photo-bleaching of the background medium. Moreover, the temporal background variation is often experiment-specific and hence cannot be easily corrected, in contrast to shading, where a prospective calibration method can be used. Existing retrospective illumination correction methods, ranging from simple multi-image averaging to sophisticated optimisation based methods such as CIDRE and BaSiC, all assume that the foreground of all images is uncorrelated between each other. However, this assumption is violated in e.g. long-term time-lapse microscopy imaging of adherent stem cells, in which a strong foreground correlation is observed from frame to frame. In this paper, we propose a new illumination and background correction method for time-lapse imaging, based on low-rank and sparse decomposition. We incorporate binary segmentation masks that inform the weighting scheme of our reweighted $$L_1$$ L 1 norm minimisation about foreground vs background pixels in the image. This yields a better separation of the low-rank and sparse component, hence improving the estimation of illumination profiles. Experiments on both simulated and real time-lapse data demonstrate that our approach is superior to existing illumination correction methods and improves single cell quantification.

Tingying Peng, Lorenz Lamm, Dirk Loeffler, Nouraiz Ahmed, Nassir Navab, Timm Schroeder, Carsten Marr
Joint Spatial-Wavelet Dual-Stream Network for Super-Resolution

Super-Resolution (SR) techniques can compensate for the missing information of low-resolution images and further promote experts and algorithms to make accurate diagnosis decisions. Although the existing pixel-loss based SR works produce high-resolution images with impressive objective metrics, the over-smoothed contents that lose high-frequency information would disturb the visual experience and the subsequent diagnosis. To address this issue, we propose a joint Spatial-Wavelet super-resolution Network (SWD-Net) with collaborative Dual-stream. In the spatial stage, a Refined Context Fusion (RCF) is proposed to iteratively rectify the features by a counterpart stream with compensative receptive fields. After that, the wavelet stage enhances the reconstructed images, especially the structural boundaries. Specifically, we design the tailor-made Wavelet Features Adaptation (WFA) to adjust the wavelet coefficients for better compatibility with networks and Wavelet-Aware Convolutional blocks (WAC) to exploit features in the wavelet domain efficiently. We further introduce the wavelet coefficients supervision together with the traditional spatial loss to jointly optimize the network and obtain the high-frequency enhanced SR images. To evaluate the SR for medical images, we build a benchmark dataset with histopathology images and evaluate the proposed SWD-Net under different settings. The comprehensive experiments demonstrate our SWD-Net outperforms state-of-the-art methods. Furthermore, SWD-Net is proven to promote medical image diagnosis with a large margin. The source code and dataset are available at https://github.com/franciszchen/SWD-Net .

Zhen Chen, Xiaoqing Guo, Chen Yang, Bulat Ibragimov, Yixuan Yuan
Towards Neuron Segmentation from Macaque Brain Images: A Weakly Supervised Approach

The advance of microscopic imaging technology has enabled the observation of primate brain in its entirety and at single-neuron resolution. It is then an urgent need to develop means for automated analyses of these brain images, e.g. neuron segmentation. Deep learning is demonstrated an appealing approach for segmentation of natural images, but the success of deep learning is highly dependent on the large-scale and well-built training data that are costly to collect. In this paper, we take a step towards the goal of neuron segmentation from primate brain images, using a weakly supervised approach. We build – to our best knowledge – the first dual-channel three-dimensional image dataset of macaque brain for neuron segmentation. We propose two kinds of “weak” labels, i.e. central points and rough masks, to prepare training data with an affordable cost. Accordingly, we design a weakly supervised learning method for neuron instance segmentation where instances can be easily extracted from the predicted peak-shape probability maps. Experimental results have shown the effectiveness of our approach. We also verify the efficiency of the proposed method on a public nuclei dataset. Our dataset and code have been published at https://braindata.bitahub.com/ .

Meng Dong, Dong Liu, Zhiwei Xiong, Xuejin Chen, Yueyi Zhang, Zheng-Jun Zha, Guoqiang Bi, Feng Wu
3D Reconstruction and Segmentation of Dissection Photographs for MRI-Free Neuropathology

Neuroimaging to neuropathology correlation (NTNC) promis-es to enable the transfer of microscopic signatures of pathology to in vivo imaging with MRI, ultimately enhancing clinical care. NTNC traditionally requires a volumetric MRI scan, acquired either ex vivo or a short time prior to death. Unfortunately, ex vivo MRI is difficult and costly, and recent premortem scans of sufficient quality are seldom available. To bridge this gap, we present methodology to 3D reconstruct and segment full brain image volumes from brain dissection photographs, which are routinely acquired at many brain banks and neuropathology departments. The 3D reconstruction is achieved via a joint registration framework, which uses a reference volume other than MRI. This volume may represent either the sample at hand (e.g., a surface 3D scan) or the general population (a probabilistic atlas). In addition, we present a Bayesian method to segment the 3D reconstructed photographic volumes into 36 neuroanatomical structures, which is robust to nonuniform brightness within and across photographs. We evaluate our methods on a dataset with 24 brains, using Dice scores and volume correlations. The results show that dissection photography is a valid replacement for ex vivo MRI in many volumetric analyses, opening an avenue for MRI-free NTNC, including retrospective data. The code is available at https://github.com/htregidgo/DissectionPhotoVolumes .

Henry F. J. Tregidgo, Adrià Casamitjana, Caitlin S. Latimer, Mitchell D. Kilgore, Eleanor Robinson, Emily Blackburn, Koen Van Leemput, Bruce Fischl, Adrian V. Dalca, Christine L. Mac Donald, C. Dirk Keene, Juan Eugenio Iglesias
DistNet: Deep Tracking by Displacement Regression: Application to Bacteria Growing in the Mother Machine

The mother machine is a popular microfluidic device that allows long-term time-lapse imaging of thousands of cells in parallel by microscopy. It has become a valuable tool for single-cell level quantitative analysis and characterization of many cellular processes such as gene expression and regulation, mutagenesis or response to antibiotics. The automated and quantitative analysis of the massive amount of data generated by such experiments is now the limiting step. In particular the segmentation and tracking of bacteria cells imaged in phase-contrast microscopy—with error rates compatible with high-throughput data—is a challenging problem.In this work, we describe a novel formulation of the multi-object tracking problem, in which tracking is performed by a regression of the bacteria’s displacement, allowing simultaneous tracking of multiple bacteria, despite their growth and division over time. Our method performs jointly segmentation and tracking, leveraging sequential information to increase segmentation accuracy.We introduce a Deep Neural Network (DNN) architecture taking advantage of a self-attention mechanism which yields extremely low tracking error rate and segmentation error rate. We demonstrate superior performance and speed compared to state-of-the-art methods. Our method is named DiSTNet which stands for DISTance+DISplacement Segmentation and Tracking Network.While this method is particularly well suited for mother machine microscopy data, its general joint tracking and segmentation formulation could be applied to many other problems with different geometries.

Jean Ollion, Charles Ollion
A Weakly Supervised Deep Learning Approach for Detecting Malaria and Sickle Cells in Blood Films

Machine vision analysis of blood films imaged under a brightfield microscope could provide scalable malaria diagnosis solutions in resource constrained endemic urban settings. The major bottleneck in successfully analyzing blood films with deep learning vision techniques is a lack of object-level annotations of disease markers such as parasites or abnormal red blood cells. To overcome this challenge, this work proposes a novel deep learning supervised approach that leverages weak labels readily available from routine clinical microscopy to diagnose malaria in thick blood film microscopy. This approach is based on aggregating the convolutional features of multiple objects present in one hundred high resolution image fields. We show that this method not only achieves expert-level malaria diagnostic accuracy without any hard object-level labels but can also identify individual malaria parasites in digitized thick blood films, which is useful in assessing disease severity and response to treatment. We demonstrate another application scenario where our approach is able to detect sickle cells in thin blood films. We discuss the wider applicability of the approach in automated analysis of thick blood films for the diagnosis of other blood disorders.

Petru Manescu, Christopher Bendkowski, Remy Claveau, Muna Elmi, Biobele J. Brown, Vijay Pawar, Mike J. Shaw, Delmiro Fernandez-Reyes
Imaging Scattering Characteristics of Tissue in Transmitted Microscopy

Scattering property plays a very important role in optical imaging and diagnostic applications, such as analysis of cancerous process and diagnosis of dysplasia or cancer. The existing methods focused on removing scattering components in order to visualize the spatial distribution of the reflection and absorption properties. We propose a novel method for estimating the spatial distribution of scattering property by measuring a set of intensities of the direct scattered light with each angle for each point. Our key contribution is to decompose the captured light into the direct scattered light with each angle by using varying spatial frequency of illumination patterns that can control the range of the scattered angle. By applying the method to observe a spatially inhomogeneous translucent object, we can extract the map of the angular distribution of scattering. To the best of our knowledge, this is the first method to enable visualizing a spatial map of scattering property using a conventional transmitted microscope setup. Experimental results on synthetic data and real complex materials demonstrate the effectiveness of our method for the estimation of scattering distribution.

Mihoko Shimano, Yuta Asano, Shin Ishihara, Ryoma Bise, Imari Sato
Attention Based Multiple Instance Learning for Classification of Blood Cell Disorders

Red blood cells are highly deformable and present in various shapes. In blood cell disorders, only a subset of all cells is morphologically altered and relevant for the diagnosis. However, manually labeling of all cells is laborious, complicated and introduces inter-expert variability. We propose an attention based multiple instance learning method to classify blood samples of patients suffering from blood cell disorders. Cells are detected using an R-CNN architecture. With the features extracted for each cell, a multiple instance learning method classifies patient samples into one out of four blood cell disorders. The attention mechanism provides a measure of the contribution of each cell to the overall classification and significantly improves the networks classification accuracy as well as its interpretability for the medical expert.

Ario Sadafi, Asya Makhro, Anna Bogdanova, Nassir Navab, Tingying Peng, Shadi Albarqouni, Carsten Marr
A Generative Modeling Approach for Interpreting Population-Level Variability in Brain Structure

Understanding how neural structure varies across individuals is critical for characterizing the effects of disease, learning, and aging on the brain. However, disentangling the different factors that give rise to individual variability is still an outstanding challenge. In this paper, we introduce a deep generative modeling approach to find different modes of variation across many individuals. Our approach starts with training a variational autoencoder on a collection of auto-fluorescence images from a little over 1,700 mouse brains at 25 $$\upmu $$ μ m resolution. We then tap into the learned factors and validate the model’s expressiveness, via a novel bi-directional technique that makes structured perturbations to both, the high-dimensional inputs of the network, as well as the low-dimensional latent variables in its bottleneck. Our results demonstrate that through coupling generative modeling frameworks with structured perturbations, it is possible to probe the latent space of the generative model to provide insights into the representations of brain structure formed in deep networks.

Ran Liu, Cem Subakan, Aishwarya H. Balwani, Jennifer Whitesell, Julie Harris, Sanmi Koyejo, Eva L. Dyer
Processing-Aware Real-Time Rendering for Optimized Tissue Visualization in Intraoperative 4D OCT

Intraoperative Optical Coherence Tomography (iOCT) has advanced in recent years to provide real-time high resolution volumetric imaging for ophthalmic surgery. It enables real-time 3D feedback during precise surgical maneuvers. Intraoperative 4D OCT generally exhibits lower signal-to-noise ratio compared to diagnostic OCT and visualization is complicated by instrument shadows occluding retinal tissue. Additional constraints of processing data rates upwards of 6 GB/s create unique challenges for advanced visualization of 4D OCT. Prior approaches for real-time 4D iOCT rendering have been limited to applying simple denoising filters and colorization to improve visualization.We present a novel real-time rendering pipeline that provides enhanced intraoperative visualization and is specifically designed for the high data rates of 4D iOCT. We decompose the volume into a static part consisting of the retinal tissue and a dynamic part including the instrument. Aligning the static parts over time allows temporal compounding of these structures for improved image quality. We employ a translational motion model and use axial projection images to reduce the dimensionality of the alignment. A model-based instrument segmentation on the projections discriminates static from dynamic parts and is used to exclude instruments from the compounding. Our real-time rendering method combines the compounded static information with the latest iOCT data to provide a visualization which compensates instrument shadows and improves instrument visibility.We evaluate the individual parts of our pipeline on pre-recorded OCT volumes and demonstrate the effectiveness of our method on a recorded volume sequence with a moving retinal forceps.

Jakob Weiss, Michael Sommersperger, Ali Nasseri, Abouzar Eslami, Ulrich Eck, Nassir Navab

Cell Segmentation and Stain Normalization

Frontmatter
Boundary-Assisted Region Proposal Networks for Nucleus Segmentation

Nucleus segmentation is an important task in medical image analysis. However, machine learning models cannot perform well because there are large amount of clusters of crowded nuclei. To handle this problem, existing approaches typically resort to sophisticated hand-crafted post-processing strategies; therefore, they are vulnerable to the variation of post-processing hyper-parameters. Accordingly, in this paper, we devise a Boundary-assisted Region Proposal Network (BRP-Net) that achieves robust instance-level nucleus segmentation. First, we propose a novel Task-aware Feature Encoding (TAFE) network that efficiently extracts respective high-quality features for semantic segmentation and instance boundary detection tasks. This is achieved by carefully considering the correlation and differences between the two tasks. Second, coarse nucleus proposals are generated based on the predictions of the above two tasks. Third, these proposals are fed into instance segmentation networks for more accurate prediction. Experimental results demonstrate that the performance of BRP-Net is robust to the variation of post-processing hyper-parameters. Furthermore, BRP-Net achieves state-of-the-art performances on both the Kumar and CPM17 datasets. The code of BRP-Net will be released at https://github.com/csccsccsccsc/brpnet .

Shengcong Chen, Changxing Ding, Dacheng Tao
BCData: A Large-Scale Dataset and Benchmark for Cell Detection and Counting

Breast cancer is a main malignant tumor for women and the incidence is trending to ascend. Detecting positive and negative tumor cells in the immunohistochemically stained sections of breast tissue to compute the Ki-67 index is an essential means to determine the degree of malignancy of breast cancer. However, there are scarcely public datasets about cell detection of Ki-67 stained images. In this paper, we introduce a large-scale Breast tumor Cell Dataset (BCData) for cell detection and counting, which contains 1,338 images with 181,074 annotated cells belonging to two categories, i.e., positive and negative tumor cells. (We state that our dataset can only be used for non-commercial research.) Our dataset varies widely in both the distributing density of tumor cells and the Ki-67 index. We conduct several cell detection and counting methods on this dataset to set the first benchmark. We believe that our dataset will facilitate further research in cell detection and counting fields in clustering, overlapping, and variational stained conditions. Our dataset is available at https://sites.google.com/view/bcdataset

Zhongyi Huang, Yao Ding, Guoli Song, Lin Wang, Ruizhe Geng, Hongliang He, Shan Du, Xia Liu, Yonghong Tian, Yongsheng Liang, S. Kevin Zhou, Jie Chen
Weakly-Supervised Nucleus Segmentation Based on Point Annotations: A Coarse-to-Fine Self-Stimulated Learning Strategy

Nucleus segmentation is a fundamental task in digital pathology analysis. However, it is labor-expensive and time-consuming to manually annotate the pixel-level full nucleus masks, while it is easier to make point annotations. In this paper, we propose a coarse-to-fine weakly-supervised framework to train the segmentation model from only point annotations to reduce the labor cost of generating pixel-level masks. Our coarse-to-fine strategy can improve segmentation performance progressively in a self-stimulated learning manner. Specifically, to generate coarse segmentation masks, we employ a self-supervision strategy using clustering to perform the binary classification. To avoid trivial solutions, our model is sparsely supervised by annotated positive points and geometric-constrained negative boundaries, via point-to-region spatial expansion and Voronoi partition, respectively. Then, to generate fine segmentation masks, the prior knowledge of edges in the unadorned image is additionally utilized by our proposed contour-sensitive constraint to further tune the nucleus contours. Experimental results on two public datasets show that our model trained with weakly-supervised data (i.e., point annotations) achieves competitive performance compared with the model trained with fully supervised data (i.e., full nucleus masks). The code is made publicly available at https://github.com/tiankuan93/C2FNet .

Kuan Tian, Jun Zhang, Haocheng Shen, Kezhou Yan, Pei Dong, Jianhua Yao, Shannon Che, Pifu Luo, Xiao Han
Structure Preserving Stain Normalization of Histopathology Images Using Self Supervised Semantic Guidance

Although generative adversarial network (GAN) based style transfer is state of the art in histopathology color-stain normalization, they do not explicitly integrate structural information of tissues. We propose a self-supervised approach to incorporate semantic guidance into a GAN based stain normalization framework and preserve detailed structural information. Our method does not require manual segmentation maps which is a significant advantage over existing methods. We integrate semantic information at different layers between a pre-trained semantic network and the stain color normalization network. The proposed scheme outperforms other color normalization methods leading to better classification and segmentation performance.

Dwarikanath Mahapatra, Behzad Bozorgtabar, Jean-Philippe Thiran, Ling Shao
A Novel Loss Calibration Strategy for Object Detection Networks Training on Sparsely Annotated Pathological Datasets

Recently, object detection frameworks based on Convolutional Neural Networks (CNNs) have become powerful methods for various tasks of medical image analysis; however, they often struggle with most pathological datasets, which are impossible to annotate all the cells. Obviously, sparse annotations may lead to a seriously miscalculated loss in training, which limits the performance of networks. To address this limitation, we investigate the internal training process of object detection networks. Our core observation is that there is a significant density difference between the regression boxes of the positive instances and negative instances. Our novel Boxes Density Energy (BDE) focuses on utilizing the densities of regression boxes to conduct loss-calibration, which is dedicated to reducing the miscalculated loss, meanwhile to penalizing mispredictions with a relatively more significant loss. Thus BDE can guide networks to be trained along the right direction. Extensive experiments have demonstrated that, BDE on the sparsely annotated pathological dataset can significantly boost the performance of networks, and even with 1.0–1.5% higher recall than networks trained on the fully annotated dataset.

Hansheng Li, Xin Han, Yuxin Kang, Xiaoshuang Shi, Mengdi Yan, Zixu Tong, Qirong Bu, Lei Cui, Jun Feng, Lin Yang
Histopathological Stain Transfer Using Style Transfer Network with Adversarial Loss

Deep learning models that are trained on histopathological images obtained from a single lab and/or scanner give poor inference performance on images obtained from another scanner/lab with a different staining protocol. In recent years, there has been a good amount of research done for image stain normalization to address this issue. In this work, we present a novel approach for the stain normalization problem using fast neural style transfer coupled with adversarial loss. We also propose a novel stain transfer generator network based on High-Resolution Network (HRNet) which requires less training time and gives good generalization with few paired training images of reference stain and test stain. This approach has been tested on Whole Slide Images (WSIs) obtained from 8 different labs, where images from one lab were treated as a reference stain. A deep learning model was trained on this stain and the rest of the images were transferred to it using the corresponding stain transfer generator network. Experimentation suggests that this approach is able to successfully perform stain normalization with good visual quality and provides better inference performance compared to not applying stain normalization.

Harshal Nishar, Nikhil Chavanke, Nitin Singhal
Instance-Aware Self-supervised Learning for Nuclei Segmentation

Due to the wide existence and large morphological variances of nuclei, accurate nuclei instance segmentation is still one of the most challenging tasks in computational pathology. The annotating of nuclei instances, requiring experienced pathologists to manually draw the contours, is extremely laborious and expensive, which often results in the deficiency of annotated data. The deep learning based segmentation approaches, which highly rely on the quantity of training data, are difficult to fully demonstrate their capacity in this area. In this paper, we propose a novel self-supervised learning framework to deeply exploit the capacity of widely-used convolutional neural networks (CNNs) on the nuclei instance segmentation task. The proposed approach involves two sub-tasks (i.e., scale-wise triplet learning and count ranking), which enable neural networks to implicitly leverage the prior-knowledge of nuclei size and quantity, and accordingly mine the instance-aware feature representations from the raw data. Experimental results on the publicly available MoNuSeg dataset show that the proposed self-supervised learning approach can remarkably boost the segmentation accuracy of nuclei instance—a new state-of-the-art average Aggregated Jaccard Index (AJI) of 70.63%, is achieved by our self-supervised ResUNet-101. To our best knowledge, this is the first work focusing on the self-supervised learning for instance segmentation.

Xinpeng Xie, Jiawei Chen, Yuexiang Li, Linlin Shen, Kai Ma, Yefeng Zheng
StyPath: Style-Transfer Data Augmentation for Robust Histology Image Classification

The classification of Antibody Mediated Rejection (AMR) in kidney transplant remains challenging even for experienced nephropathologists; this is partly because histological tissue stain analysis is often characterized by low inter-observer agreement and poor reproducibility. One of the implicated causes for inter-observer disagreement is the variability of tissue stain quality between (and within) pathology labs, coupled with the gradual fading of archival sections. Variations in stain colors and intensities can make tissue evaluation difficult for pathologists, ultimately affecting their ability to describe relevant morphological features. Being able to accurately predict the AMR status based on kidney histology images is crucial for improving patient treatment and care. We propose a novel pipeline to build robust deep neural networks for AMR classification based on StyPath, a histological data augmentation technique that leverages a light weight style-transfer algorithm as a means to reduce sample-specific bias. Each image was generated in $$1.84 \pm 0.03$$ 1.84 ± 0.03 s using a single GTX TITAN V gpu and pytorch, making it faster than other popular histological data augmentation techniques. We evaluated our model using a Monte Carlo (MC) estimate of Bayesian performance and generate an epistemic measure of uncertainty to compare both the baseline and StyPath augmented models. We also generated Grad-CAM representations of the results which were assessed by an experienced nephropathologist; we used this qualitative analysis to elucidate on the assumptions being made by each model. Our results imply that our style-transfer augmentation technique improves histological classification performance (reducing error from 14.8% to 11.5%) and generalization ability.

Pietro Antonio Cicalese, Aryan Mobiny, Pengyu Yuan, Jan Becker, Chandra Mohan, Hien Van Nguyen
Multimarginal Wasserstein Barycenter for Stain Normalization and Augmentation

Variations in hematoxylin and eosin (H&E) stained images (due to clinical lab protocols, scanners, etc) directly impact the quality and accuracy of clinical diagnosis, and hence it is important to control for these variations for a reliable diagnosis. In this work, we present a new approach based on the multimarginal Wasserstein barycenter to normalize and augment H&E stained images given one or more references. Specifically, we provide a mathematically robust way of naturally incorporating additional images as intermediate references to drive stain normalization and augmentation simultaneously. The presented approach showed superior results quantitatively and qualitatively as compared to state-of-the-art methods for stain normalization. We further validated our stain normalization and augmentations in the nuclei segmentation task on a publicly available dataset, achieving state-of-the-art results against competing approaches.

Saad Nadeem, Travis Hollmann, Allen Tannenbaum
Corruption-Robust Enhancement of Deep Neural Networks for Classification of Peripheral Blood Smear Images

Deep learning emerges as a promising technology for automated peripheral blood smear analysis and hematologic diagnosis. A big challenge to deep learning model is the accuracy drop when facing image corruptions caused by different smear preparation and digitalization operations. In order to serve the real applications of peripheral blood analysis, a practical deep learning classifier needs to be robust against a wide range of corruptions. In this paper, we first investigate the comprehensive corruption types for peripheral blood smear image and establish the benchmark dataset Smear-C simulating the real physical cause factors. Then we propose a novel method SmearRobust which can easily fit into the existing neural networks and improve the robustness. Experimental results show that SmearRobust can significantly enhance the robustness upon Smear-C dataset. Furthermore, the proposed corruption simulation algorithms and the robust learning method can be potentially applied to bone marrow smear and general pathology areas.

Songtao Zhang, Qingwen Ni, Bing Li, Shan Jiang, Wenyu Cai, Hang Chen, Lin Luo
Multi-field of View Aggregation and Context Encoding for Single-Stage Nucleus Recognition

Automated nucleus/cell recognition is a very challenging task, especially for differentiating tumor nuclei from non-tumor nuclei in Ki67 immunohistochemistry (IHC) stained images. Convolutional neural networks and their variants have been recently introduced to identify different types of nuclei and have achieved state-of-the-art performance. However, previous nucleus recognition approaches do not explicitly encode contextual information in the images, which can be very helpful for network representation learning. In this paper, we propose a novel multi-field-of-view context encoding method for single-stage nuclei identification in Ki67 IHC stained images. Specifically, we learn a deep structured regression model that takes multi-field of views of images as input and conducts feature aggregation on the fly for representation learning; then, we design a context encoding module to explicitly explore the multi-field-of-view contextual information and enhance the model’s representation power. In order to further improve nucleus recognition, we also introduce a novel deep regression loss that can emphasize specific channels of the prediction map with category-aware channel suppression. The proposed method can be learned in an end-to-end, pixel-to-pixel manner for single-stage nucleus recognition. We evaluate our method on a large-scale pancreatic neuroendocrine tumor image dataset, and the experiments demonstrate the superior performance of our method in nucleus recognition.

Tian Bai, Jiayu Xu, Fuyong Xing
Self-supervised Nuclei Segmentation in Histopathological Images Using Attention

Segmentation and accurate localization of nuclei in histopathological images is a very challenging problem, with most existing approaches adopting a supervised strategy. These methods usually rely on manual annotations that require a lot of time and effort from medical experts. In this study, we present a self-supervised approach for segmentation of nuclei for whole slide histopathology images. Our method works on the assumption that the size and texture of nuclei can determine the magnification at which a patch is extracted. We show that the identification of the magnification level for tiles can generate a preliminary self-supervision signal to locate nuclei. We further show that by appropriately constraining our model it is possible to retrieve meaningful segmentation maps as an auxiliary output to the primary magnification identification task. Our experiments show that with standard post-processing, our method can outperform other unsupervised nuclei segmentation approaches and report similar performance with supervised ones on the publicly available MoNuSeg dataset. Our code and models are available online ( https://github.com/msahasrabudhe/miccai2020_self_sup_nuclei_seg ) to facilitate further research.

Mihir Sahasrabudhe, Stergios Christodoulidis, Roberto Salgado, Stefan Michiels, Sherene Loi, Fabrice André, Nikos Paragios, Maria Vakalopoulou
FocusLiteNN: High Efficiency Focus Quality Assessment for Digital Pathology

Out-of-focus microscopy lens in digital pathology is a critical bottleneck in high-throughput Whole Slide Image (WSI) scanning platforms, for which pixel-level automated Focus Quality Assessment (FQA) methods are highly desirable to help significantly accelerate the clinical workflows. Existing FQA methods include both knowledge-driven and data-driven approaches. While data-driven approaches such as Convolutional Neural Network (CNN) based methods have shown great promises, they are difficult to use in practice due to their high computational complexity and lack of transferability. Here, we propose a highly efficient CNN-based model that maintains fast computations similar to the knowledge-driven methods without excessive hardware requirements such as GPUs. We create a training dataset using FocusPath which encompasses diverse tissue slides across nine different stain colors, where the stain diversity greatly helps the model to learn diverse color spectrum and tissue structures. In our attempt to reduce the CNN complexity, we find with surprise that even trimming down the CNN to the minimal level, it still achieves a highly competitive performance. We introduce a novel comprehensive evaluation dataset, the largest of its kind, annotated and compiled from TCGA repository for model assessment and comparison, for which the proposed method exhibits superior precision-speed trade-off when compared with existing knowledge-driven and data-driven FQA approaches.

Zhongling Wang, Mahdi S. Hosseini, Adyn Miles, Konstantinos N. Plataniotis, Zhou Wang

Histopathology Image Analysis

Frontmatter
Pairwise Relation Learning for Semi-supervised Gland Segmentation

Accurate and automated gland segmentation on histology tissue images is an essential but challenging task in the computer-aided diagnosis of adenocarcinoma. Despite their prevalence, deep learning models always require a myriad number of densely annotated training images, which are difficult to obtain due to extensive labor and associated expert costs related to histology image annotations. In this paper, we propose the pairwise relation-based semi-supervised (PRS $$^2$$ 2 ) model for gland segmentation on histology images. This model consists of a segmentation network (S-Net) and a pairwise relation network (PR-Net). The S-Net is trained on labeled data for segmentation, and PR-Net is trained on both labeled and unlabeled data in an unsupervised way to enhance its image representation ability via exploiting the semantic consistency between each pair of images in the feature space. Since both networks share their encoders, the image representation ability learned by PR-Net can be transferred to S-Net to improve its segmentation performance. We also design the object-level Dice loss to address the issues caused by touching glands and combine it with other two loss functions for S-Net. We evaluated our model against five recent methods on the GlaS dataset and three recent methods on the CRAG dataset. Our results not only demonstrate the effectiveness of the proposed PR-Net and object-level Dice loss, but also indicate that our PRS $$^2$$ 2 model achieves the state-of-the-art gland segmentation performance on both benchmarks.

Yutong Xie, Jianpeng Zhang, Zhibin Liao, Johan Verjans, Chunhua Shen, Yong Xia
Ranking-Based Survival Prediction on Histopathological Whole-Slide Images

Survival prediction for patients based on gigapixel histopathological whole-slide images (WSIs) has attracted increasing attention in recent years. Previous studies mainly focus on the framework of predicting the survival hazard scores based on one individual WSI for each patient directly. These prediction methods ignore the relative survival differences among patients, i.e., the ranking information, which is important for a regression task. Under such circumstances, we propose a ranking-based survival prediction method on WSIs – RankSurv, which takes the ranking information into consideration during the learning process. First, a hypergraph representation is introduced to conduct hazard prediction on each WSI respectively, which is able to learn the high-order correlation among different patches in the WSI. Then, a ranking-based prediction process is conducted using pairwise survival data. Experiments are conducted on three public carcinoma datasets (i.e., LUSC, GBM, and NLST). Quantitative results show that the proposed method significantly outperforms state-of-the-art methods on all three datasets, which demonstrates the effectiveness of the proposed ranking-based survival prediction framework.

Donglin Di, Shengrui Li, Jun Zhang, Yue Gao
Renal Cell Carcinoma Detection and Subtyping with Minimal Point-Based Annotation in Whole-Slide Images

Cancerous region detection and subtyping in whole-slide images (WSIs) are fundamental for renal cell carcinoma (RCC) diagnosis. The main challenge in the development of automated RCC diagnostic systems is the lack of large-scale datasets with precise annotations. In this paper, we propose a framework that employs a semi-supervised learning (SSL) method to accurately detect cancerous regions with a novel annotation method called Minimal Point-Based (Min-Point) annotation. The predicted results are efficiently utilized by a hybrid loss training strategy in a classification model for subtyping. The annotator only needs to mark a few cancerous and non-cancerous points in each WSI. The experiments on three significant subtypes of RCC proved that the performance of the cancerous region detector trained with the Min-Point annotated dataset is comparable to the classifiers trained on the dataset with full cancerous region delineation. In subtyping, the proposed model outperforms the model trained with only whole-slide diagnostic labels by 12% in terms of the testing f1-score. We believe that our “detect then classify” schema combined with the Min-Point annotation would set a standard for developing intelligent systems with similar challenges.

Zeyu Gao, Pargorn Puttapirat, Jiangbo Shi, Chen Li
Censoring-Aware Deep Ordinal Regression for Survival Prediction from Pathological Images

Survival prediction is a typical task in computer-aided diagnosis with many clinical applications. Existing approaches to survival prediction are mostly based on the classic Cox model, which mainly focus on learning a hazard or survival function rather than the survival time, largely limiting their practical uses. In this paper, we present a Censoring-Aware Deep Ordinal Regression (CDOR) to directly predict survival time from pathological images. Instead of relying on the Cox model, CDOR formulates survival prediction as an ordinal regression problem, and particularly introduces a censoring-aware loss function to train the deep network in the presence of censored data. Experiment results on publicly available dataset demonstrate that, the proposed CDOR can achieve significant higher accuracy in predicting survival time.

Lichao Xiao, Jin-Gang Yu, Zhifeng Liu, Jiarong Ou, Shule Deng, Zhenhua Yang, Yuanqing Li
Tracing Diagnosis Paths on Histopathology WSIs for Diagnostically Relevant Case Recommendation

Telepathology has enabled the remote cancer diagnosis based on digital pathological whole slide images (WSIs). During the diagnosis, the behavior information of the pathologist can be recorded by the platform and then archived with the digital cases. The diagnosis path of the pathologist on a WSI is valuable information since the image content within the path is highly correlated with the diagnosis report of the pathologist. In this paper, we proposed a novel diagnosis path network (DPathNet). DPathNet utilizes the diagnosis paths of pathologists on the WSIs as the supervision to learn the pathology knowledge from the image content. Based on the DPathNet, we develop a novel approach for computer-aided cancer diagnosis named session-based histopathology image recommendation (SHIR). SHIR summaries the information of a WSI while the pathologist browsing the WSI and actively recommends the relevant cases within similar image content from the database. The proposed approaches are evaluated on a gastric dataset containing 983 cases within 5 categories of gastric lesions. The experimental results have demonstrated the effectiveness of the DPathNet to the SHIR task and the supervision of the diagnosis path is sufficient to train the DPathNet. The MRR and MAP of the proposed SHIR framework are respectively 0.741 and 0.777 on the gastric dataset.

Yushan Zheng, Zhiguo Jiang, Haopeng Zhang, Fengying Xie, Jun Shi
Weakly Supervised Multiple Instance Learning Histopathological Tumor Segmentation

Histopathological image segmentation is a challenging and important topic in medical imaging with tremendous potential impact in clinical practice. State of the art methods rely on hand-crafted annotations which hinder clinical translation since histology suffers from significant variations between cancer phenotypes. In this paper, we propose a weakly supervised framework for whole slide imaging segmentation that relies on standard clinical annotations, available in most medical systems. In particular, we exploit a multiple instance learning scheme for training models. The proposed framework has been evaluated on multi-locations and multi-centric public data from The Cancer Genome Atlas and the PatchCamelyon dataset. Promising results when compared with experts’ annotations demonstrate the potentials of the presented approach. The complete framework, including 6481 generated tumor maps and data processing, is available at https://github.com/marvinler/tcga_segmentation .

Marvin Lerousseau, Maria Vakalopoulou, Marion Classe, Julien Adam, Enzo Battistella, Alexandre Carré, Théo Estienne, Théophraste Henry, Eric Deutsch, Nikos Paragios
Divide-and-Rule: Self-Supervised Learning for Survival Analysis in Colorectal Cancer

With the long-term rapid increase in incidences of colorectal cancer (CRC), there is an urgent clinical need to improve risk stratification. The conventional pathology report is usually limited to only a few histopathological features. However, most of the tumor microenvironments used to describe patterns of aggressive tumor behavior are ignored. In this work, we aim to learn histopathological patterns within cancerous tissue regions that can be used to improve prognostic stratification for colorectal cancer. To do so, we propose a self-supervised learning method that jointly learns a representation of tissue regions as well as a metric of the clustering to obtain their underlying patterns. These histopathological patterns are then used to represent the interaction between complex tissues and predict clinical outcomes directly. We furthermore show that the proposed approach can benefit from linear predictors to avoid overfitting in patient outcomes predictions. To this end, we introduce a new well-characterized clinicopathological dataset, including a retrospective collective of 374 patients, with their survival time and treatment information. Histomorphological clusters obtained by our method are evaluated by training survival models. The experimental results demonstrate statistically significant patient stratification, and our approach outperformed the state-of-the-art deep clustering methods.

Christian Abbet, Inti Zlobec, Behzad Bozorgtabar, Jean-Philippe Thiran
Microscopic Fine-Grained Instance Classification Through Deep Attention

Fine-grained classification of microscopic image data with limited samples is an open problem in computer vision and biomedical imaging. Deep learning based vision systems mostly deal with high number of low-resolution images, whereas subtle details in biomedical images require higher resolution. To bridge this gap, we propose a simple yet effective deep network that performs two tasks simultaneously in an end-to-end manner. First, it utilises a gated attention module that can focus on multiple key instances at high resolution without extra annotations or region proposals. Second, the global structural features and local instance features are fused for final image level classification. The result is a robust but lightweight end-to-end trainable deep network that yields state-of-the-art results in two separate fine-grained multi-instance biomedical image classification tasks: a benchmark breast cancer histology dataset and our new fungi species mycology dataset. In addition, we demonstrate the interpretability of the proposed model by visualising the concordance of the learned features with clinically relevant features.

Mengran Fan, Tapabrata Chakraborti, Eric I-Chao Chang, Yan Xu, Jens Rittscher
A Deformable CRF Model for Histopathology Whole-Slide Image Classification

To detect abnormality from histopathology images in a patch-based convolutional neural network (CNN), spatial context is an important cue. However, whole-slide image (WSI) is characterized by high morphological heterogeneity in the shape and scale of tissues, hence a simple visual span to a larger context may not well capture the information associated with the central patch or disease of interest. In this paper, we propose a Deformable Conditional Random Field (DCRF) model to learn the offsets and weights of neighboring patches in a spatial-adaptive manner. Additionally, rather than using regularly tessellated or overlapped patches, we localize patches with more powerful feature representations by the adaptively adjusted offsets in a WSI. Both the employment of DCRF for better feature extraction from spatial sampling patches, as well as utilization of the auto-generated patches as training input, can achieve performance improvement in the target task. This model is feasible to the widespread annotation strategies in histopathology images, either with a contoured region of interest (ROI) or patch-wise multi-tissue labels. The proposed model is validated on the patient cohorts from The Cancer Genome Atlas (TCGA) dataset and the Camelyon16 dataset for performance evaluation. The experimental results demonstrate the advantage of the proposed model in the classification task, by the comparison against the baseline models.

Yiqing Shen, Jing Ke
Deep Active Learning for Breast Cancer Segmentation on Immunohistochemistry Images

Immunohistochemistry (IHC) plays an essential role in breast cancer diagnosis and treatment. Reliable and automatic segmentation of breast cancer regions on IHC images would be of considerable value for further analysis. However, the prevalent fully convolutional networks (FCNs) suffer from difficulties in obtaining sufficient annotated training data. Active learning, on the other hand, aims to reduce the cost of annotation by selecting an informative and effective subset for labeling. In this paper, we present a novel deep active learning framework for breast cancer segmentation on IHC images. Three criteria are explicitly designed to select training samples: dissatisfaction, representativeness and diverseness. Dissatisfaction, consisting of both pixel-level and image-level dissatisfaction, focuses on selecting samples that the network does not segment well. Representativeness chooses samples that can mostly represent all the other unlabeled samples and diverseness further makes the chosen samples different from those already in the training set. We evaluate the proposed method on a large-scale in-house breast cancer IHC dataset and demonstrate that our method outperforms the state-of-the-art suggestive annotation (SA) [1] and representative annotation (RA) [5] on two test sets and achieves competitive or even superior performance using 40% of training data to using the full set of training data.

Haocheng Shen, Kuan Tian, Pei Dong, Jun Zhang, Kezhou Yan, Shannon Che, Jianhua Yao, Pifu Luo, Xiao Han
Multiple Instance Learning with Center Embeddings for Histopathology Classification

Histopathology image analysis plays an important role in the treatment and diagnosis of cancer. However, analysis of whole slide images (WSI) with deep learning is challenging given that the duration of pixel-level annotations is laborious and time consuming. To address this, recent methods have considered WSI classification as a Multiple Instance Learning (MIL) problem often with a multi-stage process for learning instance and slide level features. Currently, most methods focus on either instance-selection or instance prediction-aggregation that often fails to generalize and ignores instance relations. In this work, we propose a MIL-based method to jointly learn both instance- and bag-level embeddings in a single framework. In addition, we propose a center loss that maps embeddings of instances from the same bag to a single centroid and reduces intra-class variations. Consequently, our model can accurately predict instance labels and leverages robust hierarchical pooling of features to obtain bag-level features without sacrificing accuracy. Experimental results on curated colon datasets show the effectiveness of the proposed methods against recent state-of-the-art methods.

Philip Chikontwe, Meejeong Kim, Soo Jeong Nam, Heounjeong Go, Sang Hyun Park
Graph Attention Multi-instance Learning for Accurate Colorectal Cancer Staging

Colorectal Cancer (CRC) is one of the most common cancer diagnosed in humans. Outcomes vary significantly among patients with different tumor status. Accurate staging of colorectal cancer for personalized treatment is thus highly desired. Whole slide pathological images (WSIs) serves as the gold standard for Tumour Node Metastasis (TNM) staging. However, TNM staging for colorectal cancer relies on labor-intensive manual discriminative patch labeling, which is not suitable and scalable for large-scale WSIs TNM staging. Though various methods have been proposed to select key image patches to perform staging, they are unable to consider the structure of tissue types in biopsy samples which is a key evidence for determining tumor status. In this paper, we propose a Graph Attention Multi-instance Learning (Graph Attention MIL) with texture features, which encodes a spatial structure between patches and jointly predicts the TNM staging. We evaluated our proposed method on a large cohort of colorectal cancer dataset. The proposed framework improves the performance over the existing state-of-the-art methods indicating the future research towards graph based learning for TNM staging.

Ashwin Raju, Jiawen Yao, Mohammad MinHazul Haq, Jitendra Jonnagaddala, Junzhou Huang
Deep Interactive Learning: An Efficient Labeling Approach for Deep Learning-Based Osteosarcoma Treatment Response Assessment

Osteosarcoma is the most common malignant primary bone tumor. Standard treatment includes pre-operative chemotherapy followed by surgical resection. The response to treatment as measured by ratio of necrotic tumor area to overall tumor area is a known prognostic factor for overall survival. This assessment is currently done manually by pathologists by looking at glass slides under the microscope which may not be reproducible due to its subjective nature. Convolutional neural networks (CNNs) can be used for automated segmentation of viable and necrotic tumor on osteosarcoma whole slide images. One bottleneck for supervised learning is that large amounts of accurate annotations are required for training which is a time-consuming and expensive process. In this paper, we describe Deep Interactive Learning (DIaL) as an efficient labeling approach for training CNNs. After an initial labeling step is done, annotators only need to correct mislabeled regions from previous segmentation predictions to improve the CNN model until the satisfactory predictions are achieved. Our experiments show that our CNN model trained by only 7 h of annotation using DIaL can successfully estimate ratios of necrosis within expected inter-observer variation rate for non-standardized manual surgical pathology task.

David Joon Ho, Narasimhan P. Agaram, Peter J. Schüffler, Chad M. Vanderbilt, Marc-Henri Jean, Meera R. Hameed, Thomas J. Fuchs
Modeling Histological Patterns for Differential Diagnosis of Atypical Breast Lesions

Our goal in this paper is to build parametric models for a dictionary of histological patterns that aid in the differential diagnosis of atypical breast lesions and evaluate the inferential power of these hand-crafted features. Diagnosis of high-risk atypical breast lesions is challenging and remains a critical component of breast cancer screening, presenting even for experienced pathologists a more difficult classification problem than the binary detection task of cancer vs not-cancer. Following guidelines in the WHO classification of the tumors of the breast (an essential reference for pathologists, clinicians and researchers) and in consultation with our team of breast sub-specialists (N = 3), we assembled a visual dictionary of sixteen histological patterns (e.g., cribriform, picket-fence), a subset that pathologists frequently use in making complex diagnostic decisions of atypical breast lesions. We invoke parametric models for each pattern using a mix of unary, binary and ternary features that account for morphological and architectural tissue properties. We use 1441 ductal regions of interest (ROIs) extracted automatically from 93 whole slide images (WSIs) with a computational pathology pipeline. We collected diagnostic labels for all of the ROIs: normal and columnar cell changes (CCC) as low-risk benign lesions (= 1124), and flat epithelium atypia (FEA) and atypical ductal hyperplasia (ADH) as high-risk benign lesions (= 317). We generate likelihood maps for each dictionary pattern across a given ROI and integrate this information to determine a diagnostic label of high- or low-risk. Our method has comparable classification accuracies to the pool of breast pathology sub-specialists. Our study enables a deeper understanding of the discordance among pathologists in diagnosing atypical breast lesions.

Akash Parvatikar, Om Choudhary, Arvind Ramanathan, Olga Navolotskaia, Gloria Carter, Akif Burak Tosun, Jeffrey L. Fine, S. Chakra Chennubhotla
Foveation for Segmentation of Mega-Pixel Histology Images

Segmenting histology images is challenging because of the sheer size of the images with millions or even billions of pixels. Typical solutions pre-process each histology image by dividing it into patches of fixed size and/or down-sampling to meet memory constraints. Such operations incur information loss in the field-of-view (FoV) (i.e., spatial coverage) and the image resolution. The impact on segmentation performance is, however, as yet understudied. In this work, we first show under typical memory constraints (e.g., 10G GPU memory) that the trade-off between FoV and resolution considerably affects segmentation performance on histology images, and its influence also varies spatially according to local patterns in different areas (see Fig. 1). Based on this insight, we then introduce foveation module, a learnable “dataloader” which, for a given histology image, adaptively chooses the appropriate configuration (FoV/resolution trade-off) of the input patch to feed to the downstream segmentation model at each spatial location (Fig. 1). The foveation module is jointly trained with the segmentation network to maximise the task performance. We demonstrate, on the Gleason2019 challenge dataset for histopathology segmentation, that the foveation module improves segmentation performance over the cases trained with patches of fixed FoV/resolution trade-off. Moreover, our model achieves better segmentation accuracy for the two most clinically important and ambiguous classes (Gleason Grade 3 and 4) than the top performers in the challenge by 13.1% and 7.5%, and improves on the average performance of 6 human experts by 6.5% and 7.5%.

Chen Jin, Ryutaro Tanno, Moucheng Xu, Thomy Mertzanidou, Daniel C. Alexander
Multimodal Latent Semantic Alignment for Automated Prostate Tissue Classification and Retrieval

This paper presents an information fusion method for the automatic classification and retrieval of prostate histopathology whole-slide images (WSIs). The approach employs a weakly-supervised machine learning model that combines a bag-of-features representation, kernel methods, and deep learning. The primary purpose of the method is to incorporate text information during the model training to enrich the representation of the images. It automatically learns an alignment of the visual and textual space since each modality has different statistical properties. This alignment enriches the visual representation with complementary semantic information extracted from the text modality. The method was evaluated in both classification and retrieval tasks over a dataset of 235 prostate WSIs with their pathology report from the TCGA-PRAD dataset. The results show that the multimodal-enhanced model outperform unimodal models in both classification and retrieval. It outperforms state–of–the–art baselines by an improvement in WSI cancer detection of 4.74 $$\%$$ % achieving 77.01 $$\%$$ % in accuracy, and an improvement of 19.35 $$\%$$ % for the task of retrieving similar cases, obtaining 64.50 $$\%$$ % in mean average precision.

Juan S. Lara, Victor H. Contreras O., Sebastián Otálora, Henning Müller, Fabio A. González

Opthalmology

Frontmatter
GREEN: a Graph REsidual rE-ranking Network for Grading Diabetic Retinopathy

The automatic grading of diabetic retinopathy (DR) facilitates medical diagnosis for both patients and physicians. Existing researches formulate DR grading as an image classification problem. As the stages/categories of DR correlate with each other, the relationship between different classes cannot be explicitly described via a one-hot label because it is empirically estimated by different physicians with different outcomes. This class correlation limits existing networks to achieve effective classification. In this paper, we propose a Graph REsidual rE-ranking Network (GREEN) to introduce a class dependency prior into the original image classification network. The class dependency prior is represented by a graph convolutional network with an adjacency matrix. This prior augments image classification pipeline by re-ranking classification results in a residual aggregation manner. Experiments on the standard benchmarks have shown that GREEN performs favorably against state-of-the-art approaches.

Shaoteng Liu, Lijun Gong, Kai Ma, Yefeng Zheng
Combining Fundus Images and Fluorescein Angiography for Artery/Vein Classification Using the Hierarchical Vessel Graph Network

We present a new framework for retinal artery/vein classification from fundus images and corresponding fluorescein angiography (FA) images. While FA seem to provide the most relevant information, it is often insufficient depending on the acquisition conditions. As fundus images are often acquired by default, we combine the fundus image and FA within a parallel convolutional neural network to extract the maximum information in the generated features. Furthermore, we use these features as the input to a hierarchical graph neural network to ensure that the connectivity of vessels plays a part in the classification. We provide investigative evidence through ablative and comparative quantitative evaluations to better determine the optimal configuration in combining the fundus image and FA in a deep learning framework and demonstrate the enhancement in performance compared to previous methods.

Kyoung Jin Noh, Sang Jun Park, Soochahn Lee
Adaptive Dictionary Learning Based Multimodal Branch Retinal Vein Occlusion Fusion

Multimodal imaging can provide complementary information for branch retinal vein occlusions (BRVO) visualization. We proposed an adaptive dictionary learning based multimodal BRVO fusion method for color fundus photography (CFP), Fluorescein angiography (FA) and optical coherence tomography angiography (OCTA) images. First, the contrast of lesion areas in CFP and FA images was enhanced by using a local contrast enhancement algorithm based on standard deviation, and meanwhile a Frangi filter based algorithm was adopted to enhance vessels in OCTA images. Then, the local energy and multi-scale spatial frequency of image patches were calculated as brightness and gradient features respectively. The K-singular value decomposition algorithm was performed to train and generate the brightness and gradient sub-dictionaries, which were merged together to obtain the final adaptive dictionary. Finally, the orthogonal matching pursuit algorithm was adopted to calculate the sparse representation coefficient, and the maximum absolute value fusion strategy was applied to combine multimodal information. Experimental results demonstrate that the proposed method is more effective for BRVO visualization than the single-modality image, because our method can combine complementary information from multimodal images. It is useful for the clinical evaluation of BRVO.

Chen Huang, Keren Xie, Yuhan Zhang, Mingchao Li, Zhongmin Wang, Qiang Chen
TR-GAN: Topology Ranking GAN with Triplet Loss for Retinal Artery/Vein Classification

Retinal artery/vein (A/V) classification lays the foundation for the quantitative analysis of retinal vessels, which is associated with potential risks of various cardiovascular and cerebral diseases. The topological connection relationship, which has been proved effective in improving the A/V classification performance for the conventional graph based method, has not been exploited by the deep learning based method. In this paper, we propose a Topology Ranking Generative Adversarial Network (TR-GAN) to improve the topology connectivity of the segmented arteries and veins, and further to boost the A/V classification performance. A topology ranking discriminator based on ordinal regression is proposed to rank the topological connectivity level of the ground-truth, the generated A/V mask and the intentionally shuffled mask. The ranking loss is further back-propagated to the generator to generate better connected A/V masks. In addition, a topology preserving module with triplet loss is also proposed to extract the high-level topological features and further to narrow the feature distance between the predicted A/V mask and the ground-truth. The proposed framework effectively increases the topological connectivity of the predicted A/V masks and achieves state-of-the-art A/V classification performance on the publicly available AV-DRIVE dataset.

Wenting Chen, Shuang Yu, Junde Wu, Kai Ma, Cheng Bian, Chunyan Chu, Linlin Shen, Yefeng Zheng
DeepGF: Glaucoma Forecast Using the Sequential Fundus Images

Disease forecast is an effective solution to early treatment and prevention for some irreversible diseases, e.g., glaucoma. Different from existing disease detection methods that predict the current status of a patient, disease forecast aims to predict the future state for early treatment. This paper is a first attempt to address the glaucoma forecast task utilizing the sequential fundus images of a patient. Specifically, we establish a database of sequential fundus images for glaucoma forecast (SIGF), which includes an average of 9 images per eye, corresponding to 3,671 fundus images in total. Besides, a novel deep learning method for glaucoma forecast (DeepGF) is proposed based on our SIGF database, consisting of an attention-polar convolution neural network (AP-CNN) and a variable time interval long short-term memory (VTI-LSTM) network to learn the spatio-temporal transition at different time intervals across sequential medical images of a person. In addition, a novel active convergence (AC) training strategy is proposed to solve the imbalanced sample distribution problem of glaucoma forecast. Finally, the experimental results show the effectiveness of our DeepGF method in glaucoma forecast.

Liu Li, Xiaofei Wang, Mai Xu, Hanruo Liu, Ximeng Chen
Single-Shot Retinal Image Enhancement Using Deep Image Priors

Retinal images acquired using fundus cameras often contain visual artifacts due to imperfect imaging conditions, refractive medium turbidity, and motion blur. In addition, ocular diseases such as the presence of cataract results in blurred retinal image. The presence of these visual artifacts reduces the effectiveness of the diagnosis process followed by an expert ophthalmologist or a computer-aided detection/diagnosis system. In this paper, we put forward a single-shot deep image priors (DIP)-based approach for retinal image enhancement. Unlike typical deep learning-based approaches, our method does not require any training data. Instead, our DIP-based method can learn the underlying image prior using a single degraded image. We show that the architecture of the convolutional neural network imposes a strong image prior that is sufficient for capturing the retinal image statistics to generate an enhanced image using a degraded version of it. We evaluate our proposed framework on five datasets and show that the enhanced images using our proposed method perform significantly better on the retinal image enhancement and synthesis tasks as compared to several competitive baselines.

Adnan Qayyum, Waqas Sultani, Fahad Shamshad, Junaid Qadir, Rashid Tufail
Robust Layer Segmentation Against Complex Retinal Abnormalities for en face OCTA Generation

On en face optical coherence tomography angiography (OCTA), the vascular patterns from superficial vascular complex (SVC) and deep vascular complex (DVC) are distinguishable, while outer retina is normally avascular. To visualize en face OCTA images of different vascular patterns, it is inevitable to segment the three regions. However, the automated layer segmentation still faces huge challenge towards manifold advanced tissue lesions affected eyes. In this paper, we first design a region segmentation based augmented 3D U-Net network to fuse spectral domain optical coherence tomography (SD-OCT) structural information and OCTA vascular distribution. Subsequently, an innovative multitask layer-by-layer recoding module breaks up voxel-wise region segmentation probability maps into independent refinement task aiming at further weakening the influence of retinal abnormal regions on layer segmentation. In the end, a simple and effective layer surface encoding module converts the refined region segmentation result of each layer to its continuous surface vector, which advantages are that eliminates the outlier error segmentation in region segmentation tasks and guarantees the uniqueness and strict order constraint of each retinal layer surface in each column. The model validation is carried out on 262 eyes, including 95 normal eyes and 167 multifarious abnormalities affected eyes. The experimental results demonstrate that our method achieves higher segmentation accuracy and stronger ability to fight diseases compared with state-of-the-art segmentation methods.

Yuhan Zhang, Chen Huang, Mingchao Li, Sha Xie, Keren Xie, Zexuan Ji, Songtao Yuan, Qiang Chen
Anterior Segment Eye Lesion Segmentation with Advanced Fusion Strategies and Auxiliary Tasks

Anterior segment diseases of the eye cover a wide range of pathologies that occur at the front part of the eye such as conjunctival hemorrhage, corneal neovascularization and cataract. A tool for fast and accurate segmentation of anterior eye lesions can greatly benefit research and clinical care. However, the majority of current studies only pay attention to retinal-related diseases such as retinal diabetic retinopathy or macular degeneration and hence overlook this area. To enhance and boost research interest in this field, we construct the first anterior eye lesion segmentation dataset with pixel level annotations of four categories of lesions as well as the underlying eye structures. We propose a novel simultaneous eye lesion and structure segmentation model with two advanced feature fusion strategies and two auxiliary tasks. The fusion strategies enable knowledge gained from eye structure segmentation branch to be effectively transferred to the lesion segmentation branch, utilizing the intrinsic relationship between lesion and eye structure. The auxiliary tasks, which do not require extra manual annotations, are set to encourage the model to learn more general representations of features, thus improving the segmentation performance. Experimental results on the newly constructed dataset indicate that our model can exceed baseline models consistently and significantly. The source code and the newly constructed dataset are made public to benefit future research.

Ke Wang, Xiaohong Liu, Kang Zhang, Ting Chen, Guangyu Wang
Cost-Sensitive Regularization for Diabetic Retinopathy Grading from Eye Fundus Images

Assessing the degree of disease severity in biomedical images is a task similar to standard classification but constrained by an underlying structure in the label space. Such a structure reflects the monotonic relationship between different disease grades. In this paper, we propose a straightforward approach to enforce this constraint for the task of predicting Diabetic Retinopathy (DR) severity from eye fundus images based on the well-known notion of Cost-Sensitive classification. We expand standard classification losses with an extra term that acts as a regularizer, imposing greater penalties on predicted grades when they are farther away from the true grade associated to a particular image. Furthermore, we show how to adapt our method to the modelling of label noise in each of the sub-problems associated to DR grading, an approach we refer to as Atomic Sub-Task modeling. This yields models that can implicitly take into account the inherent noise present in DR grade annotations. Our experimental analysis on several public datasets reveals that, when a standard Convolutional Neural Network is trained using this simple strategy, improvements of 3–5% of quadratic-weighted kappa scores can be achieved at a negligible computational cost. Code to reproduce our results is released at github.com/agaldran/cost_sensitive_loss_classification .

Adrian Galdran, Jose Dolz, Hadi Chakor, Hervé Lombaert, Ismail Ben Ayed
Disentanglement Network for Unsupervised Speckle Reduction of Optical Coherence Tomography Images

Optical coherence tomography (OCT) has received increasing attention in the diagnosis of ophthalmic diseases due to its non-invasive character. However, the speckle noise associated with the low-coherence interferometric imaging modality has considerably negative influence on its clinical application. Moreover, the lack of clean and corresponding noisy OCT image pairs makes it difficult for supervised learning-based approaches to achieve satisfactory denoising results. Therefore, inspired by the idea of disentangled representation and generative adversarial network (GAN), we propose an unsupervised OCT image speckle reduction algorithm which firstly disentangles the noisy image into content and noise spaces by corresponding encoders. Then the generator is used to predict denoised OCT image only with the extracted content features. In addition, the pure noise patches which are cut from the noisy image are utilized to ensure a purer disentanglement. Extensive experiments have been conducted and the results suggest that our proposed method demonstrates competitive performance with respect to other state-of-the-art approaches.

Yongqiang Huang, Wenjun Xia, Zexin Lu, Yan Liu, Jiliu Zhou, Leyuan Fang, Yi Zhang
Positive-Aware Lesion Detection Network with Cross-scale Feature Pyramid for OCT Images

Optical coherence tomography (OCT) is an important imaging technique in ophthalmology, and accurate detection of retinal lesions plays an important role in computer-aided diagnosis. However, the particularities of retinal lesions, such as their complex appearance and large variation of scale, limit the successful application of conventional deep learning-based object detection networks for OCT lesion detection. In this study, we propose a positive-aware lesion detection network with cross-scale feature pyramid for OCT images. A cross-scale boost module with non-local network is firstly applied to enhance the ability of feature representation for OCT lesions with varying scales. To avoid lesion omission and misdetection, some positive-aware network designs are then added into a two-stage detection network, including global level positive estimation and local level positive mining. Finally, we establish a large OCT dataset with multiple retinal lesions, and perform sufficient comparative experiments on it. The results demonstrate that our proposed network achieves 92.36 mean average precision (mAP) for OCT lesion detection, which is superior to other existing detection approaches.

Dongyi Fan, Chengfen Zhang, Bin Lv, Lilong Wang, Guanzheng Wang, Min Wang, Chuanfeng Lv, Guotong Xie
Retinal Layer Segmentation Reformulated as OCT Language Processing

In the medical field, semantic segmentation has recently been dominated by deep-learning based image processing methods. Convolutional Neural Network approaches analyze image patches, draw complex features and latent representations and take advantage of these to label image pixels and voxels. In this paper, we investigate the usefulness of Recurrent Neural Network (RNN) for segmentation of OCT images, in which the intensity of elements of each A-mode depend on the path projected light takes through anatomical tissues to reach that point. The idea of this work is to reformulate this sequential voxel labeling/segmentation problem as language processing. Instead of treating images as patches, we regard them as a set of pixel column sequences and thus tackle the task of image segmentation, in this case pixel sequence labeling, as a natural language processing alike problem. Anatomical consistency, i.e. expected sequence of voxels representing retinal layers of eye’s anatomy along each OCT ray, serves as a fixed and learnable grammar. We show the effectiveness of this approach on a layer segmentation task for retinal Optical Coherence Tomography (OCT) data. Due to the inherent directionality of the modality, certain properties and artifacts such as varying signal strength and shadowing form a consistent pattern along increasing imaging depth. The retinal layer structure lends itself to our approach due to the fixed order of layers along the imaging direction. We investigate the influence of different model choices including simple RNNS, LSTMs and GRU structures on the outcome of this layer segmentation approach. Experimental results show that the potential of this idea that is on par with state of the art works while being flexible to changes in the data structure.

Arianne Tran, Jakob Weiss, Shadi Albarqouni, Shahrooz Faghi Roohi, Nassir Navab
Reconstruction and Quantification of 3D Iris Surface for Angle-Closure Glaucoma Detection in Anterior Segment OCT

Precise characterization and analysis of iris shape from Anterior Segment OCT (AS-OCT) are of great importance in facilitating diagnosis of angle-closure-related diseases. Existing methods focus solely on analyzing structural properties identified from an individual 2D slice, while accurate characterization of morphological changes of iris shape in 3D AS-OCT may be able to reveal in addition the risk of disease progression. In this paper, we propose a novel framework for reconstruction and quantification of 3D iris surface from AS-OCT volume. We consider it to be the first work to detect angle-closure glaucoma by means of 3D representation. An iris segmentation network with wavelet refinement block (WRB) is first proposed to generate the initial shape of the iris from single AS-OCT slice. The 3D iris surface is then reconstructed using a guided optimization method with Poisson-disk sampling. Finally, a set of surface-based features are extracted, which are used in detecting of angle-closure glaucoma. Experimental results demonstrate that our method is highly effective in iris segmentation and surface reconstruction. Moreover, we show that 3D-based representation achieves better performance in angle-closure glaucoma detection than does 2D-based feature.

Jinkui Hao, Huazhu Fu, Yanwu Xu, Yan Hu, Fei Li, Xiulan Zhang, Jiang Liu, Yitian Zhao
Open-Appositional-Synechial Anterior Chamber Angle Classification in AS-OCT Sequences

Anterior chamber angle (ACA) classification is a key step in the diagnosis of angle-closure glaucoma in Anterior Segment Optical Coherence Tomography (AS-OCT). Existing automated analysis methods focus on a binary classification system (i.e., open angle or angle-closure) in a 2D AS-OCT slice. However, clinical diagnosis requires a more discriminating ACA three-class system (i.e., open, appositional, or synechial angles) for the benefit of clinicians who seek better to understand the progression of the spectrum of angle-closure glaucoma types. To address this, we propose a novel sequence multi-scale aggregation deep network (SMA-Net) for open-appositional-synechial ACA classification based on an AS-OCT sequence. In our method, a Multi-Scale Discriminative Aggregation (MSDA) block is utilized to learn the multi-scale representations at slice level, while a ConvLSTM is introduced to study the temporal dynamics of these representations at sequence level. Finally, a multi-level loss function is used to combine the slice-based and sequence-based losses. The proposed method is evaluated across two AS-OCT datasets. The experimental results show that the proposed method outperforms existing state-of-the-art methods in applicability, effectiveness, and accuracy. We believe this work to be the first attempt to classify ACAs into open, appositional, or synechial types grading using AS-OCT sequences.

Huaying Hao, Huazhu Fu, Yanwu Xu, Jianlong Yang, Fei Li, Xiulan Zhang, Jiang Liu, Yitian Zhao
A Macro-Micro Weakly-Supervised Framework for AS-OCT Tissue Segmentation

Primary angle closure glaucoma (PACG) is the leading cause of irreversible blindness among Asian people. Early detection of PACG is essential, so as to provide timely treatment and minimize the vision loss. In the clinical practice, PACG is diagnosed by analyzing the angle between the cornea and iris with anterior segment optical coherence tomography (AS-OCT). The rapid development of deep learning technologies provides the feasibility of building a computer-aided system for the fast and accurate segmentation of cornea and iris tissues. However, the application of deep learning methods in the medical imaging field is still restricted by the lack of enough fully-annotated samples. In this paper, we propose a novel framework to segment the target tissues accurately for the AS-OCT images, by using the combination of weakly-annotated images (majority) and fully-annotated images (minority). The proposed framework consists of two models which provide reliable guidance for each other. In addition, uncertainty guided strategies are adopted to increase the accuracy and stability of the guidance. Detailed experiments on the publicly available AGE dataset demonstrate that the proposed framework outperforms the state-of-the-art semi-/weakly-supervised methods and has a comparable performance as the fully-supervised method. Therefore, the proposed method is demonstrated to be effective in exploiting information contained in the weakly-annotated images and has the capability to substantively relieve the annotation workload.

Munan Ning, Cheng Bian, Donghuan Lu, Hong-Yu Zhou, Shuang Yu, Chenglang Yuan, Yang Guo, Yaohua Wang, Kai Ma, Yefeng Zheng
Macular Hole and Cystoid Macular Edema Joint Segmentation by Two-Stage Network and Entropy Minimization

The co-occurrence of macular hole (MH) and cystoid macular edema (CME) indicates the serious visual impairment in ophthalmology clinic. Joint segmentation and quantitative analysis of MH and CME can greatly assist the ophthalmologists in clinical diagnosis and treatment. Benefitting from the advancement of computer digital image processing technology, deep learning has shown remarkable performance in assisting doctors to diagnose diseases. In this paper, we propose a two-stage network for the segmentation of MH and CME, the MH auxiliary network and the joint segmentation network, in which the output of the Linknet based auxiliary network is used as the input of the joint segmentation network. The MH auxiliary network is designed to solve the problem that the top boundary of the MH is difficult to be discriminated by the joint segmentation network. In the joint segmentation network, we add a mixed downsampling module to retain more fine feature information during the downsampling. Furthermore, a new self-entropy loss function is proposed, which can pay more attention to the hard samples and reduce the uncertainty of the network prediction. Experimental results show that our method achieved an average Dice of 89.32% and an average IOU of 81.42% in segmentation of MH and CME, showing extremely competitive results.

Lei Ye, Weifang Zhu, Dengsen Bao, Shuanglang Feng, Xinjian Chen
Retinal Nerve Fiber Layer Defect Detection with Position Guidance

The retinal nerve fiber layer defect (RNFLD) provides early diagnostic evidence for many irreversible disabling or blinding diseases. This paper aims for automated RNFLD detection based on fundus images. Different from previous works that only consider the local contexts, we are the first to propose to detect RNFLD with position guidance, which senses both the physiological position and global dependencies with ease. Our solution consists of a position-consistent data preprocessing, a Position Guided Network, and a weakly supervised learning strategy. In the position-consistent data preprocessing, the optic disc region is evenly divided into several sectors according to the distribution regularity of RNFL. To detect RNFLD in sectors, the proposed Position Guided Network highlights the significant region with a position-aware attention module and captures the global dependencies with a bidirectional GRU module. The dataset about RNFLD suffers from noise labels, which is verified in our created dataset containing 4,335 fundus images. Thus the weakly supervised learning strategy, which jointly optimizes network parameters and label distributions, is proposed to reduce the impact of noise labels. Tested on a clinical dataset of 750 images, our solution achieves outstanding performance, attaining the F1 score of 81.00% that outperforms the baseline by 13.71%.

Fei Ding, Gang Yang, Dayong Ding, Gangwei Cheng
An Elastic Interaction-Based Loss Function for Medical Image Segmentation

Deep learning techniques have shown their success in medical image segmentation since they are easy to manipulate and robust to various types of datasets. The commonly used loss functions in the deep segmentation task are pixel-wise loss functions. This results in a bottleneck for these models to achieve high precision for complicated structures in biomedical images. For example, the predicted small blood vessels in retinal images are often disconnected or even missed under the supervision of the pixel-wise losses. This paper addresses this problem by introducing a long-range elastic interaction-based training strategy. In this strategy, convolutional neural network (CNN) learns the target region under the guidance of the elastic interaction energy between the boundary of the predicted region and that of the actual object. Under the supervision of the proposed loss, the boundary of the predicted region is attracted strongly by the object boundary and tends to stay connected. Experimental results show that our method is able to achieve considerable improvements compared to commonly used pixel-wise loss functions (cross entropy and dice Loss) and other recent loss functions on three retinal vessel segmentation datasets, DRIVE, STARE and CHASEDB1. The implementation is available at https://github.com/charrywhite/elastic_interaction_based_loss .

Yuan Lan, Yang Xiang, Luchan Zhang
Retinal Image Segmentation with a Structure-Texture Demixing Network

Retinal image segmentation plays an important role in automatic disease diagnosis. This task is very challenging because the complex structure and texture information are mixed in a retinal image, and distinguishing the information is difficult. Existing methods handle texture and structure jointly, which may lead biased models toward recognizing textures and thus results in inferior segmentation performance. To address it, we propose a segmentation strategy that seeks to separate structure and texture components and significantly improve the performance. To this end, we design a structure-texture demixing network (STD-Net) that can process structures and textures differently and better. Extensive experiments on two retinal image segmentation tasks (i.e., blood vessel segmentation, optic disc and cup segmentation) demonstrate the effectiveness of the proposed method.

Shihao Zhang, Huazhu Fu, Yanwu Xu, Yanxia Liu, Mingkui Tan
BEFD: Boundary Enhancement and Feature Denoising for Vessel Segmentation

Blood vessel segmentation is crucial for many diagnostic and research applications. In recent years, CNN-based models have leaded to breakthroughs in the task of segmentation, however, such methods usually lose high-frequency information like object boundaries and subtle structures, which are vital to vessel segmentation. To tackle this issue, we propose Boundary Enhancement and Feature Denoising (BEFD) module to facilitate the network ability of extracting boundary information in semantic segmentation, which can be integrated into arbitrary encoder-decoder architecture in an end-to-end way. By introducing Sobel edge detector, the network is able to acquire additional edge prior, thus enhancing boundary in an unsupervised manner for medical image segmentation. In addition, we also utilize a denoising block to reduce the noise hidden in the low-level features. Experimental results on retinal vessel dataset and angiocarpy dataset demonstrate the superior performance of the new BEFD module.

Mo Zhang, Fei Yu, Jie Zhao, Li Zhang, Quanzheng Li
Boosting Connectivity in Retinal Vessel Segmentation via a Recursive Semantics-Guided Network

Many deep learning based methods have been proposed for retinal vessel segmentation, however few of them focus on the connectivity of segmented vessels, which is quite important for a practical computer-aided diagnosis system on retinal images. In this paper, we propose an efficient network to address this problem. A U-shape network is enhanced by introducing a semantics-guided module, which integrates the enriched semantics information to shallow layers for guiding the network to explore more powerful features. Besides, a recursive refinement iteratively applies the same network over the previous segmentation results for progressively boosting the performance while increasing no extra network parameters. The carefully designed recursive semantics-guided network has been extensively evaluated on several public datasets. Experimental results have shown the efficiency of the proposed method.

Rui Xu, Tiantian Liu, Xinchen Ye, Lin Lin, Yen-Wei Chen
RVSeg-Net: An Efficient Feature Pyramid Cascade Network for Retinal Vessel Segmentation

Accurate retinal vessel segmentation plays a critical role in the diagnosis of many relevant diseases. However, it remains a challenging task due to (1) the great scale variation of retinal vessels, (2) the existence of a large number of capillaries in the vascular network, and (3) the interactions of the retinal vessels and other structures in the images, which easily results in the discontinuities in the segmentation results. In addition, limited training data also often prohibit current deep learning models from being efficiently trained because of the overfitting problem. In this paper, we propose a novel and efficient feature pyramid cascade network for retinal vessel segmentation to comprehensively address these challenges; we call it RVSeg-Net. The main component of the proposed RVSeg-Net is a feature pyramid cascade (FPC) module, which is capable of capturing multi-scale features to tackle scale variations of retinal vessels and aggregating local and global context information to solve the discontinuity problem. In order to overcome the overfitting problem, we further employ octave convolution to replace the traditional vanilla convolution to greatly reduce the parameters by avoiding spatial redundancy information. We conducted extensive experiments on two mainstream retinal vessel datasets (DRIVE and CHASE $$\_$$ _ DB1) to validate the proposed RVSeg-Net. Experimental results demonstrate the effectiveness of the proposed method, outperforming start-of-the-art approaches with much fewer parameters.

Wei Wang, Jiafu Zhong, Huisi Wu, Zhenkun Wen, Jing Qin
Backmatter
Metadaten
Titel
Medical Image Computing and Computer Assisted Intervention – MICCAI 2020
herausgegeben von
Prof. Anne L. Martel
Purang Abolmaesumi
Danail Stoyanov
Diana Mateus
Maria A. Zuluaga
S. Kevin Zhou
Daniel Racoceanu
Prof. Leo Joskowicz
Copyright-Jahr
2020
Electronic ISBN
978-3-030-59722-1
Print ISBN
978-3-030-59721-4
DOI
https://doi.org/10.1007/978-3-030-59722-1

Premium Partner