main-content

## Über dieses Buch

The three-volume set LNCS 10433, 10434, and 10435 constitutes the refereed proceedings of the 20th International Conference on Medical Image Computing and Computer-Assisted Intervention, MICCAI 2017, held inQuebec City, Canada, in September 2017.
The 255 revised full papers presented were carefully reviewed and selected from 800 submissions in a two-phase review process. The papers have been organized in the following topical sections: Part I: atlas and surface-based techniques; shape and patch-based techniques; registration techniques, functional imaging, connectivity, and brain parcellation; diffusion magnetic resonance imaging (dMRI) and tensor/fiber processing; and image segmentation and modelling. Part II: optical imaging; airway and vessel analysis; motion and cardiac analysis; tumor processing; planning and simulation for medical interventions; interventional imaging and navigation; and medical image computing. Part III: feature extraction and classification techniques; and machine learning in medical image computing.

## Inhaltsverzeichnis

### Erratum to: Deep Learning for Sensorless 3D Freehand Ultrasound Imaging

Raphael Prevost, Mehrdad Salehi, Julian Sprung, Alexander Ladikos, Robert Bauer, Wolfgang Wein

### Cell Lineage Tracing in Lens-Free Microscopy Videos

In vitro experiments with cell cultures are essential for studying growth and migration behaviour and thus, for gaining a better understanding of cancer progression and its treatment. While recent progress in lens-free microscopy (LFM) has rendered it an inexpensive tool for continuous monitoring of these experiments, there is only little work on analysing such time-lapse sequences.We propose (1) a cell detector for LFM images based on residual learning, and (2) a probabilistic model based on moral lineage tracing that explicitly handles multiple detections and temporal successor hypotheses by clustering and tracking simultaneously. (3) We benchmark our method on several hours of LFM time-lapse sequences in terms of detection and tracking scores. Finally, (4) we demonstrate its effectiveness for quantifying cell population dynamics.

Markus Rempfler, Sanjeev Kumar, Valentin Stierle, Philipp Paulitschke, Bjoern Andres, Bjoern H. Menze

### Separation of Transmitted Light and Scattering Components in Transmitted Microscopy

In transmitted light microscopy, a specimen tends to be observed as unclear. This is caused by a phenomenon that an image sensor captures the sum of these scattered light rays traveled from different paths due to scattering. To cope with this problem, we propose a novel computational photography approach for separating directly transmitted light from the scattering light in a transmitted light microscope by using high-frequency lighting. We first investigated light paths and clarified what types of light overlap in transmitted light microscopy. The scattered light can be simply represented and removed by using the difference in observations between focused and unfocused conditions, where the high-frequency illumination becomes homogeneous. Our method makes a novel spatial multiple-spectral absorption analysis possible, which requires absorption coefficients to be measured in each spectrum at each position. Experiments on real biological tissues demonstrated the effectiveness of our method.

Mihoko Shimano, Ryoma Bise, Yinqiang Zheng, Imari Sato

### Neuron Segmentation Using Deep Complete Bipartite Networks

In this paper, we consider the problem of automatically segmenting neuronal cells in dual-color confocal microscopy images. This problem is a key task in various quantitative analysis applications in neuroscience, such as tracing cell genesis in Danio rerio (zebrafish) brains. Deep learning, especially using fully convolutional networks (FCN), has profoundly changed segmentation research in biomedical imaging. We face two major challenges in this problem. First, neuronal cells may form dense clusters, making it difficult to correctly identify all individual cells (even to human experts). Consequently, segmentation results of the known FCN-type models are not accurate enough. Second, pixel-wise ground truth is difficult to obtain. Only a limited amount of approximate instance-wise annotation can be collected, which makes the training of FCN models quite cumbersome. We propose a new FCN-type deep learning model, called deep complete bipartite networks (CB-Net), and a new scheme for leveraging approximate instance-wise annotation to train our pixel-wise prediction model. Evaluated using seven real datasets, our proposed new CB-Net model outperforms the state-of-the-art FCN models and produces neuron segmentation results of remarkable quality.

Jianxu Chen, Sreya Banerjee, Abhinav Grama, Walter J. Scheirer, Danny Z. Chen

### Cell Encoding for Histopathology Image Classification

Although many image analysis algorithms can achieve good performance with sufficient number of labeled images, manually labeling images by pathologists is time consuming and expensive. Meanwhile, with the development of cell detection and segmentation techniques, it is possible to classify pathology images by using cell-level information, which is crucial to grade different diseases; however, it is still very challenging to efficiently conduct cell analysis on large-scale image databases since one image often contains a large number of cells. To address these issues, in this paper, we present a novel cell-based framework that requires only a few labeled images to classify large-scale pathology ones. Specifically, we encode each cell into a set of binary codes to generate image representation using a semi-supervised hashing model, which can take advantage of both labeled and unlabeled cells. Thereafter, we map all the binary codes in one whole image into a single histogram vector and then learn a support vector machine for image classification. The proposed framework is validated on one large-scale lung cancer image dataset with two types of diseases, and it can achieve 87.88% classification accuracy on 800 test images using only 5 labeled images of each disease.

Xiaoshuang Shi, Fuyong Xing, Yuanpu Xie, Hai Su, Lin Yang

### Endoscopic Depth Measurement and Super-Spectral-Resolution Imaging

Intra-operative measurements of tissue shape and multi/hyperspectral information have the potential to provide surgical guidance and decision making support. We report an optical probe based system to combine sparse hyperspectral measurements and spectrally-encoded structured lighting (SL) for surface measurements. The system provides informative signals for navigation with a surgical interface. By rapidly switching between SL and white light (WL) modes, SL information is combined with structure-from-motion (SfM) from white light images, based on SURF feature detection and Lucas-Kanade (LK) optical flow to provide quasi-dense surface shape reconstruction with known scale in real-time. Furthermore, “super-spectral-resolution” was realized, whereby the RGB images and sparse hyperspectral data were integrated to recover dense pixel-level hyperspectral stacks, by using convolutional neural networks to upscale the wavelength dimension. Validation and demonstration of this system is reported on ex vivo/in vivo animal/human experiments.

Jianyu Lin, Neil T. Clancy, Yang Hu, Ji Qi, Taran Tatla, Danail Stoyanov, Lena Maier-Hein, Daniel S. Elson

### Computational Immunohistochemistry: Recipes for Standardization of Immunostaining

Cancer diagnosis and personalized cancer treatment are heavily based on the visual assessment of immunohistochemically-stained tissue specimens. The precision of this assessment depends critically on the quality of immunostaining, which is governed by a number of parameters used in the staining process. Tuning of the staining-process parameters is mostly based on pathologists’ qualitative assessment, which incurs inter- and intra-observer variability. The lack of standardization in staining across pathology labs leads to poor reproducibility and consequently to uncertainty in diagnosis and treatment selection. In this paper, we propose a methodology to address this issue through a quantitative evaluation of the staining quality by using visual computing and machine learning techniques on immunohistochemically-stained tissue images. This enables a statistical analysis of the sensitivity of the staining quality to the process parameters and thereby provides an optimal operating range for obtaining high-quality immunostains. We evaluate the proposed methodology on HER2-stained breast cancer tissues and demonstrate its use to define guidelines to optimize and standardize immunostaining.

Nuri Murat Arar, Pushpak Pati, Aditya Kashyap, Anna Fomitcheva Khartchenko, Orcun Goksel, Govind V. Kaigala, Maria Gabrani

### Two-Stream Bidirectional Long Short-Term Memory for Mitosis Event Detection and Stage Localization in Phase-Contrast Microscopy Images

In this paper, we propose a Two-Stream Bidirectional Long Short-Term Memory (TS-BLSTM) for the task of mitosis event detection and stage localization in time-lapse phase contrast microscopy image sequences. Our method consists of two steps. First, we extract candidate mitosis image sequences. Then, we solve the problem of mitosis event detection and stage localization jointly by the proposed TS-BLSTM, which utilizes both appearance and motion information from candidate sequences. The proposed method outperforms state-of-the-arts by achieving 98.4% precision and 97.0% recall for mitosis detection and 0.62 frame error on average for mitosis stage localization in five challenging image sequences.

Yunxiang Mao, Zhaozheng Yin

### Refocusing Phase Contrast Microscopy Images

Phase contrast microscopy is a very popular non-invasive technique for monitoring live cells. However, its images can be blurred if optics are imperfectly aligned and the visualization on specimen details can be affected by noisy background. We propose an effective algorithm to refocus phase contrast microscopy images from two perspectives: optics and specimens. First, given a defocused image caused by misaligned optics, we estimate the blur kernel based on the sparse prior of dark channel, and non-blindly refocus the image with the hyper-Laplacian prior of image gradients. Then, we further refocus the image contents on specimens by removing the artifacts from the background, which provides a sharp visualization on fine specimen details. The proposed algorithm is both qualitatively and quantitatively evaluated on a dataset of 500 phase contrast microscopy images, showing its superior performance for visualizing specimens and facilitating microscopy image analysis.

Liang Han, Zhaozheng Yin

### Semi-supervised Segmentation of Optic Cup in Retinal Fundus Images Using Variational Autoencoder

Accurate segmentation of optic cup and disc in retinal fundus images is essential to compute the cup to disc ratio parameter, which is important for glaucoma assessment. The ill-defined boundaries of optic cup makes the segmentation a lot more challenging compared to optic disc. Existing approaches have mainly used fully supervised learning that requires many labeled samples to build a robust segmentation framework. In this paper, we propose a novel semi-supervised method to segment the optic cup, which can accurately localize the anatomy using limited number of labeled samples. The proposed method leverages the inherent feature similarity from a large number of unlabeled images to train the segmentation model from a smaller number of labeled images. It first learns the parameters of a generative model from unlabeled images using variational autoencoder. The trained generative model provides the feature embedding of the images which allows the clustering of the related observation in the latent feature space. We combine the feature embedding with the segmentation autoencoder which is trained on the labeled images for pixel-wise segmentation of the cup region. The main novelty of the proposed approach is in the utilization of generative models for semi-supervised segmentation. Experimental results show that the proposed method successfully segments optic cup with small number of labeled images, and unsupervised feature embedding learned from unlabeled data improves the segmentation accuracy. Given the challenge of access to annotated medical images in every clinical application, the proposed framework is a key contribution and applicable for segmentation of different anatomies across various medical imaging modalities.

Suman Sedai, Dwarikanath Mahapatra, Sajini Hewavitharanage, Stefan Maetschke, Rahil Garnavi

### QuaSI: Quantile Sparse Image Prior for Spatio-Temporal Denoising of Retinal OCT Data

Optical coherence tomography (OCT) enables high-resolution and non-invasive 3D imaging of the human retina but is inherently impaired by speckle noise. This paper introduces a spatio-temporal denoising algorithm for OCT data on a B-scan level using a novel quantile sparse image (QuaSI) prior. To remove speckle noise while preserving image structures of diagnostic relevance, we implement our QuaSI prior via median filter regularization coupled with a Huber data fidelity model in a variational approach. For efficient energy minimization, we develop an alternating direction method of multipliers (ADMM) scheme using a linearization of median filtering. Our spatio-temporal method can handle both, denoising of single B-scans and temporally consecutive B-scans, to gain volumetric OCT data with enhanced signal-to-noise ratio. Our algorithm based on 4 B-scans only achieved comparable performance to averaging 13 B-scans and outperformed other current denoising methods.

Franziska Schirrmacher, Thomas Köhler, Lennart Husvogt, James G. Fujimoto, Joachim Hornegger, Andreas K. Maier

### Fast Background Removal Method for 3D Multi-channel Deep Tissue Fluorescence Imaging

The recent advances in tissue clearing and optical imaging have enabled us to obtain three-dimensional high-resolution images of various tissues. However, the severe background noise remains a major obstacle. In addition, there is an urgent need for fast background ground correction methods. In this paper, we present a fast background removal method for 3D multi-channel deep tissue fluorescence images, in which the objectives of different channels are well separated. We first conduct a window-based normalization to distinguish foreground signals from background noises in all channels. Then, we identify the pure background regions by conducting subtraction of images in different channels, which allow us to estimate the background noises of the whole images by interpolation. Experiments on real 3D datasets of mouse stomach show our method has superior performance and efficiency comparing with the current state-of-the-art background correction methods.

Chenchen Li, Xiaowei Li, Hongji Cao, He Jiang, Xiaotie Deng, Danny Z. Chen, Lin Yang, Zhifeng Shao

### Fast Estimation of Haemoglobin Concentration in Tissue Via Wavelet Decomposition

Tissue oxygenation and perfusion can be an indicator for organ viability during minimally invasive surgery, for example allowing real-time assessment of tissue perfusion and oxygen saturation. Multispectral imaging is an optical modality that can inspect tissue perfusion in wide field images without contact. In this paper, we present a novel, fast method for using RGB images for MSI, which while limiting the spectral resolution of the modality allows normal laparoscopic systems to be used. We exploit the discrete Haar decomposition to separate individual video frames into low pass and directional coefficients and we utilise a different multispectral estimation technique on each. The increase in speed is achieved by using fast Tikhonov regularisation on the directional coefficients and more accurate Bayesian estimation on the low pass component. The pipeline is implemented using a graphics processing unit (GPU) architecture and achieves a frame rate of approximately 15 Hz. We validate the method on animal models and on human data captured using a da Vinci stereo laparoscope.

Geoffrey Jones, Neil T. Clancy, Xiaofei Du, Maria Robu, Simon Arridge, Daniel S. Elson, Danail Stoyanov

### Efficient Reconstruction of Holographic Lens-Free Images by Sparse Phase Recovery

Digital holographic lens-free imaging is based on recording the diffraction pattern of light after it passes through a specimen and post-processing the recorded diffraction pattern to reconstruct an image of the specimen. If the full, complex-valued wave-front of the diffraction pattern could be recorded then the image reconstruction process would be straight-forward, but unfortunately image sensors typically only record the amplitude of the diffraction pattern but not the phase. As a result, many conventional reconstruction techniques suffer from substantial artifacts and degraded image quality. This paper presents a computationally efficient technique to reconstruct holographic lens-free images based on sparsity, which improves image quality over existing techniques, allows for the possibility of reconstructing images over a 3D volume of focal-depths simultaneously from a single recorded hologram, provides a robust estimate of the missing phase information in the hologram, and automatically identifies the focal depths of the imaged objects in a robust manner.

Benjamin D. Haeffele, Richard Stahl, Geert Vanmeerbeeck, René Vidal

### Quality Assessment of Retinal Hyperspectral Images Using SURF and Intensity Features

Hyperspectral (HSI) retinal imaging is an emergent modality for disease diagnosis such as diabetic retinopathy. HSI represents the retina as a 3D cube, with two spatial dimensions and one spectral, meaning that spectral signatures associated with a disease may be identified. The quality of this hypercube influences the accuracy of automatic diagnosis. Three main artifacts may limit the hypercube’s quality: parasitic contribution (e.g. blinking or ghost), uneven illumination and blurriness. We present a method for artifact detection and quality assessment using SURF features and intensity-based analysis. Quality evaluation has a rich literature in classic fundus images. However, none of these works have tackled the challenges related to HSI. Hypercubes from volunteers recruited at an eye clinic, in reflectance (48) and fluorescence (32) imaging modes, were captured using a Metabolic Hyperspectral Retinal Camera based on a tuneable light source in the visible and near infrared spectral range (450–900 nm). Compared with the ratings of two observers, our proposed method shows encouraging results in artifact detection and quality assessment.

Faten M’hiri, Claudia Chevrefils, Jean-Philippe Sylvestre

### Isotropic Reconstruction of 3D Fluorescence Microscopy Images Using Convolutional Neural Networks

Fluorescence microscopy images usually show severe anisotropy in axial versus lateral resolution. This hampers downstream processing, i.e. the automatic extraction of quantitative biological data. While deconvolution methods and other techniques to address this problem exist, they are either time consuming to apply or limited in their ability to remove anisotropy. We propose a method to recover isotropic resolution from readily acquired anisotropic data. We achieve this using a convolutional neural network that is trained end-to-end from the same anisotropic body of data we later apply the network to. The network effectively learns to restore the full isotropic resolution by restoring the image under a trained, sample specific image prior. We apply our method to 3 synthetic and 3 real datasets and show that our results improve on results from deconvolution and state-of-the-art super-resolution techniques. Finally, we demonstrate that a standard 3D segmentation pipeline performs on the output of our network with comparable accuracy as on the full isotropic data.

Martin Weigert, Loic Royer, Florian Jug, Gene Myers

### Deep Learning for Isotropic Super-Resolution from Non-isotropic 3D Electron Microscopy

The most sophisticated existing methods to generate 3D isotropic super-resolution (SR) from non-isotropic electron microscopy (EM) are based on learned dictionaries. Unfortunately, none of the existing methods generate practically satisfying results. For 2D natural images, recently developed super-resolution methods that use deep learning have been shown to significantly outperform the previous state of the art. We have adapted one of the most successful architectures (FSRCNN) for 3D super-resolution, and compared its performance to a 3D U-Net architecture that has not been used previously to generate super-resolution. We trained both architectures on artificially downscaled isotropic ground truth from focused ion beam milling scanning EM (FIB-SEM) and tested the performance for various hyperparameter settings.Our results indicate that both architectures can successfully generate 3D isotropic super-resolution from non-isotropic EM, with the U-Net performing consistently better. We propose several promising directions for practical application.

Larissa Heinrich, John A. Bogovic, Stephan Saalfeld

### Histological Detection of High-Risk Benign Breast Lesions from Whole Slide Images

Accurate diagnosis of high-risk benign breast lesions is crucial in patient management since they are associated with an increased risk of invasive breast cancer development. Since it is not yet possible to identify the occult cancer patients without surgery, this limitation leads to retrospectively unnecessary surgeries. In this paper, we present a computational pathology pipeline for histological diagnosis of high-risk benign breast lesions from whole slide images (WSIs). Our pipeline includes WSI stain color normalization, ductal regions of interest (ROIs) segmentation, and cytological and architectural feature extraction to classify ductal ROIs into triaged high-risk benign lesions. We curated 93 WSIs of breast tissues containing high-risk benign lesions based on pathology reports and collected ground truth annotations from three different pathologists for the ductal ROIs segmented by our pipeline. Our method has comparable performance to a pool of expert pathologists.

Akif Burak Tosun, Luong Nguyen, Nathan Ong, Olga Navolotskaia, Gloria Carter, Jeffrey L. Fine, D. Lansing Taylor, S. Chakra Chennubhotla

### Accurate Correspondence of Cone Photoreceptor Neurons in the Human Eye Using Graph Matching Applied to Longitudinal Adaptive Optics Images

Loss of cone photoreceptor neurons is a leading cause of many blinding retinal diseases. Direct visualization of these cells in the living human eye is now feasible using adaptive optics scanning light ophthalmoscopy (AOSLO). However, it remains challenging to monitor the state of specific cells across multiple visits, due to inherent eye-motion-based distortions that arise during data acquisition, artifacts when overlapping images are montaged, as well as substantial variability in the data itself. This paper presents an accurate graph matching framework that integrates (1) robust local intensity order patterns (LIOP) to describe neuron regions with illumination variation from different visits; (2) a sparse-coding based voting process to measure visual similarities of neuron pairs using LIOP descriptors; and (3) a graph matching model that combines both visual similarity and geometrical cone packing information to determine the correspondence of repeated imaging of cone photoreceptor neurons across longitudinal AOSLO datasets. The matching framework was evaluated on imaging data from ten subjects using a validation dataset created by removing 15% of the neurons from 713 neuron correspondences across image pairs. An overall matching accuracy of 98% was achieved. The framework was robust to differences in the amount of overlap between image pairs. Evaluation on a test dataset showed that the matching accuracy remained at 98% on approximately 3400 neuron correspondences, despite image quality degradation, illumination variation, large image deformation, and edge artifacts. These experimental results show that our graph matching approach can accurately identify cone photoreceptor neuron correspondences on longitudinal AOSLO images.

Jianfei Liu, HaeWon Jung, Johnny Tam

### Active Learning and Proofreading for Delineation of Curvilinear Structures

Many state-of-the-art delineation methods rely on supervised machine learning algorithms. As a result, they require manually annotated training data, which is tedious to obtain. Furthermore, even minor classification errors may significantly affect the topology of the final result. In this paper we propose a generic approach to addressing both of these problems by taking into account the influence of a potential misclassification on the resulting delineation. In an Active Learning context, we identify parts of linear structures that should be annotated first in order to train a classifier effectively. In a proofreading context, we similarly find regions of the resulting reconstruction that should be verified in priority to obtain a nearly-perfect result. In both cases, by focusing the attention of the human expert on potential classification mistakes which are the most critical parts of the delineation, we reduce the amount of required supervision. We demonstrate the effectiveness of our approach on microscopy images depicting blood vessels and neurons.

Agata Mosinska, Jakub Tarnawski, Pascal Fua

### Tracking the Aortic Lumen Geometry by Optimizing the 3D Orientation of Its Cross-sections

We propose a fast incremental technique to compute the 3D geometry of the aortic lumen from a seed point located inside it. Our approach is based on the optimization of the 3D orientation of the cross-sections of the aorta. The method uses a robust ellipse estimation algorithm and an energy-based optimization technique to automatically track the centerline and the cross sections. In order to perform the optimization, we consider the size and the eccentricity of the ellipse which best fit the contour of the aorta on each cross-sectional plane. The method works directly on the original image and does not require a prior segmentation of the aortic lumen. We present some preliminary results which show the accuracy of the method and its ability to cope with challenging real CT (computed tomography) images of aortic lumens with significant angulations due to severe elongations.

Luis Alvarez, Agustín Trujillo, Carmelo Cuenca, Esther González, Julio Esclarín, Luis Gomez, Luis Mazorra, Miguel Alemán-Flores, Pablo G. Tahoces, José M. Carreira

### BRANCH:Bifurcation Recognition for Airway Navigation based on struCtural cHaracteristics

Bronchoscopic navigation is challenging, especially at the level of peripheral airways due to the complicated bronchial structures and the large respiratory motion. The aim of this paper is to propose a localisation approach tailored for navigation in the distal airway branches. Salient regions are detected on the depth maps of video images and CT virtual projections to extract anatomically meaningful areas that represent airway bifurcations. An airway descriptor based on shape context is introduced which encodes both the structural characteristics of the bifurcations and their spatial distribution. The bronchoscopic camera is localised in the airways by minimising the cost of matching the region features in video images to the pre-computed CT depth maps considering both the shape and temporal information. The method has been validated on phantom and in vivo data and the results verify its robustness to tissue deformation and good performance in distal airways.

Mali Shen, Stamatia Giannarou, Pallav L. Shah, Guang-Zhong Yang

### Anatomy-Driven Modelling of Spatial Correlation for Regularisation of Arterial Spin Labelling Images

Arterial spin labelling (ASL) allows blood flow to be measured in the brain and other organs of the body, which is valuable for both research and clinical use. Unfortunately, ASL suffers from an inherently low signal to noise ratio, necessitating methodological advances in ASL acquisition and processing. Spatial regularisation improves the effective signal to noise ratio, and is a common step in ASL processing. However, the standard spatial regularisation technique requires a manually-specified smoothing kernel of an arbitrary size, and can lead to loss of fine detail. Here, we present a Bayesian model of spatial correlation, which uses anatomical information from structural images to perform principled spatial regularisation, modelling the underlying signal and removing the need to set arbitrary smoothing parameters. Using data from a large cohort (N = 130) of preterm-born adolescents and age-matched controls, we show our method yields significant improvements in test-retest reproducibility, increasing the correlation coefficient by 14% relative to Gaussian smoothing and giving a corresponding improvement in statistical power. This novel technique has the potential to significantly improve single inversion time ASL studies, allowing more reliable detection of perfusion differences with a smaller number of subjects.

David Owen, Andrew Melbourne, Zach Eaton-Rosen, David L. Thomas, Neil Marlow, Jonathan Rohrer, Sebastien Ourselin

### Tracking and Segmentation of the Airways in Chest CT Using a Fully Convolutional Network

Airway segmentation plays an important role in analyzing chest computed tomography (CT) volumes such as lung cancer detection, chronic obstructive pulmonary disease (COPD), and surgical navigation. However, due to the complex tree-like structure of the airways, obtaining segmentation results with high accuracy for a complete 3D airway extraction remains a challenging task. In recent years, deep learning based methods, especially fully convolutional networks (FCN), have improved the state-of-the-art in many segmentation tasks. 3D U-Net is an example that optimized for 3D biomedical imaging. It consists of a contracting encoder part to analyze the input volume and a successive decoder part to generate integrated 3D segmentation results. While 3D U-Net can be trained for any 3D segmentation task, its direct application to airway segmentation is challenging due to differently sized airway branches. In this work, we combine 3D deep learning with image-based tracking in order to automatically extract the airways. Our method is driven by adaptive cuboidal volume of interest (VOI) analysis using a 3D U-Net model. We track the airways along their centerlines and set VOIs according to the diameter and running direction of each airway. After setting a VOI, the 3D U-Net is utilized to extract the airway region inside the VOI. All extracted candidate airway regions are unified to form an integrated airway tree. We trained on 30 cases and tested our method on an additional 20 cases. Compared with other state-of-the-art airway tracking and segmentation methods, our method can increase the detection rate by 5.6 while decreasing the false positives (FP) by 0.7 percentage points.

Qier Meng, Holger R. Roth, Takayuki Kitasaka, Masahiro Oda, Junji Ueno, Kensaku Mori

### Automatic Labeling of Vascular Structures with Topological Constraints via HMM

Identification of anatomical vessel branches is a prerequisite task for diagnosis, treatment and inter-subject comparison. We propose a novel graph labeling approach to anatomically label vascular structures of interest. Our method first extracts bifurcations of interest from the centerlines of vessels, where a set of geometric features are also calculated from. Then the probability distribution of every bifurcation is learned using a XGBoost classifier. Finally a Hidden Markov Model with a restricted transition strategy is constructed in order to find the most likely labeling configuration of the whole structure, while also enforcing topological consistency. In this paper, the proposed approach has been evaluated through leave-one-out cross validation on 50 subjects of centerlines obtained from MRA images of healthy volunteers’ Circle of Willis. Results demonstrate that our method can achieve higher accuracy and specificity, while obtaining similar precision and recall, when comparing to the best performing state-of-the-art methods. Our algorithm can handle different topologies, like circle, chain and tree. By using coordinate independent geometrical features, it does not require prior global alignment.

Xingce Wang, Yue Liu, Zhongke Wu, Xiao Mou, Mingquan Zhou, Miguel A. González Ballester, Chong Zhang

### Biomechanical Patient-Specific Model of the Respiratory System Based on 4D CT Scans and Controlled by Personalized Physiological Compliance

In this paper, we present a dynamic patient-specific model of the respiratory system for a whole respiratory cycle, based on 4D CT scans, personalized physiological compliance (pressure-volume curves), as well as an automatic tuning algorithm to determine lung pressure and diaphragm force parameters. The amplitude of the lung pressure and diaphragm forces are specific, and differs from one patient to another and depends on geometrical and physiological characteristics of the patient. To determine these parameters at different respiratory states and for each patient, an inverse finite element (FE) analysis has been implemented to match the experimental data issued directly from 4D CT images, to the FE simulation results, by minimizing the lungs volume variations. We have evaluated the model accuracy on five selected patients, from DIR-Lab Dataset, with small and large breathing amplitudes, by comparing the FE simulation results on 75 landmarks, at end inspiration (EI), end expiration (EE) states, and at each intermediate respiratory state. We have also evaluated the tumor motion identified in 4D CT scan images and compared it with the trajectory obtained by FE simulation, during one complete breathing cycle. The results demonstrate the good quantitative results of our physic-based model and we believe that our model, despite of others takes into account the challenging problem of the respiratory variabilities.

### CT Image Enhancement for Feature Detection and Localization

In recent years, many pre-processing filters have been developed in order to enhance anatomical structures on chest CT images. These filters are typically based on the analysis of the multiscale second-order local information of the image, that helps identify structures with even (tubes) or odd (surfaces) symmetries. Therefore, they often require specific parameter tuning to enhance the different structures. Moreover, while the filters seem to be able to isolate the structure of interest, they do not provide information about the sub-voxel location of the feature. In this work, we present a novel method for vessel, airway, and fissure strength computation on chest CT images using convolutional neural networks. A scale-space particle segmentation is used to isolate training points for vessels, airways, and fissures which are then used to train an 8-layer neural network with 3 convolutional layers which define high order local information of the image. The network returns a probability map of each feature and provides information on the feature offset from the voxel sampling center, allowing for sub-voxel location of the different structures. The proposed method has been evaluated on clinical CT images and compared to other methods for feature enhancement available in the literature. Results show that the proposed method outperforms competing algorithms in terms of enhancement and is also unique in providing subvoxel information.

Pietro Nardelli, James C. Ross, Raúl San José Estépar

### Information Theoretic Measurement of Blood Flow Complexity in Vessels and Aneurysms: Interlacing Complexity Index

Haemodynamics is believed to be a crucial factor in the aneurysm formation, evolution and eventual rupture. The 3D blood flow is typically derived by computational fluid dynamics (CFD) from patient-specific models obtained from angiographic images. Typical quantitative haemodynamic indices are local. Some qualitative classifications of global haemodynamic features have been proposed. However these classifications are subjective, depending on the operator visual inspection.In this work we introduce an information theoretic measurement of the blood flow complexity, based on Shannon’s Mutual Information, named Interlacing Complexity Index (ICI). ICI is an objective quantification of the flow complexity from aneurysm inlet to aneurysm outlets. It measures how unpredictable is the location of the streamlines at the outlets from knowing the location at the inlet, relative to the scale of observation.We selected from the @neurIST database a set of 49 cerebral vasculatures with aneurysms in the middle cerebral artery. Surface models of patient-specific vascular geometries were obtained by geodesic active region segmentation and manual correction, and unsteady flow simulations were performed imposing physiological flow boundary conditions. The obtained ICI has been compared to several qualitative classifications performed by an expert, revealing high correlations.

Jose M. Pozo, Arjan J. Geers, Alejandro F. Frangi

### Globally-Optimal Anatomical Tree Extraction from 3D Medical Images Using Pictorial Structures and Minimal Paths

Extracting centerlines of anatomical trees (e.g., vasculature and airways) from 3D medical images is a crucial preliminary step for various medical applications. We propose an automatic tree extraction method that leverages prior knowledge of tree topology and geometry and ensures globally-optimal solutions. We define a pictorial structure with a corresponding cost function to detect tree bifurcations in anatomical trees with predefined topology. The tree bifurcations are encoded as nodes in the pictorial structure and are associated with an artificial neural network (ANN) based unary term. The geometrical (direction and length) statistics of tree branches are learned from a training set and encoded as geometrical priors for regularizing the pictorial structure edges. Finally, detected bifurcations as well as the ANN tubularity scores, are leveraged to trace globally optimal minimal paths along 3D tree centrelines. Our method outperforms competing state-of-the-art when evaluated on 3D synthesized vasculature and lung airways in CT and our results demonstrate the advantages of incorporating tree statistics and global optimization for this task.

Zahra Mirikharaji, Mengliu Zhao, Ghassan Hamarneh

### Semi-supervised Learning for Network-Based Cardiac MR Image Segmentation

Training a fully convolutional network for pixel-wise (or voxel-wise) image segmentation normally requires a large number of training images with corresponding ground truth label maps. However, it is a challenge to obtain such a large training set in the medical imaging domain, where expert annotations are time-consuming and difficult to obtain. In this paper, we propose a semi-supervised learning approach, in which a segmentation network is trained from both labelled and unlabelled data. The network parameters and the segmentations for the unlabelled data are alternately updated. We evaluate the method for short-axis cardiac MR image segmentation and it has demonstrated a high performance, outperforming a baseline supervised method. The mean Dice overlap metric is 0.92 for the left ventricular cavity, 0.85 for the myocardium and 0.89 for the right ventricular cavity. It also outperforms a state-of-the-art multi-atlas segmentation method by a large margin and the speed is substantially faster.

Wenjia Bai, Ozan Oktay, Matthew Sinclair, Hideaki Suzuki, Martin Rajchl, Giacomo Tarroni, Ben Glocker, Andrew King, Paul M. Matthews, Daniel Rueckert

### A Localized Statistical Motion Model as a Reproducing Kernel for Non-rigid Image Registration

Thoracic image registration forms the basis for many applications as for example respiratory motion estimation and physiological investigations of the lung. Although clear motion patterns are shared among different subjects, such as the diaphragm moving in superior and inferior direction, in current image registration methods such basic prior knowledge is not considered. In this paper, we propose a novel approach for integrating a statistical motion model (SMM) into a parametric non-rigid registration framework. We formulate the SMM as a reproducing kernel and integrate it into a kernel machine for image registration. Since empirical samples are rare and statistical models built from small sample size are usually over-restrictive we localize the SMM by damping spatial long-range correlations and reduce the model bias by adding generic transformations to the SMM. As an example, we show our methods applicability on the example of the Dirlab 4DCT lung images where we build leave-one-out models for estimating the respiratory motion.

Christoph Jud, Alina Giger, Robin Sandkühler, Philippe C. Cattin

### Efficient Deformable Motion Correction for 3-D Abdominal MRI Using Manifold Regression

We present a novel framework for efficient retrospective respiratory motion correction of 3-D abdominal MRI using manifold regression. K-space data are continuously acquired under free breathing using the stack-of-stars radial golden-angle trajectory. The stack-of-profiles (SoP) from all temporal positions are embedded into a common manifold, in which SoPs that were acquired at similar respiratory states are close together. Next, the SoPs in the manifold are clustered into groups using the k-means algorithm. One 3-D volume is reconstructed at the central SoP position of each cluster (a.k.a. key-volumes). Motion fields are estimated using deformable image registration between each of these key-volumes and a reference end-exhale volume. Subsequently, the motion field at any other SoP position in the manifold is derived using manifold regression. The regressed motion fields for each of the SoPs are used to determine a final motion-corrected MRI volume. The method was evaluated on realistic synthetic datasets which were generated from real MRI data and also tested on an in vivo dataset. The framework enables more accurate motion correction compared to the conventional binning-based approach, with high computational efficiency.

Xin Chen, Daniel R. Balfour, Paul K. Marsden, Andrew J. Reader, Claudia Prieto, Andrew P. King

### Flow Network Based Cardiac Motion Tracking Leveraging Learned Feature Matching

We present a novel cardiac motion tracking method where motion is modeled as flow through a network. The motion is subject to physiologically consistent constraints and solved using linear programming. An additional important contribution of our work is the use of a Siamese neural network to generate edge weights that guide the flow through the network. The Siamese network learns to detect and quantify similarity and dissimilarity between pairs of image patches corresponding to the graph nodes. Despite cardiac motion tracking being an inherently spatiotemporal problem, few methods reliably address it as such. Furthermore, many tracking algorithms depend on tedious feature engineering and metric refining. Our approach provides solutions to both of these problems. We benchmark our method against a few other approaches using a synthetic 4D echocardiography dataset and compare the performance of neural network based feature matching with other features. We also present preliminary results on data from 5 canine cases.

Nripesh Parajuli, Allen Lu, John C. Stendahl, Maria Zontak, Nabil Boutagy, Imran Alkhalil, Melissa Eberle, Ben A. Lin, Matthew O’Donnell, Albert J. Sinusas, James S. Duncan

### Automatic 3D Cardiovascular MR Segmentation with Densely-Connected Volumetric ConvNets

Automatic and accurate whole-heart and great vessel segmentation from 3D cardiac magnetic resonance (MR) images plays an important role in the computer-assisted diagnosis and treatment of cardiovascular disease. However, this task is very challenging due to ambiguous cardiac borders and large anatomical variations among different subjects. In this paper, we propose a novel densely-connected volumetric convolutional neural network, referred as DenseVoxNet, to automatically segment the cardiac and vascular structures from 3D cardiac MR images. The DenseVoxNet adopts the 3D fully convolutional architecture for effective volume-to-volume prediction. From the learning perspective, our DenseVoxNet has three compelling advantages. First, it preserves the maximum information flow between layers by a densely-connected mechanism and hence eases the network training. Second, it avoids learning redundant feature maps by encouraging feature reuse and hence requires fewer parameters to achieve high performance, which is essential for medical applications with limited training data. Third, we add auxiliary side paths to strengthen the gradient propagation and stabilize the learning process. We demonstrate the effectiveness of DenseVoxNet by comparing it with the state-of-the-art approaches from HVSMR 2016 challenge in conjunction with MICCAI, and our network achieves the best dice coefficient. We also show that our network can achieve better performance than other 3D ConvNets but with fewer parameters.

Lequan Yu, Jie-Zhi Cheng, Qi Dou, Xin Yang, Hao Chen, Jing Qin, Pheng-Ann Heng

### Predicting Slice-to-Volume Transformation in Presence of Arbitrary Subject Motion

This paper aims to solve a fundamental problem in intensity-based 2D/3D registration, which concerns the limited capture range and need for very good initialization of state-of-the-art image registration methods. We propose a regression approach that learns to predict rotations and translations of arbitrary 2D image slices from 3D volumes, with respect to a learned canonical atlas co-ordinate system. To this end, we utilize Convolutional Neural Networks (CNNs) to learn the highly complex regression function that maps 2D image slices into their correct position and orientation in 3D space. Our approach is attractive in challenging imaging scenarios, where significant subject motion complicates reconstruction performance of 3D volumes from 2D slice data. We extensively evaluate the effectiveness of our approach quantitatively on simulated MRI brain data with extreme random motion. We further demonstrate qualitative results on fetal MRI where our method is integrated into a full reconstruction and motion compensation pipeline. With our CNN regression approach we obtain an average prediction error of 7 mm on simulated data, and convincing reconstruction quality of images of very young fetuses where previous methods fail. We further discuss applications to Computed Tomography (CT) and X-Ray projections. Our approach is a general solution to the 2D/3D initialization problem. It is computationally efficient, with prediction times per slice of a few milliseconds, making it suitable for real-time scenarios.

Benjamin Hou, Amir Alansary, Steven McDonagh, Alice Davidson, Mary Rutherford, Jo V. Hajnal, Daniel Rueckert, Ben Glocker, Bernhard Kainz

### Detection and Characterization of the Fetal Heartbeat in Free-hand Ultrasound Sweeps with Weakly-supervised Two-streams Convolutional Networks

Assessment of fetal cardiac activity is essential to confirm pregnancy viability in obstetric ultrasound. However, automated detection and localization of a beating fetal heart, in free-hand ultrasound sweeps, is a very challenging task, due to high variation in heart appearance, scale and position (because of heart deformation, scanning orientations and artefacts). In this paper, we present a two-stream Convolutional Network (ConvNet) -a temporal sequence learning model- that recognizes heart frames and localizes the heart using only weak supervision. Our contribution is three-fold: (i) to the best of our knowledge, this is the first work to use two-stream spatio-temporal ConvNets in analysis of free-hand fetal ultrasound videos. The model is compact, and can be trained end-to-end with only image level labels, (ii) the model enforces rotation invariance, which does not require additional augmentation in the training data, and (iii) the model is particularly robust for heart detection, which is important in our application where there can be additional distracting textures, such as acoustic shadows. Our results demonstrate that the proposed two-stream ConvNet architecture significantly outperforms single stream spatial ConvNets (90.3% versus 74.9%), in terms of heart identification.

Yuan Gao, J. Alison Noble

### Retrospective Head Motion Estimation in Structural Brain MRI with 3D CNNs

Head motion is one of the most important nuisance variables in neuroimaging, particularly in studies of clinical or special populations, such as children. However, the possibility of estimating motion in structural MRI is limited to a few specialized sites using advanced MRI acquisition techniques. Here we propose a supervised learning method to retrospectively estimate motion from plain MRI. Using sparsely labeled training data, we trained a 3D convolutional neural network to assess if voxels are corrupted by motion or not. The output of the network is a motion probability map, which we integrate across a region of interest (ROI) to obtain a scalar motion score. Using cross-validation on a dataset of $$n=48$$ healthy children scanned at our center, and the cerebral cortex as ROI, we show that the proposed measure of motion explains away 37% of the variation in cortical thickness. We also show that the motion score is highly correlated with the results from human quality control of the scans. The proposed technique can not only be applied to current studies, but also opens up the possibility of reanalyzing large amounts of legacy datasets with motion into consideration: we applied the classifier trained on data from our center to the ABIDE dataset (autism), and managed to recover group differences that were confounded by motion.

Juan Eugenio Iglesias, Garikoitz Lerma-Usabiaga, Luis C. Garcia-Peraza-Herrera, Sara Martinez, Pedro M. Paz-Alonso

### Learning-Based Spatiotemporal Regularization and Integration of Tracking Methods for Regional 4D Cardiac Deformation Analysis

Dense cardiac motion tracking and deformation analysis from echocardiography is important for detection and localization of myocardial dysfunction. However, tracking methods are often unreliable due to inherent ultrasound imaging properties. In this work, we propose a new data-driven spatiotemporal regularization strategy. We generate 4D Lagrangian displacement patches from different input sources as training data and learn the regularization procedure via a multi-layered perceptron (MLP) network. The learned regularization procedure is applied to initial noisy tracking results. We further propose a framework for integrating tracking methods to produce better overall estimations. We demonstrate the utility of this approach on block-matching, surface tracking, and free-form deformation-based methods. Finally, we quantitatively and qualitatively evaluate our performance on both tracking and strain accuracy using both synthetic and in vivo data.

Allen Lu, Maria Zontak, Nripesh Parajuli, John C. Stendahl, Nabil Boutagy, Melissa Eberle, Imran Alkhalil, Matthew O’Donnell, Albert J. Sinusas, James S. Duncan

### Fully Automated Segmentation-Based Respiratory Motion Correction of Multiplanar Cardiac Magnetic Resonance Images for Large-Scale Datasets

Cardiac magnetic resonance (CMR) can be used for quantitative analysis of heart function. However, CMR imaging typically involves acquiring 2D image planes during separate breath-holds, often resulting in misalignment of the heart between image planes in 3D. Accurate quantitative analysis requires a robust 3D reconstruction of the heart from CMR images, which is adversely affected by such motion artifacts. Therefore, we propose a fully automated method for motion correction of CMR planes using segmentations produced by fully convolutional neural networks (FCNs). FCNs are trained on 100 UK Biobank subjects to produce short-axis and long-axis segmentations, which are subsequently used in an iterative registration algorithm for correcting breath-hold induced motion artifacts. We demonstrate significant improvements in motion-correction over image-based registration, with strong correspondence to results obtained using manual segmentations. We also deploy our automatic method on 9,353 subjects in the UK Biobank database, demonstrating significant improvements in 3D plane alignment.

Matthew Sinclair, Wenjia Bai, Esther Puyol-Antón, Ozan Oktay, Daniel Rueckert, Andrew P. King

### Temporal HeartNet: Towards Human-Level Automatic Analysis of Fetal Cardiac Screening Video

We present an automatic method to describe clinically useful information about scanning, and to guide image interpretation in ultrasound (US) videos of the fetal heart. Our method is able to jointly predict the visibility, viewing plane, location and orientation of the fetal heart at the frame level. The contributions of the paper are three-fold: (i) a convolutional neural network architecture is developed for a multi-task prediction, which is computed by sliding a $$3 \times 3$$ window spatially through convolutional maps. (ii) an anchor mechanism and Intersection over Union (IoU) loss are applied for improving localization accuracy. (iii) a recurrent architecture is designed to recursively compute regional convolutional features temporally over sequential frames, allowing each prediction to be conditioned on the whole video. This results in a spatial-temporal model that precisely describes detailed heart parameters in challenging US videos. We report results on a real-world clinical dataset, where our method achieves performance on par with expert annotations.

Weilin Huang, Christopher P. Bridge, J. Alison Noble, Andrew Zisserman

### Longitudinal Analysis Using Personalised 3D Cardiac Models with Population-Based Priors: Application to Paediatric Cardiomyopathies

Personalised 3D modelling of the heart is of increasing interest in order to better characterise pathologies and predict evolution. The personalisation consists in estimating the parameter values of an electromechanical model in order to reproduce the observed cardiac motion. However, the number of parameters in these models can be high and their estimation may not be unique. This variability can be an obstacle to further analyse the estimated parameters and for their clinical interpretation. In this paper we present a method to perform consistent estimations of electromechanical parameters with prior probabilities on the estimated values, which we apply on a large database of 84 different heartbeats. We show that the use of priors reduces considerably the variance in the estimated parameters, enabling better conditioning of the parameters for further analysis of the cardiac function. This is demonstrated by the application to longitudinal data of paediatric cardiomyopathies, where the estimated parameters provide additional information on the pathology and its evolution.

Roch Mollero, Hervé Delingette, Manasi Datar, Tobias Heimann, Jakob A. Hauser, Dilveer Panesar, Alexander Jones, Andrew Taylor, Marcus Kelm, Titus Kuehne, Marcello Chinali, Gabriele Rinelli, Nicholas Ayache, Xavier Pennec, Maxime Sermesant

### Temporal Interpolation of Abdominal MRIs Acquired During Free-Breathing

We propose a convolutional neural network (CNN) based solution for temporal image interpolation in navigated 2D multi-slice dynamic MRI acquisitions. Such acquisitions can achieve high contrast time-resolved volumetric images without the need for breath-holding, which makes them essential for quantifying breathing induced motion for MR guided therapies. Reducing the number of navigator slices needed in these acquisitions would allow increasing through-plane resolution and reducing overall acquisition time. The proposed CNN achieves this by interpolating between successive navigator slices. The method is an end-to-end learning based approach and avoids the determination of the motion field between the input images. We evaluate the method on a dataset of abdominal MRI sequences acquired from 14 subjects during free-breathing, which exhibit pseudo-periodic motion and sliding motion interfaces. Compared to an interpolation-by-registration approach, the method achieves higher interpolation accuracy on average, quantified in terms of intensity RMSE and residual motion errors. Further, we analyze the differences between the two methods, showing the CNN’s advantages in peak inhale and exhale positions.

Neerav Karani, Christine Tanner, Sebastian Kozerke, Ender Konukoglu

### Intraoperative Organ Motion Models with an Ensemble of Conditional Generative Adversarial Networks

In this paper, we describe how a patient-specific, ultrasound-probe-induced prostate motion model can be directly generated from a single preoperative MR image. Our motion model allows for sampling from the conditional distribution of dense displacement fields, is encoded by a generative neural network conditioned on a medical image, and accepts random noise as additional input. The generative network is trained by a minimax optimisation with a second discriminative neural network, tasked to distinguish generated samples from training motion data. In this work, we propose that (1) jointly optimising a third conditioning neural network that pre-processes the input image, can effectively extract patient-specific features for conditioning; and (2) combining multiple generative models trained separately with heuristically pre-disjointed training data sets can adequately mitigate the problem of mode collapse. Trained with diagnostic T2-weighted MR images from 143 real patients and 73,216 3D dense displacement fields from finite element simulations of intraoperative prostate motion due to transrectal ultrasound probe pressure, the proposed models produced physically-plausible patient-specific motion of prostate glands. The ability to capture biomechanically simulated motion was evaluated using two errors representing generalisability and specificity of the model. The median values, calculated from a 10-fold cross-validation, were 2.8 ± 0.3 mm and 1.7 ± 0.1 mm, respectively. We conclude that the introduced approach demonstrates the feasibility of applying state-of-the-art machine learning algorithms to generate organ motion models from patient images, and shows significant promise for future research.

Yipeng Hu, Eli Gibson, Tom Vercauteren, Hashim U. Ahmed, Mark Emberton, Caroline M. Moore, J. Alison Noble, Dean C. Barratt

### CardiacNET: Segmentation of Left Atrium and Proximal Pulmonary Veins from MRI Using Multi-view CNN

Anatomical and biophysical modeling of left atrium (LA) and proximal pulmonary veins (PPVs) is important for clinical management of several cardiac diseases. Magnetic resonance imaging (MRI) allows qualitative assessment of LA and PPVs through visualization. However, there is a strong need for an advanced image segmentation method to be applied to cardiac MRI for quantitative analysis of LA and PPVs. In this study, we address this unmet clinical need by exploring a new deep learning-based segmentation strategy for quantification of LA and PPVs with high accuracy and heightened efficiency. Our approach is based on a multi-view convolutional neural network (CNN) with an adaptive fusion strategy and a new loss function that allows fast and more accurate convergence of the backpropagation based optimization. After training our network from scratch by using more than 60K 2D MRI images (slices), we have evaluated our segmentation strategy to the STACOM 2013 cardiac segmentation challenge benchmark. Qualitative and quantitative evaluations, obtained from the segmentation challenge, indicate that the proposed method achieved the state-of-the-art sensitivity (90%), specificity (99%), precision (94%), and efficiency levels (10s in GPU, and 7.5 min in CPU).

Aliasghar Mortazi, Rashed Karim, Kawal Rhode, Jeremy Burt, Ulas Bagci

### Analysis of Periodicity in Video Sequences Through Dynamic Linear Modeling

Periodicity is an important characteristic in many types of video sequences, particularly in medical applications where the cardiac and respiratory cycles are of special significance. Simple spectral analysis or band-pass filtering is often insufficient to extract the periodic signal. Here, we propose modeling the periodic and background components using nested dynamic linear models. These models can approximate the periodic and background time series in a wide range of video sequences. A likelihood ratio test can be used to find regions of the video exhibiting periodicity. Our experiments suggested this technique is suitable for a variety of applications using different imaging modalities, including ultrasound, MRI and natural video.

A. Jonathan McLeod, Dante P. I. Capaldi, John S. H. Baxter, Grace Parraga, Xiongbiao Luo, Terry M. Peters

### Groupwise Registration of MR Brain Images Containing Tumors via Spatially Constrained Low-Rank Based Image Recovery

We propose a new low-rank based image recovery method and embed it into an existing Groupwise Image Registration (GIR) framework to achieve accurate GIR of Magnetic Resonance (MR) brain images containing tumors. In our method, brain tumor regions in the input images are recovered with population-consistent normal brain appearance to produce low-rank images. The GIR framework is then applied to the tumor-free low-rank images. With no influence from the brain tumor, accurate GIR can be achieved. Unlike conventional low-rank based image recovery methods, a spatial constraint is added to the low-rank framework in our method, by which the quality of the resulting low-rank images can be improved. Particularly, the low-rank images produced by our method contain both effectively recovered brain tumor regions and well-preserved normal brain regions of input images, which are two key factors for accurate GIR. By contrast, in conventional low-rank based image recovery methods, these two factors are mutually exclusive and a good balance is difficult to achieve. Synthetic and real MR brain images are used to evaluate our method. The results show that based on our method, image recovery quality and GIR accuracy are improved in comparison to the state-of-the-art method.

Zhenyu Tang, Yue Cui, Bo Jiang

### Deep Correlational Learning for Survival Prediction from Multi-modality Data

Technological advances have created a great opportunity to provide multi-view data for patients. However, due to the large discrepancy between different heterogeneous views, traditional survival models are unable to efficiently handle multiple modalities data as well as learn very complex interactions that can affect survival outcomes in various ways. In this paper, we develop a Deep Correlational Survival Model (DeepCorrSurv) for the integration of multi-view data. The proposed network consists of two sub-networks, view-specific and common sub-network. To remove the view discrepancy, the proposed DeepCorrSurv first explicitly maximizes the correlation among the views. Then it transfers feature hierarchies from view commonality and specifically fine-tunes on the survival regression task. Extensive experiments on real lung and brain tumor data sets demonstrated the effectiveness of the proposed DeepCorrSurv model using multiple modalities data across different tumor types.

Jiawen Yao, Xinliang Zhu, Feiyun Zhu, Junzhou Huang

### Compressed Sensing on Multi-pinhole Collimator SPECT Camera for Sentinel Lymph Node Biopsy

State-of-the-art imaging devices for sentinel lymph node biopsy are either a 1-dimensional gamma probe or more recently 2-dimensional gamma cameras that locate the sentinel lymph node. These devices, however, share difficulties when multiple lymph nodes are close-by and do not allow the estimation of the distance to the lymph nodes, as the tracer activation is projected either to a 1- or 2-dimensional image plane. We propose a method, which reconstructs the tracer distribution using a single image of the detector resulting from a multi-pinhole collimator. Applying standard image processing tools on the detector’s image leads to a reduced, sparse system. Thus, we propose an efficient and reliable compressed sensing strategy, to reconstructs the 3-dimensional tracer distribution using a multi-pinhole collimator and a single detector image. This approach enables better estimation of lymph nodes position and improves the differentiation of close-by lymph nodes.

Carlo Seppi, Uri Nahum, Peter A. von Niederhäusern, Simon Pezold, Michael Rissi, Stephan K. Haerle, Philippe C. Cattin

### Personalized Pancreatic Tumor Growth Prediction via Group Learning

Tumor growth prediction, a highly challenging task, has long been viewed as a mathematical modeling problem, where the tumor growth pattern is personalized based on imaging and clinical data of a target patient. Though mathematical models yield promising results, their prediction accuracy may be limited by the absence of population trend data and personalized clinical characteristics. In this paper, we propose a statistical group learning approach to predict the tumor growth pattern that incorporates both the population trend and personalized data. In order to discover high-level features from multimodal imaging data, a deep convolutional neural network approach is developed to model the voxel-wise spatio-temporal tumor progression. The deep features are combined with the time intervals and the clinical factors to feed a process of feature selection. Our predictive model is pretrained on a group data set and personalized on the target patient data to estimate the future spatio-temporal progression of the patient’s tumor. Multimodal imaging data at multiple time points are used in the learning, personalization and inference stages. Our method achieves a Dice coefficient of $$86.8\%\,\pm \,3.6\%$$ and RVD of $$7.9\%\,\pm \,5.4\%$$ on a pancreatic tumor data set, outperforming the DSC of $$84.4\%\,\pm \,4.0\%$$ and RVD $$13.9\%\,\pm \,9.8\%$$ obtained by a previous state-of-the-art model-based method.

Ling Zhang, Le Lu, Ronald M. Summers, Electron Kebebew, Jianhua Yao

### Boundary-Aware Fully Convolutional Network for Brain Tumor Segmentation

We propose a novel, multi-task, fully convolutional network (FCN) architecture for automatic segmentation of brain tumor. This network extracts multi-level contextual information by concatenating hierarchical feature representations extracted from multimodal MR images along with their symmetric-difference images. It achieves improved segmentation performance by incorporating boundary information directly into the loss function. The proposed method was evaluated on the BRATS13 and BRATS15 datasets and compared with competing methods on the BRATS13 testing set. Segmented tumor boundaries obtained were better than those obtained by single-task FCN and by FCN with CRF. The method is among the most accurate available and has relatively low computational cost at test time.

Haocheng Shen, Ruixuan Wang, Jianguo Zhang, Stephen J. McKenna

### Field Effect Induced Organ Distension (FOrge) Features Predicting Biochemical Recurrence from Pre-treatment Prostate MRI

Aggressive cancers are known to induce field effect that affect large areas of cells at a tissue surface. This means that local deformation induced by the tumor as it grows could cause distensions in regions distant from the tumor, presumably even the surface of the organ within which the tumor is growing. In this work, we focused on evaluating whether more and less aggressive prostate cancers (i.e. tumors that subsequently resulted in disease recurrence or not) could differentially induce changes and distensions in the surface of the prostate capsule. Specifically we have developed the concept of a new imaging marker called FOrge features, that attempts to quantify the degree and nature of the deformation induced in the capsule surface on account of tumor growth and then sought to evaluate whether FOrge is predictive of the risk of biochemical recurrence in prostate cancer patients based off a pre-operative T2w MRI scan. The FOrge features were extracted from a spatially contextual surface of interest (SOI) of the prostate capsule, uniquely determined from statistically significant shape differences between prostate atlases constructed from patients who did (BCR+) and who did not (BCR−) undergo biochemical recurrence. A random forest classifier trained on the FOrge features extracted from atlas images (25 BCR+ and 25 BCR−) yielded an accuracy of 78% and an AUC of 0.72 in an independent validation set of 30 patients.

Soumya Ghose, Rakesh Shiradkar, Mirabela Rusu, Jhimli Mitra, Rajat Thawani, Michael Feldman, Amar Gupta, Andrei Purysko, Lee Ponsky, Anant Madabhushi

### Multi-label Inductive Matrix Completion for Joint MGMT and IDH1 Status Prediction for Glioma Patients

MGMT promoter methylation and IDH1 mutation in high-grade gliomas (HGG) have proven to be the two important molecular indicators associated with better prognosis. Traditionally, the statuses of MGMT and IDH1 are obtained via surgical biopsy, which is laborious, invasive and time-consuming. Accurate presurgical prediction of their statuses based on preoperative imaging data is of great clinical value towards better treatment plan. In this paper, we propose a novel Multi-label Inductive Matrix Completion (MIMC) model, highlighted by the online inductive learning strategy, to jointly predict both MGMT and IDH1 statuses. Our MIMC model not only uses the training subjects with possibly missing MGMT/IDH1 labels, but also leverages the unlabeled testing subjects as a supplement to the limited training dataset. More importantly, we learn inductive labels, instead of directly using transductive labels, as the prediction results for the testing subjects, to alleviate the overfitting issue in small-sample-size studies. Furthermore, we design an optimization algorithm with guaranteed convergence based on the block coordinate descent method to solve the multivariate non-smooth MIMC model. Finally, by using a precious single-center multi-modality presurgical brain imaging and genetic dataset of primary HGG, we demonstrate that our method can produce accurate prediction results, outperforming the previous widely-used single- or multi-task machine learning methods. This study shows the promise of utilizing imaging-derived brain connectome phenotypes for prognosis of HGG in a non-invasive manner.

Lei Chen, Han Zhang, Kim-Han Thung, Luyan Liu, Junfeng Lu, Jinsong Wu, Qian Wang, Dinggang Shen

### Radiographic-Deformation and Textural Heterogeneity (r-DepTH): An Integrated Descriptor for Brain Tumor Prognosis

Most aggressive tumors are systemic, implying that their impact is not localized to the tumor itself but extends well beyond the visible tumor borders. Solid tumors (e.g. Glioblastoma) typically exert pressure on the surrounding normal parenchyma due to active proliferation, impacting neighboring structures and worsening survival. Existing approaches have focused on capturing tumor heterogeneity via shape, intensity, and texture radiomic statistics within the visible surgical margins on pre-treatment scans, with the clinical purpose of improving treatment management. However, a poorly understood aspect of heterogeneity is the impact of active proliferation and tumor burden, leading to subtle deformations in the surrounding normal parenchyma distal to the tumor. We introduce radiographic-Deformation and Textural Heterogeneity (r-DepTH), a new descriptor that attempts to capture both intra-, as well as extra-tumoral heterogeneity. r-DepTH combines radiomic measurements of (a) subtle tissue deformation measures throughout the extraneous surrounding normal parenchyma, and (b) the gradient-based textural patterns in tumor and adjacent peri-tumoral regions. We demonstrate that r-DepTH enables improved prediction of disease outcome compared to descriptors extracted from within the visible tumor alone. The efficacy of r-DepTH is demonstrated in the context of distinguishing long-term (LTS) versus short-term (STS) survivors of Glioblastoma, a highly malignant brain tumor. Using a training set (N = 68) of treatment-naive Gadolinium T1w MRI scans, r-DepTH achieved an AUC of 0.83 in distinguishing STS versus LTS. Kaplan Meier survival analysis on an independent cohort (N = 11) using the r-DepTH descriptor resulted in p = 0.038 (log-rank test), a significant improvement over employing deformation descriptors from normal parenchyma (p = 0.17), or textural descriptors from visible tumor (p = 0.81) alone.

Prateek Prasanna, Jhimli Mitra, Niha Beig, Sasan Partovi, Gagandeep Singh, Marco Pinho, Anant Madabhushi, Pallavi Tiwari

### RADIomic Spatial TexturAl descripTor (RADISTAT): Characterizing Intra-tumoral Heterogeneity for Response and Outcome Prediction

Radiomic analysis in cancer applications enables capturing of disease-specific heterogeneity, through quantification of localized texture feature responses within and around a tumor region. Statistical descriptors of the resulting feature distribution (e.g. skewness, kurtosis) are then input to a predictive model. However, a single statistic may not fully capture the rich spatial diversity of pixel-wise radiomic expression maps. In this work, we present a new RADIomic Spatial TexturAl descripTor (RADISTAT) which attempts to (a) more completely characterize the spatial heterogeneity of a radiomic feature, and (b) capture the overall distribution heterogeneity of a radiomic feature by combining the proportion and arrangement of regions of high and low feature expression. We demonstrate the utility of RADISTAT in the context of (a) discriminating favorable from unfavorable treatment response in a cohort of N = 44 rectal cancer (RCa) patients, and (b) distinguishing short-term from long-term survivors in a cohort of N = 55 glioblastoma multiforme (GBM) patients. For both datasets, RADISTAT resulted in a significantly improved classification performance (AUC = 0.79 in the RCa cohort, AUC = 0.71 in the GBM cohort, based on randomized cross-validation) as compared to using simple statistics (mean, variance, skewness, or kurtosis) to describe radiomic co-occurrence features.

Jacob Antunes, Prateek Prasanna, Anant Madabhushi, Pallavi Tiwari, Satish Viswanath

### Locally Affine Diffeomorphic Surface Registration for Planning of Metopic Craniosynostosis Surgery

The outcome of cranial vault reconstruction for the surgical treatment of craniosynostosis heavily depends on the surgeon’s expertise because of the lack of an objective target shape. We introduce a surface-based diffeomorphic registration framework to create the optimal post-surgical cranial shape during craniosynostosis treatment. Our framework estimates and labels where each bone piece needs to be cut using a reference template. Then, it calculates how much each bone piece needs to be translated and in which direction, using the closest normal shape from a multi-atlas as a reference. With our locally affine approach, the method also allows for bone bending, modeling independently the transformation of each bone piece while ensuring the consistency of the global transformation. We evaluated the optimal plan for 15 patients with metopic craniosynostosis. Our results showed that the automated surgical planning creates cranial shapes with a reduction in cranial malformations of 51.43% and curvature discrepancies of 35.09%, which are the two indices proposed in the literature to quantify cranial deformities objectively. In addition, the cranial shapes created were within healthy ranges.

Antonio R. Porras, Beatriz Paniagua, Andinet Enquobahrie, Scott Ensel, Hina Shah, Robert Keating, Gary F. Rogers, Marius George Linguraru

### A Time Saver: Optimization Approach for the Fully Automatic 3D Planning of Forearm Osteotomies

Three-dimensional (3D) computer-assisted preoperative planning has become the state-of-the-art for surgical treatment of complex forearm bone malunions. Despite benefits of these approaches, surgeon time and effort to generate a 3D-preoperative planning remains too high, and limits their clinical application. This motivates the development of computer algorithms able to expedite the process. We propose a staged multi-objective optimization method based on a genetic algorithm with tailored fitness functions, capable to generate a 3D-preoperative plan in a fully automatic fashion. A clinical validation was performed upon 14 cases of distal radius osteotomy. Solutions generated by our algorithm (OA) were compared to those created by surgeons using dedicated planning software (Gold Standard; GS), demonstrating that in 53% of the tested cases, OA solutions were better than or equal to GS solutions, successfully reducing surgeon’s interaction time. Additionally, a quantitative evaluation based on 4 different error measurement confirmed the validity of our method.

Fabio Carrillo, Lazaros Vlachopoulos, Andreas Schweizer, Ladislav Nagy, Jess Snedeker, Philipp Fürnstahl

### DARWIN: Deformable Patient Avatar Representation With Deep Image Network

In this paper, we present a technical approach to robustly estimate the detailed patient body surface mesh under clothing cover from a single snapshot of a range sensor. Existing methods either lack level of detail of the estimated patient body model, fail to estimate the body model robustly under clothing cover, or lack sufficient evaluation over real patient datasets. In this work, we overcome these limitations by learning deep convolutional networks over real clinical dataset with large variation and augmentation. Our approach is validated with experiments conducted over 1063 human subjects from 3 different hospitals and surface errors are measured against groundtruth from CT data.

Vivek Singh, Kai Ma, Birgi Tamersoy, Yao-Jen Chang, Andreas Wimmer, Thomas O’Donnell, Terrence Chen

### Simultaneous Recognition and Pose Estimation of Instruments in Minimally Invasive Surgery

Detection of surgical instruments plays a key role in ensuring patient safety in minimally invasive surgery. In this paper, we present a novel method for 2D vision-based recognition and pose estimation of surgical instruments that generalizes to different surgical applications. At its core, we propose a novel scene model in order to simultaneously recognize multiple instruments as well as their parts. We use a Convolutional Neural Network architecture to embody our model and show that the cross-entropy loss is well suited to optimize its parameters which can be trained in an end-to-end fashion. An additional advantage of our approach is that instrument detection at test time is achieved while avoiding the need for scale-dependent sliding window evaluation. This allows our approach to be relatively parameter free at test time and shows good performance for both instrument detection and tracking. We show that our approach surpasses state-of-the-art results on in-vivo retinal microsurgery image data, as well as ex-vivo laparoscopic sequences.

Thomas Kurmann, Pablo Marquez Neila, Xiaofei Du, Pascal Fua, Danail Stoyanov, Sebastian Wolf, Raphael Sznitman

### A Comparative Study of Breast Surface Reconstruction for Aesthetic Outcome Assessment

Breast cancer is the most prevalent cancer type in women, and while its survival rate is generally high the aesthetic outcome is an increasingly important factor when evaluating different treatment alternatives. 3D scanning and reconstruction techniques offer a flexible tool for building detailed and accurate 3D breast models that can be used both pre-operatively for surgical planning and post-operatively for aesthetic evaluation. This paper aims at comparing the accuracy of low-cost 3D scanning technologies with the significantly more expensive state-of-the-art 3D commercial scanners in the context of breast 3D reconstruction. We present results from 28 synthetic and clinical RGBD sequences, including 12 unique patients and an anthropomorphic phantom demonstrating the applicability of low-cost RGBD sensors to real clinical cases. Body deformation and homogeneous skin texture pose challenges to the studied reconstruction systems. Although these should be addressed appropriately if higher model quality is warranted, we observe that low-cost sensors are able to obtain valuable reconstructions comparable to the state-of-the-art within an error margin of 3 mm.

René M. Lacher, Francisco Vasconcelos, David C. Bishop, Norman R. Williams, Mohammed Keshtgar, David J. Hawkes, John H. Hipwell, Danail Stoyanov

### DejaVu: Intra-operative Simulation for Surgical Gesture Rehearsal

Advances in surgical simulation and surgical augmented reality have changed the way surgeons prepare for practice and conduct medical procedures. Despite considerable interest from surgeons, the use of simulation is still predominantly confined to pre-operative training of surgical tasks and the lack of robustness of surgical augmented reality means that it is seldom used for surgical guidance. In this paper, we present DejaVu, a novel surgical simulation approach for intra-operative surgical gesture rehearsal. With DejaVu we aim at bridging the gap between pre-operative surgical simulation and crucial but not yet robust intra-operative surgical augmented reality. By exploiting intra-operative images we produce a simulation that faithfully matches the actual procedure without visual discrepancies and with an underlying physical modelling that performs real-time deformation of organs and surrounding tissues, surgeons can interact with the targeted organs through grasping, pulling or cutting to immediately rehearse their next gesture. We present results on different in vivo surgical procedures and demonstrate the feasibility of practical use of our system.

Nazim Haouchine, Danail Stoyanov, Frederick Roy, Stephane Cotin

### An Optimal Control Approach for High Intensity Focused Ultrasound Self-Scanning Treatment Planning

In noninvasive abdominal tumor treatment, research has focused on canceling organ motion either by gating, breath holding or tracking of the target. This paper is based on the novel self-scanning method which combines the advantages of the gated and the tracking method. This approach leverages the respiratory organ motion by holding the focal spot of the high intensity focused ultrasound (HIFU) device static for a given time, while it passively scans the tumor due to respiratory motion. This enables to use a lower-cost HIFU device. We present a planning method for such a system that is based on optimal control theory which optimizes the scanning path and the sonication intensities simultaneously. The method minimizes treatment time and ensures complete tumor ablation according to the thermal dose under free-breathing. To verify our method, we simulated a tumor in two dimensions. The achieved treatment time performs on par to the gold-standard tracking method. Moreover, we measured the temperature profile of the HIFU device in a tissue-mimicking phantom to verify our temperature model.

Nadia Möri, Laura Gui, Christoph Jud, Orane Lorton, Rares Salomir, Philippe C. Cattin

### A Mixed-Reality Approach to Radiation-Free Training of C-arm Based Surgery

The discrepancy of continuously decreasing clinical training opportunities and increasing complexity of interventions in surgery has led to the development of different training options like anatomical models, computer-based simulators or cadaver trainings. However, trainees, following this training and ultimately performing patient treatment, still face a steep learning curve. To address this problem for C-arm based surgery, we introduce a realistic radiation-free simulation system that combines patient-based 3D printed anatomy and simulated X-ray imaging using a physical C-arm. This mixed reality simulation system facilitates a transition to C-arm based surgery and has the potential to complement or even replace large parts of cadaver training and to reduce the risk for errors when proceeding to patient treatment. In a technical evaluation, we show that our system simulates X-ray images accurately with an RMSE of 1.85 mm compared to real X-ray imaging. To explore the fidelity and usefulness of the proposed mixed reality system for training and assessment, we conducted a user study. Six surgical experts performed a facet joint injection on the simulator and rated aspects of the system on a 5-point Likert scale. They expressed agreement with the overall realism of the simulation and strong agreement with the usefulness of such a mixed reality system for training of novices and experts.

Philipp Stefan, Séverine Habert, Alexander Winkler, Marc Lazarovici, Julian Fürmetz, Ulrich Eck, Nassir Navab

### Image-Driven Stochastic Identification of Boundary Conditions for Predictive Simulation

In computer-aided interventions, biomechanical models reconstructed from the pre-operative data are used via augmented reality to facilitate the intra-operative navigation. The predictive power of such models highly depends on the knowledge of boundary conditions. However, in the context of patient-specific modeling, neither the pre-operative nor the intra-operative modalities provide a reliable information about the location and mechanical properties of the organ attachments. We present a novel image-driven method for fast identification of boundary conditions which are modelled as stochastic parameters. The method employs the reduced-order unscented Kalman filter to transform in real-time the probability distributions of the parameters, given observations extracted from intra-operative images. The method is evaluated using synthetic, phantom and real data acquired in vivo on a porcine liver. A quantitative assessment is presented and it is shown that the method significantly increases the predictive power of the biomechanical model.

Igor Peterlik, Nazim Haouchine, Lukáš Ručka, Stéphane Cotin

### Toward Computer-Assisted Planning for Interstitial Laser Ablation of Malignant Brain Tumors Using a Tubular Continuum Robot

This paper presents a computer-assisted planning workflow for robotic laser induced thermal therapy in the brain (LITT). A tubular continuum robot is used to position a laser probe for thermoablation, as conventional multiple straight trajectories are insufficient to treat polycyclic intracranial lesions with stereotactically placed probes. A multiobjective variable-length particle swarm optimization algorithm is utilized to determine an optimal number and size of ablation objects placed within the tumor volume while optimizing configuration and design parameters of the tubular continuum robot and optimal insertion path simultaneously. The algorithm optimizes for pareto-optimal solutions by considering multiple objectives. To verify the proposed optimization workflow, 15 patient trials and the expertise of two neurosurgeons are considered.

Josephine Granna, Arya Nabavi, Jessica Burgner-Kahrs

### Providing Effective Real-Time Feedback in Simulation-Based Surgical Training

Virtual reality simulation is becoming popular as a training platform in surgical education. However, one important aspect of simulation-based surgical training that has not received much attention is the provision of automated real-time performance feedback to support the learning process. Performance feedback is actionable advice that improves novice behaviour. In simulation, automated feedback is typically extracted from prediction models trained using data mining techniques. Existing techniques suffer from either low effectiveness or low efficiency resulting in their inability to be used in real-time. In this paper, we propose a random forest based method that finds a balance between effectiveness and efficiency. Experimental results in a temporal bone surgery simulation shows that the proposed method is able to extract highly effective feedback at a high level of efficiency.

Xingjun Ma, Sudanthi Wijewickrema, Yun Zhou, Shuo Zhou, Stephen O’Leary, James Bailey

### Fully Automatic and Real-Time Catheter Segmentation in X-Ray Fluoroscopy

Augmenting X-ray imaging with 3D roadmap to improve guidance is a common strategy. Such approaches benefit from automated analysis of the X-ray images, such as the automatic detection and tracking of instruments. In this paper, we propose a real-time method to segment the catheter and guidewire in 2D X-ray fluoroscopic sequences. The method is based on deep convolutional neural networks. The network takes as input the current image and the three previous ones, and segments the catheter and guidewire in the current image. Subsequently, a centerline model of the catheter is constructed from the segmented image. A small set of annotated data combined with data augmentation is used to train the network. We trained the method on images from 182 X-ray sequences from 23 different interventions. On a testing set with images of 55 X-ray sequences from 5 other interventions, a median centerline distance error of 0.2 mm and a median tip distance error of 0.9 mm was obtained. The segmentation of the instruments in 2D X-ray sequences is performed in a real-time fully-automatic manner.

Pierre Ambrosini, Daniel Ruijters, Wiro J. Niessen, Adriaan Moelker, Theo van Walsum

### Deep Neural Networks Predict Remaining Surgery Duration from Cholecystectomy Videos

For every hospital, it is desirable to fully utilize its operating room (OR) capacity. Inaccurate planning of OR occupancy impacts patient comfort, safety and financial turnover of the hospital. A source of suboptimal scheduling often lies in the incorrect estimation of the surgery duration, which may vary significantly due to the diversity of patient conditions, surgeon skills and intraoperative situations. We propose automatic methods to estimate the remaining surgery duration in real-time by using only the image feed from the endoscopic camera and no other sensor. These approaches are based on neural networks designed to learn the workflow of an endoscopic procedure. We train and evaluate our models on a large dataset of 120 endoscopic cholecystectomies. Results show the strong benefits of these approaches when surgeries last longer than usual and promise practical improvements in OR management.

Ivan Aksamentov, Andru Putra Twinanda, Didier Mutter, Jacques Marescaux, Nicolas Padoy

### UV Exposed Optical Fibers with Frequency Domain Reflectometry for Device Tracking in Intra-arterial Procedures

Shape tracking of medical devices using strain sensing properties in optical fibers has seen increased attention in recent years. In this paper, we propose a novel guidance system for intra-arterial procedures using a distributed strain sensing device based on optical frequency domain reflectometry (OFDR) to track the shape of a catheter. Tracking enhancement is provided by exposing a fiber triplet to a focused ultraviolet beam, producing high scattering properties. Contrary to typical quasi-distributed strain sensors, we propose a truly distributed strain sensing approach, which allows to reconstruct a fiber triplet in real-time. A 3D roadmap of the hepatic anatomy integrated with a 4D MR imaging sequence allows to navigate the catheter within the pre-interventional anatomy, and map the blood flow velocities in the arterial tree. We employed Riemannian anisotropic heat kernels to map the sensed data to the pre-interventional model. Experiments in synthetic phantoms and an invivo model are presented. Results show that the tracking accuracy is suitable for interventional tracking applications, with a mean 3D shape reconstruction errors of $$1.6\,\pm \,0.3$$ mm. This study demonstrates the promising potential of MR-compatible UV-exposed OFDR optical fibers for non-ionizing device guidance in intra-arterial procedures.

Francois Parent, Maxime Gerard, Raman Kashyap, Samuel Kadoury

### Real-Time 3D Ultrasound Reconstruction and Visualization in the Context of Laparoscopy

In the context of laparoscopic interventions involving intracorporeal ultrasound, we present a method to visualize hidden targets in 3D. As the surgeon scans the organ surface, we stitch tracked 2D ultrasound images into a 3D volume in real-time. This volume, registered in space with the surface view provided by the laparoscope, is visualized through a transparent window in the surface image. The efficacy of the proposed method is demonstrated by conducting a psychophysical study with phantoms, involving experienced ultrasound users and laparoscopic surgeons. The results reveal that the proposed method demands significantly less cognitive and physical effort compared to the 2D ultrasound visualization method conventionally used in the operating room.

Uditha L. Jayarathne, John Moore, Elvis C. S. Chen, Stephen E. Pautler, Terry M. Peters

### Improving Needle Detection in 3D Ultrasound Using Orthogonal-Plane Convolutional Networks

Successful automated detection of short needles during an intervention is necessary to allow the physician identify and correct any misalignment of the needle and the target at early stages, which reduces needle passes and improves health outcomes. In this paper, we present a novel approach to detect needle voxels in 3D ultrasound volume with high precision using convolutional neural networks. Each voxel is classified from locally-extracted raw data of three orthogonal planes centered on it. We propose a bootstrap re-sampling approach to enhance the training in our highly imbalanced data. The proposed method successfully detects 17G and 22G needles with a single trained network, showing a robust generalized approach. Extensive ex-vivo evaluations on 3D ultrasound datasets of chicken breast show 25% increase in F1-score over the state-of-the-art feature-based method. Furthermore, very short needles inserted for only 5 mm in the volume are detected with tip localization errors of $${<}$$0.5 mm, indicating that the tip is always visible in the detected plane.

Arash Pourtaherian, Farhad Ghazvinian Zanjani, Svitlana Zinger, Nenad Mihajlovic, Gary Ng, Hendrikus Korsten, Peter de With

### Motion-Compensated Autonomous Scanning for Tumour Localisation Using Intraoperative Ultrasound

Intraoperative ultrasound facilitates localisation of tumour boundaries during minimally invasive procedures. Autonomous ultrasound scanning systems have been recently proposed to improve scanning accuracy and reduce surgeons’ cognitive load. However, current methods mainly consider static scanning environments typically with the probe pressing against the tissue surface. In this work, a motion-compensated autonomous ultrasound scanning system using the da Vinci$$^{\tiny {\textregistered }}$$ Research Kit (dVRK) is proposed. An optimal scanning trajectory is generated considering both the tissue surface shape and the ultrasound transducer dimensions. An effective vision-based approach is proposed to learn the underlying tissue motion characteristics. The learned motion model is then incorporated into the visual servoing framework. The proposed system has been validated with both phantom and ex vivo experiments.

Lin Zhang, Menglong Ye, Stamatia Giannarou, Philip Pratt, Guang-Zhong Yang

### Deep Learning for Sensorless 3D Freehand Ultrasound Imaging

3D freehand ultrasound imaging is a very promising imaging modality but its acquisition is often neither portable nor practical because of the required external tracking hardware. Building a sensorless solution that is fully based on image analysis would thus have many potential applications. However, previously proposed approaches rely on physical models whose assumptions only hold on synthetic or phantom datasets, failing to translate to actual clinical acquisitions with sufficient accuracy. In this paper, we investigate the alternative approach of using statistical learning to circumvent this problem. To that end, we are leveraging the unique modeling capabilities of convolutional neural networks in order to build an end-to-end system where we directly predict the ultrasound probe motion from the images themselves. Based on thorough experiments using both phantom acquisitions and a set of 100 in-vivo long ultrasound sweeps for vein mapping, we show that our novel approach significantly outperforms the standard method and has direct clinical applicability, with an average drift error of merely 7$$\%$$ over the whole length of each ultrasound clip.

Raphael Prevost, Mehrdad Salehi, Julian Sprung, Alexander Ladikos, Robert Bauer, Wolfgang Wein

### Ultrasonic Needle Tracking with a Fibre-Optic Ultrasound Transmitter for Guidance of Minimally Invasive Fetal Surgery

Ultrasound imaging is widely used for guiding minimally invasive procedures, including fetal surgery. Visualisation of medical devices such as medical needles is critically important and it remains challenging in many clinical contexts. During in-plane insertions, a needle can have poor visibility at steep insertion angles and at large insertion depths. During out-of-plane insertions, the needle tip can have a similar ultrasonic appearance to the needle shaft when it intersects with the ultrasound imaging plane. When the needle tip is not accurately identified, it can damage critical structures, with potentially severe consequences, including loss of pregnancy. In this paper, we present a tracking system to directly visualise the needle tip with an ultrasonic beacon. The waves transmitted by the beacon were received by an external ultrasound imaging probe. Pairs of co-registered images were acquired in rapid succession with this probe: a photoacoustic image obtained with the system in receive-only mode, and a conventional B-mode ultrasound image. The beacon comprised a custom elastomeric nanocomposite coating at the distal end of an optical fibre, which was positioned within the lumen of a commercial 22 gauge needle. Delivery of pulsed light to the coating resulted in the photoacoustic generation of ultrasonic waves. The measured tracking accuracies in water in the axial and lateral dimensions were $$0.39 \pm 0.19$$ mm and $$1.85 \pm 0.29$$ mm, respectively. To obtain a preliminary indication of the clinical potential of this ultrasonic needle tracking system, needle insertions were performed in an in vivo fetal sheep model. The results demonstrate that ultrasonic needle tracking with a fibre-optic transmitter is feasible in a clinically realistic fetal surgery environment, and that it could be useful to guide minimally invasive procedures by providing accurate visualisation of the medical device tip.

Wenfeng Xia, Sacha Noimark, Sebastien Ourselin, Simeon J. West, Malcolm C. Finlay, Anna L. David, Adrien E. Desjardins

### Learning CNNs with Pairwise Domain Adaption for Real-Time 6DoF Ultrasound Transducer Detection and Tracking from X-Ray Images

In this paper, we present a fully automatic and real-time CNN-based system that achieves highly accurate and robust 6DoF pose estimation and tracking of Transesophageal Echocardiography (TEE) transducer from 2D X-ray images, a key enabler for integrating ultrasound and fluoroscopic image guidance in hybrid operating rooms for catheter-based procedures. Lightweight hierarchical CNNs are first pre-trained purely on a large number of synthetically-generated X-ray images with known ground truth poses. The pre-trained CNNs are then refined for generalization using only a small number of real X-ray images with annotated poses via our proposed pairwise domain adaptation scheme. To resolve the pose ambiguity caused by the self-symmetry of the TEE transducer and the translucent nature of X-ray imaging, a CNN classifier is trained to classify a correct pose from its flipped counterpart by seeing a large number of synthetically-generated pairs. The proposed system is validated on 1,663 fluoroscopic images from clinical studies, and achieves an error rate of 6.53% with a clinically relevant criteria (i.e., Projected Target Registration Error larger than 2.5 mm) and a frame rate of 83.3 frames per second in tracking mode, outperforming the state-of-the-art methods in terms of both accuracy and speed.

Jiannan Zheng, Shun Miao, Rui Liao

### An Approach for Needle Based Optical Coherence Elastography Measurements

While navigation and interventional guidance are typically based on image data, the images do not necessarily reflect mechanical tissue properties. Optical coherence elastography (OCE) presents a modality with high sensitivity and very high spatial and temporal resolution. However, OCE has a limited field of view of only 2–5 mm depth. We present a side-facing needle probe to image externally induced shear waves from within soft tissue. A first method of quantitative needle-based OCE is provided. Using a time of flight setup, we establish the shear wave velocity and estimate the tissue elasticity. For comparison, an external scan head is used for imaging. Results for four different phantoms indicate a good agreement between the shear wave velocities estimated from the needle probe at different depths and the scan head. The velocities ranging from 0.9–3.4 m/s agree with the expected values, illustrating that tissue elasticity estimates from within needle probes are feasible.

Sarah Latus, Christoph Otte, Matthias Schlüter, Josua Rehra, Kevin Bizon, Hinnerk Schulz-Hildebrandt, Thore Saathoff, Gereon Hüttmann, Alexander Schlaefer

### Concurrent Segmentation and Localization for Tracking of Surgical Instruments

Real-time instrument tracking is a crucial requirement for various computer-assisted interventions. To overcome problems such as specular reflection and motion blur, we propose a novel method that takes advantage of the interdependency between localization and segmentation of the surgical tool. In particular, we reformulate the 2D pose estimation as a heatmap regression and thereby enable a robust, concurrent regression of both tasks via deep learning. Throughout experimental results, we demonstrate that this modeling leads to a significantly better performance than directly regressing the tool position and that our method outperforms the state-of-the-art on a Retinal Microsurgery benchmark and the MICCAI EndoVis Challenge 2015.

Iro Laina, Nicola Rieke, Christian Rupprecht, Josué Page Vizcaíno, Abouzar Eslami, Federico Tombari, Nassir Navab

### Surgical Soundtracks: Towards Automatic Musical Augmentation of Surgical Procedures

Advances in sensing and digitalization enable us to acquire and present various heterogeneous datasets to enhance clinical decisions. Visual feedback is the dominant way of conveying such information. However, environments rich with many sources of information all presented through the same channel pose the risk of over stimulation and missing crucial information. The augmentation of the cognitive field by additional perceptual modalities such as sound is a workaround to this problem. A major challenge in auditory augmentation is the automatic generation of pleasant and ergonomic audio in complex routines, as opposed to overly simplistic feedback, to avoid fatigue. In this work, without loss of generality to other procedures, we propose a method for aural augmentation of ophthalmic procedures via automatic modification of musical pieces. Evaluations of this first proof of concept regarding recognizability of the conveyed information along with qualitative aesthetics show the potential of our method.

Sasan Matinfar, M. Ali Nasseri, Ulrich Eck, Hessam Roodaki, Navid Navab, Chris P. Lohmann, Mathias Maier, Nassir Navab

### Precise Ultrasound Bone Registration with Learning-Based Segmentation and Speed of Sound Calibration

Ultrasound imaging is increasingly used in navigated surgery and registration-based applications. However, spatial information quality in ultrasound is relatively inferior to other modalities. Main limiting factors for an accurate registration between ultrasound and other modalities are tissue deformation and speed of sound variation throughout the body. The bone surface in ultrasound is a landmark which is less affected by such geometric distortions. In this paper, we present a workflow to accurately register intra-operative ultrasound images to a reference pre-operative CT volume based on an automatic and real-time image processing pipeline. We show that a convolutional neural network is able to produce robust, accurate and fast bone segmentation of such ultrasound images. We also develop a dedicated method to perform online speed of sound calibration by focusing on the bone area and optimizing the appearance of steered compounded images. We provide extensive validation on both phantom and real cadaver data obtaining overall errors under one millimeter.

Mehrdad Salehi, Raphael Prevost, José-Luis Moctezuma, Nassir Navab, Wolfgang Wein

### Convolutional Neural Network and In-Painting Techniques for the Automatic Assessment of Scoliotic Spine Surgery from Biplanar Radiographs

Assessing the effectiveness of scoliosis surgery requires the quantification of 3D spinal deformities from pre- and post-operative radiographs. This can be achieved from 3D reconstructed models of the spine but a fast-automatic method to recover this model from pre- and post-operative radiographs remains a challenge. For example, the vertebrae’s visibility varies considerably and large metallic objects occlude important landmarks in postoperative radiographs. This paper presents a method for automatic 3D spine reconstruction from pre- and post-operative calibrated biplanar radiographs. We fitted a statistical shape model of the spine to images by using a 3D/2D registration based on convolutional neural networks. The metallic structures in postoperative radiographs were detected and removed using an image in-painting method to improve the performance of vertebrae registration. We applied the method to a set of 38 operated patients and clinical parameters were computed (such as the Cobb and kyphosis/lordosis angles, and vertebral axial rotations) from the pre- and post-operative 3D reconstructions. Compared to manual annotations, the proposed automatic method provided values with a mean absolute error <5.6° and <6.8° for clinical angles; <1.5 mm and <2.3 mm for vertebra locations; and <4.5° and <3.7° for vertebra orientations, respectively for pre- and post-operative times. The fast-automatic 3D reconstruction from pre- and post in-painted images provided a relevant set of parameters to assess the spine surgery without any human intervention.

B. Aubert, P. A. Vidal, S. Parent, T. Cresson, C. Vazquez, J. De Guise

### A Novel Automatic Segmentation Method to Quantify the Effects of Spinal Cord Injury on Human Thigh Muscles and Adipose Tissue

In this study, a novel automatic method for segmenting muscle groups and adipose tissue in thigh MRI volumes is developed to quantify the negative effects of spinal cord injury (SCI) on fat and muscle distribution in individuals with severe SCI. The thigh volumes were segmented based on subcutaneous fat, inter-muscular fat and muscle tissue using Linear Combination of Discrete Gaussians algorithm. Furthermore, the three main compartments of the muscle tissue: knee extensor, knee flexor and hip adductor muscles were segmented utilizing the Joint Markov Gibbs Random Field (MGRF) model that integrates first order appearance model of the muscles, spatial information, and shape model to localize the muscle groups. The method was tested on 10 SCI and 10 non-disabled (ND) subjects and the results has shown high accuracy of 96.86 ± 3.48 for fat segmentation and 94.76 ± 1.70 for muscle group segmentation based on Dice similarity percentage. Next, we calculated 3 ratios based on the volumes of the subcutaneous fat to muscle tissue, inter-muscular fat to muscle and extensor to flexor for all subjects. Mann-Whitney statistical test showed that inter-muscular fat to muscle ratio was significantly greater in SCI than in ND group (p = 0.001).

Samineh Mesbah, Ahmed Shalaby, Sean Stills, Ahmed Soliman, Andrea Willhite, Susan Harkema, Enrico Rejc, Ayman El-baz

### OptiC: Robust and Automatic Spinal Cord Localization on a Large Variety of MRI Data Using a Distance Transform Based Global Optimization

Localizing the center of the spinal cord on MR images is a critical step toward fully automated and robust quantitative analysis, which is essential to achieve clinical utilization. While automatic localization of the spinal cord might appear as a simple task, that has already been addressed extensively, it is much more challenging to achieve this across the large variation in MRI contrasts, field of view, resolutions and pathologies. In this study, we introduce a novel method, called “OptiC”, to automatically and robustly localize the spinal cord on a large variety of MRI data. Starting from a localization map computed by a linear Support Vector Machine trained with Histogram of Oriented Gradient features, the center of the spinal cord is localized by solving an optimization problem, that introduces a trade-off between the localization map and the cord continuity along the superior-inferior axis. The OptiC algorithm features an efficient search (with a linear complexity in the number of voxels) and ensures the global minimum is reached. OptiC was compared to a recently-published method based on the Hough transform using a broad range of MRI data, involving 13 different centers, 3 contrasts ($$T_2$$-weighted n=278, $$T_1$$-weighted n=112 and $$T_2^*$$-weighted n=263), with a total of 441 subjects, including 133 patients with traumatic and neurodegenerative diseases. Overall, OptiC was able to find 98.5% of the gold-standard centerline coverage, with a mean square error of 1.21 mm, suggesting that OptiC could reliably be used for subsequent analyses tasks, such as cord segmentation, opening the door to more robust analysis in patient population.

Charley Gros, Benjamin De Leener, Sara M. Dupont, Allan R. Martin, Michael G. Fehlings, Rohit Bakshi, Subhash Tummala, Vincent Auclair, Donald G. McLaren, Virginie Callot, Michaël Sdika, Julien Cohen-Adad

### Joint Craniomaxillofacial Bone Segmentation and Landmark Digitization by Context-Guided Fully Convolutional Networks

Generating accurate 3D models from cone-beam computed tomography (CBCT) images is an important step in developing treatment plans for patients with craniomaxillofacial (CMF) deformities. This process often involves bone segmentation and landmark digitization. Since anatomical landmarks generally lie on the boundaries of segmented bone regions, the tasks of bone segmentation and landmark digitization could be highly correlated. However, most existing methods simply treat them as two standalone tasks, without considering their inherent association. In addition, these methods usually ignore the spatial context information (i.e., displacements from voxels to landmarks) in CBCT images. To this end, we propose a context-guided fully convolutional network (FCN) for joint bone segmentation and landmark digitization. Specifically, we first train an FCN to learn the displacement maps to capture the spatial context information in CBCT images. Using the learned displacement maps as guidance information, we further develop a multi-task FCN to jointly perform bone segmentation and landmark digitization. Our method has been evaluated on 107 subjects from two centers, and the experimental results show that our method is superior to the state-of-the-art methods in both bone segmentation and landmark digitization.

Jun Zhang, Mingxia Liu, Li Wang, Si Chen, Peng Yuan, Jianfu Li, Steve Guo-Fang Shen, Zhen Tang, Ken-Chung Chen, James J. Xia, Dinggang Shen

### Online Statistical Inference for Large-Scale Binary Images

We present a unified online statistical framework for quantifying a collection of binary images. Since medical image segmentation is often done semi-automatically, the resulting binary images may be available in a sequential manner. Further, modern medical imaging datasets are too large to fit into a computer’s memory. Thus, there is a need to develop an iterative analysis framework where the final statistical maps are updated sequentially each time a new image is added to the analysis. We propose a new algorithm for online statistical inference and apply to characterize mandible growth during the first two decades of life.

Moo K. Chung, Ying Ji Chuang, Houri K. Vorperian

### Short Acquisition Time PET Quantification Using MRI-Based Pharmacokinetic Parameter Synthesis

Positron Emission Tomography (PET) with pharmacokinetic (PK) modelling is a quantitative molecular imaging technique, however the long data acquisition time is prohibitive in clinical practice. An approach has been proposed to incorporate blood flow information from Arterial Spin Labelling (ASL) Magnetic Resonance Imaging (MRI) into PET PK modelling to reduce the acquisition time. This requires the conversion of cerebral blood flow (CBF) maps, measured by ASL, into the relative tracer delivery parameter ($$R_1$$) used in the PET PK model. This was performed regionally using linear regression between population $$R_1$$ and ASL values. In this paper we propose a novel technique to synthesise $$R_1$$ maps from ASL data using a database with both $$R_1$$ and CBF maps. The local similarity between the candidate ASL image and those in the database is used to weight the propagation of $$R_1$$ values to obtain the optimal patient specific $$R_1$$ map. Structural MRI data is also included to provide information within common regions of artefact in ASL data. This methodology is compared to the linear regression technique using leave one out analysis on 32 subjects. The proposed method significantly improves regional $$R_1$$ estimation ($$p<0.001$$), reducing the error in the pharmacokinetic modelling. Furthermore, it allows this technique to be extended to a voxel level, increasing the clinical utility of the images.

Catherine J. Scott, Jieqing Jiao, M. Jorge Cardoso, Andrew Melbourne, Enrico De Vita, David L. Thomas, Ninon Burgos, Pawel Markiewicz, Jonathan M. Schott, Brian F. Hutton, Sébastien Ourselin

### A Variational Approach to Sparse Model Error Estimation in Cardiac Electrophysiological Imaging

Noninvasive reconstruction of cardiac electrical activity from surface electrocardiograms (ECG) involves solving an ill-posed inverse problem. Cardiac electrophysiological (EP) models have been used as important a priori knowledge to constrain this inverse problem. However, the reconstruction suffer from inaccuracy and uncertainty of the prior model itself which could be mitigated by estimating a priori model error. Unfortunately, due to the need to handle an additional large number of unknowns in a problem that already suffers from ill-posedness, model error estimation remains an unresolved challenge. In this paper, we address this issue by modeling and estimating the a priori model error in a low dimensional space using a novel sparse prior based on the variational approximation of L0 norm. This prior is used in a posterior regularized Bayesian formulation to quantify the error in a priori EP model during the reconstruction of transmural action potential from ECG data. Through synthetic and real-data experiments, we demonstrate the ability of the presented method to timely capture a priori model error and thus to improve reconstruction accuracy compared to approaches without model error correction.

Sandesh Ghimire, John L. Sapp, Milan Horacek, Linwei Wang

### Data-Driven Rank Aggregation with Application to Grand Challenges

The increased number of challenges for comparative evaluation of biomedical image analysis procedures clearly reflects a need for unbiased assessment of the state-of-the-art methodological advances. Moreover, the ultimate translation of novel image analysis procedures to the clinic requires rigorous validation and evaluation of alternative schemes, a task that is best outsourced to the international research community. We commonly see an increase of the number of metrics to be used in parallel, reflecting alternative ways to measure similarity. Since different measures come with different scales and distributions, these are often normalized or converted into an individual rank ordering, leaving the problem of combining the set of multiple rankings into a final score. Proposed solutions are averaging or accumulation of rankings, raising the question if different metrics are to be treated the same or if all metrics would be needed to assess closeness to truth. We address this issue with a data-driven method for automatic estimation of weights for a set of metrics based on unsupervised rank aggregation. Our method requires no normalization procedures and makes no assumptions about metric distributions. We explore the sensitivity of metrics to small changes in input data with an iterative perturbation scheme, to prioritize the contribution of the most robust metrics in the overall ranking. We show on real anatomical data that our weighting scheme can dramatically change the ranking.

James Fishbaugh, Marcel Prastawa, Bo Wang, Patrick Reynolds, Stephen Aylward, Guido Gerig

### Benchmarking Quantitative Imaging Biomarker Measurement Methods Without a Gold Standard

Validation of quantitative imaging biomarker (QIB) measurement methods is generally based on the concept of a reference method, also called a gold standard (GS). Poor quality of the GS, for example due to inter- and intra-rater variabilities in segmentation, may lead to biased error estimates and thus adversely impact the validation. Herein we propose a novel framework for benchmarking multiple measurement methods without a GS. The framework consists of (i) an error model accounting for correlated random error between measurements extracted by the methods, (ii) a novel objective based on a joint posterior probability of the error model parameters (iii) Markov chain Monte Carlo to sample the posterior. Analysis of the posterior enables not only to estimate the error model parameters (systematic and random error) and thereby benchmark the methods, but also to estimate the unknown true values of QIB. Validation of the proposed framework on multiple sclerosis total lesion load measurements by four automated segmentation methods applied to a clinical brain MRI dataset showed a very good agreement of the error model and true value estimates with corresponding least squares estimates based on a known GS.