Skip to main content
Top

2025 | Book

Ophthalmic Medical Image Analysis

11th International Workshop, OMIA 2024, Held in Conjunction with MICCAI 2024, Marrakesh, Morocco, October 10, 2024, Proceedings

insite
SEARCH

About this book

This book constitutes the refereed proceedings of the 11th International Workshop on Ophthalmic Medical Image Analysis, OMIA 2024, held in conjunction with the 27th International Conference on Medical Imaging and Computer-Assisted Intervention, MICCAI 2024, in Marrakesh, Morocco, in October 2024.

The 16 papers presented in this book were carefully reviewed and selected from 31 submissions. The papers cover various topics such as computer-aided detection and diagnosis of disease; image analysis of novel ophthalmic imaging modalities; multimodal ophthalmic image analysis; ophthalmic image atlases; ophthalmic image analysis in animals; registration of ophthalmic images, including multimodal, segmentation of structures (e.g., vasculature, lesions, landmarks), combined analysis of images of the eye and other organs; validation; and/or crowd sourcing.

Table of Contents

Frontmatter
Selective Functional Connectivity between Ocular Dominance Columns in the Primary Visual Cortex
Abstract
The primary visual cortex (V1) in humans and many animals is comprised of fine-scale neuronal ensembles that respond preferentially to the stimulation of one eye over the other, also known as the ocular dominance columns (ODCs). Despite its importance in shaping our perception, to date, the nature of the functional interactions between ODCs has remained poorly understood. In this work, we aimed to improve our understanding of the interaction mechanisms between fine-scale neuronal structures distributed within V1. To that end, we applied high-resolution functional MRI to study mechanisms of functional connectivity between ODCs. Using this technique, we quantified the level of functional connectivity between ODCs as a function of the ocular preference of ODCs, showing that alike ODCs are functionally more connected compared to unalike ones. Through these experiments, we aspired to contribute to filling the gap in our knowledge of the functional connectivity of ODCs in humans as compared to animals.
Iman Aganj, Shahin Nasr
ETSCL: An Evidence Theory-Based Supervised Contrastive Learning Framework for Multi-modal Glaucoma Grading
Abstract
Glaucoma is one of the leading causes of vision impairment. Digital imaging techniques, such as color fundus photography (CFP) and optical coherence tomography (OCT), provide quantitative and noninvasive methods for glaucoma diagnosis. Recently, in the field of computer-aided glaucoma diagnosis, multi-modality methods that integrate the CFP and OCT modalities have achieved greater diagnostic accuracy compared to single-modality methods. However, it remains challenging to extract reliable features due to the high similarity of medical images and the unbalanced multi-modal data distribution. Moreover, existing methods overlook the uncertainty estimation of different modalities, leading to unreliable predictions. To address these challenges, we propose a novel framework, namely ETSCL, which consists of a contrastive feature extraction stage and a decision-level fusion stage. Specifically, the supervised contrastive loss is employed to enhance the discriminative power in the feature extraction process, resulting in more effective features. In addition, we utilize the Frangi vesselness algorithm as a preprocessing step to incorporate vessel information to assist in the prediction. In the decision-level fusion stage, an evidence theory-based multi-modality classifier is employed to combine multi-source information with uncertainty estimation. Extensive experiments demonstrate that our method achieves state-of-the-art performance. The code is available at https://​github.​com/​master-Shix/​ETSCL.
Zhiyuan Yang, Bo Zhang, Yufei Shi, Ningze Zhong, Johnathan Loh, Huihui Fang, Yanwu Xu, Si Yong Yeo
VNR-AV: Structural Post-processing for Retinal Arteries and Veins Segmentation
Abstract
The retinal vasculature reveals numerous health conditions, making the quantitative assessment of changes in retinal arteries and veins crucial for disease prevention and management. Quantifying changes in the retinal vasculature requires segmentation to delineate it. Deep-learning techniques demonstrate impressive results for retinal vasculature segmentation in color fundus images. However, even if the generated segmentations are good at the pixel level, they are not coherent at the structural level, (i.e. not anatomically coherent compared to a real retinal vasculature). The vasculature of the retina is composed of two completely connected trees: arteries and veins, whereas segmentations produce several disconnected components. In this article, we propose VNR-AV: a Vasculature Network Retrieval method specifically designed for retinal Arteries and Veins segmentation. The proposed post-processing method achieves two main objectives: it leverages vessels segmentation to enhance the segmentation of arteries and veins by performing reconnection, removal, and detail gathering; and it removes or reconnects segmentation components based on a set of rules developed through an understanding of deep-learning-generated segmentations. VNR-AV retrieve a fully connected thus more anatomically coherent structure of the retinal arteries and veins networks while managing to slightly improve the superposition quality at pixel-level. VNR-AV enable a more coherent assessment of changes in retinal arteries and veins and pave the way for further research in prevention and management of eye-related diseases.
Idris Dulau, Benoit Recur, Catherine Helmer, Cecile Delcourt, Marie Beurton-Aimar
Wavelet Deep Learning Network for Objective Retinal Functional Estimation from Multimodal Retinal Imaging
Abstract
In this study, we proposed a wavelet-based deep learning network to estimate retinal function from retinal structure in patients with Retinitis Pigmentosa. We used macular integrity assessment microperimetry to measure retinal sensitivities (functional information) and spectral domain optical coherence tomography to assess retinal layer thicknesses (structural information). Outer, inner, and total retinal thicknesses were extracted. We found a strong correlation between outer retinal thickness and retinal sensitivity. Leveraging this correlation, we employed machine learning models for functional estimation from retinal layer thicknesses and vice versa. For functional estimation, we incorporated discrete wavelet transform and max-pooling features in a ResNet18-based architecture, significantly improving the accuracy to an \(R^2\) score of 0.79. Our results demonstrate that machine learning models can effectively predict retinal function from retinal structure, and vice versa. Furthermore, the integration of discrete wavelet transform features in the convolutional neural network improved the performance of functional estimation from retinal structure.
An D. Le, Shaden H. Yassin, William R. Freeman, Anna Heinke, Dirk-Uwe G. Bartsch, Shyamanga Borooah, Shiwei Jin, Truong Nguyen, Cheolhong An
Inter-frame Sclera Vessel Rotation Tracking for Toric Intraocular Lens Implantation Navigation
Abstract
In refractive cataract surgery, precise alignment of the toric intraocular lens with the intended axis of astigmatism is critical, but challenges arise due to intraoperative eye rotation. While existing computer-assisted navigation methods primarily address first-frame rotation in the microscopic video, they neglect significant intraoperative rotation among video frames, termed as inter-frame rotation. In this study, we propose a novel inter-frame rotation computation pipeline leveraging sclera vessel features, which involves extracting and normalizing sclera regions of a reference and intraoperative microscopic video frame into polar representations, recognizing prominent feature points to initialize a multi-regions of interests (ROIs) correlation rotation tracker, and computing the rotation degree to update the alignment axis in toric intraocular lens (IOL) implantation navigation. Ablation experimental results validate the effectiveness of each component in the proposed pipeline, and comparative experiments highlight the pipeline’s superior performance over related methods. Crucially, we seamlessly integrate this rotation pipeline into an augmented reality navigation system for cataract surgery, with preliminary validation demonstrating its effectiveness in a real clinical setting.
Puxun Tu, Meng Xie, Jiao Lv, Peiquan Zhao, Ce Zheng, Xiaojun Chen
Data Heterogeneity-Aware Personalized Federated Learning for Diagnosis
Abstract
Personalized federated learning is an extension of federated learning that aims to improve prediction accuracy for diverse clients by tailoring models to their individual data. However, the inherent agnosticism of the data across clients poses a challenge in the awareness of the client data characteristics, impacting the effectiveness of personalization. To overcome this challenge, we propose a data heterogeneity-aware algorithm for personalization in federated learning, which involves assessing the heterogeneity across client data using uncertainty. Specifically, a heterogeneity weight is determined based on the predictive uncertainty of the global model on client-specific data. Subsequently, an adaptive fusion of the global model and the previous client model is enabled using the heterogeneity weight to personalize the initialization of the client model training in each iteration. Experiments conducted on diagnosis in two imaging modalities, particularly under non-independent and identically distributed (non-IID) scenarios, demonstrate the superior performance of our proposed algorithm compared to state-of-the-art approaches.
Huiyan Lin, Heng Li, Haojin Li, Xiangyang Yu, Kuai Yu, Chenhao Liang, Huazhu Fu, Jiang Liu
MM-UNet: A Mixed MLP Architecture for Improved Ophthalmic Image Segmentation
Abstract
Ophthalmic image segmentation serves as a critical foundation for ocular disease diagnosis. Although fully convolutional neural networks (CNNs) are commonly employed for segmentation, they are constrained by inductive biases and face challenges in establishing long-range dependencies. Transformer-based models address these limitations but introduce substantial computational overhead. Recently, a simple yet efficient Multilayer Perceptron (MLP) architecture was proposed for image classification, achieving competitive performance relative to advanced transformers. However, its effectiveness for ophthalmic image segmentation remains unexplored. In this paper, we introduce MM-UNet, an efficient Mixed MLP model tailored for ophthalmic image segmentation. Within MM-UNet, we propose a multi-scale MLP (MMLP) module that facilitates the interaction of features at various depths through a grouping strategy, enabling simultaneous capture of global and local information. We conducted extensive experiments on both a private anterior segment optical coherence tomography (AS-OCT) image dataset and a public fundus image dataset. The results demonstrated the superiority of our MM-UNet model in comparison to state-of-the-art deep segmentation networks.
Zunjie Xiao, Xiaoqing Zhang, Risa Higashita, Jiang Liu
Coral-CVDs: A Consistent Ordinal Regression Model for Cardiovascular Diseases Grading
Abstract
Cardiovascular diseases (CVDs) are the leading cause of death worldwide, emphasizing the critical need for early detection to improve treatment outcomes and prevent severe complications. Changes of the retina can be used to predict CVDs. Traditional methods often use color fundus photography (CFP) to classify CVDs risk levels into discrete categories. However, these methods typically treat the problem as a classification task, potentially overlooking the ordinal relationships among different risk levels. We propose Coral-CVDs to address this limitation by integrating these ordinal relationships into the classification models. This enhancement allows the model to better distinguish between the boundaries of adjacent risk levels. Additionally, we resolve the inconsistency issues present in traditional ordinal regression models and provide mathematical proofs to support our approach. We conducted a series of experiments using the UK Biobank data to validate our hypotheses and the results have shown its effectiveness.
Zhuangzhi Gao, He Zhao, Zhongli Wu, Yuankai Wang, Gregory Yoke Hong Lip, Alena Shantsila, Eduard Shantsila, Yalin Zheng
Affordable Deep Learning for Diagnosing Inherited and Common Retinal Diseases via Color Fundus Photography
Abstract
Retinal diseases are a significant global health concern, often leading to severe vision impairment or blindness if not diagnosed and treated promptly. Classifying inherited retinal diseases (IRDs) is particularly challenging, typically requiring genetic analysis and expert ophthalmologists. Developing automatic deep learning models for their classification is crucial, especially for regions with low-resource settings. This work focuses on diagnosing IRDs and other common retinal conditions using color fundus photographs (CFPs), leveraging transformative advances in deep learning. Our approach utilizes a ResNet18 model trained on CFPs from seven diverse, multi-institute sources, aiming to classify 21 retinal diseases, including 13 IRDs. This is the highest number of IRD classes addressed by a single model using color fundus photographs to date. The model achieved an impressive \(F_1\) score of 0.86 for Retinitis Pigmentosa (RP), demonstrating its capability for broad diagnostic use. The results highlight the feasibility of employing deep learning for IRD detection, a task that traditionally relies on expensive and time-consuming genetic testing. The inclusion of a diverse dataset ensures robust performance and generalizability across various demographics. Activation map analysis confirms the model’s accuracy in identifying disease patterns. These promising results mark a significant step towards more accessible and efficient retinal disease diagnosis through deep learning technology.
Elyes Farjallah, Said El Shamieh, Razieh Rezaei, Philipp Herrmann, Sandrine H. Künzel, Frank G. Holz, Shadi Albarqouni
Comparative Analysis of Data Augmentation for Retinal OCT Biomarker Segmentation
Abstract
Data augmentation plays a crucial role in addressing the challenge of limited expert-annotated datasets in deep learning applications for retinal Optical Coherence Tomography (OCT) scans. This work exhaustively investigates the impact of various data augmentation techniques on retinal layer boundary and fluid segmentation. Our results reveal that their effectiveness significantly varies based on the dataset’s characteristics and the amount of available labeled data. While the benefits of augmentation are not uniform-being more pronounced in scenarios with scarce data, particularly for transformation-based methods-the findings highlight the necessity of a strategic approach to data augmentation. It is essential to note that the effectiveness of data augmentation varies significantly depending on the characteristics of the dataset. The findings emphasize the need for a nuanced approach, considering factors like dataset characteristics, the amount of labelled data, and the choice of model architecture.
Markus Unterdechler, Botond Fazekas, Guilherme Aresta, Hrvoje Bogunović
Advanced Diabetic Retinopathy Classification: Integrating Pathological Indicators Segmentation and Morphological Feature Analysis
Abstract
Diabetic retinopathy (DR) is a common and serious complication of diabetes mellitus, often leading to blindness. Identifying DR stages accurately is essential for timely and effective treatment. This study introduces an innovative method to improve DR classification by detecting retinal lesions in fundus images (such as Hard Exudates, Soft Exudates, Microaneurysms, and Hemorrhages) and extracting visual and morphological features from these lesions. The proposed approach generates synthetic retinal lesions and creates artificial masks to highlight DR pathology regions. A segmentation model is trained to produce these masks, which are then refined using real fundus images and corresponding annotations. The model is then fine-tuned with real fundus images and corresponding masks. The model then derives morphological features (such as the number of each lesion, and maximum and minimum sizes, among others) from the generated masks and integrates them with latent features extracted from the segmentation model to enhance classification accuracy. The model can show a visual explanation generated by the to aid doctors in verifying and trusting the AI’s decisions, ultimately enhancing clinical decision-making and patient care. Experimental results on the DDR, E-Ophtha, and IDRiD datasets demonstrate the method’s effectiveness in improving lesion segmentation and DR classification. The DR classification achieved an accuracy of 88.04% and a Quadratic Weighted Kappa (QWK) score of 93.71%, surpassing state-of-the-art methods. The use of composite masks improves the model’s ability to identify subtle DR progression indicators, enabling more precise diagnostics and explainable AI-based interventions in clinical practice. The code is publicly available at https://​github.​com/​saifalkhaldiurv/​Advanced-Diabetic-retinopathy-Classification.​git.
Saif Khalid Musluh, Ammar M. Okran, Saddam Abdulwahab, Domenec Puig, Hatem A. Rashwan
Masked Image Modelling for Retinal OCT Understanding
Abstract
This work explores the effectiveness of masked image modelling for learning representations of retinal OCT images. To this end, we leverage Masked Autoencoders (MAE), a simple and scalable method for self-supervised learning, to obtain a powerful and general representation for OCT images by training on 700K OCT images from 41K patients collected under real world clinical settings. We also provide the first extensive evaluation for a model of OCT on a challenging battery of 6 downstream tasks. Our model achieves strong performance when fully finetuned but can also serve as a versatile frozen feature extractor for many tasks using lightweight adapters. Furthermore, we propose an extension of the MAE pretraining to fuse OCT with an auxiliary modality, namely, IR fundus images and learn a joint model for both. We demonstrate our approach improves performance on a multimodal downstream application. Our experiments utilize most publicly available OCT datasets, thus enabling future comparisons. Our code and model weights are publicly available https://​github.​com/​TheoPis/​MIM_​OCT.
Theodoros Pissas, Pablo Márquez-Neila, Sebastian Wolf, Martin Zinkernagel, Raphael Sznitman
A Dual-Stream Network for Langerhans’ Cells Segmentation in CCM Images
Abstract
Confocal Corneal Microscopy (CCM) image is an important metric in ophthalmological corneal-related diagnosis. CCM is able to provide high-resolution microscopy that enables accurate monitoring and diagnosis of various corneal diseases, including keratitis, corneal ulcers, and dry eye syndrome. Current corneal fiber nerve image segmentation methods are not suitable for small-size Langerhans’ Cells Segmentation. In this paper, we introduce a dual-stream network for the Langerhans’ Cells detection and segmentation in CCM images, aiming to assist in the diagnosis of dry eye syndrome. The proposed method comprises both segmentation and detection branches, and the fusion of their mutual information is achieved through the Targetized Convolution Module (TCM) and Semantic Mask Guide Refinement Module (SMGR). Compared to the existing methods, our proposed framework can improve the performance for both segmentation and detection tasks simultaneously.
Jun Wu, Jinshu Gao, Jingjie Lin, Zeyu Huang, Yang Liu, Zhengyu Chen, Qin Long, Jianchun Zhao, Dayong Ding
Formula-Driven Data Augmentation and Partial Retinal Layer Copying for Retinal Layer Segmentation
Abstract
Major retinal layer segmentation methods from OCT images assume that the retina is flattened in advance, and thus cannot always deal with retinas that have changes in retinal structure due to ophthalmopathy and/or curvature due to myopia. To eliminate the use of flattening in retinal layer segmentation for practicality of such methods, we propose novel data augmentation methods for OCT images. Formula-driven data augmentation (FDDA) emulates a variety of retinal structures by vertically shifting each column of the OCT images according to a given mathematical formula. We also propose partial retinal layer copying (PRLC) that copies a part of the retinal layers and pastes it into a region outside the retinal layers. Through experiments using the OCT MS and Healthy Control dataset and the Duke Cyst DME dataset, we demonstrate that the use of FDDA and PRLC makes it possible to detect the boundaries of retinal layers without flattening even retinal layer segmentation methods that assume flattening of the retina.
Tsubasa Konno, Takahiro Ninomiya, Kanta Miura, Koichi Ito, Noriko Himori, Parmanand Sharma, Toru Nakazawa, Takafumi Aoki
Enhancing Community Vision Screening: AI-Driven Retinal Photography for Early Disease Detection and Patient Trust
Abstract
Community vision screening plays a crucial role in identifying individuals with vision loss and preventing avoidable blindness, particularly in rural communities where access to eye care services is limited. Currently, there is a pressing need for a simple and efficient process to screen and refer individuals with significant eye disease-related vision loss to tertiary eye care centers for further care. An ideal solution should seamlessly and readily integrate with existing workflows, providing comprehensive initial screening results to service providers, thereby enabling precise patient referrals for timely treatment. This paper introduces the Enhancing Community Vision Screening (ECVS) solution, which addresses the aforementioned concerns with a novel and feasible solution based on simple, non-invasive retinal photography for the detection of pathology-based visual impairment. Our study employs four distinct deep learning models: RETinal photo Quality Assessment (RETQA), Pathology Visual Impairment detection (PVI), Eye Disease Diagnosis (EDD) and Visualization of Lesion Regions of the eye (VLR). We conducted experiments on over 10 datasets, totaling more than 80,000 fundus photos collected from various sources. The models integrated into ECVS achieved impressive AUC scores of 0.98 for RETQA, 0.95 for PVI, and 0.90 for EDD, along with a DICE coefficient of 0.48 for VLR. These results underscore the promising capabilities of ECVS as a straightforward and scalable method for community-based vision screening.
Xiaofeng Lei, Yih-Chung Tham, Jocelyn Hui Lin Goh, Yangqin Feng, Yang Bai, Zhi Da Soh, Rick Siow Mong Goh, Xinxing Xu, Yong Liu, Ching-Yu Cheng
Enhancing Large Foundation Models to Identify Fundus Diseases Based on Contrastive Enhanced Low-Rank Adaptation Prompt
Abstract
Large foundation models (LFMs) have recently gained attention in the computer vision field. Recently, several LFMs focused on ophthalmic imaging were pre-trained and achieved promising performance on different downstream tasks. However, few have explored how to leverage the pre-trained ophthalmic LFM to improve the accuracy of fundus disease identification. Currently, commonly used fine-tuning and transfer learning methods for training pre-trained LFM to recognize fundus diseases are susceptible to the influence of dataset feature distribution, resulting in limited performance when identifying fundus diseases with complex or ambiguous pathological features. Therefore, exploring how to effectively utilize the pre-trained ophthalmic LFM to improve the performance of identifying fundus diseases with different feature distributions is crucial and remains an open challenge. Focusing on these challenges, we propose a novel contrastive enhanced low-rank learnable adaptation prompt (CE-LORA) to enhance the performance of foundation models in fundus disease identification tasks. Our proposed method introduces low-rank representation reparameterization combined data augmentation contrastive learning category to achieve parameter-efficient transfer learning, guiding the model to focus on category-related features while enhancing the model’s capacity to represent discriminative features contained in fundus images. We analyzed the performance of different methods on a dataset collected from multiple ophthalmic clinics with varying feature distributions. Our proposed CE-LORA increased the average F1 score of the LFM model by 36.74% and 14.94% compared to commonly used approaches of fine-tuning and transfer learning, respectively, reaching 94.29%.
Meng Wang, Tian Lin, Ting Xu, Ke Zou, Haoyu Chen, Huazhu Fu, Ching-Yu Cheng
Backmatter
Metadata
Title
Ophthalmic Medical Image Analysis
Editors
Antony Bhavna
Hao Chen
Huihui Fang
Huazhu Fu
Cecilia S. Lee
Copyright Year
2025
Electronic ISBN
978-3-031-73119-8
Print ISBN
978-3-031-73118-1
DOI
https://doi.org/10.1007/978-3-031-73119-8

Premium Partner