Skip to main content

2024 | Book

Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications

26th Iberoamerican Congress, CIARP 2023, Coimbra, Portugal, November 27–30, 2023, Proceedings, Part II


About this book

This 2-volume set, LNCS 14469 and 14470, constitutes the proceedings of the 26th Iberoamerican Congress on Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications, CIARP 2023, which took place in Coimbra, Portugal, in November 2023.
The 61 papers presented were carefully reviewed and selected from 106 submissions. And present research in the fields of pattern recognition, artificial intelligence, and related areas.

Table of Contents

Assessing the Generalizability of Deep Neural Networks-Based Models for Black Skin Lesions
Melanoma is the most severe type of skin cancer due to its ability to cause metastasis. It is more common in black people, often affecting acral regions: palms, soles, and nails. Deep neural networks have shown tremendous potential for improving clinical care and skin cancer diagnosis. Nevertheless, prevailing studies predominantly rely on datasets of white skin tones, neglecting to report diagnostic outcomes for diverse patient skin tones. In this work, we evaluate supervised and self-supervised models in skin lesion images extracted from acral regions commonly observed in black individuals. Also, we carefully curate a dataset containing skin lesions in acral regions and assess the datasets concerning the Fitzpatrick scale to verify performance on black skin. Our results expose the poor generalizability of these models, revealing their favorable performance for lesions on white skin. Neglecting to create diverse datasets, which necessitates the development of specialized models, is unacceptable. Deep neural networks have great potential to improve diagnosis, particularly for populations with limited access to dermatology. However, including black skin lesions is necessary to ensure these populations can access the benefits of inclusive technology.
Luana Barros, Levy Chaves, Sandra Avila
Breast MRI Multi-tumor Segmentation Using 3D Region Growing
Breast tumor is one of the most prominent indicators for diagnosis of breast cancer. Magnetic Resonance Imaging (MRI) is a relevant imaging modality tool for breast cancer screening. Moreover, an accurate 3D segmentation of breast tumors from MRI scans plays a key role in the analysis of the disease. This paper presents a pipeline to automatically segment multiple tumors in breast MRI scans, following the methodology proposed by one previous study, addressing its limitations in detecting multiple tumors and automatically selecting seed points using a 3D region growing algorithm. The pre-processing includes bias field correction, data normalization, and image filtering. The segmentation process involved several steps, including identifying high-intensity points, followed by identifying high-intensity regions using k-means clustering. Then, the centers of the regions were used as seeds for the 3D region growing algorithm, resulting in a mask with 3D structures. These masks were then analyzed in terms of their volume, compactness, and circularity. Despite the need for further adjustments in the model parameters, the successful segmentation of four tumors proved that our solution is a promising approach for automatic multi-tumor segmentation with the potential to be combined with a classification model relying on the characteristics of the segmented structures.
Teresa M. C. Pereira, Ana Catarina Pelicano, Daniela M. Godinho, Maria C. T. Gonçalves, Tiago Castela, Maria Lurdes Orvalho, Vitor Sencadas, Raquel Sebastião, Raquel C. Conceição
Mortality Prediction via Logistic Regression in Intensive Care Unit Patients with Pneumonia
This work focuses on the problem of mortality prediction in patients with pneumonia after admission into an intensive care unit, by addressing it via logistic regression. This approach can model the relationship between clinical correlates and the probability of the binary outcome, with obvious advantages such as simplicity and interpretability of the predictive models. This work further inspects the potential of localized models, an approach based on different (parallel) predictive models each one constructed in clusters automatically identified in the training set. The predicted outcome is then obtained via membership separation (M, which corresponds to the outcome of the closest localized model) or weights (W, outcome as the weight average of localized outcomes via inverse distance). The results point out a similar balanced accuracy of 0.73 for the global model M24-48PS (without oversampling) and the W M24-48PSC model (weighted average of localized models without oversampling), which is partially explained by the small separability between the identified clusters. Therefore, a proof of concept was performed to support the usefulness of localized models in more separable data. This study considered a small amount of data for training and testing (chosen as that closest to the centroids of the identified clusters) and the results suggest that the localized approach can outperform the global one in more separable data.
Nuno Pedrosa, Sónia Gouveia
ECG Feature-Based Classification of Induced Pain Levels
Appropriate pain treatment relies on an accurate assessment of pain. Limitations regarding subjective reporting of pain or observational bias, when pain is assessed by a healthcare professional, can lead to inadequate pain treatment. Therefore, pain assessment using physiological signals has been studied in past years due to the importance of objective measurement. The aim of this work is to use features extracted from Electrocardiogram (ECG) signals to classify pain induced by a Cold Pressor Task (CPT). Specifically, the goal is to determine the optimal hyperparameters of the classification algorithms and the optimal features for accurately distinguishing between higher and lower levels of pain. A model combining 15 ECG-features related to the P, R, S, and T waves and the Random Forest algorithm provided the best performance for predicting induced pain levels. This model achieved an accuracy of 95.3%, an F1-score of 94.0%, a precision of 97.9%, and a recall of 90.4%. These results show the feasibility of identifying pain through the physiological characteristics of the ECG.
Daniela Pais, Raquel Sebastião
Leveraging Model Fusion for Improved License Plate Recognition
License Plate Recognition (LPR) plays a critical role in various applications, such as toll collection, parking management, and traffic law enforcement. Although LPR has witnessed significant advancements through the development of deep learning, there has been a noticeable lack of studies exploring the potential improvements in results by fusing the outputs from multiple recognition models. This research aims to fill this gap by investigating the combination of up to 12 different models using straightforward approaches, such as selecting the most confident prediction or employing majority vote-based strategies. Our experiments encompass a wide range of datasets, revealing substantial benefits of fusion approaches in both intra- and cross-dataset setups. Essentially, fusing multiple models reduces considerably the likelihood of obtaining subpar performance on a particular dataset/scenario. We also found that combining models based on their speed is an appealing approach. Specifically, for applications where the recognition task can tolerate some additional time, though not excessively, an effective strategy is to combine 4–6 models. These models may not be the most accurate individually, but their fusion strikes an optimal balance between accuracy and speed.
Rayson Laroca, Luiz A. Zanlorensi, Valter Estevam, Rodrigo Minetto, David Menotti
Enhancing Object Detection in Maritime Environments Using Metadata
Over the years, many solutions have been suggested in order to improve object detection in maritime environments. However, none of these approaches uses flight information, such as altitude, camera angle, time of the day, and atmospheric conditions, to improve detection accuracy and network robustness, even though this information is often available and captured by the UAV. This work aims to develop a network unaffected by image-capturing conditions, such as altitude and angle. To achieve this, metadata was integrated into the neural network, and an adversarial learning training approach was employed. This was built on top of the YOLOv7, which is a state-of-the-art realtime object detector. To evaluate the effectiveness of this methodology, comprehensive experiments and analyses were conducted. Findings reveal that the improvements achieved by this approach are minimal when trying to create networks that generalize more across these specific domains. The YOLOv7 mosaic augmentation was identified as one potential responsible for this minimal impact because it also enhances the model’s ability to become invariant to these image-capturing conditions. Another potential cause is the fact that the domains considered (altitude and angle) are not orthogonal with respect to their impact on captured images. Further experiments should be conducted using datasets that offer more diverse metadata, such as adverse weather and sea conditions, which may be more representative of real maritime surveillance conditions. The source code of this work is publicly available at https://​github.​com/​ipleiria-robotics/​maritime-metadata-adaptation.
Diogo Samuel Fernandes, João Bispo, Luís Conde Bento, Mónica Figueiredo
Streaming Graph-Based Supervoxel Computation Based on Dynamic Iterative Spanning Forest
Streaming video segmentation decreases processing time by creating supervoxels taking into account small parts of the video instead of using all video content. Thanks to the good performance of the Iterative Spanning Forest to compute Supervoxels (ISF2SVX) based on Dynamic Iterative Spanning Forest (DISF) for video segmentation framework we propose a new graph-based streaming video segmentation method for supervoxel generation by using dynamic iterative spanning forest framework, so-called StreamISF, based on a pipeline composed of six stages: (1) formation of the graph for each block of the video; (2) seed oversampling; (3) IFT-based supervoxel design; (4) reduction in the number of supervoxels; (5) spread of trees; and (6) creation of the segmented video. The difference in our proposed method is that it is unnecessary to have all the video in memory and the only previous information necessary to segment a block is the intersection frame between the blocks. Moreover, experimental results show that StreamISF creates supervoxels that maintain temporal coherence, producing very competitive measures compared to the state-of-the-art. Our code is publically available at https://​github.​com/​IMScience-PPGINF-PucMinas/​StreamISF.
Danielle Vieira, Isabela Borlido Barcelos, Felipe Belém, Zenilton K. G. Patrocínio Jr., Alexandre X. Falcão, Silvio Jamil F. Guimarães
Improving Pest Detection via Transfer Learning
Pest monitoring models play a vital role in enabling informed decisions for pest control and effective management strategies. In the context of smart farming, various approaches have been developed, surpassing traditional techniques in both efficiency and accuracy. However, the application of Few-Shot Learning (FSL) methods in this domain remains limited. In this study, we aim to bridge this gap by leveraging Transfer Learning (TL). Our findings highlight the considerable efficacy of TL techniques in this context, showcasing a significant 24% improvement in mAP performance and a 10% reduction in training time, thereby enhancing the efficiency of the model training process.
Dinis Costa, Catarina Silva, Joana Costa, Bernardete Ribeiro
Do Emotional States Influence Physiological Pain Responses?
Pain is a highly subjective and complex phenomenon. Current methods used to measure pain mostly rely on the patient’s description, which may not always be possible. This way, pain recognition systems based on body language and physiological signals have emerged. As the emotional state of a person can also influence the way pain is perceived, in this work, a protocol for pain induction with previous emotional elicitation was conducted. Eletrocardiogram (ECG), Electrodermal Activity (EDA) and Eletromyogram (EMG) signals were collected during the protocol. Besides the physiological responses, perception was also assessed through reported-scores (using a numeric scale) and times for pain tolerance. In this protocol, 3 different emotional elicitation sessions, negative, positive and neutral, were performed through videos of excerpts of terror, comedy and documentary movies, respectively, and pain was induced using the Cold Pressor Task (CPT). A total of 56 participants performed the study (with 54 completing all three sessions). The results showed that during the negative emotional state, pain reported-scores were higher and pain threshold and tolerance times were smaller when compared with positive. As expected, the physiological response to pain remain similar despite the emotional elicitation.
Bruna Alves, Catarina Silva, Raquel Sebastião
Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications
Verónica Vasconcelos
Inês Domingues
Simão Paredes
Copyright Year
Electronic ISBN
Print ISBN

Premium Partner