Skip to main content

2025 | Buch

Cloud Computing, Big Data and Emerging Topics

12th Conference, JCC-BD&ET 2024, La Plata, Argentina, June 25–27, 2024, Revised Selected Papers

herausgegeben von: Marcelo Naiouf, Laura De Giusti, Franco Chichizola, Leandro Libutti

Verlag: Springer Nature Switzerland

Buchreihe : Communications in Computer and Information Science

insite
SUCHEN

Über dieses Buch

This volume CCIS 2189 constitutes the refereed proceedings of the 12th Conference, JCC-BD&ET 2024, held in La Plata, Argentina during June 25–27, 2024.

The 12 full papers presented were carefully reviewed and selected from 37 submissions. They were categorized under the topical sections as follows: Parallel and Distributed Computing, Machine and Deep Learning, Smart Cities and E-Government, Visualization, Emerging Topics, Innovation in Computer Science Education, Computer Security.

Inhaltsverzeichnis

Frontmatter

Parallel and Distributed Computing

Frontmatter
Fast Genomic Data Compression on Multicore Machines
Abstract
Nowadays, Genomics has gained relevance since it allows preventing, diagnosing and treating diseases in a personalized way. The reduction in sequencing time and cost has increased the demand and, thus, the amount of genomic data that must be stored or transferred. Consequently, it becomes necessary to develop genome compression algorithms that help to reduce storage usage without consuming too much time. This is now possible thanks to modern multicore machines. This paper improves MtHRCM, a multi-threaded compression algorithm for large collections of genomes, by reducing its sequential component in order to enhance performance and scalability. Experimental results show that our optimized version is faster than MtHRCM and achieves the same compression ratio. Also, they reveal that this new version scales well when increasing the number of threads/cores for smaller test collections, while the high amount of simultaneous I/O requests to disk limits the scalability for larger test collections.
Victoria Sanz, Adrián Pousa, Marcelo Naiouf, Armando De Giusti

Machine and Deep Learning

Frontmatter
Deep Learning-Based Instance Segmentation of Neural Progenitor Cell Nuclei in Fluorescence Microscopy Images
Abstract
In this work, a Deep Learning-based machine vision model was developed for the detection, segmentation and counting of Neural Progenitor Cell nuclei from fluorescence microscopy images. The cells were obtained from adult mice and cultivated in vitro, with cellular nuclei labeled using DAPI dye. Convolutional neural networks for instance segmentation, specifically the Mask R-CNN model with ResNet-50 and ResNet-101 backbones, were trained to recognize the nuclei, and their results were evaluated. Nuclei labeling was implemented semi-automatically, applying a Superpixel technique and then refining the segmentations from a manual process, also using a pre-trained model, which allowed to assemble a dataset of 66 images with 6392 labels in total. The results obtained with the Resnet-50 backbone show that there is an effectiveness of 98.6% for between the specialist count and model-predicted count, in addition to having an mAP50 of 98.0%. This approach has the potential to significantly reduce the time and effort required to analyze large image sets, which is especially useful in studies that require repetitive and detailed cellular analysis.
Gabriel Pérez, Claudia Cecilia Russo, Maria Laura Palumbo, Alejandro David Moroni
Object Recognition Models for Indoor Users’ Location
Abstract
Despite technological advances, precise positioning within buildings remains a considerable challenge. In this context, the present paper explores the research of user location in indoor spaces, embracing object recognition models executed directly on mobile devices. Our proposal is based on designing a generic solution architecture adaptable to any physical environment, enabling the definition and usage of relevant generic objects within the environment to determine the users’ current location. This proposal uses Computer Vision, employing object recognition models for positioning. This kind of indoor positioning benefits from the growth of smartphones’ functionalities and capabilities, thus avoiding the need to install additional infrastructures in physical spaces. A specific implementation of this architecture for React Native is presented, using the TensorFlow platform to support object recognition. This implementation allows demonstrating how this positioning works through concrete use cases. In addition, some lessons learned are discussed, which we hope will contribute to this topic.
Franco M. Borrelli, Cecilia Challiol
CB-RISE: Improving the RISE Interpretability Method Through Convergence Detection and Blurred Perturbations
Abstract
This paper presents significant advancements in the RISE (Randomized Input Sampling for Explanation) algorithm, a popular black-box interpretability method for image data. RISE’s main weakness lies on the large number of model evaluations required to produce the importance heatmap. Furthermore, RISE’s strategy of occluding image regions with black patches is not advisable, as it may lead to unexpected predictions. Therefore, we introduce two new versions of the algorithm, C-RISE and CB-RISE, each incorporating novel features to address the two major challenges of the original implementation. C-RISE introduces a convergence detection based on the Welford algorithm which reduces the computational burden of the algorithm by ceasing computations once the importance map stabilizes. CB-RISE, additionally, introduces the use of blurred masks as perturbations, equivalent to applying Gaussian noise, as opposed to black patches. This allows for a more nuanced representation of the model’s decision-making process. Our experimental results demonstrate the effectiveness of these improvements, successfully enhancing the effectiveness of the generated heatmaps while improving their quality, qualitatively, and showing a speedup of approximately 3.
Oscar Stanchi, Franco Ronchetti, Pedro Dal Bianco, Gastón Rios, Santiago Ponte Ahon, Waldo Hasperué, Facundo Quiroga
Wavelength Calibration of Historical Spectrographic Plates with Dynamic Time Warping
Abstract
The Facultad de Ciencias Astronómicas y Geofísicas of the Universidad Nacional de La Plata counts with 15,000 spectroscopic records on glass plates with valuable and unique astronomical data.
Currently, processing these plates requires a complex manual process that involves several stages, requiring several hours to process a single plate. In particular, the wavelength calibration requires the determination of the wavelength range in which the data were observed. This is achieved by matching the spectrum of the comparison lamp on the plate to the reference spectrum. Since many times neither the metadata of the lamps nor the physical lamps are available, automating the tasks requires a semi-blind approach that uses simulated data as a reference. However, the simulated data differs significantly from the physical lamps, given that many peaks determined by theoretical calculations are rarely observed in practice, and conversely the physical lamps and spectrograph carry imperfections that cause unexpected peaks.
In this work, we propose an wavelength calibration pipeline that enables automated matching of the wavelength of the comparison lamps via Dynamic Time Warping (DTW) between the samples and simulated data. Our best model achieves a 93% average Intersection-over-Union (IoU) over a set of 32 manually calibrated plates.
Santiago Andres Ponte Ahón, Juan Martín Seery, Facundo Quiroga, Franco Ronchetti, Oscar Stanchi, Pedro Dal Bianco, Waldo Hasperué, Yael Aidelman, Roberto Gamen
An Empirical Method for Processing I/O Traces to Analyze the Performance of DL Applications
Abstract
The exponential growth of data handled by Deep Learning (DL) applications has led to an unprecedented demand for computational resources, necessitating their execution on High Performance Computing (HPC) systems. However, understanding and optimizing Input/Output (I/O) of the DL applications can be challenging due to the complexity and scale of DL workloads and the heterogeneous nature of I/O operations. This paper addresses this issue by proposing an I/O traces processing method that simplifies the generation of reports on global I/O patterns and performance to aid in I/O performance analysis. Our approach focuses on understanding the temporal and spatial distributions of I/O operations and related with the behavior at I/O system level. The proposed method enables us to synthesize and extract key information from the reports generated by tools such as Darshan tool and the seff command. These reports offer a detailed view of I/O performance, providing a set of metrics that deepen our understanding of the I/O behavior of DL applications.
Edixon Parraga, Betzabeth Leon, Sandra Mendez, Dolores Rexachs, Remo Suppi, Emilio Luque

Smart Cities and E-Government

Frontmatter
Industry 5.0. Digital Twins in the Process Industry. A Bibliometric Analysis
Abstract
In the context of industrial digitalization, the Industry 5.0 model incorporates digital twins as an innovative tool. This study aims to delve deeper into the concept of digital twins, their integration with the Industrial Internet of Things (IIoT), and how these solutions contribute to bring intelligence into industrial environments.
Digitalization in industry enables connected products and processes to enhance the productivity and efficiency of people, plants, and equipment. The outcomes of these improvements should have widespread impacts on both the economy and the environment. As connected products and processes generate data, this data is increasingly viewed as a fundamental source of competitive advantage, posing new challenges in industrial environments.
The article examines digital twin technology, its integration with IIoT and the intelligence of this devices in the Industry 5.0 or smart manufacturing framework. The focus lies on discussing about the contribution of digital twins for optimizing industrial processes.
The paper reviews relevant articles and conducts a bibliometric analysis of key topics surrounding digital twins as a value-added solution for process optimization within the Industry 5.0 paradigm. The primary findings highlight the growing significance of this subject since 2018, as evidenced by the number of published articles in the Scopus Database. Additionally, the study underscores the complexity of digital twins addressing this issue within the industrial environment.
Federico Walas Mateo, Armando De Giusti

Visualization

Frontmatter
An ABMS COVID-19 Propagation Model for Hospital Emergency Departments
Abstract
The spread of COVID-19 between different agents in a hospital emergency department can be simulated by modeling the interactions between the agents and the environment. In this research, we use Agent Based Modeling and Simulation techniques to build a model of COVID-19 propagation based on an Emergency Department Simulator which has been tested and validated previously. The benefits of ABM include its ability to simulate complex systems, its flexibility, and its ability to model the interactions between different agents in the system. The obtained model will allow us to build a propagation simulator that enables us to build virtual environments with the aim of analyzing how the interactions between agents influence the rate of virus transmission. The model can be used to study the effectiveness of different interventions, such as social distancing, wearing masks, and vaccination, in reducing the spread of COVID-19.
Morteza Ansari Dogaheh, Manel Taboada, Francisco Epelde, Emilio Luque, Dolores Rexachs, Alvaro Wong

Emerging Topics

Frontmatter
QuantumUnit: A Proposal for Classic Multi-qubit Assertion Development
Abstract
In this work, we present a systematic approach to implementing basic assertions in quantum circuits in order to verify classical states across any number of qubits. Our methodology utilizes fundamental quantum gates available on all gate-based quantum computing platforms. Through strategic combinations of these gates, we construct the ‘Equal to’, ‘Different than’, ‘Greater than’, and ‘Lower than’ comparators. By introducing single-qubit comparators and demonstrating how they can be expanded to handle multiple qubits, we provide a scalable method for verifying classical states in quantum circuits. This work provides the groundwork for robust testing procedures in quantum computing, a critical factor in the continued growth and reliability of this rapidly advancing field.
Ignacio García-Rodríguez de Guzmán, Antonio García de la Barrera Amo, Manuel Ángel Serrano, Macario Polo, Mario Piattini
Tool for Quantum-Classical Software Lifecycle
Abstract
With the growing diffusion of quantum computing technology and the increasingly promising applications derived from it, the relevance of developing specific software for these systems is gaining significant momentum. This surge is due to the need to design and produce quantum software that meets performance and functional requirements but also follows the well-known good practices and rigorous methodologies inherent in quantum software engineering. In this context, one of the main challenges facing the development of hybrid (quantum-classical) systems is the effective management of the lifecycle of this new type of software, whose nature differs from traditional systems. The proposed research attempts to comprehensively address the lifecycle management of hybrid software, through the design and development of a specific support tool. To achieve this goal, the ICSM (Integrated Software Cycle Management) model, which is a consolidated framework for the lifecycle management of traditional software, will be taken as a starting point. This model will be carefully adapted to meet the unique needs and challenges inherent in hybrid software, thus ensuring that the development, maintenance, and updating practices of this type of software are as robust and efficient as those applied in the realm of conventional software. Through this adaptation, the aim is not only to improve the quality of the developed hybrid software but also make developers easier to adopt the innovative and complex quantum software paradigm.
Jesús Párraga Aranda, Ricardo Pérez del Castillo, Mario Piattini

Innovation in Computer Science Education

Frontmatter
Strategies to Predict Students’ Exam Attendance
Abstract
This article presents a study on predicting student attendance to exams in a university setting. The study focused on the Concept of Algorithms, Data, and Programs course, a foundational course in systems bachelor. Two models were constructed: linear regression and polynomial regression of degree 3, aimed to predict the total number of attendees and the number of students who would pass the exam. We built a dataset that included information on student enrollment, previous exam attendance, grades, and other relevant factors. Students were classified into three groups: reduced exam, complete exam with prior attendance, and complete exam without prior attendance. The results showed that the models’ predictions were accurate enough, and that they could be used to ensure appropriate classroom occupancy without overcrowding or empty rooms. The models guided the allocation of students, optimizing space utilization while providing available seats for attending students. The study identified opportunities for improvement. One limitation was the assignment of attendance probabilities to achieve the overall predicted attendance. Future work could involve predicting attendance rates for each group individually. Additionally, implementing a classification model to categorise students into pass, fail, insufficient, and non-attendance groups would provide a more comprehensive understanding of student outcomes.
Gonzalo L. Villarreal, Verónica Artola

Computer Security

Frontmatter
Prediction of TCP Firewall Action Using Different Machine Learning Models
Abstract
In today’s world, the issues associated with network security have increased by a tremendous amount. For instance, cyber-attacks during different types of network transmissions are one of the major problems associated with security. A Firewall can be used to provide protection against unauthorized traffic during these transmissions. Our proposed solution uses several different machine learning techniques to predict the TCP firewall action on the basis of the transmission characterisitics of the TCP model. The main idea is to study different features of a TCP transmission like Source Port, Destination Port, Elapsed Time (in seconds), NAT Source Port, NAT Destination Port. Based on the analysis performed on these features, the TCP firewall action will be classified into one of the four categories. These categories are: Allow, Deny, Drop, Reset-Both. In this project, nine different machine learning models have been trained using an available dataset. The dataset used has over 65000 rows, where each row represents a TCP transmission. Each TCP transmission has been described by around 11 different features. A few examples of machine learning models that have been used include: Decision Tree Classifier, Random Forest Classifier, XGBoost Model, and Gradient Boosting Model.
Amit Kumar Bairwa, Akshit Kamboj, Sandeep Joshi, Pljonkin Anton Pavlovich, Saroj Hiranwal
Backmatter
Metadaten
Titel
Cloud Computing, Big Data and Emerging Topics
herausgegeben von
Marcelo Naiouf
Laura De Giusti
Franco Chichizola
Leandro Libutti
Copyright-Jahr
2025
Electronic ISBN
978-3-031-70807-7
Print ISBN
978-3-031-70806-0
DOI
https://doi.org/10.1007/978-3-031-70807-7