Skip to main content
Top

Advances in Computational Collective Intelligence

17th International Conference, ICCCI 2025, Ho Chi Minh City, Vietnam, November 12–15, 2025, Proceedings, Part II

  • 2026
  • Book

About this book

This two-volume set CCIS 2747-2748 constitutes the refereed proceedings of the 17th International Conference on Computational Collective Intelligence, ICCCI 2025, held in Ho Chi Minh City, Vietnam, during November 12–15, 2025.

The 67 full papers included in this book were carefully reviewed and selected from 290 submissions. The papers are organized in the following topical sections:

Part I: Collective Intelligence and Collective Decision-Making; Cooperative Strategies for Decision Making and Optimization; Computational Intelligence for Digital Content Understanding; Data Fusion and Application for Industry 4.0; and Natural Language Processing.

Part II: Deep Learning Techniques; Social Networks and Intelligent Systems; Computational Intelligence in Medical Applications; Data Mining and Machine Learning; and Cybersecurity Blockchain Technology and Internet of Things.

Table of Contents

Next
  • current Page 1
  • 2
  • 3
  1. Frontmatter

  2. Deep Learning Techniques

    1. Frontmatter

    2. AERNN4RP: An Attention-Enriched Recurrent Neural Network for Rainfall Prediction

      Vu Nguyen, Tham Vo
      Abstract
      Over the past several decades, researchers have explored various predictive methods to accurately forecast annual and monthly rainfall for specific regions, recognizing its critical role in water resource management, agriculture, and manufacturing. Reliable precipitation predictions also yield valuable insights into climatological factors and potential future climate change effects. Recent advances in deep learning, particularly recurrent neural network (RNN) architectures, have improved long-range temporal representation learning for rainfall forecasting. However, existing RNN-based models often struggle with delayed or noisy input sequences and fail to adequately capture the relative importance of different input sequences across multiple time steps. To address these limitations, we propose AERNN4RP, a novel RNN framework incorporating multiple levels of attention. AERNN4RP incorporates self-adaptive evolutionary and self-supervising attention mechanisms alongside a sequential auto-encoding strategy to extract complex temporal features while mitigating noise and data uncertainty, thereby enhancing forecasting accuracy. Extensive experiments on real-world rainfall datasets demonstrate the effectiveness and practical benefits of this approach.
    3. Drone-Based Autonomous Navigation and Victim Detection Using Deep Learning Technique for Non-GPS In-Building Environments

      Trong Tuan Do, Ba Dong Nguyen, Dinh Duy Nguyen, Duc Minh Nguyen, Thanh Trung Nguyen, Quang Khiem Nguyen
      Abstract
      In today’s rapidly evolving society, the swift construction of high-rise buildings has been accompanied by an increased risk of incidents that jeopardize human safety. Emergencies—such as fires or structural collapses—in these tall structures pose significant challenges, as compromised communication systems and obstructed GPS signals often hinder external access. Consequently, employing devices that operate independently of local infrastructures is essential. However, enabling these devices to navigate and detect victims in post-incident environments autonomously—characterized by altered spatial configurations and numerous obstacles—remains a formidable challenge. The necessity for efficient and timely search and rescue operations in indoor or non-GPS environments, such as factories, warehouses, and high-rise buildings, is growing worldwide. Traditional rescue methods in these constrained and complex settings are time-consuming and hazardous, underscoring the demand for more autonomous, technology-driven solutions. Unmanned Aerial Vehicles (UAVs), or drones, have emerged as a promising technology in this regard, offering both a strategic aerial perspective and agile maneuverability. This study presents an integrated UAV system powered by deep learning to enhance search and rescue operations in GPS-denied in-building environments. The primary objectives are to enable autonomous navigation and real-time victim detection within complex, obstacle-ridden settings. We developed a ResNet-8-based model for efficient obstacle avoidance to meet these goals and employed several YOLO-based architectures to localize victims accurately. Extensive experiments in a real in-building environment demonstrated that the navigation model consistently achieved over 90% accuracy with processing times under 17 ms, while the detection models operated at frame rates up to 45 FPS. The results confirm that the proposed system enhances the safety and efficiency of rescue missions and offers a robust solution for emergency operations in environments where traditional GPS-based navigation is not feasible.
    4. Defensive Strategy for Explainability in Deep Neural Networks Under Adversarial Attacks

      Tuan Trung Mac, Tan Loc Nguyen, Bac Le
      Abstract
      Deep Neural Networks (DNNs) perform excellently on most problems, although the issue with many complex models is their vulnerability to adversarial attacks, which can mislead the model’s explanation. Recent work indicates that explanation mechanisms can be compromised by an attacker who alters the explanation while the output remains accurate. This lowers the reliability and robustness of the explanation. Our work considers the threat of such an attack to Explainable Artificial Intelligence (xAI) and introduces a mechanism for defending the explanation against such attacks. Our research proposes NODA (Normalization Defense Against Adversaries), based on a Hessian regularizer and data normalization, to enhance the reliability of the explanation. Our investigation validates that NODA is effective in defending against the attack while not damaging the model’s performance.
    5. Balancing Water Quality Using Efficient Deep Learning Configuration for Aquaponics Application

      Quoc Phong Tran, Minh Tai Pham Nguyen, Nguyen Phuc Cam Tu, Minh Khue Phan Tran, Trong Nhan Le
      Abstract
      Monitoring water quality is one of the considerable topics in aquaponics, as the imbalance in pH level, the dissolved oxygen (DO), and nutrient concentrations can negatively affect plant growth and other living beings. The traditional approaches show numerous challenges due to manual intervention, making the process become labor-intensive and prone to human error. Therefore, this study proposes the unique architecture called CNN-LSTM-RF, which is the combination of Convolution Neural Network, Long-Short Term Memory, and Random Forest. While the CNN allows for the filtering process that retains the most prominent features among windows to be transformed and fed to the LSTM to learn the relationship of features in temporal series, the Random Forest serves as the final classification to predict the final output for the pump state. As a result, the proposed model can outperform most of the other deep learning configurations in water quality prediction, reaching up to 62.6% accuracy. The system powered by such a deep learning model shows its dynamical adaption to environmental changes, optimizing water conditions with minimal human input.
    6. Comparative Analysis of CNN Architectures for Boxing Punch Detection

      Piotr Stefański
      Abstract
      This paper presents a comparative analysis of convolutional neural network (CNN) architectures for detecting punches in boxing videos, to improve sports performance analysis through computer vision. The study evaluates the effectiveness of four CNN models, custom CNN, ResNet50, Inception v3, and VGG16, in identifying punches, considering factors such as classification precision, computational efficiency, and robustness to real-world challenges such as class imbalance. The custom CNN architecture, developed in previous work, was shown to provide a good balance between accuracy and computational demands. In contrast, ResNet50 performed well in complex scenarios, demonstrating strong feature extraction capabilities, while Inception v3 demonstrated superior efficiency in handling varying input sizes. VGG16, although effective, proved computationally expensive for real-time applications. The models were evaluated using metrics such as balanced accuracy and F1 score, addressing the issue of class imbalance where punches are less frequent (approximately 3%) than non-punch frames. The experiments were performed using a publicly available boxing punch classification dataset and source code, both published by the authors to facilitate reproducibility and further research. The results indicate that CNNs offer promising solutions for automated punch detection, with implications for broader applications in sports analytics, including training, injury prevention, and tactical analysis. Future research directions include exploring more advanced CNN architectures, incorporating temporal information, and improving real-time processing capabilities. This work contributes to the development of efficient, scalable and reliable computer vision systems for the evaluation of sports performance.
    7. ResNet-Based Pandemic Keyword Spotting in Continuous Multilingual Speech: A Study in UNESCO’s Audio Messages for Rapid Health Response

      Samawel Jaballi, Manar Joundy Hazar, Salah Zrigui, Henri Nicolas, Mounir Zrigui
      Abstract
      In response to the urgent need for accelerated decision-making during health crises, this research underscores the pivotal role of keyword spotting (KWS) within continuous multilingual speech frameworks. Our methodology encompasses four critical phases: Dataset Selection, Feature Extraction, Model Training, and Posterior Handling. We assembled a diverse dataset of 42 recordings in English, French, and Arabic, totaling \( \approx 35\) minutes. MFCCs were employed for feature extraction due to their alignment with human auditory perception. For model training, we evaluated two Convolutional Neural Network (CNN) architectures, ResNet-18 and ResNet-152, comparing their performance in recognizing keywords across multilingual contexts. The dataset was preprocessed to include MFCCs and contextual embeddings for predefined keywords using Multilingual BERT, creating an integrated representation for model input. Experimental results focused on accuracy, loss, and F1 score demonstrate that ResNet-18 achieved superior performance with 90.26% accuracy and an F1 score of 95.75%  outperforming ResNet-152, which attained 88.78% accuracy and an F1 score of 88.79%. These results highlight ResNet-18’s effectiveness in multilingual KWS tasks, making it a valuable tool for rapid and accurate decision-making during health crises.
    8. SignBart - New Approach with the Skeleton Sequence for Isolated Sign Language Recognition

      Tinh Nguyen, Minh Khue Phan Tran
      Abstract
      Sign language recognition is crucial for individuals with hearing impairments to break communication barriers. However, previous approaches have had to choose between efficiency and accuracy. Such as RNNs, LSTMs, and GCNs, had problems with vanishing gradients and high computational costs. Despite improving performance, transformer-based methods were not commonly used. This study presents a new novel SLR approach that overcomes the challenge of independently extracting meaningful information from the x and y coordinates of skeleton sequences, which traditional models often treat as inseparable. By utilizing an encoder-decoder of BART architecture, the model independently encodes the x and y coordinates, while Cross-Attention ensures their interrelation is maintained. With only 749,888 parameters, the model achieves 96.04% accuracy on the LSA-64 dataset, significantly outperforming previous models with over one million parameters. The model also demonstrates excellent performance and generalization across WLASL and ASL-Citizen datasets. Ablation studies underscore the importance of coordinate projection, normalization, and using multiple skeleton components for boosting model efficacy. This study offers a reliable and effective approach for sign language recognition, with strong potential for enhancing accessibility tools for the deaf and hard of hearing.
    9. Semantic Communication with Transformer and Knowledge Distillation for Language Translation

      Huy Thanh Nguyen, Thuy An Nguyen, Tung Kieu, Toan Van Nguyen, Phuong Luu Vo
      Abstract
      Semantic communication has emerged as a promising paradigm for bandwidth-constrained or error-prone environments by focusing on the transmission of meaningful representations rather than raw data. In this paper, we propose a novel framework for enabling language translation in semantic communication, allowing end-users to communicate seamlessly in different languages. Specifically, we design a Transformer-based model to facilitate translation between transmitters and receivers. To enhance efficiency without compromising accuracy, we integrate a knowledge distillation technique using a teacher-student architecture to optimize model performance. To evaluate the effectiveness of our framework under realistic conditions, we conduct experiments on an English-Vietnamese dataset through simulated communication channels. Experimental results demonstrate that our approach outperforms conventional communication methods across various simulated environments, highlighting its potential for practical deployment in multilingual semantic communication systems.
  3. Social Networks and Intelligent Systems

    1. Frontmatter

    2. Will They Spread It or Unmask It? Designing a Method for Questionnaire Specific Data Characterisation - Study Case on Users’ Reactance to Fake News

      Mihaela Colhon, Irina-Valentina Tudor, Cristina-Mihaela Tudorache, Constantin-Cristian Dinu, Cristina Popîrlan, Gabriel Stoian
      Abstract
      Designing a questionnaire that can provide valuable insights in order to characterize its respondents usually involves a systematic approach to gather, process, analyze and interpret the resulting data. For this study we created a scenario in which young adults respondents are endowed with an expert social media user role and are asked to note their feedback in a simulated fake news spreading situation which was designed to target the most sensitive problems of the moment such as pandemic, wars or easy money rewards on social platforms in order to trigger their reactance. Using inferential statistics we identified three groups within the respondent pool and based on their attributes we defined three user profiles: the two extreme profiles show significant differences in terms of social media behavior, while the middle one shares characteristics of both extremes. These results can be further used in a decision-making process in order to identify the users’ reactance to fake news based on their social media behavior profile.
    3. Temporal Dynamic Networks in Infectious Disease Modeling: A Review of Methods and Applications

      Botambu Collins, Jia Yang, Dinh Tuyen Hoang, Jin-Taek Seong
      Abstract
      The increasing complexity of infectious disease transmission calls for advanced modeling frameworks that move beyond static assumptions. Temporal Dynamic Networks (TDNs) have emerged as a useful tool for capturing the dynamics of contact structures that drive epidemic spread. This survey critically examines the state-of-the-art applications of TDNs in modeling infectious diseases, focusing on computational advances such as machine learning, agent-based models, graph neural networks, and hybrid models. We shed light on the theoretical insights that can enhance the understanding and application of these frameworks. This study discusses real-world applications of these models in epidemiology and contends that more concerted effort is required by stakeholders to combat and mitigate the risk of disease resurgence.
    4. GeeSanBhava: Sentiment Tagged Sinhala Music Video Comment Data Set

      Yomal De Mel, Nisansa de Silva
      Abstract
      This study introduces GeeSanBhava, a high-quality data set of Sinhala song comments extracted from YouTube manually tagged using Russell’s Valence-Arousal model by three independent human annotators. The human annotators achieve a substantial inter-annotator agreement (Fleiss’ kappa = 84.96%). The analysis revealed distinct emotional profiles for different songs, highlighting the importance of comment-based emotion mapping. The study also addressed the challenges of comparing comment-based and song-based emotions, mitigating biases inherent in user-generated content. A number of Machine learning and deep learning models were pre-trained on a related large data set of Sinhala News comments in order to report the zero-shot result of our Sinhala YouTube comment data set. An optimized Multi-Layer Perceptron model, after extensive hyperparameter tuning, achieved a ROC-AUC score of 0.887. The model is a three-layer MLP with a configuration of 256, 128, and 64 neurons. This research contributes a valuable annotated dataset and provides insights for future work in Sinhala Natural Language Processing and music emotion recognition. The complete data set is publicly available at: https://bit.ly/SinhalaYoutubeComments.
    5. Natural Language Processing with Disaster Tweets: Predicting Crisis Events with Social Media Using Machine Learning

      Saddam Hossain, Doina Logofătu
      Abstract
      In recent years, social media—especially Twitter—has become a key platform for disaster response, enabling the rapid spread of critical information. The substantial amount of tweets during crises necessitates the creation of an effective classification system to distinguish pertinent disaster-related content from irrelevant Tweets. This study examines various machine learning and deep learning techniques for classifying disaster-related tweets, ultimately introducing a hybrid RoBERTa-BiLSTM model that outperforms conventional methods.
      Our research assesses multiple models, including Logistic Regression, Naive Bayes, Support Vector Machines (SVM), Random Forest, Convolutional Neural Networks (CNN), and Long Short-Term Memory Networks (LSTM). We also fine-tuned transformer-based models like DistilBERT to determine their effectiveness in capturing contextual nuances. The dataset, obtained from Kaggle, contains labeled disaster and non-disaster tweets, with rigorous preprocessing applied to enhance text representation.
      Results of experiments demonstrate that the proposed RoBERTa-BiLSTM model achieves the highest classification accuracy of 84%, surpassing other methods in performance. This outcome highlights the advantage of combining transformer-based contextual understanding with sequential learning capabilities. Our findings highlight the potential of hybrid deep learning architectures in disaster management, offering a reliable solution for real-time crisis monitoring and information extraction from social media. Future research may explore real-time deployment, multilingual support, and multi-modal integration to further enhance practical applicability.
    6. Large Language Model-Driven Approach for Automated Competency Knowledge Graph Construction in IT Domain

      Duy Dinh, Han Duong, Thu Nguyen
      Abstract
      Knowledge Graphs (KG) are vital for organizing information from large datasets, enabling efficient knowledge extraction. However, manual KG construction is time-consuming and challenging due to data diversity and complexity, especially in domain-specific applications. Massive Open Online Courses (MOOCs) platforms provide abundant, high-quality Information Technology (IT) education resources, yet these remain unstructured for effective knowledge extraction and use. This study proposes an automated method for constructing KG with high accuracy, scalability, and optimal resource utilization to address this problem. The proposed approach integrates multiple components into a comprehensive, automated KG construction pipeline, including contextual database creation, entity and relation extraction from diverse data formats, and graph completion via hidden link discovery. This study’s key contributions include: (1) a fully automated process to extract and structure IT MOOC data for KG construction; (2) an approach to improve Large Language Models (LLMs) performance in Named Entity Recognition (NER) tasks; (3) comprehensive empirical evaluations exploring LLM capabilities in NER and Knowledge Graph Completion. The study yields a comprehensive knowledge graph for the IT MOOC domain, comprising 16 entity types, 1,923 unique entities, and 32 relation types with more than 3,590 triples. Experimental results indicate that the majority of LLMs achieve F1-score enhancements between 2% and approximately 10% in NER tasks while proficiently identifying relationships among entities. The results underscore the promise of automated knowledge graph construction for IT MOOCs, enhancing useful and semantically enriched knowledge resources for question-answering systems, learning advisory platforms, and IT educational applications.
Next
  • current Page 1
  • 2
  • 3
Title
Advances in Computational Collective Intelligence
Editors
Ngoc Thanh Nguyen
Vu Dinh Duc Anh
Adrianna Kozierkiewicz
Sinh Nguyen Van
Manuel Nunez
Jan Treur
Gottfried Vossen
Copyright Year
2026
Electronic ISBN
978-3-032-10209-6
Print ISBN
978-3-032-10208-9
DOI
https://doi.org/10.1007/978-3-032-10209-6

PDF files of this book have been created in accordance with the PDF/UA-1 standard to enhance accessibility, including screen reader support, described non-text content (images, graphs), bookmarks for easy navigation, keyboard-friendly links and forms and searchable, selectable text. We recognize the importance of accessibility, and we welcome queries about accessibility for any of our products. If you have a question or an access need, please get in touch with us at accessibilitysupport@springernature.com.

Premium Partner

    Image Credits
    Neuer Inhalt/© ITandMEDIA, Nagarro GmbH/© Nagarro GmbH, AvePoint Deutschland GmbH/© AvePoint Deutschland GmbH, AFB Gemeinnützige GmbH/© AFB Gemeinnützige GmbH, USU GmbH/© USU GmbH, Ferrari electronic AG/© Ferrari electronic AG