Skip to main content
Top

Knowledge Science, Engineering and Management

18th International Conference, KSEM 2025, Macao, China, August 4–7, 2025, Proceedings, Part III

  • 2026
  • Book
insite
SEARCH

About this book

The six-volume proceedings set LNAI 15919, 15920, 15921, 15922, 15923 and 15924 constitutes the refereed proceedings of the 18th International Conference on Knowledge Science, Engineering and Management, KSEM 2025, held in Macao, China during August 4–7, 2025.

The 106 papers and 66 short papers are included in these proceedings were carefully reviewed and selected from 354 submissions. They focus on all aspects of the exchange of research in artificial intelligence, data science, knowledge engineering, AI safety, large language models, and related frontier areas.

Table of Contents

Frontmatter
ACL: Adaptive Chunking of Large Language Models for Efficient Inference on Automotive Edge Devices

Large language models (LLMs) increasingly drive intelligent services within automotive edge computing. However, deploying these models efficiently remains challenging due to diverse hardware setups and limited computational resources typical of automotive edge environments. Existing deployment strategies often disregard hardware diversity, resulting in suboptimal resource use and compromised performance, particularly under peak inference workloads. Consequently, computing elements like CPUs and integrated GPUs are frequently idle, with tasks excessively dependent on discrete GPUs.To address this, we propose a dynamic inference partitioning strategy named Hardware-Aware Dynamic Scheduling (ACL), tailored specifically for automotive edge computing. Our approach leverages the inherent distinction between initial token-processing phases (prefill) and subsequent token generation phases (decode) within LLM inference. By adaptively distributing these phases across heterogeneous hardware units, ACL maximizes resource utilization and balances workloads effectively.Empirical evaluations indicate that ACL significantly enhances inference performance. Furthermore, our framework demonstrates robust efficiency improvements consistently across various LLM architectures, highlighting its adaptability and effectiveness in heterogeneous automotive computing scenarios.

Yufei Lin, Tianxiang Xu, Chengwei Ye, Huanzhen Zhang, Kangsheng Wang
The Evaluation of Retrieval-Based Unlearning Mechanisms on Large Language Models

Machine unlearning is essential for large language models (LLMs) to guarantee data privacy, model flexibility, and adherence to ethical standards. It allows the elimination of certain knowledge, addressing privacy issues and alleviating biases or misinformation without necessitating complete retraining. Retrieval-based techniques enhance LLMs by integrating external knowledge during inference. It improves model accuracy, reduces hallucinations, and enables real-time access to updated information without retraining. Recently, retrieval-based techniques have also demonstrated their capability to achieve machine unlearning without the adjustment of model parameters. However, the application of these parameter-agnostic unlearning algorithms remains inadequately investigated. In this paper, we examined the performance of retrieval-based unlearning method on different LLMs. Specifically, we established different evaluation metrics to explore the effectiveness of unlearning, the cost of unlearning, etc. We also emphasized the influential aspects that impact unlearning efficacy across various unlearning tasks. Our study provides insight into the application of LLM unlearning approaches in real-world scenarios.

Zihan Xie, Lefeng Zhang, Minfeng Qi
Node Centrality Approximation in Complex Networks via Inductive Graph Neural Networks

In the realm of network science, Closeness Centrality (CC) and Betweenness Centrality (BC) serve as pivotal metrics for deciphering structural significance and information flow dynamics within networks. These metrics are indispensable for applications such as community delineation and network resilience analysis; however, their calculation in extensive graphs presents substantial computational burdens. Although recent developments in approximation methodologies have alleviated some of these challenges, issues pertaining to processing duration and responsiveness to network alterations persist. In this study, we introduce the CNCA-IGE model, an inductive graph neural network-based encoder-decoder framework. The framework utilizes the degree centrality (DC) of nodes as input feature and is specifically designed to proximate the CC and BC of nodes in complex networks. Across diverse synthetic and real-world networks, the CNCA-IGE model outperforms state-of-the-art baselines in both efficiency and accuracy. This advancement holds potential for enhancing applications such as social network analysis and the optimization of communication networks.

Yiwei Zou, Ting Li, Tao Zhang, Zong-fu Luo
Carbon Market Price Prediction Method Based on Multi-feature Fusion and Deep Learning

To address the multi-feature-driven nature of carbon price fluctuations, this study proposes an innovative carbon price prediction model that integrates multi-source features with deep learning techniques. The model combines the LSTM network with attention mechanism to effectively capture key temporal patterns, while quantifying news text through keyword frequency analysis to construct a news influence indicator. By incorporating an incremental learning strategy with experience replay, the model gains strong adaptability to dynamic market changes. Experimental results show that, compared with traditional models, the proposed model reduces the Mean Absolute Error (MAE) by 46.4% and improves the Root Mean Square Error (RMSE) by 17.1%. These findings validate the effectiveness of integrating news influence and multi-level mechanisms into carbon price prediction, significantly enhancing prediction accuracy.

Yiyi He, Shouyi Chen, Chung-Lun Wei, Chiawei Chu
MMtuning: An Advanced Multi-adapter Framework for Efficient Multimodal Large Language Models Fine-Tuning

Modular skill acquisition has emerged as a promising paradigm in multi-task parameter-efficient fine-tuning (PEFT), enhancing knowledge organization and task transfer. Building on this concept, we propose MMtuning, an advanced PEFT framework for multimodal large language models (MLLMs), enabling direct fine-tuning from pre-trained unimodal models. In MMtuning, we formulate a multimodal skill allocation matrix that concurrently learns with the multi-adapter skill inventory, enabling optimal skill allocation for input samples during task fine-tuning. Experiments on ScienceQA and Visual7W demonstrate that MMtuning achieves superior sample efficiency compared to existing PEFT methods with the equivalent parameter volumes.

Li Qiao, Haowen Wang, Kazunori Sugiura, Keren Liu, Jinglu Hu
Multi-sensor Fusion Framework for HAR: Integrating Time-Frequency Features and Self-supervised Learning

The increasing ubiquity of the Internet of Things (IoT) and smart devices equipped with embedded human body sensors has intensified the focus on Human Activity Recognition (HAR). However, HAR, which relies on sensor data, faces challenges related to feature extraction and data correlation. To address these issues, our paper proposes a Multi-Sensor Fusion Network (MSFNet). This model leverages accelerometer and gyroscope time-domain data, transforms it into the frequency domain, and extracts features from both domains to enhance feature interaction through Self-Supervised Learning (SSL). MSFNet fuses time-frequency data and employs four encoders for feature extraction, along with two SSL tasks to improve classifier accuracy. Our experimental results demonstrate the model’s effectiveness and its competitive performance compared to existing benchmarks. The code can be accessed via the Github link https://github.com/bx12138/MSFNet

Xiang Wu, Wei Zhang
DRL-SA: Deep Reinforcement Learning-Based Client Selection and Secure Aggregation for Federated Learning

Federated Learning (FL) offers a privacy-preserving approach to distributed machine learning by enabling collaborative training across multiple clients. Existing research mainly focuses on client selection to improve global model convergence, addressing client resource heterogeneity. However, these methods often separate efficiency from security, neglecting the protection of local model parameter interactions and risking client privacy. To solve this, we propose a framework that balances optimal client selection with secure parameter interactions, protecting client models during training and ensuring correct aggregation without privacy leaks. To prevent man-in-the-middle attacks, a BLS signature mechanism ensures data integrity. Additionally, a Deep Reinforcement Learning (DRL)-based self-weighting method mitigates masking effects on server aggregation, enabling accurate weighted aggregation without masks. Experiments show our framework significantly improves model accuracy over traditional FL aggregation methods.

Qiuhao Xu, Chen Wang, Jian Shen
DynamicFedPEFT: Efficient Fine-Tuning of Dynamic Federated Parameters for Large Language Models

The rapid evolution of Large Language Models (LLMs) has revolutionized natural language processing; however, their deployment in heterogeneous environments remains challenging. While Parameter-Efficient Fine-Tuning (PEFT) has emerged as a cost-effective adaptation strategy, existing federated learning approaches fail to effectively address the combined challenges of data heterogeneity, computational diversity, and privacy preservation. This paper introduces DynamicFedPEFT, a novel federated learning framework that dynamically optimizes LLM adaptation through three key innovations: (1) a multi-dimensional evaluation framework that quantifies client data quality using semantic coherence, lexical diversity, and contextual richness; (2) an adaptive LoRA configuration mechanism that automatically adjusts rank and scaling parameters based on local data characteristics; and (3) a quality-weighted aggregation protocol that prioritizes contributions from high-value clients. Furthermore, the framework incorporates a resource-aware training architecture that enables full participation across heterogeneous devices through progressive parameter freezing. Comprehensive evaluations on six NLP benchmarks demonstrate state-of-the-art performance, with a 5.2% accuracy improvement over conventional federated PEFT approaches and a 28% reduction in communication costs. The proposed solution establishes a new paradigm for collaborative LLM optimization, balancing model performance, resource efficiency, and data privacy.

Xiaorui Luo, Chi Jiang, Shuai Wang, Yin Zhang
Dynamic, Multi-scale, and Noise-Aware Modeling for Skeleton Action Prediction

This paper proposes a novel framework for skeleton action prediction that integrates dynamic modeling, multi-scale feature learning, and noise modeling to address the challenges of predicting actions from partially observed and noisy data. Traditional methods struggle with limited observation data and noise, while skeleton-based action prediction has gained attention for its robustness to environmental changes and compact representation of human movements. To this end, we design a temporal Diffusion model that handles the uncertainty in partial observation data through an iterative denoising process and introduce a spatio-temporal adaptive attention Transformer to capture complex spatio-temporal relationships in skeleton sequences. Additionally, we propose mechanisms for dynamically adjusting time steps and non-uniform noise scheduling, enabling the model to adaptively learn noise characteristics across different temporal scales. To further enhance the model’s generalization and prediction accuracy, we design a multi-scale loss function to optimize the model’s performance across multiple temporal scales. Experimental results demonstrate that our model achieves significantly lower prediction errors compared to state-of-the-art methods on the NTU RGB+D and Human3.6M datasets, validating its superior performance in skeleton action prediction tasks. This study offers new technical insights for the field of skeleton action prediction and holds great potential for practical applications in intelligent surveillance, human-computer interaction, and healthcare monitoring.

Cui Ran, Zhu Aichun, Liu Yang
Adaptive Retrieval Enhancement for Open-Domain Question Answering

Large language models (LLMs) excel in text generation; however, they encounter several challenges, including inaccurate facts, hallucinations, and outdated knowledge. Retrieval-augmented methods address these issues by grounding generation in external corpora. Nonetheless, a critical trade-off exists: sparse retrievers, such as BM25, prioritize lexical exactness but overlook semantic variations, while dense retrievers, like DPR, capture semantic relevance but neglect precise term matching. To address this challenge, we propose Adaptive Retrieval Enhancement (ARE), a novel framework that synergistically integrates sparse and dense retrieval through three key innovations: (1) LLM-driven query expansion, which generates diverse and semantically equivalent questions to broaden the retrieval scope; (2) hybrid fusion, which combines BM25 and DPR scores via a trainable dual BERT ranker; and (3) efficiency optimization, incorporating FAISS indexing and adaptive context truncation. Experimental results demonstrate that our method achieves significant performance improvements on datasets such as TriviaQA, Natural Questions, and WebQuestions. This work not only highlights the potential of integrating multiple retrieval technologies but also offers valuable insights for the design of future question-answering systems.

Lulu Lin, Xiao Zhu
Privacy-Preserving Shortest Path Queries on Encrypted Attributed IIoT Graphs

Cryptographic technologies are increasingly utilized to secure private data in outsourcing scenarios. In particular, enabling queries on encrypted attributed graphs with rich information and broad practical applications has garnered wide attention. However, most existing studies primarily address keyword queries within simple graph structures, such as neighbor relationship, severely limiting graph utility. Notably, there has been no prior work that supports shortest path queries - an essential graph algorithm - with attribute constrains on encrypted graphs.In this paper, we introduce SAGES (Static Attributed Graph Searchable Encryption), the first scheme designed to facilitate shortest path queries under specific attribute requirements. SAGES employs symmetric searchable encryption (SSE) to enhance en/de-cryption speeds, and constructs an encrypted structure to enable rapid query execution through efficient index retrieval. In addition, we implement a compression algorithm to minimize server storage overhead. We also formalize leakage functions and provide a rigorous security proof under reasonable leakage assumptions, ensuring that the shortest path structure remains protected against the latest query recovery attacks. Simulated experiments using eight real-world graph datasets demonstrate the effectiveness of our graph compression and the computational efficiency of both setup and query processes. Notably, we achieve an average compression ratio of 79.69%, and query times across all test datasets remain below 700 us.

Weixiao Wang, Qing Fan, Yajie Wang, Chuan Zhang, Liehuang Zhu
Text Attributed Graph Node Classification Using Sheaf Neural Networks and Large Language Models

Text-Attributed Graphs (TAGs) seamlessly integrate textual data with graph structures, presenting unique challenges and opportunities for jointly modeling text and graph information. Recent advancements in Large Language Models (LLMs) have significantly enhanced the generative and predictive capabilities of text modeling. However, existing graph models often fall short in capturing intricate node relationships, as their edge representations are typically limited to scalar values. In this paper, we introduce SheaFormer, a novel method that encodes rich and complex relational information between nodes as edge vectors. During the message-passing phase, SheaFormer aggregates both neighbor node representations and edge vectors to update the central node’s representation, eliminating the need to fine-tune the LLMs on the text-attributed graph. Specifically, for a given TAG, SheaFormer is trained to minimize the prediction errors of the LLM in forecasting the next word in node text sequences. Furthermore, we enhance SheaFormer’s performance by incorporating prompt-based fine-tuning techniques. Once trained, SheaFormer can be seamlessly adapted to various downstream tasks. Extensive node classification experiments across multiple domains demonstrate that SheaFormer consistently achieves state-of-the-art performance, validating its effectiveness in capturing complex relationships within TAGs. Additionally, we conduct ablation studies and scalability analyses to ensure the robustness and applicability of our approach.

Haoyang Yu, Zhongyu Li, Geng Zhao, Jiayu Li, Ruofei Jiang
TIEBN: An Eigenvalue-Driven Blockchain Network for Anomaly Detection

In this paper, we introduce the Trust Improvement Eigenvalue Blockchain Network (TIEBN), an innovative framework that leverages eigenvalue theory to address critical challenges in blockchain security, privacy, and scalability. The framework employs eigenvalue decomposition to optimize transaction validation, ensuring scalability while preserving data integrity and confidentiality. This approach enables TIEBN to rapidly identify fraudulent activities and network attacks, significantly enhancing the security of decentralized systems. Furthermore, TIEBN’s spectral analysis refines consensus mechanisms, reducing confirmation times and improving overall network performance. By integrating real-time anomaly detection and privacy-preserving techniques, TIEBN ensures that sensitive transaction data remains secure and confidential, even in high-frequency applications. This paper explores the foundational principles of TIEBN, demonstrating its potential to revolutionize blockchain technology by addressing key security, privacy, and scalability challenges. Through extensive simulations and evaluations, we show that TIEBN outperforms traditional blockchain architectures in terms of transaction throughput, latency, and security. The proposed framework not only enhances the efficiency of blockchain networks but also strengthens trust and reliability in decentralized systems. By combining eigenvalue theory with advanced anomaly detection and privacy-preserving mechanisms, TIEBN paves the way for secure, scalable, and privacy-conscious blockchain ecosystems capable of supporting a wide range of real-world applications.

Grace Mupoyi Ntuala, Jianbin Gao, Patrick Mukala, Qi Xia, Ansu Badjie, Godfred Doe, Hu Xia
Enhanced Knowledge Tracing via Imputing Knowledge States

The advancement of online learning platforms has intensified the demand for personalized learning. Knowledge tracing (KT) technology, which models students’ knowledge states to predict their problem-solving performance, serves as the foundation for personalization. Current KT models primarily construct learning sequences using platform-recorded exercise data, students’ learning behaviors outside the platform can also affect their knowledge states. However, comprehensive data collection of heterogeneous learning behaviors requires substantial resources, while modeling such behavioral diversity presents technical challenges, making it impractical to resolve these limitations through exhaustive behavioral logging and direct KT integration. To address these challenges, this paper proposes Imputing Knowledge States (IKT). Specifically, we first model fine-grained knowledge states at the concept level by analyzing direct and indirect relationships between knowledge concepts through students’ learning sequences. Subsequently, we reconstruct these knowledge states via a Variational Autoencoder (VAE) and predict students’ extra-platform learning activities by contrasting reconstructed-original knowledge states and analyzing inter-interaction time intervals. These predictions then guide knowledge state imputation to derive more plausible representations. Finally, extensive experiments on four real-world datasets demonstrate that IKT outperforms current state-of-the-art models.

Songtao Cai, Li Li
Resisting Catastrophic Recall: Persistent Unlearning via Knowledge Distillation with Feature Suppression

Machine learning, as a key supporting technology for AI, has greatly contributed to the rapid development of AI and provided a strong impetus for improving productivity. Meanwhile, machine unlearning has emerged as an important area as data privacy and security concerns become more prominent. It helps to remove certain data knowledge from a trained model without retraining it from scratch. Existing research on machine unlearning methods has primarily focused on improving the efficiency of unlearning algorithms and the effectiveness of data removal. However, it largely ignores whether real-world models can maintain unlearned performance during incremental learning. Motivated by this, we incorporate a knowledge distillation-based feature suppression mechanism to prevent the model from re-learning the removed class representations when the unlearned model undergoes successive incremental learning. By leveraging knowledge distillation, we effectively constrain the feature space, ensuring that the unlearned knowledge does not resurface in subsequent learning stages. Furthermore, we design class-specific unlearning methods to validate the proposed approach and provide a new perspective on class unlearning.

Zonghao Ji, Youyang Qu, Longxiang Gao, Taihao Zhang
Enhancing Legal Judgment Prediction in LLMs via Legal Norms Integration

Legal judgment prediction (LJP) is a crucial task in intelligent judiciary systems. We observe that existing LLMs perform suboptimally in this task. The main challenge lies in the inherent conflict between the abstract labels and the lengthy textual facts, making it difficult for LLMs to reason accurately. To enable LLMs to adapt effectively to the unfamiliar LJP task, we propose a novel framework for Chinese LJP, termed N2RPT, which draws inspiration from the reasoning processes of real-world judges and leverages a sophisticated integration of legal norms to enhance decision-making precision. N2RPT employs a pre-trained language model (PLM) collaborates with a LLM through an iterative, relevance-driven retrieval process that refines information from coarse to fine granularity. Subsequently, strict label-consistent legal norms are employed as candidates and demonstrations within prompt engineering, ensuring that the LLM adheres to established legal standards during the reasoning process. To further mitigate the risk of hallucinations in LLM outputs, GPT-4 is leveraged to synthesize reasoning trajectories, which are then used to fine-tune the LLM and enhance its capability. Extensive experiments conducted on real-world datasets demonstrate the effectiveness and superior performance of the proposed framework in enabling LLMs for LJP task.

Han Dai, Wenwen Zhao, Li Li
Geo-DETR: Geographical Map Detection Based on Multi-stage Gradient Feature Fusion

Digital geographical maps of China are widely used across various fields, but many of these maps contain common errors, such as inaccurate borders and missing islands, which can severely impact national security and sovereignty. Therefore, we propose Geo-DETR, a novel approach for map accuracy assessment. Initially, to address the challenge of extracting intricate and subtle boundary information, we present the Pristine Gradient Extraction Module (PGEM), which enhances boundary detection through gradient-based features. Subsequently, the Gradual Attention Fusion Module (GAFM) and Dual Layer Attention (DLA) mechanism adopt a multi-scale, multi-path strategy to optimize the fusion of boundary and semantic features, reducing information loss. Additionally, we design a Cross-Scale Fusion Encoder (CSFE) that enhances the model’s ability to capture both high-level semantic representations and fine-grained details. Experimental results show that Geo-DETR significantly outperforms existing methods in map detection tasks, efficiently and accurately identifying map errors, even in resource-constrained environments.

Yan Xu, Chuantao Li, Zhenqiang Zhang, Liting Geng, Yue Liu, Chunxiao Wang, Zhigang Zhao, Jialiang Lv
FATFI: A Framework to Generate Adversarial Traffic with Feature Interpretability

Existing adversarial attacks often lack explainability, making it challenging to understand how these attacks bypass detection. Moreover, these attacks focus on bypassing Network Intrusion Detection Systems (NIDS) detection while neglecting the characteristics of different attack types and employing a unified perturbation method, which may compromise the attacks’ functionality in real-world scenarios. To address these challenges, we propose FATFI, a framework designed to generate adversarial traffic with feature interpretability, thereby enhancing attack effectiveness. FATFI employs a multi-level hybrid explanation method, analyzing both global and local feature importance and evaluating feature stability using the Coefficient of Variation (CV) to rank features. By perturbing packets and observing feature changes, FATFI generate feature combinations and determine the optimal perturbation strategy through a scoring mechanism. FATFI then applies these perturbations to traffic using deep learning (DL) models trained on benign characteristics. This ensures the modified traffic evades NIDS detection. We evaluate FATFI on intrusion attacks using public network datasets and seven NIDS. The experimental results demonstrate that FATFI outperforms baseline feature selection techniques in feature evasion, achieving an average improvement of up to 24.56% compared to prior studies. In terms of traffic evasion, FATFI achieves an Evasion Increase Rate (EIR) of 99.47%, while also validating the effectiveness of adversarial traffic in real-world scenarios.

Yikang Wang, Weina Niu, Dujuan Gu, Qingjun Yuan, Jiacheng Gong, Shuangqi Gan, Xin Lin, Xiaosong Zhang
Enhancing Multi-source Localization via Tailored Feature Representation Framework

Fast and accurate source localization can minimize the harm caused by rumors. However, due to the diversity and complexity of the dissemination of information, identifying the source of rumors on social networks remains a crucial and unresolved task. Meanwhile, the low-dimensional label features of existing methods limit the expressiveness of node representations. In the paper, we propose an Enhancing Multi-Source Localization via Tailored Feature Representation Framework (SL-TFRF) to address this limitation. Specifically, we design a feature representation module that utilizes embedding layers and contrastive learning to expand the dimensionality of node features. Furthermore, we introduce a novel attention fusion method inspired by the sliding windows to account for the varying information transmission efficiencies of different nodes. In addition, we develop a class balancing mechanism to alleviate the label imbalance inherent in source localization. Extensive experiments validate the effectiveness of SL-TFRF and demonstrate its superiority over state-of-the-art methods.

Wenchao Song, Guowei Chen, Yanchao Liu, Chi Zhang, Junpeng Gong, Pengzhou Zhang
FedMP: A Multi-prototype Heterogeneous Federated Learning Framework

Federated learning (FL) has become a popular paradigm for privacy-preserving collaborative knowledge discovery. However, data and model heterogeneity among participants presents a range of challenges, such as class imbalance and model security. This paper focuses on class imbalance heterogeneity and proposes an FL framework based on multi-prototype learning (FedMP). Unlike traditional gradient-based FL, FedMP utilizes client prototypes to abstract the knowledge of each local dataset and represents the global model through the aggregation of prototypes on the server’s side. In this scenario, the client and the server only need to exchange prototypes, allowing independent local model training without considering heterogeneity and improving security. We employ clustering algorithms to achieve prototype aggregation and use multiple global prototypes to represent the knowledge of each class, effectively addressing the class imbalance issue. Additionally, we propose a contribution evaluation method based on the cumulative contributions of clients to the global prototypes. This method is used in model inference and effectively improves prediction accuracy. Experimental results show that FedMP outperforms the baseline methods in accuracy across several datasets.

Huanhuan Chi, Yu Peng, Zhenni Liu, Ping Xiong
CGM: Intrusion Detection Based on a Multi-head Attention Optimization Model

With the increasing sophistication of network attacks, traditional intrusion detection systems face the dual challenges of high-dimensional traffic feature extraction difficulties and category imbalance. We introduce a novel intrusion detection technique called CGM in this paper, which integrates CNN and Bidirectional Gated Recurrent Units (BiGRU), and is optimized by multi-head attention mechanism to improve detection performance. Our proposed method differs from existing approaches in the following aspects. Firstly, the CVAE-GAN-NCR algorithm is used to resample the dataset to efficiently generate minority class samples. Secondly,we incorporate Recursive Feature Elimination (RFE) combined with Random Forest (RF) to optimize feature selection. Finally, we use a multi-head attention mechanism to optimize the CNN - BiGRU model to improve the model’s predictive accuracy and feature representation. To validate our approach, we conduct experiments on the CSE-CICIDS2018 dataset, achieving a multi-class classification accuracy of 98.04%. The method not only optimizes the data processing flow, but also improves the accuracy and robustness of malicious traffic detection by fusing deep learning and feature selection techniques.

Siyao Li, Yong Wang, Zhen Wang
FusionMIA: Enhancing Membership Inference Attacks with Spy Clients and Shadow Models in Federated Learning

Federated learning is a paradigm that enables multiple clients to collaboratively train a global model without sharing raw data. To enhance privacy, secure aggregation has been widely adopted to conceal individual model updates. However, secure aggregation does not provide sufficient protection against membership inference attack. In this paper, we introduce FusionMIA, a novel membership inference attack that leverages spy clients and shadow models, systematically infer the privacy information of target training samples from aggregated updates. Contrary to the assumptions made by previous studies that secure aggregation offers sufficient protection against inference attacks, our research demonstrates that FusionMIA can effectively compromise the security of federated learning systems even in secure aggregation protected settings. FusionMIA successfully reconstructs membership information by leveraging the differential impact of individual client updates on the aggregated model. Extensive experiments on MNIST and CIFAR-10 demonstrate the effectiveness of this approach, achieving >90% AUC-score in most cases and exposing a significant vulnerability in federated learning systems. These findings underscore the pressing need for more robust privacy-preserving mechanisms in federated learning that extend beyond conventional aggregation-based defenses.

Xiang Lan, Jiayin Li, Zuobin Ying, Xingshuo Han, Shengmin Xu
DynaKiteQuery: Top-K Closest-Vertex Queries on Dynamic Attributed Knowledge Graphs for IIoT Applications

Top-k closest-vertex queries on weighted knowledge graphs refer to the process of retrieving the k vertices that are closest to a given query vertex based on the shortest distance. This operation is particularly valuable in the Industrial Internet of Things (IIoT), where it leverages data security inversion and traceability such as risk identification, asset association analysis, and anomaly tracing across the entire data lifecycle. Although extensive research has been conducted on ranking and querying knowledge graphs, the specific problem of top-k closest-vertex queries on dynamic attributed knowledge graphs remains largely unexplored. To bridge this gap, we propose an attribute-based indexing mechanism, along with an associated scalable storage structure, to enable efficient top-k search and dynamic graph updates. We evaluate our approach in terms of update efficiency when new edges are added and query performance as k varies. Experimental results demonstrate that the update time scales linearly with the number of added edges, while the search time remains independent of k and is influenced only by the overall size of the knowledge graph.

Qing Fan, Weixiao Wang, Yajie Wang, Hui Xie, Yudi Zhang, Liehuang Zhu
Evaluating LLMs for Multi-label Text Classification

As machine learning models grow in size, the demand for well-annotated data increases. However, human annotation is expensive, and the human-labeling process faces issues such as delayed response and ethical concerns. The recently launched ChatGPT provides an alternative solution to generate labels instead of using human annotators. This paper explores ChatGPT’s potential to replace human efforts in text classification tasks through a comprehensive investigation. Our findings reveal that ChatGPT can perform well in text classification tasks, though fairness issues require attention. These results demonstrate the potential of ChatGPT in replacing human annotators, especially in ethically challenging, content-sensitive tasks where human involvement could be limited.

Mengqi Wang, Ming Liu
Generating Feedback for School Students Essay with Large Language Models

This paper explores using large language models (LLMs) like T5, BART, and GPT-4 Turbo to automatically generate feedback on primary and secondary school students’ essays. We constructed a dataset which consists of over 740 student essays and tutor feedback across different year levels, based on which we evaluated the performance of prevalent LLMs in feedback generation. After aligning automated evaluation metrics with educational standards in helpfulness, readability, acceptance, relevance, and specificity, we conducted further user studies to assess GPT-4’s effectiveness in personalised feedback generation. Our findings show that GPT-4 Turbo, especially when using well-designed prompts and reasoning strategies, outperforms models like T5 and BART in providing more readable feedback. Human evaluation also supports the readability and relevance of GPT-4 Turbo’s feedback, but lacks helpfulness and specificity compared to real tutor feedback.

Dan Zhang, Thuong Hoang, Ye Zhu, Rui Wang, Paula Crouch
Dynamic Asymmetric Contrastive Learning with Adaptive Hard Negative Mining for Resume-Job Matching

In the online recruitment domain, efficient and accurate resume-job matching is essential for optimizing talent selection and work allocation. However, existing deep contrastive learning methods still face two major challenges: the symmetry assumption restricts the differentiation of resume and job representations, reducing semantic distinction; random negative sampling introduces low-value samples, weakening the model of ability to distinguish similar matches. To address these issues, we propose Dynamic Asymmetric Contrastive Learning (DACL), which improves representation learning and negative sampling quality through asymmetric contrastive learning and adaptive hard negative mining. Our approach introduces a bidirectional temperature regulation mechanism to independently optimize resume-to-job and job-to-resume matching separately, mitigating gradient conflicts and improving adaptability. Additionally, we introduce a dynamic hard negative selection mechanism, leveraging both semantic and structural features to identify high-confusion negatives, improving model robustness. Experiments results on real-world datasets demonstrate that DACL significantly improves matching accuracy and retrieval efficiency, providing a generalizable and scalable optimization framework for resume-job matching.

Suhuan Duan, Xingji an Xu, Fanjun Meng
TRIAD: A Tool-Responsive Instruction-Aligned Framework for Domain-Specific Problem Solving

While general-purpose large language models (LLMs) demonstrate robust problem-solving capabilities through universal tools, their effectiveness in specialized domains remains constrained by insufficient professional tool usage and prohibitive customization costs for end-users. This paper presents the Tool-Responsive Instruction-Aligned Development (TRIAD) framework, enabling resource-efficient enhancement of domain-specific capabilities in small language models through synergistic tool-data optimization. The framework comprises three synergistic components: (1) Tool-Semantic Anchored Dataset Construction filters non-geometric problems from MATH [11] and converts them to Wolfram Language code (3,366 samples); (2) Autonomous Prompt Optimization employs DeepSeek-R1 [7] guided iterative refinement to develop tool-adapted templates, achieving significant code structure similarity improvements; (3) Tool-Sensitive Instruction Tuning integrates domain knowledge via LoRA-based parameter-efficient adaptation [13]. Experiments reveal TRIAD’s substantial performance gains across 7B-parameter models: Qwen2-7B-instruct [27] shows 42.6% TUPS improvement through APO optimization, while Gemma-7B [19] and Mistral-7B-instruct [14] achieve TUPS boosts from 15.0% $$\rightarrow $$ → 60.5% and 32.6% $$\rightarrow $$ → 58.4% respectively via full TRIAD implementation. This work proposes a cost-effective framework to enhance small language models domain capabilities, with experimental results supporting its effectiveness.

Duo Zhang, Yuxia Cheng
Verifiable Fine-Grained Federated Unlearning

With the data security law granting users the right to be forgotten, it has become essential to tackle the challenge of unlearning specific training data from the global model in federated learning (FL). Most existing federated unlearning researches employ model retraining methods to forget clients’ data, which brings high computational costs and low training efficiency. Furthermore, the issue of fine-grained deletion and forgetting of part of the data within clients has yet to be addressed. To achieve efficient unlearning of part of the client’s data in FL, this paper proposes a novel approximate federated unlearning scheme based on gradient ascent. Specifically, this scheme first adopts a constrained gradient ascent method for local unlearning of the deleted data of the target client, using a dynamic penalty mechanism to reduce catastrophic forgetting in the local model. Secondly, the scheme optimizes the local unlearning model through projected gradient ascent, improving the accuracy of the global unlearning model on normal data. Additionally, extensive experiments have been conducted to verify the performance of the federated unlearning, and comparing our scheme with the model retraining. The experimental results demonstrate the effectiveness and efficiency of the proposed scheme.

Yong Wang, Guangyu Peng, Xueli Nie, Bruce Gu
MCC: Multi-level Feature and Context-Aware Attention Mechanism with Consistent Distributions for Recipe Retrieval

With people’s increasing emphasis on healthy eating, food computing has become a significant research area, in which recipe retrieval is an essential part. In this paper, we are interested in retrieving food recipes from food images and vice versa. We present Multi-level Feature and Context-aware Attention Mechanism with Consistent Distributions for Recipe Retrieval (MCC). To reduce the distance between image modality and text modality, we introduce Maximum Mean Discrepancy and propose a novel triplet loss (TL-MMD), which outperforms traditional triplet loss by effectively aligning cross-modal distributions and enhancing convergence, thus achieving superior retrieval accuracy. Considering that a dish comprises multiple ingredients, with specific regions roughly corresponding to individual ingredients, we propose an encoder with multi-level features that innovatively integrates an advanced attention mechanism. This approach surpasses traditional CNN-based encoders by dynamically focusing on key image regions and fusing multi-resolution features, achieving richer and more detailed representations. Furthermore, we construct a Contextual Attention Module (CAM), targeting distinct regions in the image and individual words in the recipe simultaneously, to discover full latent alignments and infer region-word similarity with greater precision and interpretability than prior methods. Our model surpasses the competition by achieving state-of-the-art performance on Recipe1M, boasting an improvement of 2–4%. Through ablation experiments, we verify that each of our components contributes significantly to enhancing the performance, collectively establishing MCC as a superior solution for cross-modal recipe retrieval.

Kaihao Wang, Hangrui Xu, Jian Liu
Heuristic Ant Colony Enabled Federated UAV Circuit Inspection Planning Algorithm Considering Adaptive Weather

In this paper, we propose a UAV path planning algorithm that takes into account real-time weather conditions. In recent years, numerous researchers have focused on optimisation algorithms for optimal path planning. Federated learning architectures allow distributed UAV nodes to share critical path features and local optimisation experience for collaborative knowledge accumulation without compromising private data. The ant colony algorithm, with its bionic optimality seeking mechanism, emulates ant behaviour in terms of pheromone release and path optimisation, thereby initially delineating feasible routes for drones. However, existing algorithms are deficient in their inability to incorporate real-time weather conditions into the path planning process, a shortcoming that significantly limits their practical application. To address this shortcoming, this paper proposes a weather-based adaptive heuristic ant colony optimisation (ACO) UAV circuit inspection planning algorithm (WACA). The algorithm is based on the original ACO algorithm and incorporates real-time weather conditions in the inspection area. Experimental results show that the proposed method improves the practical feasibility and versatility of route planning while minimising the time cost.

Wei Ding, Luyao Wang, Yan Zhou, Lingzhi Kong, Myung Jin Lee, Keun Ho Ryu, Kwang Woo Nam, Qinyao Hou
Context-Aware Spatiotemporal Graph Attention Network for Next POI Recommendation

Point-of-interest (POI) recommendations leverage the vast amounts of GPS data collected from location-based social networks to identify frequent patterns and current interests from users’ historical check-in trajectories, enabling accurate predictions of the next POI a user will visit. Graph neural network-based models have made significant breakthroughs in this field by effectively integrating global information. However, current mainstream models tend to focus primarily on POI check-in sequences, neglecting the rich spatiotemporal dynamics inherent in the trajectory data and their inability to dynamically model the heterogeneous importance of spatiotemporal features, which vary across users, locations, and temporal contexts. To address these limitations, we propose a Context-aware Spatiotemporal Graph Attention Network for next-POI recommendations. Our model introduces a novel graph attention mechanism that dynamically adjusts the importance of different spatiotemporal features based on the specific context of user behaviors. This context-awareness enables the model to effectively capture both temporal and spatial homogeneity or heterogeneity in user movement patterns, adapting feature weights according to individual preferences and situational factors. By modeling the varying importance of spatiotemporal features across different contexts, our model achieves more personalized and accurate POI recommendations. Experimental results on real-world datasets demonstrate the effectiveness of our proposed approach in improving the performance of POI recommendation tasks.

Qiuhan Han, Qian Wang, Atsushi Yoshikawa, Masayuki Yamamura
From Thinking to Output: Chain-of-Thought and Text Generation Characteristics in Reasoning Language Models

Recently, there have been notable advancements in large language models (LLMs), demonstrating their growing abilities in complex reasoning. However, existing research largely overlooks a thorough and systematic comparison of these models’ reasoning processes and outputs, particularly regarding their self-reflection pattern (also termed “Aha moment” [3]) and the interconnections across diverse domains. This paper proposes a novel framework for analyzing the reasoning characteristics of four cutting-edge large reasoning models (GPT-o1 [10], DeepSeek-R1 [3], Kimi-k1.5 [13], and Grok-3) using keywords statistic and LLM-as-a-judge paradigm. Our approach connects their internal thinking processes with their final outputs. A diverse dataset consists of real-world scenario-based questions covering logical deduction, causal inference, and multi-step problem-solving. Additionally, a set of metrics is put forward to assess both the coherence of reasoning and the accuracy of the outputs. The research results uncover various patterns of how these models balance exploration and exploitation, deal with problems, and reach conclusions during the reasoning process. Through quantitative and qualitative comparisons, disparities among these models are identified in aspects such as the depth of reasoning, the reliance on intermediate steps, and the degree of similarity between their thinking processes and output patterns and those of GPT-o1. This work offers valuable insights into the trade-off between computational efficiency and reasoning robustness and provides practical recommendations for enhancing model design and evaluation in practical applications. We publicly release our project at: https://github.com/ChangWenhan/FromThinking2Output

Junhao Liu, Zhenhao Xu, Yuxin Fang, Yichuan Chen, Zuobin Ying, Wenhan Chang
Backmatter
Title
Knowledge Science, Engineering and Management
Editors
Tianqing Zhu
Wanlei Zhou
Congcong Zhu
Copyright Year
2026
Publisher
Springer Nature Singapore
Electronic ISBN
978-981-9530-55-7
Print ISBN
978-981-9530-54-0
DOI
https://doi.org/10.1007/978-981-95-3055-7

PDF files of this book have been created in accordance with the PDF/UA-1 standard to enhance accessibility, including screen reader support, described non-text content (images, graphs), bookmarks for easy navigation, keyboard-friendly links and forms and searchable, selectable text. We recognize the importance of accessibility, and we welcome queries about accessibility for any of our products. If you have a question or an access need, please get in touch with us at accessibilitysupport@springernature.com.

Premium Partner

    Image Credits
    Neuer Inhalt/© ITandMEDIA, Nagarro GmbH/© Nagarro GmbH, AvePoint Deutschland GmbH/© AvePoint Deutschland GmbH, AFB Gemeinnützige GmbH/© AFB Gemeinnützige GmbH, USU GmbH/© USU GmbH, Ferrari electronic AG/© Ferrari electronic AG