Skip to main content
Top

2025 | Book

Proceedings of the 4th International Conference on Frontiers of Electronics, Information and Computation Technologies (ICFEICT 2024)

Volume II

insite
SEARCH

About this book

This book contains papers that have been carefully compiled from the fourth International Conference on Frontiers of Electronics, Information and Computation Technologies (ICFEICT), which was held in Beijing from June 22 to June 24, 2024. These papers have undergone rigorous review processes and adhere to strict standards. The primary goal of the conference is to promote research and development efforts in these areas while fostering the exchange of scientific information.

The intended audience for the papers presented at ICFEICT 2024 will primarily be leading academic scientists, researchers, scholars, educators, developers, engineers, students, and practitioners working globally in the areas of electronics engineering, communications, and computing.

Table of Contents

Frontmatter

Computation

Frontmatter
Extraction of Main Skeleton from Point Cloud Based on Maximum Hollow Projection Method

The skeletonization of three-dimensional point cloud models is an important research area in computer graphics used to describe the topological structure of objects. In the automation measurement and machining of complex cavities, a well-constructed skeleton plays a crucial role in generating automated equipment motion trajectories. However, when sampling the point cloud of complex-shaped cavities, limitations such as the sensor’s range and object reflectivity often result in unevenly distributed or missing data in the extracted point cloud model. This greatly impacts the centrality of the generated skeleton and the extent of three-dimensional feature restoration. To address these issues, this paper proposes an algorithm for extracting the main skeleton of complex cavity point clouds. The algorithm first inputs the sensor’s initial parameters based on practical measurement needs and performs segmented scanning and sampling of the point cloud from the starting point. Local normal distribution trends of the sampled segment are then analyzed to ascertain axial features, followed by planar projections using multiple orthogonal bases and calculation of the hollow area. Finally, combining the calculated results of axial features and maximum hollow area, the growth direction and step length of the main skeleton are estimated. The process is repeated until a complete main skeleton is generated. Experimental results show that the proposed algorithm can still accurately restore model features when faced with complex shapes, uneven densities, and local data missing point clouds.

Weiqing Li, Zhuhua Bai, Fantong Meng, Renke Kang, Zhigang Dong, Guolin Yang
Research on Vehicle Digital Twins Based on Simulink and CANoe

Digital twin technology plays a crucial role in advancing intelligent manufacturing and industrial Internet applications, particularly in automotive research, development, manufacturing and services. Creating a digital twin of a vehicle serves as a foundation for implementing this technology in the automotive industry. In the paper, we explore the architecture of a vehicle digital twin system, discuss the construction of physical and virtual entities within it and the connections between them. And then we propose a vehicle digital twin design scheme for anomaly detection in Controller Area Network (CAN). Specifically, we first create a virtual vehicle model and a vehicle powertrain CAN bus model, and then deploy a vehicle digital twin system. During the construction process, Simulink modeling and Simulink/CANoe co-simulation modeling are utilized. Finally, we design an experiment to detect and address various CAN bus attacks to verify the effectiveness of the digital twin system.

Yunlai Zhang, Guihe Qin, Yanhua Liang, Jiaru Song
Research on Three-Dimensional Bin Packing Problem Based on Improved Neural Network Algorithm

With the development of e-commerce, the three-dimensional bin packing problem is crucial for improving logistics efficiency and reducing costs. In this study, an improved deep neural network algorithm (IDNNA) is proposed. This algorithm utilizes the size information of ordered items to design neural network inputs, enabling a better understanding of irregular items. The neural network outputs the box type with the maximum loading rate, and training efficiency is improved by designing masking information. The algorithm’s performance is optimized through multiple experiments. IDNNA demonstrates excellent performance on public datasets, significantly improving efficiency when handling large-scale instances, successfully reducing the average number of attempts by up to 81.81%. This research provides an efficient and accurate solution for the e-commerce warehousing industry, significantly improving packing efficiency and reducing costs.

Haofang Zhao, Xiangyu Yin
Active Learning Combining Neighborhood Density and Uncertainty

With the rapid development of machine learning, acquiring high-performance classification models with minimal annotation costs has become an urgent issue. This study aims to address this challenge through selective sampling using active learning. However, traditional optimization experimental design algorithms have failed to fully utilize sample label information. Therefore, a novel active learning algorithm combining neighborhood density and uncertainty is proposed to improve the direct experimental design by selecting representative and uncertain samples. Experimental results demonstrate that compared to traditional methods, this algorithm exhibits significantly superior performance on four standard image datasets.

Mingkai Yang, Jingjing Huang, Ao Chen, Guan Wang, Shun Chen
Multi-scale Confidence-Aware Feature Fusion Network for Crowd Counting

Estimating crowd size is a significant area of study within computer vision. Its purpose is to use video or image data to calculate the number of crowds. Deep learning and neural networks have become widely utilized in the realm, offering robust methodologies for analysis and prediction. However, factors such as occlusion, crowd distribution, and chaotic scenes pose obstacles to crowd counting research. To tackle these challenges, this paper proposes a multi-scale confidence-aware feature fusion network to achieve efficient and accurate crowd counting. In this structure, we learn the scale variation of multi-scale features. Due to the different information features that are not fully focused on at multiple scales, we use the original features to learn a confidence score for each scale. Finally, we use cross attention to fuse the learned weighted feature map with the original feature map, fusing features that may be lost in multiple scales or details that cannot be noticed in the initial feature map, resulting in a better feature map and more accurate results.

Zhengpeng Zhao, Gengshen Wu
No-Reference Image Quality Assessment Method Based on Mutual Information of Regions

In response to the problem of the lack of generalization ability of the existing image quality evaluation methods for different types of image distortion, a new no-reference image quality evaluation method is proposed. This method is based on the image region mutual information feature, combined with the non-subsampled shearlet transform, and designs a multi-scale frequency domain feature and its extraction method to train the image distortion classification model and regression model. Two kinds of models are used to calculate the distortion score and the image quality score, and the two scores are fused as the evaluation score of the image quality to achieve objective assessment of image quality. Performance tests on real image data show that the method proposed in this paper has the advantages of high subjective consistency, good generalization, and no need for reference images. It can be applied to various imaging devices and image processing application systems.

Zhan Yuan, Peiyuan Wang, Li Jia, Yan Li, Mingzhuo Xia
Research on Patrol Path Optimization Based on Improved Ant Colony Algorithm

Patrol path optimization is of great significance in various real applications. At present, there are many ways to optimize the patrol path from various angles, and there are also many studies on traditional ant algorithms. After studying the application of the existing traditional ant colony algorithm in patrol path optimization, we analyze the existing problems and difficulties, and propose an improved ant colony algorithm. After deeply studying the ant colony algorithm and patrol path optimization problem, the paper tries to introduce the improved ant colony algorithm into the problem based on the modeling of the patrol path problem. The experimental results show that the patrol path optimization method based on the improved ant colony algorithm has certain advantages over the ordinary patrol path optimization method.

Haihui Zhou
Cross-Modal Translation for Medical Images Augmentation Based on Diagnosis Reports

As one of the promising approaches to improving efficiency in clinical medicine, deep learning models are still faced with dataset shortages since data collection inevitably incurs extra effort and is time-consuming. Moreover, conventional data augmentation for medical images, such as rotation and cropping, fails to maintain sufficient diversity of pathological samples. In this paper, we propose Med-VQ-VAE, a pre-training model for image reconstruction inspired by VQ-VAE and VQ-GAN that utilises its decoder component to generate robust image varieties. Based on our proposed pre-training model, we introduce Med-VAE-GAN, an image-generation model that utilises diagnosis reports to create image-report pairs. Our models can obtain a promising result.

Xingren Wang, Sixing Yin, Wenyu Yin, Yining Wang, Jiayue Li, Shufang Li
A High-Utilization Reconfigurable Digital Computing-in-Memory SRAM Using Dynamic Logic for Edge Neural Network Applications

This paper presents a digital computing-in-memory (CIM) SRAM that can effectively execute small to medium size various-bit precision quantized neural networks. This design uses dynamic logic to compute the bitwise multiplication to reduce area overhead and achieve high performance. The design can be reconfigured to support various bit precision multiply-and-accumulate (MAC) operations in a bit serial manner. The bit cell array consists of 32 computing sections and each section has an independent bit sum and accumulate unit in order to make the utilization more efficient when executing small MAC operations. By adopting the foundry provided push rule 8T SRAM bit cell, the design achieves a compact area and high stability. This design is implemented in 40 nm CMOS technology. The simulation result shows that this design achieves a maximum area efficiency of 9985 GOPS/mm2 and a maximum energy efficiency of 122 TOPS/W. Furthermore, taking MobileNet as an example, this design demonstrates much higher utilization compared with other state-of-the-art SRAM CIM macros.

Xiangguang Su, Yangzhan Mai
Application of a Soft Sensor Model Based on TCN-LSTM to Chemical Processes

For the soft sensor modeling of chemical processes exhibiting pronounced nonlinearity and intricacy, this study proposes a soft sensor model -- TCN-LSTM, which combines Temporal Convolutional Networks (TCN) and Long Short-term Memory Networks (LSTM). TCN-LSTM can extract the spatio-temporal features and dynamic response relationships of the input samples, addressing their time-varying delay issue by discerning dynamic response relationships. To confirm TCN-LSTM's efficacy, we apply it to modeling a soft sensor instance of a debutanizer column. The experiment outcomes demonstrate that TCN-LSTM exhibits superior measurement accuracy over Backpropagation Neural Network (BP), Radial Basis Function Neural Network (RBF), Convolutional Neural Networks (CNN), LSTM, and TCN approaches.

Yang Hao, Jun Li
FlexiLearn-Accessible Learning: Digital Text and Audiobooks for Blind and General Students

The application provides audio and textbooks, for both general and blind students, which can be accessed through a talkback and ASR system. Blind users can use the TTS system to listen to the textbooks, and the application provides additional commands such as read, pause, repeat, and bookmark to facilitate their learning experience. The application's primary goal is to address the challenges faced by blind individuals when accessing learning materials, particularly when it comes to traditional print-based resources. The initiative also seeks to increase the accessibility of digital books by working with technology companies, libraries, and publishers. By giving blind people easier access to information, the suggested platform and collaborative efforts can improve their education and personal growth. Moreover, the application can also benefit general students who may not have free access to educational books or may prefer audio over text- based learning materials. Overall, the project aims to create an accessible and inclusive learning environment that caters to the needs of all students, including those with visual impairments.

Areeba Abdul Haq, Abdullah Ayub Khan, Asif Ali Laghari, Waseem Bakhsh, Shafique Ahmed Awan, Muhammad Asad Abbasi
A Recognition Algorithm for Distracted Driving Behavior Based on CBAM-EfficientNetB0

A framework named CBAM-EfficientNetB0 was proposed in this paper, aimed at addressing the issue of low accuracy in distracted driver behavior recognition under low-parameter conditions. CBAM-EfficientNetB0 integrates CBAM attention mechanisms, including Channel Attention Module and Spatial Attention Module. This enables the network to focus more on important feature information, suppress irrelevant or redundant features, thereby enhancing feature distinctiveness and expressiveness, while also reducing complexity and improving model training efficiency. By adopting the cross-entropy loss function to address multi-class classification problems, and after experimenting with the optimization of the model's optimizer to SGD, optimized learning rates, and momentum parameters were obtained, which not only improved recognition accuracy but also enhanced model convergence. CBAM-EfficientNetB0 achieves a 97.6% accuracy with a basic parameter set of 5.33 million on the State Farm Distracted Driver Detection dataset. The results demonstrate that compared to other four frameworks in the same category, it achieves better accuracy and performs well under low-parameter conditions. In the future, this technology will be able to assist drivers in safe driving.

Xin Shi, Fen Li, Guangqiang Lu, Yanjing Xie, Fangyan Dong, Kewei Chen
D-mamba: A Mamba Based Model for DGA Domain Classification

Domain Generation Algorithm (DGA) algorithm is an important part of C&C attacks. Attackers use DGA domains to establish communication between infected machines and C&C servers, and then manipulate user hosts to achieve malicious attacks. The DGA domain name detection model based on Long Short-Term Neural Network (LSTM) and Transformer has the disadvantages of low accuracy and poor adaptability. In order to solve these problems, this paper proposes a D-mamba model based on the latest Mamba structure, which improves the detection ability of the model through a four-channel feature extraction algorithm. The experimental results show that the D-mamba model achieves 98.97% F1 score for the 2-classification task on the OSINT public dataset, and the D-mamba model achieves 68.75% F1 score for the multi-classification task on the OSINT public dataset, which are higher than other models.

Lingshan Kong, He Wang, Hongfeng Jia, Runsi Ma
Developing Deep Learning Algorithms for Adaptive Control in Prosthetics

This study focuses on enhancing prosthesis control via the integration of deep learning models and adaptive control methods. Our work combines VGG16, CNN, and RNN architectures to increase gesture detection from EMG data, permitting precise control of prosthetic limbs. We created and trained these models using a dataset of EMG measurements related to diverse hand gestures, obtaining great accuracy and resilience in gesture categorization. The adaptive control system, including real-time feedback and a PD controller, significantly increased prosthesis responsiveness, displaying considerable gains in operational accuracy. The findings underline the potential for these sophisticated models to change prosthesis control, enabling greater user experience and usefulness. This study is innovative in its coupling of deep learning with adaptive control for prostheses, emphasizing considerable gains over prior approaches. The ramifications of this study extend to enhanced prosthetic devices with more intuitive and effective control methods.

Zaman Mahdi Abbas Alabbas, Basim Ghalib Mejbel, Adel A. Abbas, Sazan Kamal Sulaiman, Ahmed Dheyaa Radhi, Saadaldeen Rashid Ahmed, Abdulghafor Mohammed Hashim, Talib Abidzaid Al-Sharify, Lal Hussain
Chinese Text-to-SQL Parsing Based on Relation-Aware Mechanism

When converting natural language questions into SQL queries with semantic parsing models, existing methods struggle to parse various unknown database structures. Encoding the relations within the database and aligning the columns in the database with the keywords in given natural language questions are key challenges for existing text-to-SQL methods to achieve generalization. Furthermore, the majority of research related to text-to-SQL tasks is currently based on English datasets, with very limited methods tailored to Chinese questions. To address these challenges, we propose a text-to-SQL method based on relation-aware self-attention mechanism. It utilizes multilingual BERT for initial word embedding and integrates both local and non-local relations distinctively with line graph. Experiments conducted on the Chinese dataset CSpider demonstrate the effectiveness of the proposed method. It achieves an accuracy of 51.9% in exact matching, at least 5% absolute improvement compared to other existing schema encoding and linking text-to-SQL models.

Qixiang He, Hongyu Guo
Constant-Time Discrete Gaussian Sampling for Edge Computing Based on DPWGAN

Edge computing reduces latency and optimizes resource allocation by moving data processing to the network’s edge, ideal for tasks like post-quantum encryption. Lattice cryptography, particularly discrete Gaussian distribution, is crucial in this field. The mainstream approach for discrete Gaussian sampling involves generation trees, which require storing table information, leading to unnecessary memory usage. Balancing computational and memory loads while ensuring time consistency is crucial for post-quantum blockchain applications. Additionally, ensuring consistent sampling times for binary trees of different depths is also challenging. Accordingly, we propose a generative model that builds upon Generative Adversarial Networks (GAN) by incorporating rigorous privacy guarantees and parameter obfuscation, fully utilizing the advantages of having a fixed structure for generative models and allowing them to retain constant time. Through extensive experiments, including comparison with Falcon, which is a NIST Round-3 finalist for the post-quantum digital signature standard. Simulation results indicate that the proposed method can achieve enhanced security in the signature process while keeping computational overhead within manageable limits.

Jingbin Shi, Ning Li, Feixiang Li, Mingzhe Liu, Xige Zhang
EleKAN: Temporal Kolmogorov-Arnold Networks for Price and Demand Forecasting Framework in Smart Cities

In smart cities, the demand forecasting problem is inherently dynamic and difficult to predict. A novel forecasting framework, EleKAN is proposed, that utilizes the Temporal Kolmogorov-Arnold Networks (TKANs), which are designed to leverage enhanced efficiency, accuracy, and reslience in the multistep time series predictions. TKANs integrate the interpretability and performance strengths of Kolmogorov-Arnold Networks (KANs) with the temporal dependency management capabilities of recurrent architectures, effectively addressing the limitations of existing models. The proposed framework is applied to a benchmarking Australian New South Wales (NSW) electricity market, which includes various features including total demand, demand delay, power consumption, and temporal factors such as weekdays, holidays, and time intervals. Experimental results demonstrate that EleKAN achieves a Root Mean Square Error (RMSE) of 0.0030, an R-squared ( $$R^2$$ R 2 ) score of 0.9757, significantly improving the precision and dependability of power demand estimates. A percentage improvement of 10. 9%, 6. 1%, and 9. 7% is obtained in the score of $$R^{2}$$ R 2 over LSTM and GRU respectively. This makes it particularly valuable for long-term energy management and planning in smart cities.

Pronaya Bhattacharya, Tamoghna Mukherjee
A Deep Reinforcement Learning-Based Topology Optimisation Method for Distributed Trial Networks

In a distributed node network, the network topology affects important performance metrics, including link-utilisation, throughput and latency.In a large-scale distributed simulation experiments across geographical areas, the network transmission conditions between different distributed nodes vary greatly. To improve the performance of the whole test system, it is necessary to give a more perfect test node deployment scheme. In practice, the optimisation method of manual adjustment is usually used, which cannot guarantee to find the optimal solution. In this paper, we introduce a deep reinforcement learning (DRL) approach aimed at addressing the node deployment challenge in cross-domain collaborative experiments. We employ the Advantage Actor-Critic algorithm (A2C) to optimize the node deployment strategy and identify the most effective solution. The A2C comprises three components: a Validator responsible for validating the accuracy of the generated network topology, a Graph Neural Network (GNN) designed to efficiently approximate topology ratings, and a DRL actor layer. We tested the method in simulation based on a real experimental scenario, and the experimental results demonstrate the feasibility of the A2C algorithm to solve such problems. 150–250 words.

Zhuo Wang, Mingzhe Liu, Feixiang Li, Honglei Yin
Control System for Rehabilitation Bionic Hand Based on Precise Control Algorithms

In recent years, the disease burden of stroke has shown an explosive growth, and hand function rehabilitation training equipment has shown a broad application prospect. The current research of rehabilitation gloves focuses on the realization of preset rehabilitation actions, but the existing rehabilitation bionic hand control system often does not consider the motor drive of finger motion trajectory and the response speed of the device, resulting in the problem of insufficient accuracy and large delay. To improve the rehabilitation effect, this paper developed a rehabilitation bionic hand control system based on the PID algorithm and asynchronous concurrent processing, including a special drive circuit board, Hall-type reducer motor, asynchronous concurrent program and PID algorithm. This system can control the bionic rehabilitation glove to complete the preset action and provide support for rehabilitation treatment. Experimental results indicate that the system has high accuracy and low delay, which meets the requirements of rehabilitation treatment and shows considerable practical application prospects.

Rongben Zhai, Yunhan Gao, Guangdi Li, Qi Ding, Yang Zhang, Wenli Zhang
Research on Spatiotemporal Dynamic EIT Optimization Algorithm Based on Tensor Nuclear Norm

Spatiotemporal dynamic Electrical Impedance Tomography (EIT) is an emerging non-invasive imaging technique that reconstructs internal structures by measuring tissue conductivity distributions. This study utilizes the tensor nuclear norm (TNN) method based on tensor singular value decomposition (t-SVD) to leverage low-rank and sparse priors for capturing spatiotemporal characteristics, overcoming the limitations of traditional EIT algorithms in dynamic imaging. The alternating direction method of multipliers (ADMM) is employed to solve the optimization problem, emphasizing spatiotemporal continuity and achieving precise reconstruction of dynamic conductivity changes. Simulations and experiments, particularly in lung respiration imaging, demonstrate the superior performance of the TNN method compared to the conjugate gradient least squares (CGLS) method. Results show that TNN excels in noise reduction and image continuity, providing a novel approach for dynamic EIT monitoring applications.

Tao Zhang, Qi Wangv, Zichen Wang
Advancing Speech Emotion Recognition for Urdu: Methodological Developments in Low-Resource Contexts

In the realm of Speech Emotion Recognition (SER), addressing the unique challenges presented by linguistically diverse and low-resource environments is critical yet often overlooked. This study advances SER for Urdu, a notably underrepresented language in computational linguistics, marking a significant stride toward linguistic inclusivity in intelligent systems. We introduce a pioneering combination of Mel-Frequency Cepstral Coefficients (MFCCs) with Convolutional Neural Networks (CNNs) and Gated Recurrent Unit (GRU) technologies to capture the emotional subtleties within Urdu speech. Our research employs the SEMOUR+ dataset for comprehensive analysis and incorporates cross-validation with an additional Urdu dataset using both GRU and Random Forest models to evaluate robustness. The results demonstrate a significant enhancement in SER accuracy, achieving up to 84.92%. Notably, the proficiency in identifying happiness as a test case highlights the model’s real-world applicability. This work not only furthers the development of SER frameworks for Urdu but also establishes a foundation for similar advancements in other low-resource languages, underscoring the crucial role of artificial intelligence in overcoming linguistic boundaries in emotion detection.

Muhammad Adeel, Zhiyong Tao
AIChainDNS: A Framework for Optimizing DNS Security Through Blockchain and Machine Learning

As a critical element of the internet’s framework, the Domain Name System (DNS) consistently faces security vulnerabilities. This paper proposes the AIChainDNS framework, which integrates the decentralization and consensus mechanisms of blockchain with the advanced pattern recognition capabilities of deep learning to enhance DNS security. The framework’s efficacy is validated through extensive tests, where AIChainDNS exhibits superior performance: utilizing the NSL-KDD dataset, the model demonstrates an anomaly detection precision of 98.26% on the Bytedance DNS Dataset (BDD), it attains a 94.56% accuracy rate. These results demonstrate AIChainDNS’s potential to better satisfy the security demands of elite internet content providers in real-world scenarios.

Lingshan Kong, Runsi Ma, Hongfeng Jia, He Wang
A Heuristic Algorithm for Solving the MTSP Problem Based on DBSCAN Clustering Algorithm and CNGA Algorithm

The Multiple Traveling Salesman Problem (MTSP) is an extension of the Traveling Salesman Problem (TSP), aimed at finding the optimal routes among multiple salesmen to minimize the total journey. Genetic algorithms, as a heuristic optimization method called DCNGA (DBCSAN clustering algorithm & C-N-GA), have shown good performance in solving MTSP. This heuristic algorithm is based on C-N-GA (an improved genetic algorithm) combined with the DBCSAN algorithm. By clustering cities into multiple clusters through the DBCSAN algorithm, by calculating the TSP for each cluster of cities individually to obtain local optimal solutions, which are then combined to form the global optimal solution. The 5 experimental results provided in this article indicate that the algorithm can effectively find the optimal salesman route solution, improve search efficiency by about 2%−4%, and reduce calculation time by about 10%−20%. It is suitable for solving mid-term strategic planning problems in practical applications.

Guo Chen, Yiwen Cai, Guangqiang Lu, Yanjing Xie, Fangyan Dong, Kewei Chen
Research on the Application of Deep Neural Network in the Classification of EEG Data of Major Depressive Disorder and Bipolar Disorder and Optimization Strategy

With the increasing impact of affective disorders on global health, accurate identification of major depressive disorder (MDD) and bipolar disorder (BD) is crucial. The symptomatic similarity between these disorders complicates treatment decisions, affecting patient outcomes and long-term management. To overcome the limitations of traditional symptom-based diagnostics, this study integrates multi-source datasets, analyzing clinical and high-precision EEG data from 200 MDD and 200 BD patients. We extracted 360 metrics using feature engineering techniques and applied various machine learning algorithms, including random forest, support vector machine (SVM), and neural networks. Specifically, we addressed the temporal characteristics of Electroencephalogram (EEG) data by introducing Recurrent Neural Network (RNN), Long Short-Term Memory Network (LSTM), and Transformer models. The Fully Connected Neural Network (FCNN) was selected as the optimal model due to its superior performance in accuracy, specificity, and sensitivity. We also emphasized the impact of model generalization on clinical applications, exploring variable importance and sub-band contributions in EEG. To prevent overfitting from large EEG data inputs, we proposed corresponding strategies and improvements. Our findings improve the classification accuracy of MDD and BD and highlight the value of EEG features in mental disorder identification, providing a foundation for developing efficient automated diagnostic tools and advancing precision medicine in affective disorder treatment.

Zhuozheng Wang, Bingxu Chen, Xixi Zhao, Xinyu Liu, Xiaoyun Liu
Intelligent Warehouse Based on Radio Frequency Identification and Recurrent Neural Network

This article explores the application and importance of intelligent warehouse management systems in modern logistics management. With the rapid development of global trade and the continuous growth of logistics demand, traditional warehouse management methods are no longer able to adapt to the growing market demand. Therefore, this article delves into how to use Recurrent Neural Network (RNN) and Radio Frequency Identification (RFID) technology to build an intelligent warehouse management system. Through the application of RFID technology, the material in and out process is automated, greatly improving the efficiency and accuracy of warehouse operations. Meanwhile, as a powerful sequence model, RNN can accurately predict the future demand for materials based on time series data, providing decision support for warehouse managers and helping them flexibly adjust inventory strategies to cope with market fluctuations. By combining RFID and RNN technologies, this article constructs an intelligent warehouse management system that can monitor logistics in real-time, accurately predict demand, and make intelligent decisions. This system not only helps to reduce logistics costs and improve inventory turnover, but also enhances customer service levels to meet the urgent market demand for efficient logistics. In addition, this article also explores the advantages of RNN in the field of intelligent warehouse management.

Fei Dai, Feng Li, Jianhua Ma, Qiuli Chen
FedPA: Unbiased Prototype Alignment in Federated Learning to Mitigate Feature Skew

Although traditional federated learning (FL) methods excel in addressing label skew issues through local model updates and personalized learning strategies, they fall short in handling feature skew, which limits the model’s generalization capability across different domains. We propose FedPA, an innovative framework designed to significantly enhance model generalization performance in feature-skewed environments by utilizing prototypes to extract domain-invariant features. FedPA employs advanced prototype aggregation strategies and a feature adapter network trained end-to-end on the server side, ensuring the extraction of consistent, unbiased prototypes. On the client side, it enhances model accuracy and stability through alignment with these unbiased prototypes. Experimental evaluations confirm that FedPA significantly surpasses existing FL methods in terms of accuracy and robustness across multiple datasets. Notably, on the Digits-5 dataset, FedPA achieves a 5% improvement in accuracy compared to the standard FedAvg method and other advanced techniques. Furthermore, on the Office-Caltech-10 dataset, the global model’s accuracy is enhanced by 11.9%. These findings underscore FedPA’s effectiveness as a solution for cross-domain applications in FL.

Qiuli Chen, Xin Chen, Feng Li, Xiangli Yang
A Novel Method for Calculating Depression Level Based on Hybrid Neural Networks and Subjective Scales

With the increase of social pressure, depression has become a common psychological disorder. Traditional self-assessment scales for diagnosis are highly subjective, and electroencephalography (EEG) is a commonly used objective assessment tool for depression. To increase the precision of the widely used algorithms for the diagnosis of depression, this paper uses a combination of one-dimensional convolutional neural networks (1D-CNN) and gated recurrent units (GRUs) to extract local and temporal features of the EEG signal. Experiments show that the 1D-CNN-GRU model has better performance compared with the single network algorithm, the accuracy in the public dataset is 98%. In comparison to 1D-CNN and GRU models, which is 4% and 3% higher, respectively. In order to obtain a more accurate index of depression level, this paper uses different method to integrate the subjective PHQ-9 self-assessment scale results with the objective depression level scores obtained from 1D-CNN-GRU. The final score is fine-tuned on the basis of integrated score. This work can achieve more accurate detection of depression and assist doctors in diagnosis.

Zhuozheng Wang, Keyuan Li, Xixi Zhao
Multi-user Semi-quantum Private Information Retrieval Using Bell States

As a kind of practical quantum cryptography protocol, quantum private information retrieval (QPIR) can protect the privacy of both the user and the database while a user retrieving an entry from the database. Most existing QPIR protocols support database retrieval by only single-user, and when the same entry is expected to be retrieved by multiple users, the protocol needs to be repeated between each user and the database. In order to solve the specific scenario that multiple users in the power grid system need to securely obtain the same entry through distributed terminals, this paper proposes a new multi-user semi-quantum private information retrieval protocol. The protocol is based on the Bell state, which is simpler than the multi-particle entangled state. It also doesn’t require the distributed terminals to have full quantum capabilities, which reduces the difficulty of protocol implementation. In addition, we analyze the security of the protocol to prove that the protocol can effectively prevent the data associated with sensitive information from being leaked, and both user privacy and database privacy are guaranteed.

Baoping Xu, Bo Zhang, Zilong Han, Huairen Yang, Ding Xing, Zhao Dou
Edge Computing Scheduling Algorithms Based on Reinforcement Learning

Computational resource scheduling, through the rational allocation of resources, can significantly enhance the quality of service and user experience, prevent resource overload and wastage, and is crucial for improving the efficiency of edge computing resources management. This paper proposes an edge computing power scheduling algorithm based on reinforcement learning, which, combined with business demand forecasting, allows the scheduling strategy to anticipate and adapt to the dynamic changes in business demands in advance, achieving more precise computing power scheduling. An innovative reward function is designed, taking into account four key factors: latency, edge hit rate, resource utilization balance, and business priority. Simulation experiments have demonstrated that, the proposed algorithm exhibits superior performance in reducing latency, improving edge hit rate, and enhancing the balance of resource utilization on edge servers.

Lingshan Kong, Zihao Wang, Qian Wu, Zhi Li
SDFMD: Spatial Domain Filtering for Accelerating Video Analysis

The exponential growth of video data necessitates more efficient video analysis systems. Many prior studies focused on accelerating video analysis, and one representative method among them is frame filtering. However, traditional frame filtering methods focus on the temporal domain, neglecting the spatial domain. We introduce SDFMD, a lightweight system that filters frames spatially, enhancing processing speed over models like DETR. SDFMD operates in three stages: spatial frame filtering, feature extraction via a Transformer encoder, and analysis using a Transformer decoder to output categories and detection boxes. Experiments indicate SDFMD doubles the encoding and decoding speed of videos, increases video analysis throughput by 9.43%, at a sacrifice of less than 20% reduction in video analysis accuracy.

Hebin Sun, Liu Wei, Meizhao Liu, Yingcheng Gu, Fei Xia, Kai Liu, Yu Song, Huanyu Cheng, Lei Tang, Sheng Zhang
A Bi-hierarchical Graph Based Approach for Insider Threat Detection

Existing graph-based insider threat detection models extract complex relationships between different behavioral entities through graph embedding or graph neural network models. However, each of these methods has its own advantages and disadvantages, resulting in unsatisfactory performance. Furthermore, the issue of class imbalance has not been adequately addressed in these approaches. To address the above existing problems, this paper proposes a novel day-level insider threat detection method, called BHGITD (bi-hierarchical graph-based insider threat detection). In this model, we first construct a heterogeneous graph at the level of behavioral operations for each user based on system logs and propose a random wandering algorithm applicable to this heterogeneous graph to extract the embedded representations between different behavioral operations. Then a homogeneous graph at the user-day level is constructed based on the organizational relationships among users, and a graph neural network model is used to mine the spatial representations of user-day behaviors. In addition, in terms of data imbalance, we use a random under-sampling technique and a weighted cross-entropy loss function to enhance the model’s detection ability for malicious samples. The experimental results on the CERT r4.2 dataset show that the recall and F1 score of BHGITD reach up to 99.32% and 97.66% respectively, which significantly outperform previous works.

Rui Hou, Xiaolong Deng
Research on Contact Pressure-Based Fire Cupping Mouth Damage Detection Method

Fire cup is widely used in Traditional Chinese Medicine. The fire cup being in good condition is the basic requirement. But there are few researches of the fire cup mouth defect detection. Most of the bottle checking systems are focus on the uniform-sized bottles, and the methods for it are almost based on the camera images. The fire cup shows complex image structure in its neck and mouth, which makes the recognition process be difficult. A new kind of checking device is present in this thesis, which adopts manual checking method with pressure sensation. By using artificial intelligent network, a peaks detection way is present in this thesis. Experiment shows the satisfying result.

Xiaomin Liu, Xinyu Wu, Xiaoman Wang, Yuxuan Peng, Chenxi Pan, Jingyao Xu, Simeng Li, Chuanglong Xu
A DQN-Based Method for Shortwave Source Localization

Shortwave source localization represents a significant and pervasive aspect of research and development in many fields. Traditionally, the staff usually identify the direction of arrival of signal according to their own experience in shortwave signal direction-finding and localization. Recently, artificial intelligence technology such as deep learning has been applied to this work. However, deep learning requires a large number of signal samples and manual labeling. In our research, we propose a shortwave intelligent direction-finding and localization method based on reinforcement learning. It can reduce manual annotation, achieve the autonomous evolution of learning while working. Therefore, our work is about innovating DQN to this work. Through training, the feasibility and effectiveness of our approach was demonstrated.

Qiyue Feng, Tao Tang, Zhidong Wu, Xiaojun Zhu, Ding Wang
Research on Autonomous Driving Applications of Multiple Aircraft Based on Artificial Intelligence

This study concentrates on the specific applications of artificial intelligence in civil aviation, with a particular focus on the autonomous scheduling of aircraft in the air. By using DBSCAN clustering algorithm to identify the terminal area trajectory pattern and combining it with the raster method to model the airspace in real time, the trajectory planning space is effectively reduced. Then, using the SARSA algorithm, autonomous aircraft heading adjustment in the approach control area is realized to effectively avoid conflicts. The research includes autonomous aircraft scheduling, terminal area definition, conflict resolution and trajectory prediction. By simulating the air traffic control environment, this paper not only verifies the advantages of the SARSA algorithm in improving the efficiency and safety of autonomous dispatch of aircraft, but also demonstrates the great potential of artificial intelligence technology in solving the problem of autonomous and safe dispatch of multiple aircraft types, especially in the Markov decision-making process.

WenFei Dai, HongMing Han, ZhaoRui Zhang
Deep Learning in Medical Imaging for Early Disease Detection

Early illness diagnosis is critical for improving healthcare outcomes, and deep learning algorithms have shown promise in raising the accuracy of medical imaging. This research tackles the difficulty of accurate early diagnosis by the deployment of sophisticated deep learning models, notably ResNet-50 and Convolutional Neural Networks (CNNs). The study follows a systematic process starting with issue identification and literature assessment, continuing to the creation and deployment of these models using medical imaging datasets. The ResNet-50 model, recognized for its deep residual learning architecture, obtained an astounding accuracy of 92.5% in diagnosing early-stage disorders, proving its ability in discriminating between distinct ailments. This research also uses CNNs to further boost detection performance. The findings underline the potential of deep learning to transform early illness identification, giving considerable gains over previous techniques. The results add to the field by offering a rigorous framework for using deep learning in medical imaging, with implications for more accurate and quicker diagnosis in clinical practice.

Saadaldeen Rashid Ahmed, Talib Abidzaid Al-Sharify, Taif S. Hasan, Rawshan Nuree Othman, Bourair Al-Attar, Noor Haydar Shaker, Abdulghafor Mohammed Hashim, Abu Saleh Musa Miah, Lal Hussain
Computing Resource Allocation Pursuing Maximum Revenue for Computing Network

The Computing Resource Allocation Pursuing Maximum Revenue (CRA-PMR) algorithm is a novel approach to optimizing resource allocation in cloud computing environments. It intelligently identifies overlapping computing demands, enabling resource reuse and enhancing service reliability through dynamic load thresholding. By integrating energy consumption costs into revenue calculations, CRA-PMR aligns with sustainability goals, aiming for carbon neutrality. The algorithm’s performance is evaluated through simulations, demonstrating its effectiveness in maximizing revenue while maintaining service quality. This research contributes to the field by offering a comprehensive solution that considers resource efficiency, cost management, and environmental impact.

Shoufeng Wang, Huamin Chen, Ye Ouyang, Fan Li, Xuan Chen, Jianchao Guo, Zhidong Ren, Rongxing He
Lightweight FOMO-Based Fault Detection Algorithm

Aiming at the problem of low efficiency fault detection in industrial production, this paper proposes a fault detection algorithm based on “Faster Objects, More Objects” (FOMO). This method is suitable for Tiny Machine Learning (TinyML) equipment. It can efficiently extract and classify the defect features in industrial products through Convolutional Neural Networks (CNN). In terms of algorithm application, this paper compares the performance differences of (You Only Look Once version 5 (YOLOv5), Single Shot Multibox Detector (SSD) and FOMO in 3D printing defect detection. In the experimental environment built in this paper, FOMO detection algorithm has significant advantages in lightweight, detection speed, recognition accuracy and other aspects compared with other detection algorithms, and its detection accuracy and stability meet the requirements of practical application.

Zhikai Wang, Zhidu Li, Tingyin Zhao, Yongan Zheng
Video Moment Retrieval Based on Multimodal Information Fusion

Video data has become indispensable to people’s daily lives. Retrieving video clips relevant to user queries from massive datasets has become a significant research focus. However, current methods still have two major limitations: (1) independent moment encoding, which fails to consider the contextual information fully, (2) lack of utilization of fine-grained information, ignoring boundary information. This paper proposes a video clip retrieval method based on multimodal information fusion to address the above issues and improve retrieval accuracy, namely VRMF. This method derives video encodings that simultaneously integrate contextual, fine-grained, and user query information by exploring the relationships between candidate clips, fusing fine-grained cross-modal information of videos and user queries, and incorporating boundary information of different video clips. Finally, results are obtained through a scoring and ranking algorithm. We conduct experiments on three public datasets, showing that VRMF outperforms baseline models.

Lu Zhang, Xunyuan Liu, Yingxuan Guan, Ying Xing, Yaru Zhao
Theoretical Foundations and Research Directions of Storage-Calculation Fusion

Storage-calculation fusion technology aims to integrate storage and computing functions into the same generation of hardware units, in order to reduce data transmission between storage and computing units, decrease latency and power consumption, and enhance processing efficiency. In the application of power grids, this technology is particularly crucial. This paper outlines the three major application models of storage-calculation fusion technology: in-memory computing, in-storage computing, and storage-calculation integration, and analyzes the core algorithms and advantages. Additionally, the paper matches these algorithms with the characteristics and requirements of power application scenarios. Furthermore, this paper proposes that future breakthroughs should be achieved in both basic algorithmic research and the practical implementation of algorithms in application fields.

Yue Wang, Min Zheng, Chunpeng Wu, Qinghe Ye, Weiwei Liu, Fei Zhou
Carbon Trading Behavior-Based Strategy for Query Load Balance in Cloud Database

A cloud database uses thousands of nodes to exchange data in order to process query requests from different applications. It inevitably happens that some nodes stored hot records are facing high frequent query requests while the others are rarely visited or even idle. Therefore, how data are dynamically allocated and migrated at the runtime has crucial impact on the query load distribution and the system performance. In this paper, we simulate the management of carbon assets in the global carbon trading market, and regard the cloud database as global carbon trading market, where data nodes are regarded as countries around the world and the query load as carbon quota. The global carbon trading market will achieve balance through the trading of query load between countries, ultimately achieving carbon peak, which means the global equilibrium in cloud database. Therefore, we propose a Carbon Trading Behavior-based Strategy for query load balance in cloud database, taking into account the computing capacity, disk volume, bandwidth, etc. We further apply this strategy to MongoDB and conduct experiments on synthetic dataset. Experimental results show that the strategy we propose significantly enhances the query efficiency of MongoDB, and over 65% improvement is achieved for efficiency.

Yingxuan Guan, Shubo Zhang, Qiuyue Cui, Minghao Yi, Binyang Li, Lin Deng
RNA-Protein Binding Site Prediction Based on Multi-scale CNN Convolution with Global Relationship

In light of the intricate, resource-intensive, and time-consuming characteristics associated with high-throughput experiments, the computational prediction of RNA-binding Protein (RBP) binding sites presents an efficacious strategy. Numerous methodologies have surfaced to forecast RNA-protein binding sites, typically necessitating the amalgamation of diverse features derived from annotated knowledge, including original RNA (Ribonucleic Acid) sequences and secondary structures. This integration serves to augment overall predictive performance. In this context, we propose an efficient and simple hybrid model, i.e., DeepCBA, which is only related to the original RNA sequences compared to other methods, thus simplifying the computational process. The model extracts local information and global context features with the help of multi-scale CNN convolution and BiLSTM (Bidrectional Long Short-Term Memory). Meanwhile, we make it possible to focus more on important features by introducing an attention mechanism in BiLSTM, which enables us to predict RNA-protein binding sites more efficiently. Empirical findings demonstrate that DeepCBA exhibits superior performance compared to contemporary methods in the realm of binding site prediction. This finding not only highlights the validity and performance advantages of the model, but also provides a powerful tool and method for future RNA-binding protein research. This means that DeepCBA is expected to be an important contribution to the field of RNA-binding protein research and to promote the further development of related fields.

Hui Yang, Jiawei Wang
A Bidirectional Entity Relation Joint Extraction Framework Based on Multi-layer Perceptron

Entity and relation extraction from text without a fixed format is a vital step in knowledge graph construction. Pipeline methods overlook the interaction between two sub-tasks and are susceptible to error propagation issues. To tackle this issue, we propose a bidirectional entity relation joint extraction framework based on multi-layer perceptron (MLPBRE). This frame-work employs bidirectional extraction methods (S2T and O2T) to extract relations starting from both head and tail entities, and it enhancing feature representation with multi-layer perceptron. Experimental results of the model show that MLPBRE achieves F1 scores of 92.7% and 92.8% on the NYT and NYT datasets, and 93.2% and 89.7% on the WebNLG* and WebNLG datasets, respectively, surpassing current methods. The method particularly excels in handling sentences with overlapping triplet issues and those containing multiple triplets. The proposed approach significantly reduces error propagation issues between two sub-tasks and it enhances the overall performance of entity relation joint extraction tasks.

Jiawei Wang, Hui Yang, Jingtao Li, Jiangkai Yan, Yukun Yan
Research on Deep Applications Based on Visual Recognition Technology

This paper conducts an in-depth study on the application of YOLO-v4, Faster R-CNN, and DenseBox neural networks for training deep learning models, using smart restaurants and dish recognition as examples. Unmanned settlement systems in smart restaurants can effectively reduce labor costs and improve operational efficiency. The core of an unmanned settlement system is the accurate identification of dishes. To address this issue, deep learning methods were used to recognize images from Chinese dish databases. YOLO-v4, Faster R-CNN, and DenseBox neural networks were selected for training deep learning models, and their recognition results were compared and analyzed. Experimental results show that the Faster R-CNN network model performs the best, with the ability to automatically extract image features, achieving a recognition rate of 95.4% for dishes, a recall rate of 82%, and an F1 value of 88.2%. This study provides a reliable foundation for the intelligent recognition of dishes and the application of smart catering.

Yi Yu, Archival Sebial, Angie Ceniza Canillo
A Yolo-v7 Based Approach for Sperm and Noise Detection in Microscopic Videos

Accurate detection of sperms and impurities is a highly challenging task, primarily due to the small size of the targets, uncertain morphologies, and the presence of numerous impurities with varying sizes and shapes. Currently, the detection of sperms and impurities still relies heavily on traditional image processing and detection techniques, which have limited performance and often require manual intervention during the detection process, thus increasing time costs and injecting subjective bias into the analysis. Drawing inspiration from the numerous successful applications of deep learning techniques in various object detection challenges, we propose here an improved Yolo-v7 network model for sperm and impurity detection. This model employs Switchable Atrous Convolution (SAC) and the Wise-IoU loss function. Experimental results demonstrate that the highest AP50 for sperm and impurity detection are 95.1% and 62.4%, respectively. This significantly improves the detection accuracy of sperms and impurities, surpassing competitors and establishing state-of-the-art results on this issue.

Qingtao Meng, Chen Li, Ning Xu, Marcin Grzegorzek, Xinyu Huang, Hongzan Sun, Junxin Chen
Texture Features Based Microbiological Image Retrieval

Environmental problems related to microorganisms are receiving increasing attention. Compared with manual judgement and instrumental detection, processing microbial microscopic images for microbial retrieval using computer image processing techniques is a more effective method. In this paper, GLCM, GGCM, HOG, LBP and Gabor features of microbial microscopic images are extracted respectively. A microbial microscopic image retrieval method based on texture features is proposed using machine learning methods, and the retrieval effect of the fusion of two texture features is particularly analysed. By comparing the retrieval effect of multiple texture features, the HOG-based retrieval method has superior performance.

Han Yu, Bolin Lu, Xinyu Ouyang, Yuhang Yang, Yue Zhang, Haobo Meng, Marcin Grzegorzek, Xin Zhao, Chen Li, Hongwei Lei
A Novel Digital Twin System of Elderly-Care Robot for Housework Execution

With the increasing problem of population aging, more and more elderly people are inclined to choose home care. In order to improve the safety and independence of elderly home care, home care robots are viewed as a feasible solution, which can undertake some cumbersome and complex housework tasks. Therefore, this article proposes a novel digital twin system of elderly-care robot (ECRDTS) for housework execution for the first time. Firstly, the architecture of the digital twin system is proposed, which can accurately reflect the characteristics of the physical robot and enable the robot to achieve motion simulation, 3D state monitoring, and virtual-real synchronization. Secondly, ECRDTS was constructed in three dimensions of geometry, data, and robot task execution process, which realized high-fidelity mapping from physical space to digital space and laid the foundation for subsequent robots to autonomously execute housework tasks in a digital twin environment. Finally, the proposed method is applied to the task planning of robot-assisted cleaning of vegetables and fruits in the family kitchen scene, and the effectiveness of the method in practical application scenarios is verified. This research achievement not only meets the needs of elderly home care but also ensures the safety and independence of elderly living alone. Ultimately, this technology is expected to be widely used in nursing homes, hospitals, and home care tasks.

Xiufang Liu, Donghui Zhao, Jiahui Ding, Junyou Yang, Shuoyu Wang, Bowen Liu
Transformer for Brain Tumor MRI Decision Making in Smart City Health Services

Medical image segmentation is important for medical diagnosis, and deep neural networks (CNN) have made significant progress in this area. However, CNN have the disadvantage of focusing mainly on localized features. In contrast, the Transformer architecture is able to consider the entire input sequence and therefore captures the global contextual information of medical images more efficiently. In this study, we propose an innovative approach to enhance image detail information by first pre-processing using Contrast Constrained Adaptive Histogram Equalization (CLAHE), and then combining U-Net with the VIT Transformer framework to further process MRI brain tumor medical images. The method proposed in this study is improved VIT based U-Net. This study achieved particularly outstanding results in the recognized dataset Brats2020, and the evaluation metrics results of the algorithm proposed in this study are more than 99%, and the results in the MSD dataset, although slightly inferior to the Brats2020 dataset, are still advantageous in comparison with other algorithms.

Uzair Aslam Bhatti, Xinxin Sun, Mengxing Huang, Yu Zhang, Yuanyuan Wu, Huizhou Liu, Xu Bo, Siling Feng
Exploring the Path of Human-Robot Collaboration on Project Team Resilience by Digital Technology

In the context of variable, uncertain, complex and ambiguous, it is crucial to enhance team resilience through human-robot collaboration supported by digital technology. While increasing the resilience of the project team, artificial intelligence-based Human-robot collaboration also produces uncertainty in Human-robot collaboration behavior. It weakens the ability of the Human-robot team to cope with adversity due to the complex and random operating environment. Based on the resource base theory and team process theory, starting from the innovation of Human-robot collaboration on traditional work scenarios and organizational models, the regression equation model is used to test the action path and influencing mechanism of Human-robot collaboration on the resilience of project teams. Through the analysis of the data of 105 project teams, the results show that Human-robot collaboration has a significant positive impact on team resilience, resource acquisition plays a significant mediating role in the relationship between Human-robot collaboration and team resilience, and Human-robot trust plays a significant moderating role in the relationship between Human-robot collaboration and project team resilience.

Shiying Shi, Xinxin Peng, Jingjing Zhang, Fangfang Zhao
Semi-supervised Remote Sensing Image Classification for Edge Computing via Contrastive Learning

Remote sensing image has great application prospect in multiple fields, while classification is the key for utilization of these images. Meanwhile, the recent growth in edge computing eases the high hardware requirement of machine learning methods. In order to improve the existing problems of accuracy dissatisfactory and high data dependency in remote sensing image classification, this manuscript establishes a semi-supervised contrastive learning image classification neural network model based on SimCLR algorithm suiting for edge computing. The semi-supervise network architecture is conductive to obtain optimal value with unlabeled data, which is important due to the inadequacy of labeled remote sensing data. By incorporating an interpolation mix module for feature space expression optimization, the proposed model improves the sample utilization ability and speeds up the convergence rate of the network. In this manuscript, the local feature contrastive learning module extracts secondary features, for the sake of enhancing the classification accuracy. The experiment outcomes demonstrate that our model has comparatively good performance on remote sensing image datasets and better handling for data insufficiency.

Pengquan Liao, Ning Li, Mingzhe Liu, Kai Qu, Feixiang Li, Jinyi Chen
FaceWave: SVM and Wavelet Transformation-Based Face Recognition Framework for Women Cricketers

Advancements in machine learning (ML) have significantly enhanced image processing algorithms, leading to notable improvements in face recognition technology. The paper introduces a framework, FaceWave, which uses a support vector machine (SVM) model combined with wavelet transformation to extract vital facial features from female cricketers. The integration addresses the existing challenges of identifying facial features in context of lighting, posture, and angles. In the framework, raw images are first collected using the Google Chrome Fatkun addon. These images undergo a thorough preprocessing stage to enhance their quality and remove any noise or artifacts. Subsequently, wavelet transformation is applied to derive significant features from the facial images. The processed feature vectors are then utilized to train our SVM model. Our model demonstrates impressive accuracy, achieving 76% during training and 84% during testing. These results highlight the robustness of FaceWave in accurately identifying female cricketers, showcasing the potential of our customized approach in this specialized application.

Sachin Kumar, Vivek Kumar Prasad, Pronaya Bhattacharya, Pushan Kumar Dutta, Subrata Tikadar, Bharat Bhushan
IOT and AI Integration for Smart Prosthetic Limb Systems

This project studies the combination of the Internet of Things (IoT) with artificial intelligence (AI) to increase the functionality and flexibility of smart prosthetic limbs. The work takes a mixed-method approach, combining simulation-based modeling and real-world experimental testing, to design a prosthetic system that leverages advanced AI algorithms and IoT technology. Key techniques incorporate the use of machine learning for adaptive control and real-time data processing using IoT sensors. The results reveal considerable improvements in prosthetic limb performance, including increased responsiveness, user experience, and system dependability. Findings reveal that the combination of IoT and AI not only enhances prosthesis control but also provides a more customized and intuitive user interface. The study concluded that this new technique has the potential to transform prosthetic technology, presenting a roadmap for future research into further refinement and broader implementation of smart prostheses. Future work will focus on improving the AI algorithms, increasing real-world testing, and finding potential applications in customized healthcare.

Zainab Imad Altamimi, Ali Fenjan, Noor Talib Al-Sharify, Mohammad K. Abdul-Hussein, Ahmed Dheyaa Radhi, Saadaldeen Rashid Ahmed, Ola Farooq Jelwy, Abdulghafor Mohammed Hashim, Abu Saleh Musa Miah
Gesture Recognition Method Based on Hybrid Classifier Under Non-ideal Conditions

In the field of gesture recognition based on surface electromyographic (sEMG) signals, many deep learning methods are able to achieve high recognition accuracy on multiple gestures. However, in practical applications, sEMG gesture recognition is susceptible to interference from irrelevant gestures, leading to a decrease in recognition accuracy and affecting system stability. Therefore, identifying the category of target gestures while excluding interference from irrelevant gestures has become a hotspot issue in this field. This paper introduces a gesture recognition algorithm based on a hybrid classifier under non-ideal conditions. Firstly, we introduce and improve the multi-class classifier LST-EMG-Net to enhance the algorithm’s accuracy on target gestures. Secondly, we introduce EMG-FRNet as a single-class classifier to improve the algorithm’s accuracy on irrelevant gestures. Then, a feature correlation screening module is designed to intercept target gesture samples outside the single-class classifier, avoiding misjudgment by the single-class classifier as irrelevant gestures, thus enhancing the overall accuracy of the algorithm on target and irrelevant gestures.

Yufei Wang, Gongpeng Pang, Bo Liu, Yifan Li, Wenli Zhang
A New Modeling Method of Dynamic RCS of Cruise Missile

Cruise missile is an important threat in modern air defense field. The research of cruise missile target scattering characteristics plays an important role in radar detection efficiency. In this paper, a new dynamic RCS modeling method for cruise missile target is proposed to solve the problem that the static RCS data does not fit the radar detection. Based on the scene of airborne early warning radar detecting cruise missile flying at low altitude, the target model is emulated by CST software, and the static omnidirectional RCS simulation data of the target under S-band operating condition are obtained. Combining the target motion simulation trajectory and the radar motion trajectory, the radar line-of-sight angle is calculated by 3D coordinate transformation, and the dynamic RCS data of cruise missile is obtained by searching the static RCS database based on the line-of-sight angle. The simulation results can provide data support for capability evaluation, parameter optimization and target identification of radar detection cruise missile.

Wei Han, Zehao Ye, Hao Tu, Juntao Liang, Yawei Song, Kai Yan
A Transformer-GRU-Based Edge Computing Method for Vessel Trajectory Prediction

Edge computing, an essential aspect of modern computing, is particularly well-suited for vessel trajectory prediction due to its ability to process vast amounts of real-time data, thereby overcoming the limitations of traditional centralized methods. Despite the existence of various prediction models, there is limited research focused on long-distance trajectory prediction. This study proposes a Transformer-GRU-based method to enhance the accuracy of long-distance vessel trajectory predictions. Historical AIS data for multiple vessels were collected, preprocessed using cubic spline interpolation, and validated through a combination of Transformer-GRU prediction and residual compensation. The proposed model outperforms seq2seq and GRU models, achieving mean squared error (MSE) values for longitude and latitude prediction errors below $$6\times {10}^{-5}$$ 6 × 10 - 5 for long-distance predictions, thereby providing a more accurate approach for long-distance vessel trajectory prediction. Comparatively, the Transformer-GRU model improves performance by 36.51% over the GRU-only model, albeit with a time cost increase of 41.69%.

Yuhao Su, Mingzhe Liu, Feixiang Li, Honglei Yin, Chao Fang
Multimodal Semantic Fusion Network for Fake News Detection

Mitigating the dissemination of misinformation requires effective fake news detection. Existing models rely on simple global features and shallow language models for visual and textual feature extraction, making it difficult to capture deep semantic relationships. Furthermore, traditional models lack detailed fusion mechanisms, leading to inefficient filtering of redundant information and affecting overall performance. This paper proposes a multimodal fake news detection model, MSFND-Net, which combines advanced feature extraction and semantic relationship modeling techniques. MSFND-Net includes four components: feature extraction, internal semantic relationship modeling, cross-modal semantic relationship modeling, and classification. Visual features are obtained through Faster-RCNN combined with a pre-trained ResNet-101 model, whereas textual features are derived from the BERT model. The internal semantic relationship modeling module constructs semantic relationship graphs for visual regions and text words, using a graph attention network to capture these relationships. The cross-modal semantic relationship modeling module uses a two-layer graph attention network to process image and text features, forming a combined feature graph, which is further processed by a cross-modal graph neural network to capture complex cross-modal semantic relationships. The feature fusion attention module calculates the cosine similarity between textual and visual features, normalizes it to obtain cross-modal similarity scores, and adaptively reweights these features. Experimental results show that MSFND-Net performs excellently on several well-known fake news detection datasets, with overall accuracy improvements of 4.1%, 1.1%, and 3.8% on the Weibo, Politifact, and Gossipcop datasets, respectively.

Jiaqian Liu, Xiaolong Deng
YOLO-GCN Fusion: An Efficient Algorithms Framework for Abnormal Behaviors Detection

To address the challenge that existing abnormal behaviors detection methods are difficult to balance efficiency and effectiveness, we proposed an efficient hybrid algorithms framework combining YOLO (You Only Look Once) and GCN (Graph Convolutional networks). The framework consists of two stages: detection and recognition. In the detection stage, an object detection algorithm and MOT (a multi-object tracking) algorithm are employed to initially detect abnormal actions in frames captured by video sensors. In the recognition stage, a pose estimation algorithm is utilized to extract the skeleton data information from suspicious video clips as the input to a graph convolutional network, which then further determines whether there are abnormal behaviors in the video clips. Experimental results show that the proposed framework achieves an average recognition accuracy of 95.87% and a recognition speed of 41 fps on a self-built dataset consisting of five types of behaviors, demonstrating its effectiveness in real-time detection and classification tasks for security.

Rongfa Li
A Study on an Intelligent Elevator Control Interaction System Integrating Palmprint Recognition and Gesture Recognition

The imperative pursuit of computer vision technology research lies in developing a more accurate, intelligent, swift, and secure gesture imaging recognition methodology. In this context, we have devised an advanced intelligent elevator control interaction system that seamlessly integrates gesture and palmprint recognition. The system employs the Multi-Feature Robust Alignment Technique (MFRAT) to extract distinctive palmprint features, utilizing point-pair regions for palmprint matching. Gesture recognition is achieved through a sophisticated hybrid modeling approach, incorporating mediapipe and Resnet34. Subsequently, a dynamic frequency localization method is employed to translate and analyze specific elevator gesture commands, effectively executing user instructions and ensuring precise floor navigation. Experimental results underscore the system’s proficiency, with a recognition rate of up to 95.5% for single-frame gesture images and 92.84% for combined gesture images. By harnessing the diversity of gestures and the unique characteristics of palmprints, the system employs non-contact gesture imaging to facilitate seamless, efficient, and secure human-machine interaction between users and elevators, showcasing promising applications across various domains.

Jiayu Liu, Wei Jia, Jing Zhang
Differential Privacy Federated Graph Based Fraud Detection

Fake reviews are spread all over the network platform, which is very harmful to users, businesses and the platform itself. In recent years, graph neural networks have been widely used in fraud detection problems. The graph neural network aggregates the neighborhood information of nodes through different relationships to reveal the suspiciousness of nodes. However, previous work has been based on a single client training model, without integrating data information from multiple clients for federated training, resulting in overall poor performance. Therefore, we propose a new framework called FGFD (Federated Graph Fraud Detection) to learn the characteristics of fake reviews on different platforms while protecting data privacy through federated learning, and improving overall performance. Specifically, we first design a GNN model, called GFD, which deployed the model on all clients in the system, and then FGFD through many clients (such as different e-commerce platforms) under the coordination of a central service provider server to train GFD, GFD training carried out under the local client only, and the model parameters are exchanged between the client and the central server without exchanging data, thereby realizing the protection of client data privacy. The performance of GFD will learns the characteristics of fake reviews in different client in the process of exchanging data, thereby improving the overall performance. We verified the performance of the model on the Amazon and YELP datasets, the FGFD with federated learning graph neural network models outperformed other graph neural network models.

Xiaolong Deng, Yunyun Dai, Tianxu Zhang
A Heterogeneous Multi-container Scheduling Mechanism Based on Deep Reinforcement Learning

Existing cloud computing container scheduling methods have significant deficiencies in resource utilization and performance. To address these issues and improve resource efficiency and cluster performance, we propose a heterogeneous multi-container scheduling mechanism based on Deep Reinforcement Learning (DRL). This technology employs an innovative scheduling strategy that utilizes the Proximal Policy Optimization (PPO) algorithm within DRL to achieve optimized resource allocation and enhanced utilization. It dynamically adjusts container deployment, reduces resource waste, and simultaneously enhances cluster performance and response speed.

Junjie Li, Wangdong Wu, Ling Wang, Lei Wang
Improved Constrained Weighted Least Squares TDOA/FDOA Passive Localization Method Based on Lagrange Multipliers

Aiming at the root mean square error (RMSE) of positioning and the poor ability of positioning deviation to adapt to the measurement noise of the current time-difference/frequency-difference passive positioning algorithms, a kind of improved constraint-weighted least squares method based on Lagrange multipliers is proposed The positioning performances of the proposed method and two-stage weighted least squares (TSWLS) are compared by computer simulation under near and far field conditions, respectively. The simulation experiments show that the proposed algorithm has good positioning effect for both near-field and far-field targets, and can meet the high accuracy and real-time positioning under certain noise interference conditions.

Zheng Wang, Jianyu Yu
Parameter Estimation and Sorting Identification of FH Signals Based on Improved Connected Region Labeling

Addressing the difficulty of identifying small unmanned aerial vehicles in complex electromagnetic environments, a frequency hopping signal estimation and sorting method based on improved connected region labeling was proposed. Energy threshold statistical method, based on local windows, was adopted to denoise in time-frequency domain. Conventional connection region labeling method was improved, and a new connection region fragment association and interference suppression method was designed to solve the problem of connection area breakage and interference mixing. Reconstruction of connection areas using time-frequency amplitude differences for mixed multi FH signals. Parameter extraction and signal sorting identification based on improved connected region labeling map. Simulation was showed that in environments with noise, interference, and multi FH signals aliasing, accuracy of signal parameter estimation and probability of correct identification of the target were significantly higher than conventional connected region labeling methods.

Pei Zhu, Wei Han, ChengWei He, YingXin Xu, BuQiu Tian, LiangFa Hua
The Contribution of Sequence Features to the Intelligent Prediction of Essential Genes in the Plasmodium Falciparum 3D7 Genome

Theoretical prediction of essential genes in Plasmodium falciparum 3D7 genome shows significant importance in the fight against malaria. The sequence feature with the virtues of generality across organisms and accessibility could be used to predict essential gene. In this study, we used an ensemble model with two base estimators: a SVM model and a LGBM model to measure the prediction ability of the sequence feature. It could be seen that the sequence feature extracted by the w-Nucleotide Z Curve had the ability to predict essential gene. The prediction accuracy increased as the number of the variables increased till w equaled 3 or 4. Then as the number of the variable increased, the redundant variables appeared. When the redundant variable were removed by Recursive Feature Elimination method, the performance of the model could be improved.

Ronghao Li, Wenxin Zheng
Backmatter
Metadata
Title
Proceedings of the 4th International Conference on Frontiers of Electronics, Information and Computation Technologies (ICFEICT 2024)
Editors
Weijian Liu
Qi Wang
Jinchao Feng
Wenli Zhang
Copyright Year
2025
Publisher
Springer Nature Singapore
Electronic ISBN
978-981-9653-18-8
Print ISBN
978-981-9653-17-1
DOI
https://doi.org/10.1007/978-981-96-5318-8