Zum Inhalt

Artificial Intelligence and Soft Computing

24th International Conference, ICAISC 2025, Zakopane, Poland, June 22–26, 2025, Proceedings, Part II

  • 2026
  • Buch
insite
SUCHEN

Über dieses Buch

Dieser Band stellt die Tagung der 24. Internationalen Konferenz über künstliche Intelligenz und Soft Computing, ICAISC 2025, Zakopane, Polen, vom 22. bis 26. Juni 2025 dar. Die 83 vollständigen Aufsätze in diesem Buch wurden sorgfältig überprüft und aus 163 Einreichungen ausgewählt. Sie sind wie folgt in thematische Abschnitte gegliedert: Teil I - Neuronale Netze und ihre Anwendungen; Fuzzy Systems und ihre Anwendungen; Evolutionäre Algorithmen und ihre Anwendungen. Teil II - Computer Vision, Bild- und Sprachanalyse, Data Mining, Musterklassifizierung und künstliche Intelligenz in der Modellierung und Simulation. Teil III - Verschiedene Probleme der künstlichen Intelligenz; Agentensysteme, Robotik und Steuerung, Bioinformatik, Biometrie und medizinische Anwendungen und gleichzeitige Parallelverarbeitung.

Inhaltsverzeichnis

Frontmatter

Computer Vision, Image and Speech Analysis

Frontmatter
An Autonomous Weeding Robot with Novel Unsupervised Domain Adaptation
Abstract
Weeding is a critical, yet labor intensive task in agriculture, where manual methods are inefficient and unsustainable for large-scale farming. To address this challenge, we propose two complementary components aimed at enhancing precision agriculture: (1) a custom designed autonomous robotic platform for efficient field navigation and monitoring, and (2) a novel unsupervised domain adaptation (UDA) framework for robust crop and weed segmentation. The proposed UDA framework integrates two key modules, the contrastive learning module (CLM) for improved feature alignment and the enhanced fast fourier transform module (EFFT) for stylistic adaptation designed to mitigate domain gaps between source and target datasets. Extensive experiments on diverse agricultural datasets, including UAV-Bonn, UAV-Zurich, Sunflower, and Sugarbeet, demonstrate that the proposed method achieves significant improvements in mean Intersection over Union (mIoU), outperforming the MIC baseline by 15.52% and 13.22% in the Sunflower-to-Sugarbeet and Sugarbeet-to-Sunflower tasks, respectively, and delivering competitive results in UAV-Bonn-to-UAV-Zurich and superior performance in UAV-Zurich-to-UAV-Bonn. These results highlight the system’s potential for advancing sustainable and autonomous weeding in precision agriculture, with future integration of the UDA framework into the robotic platform enabling fully automated weed management.
Khubaib Ahmad, Dewa Made Sri Arsa, Michal Strzelecki, Jonghoon Lee, Hyongsuk Kim
Swin-Editor: Enhancing Consistency in Text-Driven Video Editing
Abstract
Large visual models have recently made considerable progress in Text-to-Video generation thanks to the development of foundation models and multi-modal alignment techniques, making video generation more and more realistic. Current approaches predominantly rely on adapting image-based diffusion models via spatiotemporal attention, but this generally leads to temporal inconsistency and increasing model complexity. This inconsistency is mainly related to the fact those approaches are founded on models that were originally designed for image generation, thus, they do not consider implicitly the spatiotemporal aspect of videos. In this paper, we introduce Swin-Editor, an efficient approach of video editing from text-instruction that expands a diffusion-based Text-to-Image model into Text-to-Video. Specifically, our focus lies in enhancing the visual quality of the generated videos by incorporating a spatiotemporally factorized video prediction mechanism in the diffusion model. Additionally, to reduce computational complexity and memory requirements, the proposed model includes a Vector Quantized Variational Autoencoder module, intended to quantize and compress the spatiotemporal latent features. The proposed architecture produces a good compromise between multiple evaluation metrics against state-of-the-art models in various scenarios. Project page: Swin-Editor.
Abdelilah Aitrouga, Youssef Hmamouche, Amal El Fallah Seghrouchni
A Study on Highly Efficient Compact Transformer Features for Histopathological Image Recognition
Abstract
Histopathological image recognition requires efficient feature extraction and dimensionality reduction to manage the complexity of the scans. For this, we propose a hybrid models that combine TinyViT with the self-supervised learning capabilities of DINOv1 and dimensionality reduction techniques, which achieve higher accuracy with improved computational efficiency outperforming DINOv1 architectures. To enhance performance, feature reduction techniques, such as PCA and NCA, are employed to reduce feature sizes while minimizing accuracy loss and retaining critical information. Although DINOv1 exhibits state-of-the-art accuracy in general computer vision tasks, its performance on medical images at the researched magnification is limited. In contrast, TinyVIT-based models offer a balanced solution to efficiently process large histopathology scans with improved accuracy and reduced computational requirements, as our results show.
Didih Rizki Chandranegara, Przemysław Niedziela, Bogusław Cyganek
Accessibility of Graphical User Interfaces for Autistic People
Abstract
People with the autism spectrum have special needs when it comes to using computers. Therefore designers of graphical interfaces dedicated to such people should bear in mind the limitations faced by them. In this paper, the system, which supports designers by automatically evaluating the accessibility of graphical interfaces, is proposed. Image processing methods are used to extract features, on which the introduced fit scale allowing for evaluation of interface adaptation to special preferences is based. The presented approach facilitates the creation of interfaces intelligible for autistic people and raises awareness about their needs. The fit scale values were calculated for several game interfaces and the obtained results have been compared with a visual analysis performed by an expert working with autistic people.
Adam Górski, Aleksandra Świerczek, Grażyna Ślusarczyk
Is It Large Enough? A Prompt Learning for Large Multi-modal Models
Abstract
Large vision-language models (LVLMs), such as CLIP and Stable Diffusion, exhibit remarkable versatility in diverse applications but remain sensitive to poorly formed or ambiguous user queries. This sensitivity often results in suboptimal outputs, limiting their effectiveness in real-world scenarios. To address this challenge, we propose a novel approach that refines the user queries to generate query-specific images. This work explores the integration of CLIP (Contrastive Language-Image Pretraining) and Stable Diffusion for generating high-quality, task-specific images from both generic and task-specific text prompts. We propose a novel framework where task-specific (generated using large language model)and generic prompts are encoded through CLIP’s text encoder, and task-specific images are encoded through CLIP’s vision encoder. To align these diverse embeddings, we employ contrastive learning, which optimizes the proximity of text embeddings to ensure that task-specific textual descriptions are coherently aligned with generic prompts. After the contrastive alignment, a text decoder is used to convert the optimized embeddings into natural language descriptions, which are then fed into Stable Diffusion for image generation. This end-to-end pipeline facilitates the creation of diverse and contextually accurate images, enabling conditional image generation based on both generic and task-specific textual inputs. Our approach demonstrates significant promise in enhancing image generation tasks where text and image modalities are tightly coupled, offering an efficient solution for high-quality, context-aware image synthesis. We release code at https://​github.​com/​s4nyam/​isitlargeenough.
Sanyam Jain, Vijeta Sharma
Neural Network-Based Verification of Document Authenticity via Paper Structure and Scan Analysis
Abstract
This paper presents a method for document authenticity verification based on the analysis of scanned images and embedding representations generated by a deep learning model. The proposed approach leverages features of the paper microstructure and characteristics of the scanned document to classify its originality. The study employed an embedding model provided by the Cohere platform, while the classifier was built using a lightweight fully connected neural network. Experiments showed that even with a minimal number of training samples (one per class), the model could effectively distinguish original documents from their copies. The training process was fast and stable, completing in less than two seconds, which highlights the potential for integration into real time systems. Moreover, it was demonstrated that the document profile can be permanently stored as an embedding vector, enabling later verification without the need to retain the original image. The proposed solution offers high accuracy, low computational demands, and strong resistance to forgery even when the original document is unavailable.
Mateusz Janik, Jakub Nowak, Paweł Drozda, Slava Voloshynovskiy
Fuzzy Attention Module for CNNs in the Application of Space Analysis
Abstract
Classifying gravitational wave signals is an essential task in analyzing data from space collected by advanced tools such as an interferometer. In this paper, we present a new architecture of a convolutional neural network that classifies gravitational wave spectrograms into a selected class. For this purpose, a novel attention module based on the fuzzy controller architecture was proposed. The mechanism is based on the generation of matrices: query Q, keys K, and values V, where the first two are fuzzified by a Gaussian function and subjected to fuzzy inference. The inference results are sharpened and multiplied by values in V. This solution allows the use of the idea of a fuzzy controller to analyze features in neural networks. The model was tested and analyzed in terms of different evaluation metrics that show that this model can reach higher results than the state-of-the-art.
Antoni Jaszcz, Adam Zielonka, Michał Wieczorek, Dawid Połap
Dataset Augmentation for Detecting Small Objects in Fisheye Road Images
Abstract
In this work, we focus on detecting small road objects in fisheye cameras using the FishEye8K dataset. The images within the FishEye8K dataset pose significant challenges due to heavy distortion and blurring. We first thoroughly analyze the dataset and then design a data augmentation pipeline tailored specifically to simulate the characteristics of FishEye8K images in other datasets. Our approach involves using fisheye distortion alongside several pixel-level transformations, which we apply to other traffic-oriented datasets like VisDrone, UAVDT, and WoodScape. Additionally, we employ GAN-based data augmentation techniques to transform the original dataset, simulating multiple weather and lighting conditions. Finally, we conduct a comprehensive analysis to assess the suitability of typical small object detection methods for this particular problem domain. Our method was developed for AiCityChallenge2024, and we achieved an F1 score of 58.2% on the FishEye8K test-challenge subset with the Co-DETR model trained with our data augmentation pipeline. The code is available at (https://​github.​com/​deepdrivepl/​aicity24-DDPL).
Aleksandra Kos, Karol Majek, Dominik Belter
Attention U-Net Image Deblurring via Hybrid Loss Optimization
Abstract
In this paper, we propose a novel loss-based optimization strategy for image deblurring, leveraging the Attention U-Net architecture and a hybrid training objective that enhances both perceptual and structural reconstruction quality. Our approach is specifically tailored to the HIDE dataset and achieves improved performance in two key metrics: Structural Similarity Index and Peak Signal-to-Noise Ratio. These metrics are essential for evaluating the quality of image restoration, particularly in machine learning and computer vision applications. Experimental results on the HIDE dataset show that our method achieves competitive results compared to previously published models, providing a meaningful advancement in the field of image deblurring.
Adam Kwaśnik, Tomasz M. Lehmann, Przemysław Rokita
A Predictive Model for Mean Opinion Score in Text-to-Image Quality Assessment
Abstract
In recent years, text-to-image generation models have gained immense popularity and widespread use. The increasing availability of computational resources has accelerated the development of more sophisticated methods, but this rapid growth has introduced a challenge: comparing models to determine which is best suited for a given task. Currently, this problem is often addressed through manual evaluations, where humans assess and rank model outputs. However, this approach is inefficient and time-consuming, requiring extensive human input and subjective judgment. With the exponential growth of text-to-image models, manually assessing each model’s output quality has become a Sisyphean task. Automated image quality assessment (IQA) models offer a promising alternative, enabling us to reduce reliance on subjective human evaluations and instead use predicted values as metrics that indicate how well a generated image may appeal to users.
In this paper, we extend our previous research on predicting the Mean Opinion Score (MOS) for image quality and propose a novel, efficient method for evaluating the quality of text-to-image generation models. Our approach uses a ConvNeXt-based architecture, representing an upgrade to previous solutions, and provides robust and innovative metrics applicable to a wide range of text-to-image generation tasks. This model improves the speed and reliability of quality assessments, offering a scalable solution to meet the growing demand for automated evaluation in the text-to-image generation space.
Tomasz M. Lehmann, Przemysław Rokita
Vision Transformer Representations for Efficient Content-Based Image Retrieval
Abstract
Content-based image retrieval (CBIR) is one of the basic tasks of computer vision. Numerous studies have been conducted, leading to many groundbreaking methods based on deep neural networks and even more recently on vision transformers (ViT). In this article, we propose a new CBIR method based on the original self-distilled with no labels semantic features (DINO), obtained using ViT, and then additionally compressed using the principal and neighbourhood component analysis. We show highly accurate results on non trivial datasets such as Caltech-256, as well as on histopathological scans such as Kather and BreaKHis. Our method freely compares with the best CBIR approaches while having very compact image representations.
Stanisław Łażewski, Bogusław Cyganek
Intensification of Dermatological Asymmetry Measure of Skin Lesions Using PCA
Abstract
In this article, the authors address classification as a process that can be automated. The primary focus is the analysis of dermatological skin changes by leveraging expert knowledge and factor analysis based on structural geometric patterns associated with skin diseases. Furthermore, the study explores the feasibility of remote classification using a hierarchical model designed to optimize computational efficiency. Additionally, the classification process incorporates rejected principal components to identify relevant factors accurately.
Łukasz Wąs, Sławomir Wiak
The Role of Text and Vision Modalities in Solving Bongard Problems with VLMs
Abstract
Bongard Problems (BPs) pose a fundamental challenge in the abstract visual reasoning (AVR) domain. To solve a BP, one has to identify an abstract pattern that differentiates images from the left and right sides of the matrix, which requires strong perception and reasoning capabilities. In this paper, we investigate whether vision language models (VLMs) can solve BPs by considering three binary classification problem settings involving both vision and text modalities. Our experiments primarily involve 4 open-weight models including InternVL2, LLaVa-1.6, Phi 3.5 V, and Pixtral. We employ 4 BP datasets involving synthetic and real-world images: classic BPs, Bongard HOI, Bongard-OpenWorld, and Bongard-RWR. The conducted experiments show that certain models perform better in a decoupled setting, where panel descriptions are provided before-hand, suggesting their limitations in processing the vision modality. In contrast, other models achieve strong results in a uniform setup, which doesn’t involve an explicit panel description step, indicating their strength in bi-modal reasoning. We conclude by summarizing emerging research directions in the field to further advance abstract reasoning capabilities of VLMs.
Mikołaj Małkiński, Szymon Pawlonka, Jacek Mańdziuk
Leveraging Vision in Transformers Model for Point Cloud Pattern Matching
Abstract
In this article, we propose a method for comparing arbitrary sets of point clouds based on pattern descriptors. The method is specifically designed to process large outdoor 3D laser scans characterized by highly variable point density, which changes with distance from the scan center. To generate pattern descriptors, we utilize a modified encoder block from a Vision in Transformer model, trained with a differential loss function. The model is trained on scanner-specific data, enabling it to generalize to any scans without requiring retraining.
Patryk Najgebauer, Rafał Grycuk, Rafał Scherer
One Shot, Few Perspectives: Ensemble Learning for Image Segmentation
Abstract
The conventional approach to semantic segmentation necessitates training models on extensive datasets, a process that is often resource-intensive and time-consuming. Few-shot learning methods, by contrast, employ previously trained models to rapidly adapt to novel, unseen classes. These methods utilize a limited set of k samples to establish prototypes representing the novel class, guiding the model’s predictions and facilitating iterative weight adjustments in alignment with this foundational structure. In this study, we build on these strengths, augmenting the proposed system with multiple neural architectures incorporating attention modules. Specifically, we employ a 1-shot learning strategy across k different models (with \(k=3\) in our experiments), whose aggregated results enable a comprehensive representation of the novel class’s features with minimal data support. The conducted experiments have shown that a properly selected consensus method can have a positive impact on the obtained segmentation results.
Katarzyna Prokop, Jakub Siłka, Dawid Połap, Katarzyna Wiltos
Comparison of the Effectiveness of Optical Flow Methods in Determining the Velocity of a UAV
Abstract
The study focuses on comparing the effectiveness of optical flow methods in determining the velocity of an Unmanned Aerial Vehicle (UAV). The research aims to develop a method that enables precise monitoring of UAV movement without using external sensors, such as GPS. The study demonstrates how image analysis can be an alternative to traditional methods, particularly when GPS signal access is limited. The implementation and testing conducted on a developed demonstrator confirmed the method’s effectiveness, opening new perspectives for UAV applications in various fields.
Jakub Walczak, Janusz Furtak
Fine-Tuned Data Augmentation Techniques for Digit Recognition in AIoT
Abstract
In the era of smart cities and AIoT infrastructure, deploying efficient machine learning models on resource-constrained edge devices has become critical for urban utility management. This paper evaluates the effectiveness of various data augmentation techniques in enhancing the performance of machine learning models on these devices despite their limited computational resources. Our study utilizes three datasets of digit images: one from Kaggle, one from SCUT, and a proprietary dataset. We tested ranges of parameters for data augmentation, including noise, brightness, contrast, and geometric transformations, to assess their impact on model accuracy. The findings indicate that while augmentation generally improves model performance, an optimal range exists beyond which accuracy may decline due to overfitting. This paper describes this standardized approach to parameter testing that contributes to developing more efficient and accurate edge-based machine learning applications.
Marcelo Luis Walter, Juliano de Paulo Ribeiro, Alexandre Nodari, Gabriel Henrique Couto da Costa, Leonardo Nunes, Marcelo E. Pellenz, Edson Emilio Scalabrin
Segmented Medical Image Classification with Deep Convolutional Neural Networks Architectures for WBC Detection
Abstract
Automated medical image classification is essential to improve diagnostic precision, reduce the burden on clinicians, and accelerate disease detection. This study evaluates the performance of various convolutional neural network (CNN) architectures in small-scale segmented medical image datasets. Models were trained from scratch without pre-trained weights, using deterministic augmentation pipelines to ensure reproducibility. Xception achieved the highest accuracy of 96.73%, showcasing the strength of advanced architectures in medical imaging. ResNet-50 and VGG16 also performed well, with accuracy of 95% and 94.52%, respectively, demonstrating their effective balance of depth and feature extraction for this task. This paper systematically evaluates the performance of various deep learning models, offering a comparative analysis of their strengths and limitations when applied to a segmented medical image dataset.
Katarzyna Wiltos, Agnieszka Polowczyk, Alicja Polowczyk, Marcin Wozniak, Michał Wieczorek

Data Mining

Frontmatter
Deterministic and Nondeterministic Decision Trees for Decision Rule Systems from Closed Classes
Abstract
The study of the relationships between decision rule systems and decision trees is of considerable interest in computer science. In this paper, we consider classes of decision rule systems that are closed under the operation of attribute removal. For an arbitrary closed class, we study functions that characterize the dependence in the worst case of the minimum depth of deterministic and nondeterministic decision trees that solve the problem of finding all realizable rules in a decision rule system on the number of different attributes in this system. We prove that these functions are either bounded from above by a constant or grow linearly.
Kerven Durdymyradov, Mikhail Moshkov
Some Regularized Tools for Dimensionality Reduction
Abstract
Dimensionality reduction has become a commonly used part of the analysis of complex economic data. The aim of this work is to study the potential of regularized tools for dimensionality reduction methods and possibly to propose some novel regularized tools. The regularization in the form of shrinkage allows to improve numerical stability of the tools for high-dimensional data and also to reduce variability of parameter estimates at the cost of introducing bias. Firstly, a robust regularized version of the coefficient of multiple correlation is proposed, which may be exploited within a Minimum Relevance Maximum Redundancy supervised variable selection. Secondly, the ridge regularization is discussed not to bring any modification of principal component analysis; this is true also for robust versions of principal component analysis.
Jan Kalina
New Approach to Instance Selection Leveraging Clustering and Neighboring Strategies
Abstract
The article presents a novel approach to instance selection, employing a two-step algorithm. Initially, the data is clustered using k-means or k-medoids to determine the optimal number of centers. Subsequently, the n-nearest neighbors to these centers are identified and used as data instances. The method aims to investigate the effect of changing the parameter n on the reduced set size and the performance of machine learning (ML) model in classification tasks. Calinski-Harabasz, Davies-Bouldin, and Silhouettes measures are utilized to find the optimal number of centers. Evaluation is carried out on 26 UCI repository datasets by comparing the ML model classification accuracy on the reduced data with that on the original dataset. The level of data reduction, influenced by clustering methods, center determination approaches, and n parameter changes, is assessed. The results indicate a significant reduction in the volume of the dataset and an improved classification accuracy of the ML model compared to the full datasets and the literature benchmarks.
Maciej Kusy, Roman Zajdel
“Bottleneck Analysis in Software Development: A Case Study Using Process Mining”
Abstract
Bottleneck analysis in software development highlights inefficiencies that hinder team productivity. This study uses Process Mining techniques to identify and address bottlenecks in workflows by analyzing event logs from IDEs and task management systems. The approach involves process discovery, conformance checking, and predictive modeling to detect inefficiencies and deviations. A predictive model was developed to anticipate future bottlenecks, allowing for targeted interventions. Major bottlenecks included delays in communication, redundant task cycles, and misallocated resources. Workflow adjustments, such as process redesign and task redistribution, led to a 38% improvement in efficiency overall like reduction of queue time. The study highlights the role of Process Mining in optimizing development workflows, with a focus on data quality and system integration for accurate analysis.
Rodrigo Almeida de Oliveira, Juliano de Paulo Ribeiro, Edson Emílio Scalabrin

Pattern Classification

Frontmatter
Channel Attention for Fusarium Head Blight Detection in Hyperspectral Images
Abstract
This research enhances Fusarium Head Blight (FHB) detection in hyperspectral images using a Deep Convolutional Neural Network (DCNN) with Principal Component Analysis (PCA) and a Spectral Attention Module (SAM). By reducing spectral dimensionality with PCA and applying channel attention, the model improves feature representation. Tested on the AI for Agriculture 2024 dataset, it achieved 97.06% accuracy, outperforming the baseline’s 81.76%. Ablation studies confirmed PCA’s key role, while SAM had limited impact. These results highlight the benefits of spectral feature selection for more precise and scalable agricultural disease monitoring.
Lily Akpanke, Dustin van der Haar, Hima Vadapalli
Invariant Aircraft Recognition with Fourier and Dual-Tree Complex Wavelet Features
Abstract
Aircraft recognition is an extremely important task in air defense. In the existing literature, there exist only a few works on this topic. As a result, there is an urgent need to tackle this problem. In this paper we propose a novel method for aircraft recognition by polarizing the aircraft, extracting the Fourier features along the angle direction, and computing the dual-tree complex wavelet transform (DTCWT) features along the radial direction. We normalize the aircraft so that it is translation and scale invariant. Rotation invariance is achieved by taking the spectrum of the Fourier coefficients. We choose all except the finest scale of DTCWT coefficients and low frequency Fourier coefficients to classify the unknown aircraft because they are more robust to noise. Experimental results demonstrate that our new method is better than the Fourier-wavelet descriptor for all testing cases for a combination of scaling factors and rotation angles and a combination of noise levels and rotation angles, respectively.
Guang Yi Chen, Adam Krzyżak, Ventzeslav Valev
Combining Classifiers for Texture Classification
Abstract
Texture classification is a critical task with applications spanning various domains, from facial recognition to cancer detection in medical images. In traditional approaches, the application’s success heavily depends on the feature extraction and classification stages. Over the years, numerous feature extraction methods have been proposed, with non-handcrafted approaches consistently outperforming handcrafted ones. This paper investigates static and dynamic selection techniques of classifiers trained on features extracted from non-handcrafted architectures. The experiments were conducted on two challenging benchmarks widely used for texture classification evaluation: the FMD dataset and the Describable Texture Dataset (DTD). We first evaluated individual features and found that Visual Transformers (ViT) performed exceptionally well compared to other architectures. However, the significantly higher Oracle accuracy for the ensemble of classifiers suggests room for improvement in investigations concerning classifier combination and selection techniques. We observed enhanced performance when combining the top-performing individual classifiers by applying static combination methods such as sum, product, and max rules. Dynamic classifier selection techniques have not yielded improvements in the rates. The best performance was achieved using a static combination of classifiers through sum and product rules. In the FMD database, the F1-score was 93.2% and 81.74% for DTD. The results achieved are among the top three state-of-the-art performances on both datasets.
Michel Gomes de Souza, Yandre M. G. Costa, Alceu de Souza Britto, Juliano H. Foleis, Diego Bertolini
Elliptic Separator: A Geometric Approach to Linear Classification
Abstract
Linear classifiers are widely used in machine learning due to their simplicity and computational efficiency. However, existing methods face limitations including sensitivity to outliers, restrictive assumptions, and computational instability. To address these challenges, we propose the Elliptic Separator (ES), a novel linear classifier based on geometric principles. The method employs Principal Component Analysis, affine transformations, and ellipsoid fitting to transform the feature space, enabling the determination of a maximum-margin boundary between clusters. The classification task is reduced to minimizing the distance between the origin and an ellipse, which simplifies computation by solving polynomial equations. Numerical experiments on 2D binary datasets demonstrate that the Elliptic Separator outperforms traditional linear classifiers, offering a stable and efficient alternative. Future extensions aim to generalize the approach to higher dimensions and evaluate its performance on real-world data.
Firuz Kamalov
Invariant Pattern Recognition with Selectively Denoising and Ridgelet-Fourier Features
Abstract
In this paper, we propose a novel method for rotation invariant pattern recognition. We perform adaptive denoising to the input pattern images. If the noise level is above a threshold, we perform block matching and 3D filtering (BM3D) to reduce noise from the noisy pattern images. We do not conduct denoising otherwise. We extract ridgelet-Fourier features from the denoised pattern and classify the unknown pattern to one of the known classes with the nearest neighbor classifier. Experiments demonstrate that our new method achieves perfect classification rate (100%) for two datasets and different noise levels and different rotation angles, and it outperforms several existing methods for invariant pattern recognition.
Guang Yi Chen, Adam Krzyżak

Artificial Intelligence in Modeling and Simulation

Frontmatter
Entropic Metrics in Feature Selection for Neural Network-Based Data Error Detection in CMDB
Abstract
The article presents an algorithm supporting the detection of errors in databases. Unlike typical approaches, where correctness rules are defined a priori, in the proposed approach, they are constructed adaptively based on the statistical analysis of historical data. A huge number of automatically generated potential features are first statistically analyzed using the kernel method, and then, similarity indices between features are determined using the Rajski/Jaccard entropy-based metric. The algorithm ensures a balance between the individual usefulness of the features and their independence from others. A low-dimensional feature vector is obtained, which is the input to a classifier based on a neural network. The results of comparative experimental studies are presented, and the usefulness of the algorithm in practical application is discussed.
Grzegorz Mzyk, Szymon Niewiadomski
UAV Forensics: Transformer-Based Identification of Mobile Devices for Drone Piloting
Abstract
Unmanned aerial vehicles (UAVs), also known as drones, have become a popular research aspect in recent digital forensics. Many papers deal with various approaches connected with UAVs, including identification by flight logs, detecting a mobile device used to pilot a UAV, or different methods for managing their transmission. In this paper, we deal with the problem of linking mobile devices with a drone that it was flown by. More precisely, we consider, whether it is possible to identify which mobile device was used to pilot a particular drone, based on flight logs that are stored in the mobile device’s internal memory after each flight. We propose a method based on a transformer that predicts the mobile device that was used to pilot a drone. Experimental evaluation shows that it is possible to determine that a drone was piloted with a particular mobile device based on the log file. The evaluation is performed on real data, acquired from a popular UAV and several modern mobile devices used to pilot them.
Katarzyna Nieszporek, Jarosław Bernacki, Kelton A. P. Costa, Qiao Ke, Wei Wei
Predictive Monitoring of Workforce Dynamics via Neural Networks
Abstract
Effective human resource management requires continuous monitoring of workforce dynamics, including role transitions, promotions, and structural changes within an organization. This paper presents a solution based on recurrent neural networks (RNN), utilizing LSTM (Long Short-Term Memory) and GRU (Gated Recurrent Unit) architectures to analyze sequential data derived from employee interactions in a large organizational environment. The research was conducted using a text-based dataset of approximately 184 GB, encompassing various communication formats from emails and meeting transcripts to team discussions while incorporating organizational hierarchy context. The proposed model detects significant personnel events, such as changes in supervisors, promotions, or positional shifts. The analysis considers 16 features describing relationships between employees and their organizational surroundings. The use of LSTM and GRU architectures enabled the capture of complex temporal dependencies and accurate classification of career-related behavioral patterns. Designed for near real-time operation, the system supports the rapid identification of potential anomalies and assists managerial decision-making. This approach may be applied in both private and public sector institutions, wherever workforce management and information security are of strategic importance.
Jakub Nowak, Marcin Korytkowski, Rafał Scherer, Błażej Żak, Zorza Tymorek, Anita Zbieg
Backmatter
Titel
Artificial Intelligence and Soft Computing
Herausgegeben von
Leszek Rutkowski
Rafał Scherer
Marcin Korytkowski
Witold Pedrycz
Ryszard Tadeusiewicz
Jacek M. Zurada
Copyright-Jahr
2026
Electronic ISBN
978-3-032-03708-4
Print ISBN
978-3-032-03707-7
DOI
https://doi.org/10.1007/978-3-032-03708-4

Die PDF-Dateien dieses Buches wurden gemäß dem PDF/UA-1-Standard erstellt, um die Barrierefreiheit zu verbessern. Dazu gehören Bildschirmlesegeräte, beschriebene nicht-textuelle Inhalte (Bilder, Grafiken), Lesezeichen für eine einfache Navigation, tastaturfreundliche Links und Formulare sowie durchsuchbarer und auswählbarer Text. Wir sind uns der Bedeutung von Barrierefreiheit bewusst und freuen uns über Anfragen zur Barrierefreiheit unserer Produkte. Bei Fragen oder Bedarf an Barrierefreiheit kontaktieren Sie uns bitte unter accessibilitysupport@springernature.com.

    Bildnachweise
    AvePoint Deutschland GmbH/© AvePoint Deutschland GmbH, NTT Data/© NTT Data, Wildix/© Wildix, arvato Systems GmbH/© arvato Systems GmbH, Ninox Software GmbH/© Ninox Software GmbH, Nagarro GmbH/© Nagarro GmbH, GWS mbH/© GWS mbH, CELONIS Labs GmbH, USU GmbH/© USU GmbH, G Data CyberDefense/© G Data CyberDefense, FAST LTA/© FAST LTA, Vendosoft/© Vendosoft, Kumavision/© Kumavision, Noriis Network AG/© Noriis Network AG, WSW Software GmbH/© WSW Software GmbH, tts GmbH/© tts GmbH, Asseco Solutions AG/© Asseco Solutions AG, AFB Gemeinnützige GmbH/© AFB Gemeinnützige GmbH