Zum Inhalt

Data Science and Network Engineering

Proceedings of ICDSNE 2025

  • 2026
  • Buch
insite
SUCHEN

Über dieses Buch

Dieses Buch enthält Forschungsarbeiten, die auf der Internationalen Konferenz für Data Science and Network Engineering (ICDSNE 2025) präsentiert wurden, die vom 18. bis 19. Juli 2025 vom Department of Computer Science and Engineering, National Institute of Technology Agartala, Tripura, Indien, organisiert wurde. Es umfasst Forschungsarbeiten von Forschern, Akademikern, Unternehmensführern und Branchenfachleuten zur Lösung realer Probleme durch die Nutzung der Fortschritte und Anwendungen der Datenwissenschaft und des Network Engineering. Dieses Buch deckt viele fortgeschrittene Themen ab, wie künstliche Intelligenz (KI), maschinelles Lernen (ML), Deep Learning (DL), Computernetzwerke, Blockchain, Sicherheit und Datenschutz, Internet der Dinge (IoT), Cloud Computing, Big Data, Supply Chain Management und viele mehr. Verschiedene Abschnitte dieses Buches sind für die Forscher, die auf dem Gebiet der Datenwissenschaft und Netzwerktechnik arbeiten, von großem Nutzen.

Inhaltsverzeichnis

Frontmatter

Computational Intelligence

Frontmatter
Ensemble Classifier for Real-Time Breast Cancer Classification on Histopathology Images
Abstract
One of the most prevalent cancers among women worldwide is breast cancer, and treatment and survival rates are significantly boosted by early detection. Addressing this challenge, the emergence of deep learning (DL) models has offered as a powerful solution. This paper presents a deep learning ensemble-based approach for the classification of breast cancer using histopathological images. The pre-trained VGG16 and MobileNetV2 are combined in the ensemble model to perform binary classification. The sequential depth of VGG16 and the depthwise separable convolutions of MobileNetV2 improved generalization and minimized overfitting. The proposed ensemble achieves a maximum accuracy of 99.14%. The results demonstrate the effectiveness of deep learning ensemble in accurately classifying cancerous and non-cancerous breast cancer images. To assist medical professionals in the diagnosis of breast cancer from histopathology images, a web server-based application is designed. The system has a strong frontend built with ReactJS and Bootstrap that makes it easy for users to upload images. It also has a FastAPI backend that works promptly for processing images. The platform allows for the seamless upload of biopsy images, leveraging a backend TensorFlow-based convolutional neural network ensemble to predict whether the biopsy image is cancerous or non-cancerous along with associated confidence scores.
Jacinta Potsangbam, Harsh Kumar, Salam Shuleenda Devi
A Smart Surveillance Framework for Real-Time Suspicious Activity Detection and Automated Alert Generation Using YOLOv8
Abstract
As security threats evolve, conventional surveillance systems often become victims of suspicious activities. The present study describes the building of an AI-driven Smart Surveillance System for the monitoring and analysis of unusual behavior in a real-time setting by the YOLOv8 high-performing deep learning algorithm. In contrast to the traditional methods, which by and large depend on manual monitoring or rule-based techniques, this system is designed to find itself in an environment that learns through observation and recognition of potentially threatening patterns regarding the event with an outstanding degree of accuracy and speed. Additionally, the application of the latest object detection and deep learning techniques heightens security surveillance because they minimize response time and enhance situational awareness. Evaluation results demonstrate the effectiveness of the proposed system, achieving a detection accuracy of 96%, with significant improvements in processing speed and false alarm reduction compared to traditional approaches. This research presents a scalable and responsive solution for smart surveillance, offering enhanced situational awareness and proactive security monitoring through deep learning advancements.
Anurag De, Venkata Naga Durga Sowmya Kollipara, Gautam Pal, Meghana Bellamkonda, Sai Surya Nikhil Vissapragada
Enhancing E-Commerce Trust: An Integrated Product Recommendation and Fake Review Detection System
Abstract
To improve user experience and influence purchase decisions, e-commerce systems mostly rely on user ratings and tailored suggestions. However, the growing number of fraudulent reviews damages consumer confidence and compromises the accuracy of recommendations. In order to increase the dependability of e-commerce, this research proposes an integrated system that combines a sentiment-driven, feature-based product recommendation model with a false review detection method. GloVe embeddings, KMeans clustering, and sentiment analysis models like CatBoost and LightGBM are used to assess customer reviews. The Isolation Forest method, which is based on anomaly detection, is used to simultaneously identify and filter fraudulent reviews. The system’s overall accuracy significantly improved after using false review detection, illustrating the influence of sincere input on suggestion quality. Results from experiments confirm that removing fraudulent reviews improves model performance while giving consumers more accurate, tailored product recommendations. For contemporary e-commerce platforms, the suggested system provides a scalable way to boost consumer confidence.
Poranki Anusha, Buse Sudeep Sahas, T. Lakshmi Surekha
Hardware-Efficient Neural Network for Voice Disorder Classification from Multi-Source Datasets
Abstract
This paper presents a hardware-efficient neural network-based approach for classification of healthy voices and pathological conditions including functional dysphonia, hyperkinetic dysphonia, hypokinetic dysphonia, and reflux laryngitis. Using two widely-used datasets—the Saarbruecken Voice Database (SVD) database and the Voiced database—we extracted clinically relevant acoustic features such as jitter, shimmer, harmonics-to-noise ratio (HNR), fundamental frequency (f₀), and formant frequencies to train a compact convolutional neural network (CNN) optimized for deployment on Field Programmable Gate Array (FPGA) platforms. The proposed model achieved an overall classification accuracy of 91.4%, with consistently high sensitivity and specificity across all five categories. Post-training quantization was applied to convert the model into an 8-bit fixed-point representation, significantly reducing memory usage and computational overhead. The CNN was synthesized using Vivado High-Level Synthesis (HLS) and implemented on a Zynq-7000 FPGA, where it achieved real-time inference with an average latency of 80 ms per sample. These results confirm the feasibility of integrating deep learning-based voice diagnostics into portable, low-power clinical devices. The system's robustness, efficiency, and accuracy make it highly suitable for early detection and continuous monitoring of voice disorders in both clinical and telemedicine environments.
Jyoti Mishra, R. K. Sharma
Predictive Maintenance on C-MAPSS Using LSTM Variants and Attention
Abstract
Accurate prediction of Remaining Useful Life (RUL) is critical for effective predictive maintenance in industrial systems. This study improves RUL forecasting performance through signal processing techniques, avoiding the complexity of hybrid deep learning models. Using the NASA Commercial Modular Aero-Propulsion System Simulation FD001 dataset, which provides run-to-failure sensor data from turbofan engines, we applied Kalman Filtering and multi-level Discrete Wavelet Transform for noise reduction and feature enhancement. Low-variance features were discarded, and Min-Max normalization was used for data scaling. Input sequences were generated using a sliding window of length 50. Four Long Short-Term Memory (LSTM)-based models were developed for comparison: a baseline LSTM, a Bi-directional LSTM, and two variants incorporating Multi-Head Attention. While attention mechanisms improved average performance in terms of Mean Absolute Error (MAE) and Root Mean Squared Error (RMSE) by enhancing temporal focus, they also introduced higher variance and occasional large errors. In contrast, signal processing significantly improved input quality, model stability, and convergence, enabling simpler architectures to achieve competitive results. All models were trained using MSE loss and evaluated on MAE and RMSE. The results highlight that well-designed signal processing pipelines can enhance RUL prediction accuracy while reducing reliance on complex neural network architectures.
Anshuman Sinha, Gaurav Singh Rajput, Utkarsh Raj, Dilip Kumar Choubey
Unveiling Ebola–Human Protein Links Through Network Embedding and Unsupervised Machine Learning
Abstract
Understanding the interaction mechanism between host proteins and the Ebola virus is essential for targeted drug therapeutics. Although a spectrum of studies dealing with host protein and Ebola virus are available, the protein-protein interactions (PPIs) databases are not dense and thus involve bias toward well-studied proteins. Furthermore, the majority of these studies consider the supervised learning approach, whereas a sufficiently large labeled dataset is not available. In this study, we present an unsupervised computational framework to predict the associations between non-interacting Ebola-human protein pairs. We begin the work by constructing a bipartite graph from the known Ebola-human PPI dataset and then employ a low-dimensional embedding technique, Node2Vec, to understand the structural and relational characteristics of the proteins. A clustering algorithm is then applied to the human protein embeddings to unveil significant modules, which are validated through functional enrichment analysis. From these clusters, bipartite graphs are reconstructed, including known Ebola-human proteins, and further analyzed using Node2Vec and cosine similarity. Our framework successfully predicts new PPIs, many of which are indirectly supported by literature evidence. The suggested method can be used for various host-pathogen interaction investigations and provides a data-driven strategy for systematic PPI identification.
Sujoy Chatterjee, Koyel Mandal
Discount Optimisation in Food Delivery Using Machine Learning
Abstract
The rapid growth of online food delivery services has generated vast amounts of data that offer valuable insights. This data provides significant information about consumer behaviour, operational efficiency, and market trends. This study presents a comprehensive machine learning (ML) pipeline to analyse food delivery data, integrating demographic attributes, order details, and food preferences. The research includes data preprocessing, exploratory data analysis (EDA), regression modelling, hyperparameter tuning, and feature importance evaluation. Additionally, we applied advanced customer segmentation techniques to identify key consumer groups along with their distinct characteristics. The findings reveal crucial factors influencing delivery time, order value, and customer satisfaction. By leveraging predictive modelling and clustering algorithms, this study offers actionable intelligence for stakeholders to optimise service quality, manage inventory, personalize offerings, provide tailored menus, optimise discounts, and enhance operational strategies in the food delivery ecosystem.
Vaishnav Dineshkumar Prajveen, S. Dilipkumar, Akriti Saigal
A Scalable Real-Time Stock Market Prediction Framework Using LSTM Network and XGBoost Model
Abstract
This paper introduces a real-time prediction system for stock prices that unifies machine learning (ML), deep learning, and data engineering methods for improving forecasting precision. The system integrates historical stock information and analysis of news sentiment to make accurate predictions. Historical data are accessed through Yahoo Finance, whereas news articles applicable to the stocks are accessed using the News API to incorporate the effects of the external market. The framework makes use of Long Short-Term Memory (LSTM) networks and XGBoost models that are supplemented using a stacked ensemble method to obtain higher accuracy and resilience. A FastAPI-backed backend supports API endpoints for prediction and history and is backed up by Redis to support real-time data streaming and storage. There is a special Redis processor to listen to streaming stock data, process it, and create predictions from trained models. Apart from this, a Streamlit dashboard provides an interactive, user-friendly representation of the predictions, historical patterns, model comparison, and performance measures. This paper provides an example of a scalable, real-time stock market prediction solution that can aid investors and analysts in making data-driven financial choices.
Amit Kumar Roy, Munsifa Firdaus Khan Barbhuyan, Satyabrata Nath
Smart Agriculture: A Deep Learning Framework for Early Plant Disease Identification
Abstract
Agricultural productivity faces a substantial risk because crop diseases lead to major losses in yield quantity and quality, despite agriculture being essential for food security. Diseases affecting crops pose a significant threat to agricultural production by causing severe reductions in both yield quantity and crop quality, even though agriculture serves as a cornerstone of global food stability. Traditional plant disease diagnostic approaches are resource-intensive and slow, often requiring expert skills, making them inaccessible to many farmers. This paper presents a system that demonstrates automated plant disease identification using Convolutional Neural Networks (CNNs), which excel at image classification. The CNN learns to categorize several diseases through training on labeled leaf images from both healthy and diseased plants. The model’s user-friendly gradio interface allows users to upload images and receive quick diagnoses. It enables early disease detection, providing farmers with fast and reliable results to support timely treatment decisions. The experimental findings demonstrate the system’s robustness and its ability to operate effectively in agricultural environments lacking sufficient resources.
Joyeeta Das, Sanjib Debnath, Swapan Debbarma
Real-Time Intelligence Surveillance Using Object Detection and Facial Recognition on Edge Devices
Abstract
This research presents a lightweight AI system for real-time object detection and facial recognition on edge computing platforms like the Raspberry Pi. By integrating YOLOv8 for object detection and DeepFace for facial analysis with OpenCV, the system performs efficient, offline inference on resource-constrained devices without relying on cloud infrastructure. A custom domain-specific dataset enhances detection precision and recognition accuracy. Designed for forensic, surveillance, and law enforcement applications, the unified architecture enables low-latency, privacy-preserving analysis directly on-site. The modular design supports diverse scenarios such as intelligent surveillance, suspect identification, and demographic analysis. DeepFace extends functionality by enabling real-time face detection, emotion recognition, and demographic estimation (age, gender) from live camera feeds. The compact deployment demonstrates how artificial intelligence can operate effectively on edge devices for time-sensitive, real-world applications. This work emphasizes the role of modern AI models, privacy-aware offline systems, and tailored datasets in improving accuracy. It establishes a scalable foundation for broader applications in intelligent visual monitoring and forensic investigations, showing how edge AI can support secure and efficient decision-making in evolving security landscapes.
Charanarur Panem, Debasmita Karmakar, Himanshu Yadav, Anshul Rajkumar, Yuvraj Mishra, Tanmayee Anasingaraju, Harish Ogare, Albert Gautam, Laishram Hemanta Singh, Naveen Kumar Chaudhary, Suman Deb
Sentiment Analysis in Kokborok: Building Resources and Models for a Low-Resource Language
Abstract
This study presents a novel and comprehensive study on sentiment analysis for Kokborok, a low-resource Tibeto-Burman language primarily spoken in the Indian state of Tripura. The lack of annotated corpora and linguistic tools has significantly impeded the development of Natural Language Processing (NLP) applications for Kokborok. Addressing this challenge, we developed a manually verified and annotated dataset consisting of 7,521 Kokborok sentences, each labeled as expressing a positive, negative, or neutral sentiment. To analyze the sentiment (in terms of three classes: Positive, Negative and Neutral), we implemented and evaluated four traditional machine learning models: Support Vector Machine (SVM), Random Forest, Logistic Regression, and Naive Bayes. Given the class imbalance in the dataset, we further employed feature extraction techniques such as Bag-of-Words (BoW) and Term Frequency–Inverse Document Frequency (TF-IDF) in separate pipelines, both with and without the application of over-sampling methods. Thereafter, performing the comparative analysis, it was found that when incorporating oversampling with the BoW feature extraction technique and the Logistic Regression model, we derived the highest accuracy of 100% in training and 90% in testing. Similarly, with the TF-IDF feature extraction technique, we derived the highest accuracy of 100% in training and 89% in testing, with the Random Forest and SVM models. However, without the incorporation of oversampling for the extraction techniques employed, the accuracy and other evaluation metrics derived very poor results. Hence, making the feature extraction pipeline Logistic regression along with BoW and oversampling to be an appropriate choice for this study.
Tijeli Debbarma, Abhijit Sinha, Sagarika Sengupta, Jay Krishna Das, Himanish Shekhar Das
An Evolutionary Framework for Robust Abrupt Transition Detection in Video Sequences
Abstract
Accurate detection of abrupt transitions, or hard cuts, plays a vital role in video analysis tasks such as indexing, summarization, and scene segmentation. However, traditional fixed-threshold methods often struggle in dynamic conditions due to their sensitivity to noise, motion, and lighting variations. To address this, we propose a novel approach that employs Generalized Normal Distribution Optimization (GNDO) to adaptively determine an optimal threshold for identifying abrupt scene changes. The method begins by extracting grayscale frames and calculating pixel-wise differences between consecutive frames, forming the basis for transition analysis. GNDO then evolves a population of candidate thresholds using a balance of exploration and exploitation to maximize detection accuracy while minimizing false positives. A margin-based post-processing step further refines the output by filtering out weak or insignificant variations. Experimental evaluations on diverse video datasets demonstrate that the proposed GNDO-driven strategy significantly enhances abrupt transition detection performance compared to conventional techniques. The adaptive thresholding mechanism and evolutionary learning not only improve robustness but also eliminate the need for manual parameter tuning, making the method highly effective with average recall, precision, and F1 score, respectively, 98.1%, 96.3%, and 97.1% for real-world video processing applications.
Gautam Pal, Tushar Banik, Saptarshi Chakraborty, Abhijit Biswas, Anurag De
Measuring and Evaluation of Power Density Emitted by Communication Towers with Location and Time for Selected Locations in Karbala, Iraq
Abstract
This study aims to measure and evaluate the energy density of electromagnetic radiation emitted by cell phone towers in three locations in Karbala, Iraq. The first location is the area between the Holy Shrine of Imam Hussain and Abbas that represents a religious area, while the second area is Al-Hussein District Street, which represents a commercial location. The third place is Al-Ghadeer District (a residential area). Field measurements were performed by using an HF 59B Analyser. This device is distinguished by its accuracy in measuring the energy density levels of electromagnetic radiation in the bands used in modern communication systems. The results showed that the highest radiation power density was recorded at 6:00 PM, with the average radiation intensity in some evening periods exceeding 2,000 µW/m2. This is attributed to increased human activity during the evening hours and increased use of mobile phones during that time. In the Al-Ghadeer District, peak radiation was observed at 1:00 PM. Radiation levels reached approximately 69–91 µW/m2 during peak times. The Al-Hussein District Street, which is predominantly commercial, recorded relatively lower radiation levels compared to other areas, with readings ranging between 20–55 µW/m2 throughout most of the day.
Rusul Amer Mahdi, Fadel Abdul Zahra Mourad
Grey Wolf Optimization and PCA-Based Hybrid Method for Dimensionality Reduction
Abstract
High-dimensional datasets across various domains, including healthcare, industry, and social media, present challenges such as overfitting, computational expense, and diminished interpretability. Identifying the most relevant variables is crucial for addressing these issues through feature selection. In this proposed method, in the first step Grey Wolf Optimizer (GWO) algorithm is used to select the best feature the second step, the selected features are reduced by using the PCA algorithm. This investigation assessed GWO and PCA across ten high-dimensional datasets, employing selected features to train K-Nearest Neighbors (KNN) and Support Vector Machine (SVM) classifiers. The proposed method significantly decreased feature counts by factors ranging from 10 to 100, all while preserving or enhancing accuracy. The performance of KNN was frequently enhanced when utilizing features selected through the proposed method. In comparison to PCA, ReliefF, mRMR, Chi-squared, SIFS, ATFS, EmPo, and FSM, the proposed GWO demonstrated enhanced performance; it independently identified optimal subset sizes, thereby increasing its applicability for high-dimensional scenarios in practical settings.
Amit Kumar Saxena, Damodar Patel, Gayatri Sahu, Abhishek Dubey, Shreya Chinde, Umesh Kumar Shriwas
Ensemble-Based Hostile Post Detection in Hindi Using Multilingual Pretrained Models
Abstract
Detecting hostile posts in Hindi on social media is challenging due to linguistic variability, informal usage, and code-switching in the Devanagari script. While prior efforts have addressed hostility detection, few have focused on fine-grained, multi-label classification using ensemble strategies suited for Hindi. In this work, we propose a novel ensemble-based framework for hostility classification across four categories: Defamation, Fake, Hate, and Offensive. The system integrates Bagging, Boosting, Simple Majority Voting, and Weighted Majority Voting over contextual embeddings derived from MuRIL, IndicBERT, and HindiBERT models. Principal Component Analysis (PCA) reduces dimensionality and computational complexity. Evaluation on the CONSTRAINT-2021 dataset demonstrates that our model achieves F1-scores of 0.8874 (Defamation), 0.9532 (Fake), 0.8653 (Hate), and 0.8790 (Offensive), outperforming prior work and recent benchmarks. The proposed model shows relative improvements of 45%, 13%, 28%, and 25% across the respective hostility classes. This demonstrates the effectiveness of combining multilingual transformer embeddings with ensemble strategies for hostile content classification. The approach offers a scalable, language-sensitive solution for detecting hostility in Hindi social media, supporting more respectful and safer digital interactions.
Santosh Rajak, Ujwala Baruah, Souvik Chowdhury
Extractive Text Summarization Using Feature Extraction for Single Document in Tamil Language
Abstract
With a tremendous amount of data available online, reading an article, running through pages is a tedious process. Giving a sneak peek into the big picture helps to analyze the content and understand the essential message of the document. Text Summarization provides a way to condense the given document in a meaningful way. The summarized content gives a gist of the original document, having the important information to be conveyed to the user. In this paper, automatic text summarization has been performed on single documents using Fuzzy Logic inference rules. The summarization is based on features extracted from the document. For feature extraction, Frequency Based Feature Extraction Technique (FBFET) is applied. This implementation uses NOUN-VERB IDENTIFIER to recognize the prominent category and seek out the features based on their frequency in the entire document. The sentences are scored using the important characteristics for defining the rules for Fuzzy Logic Inference Engine. High priority sentences are extracted to generate the summary of the document. The performance is measured using ROGUE (Recall Oriented Understudy for Gisting Evaluation) method and analyzed. The proposed automatic text summarizer is implemented with the feature extraction module which enhances the process of summarizer.
Shyamala, Mercy Evangeline
Continuous Class Conqueror: Class Incremental Continual Learning on Video Violence Data
Abstract
Continual learning in deep neural networks, in the context of video violence detection, faces the significant challenge of catastrophic forgetting, where a model loses previously learned knowledge upon encountering new classes. In this study, we address class-incremental learning for video violence detection and propose a framework for this unexplored domain. Initially, a 3D convolutional neural network model is trained to classify Normal and Violence classes, achieving high accuracy that reaches 99%. However, upon introducing a new class (Weaponized), the model demonstrates substantial performance degradation on the original classes, highlighting the impact of catastrophic forgetting. To mitigate this, a hybrid continual learning strategy, Continuous Class Conqueror (CCC), is proposed, which combines Learning without Forgetting (LwF) and Replay technique. Experimental results show that the Continuous Class Conqueror approach effectively preserves the model’s performance on previously learned classes, preserving accuracy up to 79% while allowing it to learn new classes with a high accuracy metric of 92% incrementally for video violence data, validating the importance of hybrid continual learning strategies.
Gaganrajdeep Singh, Manish Kumar, Kanu Goel
Comparative Evaluation of Machine Learning Models in Forecasting Crop Yields Amid Climate Change
Abstract
Climate change increasingly threatens global agriculture through rising carbon dioxide (CO₂) emissions, temperature anomalies, and irregular rainfall. Accurate crop yield prediction is therefore essential for ensuring food security and effective adaptation planning. This study systematically compares three machine learning models—Multiple Linear Regression (MLR), Random Forest (RF), and Extreme Gradient Boosting (XGBoost)—for predicting crop yields using an extensive, multi-country dataset with climate and soil variables. We introduce a robust preprocessing pipeline that includes Gaussian noise-based augmentation, anomaly-based feature engineering, and dual normalization strategies to improve model generalisability under climate stress. Performance is assessed across different training sizes (70/30 and 80/20 train-test splits) and hyperparameter configurations. XGBoost consistently outperforms the other models, achieving the lowest MSE (0.3841) and the highest R2 (0.6186) thanks to its ability to model nonlinear climate-yield interactions effectively. Key insights include (1) aridity index and temperature anomalies as dominant predictors, (2) water management and crop rotation as effective adaptation strategies, and (3) preprocessing as crucial for model robustness. This work presents a scalable and interpretable framework for applying machine learning to climate-resilient agriculture.
Sally Aboulhosn, Mariam Akkawi, Seifedine Kadry
Hierarchical Verification of Speculative Beams for Accelerating LLM Inference
Abstract
Large language models (LLMs) have achieved impressive success across natural language processing tasks but face persistent challenges in inference efficiency due to their autoregressive nature. While speculative decoding and beam sampling offer improvements, traditional methods verify draft sequences sequentially and uniformly, causing unnecessary overhead. This work proposes the Hierarchical Verification Tree (HVT), a framework that restructures speculative beam decoding by prioritizing high-likelihood drafts and pruning suboptimal candidates early. A formal verification-pruning algorithm ensures correctness and efficiency. HVT integrates with standard LLM pipelines without retraining. Experiments demonstrate that HVT outperforms existing methods, achieving significant reductions in inference time and energy consumption while maintaining output quality. These results highlight the promise of hierarchical verification for accelerating LLM inference.
Jaydip Sen, Harshitha Puvvala, Subhasis Dasgupta
Artificial Intelligence for Autism Detection Using EEG Signals and Lightweight Neural Networks
Abstract
This paper describes a sophisticated framework based on deep learning for diagnosing Autism Spectrum Disorder (ASD) through electroencephalogram (EEG) signals with an emphasis on real-time and mHealth solutions. The two heterogeneous EEG datasets BCIAUT-P300 (with ASD subjects) and SPIS Resting State (control subjects) were preprocessed and standardized to a common shape. For every sample, a compact and informative representation was constructed by extracting the mean and variance of eight EEG channels over 350 epochs. The classification of ASD cases versus control cases was performed using a lightweight CNN-LSTM hybrid architecture, which yielded a test accuracy of 94.74% and ROC AUC score of 0.90. Subsequently, post-training pruning was applied, resulting in over 40% reduction in model size, which allowed deployment via TFLite without any performance loss. The model was eventually embedded in an Android app, which allowed offline, on-device inference with average prediction times of 4–5 seconds per sample. This method achieves high classification accuracy while prioritizing temporal modeling and user-centered design. By combining the extraction of EEG statistical features with deep sequential models for mobile implementation, the system becomes a scalable, easy-to-interpret, and low-latency solution for autism spectrum disorder (ASD) screening that can be utilized in clinical as well as resource-limited environments.
Afifa Shaikh, Aditya Koli, Pradeep Awubaigol, Soham Mali, Rajashri Khanai, Prema Akkasaligar
Assessment of Flood Potential Through Rainfall Pattern Analysis
Abstract
Floods pose a major threat to lives, infrastructure, and ecosystems, particularly in areas where extreme rainfall events are becoming increasingly frequent. With the patterns of climate change and urbanization advancing at a fast pace, most areas are now more exposed to floods than ever before. In this research, we examine over a century of rainfall data, from 1901 to 2024, for all Indian states to better understand flood-prone conditions. A Bidirectional Long Short-Term Memory (Bi-LSTM) model is used to learn the temporal dependencies of rainfall sequences and forecast flood potential from past trends. The model is trained and tested on performance metrics such as accuracy, precision, recall, and F1-score. The results demonstrate that the Bi-LSTM method captures rainfall’s complex spatio-temporal patterns with an accuracy of 96.2% and gives robust predictive performance in detecting high-risk areas for flooding.
Aditya Gupta, Vibha Jain
Leveraging EfficientNet Architectures for Noise Robust Speech Recognition: An Empirical Study with AudioMNIST Dataset
Abstract
Speech recognition has become ubiquitous in modern AI applications. It converts spoken words into text using learning algorithms to make human-machine interaction more convenient. Conventional methods that use mel-frequency cepstral coefficients (MFCC) features for speech recognition often lack efficiency in the presence of environmental noise. Advanced deep learning algorithms have the potential to address this challenge. This article presents an empirical study for a holistic comparison among EfficientNet-based speech recognition models. EfficientNet architectures deal with the trade-off between performance and computational efficiency through compound scaling, mobile inverted bottleneck convolution, and squeeze and excitation blocks. The presented work trains and tests EfficientNetB0-B7 with mel-spectrogram images extracted from the benchmark AudioMNIST dataset. The original acoustic signals in the dataset are superimposed with babble and Gaussian white noise to produce noisy data of varying intensities. The results show that all EfficientNet architectures have outperformed traditional methods, achieving above 90% accuracy even in noisy environments without the need for any noise removal method.
Arpita Choudhury, Amisha Singh, Pinki Roy, Sivaji Bandyopadhyay

Computer Networks and Cybersecurity

Frontmatter
Raspberry Pi NAS as a Self-hosted Private Cloud: Design, Implementation, and Performance Evaluation
Abstract
In the modern world, there is significant data generation and consumption alongside the need for flexible, scalable, and secure storage solutions. On the other hand, self-hosted cloud services are an attractive alternative to commercial clouds. Building a private cloud may help to overcome issues with privacy, security, vendor lock-in, and the recurring costs that commercial services offer. The purpose of this research is to design, implement, and evaluate personal private clouds based on Network Attached Storage, which is powered by Raspberry Pi. It investigates the possibilities of utilizing such privacy-centered cloud storage as an alternative to mainstream services available. This paper presents a system architecture for configuring core private cloud functions with performance benchmarking and security assessment strategies. Simulated experimental results demonstrate the system capabilities alongside securing sensitive data from common threats. The findings pinpoint critical areas for further enhancement and development while demonstrating the effectiveness of the low-cost method for personal and small-group private cloud applications.
Shipra Swati, Jatin Yadav, Vamsi E. Manoj
Performance Evaluation of WFS Service Consumption with Python and Cython
Abstract
OGC spatial data services, particularly the Web Feature Service (WFS), are widely used and gaining popularity due to factors like open data policies and the implementation of spatial data infrastructures by governments and institutions. This study presents a detailed performance evaluation of Web Feature Service (WFS) consumption through various processes such as connection, data capture, storage, and retrieval using Python and the OWSLib library. Additionally, it conducts a comparative performance analysis between CPython and Cython to determine efficiency differences in real execution scenarios. The experiments involved three international WFS services and implemented key algorithms in three modes: CPython, pure Cython, and Cython with static typing. The findings reinforce previous research highlighting Cython’s ability to optimize computational performance, particularly when static variable typing is applied. Significant execution time improvements were observed in specific tasks involving spatial data handling. These results have practical implications for geospatial application developers and can serve as a reference for improving high-performance spatial data workflows.
Javier Felipe Moncada Sánchez, Yenny Espinosa Gómez, Carlos Enrique Montenegro Marín, Rubén González Crespo
A Novel Hybrid Scheduling Approach for Enhancing Cloud System Performance
Abstract
Several scheduling challenges exist during task computations, especially when algorithms lack intelligent mechanisms. Modern research confirms that Machine Learning (ML) algorithms enhance system performance through their intelligent mechanisms. This research used ML’s K-Means Clustering (KMC) algorithm along with the Shortest Job First (SJF) scheduling algorithm to design and implement a novel SJF-KMC hybrid scheduling approach to improve cloud system performance. This research experimented with Google Cluster real-time tasks in ten scenarios with varying Virtual Machine deployed from ten to one hundred and implemented the SJF-KMC approach across five clusters: SJF-K1, SJF-K3, SJF-K5, SJF-K7, and SJF-K9. The performance of these approaches was compared with each other and the existing SJF algorithm across Average Network Latency (Avg_NL), Average Energy (Avg_E), Average Memory (Avg_M), and Average Throughput (Avg_T). Results show that the task clustering provided by the SJF-KMC algorithms improves cloud performance. Additionally, results indicate that minimizing task clusters during their transit phase from the user environment to the cloud environment (and vice versa) improves Avg_NL, minimal/maximum task clusters during computations enhances Avg_E and Avg_M parameters, and maintaining a minimal number of clusters during Throughput calculations. The intelligent mechanism of the SJF-KMC approach enhances scheduling, thereby improving overall cloud system performance.
Prathamesh Vijay Lahande
IPDM: An Intelligent Phishing Detection Model for E-Commerce Websites
Abstract
Nowadays, with the exponential growth of digital connectivity and online services, phishing has emerged as one of the most familiar Cyber threats, targeting individuals and organizations alike. Phishing attacks aim to deceive users into revealing sensitive information such as usernames, passwords, and financial credentials by mimicking legitimate websites. To help mitigate these risks, there is an urgent need for intelligent systems capable of detecting and preventing phishing attacks in real-time.
In this framework, various classification machine learning techniques are used, such as, Artificial neural network (ANN) achieved an accuracy of 98.90%, the Recurrent neural network (RNN), achieved an accuracy of 95.06%, and the K-Nearest Neighbors (KNN) achieved an accuracy of 97.30%, while the Convolutional neural network classifier predicted training error 99.80% and test error 99.13% with accuracy loss 0.005% and validation loss 0.03% between the evaluated models with false positive rate 0.0011. While CNN with stratified k-fold cross validation achieved an accuracy of 98.35%, and Inference Time per instance 0.000150 s was recorded. For model interpretability used Shapley additive explanations (SHAP) were used and analyzed false positive rate. The best model here is the CNN classifier on the URL-based phishing dataset with the highest accuracy of 99.80%.
Vipin Kumar, Kakali Chatterjee
Unsupervised Pattern Discovery in Cyber Incidents Using Principal Component Analysis K-Means DBSCAN and Isolation Forest
Abstract
With exponentially increasing cybersecurity threats, the global digital infrastructure must respond to unprecedented challenges; more advanced analysis approaches are required to identify threat patterns and anomalous behavior. This work proposes a comprehensive unsupervised learning framework that can cluster and perform anomaly detection for cyber-incident data from around the world between 2015 and 2024. Three complementary approaches are used in our methodology, and we have used K-Means to cluster incident patterns, then DBSCAN for DBSCAN-based cluster analysis, and finally Isolation Forest to detect anomalies. In this study, a comprehensive dataset of cyber incidents is analyzed with their (non-mutually exclusive) features, such as targeted industry, type of attack, impact, affected users and geography. The reduction in dimensionality and visualization of the clustering results in the two-dimensional space were achieved by Principal Component Analysis (PCA). The K-means analysis shows that our findings can be grouped into four (4) clusters, and four (4) core incident patterns are identified but flagged as noise by DBSCAN. The Isolation Forest algorithm effectively identified high-impact anomalous incidents with anomaly scores between 0.46 and 0.62. Our performance evaluation shows that the clustered results for the Isolation Forest (89% silhouette score, 87% anomaly precision) outperform those of traditional clustering methods. By offering automated tools for threat categorization, pattern recognition, and early warning systems to anomalous cyber activities suggesting potential sophisticated or emerging attack vectors, the research adds to both the field of cybersecurity intelligence and to various applications.
Ananjan Maiti, Rupak Chakraborty, Dipankar Basu, Indranil Sarkar, Arpita Dutta
CyberTracker: A Web Portal for Cyber Fraud Monitoring and Interactive Querying Using Lightweight Language Models
Abstract
In the modern era, cyber threats are escalating rapidly in complexity, frequency, and impact. Traditional reactive cyber security mechanisms are no longer sufficient to combat sophisticated attacks such as ransomware, phishing, and zero-day exploits. Addressing this urgent need, we present CyberTracker, an integrated real-time system for cyber crime data acquisition, analysis, visualization, and intelligent exploration. The system employs a multi-phase methodology, beginning with automated web scraping to collect cyber incident data from diverse sources, including cyber security intelligence websites, social media platforms, and news outlets. CyberTracker enhances real-time threat visibility, supports proactive decision-making, and contributes to strengthening cyber security posture against emerging threats. It transforms structured data into a graph-based representation for visual exploration of entity relationships and incident patterns. By automating threat intelligence collection, enabling intelligent interaction, and offering visual analytics, the platform provides a comprehensive, scalable, and robust solution for modern cyber crime monitoring.
A. Vikasni, B. Jagathsri, A. M. Abirami
Backmatter
Titel
Data Science and Network Engineering
Herausgegeben von
Suyel Namasudra
Nirmalya Kar
Sarat Kumar Patra
Byung-Gyu Kim
Copyright-Jahr
2026
Electronic ISBN
978-3-032-07735-6
Print ISBN
978-3-032-07734-9
DOI
https://doi.org/10.1007/978-3-032-07735-6

Die PDF-Dateien dieses Buches wurden gemäß dem PDF/UA-1-Standard erstellt, um die Barrierefreiheit zu verbessern. Dazu gehören Bildschirmlesegeräte, beschriebene nicht-textuelle Inhalte (Bilder, Grafiken), Lesezeichen für eine einfache Navigation, tastaturfreundliche Links und Formulare sowie durchsuchbarer und auswählbarer Text. Wir sind uns der Bedeutung von Barrierefreiheit bewusst und freuen uns über Anfragen zur Barrierefreiheit unserer Produkte. Bei Fragen oder Bedarf an Barrierefreiheit kontaktieren Sie uns bitte unter accessibilitysupport@springernature.com.

    Bildnachweise
    AvePoint Deutschland GmbH/© AvePoint Deutschland GmbH, NTT Data/© NTT Data, Wildix/© Wildix, arvato Systems GmbH/© arvato Systems GmbH, Ninox Software GmbH/© Ninox Software GmbH, Nagarro GmbH/© Nagarro GmbH, GWS mbH/© GWS mbH, CELONIS Labs GmbH, USU GmbH/© USU GmbH, G Data CyberDefense/© G Data CyberDefense, Vendosoft/© Vendosoft, Kumavision/© Kumavision, Noriis Network AG/© Noriis Network AG, WSW Software GmbH/© WSW Software GmbH, tts GmbH/© tts GmbH, Asseco Solutions AG/© Asseco Solutions AG, AFB Gemeinnützige GmbH/© AFB Gemeinnützige GmbH, Ferrari electronic AG/© Ferrari electronic AG