Data Science and Network Engineering
Proceedings of ICDSNE 2025
- 2026
- Book
- Editors
- Suyel Namasudra
- Nirmalya Kar
- Sarat Kumar Patra
- Byung-Gyu Kim
- Book Series
- Lecture Notes in Networks and Systems
- Publisher
- Springer Nature Switzerland
About this book
This book includes research papers presented at the International Conference on Data Science and Network Engineering (ICDSNE 2025) organized by the Department of Computer Science and Engineering, National Institute of Technology Agartala, Tripura, India, during July 18–19, 2025. It includes research work from researchers, academicians, business executives, and industry professionals for solving real-life problems by using the advancements and applications of data science and network engineering. This book covers many advanced topics, such as artificial intelligence (AI), machine learning (ML), deep learning (DL), computer networks, blockchain, security and privacy, Internet of things (IoT), cloud computing, big data, supply chain management, and many more. Different sections of this book are highly beneficial for the researchers, who are working in the field of data science and network engineering.
Table of Contents
-
Frontmatter
-
Computational Intelligence
-
Frontmatter
-
Ensemble Classifier for Real-Time Breast Cancer Classification on Histopathology Images
Jacinta Potsangbam, Harsh Kumar, Salam Shuleenda DeviAbstractOne of the most prevalent cancers among women worldwide is breast cancer, and treatment and survival rates are significantly boosted by early detection. Addressing this challenge, the emergence of deep learning (DL) models has offered as a powerful solution. This paper presents a deep learning ensemble-based approach for the classification of breast cancer using histopathological images. The pre-trained VGG16 and MobileNetV2 are combined in the ensemble model to perform binary classification. The sequential depth of VGG16 and the depthwise separable convolutions of MobileNetV2 improved generalization and minimized overfitting. The proposed ensemble achieves a maximum accuracy of 99.14%. The results demonstrate the effectiveness of deep learning ensemble in accurately classifying cancerous and non-cancerous breast cancer images. To assist medical professionals in the diagnosis of breast cancer from histopathology images, a web server-based application is designed. The system has a strong frontend built with ReactJS and Bootstrap that makes it easy for users to upload images. It also has a FastAPI backend that works promptly for processing images. The platform allows for the seamless upload of biopsy images, leveraging a backend TensorFlow-based convolutional neural network ensemble to predict whether the biopsy image is cancerous or non-cancerous along with associated confidence scores. -
A Smart Surveillance Framework for Real-Time Suspicious Activity Detection and Automated Alert Generation Using YOLOv8
Anurag De, Venkata Naga Durga Sowmya Kollipara, Gautam Pal, Meghana Bellamkonda, Sai Surya Nikhil VissapragadaAbstractAs security threats evolve, conventional surveillance systems often become victims of suspicious activities. The present study describes the building of an AI-driven Smart Surveillance System for the monitoring and analysis of unusual behavior in a real-time setting by the YOLOv8 high-performing deep learning algorithm. In contrast to the traditional methods, which by and large depend on manual monitoring or rule-based techniques, this system is designed to find itself in an environment that learns through observation and recognition of potentially threatening patterns regarding the event with an outstanding degree of accuracy and speed. Additionally, the application of the latest object detection and deep learning techniques heightens security surveillance because they minimize response time and enhance situational awareness. Evaluation results demonstrate the effectiveness of the proposed system, achieving a detection accuracy of 96%, with significant improvements in processing speed and false alarm reduction compared to traditional approaches. This research presents a scalable and responsive solution for smart surveillance, offering enhanced situational awareness and proactive security monitoring through deep learning advancements. -
Enhancing E-Commerce Trust: An Integrated Product Recommendation and Fake Review Detection System
Poranki Anusha, Buse Sudeep Sahas, T. Lakshmi SurekhaAbstractTo improve user experience and influence purchase decisions, e-commerce systems mostly rely on user ratings and tailored suggestions. However, the growing number of fraudulent reviews damages consumer confidence and compromises the accuracy of recommendations. In order to increase the dependability of e-commerce, this research proposes an integrated system that combines a sentiment-driven, feature-based product recommendation model with a false review detection method. GloVe embeddings, KMeans clustering, and sentiment analysis models like CatBoost and LightGBM are used to assess customer reviews. The Isolation Forest method, which is based on anomaly detection, is used to simultaneously identify and filter fraudulent reviews. The system’s overall accuracy significantly improved after using false review detection, illustrating the influence of sincere input on suggestion quality. Results from experiments confirm that removing fraudulent reviews improves model performance while giving consumers more accurate, tailored product recommendations. For contemporary e-commerce platforms, the suggested system provides a scalable way to boost consumer confidence. -
Hardware-Efficient Neural Network for Voice Disorder Classification from Multi-Source Datasets
Jyoti Mishra, R. K. SharmaAbstractThis paper presents a hardware-efficient neural network-based approach for classification of healthy voices and pathological conditions including functional dysphonia, hyperkinetic dysphonia, hypokinetic dysphonia, and reflux laryngitis. Using two widely-used datasets—the Saarbruecken Voice Database (SVD) database and the Voiced database—we extracted clinically relevant acoustic features such as jitter, shimmer, harmonics-to-noise ratio (HNR), fundamental frequency (f₀), and formant frequencies to train a compact convolutional neural network (CNN) optimized for deployment on Field Programmable Gate Array (FPGA) platforms. The proposed model achieved an overall classification accuracy of 91.4%, with consistently high sensitivity and specificity across all five categories. Post-training quantization was applied to convert the model into an 8-bit fixed-point representation, significantly reducing memory usage and computational overhead. The CNN was synthesized using Vivado High-Level Synthesis (HLS) and implemented on a Zynq-7000 FPGA, where it achieved real-time inference with an average latency of 80 ms per sample. These results confirm the feasibility of integrating deep learning-based voice diagnostics into portable, low-power clinical devices. The system's robustness, efficiency, and accuracy make it highly suitable for early detection and continuous monitoring of voice disorders in both clinical and telemedicine environments. -
Predictive Maintenance on C-MAPSS Using LSTM Variants and Attention
Anshuman Sinha, Gaurav Singh Rajput, Utkarsh Raj, Dilip Kumar ChoubeyAbstractAccurate prediction of Remaining Useful Life (RUL) is critical for effective predictive maintenance in industrial systems. This study improves RUL forecasting performance through signal processing techniques, avoiding the complexity of hybrid deep learning models. Using the NASA Commercial Modular Aero-Propulsion System Simulation FD001 dataset, which provides run-to-failure sensor data from turbofan engines, we applied Kalman Filtering and multi-level Discrete Wavelet Transform for noise reduction and feature enhancement. Low-variance features were discarded, and Min-Max normalization was used for data scaling. Input sequences were generated using a sliding window of length 50. Four Long Short-Term Memory (LSTM)-based models were developed for comparison: a baseline LSTM, a Bi-directional LSTM, and two variants incorporating Multi-Head Attention. While attention mechanisms improved average performance in terms of Mean Absolute Error (MAE) and Root Mean Squared Error (RMSE) by enhancing temporal focus, they also introduced higher variance and occasional large errors. In contrast, signal processing significantly improved input quality, model stability, and convergence, enabling simpler architectures to achieve competitive results. All models were trained using MSE loss and evaluated on MAE and RMSE. The results highlight that well-designed signal processing pipelines can enhance RUL prediction accuracy while reducing reliance on complex neural network architectures. -
Unveiling Ebola–Human Protein Links Through Network Embedding and Unsupervised Machine Learning
Sujoy Chatterjee, Koyel MandalAbstractUnderstanding the interaction mechanism between host proteins and the Ebola virus is essential for targeted drug therapeutics. Although a spectrum of studies dealing with host protein and Ebola virus are available, the protein-protein interactions (PPIs) databases are not dense and thus involve bias toward well-studied proteins. Furthermore, the majority of these studies consider the supervised learning approach, whereas a sufficiently large labeled dataset is not available. In this study, we present an unsupervised computational framework to predict the associations between non-interacting Ebola-human protein pairs. We begin the work by constructing a bipartite graph from the known Ebola-human PPI dataset and then employ a low-dimensional embedding technique, Node2Vec, to understand the structural and relational characteristics of the proteins. A clustering algorithm is then applied to the human protein embeddings to unveil significant modules, which are validated through functional enrichment analysis. From these clusters, bipartite graphs are reconstructed, including known Ebola-human proteins, and further analyzed using Node2Vec and cosine similarity. Our framework successfully predicts new PPIs, many of which are indirectly supported by literature evidence. The suggested method can be used for various host-pathogen interaction investigations and provides a data-driven strategy for systematic PPI identification. -
Discount Optimisation in Food Delivery Using Machine Learning
Vaishnav Dineshkumar Prajveen, S. Dilipkumar, Akriti SaigalAbstractThe rapid growth of online food delivery services has generated vast amounts of data that offer valuable insights. This data provides significant information about consumer behaviour, operational efficiency, and market trends. This study presents a comprehensive machine learning (ML) pipeline to analyse food delivery data, integrating demographic attributes, order details, and food preferences. The research includes data preprocessing, exploratory data analysis (EDA), regression modelling, hyperparameter tuning, and feature importance evaluation. Additionally, we applied advanced customer segmentation techniques to identify key consumer groups along with their distinct characteristics. The findings reveal crucial factors influencing delivery time, order value, and customer satisfaction. By leveraging predictive modelling and clustering algorithms, this study offers actionable intelligence for stakeholders to optimise service quality, manage inventory, personalize offerings, provide tailored menus, optimise discounts, and enhance operational strategies in the food delivery ecosystem. -
A Scalable Real-Time Stock Market Prediction Framework Using LSTM Network and XGBoost Model
Amit Kumar Roy, Munsifa Firdaus Khan Barbhuyan, Satyabrata NathAbstractThis paper introduces a real-time prediction system for stock prices that unifies machine learning (ML), deep learning, and data engineering methods for improving forecasting precision. The system integrates historical stock information and analysis of news sentiment to make accurate predictions. Historical data are accessed through Yahoo Finance, whereas news articles applicable to the stocks are accessed using the News API to incorporate the effects of the external market. The framework makes use of Long Short-Term Memory (LSTM) networks and XGBoost models that are supplemented using a stacked ensemble method to obtain higher accuracy and resilience. A FastAPI-backed backend supports API endpoints for prediction and history and is backed up by Redis to support real-time data streaming and storage. There is a special Redis processor to listen to streaming stock data, process it, and create predictions from trained models. Apart from this, a Streamlit dashboard provides an interactive, user-friendly representation of the predictions, historical patterns, model comparison, and performance measures. This paper provides an example of a scalable, real-time stock market prediction solution that can aid investors and analysts in making data-driven financial choices. -
Smart Agriculture: A Deep Learning Framework for Early Plant Disease Identification
Joyeeta Das, Sanjib Debnath, Swapan DebbarmaAbstractAgricultural productivity faces a substantial risk because crop diseases lead to major losses in yield quantity and quality, despite agriculture being essential for food security. Diseases affecting crops pose a significant threat to agricultural production by causing severe reductions in both yield quantity and crop quality, even though agriculture serves as a cornerstone of global food stability. Traditional plant disease diagnostic approaches are resource-intensive and slow, often requiring expert skills, making them inaccessible to many farmers. This paper presents a system that demonstrates automated plant disease identification using Convolutional Neural Networks (CNNs), which excel at image classification. The CNN learns to categorize several diseases through training on labeled leaf images from both healthy and diseased plants. The model’s user-friendly gradio interface allows users to upload images and receive quick diagnoses. It enables early disease detection, providing farmers with fast and reliable results to support timely treatment decisions. The experimental findings demonstrate the system’s robustness and its ability to operate effectively in agricultural environments lacking sufficient resources. -
Real-Time Intelligence Surveillance Using Object Detection and Facial Recognition on Edge Devices
Charanarur Panem, Debasmita Karmakar, Himanshu Yadav, Anshul Rajkumar, Yuvraj Mishra, Tanmayee Anasingaraju, Harish Ogare, Albert Gautam, Laishram Hemanta Singh, Naveen Kumar Chaudhary, Suman DebAbstractThis research presents a lightweight AI system for real-time object detection and facial recognition on edge computing platforms like the Raspberry Pi. By integrating YOLOv8 for object detection and DeepFace for facial analysis with OpenCV, the system performs efficient, offline inference on resource-constrained devices without relying on cloud infrastructure. A custom domain-specific dataset enhances detection precision and recognition accuracy. Designed for forensic, surveillance, and law enforcement applications, the unified architecture enables low-latency, privacy-preserving analysis directly on-site. The modular design supports diverse scenarios such as intelligent surveillance, suspect identification, and demographic analysis. DeepFace extends functionality by enabling real-time face detection, emotion recognition, and demographic estimation (age, gender) from live camera feeds. The compact deployment demonstrates how artificial intelligence can operate effectively on edge devices for time-sensitive, real-world applications. This work emphasizes the role of modern AI models, privacy-aware offline systems, and tailored datasets in improving accuracy. It establishes a scalable foundation for broader applications in intelligent visual monitoring and forensic investigations, showing how edge AI can support secure and efficient decision-making in evolving security landscapes. -
Sentiment Analysis in Kokborok: Building Resources and Models for a Low-Resource Language
Tijeli Debbarma, Abhijit Sinha, Sagarika Sengupta, Jay Krishna Das, Himanish Shekhar DasAbstractThis study presents a novel and comprehensive study on sentiment analysis for Kokborok, a low-resource Tibeto-Burman language primarily spoken in the Indian state of Tripura. The lack of annotated corpora and linguistic tools has significantly impeded the development of Natural Language Processing (NLP) applications for Kokborok. Addressing this challenge, we developed a manually verified and annotated dataset consisting of 7,521 Kokborok sentences, each labeled as expressing a positive, negative, or neutral sentiment. To analyze the sentiment (in terms of three classes: Positive, Negative and Neutral), we implemented and evaluated four traditional machine learning models: Support Vector Machine (SVM), Random Forest, Logistic Regression, and Naive Bayes. Given the class imbalance in the dataset, we further employed feature extraction techniques such as Bag-of-Words (BoW) and Term Frequency–Inverse Document Frequency (TF-IDF) in separate pipelines, both with and without the application of over-sampling methods. Thereafter, performing the comparative analysis, it was found that when incorporating oversampling with the BoW feature extraction technique and the Logistic Regression model, we derived the highest accuracy of 100% in training and 90% in testing. Similarly, with the TF-IDF feature extraction technique, we derived the highest accuracy of 100% in training and 89% in testing, with the Random Forest and SVM models. However, without the incorporation of oversampling for the extraction techniques employed, the accuracy and other evaluation metrics derived very poor results. Hence, making the feature extraction pipeline Logistic regression along with BoW and oversampling to be an appropriate choice for this study. -
An Evolutionary Framework for Robust Abrupt Transition Detection in Video Sequences
Gautam Pal, Tushar Banik, Saptarshi Chakraborty, Abhijit Biswas, Anurag DeAbstractAccurate detection of abrupt transitions, or hard cuts, plays a vital role in video analysis tasks such as indexing, summarization, and scene segmentation. However, traditional fixed-threshold methods often struggle in dynamic conditions due to their sensitivity to noise, motion, and lighting variations. To address this, we propose a novel approach that employs Generalized Normal Distribution Optimization (GNDO) to adaptively determine an optimal threshold for identifying abrupt scene changes. The method begins by extracting grayscale frames and calculating pixel-wise differences between consecutive frames, forming the basis for transition analysis. GNDO then evolves a population of candidate thresholds using a balance of exploration and exploitation to maximize detection accuracy while minimizing false positives. A margin-based post-processing step further refines the output by filtering out weak or insignificant variations. Experimental evaluations on diverse video datasets demonstrate that the proposed GNDO-driven strategy significantly enhances abrupt transition detection performance compared to conventional techniques. The adaptive thresholding mechanism and evolutionary learning not only improve robustness but also eliminate the need for manual parameter tuning, making the method highly effective with average recall, precision, and F1 score, respectively, 98.1%, 96.3%, and 97.1% for real-world video processing applications. -
Measuring and Evaluation of Power Density Emitted by Communication Towers with Location and Time for Selected Locations in Karbala, Iraq
Rusul Amer Mahdi, Fadel Abdul Zahra MouradAbstractThis study aims to measure and evaluate the energy density of electromagnetic radiation emitted by cell phone towers in three locations in Karbala, Iraq. The first location is the area between the Holy Shrine of Imam Hussain and Abbas that represents a religious area, while the second area is Al-Hussein District Street, which represents a commercial location. The third place is Al-Ghadeer District (a residential area). Field measurements were performed by using an HF 59B Analyser. This device is distinguished by its accuracy in measuring the energy density levels of electromagnetic radiation in the bands used in modern communication systems. The results showed that the highest radiation power density was recorded at 6:00 PM, with the average radiation intensity in some evening periods exceeding 2,000 µW/m2. This is attributed to increased human activity during the evening hours and increased use of mobile phones during that time. In the Al-Ghadeer District, peak radiation was observed at 1:00 PM. Radiation levels reached approximately 69–91 µW/m2 during peak times. The Al-Hussein District Street, which is predominantly commercial, recorded relatively lower radiation levels compared to other areas, with readings ranging between 20–55 µW/m2 throughout most of the day. -
Grey Wolf Optimization and PCA-Based Hybrid Method for Dimensionality Reduction
Amit Kumar Saxena, Damodar Patel, Gayatri Sahu, Abhishek Dubey, Shreya Chinde, Umesh Kumar ShriwasAbstractHigh-dimensional datasets across various domains, including healthcare, industry, and social media, present challenges such as overfitting, computational expense, and diminished interpretability. Identifying the most relevant variables is crucial for addressing these issues through feature selection. In this proposed method, in the first step Grey Wolf Optimizer (GWO) algorithm is used to select the best feature the second step, the selected features are reduced by using the PCA algorithm. This investigation assessed GWO and PCA across ten high-dimensional datasets, employing selected features to train K-Nearest Neighbors (KNN) and Support Vector Machine (SVM) classifiers. The proposed method significantly decreased feature counts by factors ranging from 10 to 100, all while preserving or enhancing accuracy. The performance of KNN was frequently enhanced when utilizing features selected through the proposed method. In comparison to PCA, ReliefF, mRMR, Chi-squared, SIFS, ATFS, EmPo, and FSM, the proposed GWO demonstrated enhanced performance; it independently identified optimal subset sizes, thereby increasing its applicability for high-dimensional scenarios in practical settings. -
Ensemble-Based Hostile Post Detection in Hindi Using Multilingual Pretrained Models
Santosh Rajak, Ujwala Baruah, Souvik ChowdhuryAbstractDetecting hostile posts in Hindi on social media is challenging due to linguistic variability, informal usage, and code-switching in the Devanagari script. While prior efforts have addressed hostility detection, few have focused on fine-grained, multi-label classification using ensemble strategies suited for Hindi. In this work, we propose a novel ensemble-based framework for hostility classification across four categories: Defamation, Fake, Hate, and Offensive. The system integrates Bagging, Boosting, Simple Majority Voting, and Weighted Majority Voting over contextual embeddings derived from MuRIL, IndicBERT, and HindiBERT models. Principal Component Analysis (PCA) reduces dimensionality and computational complexity. Evaluation on the CONSTRAINT-2021 dataset demonstrates that our model achieves F1-scores of 0.8874 (Defamation), 0.9532 (Fake), 0.8653 (Hate), and 0.8790 (Offensive), outperforming prior work and recent benchmarks. The proposed model shows relative improvements of 45%, 13%, 28%, and 25% across the respective hostility classes. This demonstrates the effectiveness of combining multilingual transformer embeddings with ensemble strategies for hostile content classification. The approach offers a scalable, language-sensitive solution for detecting hostility in Hindi social media, supporting more respectful and safer digital interactions. -
Extractive Text Summarization Using Feature Extraction for Single Document in Tamil Language
Shyamala, Mercy EvangelineAbstractWith a tremendous amount of data available online, reading an article, running through pages is a tedious process. Giving a sneak peek into the big picture helps to analyze the content and understand the essential message of the document. Text Summarization provides a way to condense the given document in a meaningful way. The summarized content gives a gist of the original document, having the important information to be conveyed to the user. In this paper, automatic text summarization has been performed on single documents using Fuzzy Logic inference rules. The summarization is based on features extracted from the document. For feature extraction, Frequency Based Feature Extraction Technique (FBFET) is applied. This implementation uses NOUN-VERB IDENTIFIER to recognize the prominent category and seek out the features based on their frequency in the entire document. The sentences are scored using the important characteristics for defining the rules for Fuzzy Logic Inference Engine. High priority sentences are extracted to generate the summary of the document. The performance is measured using ROGUE (Recall Oriented Understudy for Gisting Evaluation) method and analyzed. The proposed automatic text summarizer is implemented with the feature extraction module which enhances the process of summarizer. -
Continuous Class Conqueror: Class Incremental Continual Learning on Video Violence Data
Gaganrajdeep Singh, Manish Kumar, Kanu GoelAbstractContinual learning in deep neural networks, in the context of video violence detection, faces the significant challenge of catastrophic forgetting, where a model loses previously learned knowledge upon encountering new classes. In this study, we address class-incremental learning for video violence detection and propose a framework for this unexplored domain. Initially, a 3D convolutional neural network model is trained to classify Normal and Violence classes, achieving high accuracy that reaches 99%. However, upon introducing a new class (Weaponized), the model demonstrates substantial performance degradation on the original classes, highlighting the impact of catastrophic forgetting. To mitigate this, a hybrid continual learning strategy, Continuous Class Conqueror (CCC), is proposed, which combines Learning without Forgetting (LwF) and Replay technique. Experimental results show that the Continuous Class Conqueror approach effectively preserves the model’s performance on previously learned classes, preserving accuracy up to 79% while allowing it to learn new classes with a high accuracy metric of 92% incrementally for video violence data, validating the importance of hybrid continual learning strategies. -
Comparative Evaluation of Machine Learning Models in Forecasting Crop Yields Amid Climate Change
Sally Aboulhosn, Mariam Akkawi, Seifedine KadryAbstractClimate change increasingly threatens global agriculture through rising carbon dioxide (CO₂) emissions, temperature anomalies, and irregular rainfall. Accurate crop yield prediction is therefore essential for ensuring food security and effective adaptation planning. This study systematically compares three machine learning models—Multiple Linear Regression (MLR), Random Forest (RF), and Extreme Gradient Boosting (XGBoost)—for predicting crop yields using an extensive, multi-country dataset with climate and soil variables. We introduce a robust preprocessing pipeline that includes Gaussian noise-based augmentation, anomaly-based feature engineering, and dual normalization strategies to improve model generalisability under climate stress. Performance is assessed across different training sizes (70/30 and 80/20 train-test splits) and hyperparameter configurations. XGBoost consistently outperforms the other models, achieving the lowest MSE (0.3841) and the highest R2 (0.6186) thanks to its ability to model nonlinear climate-yield interactions effectively. Key insights include (1) aridity index and temperature anomalies as dominant predictors, (2) water management and crop rotation as effective adaptation strategies, and (3) preprocessing as crucial for model robustness. This work presents a scalable and interpretable framework for applying machine learning to climate-resilient agriculture. -
Hierarchical Verification of Speculative Beams for Accelerating LLM Inference
Jaydip Sen, Harshitha Puvvala, Subhasis DasguptaAbstractLarge language models (LLMs) have achieved impressive success across natural language processing tasks but face persistent challenges in inference efficiency due to their autoregressive nature. While speculative decoding and beam sampling offer improvements, traditional methods verify draft sequences sequentially and uniformly, causing unnecessary overhead. This work proposes the Hierarchical Verification Tree (HVT), a framework that restructures speculative beam decoding by prioritizing high-likelihood drafts and pruning suboptimal candidates early. A formal verification-pruning algorithm ensures correctness and efficiency. HVT integrates with standard LLM pipelines without retraining. Experiments demonstrate that HVT outperforms existing methods, achieving significant reductions in inference time and energy consumption while maintaining output quality. These results highlight the promise of hierarchical verification for accelerating LLM inference. -
Artificial Intelligence for Autism Detection Using EEG Signals and Lightweight Neural Networks
Afifa Shaikh, Aditya Koli, Pradeep Awubaigol, Soham Mali, Rajashri Khanai, Prema AkkasaligarAbstractThis paper describes a sophisticated framework based on deep learning for diagnosing Autism Spectrum Disorder (ASD) through electroencephalogram (EEG) signals with an emphasis on real-time and mHealth solutions. The two heterogeneous EEG datasets BCIAUT-P300 (with ASD subjects) and SPIS Resting State (control subjects) were preprocessed and standardized to a common shape. For every sample, a compact and informative representation was constructed by extracting the mean and variance of eight EEG channels over 350 epochs. The classification of ASD cases versus control cases was performed using a lightweight CNN-LSTM hybrid architecture, which yielded a test accuracy of 94.74% and ROC AUC score of 0.90. Subsequently, post-training pruning was applied, resulting in over 40% reduction in model size, which allowed deployment via TFLite without any performance loss. The model was eventually embedded in an Android app, which allowed offline, on-device inference with average prediction times of 4–5 seconds per sample. This method achieves high classification accuracy while prioritizing temporal modeling and user-centered design. By combining the extraction of EEG statistical features with deep sequential models for mobile implementation, the system becomes a scalable, easy-to-interpret, and low-latency solution for autism spectrum disorder (ASD) screening that can be utilized in clinical as well as resource-limited environments. -
Assessment of Flood Potential Through Rainfall Pattern Analysis
Aditya Gupta, Vibha JainAbstractFloods pose a major threat to lives, infrastructure, and ecosystems, particularly in areas where extreme rainfall events are becoming increasingly frequent. With the patterns of climate change and urbanization advancing at a fast pace, most areas are now more exposed to floods than ever before. In this research, we examine over a century of rainfall data, from 1901 to 2024, for all Indian states to better understand flood-prone conditions. A Bidirectional Long Short-Term Memory (Bi-LSTM) model is used to learn the temporal dependencies of rainfall sequences and forecast flood potential from past trends. The model is trained and tested on performance metrics such as accuracy, precision, recall, and F1-score. The results demonstrate that the Bi-LSTM method captures rainfall’s complex spatio-temporal patterns with an accuracy of 96.2% and gives robust predictive performance in detecting high-risk areas for flooding. -
Leveraging EfficientNet Architectures for Noise Robust Speech Recognition: An Empirical Study with AudioMNIST Dataset
Arpita Choudhury, Amisha Singh, Pinki Roy, Sivaji BandyopadhyayAbstractSpeech recognition has become ubiquitous in modern AI applications. It converts spoken words into text using learning algorithms to make human-machine interaction more convenient. Conventional methods that use mel-frequency cepstral coefficients (MFCC) features for speech recognition often lack efficiency in the presence of environmental noise. Advanced deep learning algorithms have the potential to address this challenge. This article presents an empirical study for a holistic comparison among EfficientNet-based speech recognition models. EfficientNet architectures deal with the trade-off between performance and computational efficiency through compound scaling, mobile inverted bottleneck convolution, and squeeze and excitation blocks. The presented work trains and tests EfficientNetB0-B7 with mel-spectrogram images extracted from the benchmark AudioMNIST dataset. The original acoustic signals in the dataset are superimposed with babble and Gaussian white noise to produce noisy data of varying intensities. The results show that all EfficientNet architectures have outperformed traditional methods, achieving above 90% accuracy even in noisy environments without the need for any noise removal method.
-
-
Computer Networks and Cybersecurity
-
Frontmatter
-
Raspberry Pi NAS as a Self-hosted Private Cloud: Design, Implementation, and Performance Evaluation
Shipra Swati, Jatin Yadav, Vamsi E. ManojAbstractIn the modern world, there is significant data generation and consumption alongside the need for flexible, scalable, and secure storage solutions. On the other hand, self-hosted cloud services are an attractive alternative to commercial clouds. Building a private cloud may help to overcome issues with privacy, security, vendor lock-in, and the recurring costs that commercial services offer. The purpose of this research is to design, implement, and evaluate personal private clouds based on Network Attached Storage, which is powered by Raspberry Pi. It investigates the possibilities of utilizing such privacy-centered cloud storage as an alternative to mainstream services available. This paper presents a system architecture for configuring core private cloud functions with performance benchmarking and security assessment strategies. Simulated experimental results demonstrate the system capabilities alongside securing sensitive data from common threats. The findings pinpoint critical areas for further enhancement and development while demonstrating the effectiveness of the low-cost method for personal and small-group private cloud applications. -
Performance Evaluation of WFS Service Consumption with Python and Cython
Javier Felipe Moncada Sánchez, Yenny Espinosa Gómez, Carlos Enrique Montenegro Marín, Rubén González CrespoAbstractOGC spatial data services, particularly the Web Feature Service (WFS), are widely used and gaining popularity due to factors like open data policies and the implementation of spatial data infrastructures by governments and institutions. This study presents a detailed performance evaluation of Web Feature Service (WFS) consumption through various processes such as connection, data capture, storage, and retrieval using Python and the OWSLib library. Additionally, it conducts a comparative performance analysis between CPython and Cython to determine efficiency differences in real execution scenarios. The experiments involved three international WFS services and implemented key algorithms in three modes: CPython, pure Cython, and Cython with static typing. The findings reinforce previous research highlighting Cython’s ability to optimize computational performance, particularly when static variable typing is applied. Significant execution time improvements were observed in specific tasks involving spatial data handling. These results have practical implications for geospatial application developers and can serve as a reference for improving high-performance spatial data workflows. -
A Novel Hybrid Scheduling Approach for Enhancing Cloud System Performance
Prathamesh Vijay LahandeAbstractSeveral scheduling challenges exist during task computations, especially when algorithms lack intelligent mechanisms. Modern research confirms that Machine Learning (ML) algorithms enhance system performance through their intelligent mechanisms. This research used ML’s K-Means Clustering (KMC) algorithm along with the Shortest Job First (SJF) scheduling algorithm to design and implement a novel SJF-KMC hybrid scheduling approach to improve cloud system performance. This research experimented with Google Cluster real-time tasks in ten scenarios with varying Virtual Machine deployed from ten to one hundred and implemented the SJF-KMC approach across five clusters: SJF-K1, SJF-K3, SJF-K5, SJF-K7, and SJF-K9. The performance of these approaches was compared with each other and the existing SJF algorithm across Average Network Latency (Avg_NL), Average Energy (Avg_E), Average Memory (Avg_M), and Average Throughput (Avg_T). Results show that the task clustering provided by the SJF-KMC algorithms improves cloud performance. Additionally, results indicate that minimizing task clusters during their transit phase from the user environment to the cloud environment (and vice versa) improves Avg_NL, minimal/maximum task clusters during computations enhances Avg_E and Avg_M parameters, and maintaining a minimal number of clusters during Throughput calculations. The intelligent mechanism of the SJF-KMC approach enhances scheduling, thereby improving overall cloud system performance. -
IPDM: An Intelligent Phishing Detection Model for E-Commerce Websites
Vipin Kumar, Kakali ChatterjeeAbstractNowadays, with the exponential growth of digital connectivity and online services, phishing has emerged as one of the most familiar Cyber threats, targeting individuals and organizations alike. Phishing attacks aim to deceive users into revealing sensitive information such as usernames, passwords, and financial credentials by mimicking legitimate websites. To help mitigate these risks, there is an urgent need for intelligent systems capable of detecting and preventing phishing attacks in real-time.In this framework, various classification machine learning techniques are used, such as, Artificial neural network (ANN) achieved an accuracy of 98.90%, the Recurrent neural network (RNN), achieved an accuracy of 95.06%, and the K-Nearest Neighbors (KNN) achieved an accuracy of 97.30%, while the Convolutional neural network classifier predicted training error 99.80% and test error 99.13% with accuracy loss 0.005% and validation loss 0.03% between the evaluated models with false positive rate 0.0011. While CNN with stratified k-fold cross validation achieved an accuracy of 98.35%, and Inference Time per instance 0.000150 s was recorded. For model interpretability used Shapley additive explanations (SHAP) were used and analyzed false positive rate. The best model here is the CNN classifier on the URL-based phishing dataset with the highest accuracy of 99.80%. -
Unsupervised Pattern Discovery in Cyber Incidents Using Principal Component Analysis K-Means DBSCAN and Isolation Forest
Ananjan Maiti, Rupak Chakraborty, Dipankar Basu, Indranil Sarkar, Arpita DuttaAbstractWith exponentially increasing cybersecurity threats, the global digital infrastructure must respond to unprecedented challenges; more advanced analysis approaches are required to identify threat patterns and anomalous behavior. This work proposes a comprehensive unsupervised learning framework that can cluster and perform anomaly detection for cyber-incident data from around the world between 2015 and 2024. Three complementary approaches are used in our methodology, and we have used K-Means to cluster incident patterns, then DBSCAN for DBSCAN-based cluster analysis, and finally Isolation Forest to detect anomalies. In this study, a comprehensive dataset of cyber incidents is analyzed with their (non-mutually exclusive) features, such as targeted industry, type of attack, impact, affected users and geography. The reduction in dimensionality and visualization of the clustering results in the two-dimensional space were achieved by Principal Component Analysis (PCA). The K-means analysis shows that our findings can be grouped into four (4) clusters, and four (4) core incident patterns are identified but flagged as noise by DBSCAN. The Isolation Forest algorithm effectively identified high-impact anomalous incidents with anomaly scores between 0.46 and 0.62. Our performance evaluation shows that the clustered results for the Isolation Forest (89% silhouette score, 87% anomaly precision) outperform those of traditional clustering methods. By offering automated tools for threat categorization, pattern recognition, and early warning systems to anomalous cyber activities suggesting potential sophisticated or emerging attack vectors, the research adds to both the field of cybersecurity intelligence and to various applications. -
CyberTracker: A Web Portal for Cyber Fraud Monitoring and Interactive Querying Using Lightweight Language Models
A. Vikasni, B. Jagathsri, A. M. AbiramiAbstractIn the modern era, cyber threats are escalating rapidly in complexity, frequency, and impact. Traditional reactive cyber security mechanisms are no longer sufficient to combat sophisticated attacks such as ransomware, phishing, and zero-day exploits. Addressing this urgent need, we present CyberTracker, an integrated real-time system for cyber crime data acquisition, analysis, visualization, and intelligent exploration. The system employs a multi-phase methodology, beginning with automated web scraping to collect cyber incident data from diverse sources, including cyber security intelligence websites, social media platforms, and news outlets. CyberTracker enhances real-time threat visibility, supports proactive decision-making, and contributes to strengthening cyber security posture against emerging threats. It transforms structured data into a graph-based representation for visual exploration of entity relationships and incident patterns. By automating threat intelligence collection, enabling intelligent interaction, and offering visual analytics, the platform provides a comprehensive, scalable, and robust solution for modern cyber crime monitoring.
-
-
Backmatter
- Title
- Data Science and Network Engineering
- Editors
-
Suyel Namasudra
Nirmalya Kar
Sarat Kumar Patra
Byung-Gyu Kim
- Copyright Year
- 2026
- Publisher
- Springer Nature Switzerland
- Electronic ISBN
- 978-3-032-07735-6
- Print ISBN
- 978-3-032-07734-9
- DOI
- https://doi.org/10.1007/978-3-032-07735-6
PDF files of this book have been created in accordance with the PDF/UA-1 standard to enhance accessibility, including screen reader support, described non-text content (images, graphs), bookmarks for easy navigation, keyboard-friendly links and forms and searchable, selectable text. We recognize the importance of accessibility, and we welcome queries about accessibility for any of our products. If you have a question or an access need, please get in touch with us at accessibilitysupport@springernature.com.