2020 | Book

# Data Management, Analytics and Innovation

## Proceedings of ICDMAI 2019, Volume 1

Editors: Prof. Neha Sharma, Dr. Amlan Chakrabarti, Prof. Valentina Emilia Balas

Publisher: Springer Singapore

Book Series: Advances in Intelligent Systems and Computing

This book presents the latest findings in the areas of data management and smart computing, big data management, artificial intelligence and data analytics, along with advances in network technologies. It addresses state-of-the-art topics and discusses challenges and solutions for future development. Gathering original, unpublished contributions by scientists from around the globe, the book is mainly intended for a professional audience of researchers and practitioners in academia and industry.

#### Data Management and Smart Informatics

##### Empirical Study of Soft Clustering Technique for Determining Click Through Rate in Online Advertising

Akshi Kumar, Anand Nayyar, Shubhangi Upasani, Arushi Arora
##### ASK Approach: A Pre-migration Approach for Legacy Application Migration to Cloud

Legacy application migration is a mammoth task, if migration approach is not well thought at the very start, i.e. pre-migration, and supported by robust planning especially at pre-migration process area. This paper proposes a mathematical pre-migration approach, which will help the enterprise to analyse existing/legacy application based on the application’s available information and parameters an enterprise would like to consider for analysis. Proposed pre-migration assessment will help in understanding the legacy application’s current state and will help in un-earthing the information with respect candidate application. Proposed pre-migration approach will help to take appropriate well-informed decision, whether to migrate or not to migrate the legacy application. As it is said that application migration is a journey, if kick-started once, needs to reach its destination else it can result into a disaster hence pre-migration is one of the important areas of migration journey.

Sanjeev Kumar Yadav, Akhil Khare, Choudhary Kavita
##### A Fuzzy Logic Based Cardiovascular Disease Risk Level Prediction System in Correlation to Diabetes and Smoking

The cardiovascular disease (CVD) is one of the major causes of death among the people having diabetes in addition to smoking habits. It will create tribulations for every organ of the human body. Smoking becomes fashion among the youth from their childhood which results in premature death. The intention of this paper is to explain the impact of diabetes and smoking along with high BP, high pulse rate, angina affect, and family history on the CVD risk level. The concept used is based on the knowledge-based system. We have proposed a fuzzy-logic-based prediction system to evaluate the CVD risk among the people having diabetes with smoking habits. The aim is to facilitate the experts to provide the medication as well as counsel the smokers well in advance. This will not merely save the individual but also an immense relief to concern. The data set is used from UCI (Machine Learning Repository). Most of the researchers worked on diabetes or smoking impact on CVD separately, but the proposed system demonstrates how drastically it will affect ones’ health condition.

Kanak Saxena, Umesh Banodha
##### An Integrated Fault Classification Approach for Microgrid System

In this paper, a moving windowing approach-based integrated fault classification algorithm is proposed for microgrid system. In a microgrid system, the nonlinear operation of control devices connected to distributed generation (DG) imposes problem for identifying the exact faulty class. In order to mitigate this issue, an integrated moving window averaging technique (IMWAT) is proposed. The method utilizes current signal at the line end. In this technique, first, the decision of the fault detection unit (FDU) is analyzed and based on that fault class is detected. The FDU uses the conventional moving window averaging technique. Different logics are framed to identify the symmetrical and unsymmetrical faults. The method is tested on a standard microgrid network and obtained results for different fault cases prove the efficacy of the proposed method.

Ruchita Nale, Ruchi Chandrakar, Monalisa Biswal

Garima Makkar
##### Normal Pressure Hydrocephalus Detection Using Active Contour Coupled Ensemble Based Classifier

The Brain plays an imperative role in the life of human being as it manages the communication between sensory organs and muscles. Consequently, any disease related to brain should be detected at an early stage. Abundant accumulation of cerebrospinal fluid in the ventricle results to a brain disorder termed as normal pressure hydrocephalus (NPH). The current study aims to segment the ventricular part from CT brain scans and then perform classification to differentiate between the normal brain and affected brain having NPH. In the proposed method, firstly few preprocessing steps have been carried out to enhance the quality of the input CT brain image and ventricle region is cropped out. Then active contour model is employed to perform segmentation of the ventricle. Features are extracted from the segmented region and Ensemble classifier is used to classify CT brain scan into two classes namely, normal and NPH. More than hundreds of CT brain scans were analyzed during this study; area of ventricle has been used as a measure of feature extraction. Experimental results disclosed a significant improvement in case of ensemble classifier in comparison to Support Vector Machine in terms of its performance.

Pallavi Saha, Sankhadeep Chatterjee, Santanu Roy, Soumya Sen
##### Question–Answer System on Episodic Data Using Recurrent Neural Networks (RNN)

Data comprehension is one of the key applications of question-answer systems. This involves a closed-domain answering system where a system can answer questions based on the given data. Previously people have used methods such as part of speech tagging and named entity recognition for such problems but those methods have struggled to produce accurate results since they have no information retention mechanisms. Deep learning and specifically recurrent neural networks based methods such as long short-term memory have been shown to be successful in creating accurate answering systems. This paper focuses on episodic memory where certain facts are aggregated in the form of a story and a question is asked related to a certain object in the story and a single fact present is given as answer. The paper compares the performance of these algorithms on benchmark dataset and provides guidelines on parameter tuning to obtain maximum accuracy. High accuracy (80% and above) was achieved on three tasks out of four.

##### Convoluted Cosmos: Classifying Galaxy Images Using Deep Learning

Misra, Diganta Mohanty, Sachi Nandan Agarwal, Mohit Gupta, Suneet K.In this paper, a deep learning-based approach has been developed to classify the images of galaxies into three major categories, namely, elliptical, spiral, and irregular. The classifier successfully classified the images with an accuracy of 97.3958%, which outperformed conventional classifiers like Support Vector Machine and Naive Bayes. The convolutional neural network architecture involves one input convolution layer having 16 filters, followed by 4 hidden layers, 1 penultimate dense layer, and an output Softmax layer. The model was trained on 4614 images for 200 epochs using NVIDIA-DGX-1 Tesla-V100 Supercomputer machine and was subsequently tested on new images to evaluate its robustness and accuracy.

Diganta Misra, Sachi Nandan Mohanty, Mohit Agarwal, Suneet K. Gupta

##### Energy-Based Improved MPR Selection in OLSR Routing Protocol

Wireless Ad hoc networks are consisting of wireless nodes that communicate over wireless medium without any centralized controller, fixed infrastructure, base station, or access point. The networks should be established in a distributed and decentralized way. Performance of mobile Ad hoc network depends on the routing scheme chosen. Extensive research has been taken place in recent years to suggest many proactive and reactive protocols to make them energy efficient. In this work, table-driven routing protocol, i.e., optimized link-state routing protocol (OLSR) is tried to make more energy efficient which also helps in prolonging the network lifetime. OLSR is a proactive routing protocol in Mobile Ad hoc Networks (MANETS) which is driven by hop-by-hop routing. The conventional OLSR is hybrid multipath routing, in which link-state information is forwarded only by Multi-Point Relays (MPRs) selected among one-hop and two-hop neighbor sets of host. In this work, a novel mechanism is introduced to select MPR among nodes neighbor set to make it more energy efficient by considering willingness of node. Proposed energy-aware MPR selection in MDOLSR is compared with conventional OLSR. Extensive simulations were performed using NS-2 simulator, and simulation results show improved network parameters such as higher throughput, more Packet Delivery Ratio (PDR) and lesser end-to-end delay as simulation time progresses.

Rachna Jain, Indu Kashyap
##### A Novel Approach for Better QoS in Cognitive Radio Ad Hoc Networks Using Cat Optimization

Cognitive Radio is a Wi-Fi verbal exchange methodology that allows the user to engage except having a fixed preassigned radio spectrum. Cognitive Radio Networks (CRNs) are having the routing hassle that is one of the serious constraints. Ad hoc networks are non-centralized Wi-Fi networks that can be constructed and there is no need for any preexisting infrastructure for these networks. Here every point can work as a router. In this paper, the authors have explained the Cognitive Radio Networks (CRN) that are obtaining so a whole lot of recognition where the principal focus is on the dynamic undertaking of channels to wireless devices. In this paper, cognitive radio networks are primarily focused. Nowadays, almost all the networks rely on fixed allocated networks in an approved or unapproved frequency group. In this paper literature evaluates associated to CRN and an optimization algorithm to enhance the overall performance of TE under CRN has been discussed. Swarm intelligence technique is used in the paper. Swarm approach is clearly the combination of the decentralized attribute to gain excellent viable solutions. The motivation regularly creates from nature, more often than non natural outlines. One of the effective approachs known as Cat swarm has been used to acquire high price of accuracy and much low error rates which improves the lifespan of the network. The results are carried out by the use of CSO (Cat Swarm Optimization) algorithm and parameters like energy consumption, congestion, overhead consumption, and number of routing rules are used to analyze the overall performance of the algorithm.

Lolita Singh, Nitul Dutta
##### (T-ToCODE): A Framework for Trendy Topic Detection and Community Detection for Information Diffusion in Social Network

The increased use of social network generates a huge amount of data. Extracting useful information from this huge data available is the need of today. Study and analysis of this data generated provide insight into the behavior of the customers or users and thus will be beneficial to increase the sales of products or understand customers. To achieve the same, we propose a novel framework which will extract trendy topics, identify communities related to these trendy, topics, and also identify influential or seed nodes in communities. The framework intends to find the list of topics which are popular, second, find trend-driven communities, and from these trend-driven communities find nodes which act as seed nodes and thus dominate the spread of information in the community. Analysis of real-world data is done and results are compared with baseline approaches.

Reena Pagare, Akhil Khare, Shankar Chaudhary
##### ns-3 Implementation of Network Mobility Basic Support (NEMO-BS) Protocol for Intelligent Transportation Systems

In an Intelligent Transportation System for a Smart City, seamless connectivity is essential for each user during mobility for efficient data communication. For a group of mobile users in a vehicle (bus/train/flight), due to high mobility, implementing a protocol in order to manage handoffs smoothly is a challenging task. Network Mobility Basic Support (NEMO-BS) protocol was proposed to comply with this requirement. It is an extended version of Mobile IPv6 (MIPv6). But, the MIPv6 implementation in ns-3, which is the most widely used open-source simulator, is still not extended so far, to support network mobility. In this work, we have implemented the functionality of the NEMO-BS protocol in ns-3.25 by modifying the existing MIPv6 module to enrich the ns-3 library.

Prasanta Mandal, Manoj Kumar Rana, Punyasha Chatterjee, Arpita Debnath
##### Modified DFA Minimization with Artificial Bee Colony Optimization in Vehicular Routing Problem with Time Windows

A NP-hard problem, vehicular routing is a combinatorial optimization problem. Vehicular routing problem with time windows indicates vehicular routing with specified start and end time. There will be “n” number of vehicles starting from the depot to cater to the needs of “m” customers. In this paper, Gehring and Homberger benchmark problems are considered wherein the size of customers is taken to be 1000. Artificial Bee Colony Optimization algorithm is executed on these 60 datasets and the number of vehicles along with total distance covered is recorded. The modified version of Deterministic Finite Automata is applied along with the Artificial Bee Colony Optimization and the results produce 25.55% efficient routes and 15.42% efficient distance compared to simple Artificial Bee Colony Optimization algorithm.

G. Niranjani, K. Umamaheswari
##### Coverage-Aware Recharge Scheduling Scheme for Wireless Charging Vehicles in the Wireless Rechargeable Sensor Networks

Recent advancement in the wireless power transfer technology has motivated the development of a wireless rechargeable sensor network (WRSN). In WRSNs, the formation of an optimal recharging schedule for each wireless charger vehicle is a well known NP-complete problem. To determine the optimal recharging schedule for each wireless charger vehicle, this paper presents a coverage-aware recharge scheduling scheme (CRS) where ACO-based metaheuristic algorithm is employed. In order to provide fast recharging in WRSN, the proposed scheme employs multiple wireless charger vehicles to perform the charging task. Performance analysis of the proposed scheme confirms its superiority in terms of charging latency.

Govind P. Gupta, Vrajesh Kumar Chawra
##### A Transition Model from Web of Things to Speech of Intelligent Things in a Smart Education System

Several terms have been used to describe Internet of Things; Web of Things (WoT) is a term which can be used interchangeability and it is referred to as the capability of devices to interconnect to the World Wide Web and sharing the information and data to one another. WoT has been mentioned in the literature to improve interconnection between devices at all times. In WoT, two different modes of communication which are generally mentioned in previous studies include person-to-thing (or thing-to-person) and thing-to-thing. This paper presents an architecture for transiting from WoT to speech-enabled WoT known as Speech of Intelligent Things (SoIT). The system employs a combination of technologies such as system design, server-side scripting, speech-based system tools, and data management in developing the SoIT prototype system as a third mode of communication. This paper illustrates a scenario whereby remote monitoring and controlling of WoT devices within the university campus might be difficult to manage by only using the modes discussed in the literature. An evolution of WoT to SoIT was realized using speech technology to provide a prototype system. Technical implications involve using a telephone by connecting an object telephone number (OTN) and dial WoT objects and establish a control mechanism. The research limitation is mainly the cost of dialing an OTN number. The contribution of this paper is to favor and encourage the use of speech technology to enhance the convenience of communication between WoT devices within the school campus.

Ambrose A. Azeta, Victor I. Azeta, Sanjay Misra, M. Ananya
##### Intrusion Detection and Prevention Systems: An Updated Review

The evolution of Information Technology (IT), cutting across several divides in our daily endeavors allows us to interact with all forms of data at different OSI model layers from application to physical. These data are susceptible to intrusion, aimed at compromising its integrity; thus, the need to protect these data, maintain its integrity, confidentiality, and availability cannot be overemphasized. Intrusion Detection and Prevention System (IDPS) is a device or software application designed to monitor a network or system. It detects vulnerabilities, reports malicious activities, and enacts preventive measures to keep up with the advancement of computer-related crimes using several response techniques. This paper presents an updated review on IDPSs given the fact that the most recent review found on the subject was done in 2016. It will also discuss the use of IDPSs to identify vulnerabilities in various channels through which data is accessed on a network or system and prevention mechanisms applied to mitigate against intrusion.

##### Simulation-Based Performance Analysis of Location-Based Opportunistic Routing Protocols in Underwater Sensor Networks Having Communication Voids

Recently, Underwater Wireless Sensor Networks (UWSNs) have emerged as a prominent research area in the networking domain due to their wide range of applications in submarine tracking, disaster detection, oceanographic data collection, pollution detection, and underwater surveillance. With its unique characteristics like continuous movement of sensor nodes, limitations in bandwidth and high utilization of energy, efficient routing and data transfer in UWSNs have remained a challenging task for researchers. Almost all the protocols proposed for terrestrial sensor networks are inefficient and do not perform well in an underwater environment. Recently Location-Based Opportunistic Routing Protocols have been observed to perform well in UWSN environments. But it is also observed that these protocols suffer from performance degradation in UWSN networks with communication voids. The objective of this research paper is to discuss the working of major Location-Based Opportunistic Routing Protocols in UWSNs with communication voids and to highlight their issues and drawbacks. We analyzed the Quality of Service parameters, packet delivery ratio, end-to-nd delay, throughput, and energy efficiency of two major Location-Based Opportunistic Routing Protocols, i.e., Vector-Based Forwarding (VBF) and Hop-by-Hop VBF (HH-VBF) in UWSNs with communication voids using NS-2 simulator with Aqua-Sim extension. Simulation results state that both VBF and HH-VBF protocols suffered from performance degradations in UWSNs with communication voids. In addition to this, the paper also highlights open issues for UWSN to assist researchers in designing efficient routing protocols for UWSNs having multiple communication voids.

Sonali John, Varun G. Menon, Anand Nayyar
##### A Hybrid Optimization Algorithm for Pathfinding in Grid Environment

Grid computing has been highly effective in the area of life sciences, financial analysis, research collaboration, and engineering. This paper is a study of existing algorithms like Swarm Intelligence (SI) algorithms such as Ant Colony Optimization (ACO), Particle Swarm Optimization (PSO), Artificial Bee Colony (ABC–PSO), and Parallel Particle Swarm Optimization (PPSO) to opt for the optimal path in a grid computing environment. These algorithms were used to solve the complex optimization problems in finding the path between source node to destination node effectively. Nature computing techniques based on the study of the collective behavior of ants, particle swarms, and bees are used to find the optimal path, improve the optimization methods and scalability in a set of representative problems. The hybridization of a grid computing environment and nature-inspired computing algorithms such as ACO, PSO, ABC–PSO, and PPSO has resulted in a class of solutions that differ in structure and design from the peer-to-peer network algorithms and the evaluated results showed the effectiveness of the pathfinding problem. ACO is implemented on a dynamic grid computing environment to demonstrate scalability and a solution for pathfinding. A class of four algorithms is used to find an optimal path and improve the optimization methods and shorten the computational time in a grid computing environment.

B. Booba, A. Prema, R. Renugadevi
##### Dynamic Hashtag Interactions and Recommendations: An Implementation Using Apache Spark Streaming and GraphX

Hashtag, started with Twitter is a keyword with prefix “#” and now being used mostly for all communication on social media. It has been identified as very powerful and effective in organizing communications according to the topic and trend. Hashtag can further help on various analysis, as it links users with their topic of interests. Hashtag aids in building communities of similar interests. With hashtags, we can follow current trend and interest on twitter which can help us in analyzing multiple factors, e.g., sensitivity of the ongoing trend, its spread, people getting affected, its effect on business and so on. Traditionally available approaches help us in analyzing batch data and finding interests and trends on it. Now with the advancements in the field of technology helps us in analyzing a large amount of online data within seconds. In this paper, we will be exploring dynamic hashtag interactions to find correlations among them and propose a methodology which can successfully find relevant hashtags based on the interest in focus. We will propose our methodology of analyzing and exploring tweets in real time with the extent of converting information; we are getting from twitter to knowledge.

Sonam Sharma
