Skip to main content

2019 | Buch

Data Management, Analytics and Innovation

Proceedings of ICDMAI 2018, Volume 1

herausgegeben von: Prof. Valentina Emilia Balas, Prof. Neha Sharma, Dr. Amlan Chakrabarti

Verlag: Springer Singapore

Buchreihe : Advances in Intelligent Systems and Computing

insite
SUCHEN

Über dieses Buch

The book presents the latest, high-quality, technical contributions and research findings in the areas of data management and smart computing, big data management, artificial intelligence and data analytics, along with advances in network technologies. It discusses state-of-the-art topics as well as the challenges and solutions for future development. It includes original and previously unpublished international research work highlighting research domains from different perspectives. This book is mainly intended for researchers and practitioners in academia and industry.

Inhaltsverzeichnis

Frontmatter

Data Management and Smart Informatics

Frontmatter
Improved Rotated Local Binary Pattern

Content-based image retrieval is the implementation of computer vision techniques to image retrieval problem, that is, issue of looking for images taking from high-end cameras in large image dataset. It aims to finding pictures of interest from an extensive picture database using the visual content of the pictures. “Content-based” implies that the hunt will break down the genuine content of the picture instead of the metadata, for example, tags, descriptions, and keywords linked with the picture. The term “content” in this context may point to shapes, hues, surfaces, or some other data that can be derived from the picture itself. Graphical processing unit is helpful in most picture handling applications because of multithread execution of algorithms, programmability and minimal effort. The substantial quantities of pictures have increased challenges to computer to store and oversee information adequately and productively. Features can be extracted parallel from the pictures with graphical processing unit utilizing different procedures. Utilizing Graphical processing unit based feature extraction algorithm can perform feature extraction easily and in fast and efficient way. The NVIDIA CUDA is fundamentally new computing architecture technology that enables the graphical processing unit to solve difficult, time taking, complex problems. This paper aims to find out the similarity between images that is query image and image present in dataset. The paper aims at the performance of various content-based image retrieval algorithms, which include, local binary pattern and rotated local binary pattern. An improvement has been made in Rotated local binary pattern where mapping function has been incorporated so as to give better results, with the help of mapping we are able to sort the higher and lower priority pixel values of query image and images present in our database.

Divya Khare, D. R. Gangodkar, Saurabh Dwivedi
An Introduction to Quantum Search Algorithm and Its Implementation

Quantum computing is new era of computing, used in modern world to solve complex problems which can be solved using a supercomputer and those problems can be solved in an efficient manner using quantum computer. Quantum computing uses properties of superposition and entanglement to solve complex problems like NP Hard, and by using them the quantum computer may find the solutions in faster manner compared to classical computers. Grover’s search is introduced for searching in an unstructured database that is used to locate a particular item in the database. It provide a speedup of $$\sqrt N$$ over classical. This article describes Grover’s search with an example, applications and limitations. Also explores the functionality of quantum circuit, oracle circuit that is particular to Grover’s. This article concludes with the Grover’s search advantage over classical search.

Jose P. Dumas, Kapil Soni, Akhtar Rasool
Smart Waste Management for Segregating Different Types of Wastes

The trend of IoT is increasing day by day, making each and everything related to daily life a smart thing. Making things smart also reduces the need for manpower; in addition, the work is done appropriately as it decreases human intervention. The garbage management is one of the major problems as it causes many diseases if the garbage is not collected on time. Smart bins are used to collect all the garbage from a particular area until the bins are filled up to a threshold level. When the bin is 80% filled, the location of the bin is displayed to the garbage collector on his mobile application using the GPS module attached to the bin. With respect to the above theory, we are proposing an idea in which there is a prototype that the bin when gets filled up to a threshold level, i.e. when 80% of the bin is filled, the GPS module attached to the bin will show an alert to the garbage collector truck of that area that the bin in his vicinity needs to be emptied. A shortest path to empty all the filled bins or the bins with the threshold level is also displayed on the mobile application that saves fuel and time and more work can be done. When the bin is filled or senses hazardous gas using a gas sensor, it will close the lid until the garbage truck arrives to empty it. Once the lid is closed, only the garbage collecting truck assigned to that area can open the lid. The garbage is then dumped to the garbage collecting truck, which in turn is composed of a unit that comprises individual machines that have the ability to segregate different types of waste using robotic arm by detecting the properties of that waste.

Rashi Kansara, Pritee Bhojani, Jigar Chauhan
Performance Evaluation and Analysis of Feature Selection Algorithms

Exorbitant data of huge dimensionality is generated because of wide application of technologies nowadays. Intent of using this data for decision-making is greatly affected because of the curse of dimensionality as selection of all features will lead to over-fitting and ignoring the relevant ones can lead to information loss. Feature selection algorithms help to overcome this problem by identifying the subset of original features by retaining relevant features and by removing the redundant ones. This paper aims to evaluate and analyze some of the most popular feature selection algorithms using different benchmarked datasets K-means Clustering, Relief, Relief-F, Random Forest (RF) algorithms are evaluated and analyzed in the form of combinations of different rankers and classifiers. It is observed empirically that the accuracy of the ranker and classifier varies from dataset to dataset. Novel concept of applying Multivariate co-relation analysis (MCA) for feature selection is made and results show improved performance over legacy based feature selection algorithms.

Tanuja Pattanshetti, Vahida Attar
Vertex Importance Extension of Betweenness Centrality Algorithm

Variety of real-life structures can be simplified by a graph. Such simplification emphasizes the structure represented by vertices connected via edges. A common method for the analysis of the vertices importance in a network is betweenness centrality. The centrality is computed using the information about the shortest paths that exist in a graph. This approach puts the importance on the edges that connect the vertices. However, not all vertices are equal. Some of them might be more important than others or have more significant influence on the behavior of the network. Therefore, we introduce the modification of the betweenness centrality algorithm that takes into account the vertex importance. This approach allows the further refinement of the betweenness centrality score to fulfill the needs of the network better. We show this idea on an example of the real traffic network. We test the performance of the algorithm on the traffic network data from the city of Bratislava, Slovakia to prove that the inclusion of the modification does not hinder the original algorithm much. We also provide a visualization of the traffic network of the city of Ostrava, the Czech Republic to show the effect of the vertex importance adjustment. The algorithm was parallelized by MPI ( http://www.mpi-forum.org/ ) and was tested on the supercomputer Salomon ( https://docs.it4i.cz/ ) at IT4Innovations National Supercomputing Center, the Czech Republic.

Jiří Hanzelka, Michal Běloch, Jan Martinovič, Kateřina Slaninová
Privacy Protection Data Analytics in Smart Home Environments with Secure Computation

These days in smart home environments, one could find the absence of mechanisms to empower occupants so as to see and manage the data created by smart gadgets at peoples living places. By means of the expanding adoption physical devices such as remote systems, intelligent gadgets, and sensors homes have been getting to be smart home environments. Intelligent gadgets could obtain an unfathomable measure of delicate individual data. Nevertheless the incredible way of smart home data investigation has been building up defensive consideration. The gathering and processing of data of this data raises privacy concerns about how the people existing in a kind of a smart home environments could guarantee where this data would be shared just pertaining to their own particular great, as opposed to be shared, collected, used, or maliciously disclosed so as to meet the requirements which would damage their independence and security. Hence handling a sort of data ought to be exclusive to specific clients in charge of straight concern. This study proposes a framework displayed to keep up safety and saving protection so as to examine the data regarding from homes that are brilliant, in the absence of bargaining on utility data. This study deals with the implantation of a security protecting method of pertaining to the art of solving coding called cryptography as well as randomization has been utilized for keeping up the protection of touchy data pertaining to a person. Randomization is strategy; which adjusts unique data by the addition of a few noises arbitrarily to unique data which is independent of different reports. At this time cryptography strategy has been utilized to provide safety and security of sensitive attributes. Prior to the process of Randomization Data partition are performed in vertical and horizontal. At long last, giving right of entry of sharing data is data is ensured against third party and valuable data is imparted to approved data per users for security counseling, and investigative reason.

N. Naveen, K. Thippeswamy

Big Data Management

Frontmatter
A Review of Software Defect Prediction Models

This paper analyzes the performance of various software defects prediction techniques. Different datasets have been analyzed for finding defects in various researches. The main aim of this paper is to study many techniques used for predicting defects in software.

Harshita Tanwar, Misha Kakkar
A Study on Privacy-Preserving Approaches in Online Social Network for Data Publishing

Online Social Networks (OSNs) have become major platform for social interactions, sharing personal experiences and providing other services. OSN providers provide significant services to its user for free of cost. Various privacy control mechanisms for users have been provided by OSNs to decide who can view their personal information. User’s sensitive information could be leaked even when privacy rules are properly set by the service providers. Various users’ data are collaborated for different analysing purposes. Many threats arise to user data in OSN. This paper discusses various types of threats that arise to user data and the technique which overcomes the attacks made on the user data.

S. Sathiya Devi, R. Indhumathi
Parallel Clustering for Data Mining in CRM

In modern business conditions that are characterized by a stronger process of globalization, uncertainty, risk and competition, companies have to struggle every day to maintain market share and achieving better business results. In order to achieve this, the company must always be a step ahead of the competition. This means anybody must anticipate the needs of its clients and each client must access individual. This work is based on addressing this goal. Due to the fact that it is a large amount of data, it is simply impossible to do manual data analysis. Analyses are left to specially developed programs; a new kind of technology whose goal is precisely the solution of the problems that has been faced in Business Intelligence. Business Intelligence (BI) refers to be a broad set of applications and technologies for data collection, access to data and expert analysis of data, and in order to provide adequate support to the decision making process. BI represents a family of products that includes Data mining Algorithms, Data mining products for creating reports. Improving efficiency in this process is discussed in this work. The M-Clustering algorithm which is conceived in this work provides solution to data mining using clusters in twofolds—setting boundary limits during filtering and historical data processing. Define a set of data to be used for training which can be taken from filtering various attributes and the fields from the classifications set given. The data processing activity will be done using this training datasets to get expected result. This is evaluated for processing actual dataset or further execution for provisional trained dataset preparation. This work covers high-level view of the proposed system along with the processing steps used in the system. It also covers experimental evaluation carried out with customized algorithm implementation in WEKA tool and compared the processing efficiency of experimental data with k-means evaluation.

E. Manigandan, V. Shanthi, Magesh Kasthuri
Digital Data Preservation—A Viable Solution

Almost all the artifacts we have in digital only format is susceptible to loss because of the media deterioration, where they are stored. Even if we argue that migrating digital data on newer media at regular intervals will solve this issue. We have an even more important issue of digital data being inaccessible or not readable. This happens if software interpreting the data becomes obsolete. In such case the data is lost, as a bit stream is meaningless unless we can interpret them. Digital data preservation is a long and still open research area. There are various solutions proposed and implemented till date. All the solutions can be broadly classified into two categories (migration and emulation), based on the strategy to ensure longevity. We studied various approaches and strategies done till date for digital data preservation and propose a new framework—a combination of migration and emulation for digital preservation with fewer dependencies on future technology.

Krishnan Arunkumar, Alagarsamy Devendran
Big Data Security Threats and Prevention Measures in Cloud and Hadoop

Big Data, collection of huge data sets is a widely used concept in present world. Although being stored and analyzed by Cloud services, it poses the greatest challenge of security threats, occurring in the exposure of enormous amount of data. This paper we are going to explain recent security risks, threats, and vulnerabilities with respect to Cloud Services, Big Data [with extra focus on EnCoRe system], Hadoop [with extra focus on HDFS] and throws light on issues dealing with big data analytics. It prominently makes use of recently developed approach called sticky policies and the existing security framework to improve security. This paper provides a literature review on security threats and privacy issues of big data, Hadoop concurrent processing. It also uses Verizon and Twilio which is familiar for its trustworthy implementation of the Hadoop using Amazon Simple Storage services (S3 services).

Manoranjan Behera, Akhtar Rasool
Optimized Capacity Scheduler for MapReduce Applications in Cloud Environments

Most of the current-day applications are data centric and involves lot of data processing. Technologies like hadoop enable data processing with automatic parallelism. Current-day applications which are more data intensive and compute intensive can take advantage of this automatic parallelism and the methodology of moving computation to data. In addition to it the Cloud computing technology enables users to establish the required clusters with required number of nodes instantly. Cloud computing has made easy for the users to execute large data applications without any requirement to establish/maintain the infrastructure. As cloud gives readily installed infrastructures, using hadoop on cloud has become common. The existing schedulers are very effective in static cluster environments but lack performance in virtual environments. The purpose of this work is to design an effective capacity scheduler for MapReduce applications for virtualized environments like public clouds by making scheduling decisions more intelligent using the characteristics of job and virtual machines.

Adepu Sree Lakshmi, N. Subhash Chandra, M. BalRaju
Big Data Analytics Provides Actionable Insights into Ship Emissions

Atmospheric emissions such as NOx from ship engines have a drastic impact on the environment. Controlling them is crucial for maintaining a sustainable growth for any logistics company. The Port of Rotterdam (The Netherlands) is using big data analytics to gain actionable insights into these emissions. Our case study deals with the implementation of the emission calculations and reporting implemented in Hadoop. In the analytical setup we introduce the method for estimating emissions based on recorded ship position data and information about its engines. We present a flexible approach that stores intermediate results allowing different levels of aggregation. These results are visualized in a Geographical Information System (GIS). We present some selected results followed by conclusions and recommendations.

Frank Cremer, Muktha Muralee

Artificial Intelligence and Data Analysis

Frontmatter
Optimizing Deep Convolutional Neural Network for Facial Expression Recognitions

Facial expression recognition (FER) systems have attracted much research interest in the area of Machine Learning. We designed a large, deep convolutional neural network to classify 40,000 images in the dataset into one of seven categories (disgust, fear, happy, angry, sad, neutral, surprise). In this project, we have designed deep learning Convolution Neural Network (CNN) for facial expression recognition and developed model in Theano and Caffe for training process. The proposed architecture achieves 61% accuracy. This work presents results of accelerated implementation of the CNN with graphic processing units (GPUs). Optimizing Deep CNN is to reduce training time for system.

Umesh Chavan, Dinesh Kulkarni
Hybrid Kmeans with Improved Bagging for Semantic Analysis of Tweets on Social Causes

Analysis of public information from social media could yield fascinating outcomes and bits of knowledge into the universe of general conclusions about any item, administration, or identity. Social network data is one of the most effective and accurate indicators of public sentiment. Analysis of the mood of public on a particular social issue can be easily judged by several methods developed by the technicians. In this paper, analysis of the mood of society towards any particular news from the twitter post in form of tweets. The key objective behind this research is to increase the accuracy and effectiveness of the classification by the process of the NLP that is Natural Language Processing Techniques while focusing on semantics and World Sense Disambiguation. The process of classification includes the combination of the effect of various independent classifiers on one particular classification problem. The data that is available in the form of tweets on twitter can easily frame the insight of the public attitude towards the particular tweet. The proposed work is well planned to design as well as implement the best hybrid method that includes Hybrid Kmeans/Modified Kmeans (MKmeans) that involves clustering and Bagging for sentiment analysis. With this proposed idea one can easily understand the behavior of the public towards the post and further assist in the future policy making taking the results as the basis. At the end results are compared with the existing model with the motive of validating the findings.

Mani Madhukar, Seema Verma
Subspace Clustering—A Survey

High-dimensional data clustering is gaining attention in recent years due to its widespread applications in many domains like social networking, biology, etc. As a result of the advances in the data gathering and data storage technologies, many a times a single data object is often represented by many attributes. Although more data may provide new insights, it may also hinder the knowledge discovery process by cluttering the interesting relations with redundant information. The traditional definition of similarity becomes meaningless in high-dimensional data. Hence, clustering methods based on similarity between objects fail to cope with increased dimensionality of data. A dataset with large dimensionality can be better described in its subspaces than as a whole. Subspace clustering algorithms identify clusters existing in multiple, overlapping subspaces. Subspace clustering methods are further classified as top-down and bottom-up algorithms depending on strategy applied to identify subspaces. Initial clustering in case of top-down algorithms is based on full set of dimensions and it then iterates to identify subset of dimensions which can better represent the subspaces by removing irrelevant dimensions. Bottom-up algorithms start with low dimensional space and merge dense regions by using Apriori-based hierarchical clustering methods. It has been observed that, the performance and quality of results of a subspace clustering algorithm is highly dependent on the parameter values input to the algorithm. This paper gives an overview of work done in the field of subspace clustering.

Bhagyashri A. Kelkar, Sunil F. Rodd
Revisiting Software Reliability

Reliability is an important issue for deciding the quality of the software. Reliability prediction is a statistical procedure that purpose to expect the future reliability values, based on known information during development processes. It is considered as a basic function of software development. A review-based research has been done in this work to evaluate the previously established methodologies for reliability prediction. In this paper, authors give a critical review related to successful research of reliability prediction. This paper also provides many challenges and keys of reliability estimation during software development process. Further, this paper gives a precarious discussion on previous work and identified factors which are important for reliability of software but still ignored. This work helps to developers for predicting the reliability of software with minimum risks.

Kavita Sahu, R. K. Srivastava
Application of Classification Techniques for Prediction of Water Quality of 17 Selected Indian Rivers

Objective: In this study, prediction using classification techniques are used to predict the water quality of the 17 selected rivers in the year 2011 using their water quality in 2008 to interpret whether the water quality has improved or deteriorated. Methods/Analysis: For this prediction, we have used data mining classification techniques using Waikato Environment for Knowledge Analysis (WEKA) API to the dataset of selected 17 Indian rivers. The data used for prediction was created from ambient water quality of Aquatic Resources in India in 2008 and 2011. Data is obtained from data portal which was published under National Data Sharing and Accessibility Policy (NDSAP) and the contributor was Ministry of Environment and Forests Central Pollution Control Board (CPCB). Findings: Out of the four techniques used, prediction of classes, i.e. excellent, good, average and fair is best done by Naive Bayes followed by J48, SMO and REPTree technique.

Harlieen Bindra, Rachna Jain, Gurvinder Singh, Bindu Garg
Trends in Document Analysis

Document analysis is one of the emerging area of research in the field of Information Retrieval. Many attempts have been made for retrieving information from a document using various machine learning algorithms. A concept of context vector is frequently used in information retrieval from document/s. Context Vector is an vector, which is used for various feature selection from documents, automatic classification of text documents, Subject Verb Agreement, etc. This paper discusses, the attempts made in the field of Information Retrieval (IR) from document using context vector. It also discuss about pros and cons of each attempt. This paper propose a system which can give “context vector” of the document set using Latent Semantic Analysis which is the most trending method in document analysis. The system is tested on BBC news dataset and proves to be successful.

Vaibhav Khatavkar, Parag Kulkarni
Comparison of Support Vector Machines With and Without Latent Semantic Analysis for Document Classification

Document Classification is a key technique in Information Retrieval. Various techniques have been developed for document classification. Every technique aims for higher accuracy and greater speed. Its performance depends on various parameters like algorithms, size, and type of dataset used. Support Vector Machine (SVM) is a prominent technique used for classifying large datasets. This paper attempts to study the effect of Latent Semantic Analysis (LSA) on SVM. LSA is used for dimensionality reduction. The performance of SVM is studied on reduced dataset generated by LSA.

Vaibhav Khatavkar, Parag Kulkarni
Facial Recognition, Expression Recognition, and Gender Identification

Face recognition has many important applications in areas such as public surveillance and security, identity verification in the digital world, and modeling techniques in multimedia data management. Facial expression recognition is also important for targeted marketing, medical analysis, and human–robot interaction. In this paper, we survey a few techniques for facial analysis. We compare the cloud platform AWS Rekognition, convolutional neural networks, transfer learning from pre-trained neural nets, and traditional feature extraction using facial landmarks for this analysis. Although not comprehensive, this survey covers a lot of ground in the state-of-the-art solutions for facial analysis. We show that to get high accuracy, good-quality data and processing power must be provided in large quantities. We present the results of our experiments which have been conducted over six different public as well as proprietary image data sets.

Shraddha Mane, Gauri Shah
Survey on Hybrid Data Mining Algorithms for Intrusion Detection System

Security is one of the most major concern issue arises in computer and internet technology. To conquer this problem, Intrusion Detection System (IDS) is the challenging solution in network systems. Such system is used to detect the known or unknown attacks made by intruders. Data mining methodologies like, clustering, classification, etc., plays a very important role in design and development of such IDS. They makes such system more effective and efficient. This paper describes some recent hybrid data mining based approaches used in development of IDS. We also describe the hybrid classification approaches used in IDS. Such Hybrid classifiers are any mixture of basic classifiers such as, SVM, Bayesian classifier, Neural network classifier, etc.

Harshal N. Datir, Pradip M. Jawandhiya
An Efficient Recognition Method for Handwritten Arabic Numerals Using CNN with Data Augmentation and Dropout

Handwritten character recognition has been the center of research and a benchmark problem in the sector of pattern recognition and artificial intelligence, and it continues to be a challenging research topic. Due to its enormous application many works have been done in this field focusing on different languages. Arabic, being a diversified language has a huge scope of research with potential challenges. A convolutional neural network (CNN) model for recognizing handwritten numerals in Arabic language is proposed in this paper, where the dataset is subject to various augmentations in order to add robustness needed for deep learning approach. The proposed method is empowered by the presence of dropout regularization to do away with the problem of data overfitting. Moreover, suitable change is introduced in activation function to overcome the problem of vanishing gradient. With these modifications, the proposed system achieves an accuracy of 99.4% which performs better than every previous work on the dataset.

Akm Ashiquzzaman, Abdul Kawsar Tushar, Ashiqur Rahman, Farzana Mohsin
Secured Human Authentication Using Finger-Vein Patterns

In any organization, providing a secured authentication system is a challenge. Here, we propose a secured authentication process using finger-vein patterns. Finger vein is a reliable biometric trait because of its distinctiveness and permanence properties. The proposed algorithm initially captures the finger-vein image and is preprocessed using Gaussian blur and morphological operations. Then features like number of corner points and the location of these corner points are extracted. The features fetched for an individual from database are compared against the extracted features. If the comparison satisfies predefined threshold value, then the authentication is successful. The simulation results of the proposed algorithm have produced the FAR as 2.78%, FRR as 0.09% and the overall performance as 99.96%.

M. V. Madhusudhan, R. Basavaraju, Chetana Hegde
Fuzzy-Based Machine Learning Algorithm for Intelligent Systems

It has become essential to develop machine learning techniques due to the automation of various tasks. At present, several tasks need manual intervention for better reliability of the system. In this work, fuzzy-based approach has been proposed where systems are trained based on initial data sets. In several data sets, the data is either partially available or unavailable. When data sets need to be used on real time systems, the non-availability of data may lead to catastrophe. In this approach, a fuzzy-based rule set is formulated. The rule strength is used to determine the effectiveness. Rules with similar strengths are clustered together. The learning is carried out by determining a threshold for the formulated rule set. Based on the threshold computed, a modified rule set is formulated with rule strengths greater than the computed threshold. A semi-supervised learning approach that uses an activation function is employed. The fuzzy learning approach proposed in this work reduces the error by 20% compared to conventional approaches.

K Pradheep Kumar
Exponential Spline Approximation for the Solution of Third-Order Boundary Value Problems

A general third-order boundary value problems (BVPs) are considered here, to find the approximate solution. An exponential amalgamation of cubic spline functions is used to form a novel numerical approach. Finite difference method supports the developed system to solve the problems slickly. Our method is convergent and second-order accurate. Numerical examples show that the method congregates with sufficient accuracy to the exact solutions.

Anju Chaurasia, Prakash Chandra Srivastava, Yogesh Gupta
Customer Lifetime Value: An Ensemble Model Approach

Customer lifetime value allows Banks and Financial Institutions to examine the worth of customers to the business, which provides important inputs to take informed marketing & retention decisions and better Customer Relationship Management. Traditional CLV approaches are primarily isolated at account level worthiness. Some Customer level CLV do take 360° view of the customer relationships but are more heuristic in nature or predicting the CLV based on historical CLV data using single model approach. In this paper, we have explored the existing solutions available to calculate the CLV and explained the rationale for not using with their respective limitations. The focus of the study was on retail banking sector, the proposal is to use whole gamut of existing marketing and risk predictive models for calculating the predicted CLV without taking the time value of money into consideration. It also discusses about the comparisons between the present and future CLV of the customer and how to check the overall health of the bank’s business using calculated CLV.

Harminder Singh Channa

Advances in Network Technologies

Frontmatter
A Cryptographic Algorithm Using Location-Based Service and Biometrics

Modern advancement of different technologies led to the increasing computational power of each individual component of digital computers that threatens to crack many secure classical algorithms as they are based on mathematical assumption. Thus like authenticated users, hackers or intruders are also able to crack security system. So, researchers and scientists are moving to new directions as well as merging different types of algorithms. Different GPS-enabled devices like smartphone, PDA, etc. are easily accessible which also supports many different applications that extract patterns like iris, fingerprint, etc. Biometric features can be used along with location of intended receiver to develop a cryptographic algorithm. Different smartphone apps provide both locations and can extract the biometric features by which people can form new key. The focus of this paper is to examine that merging of two approaches is advantageous as it provides more security to data.

Ridam Pal, Onam Bhartia, Mainak Sen
Secure and Energy Aware Shortest Path Routing Framework for WSN

Wireless Sensor Network is a network in which the sensor nodes are operated in a distributed and self-organizing fashion. Due to this nature the sensor nodes in this network are vulnerable to various malicious attacks. The sensor nodes are resource constrained, i.e., they have limited resources such as bandwidth, energy and memory. Along with Security and Energy Consumption, Delay also needs to be considered during the routing protocol design. This proposes a Trust and Energy Aware Routing Framework which provides resilience to various malicious attacks. The proposed framework helps in finding trusted nodes with maximum residual energy and routes the data through shorter paths. The experimental analysis with varying network parameters as varying the malicious nodes and varying packet size reveal the robustness of proposed approach. Total Throughput, Energy Consumption and End-to-End Delay are considered for experimental evaluation.

Nikitha Kukunuru
Air Pollution Monitoring System Using Wireless Sensor Network (WSN)

Rapid industrialization and urbanization cause the continuous decline in the environmental quality parameters. Today, the world is facing a challenge like global warming which occurs when carbon dioxide (CO2) and other greenhouse gases accumulate in the atmosphere and absorb sunlight and solar emission that have bounced off the earth’s surface. Its impacts are sea level rise, changes in the seasonal patterns, rising temperature, more frequent droughts, and extreme rainfalls. Air pollutions’ serious impact on human health and environment requires worldwide awareness and understanding. Existing systems gives real-time air pollution data to pollution monitoring authorities and, these systems are having fixed infrastructure with maintenance, reconfiguration, and reduced sensing issues. Conventional measurements are costly methods and spatially restricted. Therefore, air pollution monitoring becomes a challenging task. Here, we propose Wireless Sensor Network (WSN) based system with low-cost sensors, which collects air pollutant information in real time from different locations. Sensor data is transferred to cloud (ThingSpeak an open source API) for future analysis and calculate Air Quality Index (AQI) Android application can be used to visualize real-time air quality of the location.

Deepika Patil, T. C. Thanuja, B. C. Melinamath
Dynamic Range Implementation in Wi-Fi Access Point Through Range Adaptation Algorithm

A normal Wi-Fi access point (AP) works with a particular modulation and coding scheme (MCS). Each MCS value corresponds to a particular throughput and range. Because of this, a Wi-Fi Access point supports only a particular range at any given time. When the client is moving away from the access point, it loses the connectivity and if the Wi-Fi AP is working with the highest MCS index value, then the range supported by the AP becomes very small. Hence through dynamically changing the MCS value as the client is moving away or towards the AP, the range supported by the Wi-Fi access point can be made to change dynamically. This is achieved through the implementation of a Range adaptation algorithm. The algorithm works by knowing the position of a client can be deduced from the SNR value of its received signal. After knowing the position the highest Vht-Mcs value with a range that can support the client’s position is chosen. In this paper, the relation between SNR and MCS is studied and then an algorithm based on this is developed for changing range dynamically.

K. B. Jagan, S. Jayaganesh, R. Neelaveni
Air Quality Parameter Measurements System Using MQTT Protocol for IoT Communication Over GSM/GPRS Technology

Increasing populations and industrial activity are threatening our environment. Air pollutants are emitted at a very high rate. With the technological innovation and advancement, smart cities have come into existence where Information Technology is used for better optimization and resources planning hence promoting the sustainable growth and development. Internet of Things and Data Analytics together can lead an organization to better cost management, avoiding equipment failures and improving business operations. We propose Air Quality Measurement System which can effectively keep a track over Air quality in atmosphere. Internet is used for end to end connectivity. The parameters sensed by the system are sent to the server through a Cellular Network System. Communication between device and server is deployed over lightweight Message Queuing Telemetry Transportation (MQTT) protocol.

Anil Thosar, Rohan Nathi
Comprehensive Analysis of Routing Protocols Surrounding Underwater Sensor Networks (UWSNs)

In recent times, Underwater Sensor Networks (UWSNs) has become highly important for performing all sorts of underwater operations. It is somewhat cumbersome to implement terrestrial sensor networks routing protocols in UWSN due to high propagation delay, packet delay, and energy efficiency. So, lots of efforts are going on by researchers dedicated towards proposing efficient routing protocols for UWSN. As UWSN have specific characteristics in terms of rapid dynamic topology change, limited bandwidth, high energy consumption, high latency, packet delay issues, and security problems, designing a routing protocol to overcome all the aforesaid mentioned issues is quite a daunting task. In this research paper, we primarily focus on surveying various routing protocols available till date for data routing in UWSNs. In addition, comparison of protocols is also mentioned on the basis of various characteristics like routing technique, packet delivery ratio, energy efficiency, packet delay and localization to give a clear picture of the benefits and shortcomings of each and every enlisted protocol for UWSN.

Anand Nayyar, Vikram Puri, Dac-Nhuong Le
Network Intrusion Detection in an Enterprise: Unsupervised Analytical Methodology

Be it an individual, or an organization or any government institution, cyber-attack has no boundaries. Cyber-attacks in the form of Malware, Phishing and Intrusion into an enterprise network have become more prevalent these days. With advancement in technology, the number of connected devices has increased vastly leading to storage of very sensitive data belonging to different entities. Cybercriminals attempt to access this data as it is very lucrative for them to monetize this information. Due to the sophistication in technology used by cybercriminals, these attacks have become more difficult to detect and handle, making it a major challenge for governments and various enterprises to protect their sensitive data. Traditional detection methods such as antivirus and firewalls are limited only to known attacks, i.e., the attacks which have occurred in the past. Nowadays the growing advancement in the field of technology has led to unique and different types of attacks for which the traditional detection methods fail. In this paper, we will propose our methodology of Intrusion detection which will be able to handle such threats in near real time.

Garima Makkar, Malini Jayaraman, Sonam Sharma
Multi-stage Greenfield and Brownfield Network Optimization with Improved Meta Heuristics

Facility Location decisions are part of the company’s strategy which are important due to significant investment and decision is usually irreversible. Facility locations are important because (a) it requires large investment that cannot be recovered, (b) decisions affect the competitiveness of the company, and (c) decisions affect not only costs but the company’s income (d) Customer satisfaction and trade off of decision based on service level. In a supply chain network design, facility locations (for example, warehouse, distributor, manufacturing, cross dock, or retailers locations) play an important role in driving efficient distribution planning and satisfying customer service level. The problem is to identify optimal set of facilities that can serve all the customer’s demand with given service level at minimal cost. This problem becomes complex when possible locations are not known (Green field problem) and cost is the major driver for selection of facilities. This paper addresses this problem in two parts (a) first, improving the existing methodology (clustering) to achieve optimum clustering solution. With improved clustering outcome, we get better set of facility locations to be installed for Greenfield scenario. (b) Second, extend solution towards mathematical optimization based on cost, demand, service level, priority of facilities and business-based constraints (Brown filed Problem). This solution will help in selecting the best set of available facility location respecting business constraint so that customer demand is met in cost. Clustering algorithm, improved Meta Heuristic and GLPK (Open Source package to solve LP/MILP Problems) based mathematical optimization has been developed in Java. Currently we are pursuing further research in this direction to improve network and facility decisions.

Avneet Saxena, Dharmender Yadav
Implementing Signature Recognition System as SaaS on Microsoft Azure Cloud

The use of information technology in varied applications is growing exponentially which also makes the security of data a vital part of it. Authentication plays an imperative role in the field of information security. In this study, biometrics is used for authentication purpose and also describes the combinational power of biometrics and cloud computing technologies that exhibit the outstanding properties of flexibility, scalability, and reduced overhead costs, in order to reduce the cost of the biometric system requirements. The massive computational power and unlimited storage provided by cloud vendors make the system fast. The purpose of this research is to precisely design a biometric-based cloud architecture for online signature recognition on Windows Tablet PC, which will make the signature recognition system (SRS) more scalable, pluggable, and faster, thereby categorizing it under “Bring Your Own Device” category. For extracting the features of the signature to uniquely identify the user, Webber local descriptor (WLD) process is used. The real-time implementation of this feature extraction process as well as the execution of the classifier for the verification process is deployed on Microsoft Azure public cloud. For performance evaluation, total acceptance ratio (TAR) and total rejection ratio (TTR) are used. The proposed online signature system gives 78.10% PI (performance index) and 0.16 SPI (security performance index).

Joel Philip, Dhvani Shah
Target-Controlled Packet Forecast and Communication in Wireless Multimedia Sensor Networks

The two main factors which are vital in present multimedia applications are Target-controlled packet forecast and communication. The degradation of Quality of Service (QoS) is because the packets miss their targets and become useless and are often dropped. As the consumption of real-time hypermedia applications and Internet of Things (IoT) has grown into more, multimedia data communication is a key cause to endorse the QoS of citizens. To accomplish the QoS prerequisite in Wireless Multimedia Sensor Networks (WMSNs) the mixture of multiple communication methods is stimulated for packet sending, counting Conventional Network Coding (CNC), Analog Network Coding (ANC), Plain Routing (PR) and Direct Broadcast (i.e., No-Relaying, NR). The combination and integration of communication methods lowers packet falling probability, but complicates the packet transferring and forecast process instead. Hence, an exhaustive search scheme is introduced to get the optimal forecast sequence and equivalent communication method for target constrained multimedia broadcasts in WMSNs. With respect to promote computing proficiency for the formulated problem, two heuristic methods based on Markov chain approximation and dynamic graph is proposed.

S. Ambareesh, A. Neela Madheswari
Backmatter
Metadaten
Titel
Data Management, Analytics and Innovation
herausgegeben von
Prof. Valentina Emilia Balas
Prof. Neha Sharma
Dr. Amlan Chakrabarti
Copyright-Jahr
2019
Verlag
Springer Singapore
Electronic ISBN
978-981-13-1402-5
Print ISBN
978-981-13-1401-8
DOI
https://doi.org/10.1007/978-981-13-1402-5

Neuer Inhalt