Skip to main content
Top

2020 | Book

EAI International Conference on Big Data Innovation for Sustainable Cognitive Computing

BDCC 2018

Editors: Prof. Dr. Anandakumar Haldorai, Arulmurugan Ramu, Dr. Sudha Mohanram, Dr. Chow Chee Onn

Publisher: Springer International Publishing

Book Series : EAI/Springer Innovations in Communication and Computing

insite
SEARCH

About this book

This proceeding features papers discussing big data innovation for sustainable cognitive computing. The papers feature detail on cognitive computing and its self-learning systems that use data mining, pattern recognition and natural language processing (NLP) to mirror the way the human brain works. This international conference focuses on cognitive computing technologies, from knowledge representation techniques and natural language processing algorithms to dynamic learning approaches. Topics covered include Data Science for Cognitive Analysis, Real-Time Ubiquitous Data Science, Platform for Privacy Preserving Data Science, and Internet-Based Cognitive Platform. The EAI International Conference on Big Data Innovation for Sustainable Cognitive Computing (BDCC 2018), took place on 13 – 15 December 2018 in Coimbatore, India.

Table of Contents

Frontmatter

Main Track

Frontmatter
Chapter 1. Data Security in the Cloud via Artificial Intelligence with Vector Quantization for Image Compression

In this chapter two cloud compression techniques are used on an image, namely, Vector Quantization (VQ) and Feed Forward Neural Network (FFNN). VQ is used along K-Mean clustering to initiate the centroids and form the code-book. The FFNN in this algorithm has an architecture specification of 64 nodes in the input and the output layer along with 16 hidden layers with 16 nodes each. The VQ is applied first on the input image to achieve some compression and then the VQ compressed image is fed as an input to the FFNN network for additional compression. A set of observations for compression are recorded for different values of K (number of centroids) with a tile size 8. The results are obtained for different values of K such as 50, 100, 150, 200, 250, 500 and 1000. The proposed algorithm gives a compression ratio of about 2 and an acceptable PSNR of about 20 dB for the standard testing image Lena.

Srinivasa Kiran Gottapu, Pranav Vallabhaneni
Chapter 2. A Hybrid Ant–Fuzzy Approach for Data Clustering in a Distributed Environment

Mining the relevant documents from distributed databases is a challenging task in the era of ‘big data’. This paper presents a hybrid approach using two different metaheuristics algorithms inspired by nature, namely, the enhanced ant clustering algorithm (EACA) and the fuzzy clustering algorithm, which deal with the uncertainty in data mining. The first algorithm uses the ant colony metaphor, which is one of the most recent nature-inspired metaheuristics, and the second employs the fuzzy clustering approach. The proposed work aims to introduce the fuzzy approach to the ant-based algorithm for both local and global levels, that is, interclustering and interzone clustering, in mining distributed databases, and a new hybrid (ant–fuzzy) algorithm is developed for data clustering under a distributed environment. The proposed cluster-based framework presents the important concepts of web mining and its various real-time applications. The real-time synthetic data and, also, the training datasets from the UCI Machine Learning Repository were employed to evaluate the performance of the algorithms. The performance of the proposed work is compared with the EACA, probabilistic ant-based clustering (PACE), and K-means algorithms in terms of accuracy with the F-measure and the error rate with entropy measure.

K. Sumangala, S. Sathappan
Chapter 3. S-Transform-Based Efficient Copy-Move Forgery Detection Technique in Digital Images

Copy-move forgery (CMF), which copies a part of a picture and pastes it into another location, is one of the common strategies for digital image tampering. With the advent of high-performance hardware and the compact use of image processing software, it empowers creating image forgeries very easy which are undetectable by the naked eye. For CMF detection, we suggest an efficient and vigorous method that could take care of numerous geometric ameliorations including rotation, scaling, and blurring. In the projected CMF detection system, we use Stockwell transform (S-transform) which hybrids the advantages of both scale invariant feature transform (SIFT) and wavelet transform (WT) to extract the key points and their descriptors from the overlapped image blocks. Furthermore, Euclidean distance (ED) between the overlapped blocks is measured to detect the similarities and to identify the tampered or forged region in the image. Besides, a novel fuzzy min-max neural network-based decision tree (FMMNN-DT) classifier is used to recognize the duplicated regions in the forgery image. The proposed system is tested and validated using MICC-F220 dataset and we present comparison among the proposed outcomes with some existing ones which ensures the significance of the proposed.

Rajeev Rajkumar, Sudipta Roy, Kh. Manglem Singh
Chapter 4. Neuro-Fuzzy Ant Bee Colony Based Feature Selection for Cancer Classification

A neuro-fuzzy expert system is multi-objective, which hybrids Ant Bee Colony (ABC) with Adaptive Neuro-Fuzzy Inference System (ANFIS) called NF-ABC, which improves the classification accuracy and reduces the complexity of dimensionality, redundancy, and irrelevant data. In this proposed work, SVM and kNN algorithms are used for classification to classify the given micro array data. The results revealed that the proposed model is more successful than the previous model.

S. Gilbert Nancy, K. Saranya, S. Rajasekar
Chapter 5. Entity Resolution for Maintaining Electronic Medical Record Using OYSTER

With the advancement of technology, the world has witnessed the digitization of various fields including education, healthcare, agriculture, manufacturing, etc. Healthcare is a very important field which has witnessed the generation of huge amounts of data in the past few decades due to a steep rise in population, and hence the ever-increasing use of online databases for storing every minute detail. Doctors no more rely on written prescriptions or documents for examining the health conditions of a patient. Electronic medical records have enabled doctors to monitor each patient’s medical history with ease. However, the rate at which the data pertaining to healthcare is increasing has led to the search for new and better alternatives that enhance the feasibility and scalability of already existing digital storage systems. This chapter is intended to provide an insight on how Entity Resolution can be put to use in healthcare for maintaining electronic medical records using the open-source software, OYSTER. Also, this chapter will throw light on how performing Entity Resolution using OYSTER has an edge over the currently used systems for storing personal medical information in hospitals.

Tanya Gupta, Varad Deshpande
Chapter 6. Lifetime Improvement of Wireless Sensor Networks Using Tree-Based Routing Protocol

Wireless sensor networks (WSNs) are used for handling a large amount of data despite their serious limitations such as delay and energy consumption. The constraints become a serious issue when we deploy many nodes for the purpose of data handling. WSNs can be made energy efficient by means of Energy-Efficient Low Duty Cycle (ELDC) protocol which is an artificial neural network-based energy-efficient and robust routing scheme used in wireless sensor networks. ELDC is an extension of energy-efficient unequal clustering and energy-efficient multiple distance-aware clustering protocols. It is a dynamic group-based routing protocol which makes multi-hop strategy for communication. In ELDC there is a large amount of packet loss and load balance is not provided which reduces the lifetime of the network. Therefore, we suggest the use of General Self-Organization Tree-Based Energy Balance (GSTEB) protocol to provide load balance. It is a dynamic tree-based routing protocol which minimizes the energy consumption and has a minimum or no packet loss and data compression is also provided to improve the performance. It also prevents the entry of unauthorized node containing malicious data and the packet delivery ratio is also high. The proposed system greatly improves the lifetime of the network. The applications of our paper include environmental monitoring, air traffic control, surveillance, etc.

Sushaptha Rajagopal, R. Vani, J. C. Kavitha, R. Saravanan
Chapter 7. An Energy-Efficient Distributed Unequal Clustering Approach for Lifetime Maximization in Wireless Sensor Network

In Wireless Sensor Networks, higher energy consumption is caused due to gathering and transmission of a large amount of sensor data. In clustering, each sensor node forwards its sensed information to the Cluster Head, which further transmits the processed information to the sink. Thus, such cluster heads have more chances of being affected by node death due to higher workload and therefore rapidly decreases the lifetime of the sensor network and eventually affects the network performance. This research paper introduces a clustering algorithm named Energy-Efficient Distributed Unequal Clustering Approach for balancing the energy depletion among the cluster heads which could eliminate the hot spot problem and thus achieve lifetime maximization in Wireless Sensor Network. It implements unequal clustering technique over the sensor nodes where the cluster head election is based on fuzzy inference system, where the sensor nodes discovered with the higher chance are finalized as cluster heads. Based on the input fuzzy parameters, the cluster size is optimally adjusted to achieve load balancing among the clusters. The simulation is executed to demonstrate the performance of proposed approach with the existing LEACH, CHEF, DUCF energy-efficient clustering approach in various network scenarios.

S. Manikandan, M. Jeyakarthic
Chapter 8. An Effective Big Data and Blockchain (BD-BC) Based Decision Support Model for Sustainable Agriculture System

In spite of the available advanced technologies and the convergence of each and every field of the society with the advanced technologies, perhaps agriculture is the only field lacking behind in effective usage of advanced technology to tackle the farm problems. Due to this agro business is facing a many problems such as demand-supply synchronization, food wastage, lack of food security, unnecessary creation of demand by holding the supply erratically, and intervention of mediators. Due to which, many times farmers may not get the expected income and consumers are not happy with the varied prices. The incorporation, implementation, and usage of Big Data, Cloud, and Blockchain technologies have given the new paradigm shift in sustainable agriculture research and practices with a sustainable commercial model to comprehend potential benefits and sustain it. Hence there is a need for a technology-based efficient system tailored to the needs of farmers in order to remain competitive and derive better price realization. This work presents a technology-based novel idea to address the problems of agriculture effectively. Implementation results have shown that this model improves the quality of the agricultural system by minimizing the gap among demand and supply of food crops, required by the society from the farmer’s end, thus avoiding the loss for farmers and catering the needs of consumers. This leads to gainful crop business for farmers and satisfactory fulfillment of the societal needs.

M. Dakshayini, B. V. Balaji Prabhu
Chapter 9. An SDN-Based Strategy for Reliable Data Transmission in Mobile Wireless Sensor Networks

Wireless sensor networks have enormous applications in today’s world and have much more applications in near future. Wireless sensor networks are less reliable than other traditional networks due to its low power capabilities, limited energy, and other resources. Several applications of wireless sensor networks specifically in the context of hazardous scenarios where the wireless channel is in deprived state and that carry very critical data which need to be delivered to the designated destination with minimal delay and at right time demands a highly reliable network. To achieve better reliability wireless mobile networks could be employed. This paper proposes a novel fidelity model for a hazardous environment where the wireless channel is in deprived condition and early energy exhaustion is a threat by utilizing mobile nodes. The mobility of the node reduces the number of hops and thus increases the likelihood of effective transmissions. The simulations have proven better utilization of energy and the proposed methodology outperforms with other traditional methods in achieving reliability with minimal delay and minimal energy utilization.

V. Shubha Rao, M. Dakshayini
Chapter 10. Different Aspects of 5G Wireless Network: An Overview

Pursuit for a new feasible technology has always been the main intention of every telecom company. Keeping in line with current trends, user preferences and requirements, technology standards have evolved from 1G, 2G, 3G, 4G and now 5G. The new standards proved an opportunity for the technology community to address certain limitations that has been observed in the 4G network. Critical issues like enhancement of battery power, bandwidth restrictions, and latency, among other things, have been drastically improved. Thus, this chapter intends to discuss the various performance parameters, necessities, evolutionary enhancements, and the challenges to deploy 5G Mobile network.

Akash R. Kathavate, Bhanu Priya, Rajeshwari Hegde, Sharath Kumar
Chapter 11. Intelligent Systems for Volumetric Feature Recognition from CAD Mesh Models

This chapter presents an intelligent technique to recognize the volumetric features from CAD mesh models based on hybrid mesh segmentation. The hybrid approach is an intelligent blending of facet-based, vertex-based, rule-based, and machine learning-based techniques. Comparing with existing state-of-the-art approaches, the proposed approach does not depend on attributes like curvature, minimum feature dimension, number of clusters, number of cutting planes, the orientation of model, and thickness of the slice to extract volumetric features. The intelligent threshold prediction makes hybrid mesh segmentation automatic. The proposed technique automatically extracts volumetric features like blends and intersecting holes along with their geometric parameters. The proposed approach has been extensively tested on various benchmark test cases. The proposed approach outperforms the existing techniques favorably and found to be robust and consistent with coverage of more than 95% in addressing volumetric features.

Vaibhav Hase, Yogesh Bhalerao, Saurabh Verma, G. J. Vikhe
Chapter 12. Factors Affecting a Mobile Learning System: A Case Study

The emergence of m-learning, i.e., mobile learning, has given further strength to traditional teaching and method, as it has a great advantage in social interaction, teaching/learning methods, and knowledge transformation. The m-learning cannot be an alternative to the traditional teaching methods, but a technical facet of new teaching and learning. It is alternative to the e-learning, but with many major changes. A learner plays a very critical role in any learning environment. Learner, as a factor of m-learning, also makes certain impact on m-learning environment. Learning objectives are impacted by different attributes of the student and influences the way in which learning happens. The characteristics of the learner play a vital role in determining whether or not the learning experience is meaningful. Understanding of these learner characteristics is dynamic and a complex process. This chapter will discover the factors that make an impact on m-learning system, and then it analyzes roles of a learner and their contribution to m-learning.

Sudhindra B. Deshpande, Shrinivas R. Mngalwede, Padma Dandannavar
Chapter 13. Document Similarity Approach Using Grammatical Linkages with Graph Databases

Document similarity had become essential in many applications such as document retrieval, recommendation systems, and plagiarism checker. Many similarity evaluation approaches rely on word-based document representation, because it is very fast. But these approaches are not accurate when documents with different language and vocabulary are used. When graph representation is used for documents, they use some relational knowledge which is not feasible in many applications because of expensive graph operations. In this work a novel approach for document similarity computation which utilizes verbal intent has been developed. This improves the similarity and graph databases were also used for faster performance. The performance of the system is evaluated using various datasets and verbal intent-based approach has registered promising results.

V. Priya, K. Umamaheswari
Chapter 14. Missing Data Handling by Mean Imputation Method and Statistical Analysis of Classification Algorithm

The motive of data mining is to extract meaningful information from the large database. Because of the human errors, their high dimensionality, noisy data, and missing values, the process over dataset may degrade the performance. Therefore, the need for handling of those data in a proper way is important for improving the performance. There are many missing data handling methods available. Mean imputation is one of the methods for missing data in the dataset. This is the preprocessing operation performed before applying any machine learning algorithms. After applying mean imputation in a dataset, the decision is made either imputed mean value is good or bad. The rpart decision tree algorithm is applied on retailer dataset to handle more number of classes. From the experimental results, there is no significant difference among variables. The results of various GLM models with different were compared and analyzed to provide better performance.

K. Maheswari, P. Packia Amutha Priya, S. Ramkumar, M. Arun
Chapter 15. Task Identification System for Elderly Paralyzed Patients Using Electrooculography and Neural Networks

Earlier day’s people with disability face lot of difficulty in communication due to neuromuscular attack. They are unable to share ideas and thoughts with others, so they need some assist to overcome this condition. To overcome the condition in this paper we discussed the capabilities of designing electrooculogram (EOG)-based human computer interface (HCI) by ten subjects using power spectral density techniques and Neural Network. In this study we compare the right-hander performance with left-hander performance. Outcomes of the study concluded that left-hander performance was marginally appreciated compared to right-hander performance in terms of classification accuracy with an average accuracy of 93.38% for all left-hander subjects and 91.38% for all the right subjects using probabilistic neural network (PNN) and also we analyzed that during the training left-handers were interestingly participated and also they can able to perform the following 11 tasks easily compared with right-handers. From this study we concluded that potentiality of creating HCI was possible by means of left-handers and also study proves that right-hander need some more training to achieve this. Finally the experiment outperforms our previous study in terms of performance by changing the subjects from right-hander to left-handers.

S. Ramkumar, G. Emayavaramban, K. Sathesh Kumar, J. Macklin Abraham Navamani, K. Maheswari, P. Packia Amutha Priya
Chapter 16. A Software-Defined Networking (SDN) Architecture for Smart Trash Can Using IoT

A major challenge in Information and Communication Technology (ICT) is to acquire and transform the obtained data in an efficient way. ICT encompasses most of the day-to-day activities of human life, including environment management, education, health care, water-resource management, electricity resource management, agriculture, traffic management, etc. In the current circumstances, ICT needs to send more and more data efficiently and effectively. Here comes the role of IoT (Internet of Things) and SDN (Software-Defined Networking) in the ICT domain which takes forward the ICT in a better way. SDN is a promising network standard which brings the process of network devices in a programmatic manner. Whereas IoT makes every object in a world to function elegantly by having internet connectivity. Combining these two areas together will lead to more advantages such as efficient network management, dynamic programming of network devices, easy accessibility of device data, etc. in ICT domain. In this paper we have proposed a novel SDN architecture for IoT-based smart trash can. Smart trash can is one of the IoT applications which we developed for managing the garbage collection in a city efficiently. It also finds the efficient path for garbage collection and helps in saving time and fuel. IoT-based smart trash can is implemented using Raspberry Pi board with HC SR04 ultrasonic sensor for measuring trash level. Amazon Web Services (AWS) helps in storage of data and sending notifications to the concerned people who are involved in the process of collecting garbage.

T. Vairam, S. Sarathambekai, D. Vigneshwaran
Chapter 17. Modified K-Nearest Neighbor Fuzzy Classifier Using Group Prototypes and Its Application to Skin Segmentation

This paper describes proposed modifications to the K-NN classifier, i.e., Modified Fuzzy KNN (MFKNN) to address some complexity drawbacks of KNN. MFKNN calculates group prototypes from several patterns belonging to the same class and uses these prototypes for the recognition of patterns. Number of prototypes created by MFKNN classifier is dependent on the distance factor d. More prototypes are created for smaller value of d and vice versa.Also a fuzzy logic layer is added to it to increase the prediction accuracy of the classifier. We have compared performance of original KNN and MFKNN using skin segmentation dataset. From the experimentation, one can conclude that performance of MFKNN is better than original KNN, in terms of percentage recognition rate and recall time per pattern, classification and classification time. MFKNN thus has increased the scope of original KNN for its application to large data sets, which was not possible previously.

Priyadarshan Dhabe, Mukesh P. Chugwani, Vaibhav B. Kahalekar
Chapter 18. Enhancing Cooperative Spectrum Sensing in Flying Cell Towers for Disaster Management Using Convolutional Neural Networks

Natural calamities are increasing every year and communication plays a major role in post disaster measures to save human lives. This work utilizes the adaptation of the emerging dynamic radio technology called cognitive radio networks over Unmanned Aerial vehicles (UAV). Enhancing emergency communication over disaster affected zones where the mobile network base stations are completely destroyed is enabled by mounting drones with an omni antenna base station. This chapter analyses the cooperative spectrum sensing (CSS) technique of the intelligent radio to study incoming primary user (PU) when the available spectrum consists of multiple secondary users (SUs). A deep learning based technique called SpecCNN (Spectrum sensing Convolutional Neural Network) is proposed for performing intelligent spectrum sensing by analysing hidden cyclostationary features from drone data (image) of disastrous areas.

M. Suriya, M. G. Sumithra
Chapter 19. Emoticons and Their Effects on Sentiment Analysis of Twitter Data

Several social media sites have come to become a growing source of data and information. The recent years have witnessed a massive and unprecedented growth of the World Wide Web. Usage of other social networking sites, Web Forums, and Blogs has also seen an ever-increasing trend. Twitter and Facebook, in particular, have become well-known means of communication over the Internet where millions of users share their opinions, reviews, experiences, thoughts, feelings, and preferences on different aspects of products or services, on a daily basis. This large volume of user-generated content contains sentiment-based sentences, which are expressions that describe the mood of the writer or his/her opinion towards a particular person/entity. This has resulted in the use of tweets, SMS messages, and other short informal texts being used in Sentiment Analysis (SA). In addition to text, emoticons are increasingly being used by people (specially youngsters) to express their feelings/sentiments that otherwise cannot be adequately communicated in words. Majority of the present SA systems do not consider emoticons as a part of analysis. However, emoticons are strong indicators of sentiments and can be considered as cues for analyzing sentiments. The main objective of this work is to determine whether emoticons can be used as reliable cues in SA, based on a comparison between SA conducted on tweets with emoticons and without emoticons

P. S. Dandannavar, S. R. Mangalwede, S. B. Deshpande
Chapter 20. Prediction of Customer Churn Using Machine Learning

The increase in competition in customer service sector has become paramount to invest time in customer behavior and to accurately predict the customer churn. Customer churn occurs when the customer decides to discontinue their relations with the company. Many traditional algorithms have been used to predict the churn, and thus devise various techniques for customer retention, but with the advent of deep learning paradigms, we have witnessed algorithms that give a new prospect to this very task. Deep learning permits multilayered models to represent data in multiple abstraction levels. It also greatly reduces the work of feature engineering as it automatically comes up with good features. This chapter comprises an experimental comparison of various traditional classification algorithms, namely K-nearest neighbors, naive Bayes, random forest, decision tree, and logistic regression, with artificial neural network to predict the customer churn on IBM’s Telco Customer Churn dataset. We have compared these models based on their accuracy in predicting customer churn. Our ANN model achieves an accuracy score of 82.83% on validation data, better than our performance of 79.86% achieved for the traditional approach of using K-nearest neighbors. The results suggest that the multilayered ANN model with self-learning ability and tokenized data input outperforms traditional classification algorithms.

Saifil Momin, Tanuj Bohra, Purva Raut
Chapter 21. Prediction of Crop Yield Using Fuzzy-Neural System

Sustaining the burgeoning population is one of the major concerns of the twenty-first century. In one of its report FAO has clearly mentioned that as more developing countries enter into the developed phase, the purchasing power of the people will increase and there will be a constant increase in the food demand. To suffice the growing needs it is necessary to keep up with the demands. Addressing this situation a lot of research has been conducted in the past towards developing a robust time series forecasting algorithm. We in our research observed that due to the precarious nature of the crop yield Fuzzy time series has been particularly successful in predicting the crop production. In this chapter we propose a method to predict crop yield using fuzzy logic and artificial neural network and established the results by implementing it on rice yield dataset.

Bindu Garg, Tanya Sah
Chapter 22. Speed Estimation and Detection of Moving Vehicles Based on Probabilistic Principal Component Analysis and New Digital Image Processing Approach

In the twenty-first century, smart city surveillance management is one of the advancements of Information and Communication Technology. Intelligent Transport System (ITS) is an essential component of the smart city. Moving vehicle detection and speed estimation are major tasks of traffic management. Vehicle tracking and speed measurement methods failed to achieve good accuracy rate due to unsuccessful detection of moving vehicles. In the existing system, the conventional de-noising filters reduce the noise in smooth regions. The edges of object boundaries are not sharply identified. In this chapter, the Probabilistic Principal Component Analysis (PPCA) method is proposed to detect multiple outliers in objects. It is computationally fast and robust in identifying outliers which helps to reduce the dimension of video by finding an alternate set of coordinates. The proposed approach consists of three stages. First, Spatio-temporal Varying Filter (STVF) is applied to preprocess extracted frames. Contour finding algorithm is used to detect the vehicle. The frame count scheme is applied to estimate the vehicle speed. This approach provides high detection accuracy with high precision and recall rate in BrnoCompSpeed dataset.

T. V. Mini, V. Vijayakumar
Chapter 23. A Posture Recognition System for Assisted Self-Learning of Yoga by Cognitive Impaired Older People for the Prevention of Falls

According to United Nation’s World Population ageing report, every country is facing an ageing issue by most of their citizens. It is estimated that 75% of fall injuries occur in low- and middle-income countries. Reasons for fall include slipping due to loss of footing or traction, problem in balancing, reduced muscle strength, poor vision, mobility/gait, and cognitive impairment. Fall risk in the older people that too those who are with cognitive impairment is very high than others. Yoga uses a series of physical postures called asanas, breathing control, and meditation. Since Yoga concentrates on both body and mind, it is more therapeutic than exercise. Some older people, especially women, showed discomfort of doing yoga in public view and prefer to do yoga at their home during their free time. Assisted self-learning can address the needs of above said people. Most of the Yoga asanas depend on either sitting or standing postures. The identification of postures is based on the joint angles. Sensor like Orbbec Astra observes the scene and provides skeletal information of a human being. We use this information for calculating the joint angles to identify the postures required as part of an Assisted Self-Learning System.

K. Ponmozhi, P. Deepalakshmi
Chapter 24. Improved UFHLSNN (IUFHLSNN) for Generalized Representation of Knowledge and Its CPU Parallel Implementation Using OpenMP

Fuzzy Hyper Line Segment Neural Network (FHLSNN) (Kullarni et al., International Joint conference on 4:2918–2933, 2001) is a hybrid system that combines fuzzy logic (Zadeeh, IEEE Trans. Fuzzy Syst. 4:103, 1996) and neural networks (Zurada, Fundamental Concepts and Models of Artificial Neural Systems, 1992, pp. 30–36). It is used extensively for real-world pattern classification (Zurada, Fundamental Concepts and Models of Artificial Neural Systems, 1992, pp. 30–36). It learns patterns in terms of n-dimensional Hyper Line Segment (HLS). Modified Fuzzy Hyper Line Segment Neural Network (MFHLSNN) (Patil et al., The 12 IEEE International Conference, vol. 2, 2003) is a modified version of FHLSNN (Kullarni et al., International Joint conference on 4:2918–2933, 2001) that improves the quality of reasoning and recall time per pattern using modified fuzzy membership function. Updated Fuzzy Hyper Line Segment Neural Network (UFHLSNN) (Dhabe, 2016 International Conference on Computing, Analytics and Security Trends, 2016) for larger pattern datasets is proposed using minimum computational efforts to compute membership. In this chapter, we proposed improved version of UFHLSNN (Dhabe, 2016 International Conference on Computing, Analytics and Security Trends, 2016), called IUFHLSNN, for generalized representation of knowledge for better recognition. IUFHLSNN uses midpoints of HLSs computed for the recall phase and thus expected to provide better recognition, as suggested in Occam’s razor principle (Blumer et al., Inform. Process. Lett. 24:377–380, 1987).We compared serial and parallel implementations using Intel’s Xeon E5-2620 and obtained average speedup of 16.96× and 77.22×, respectively, for classification and recognition, for all the used datasets (Poker Dataset, https://archive.ics.uci.edu/ml/datasets/Poker+Hand ; QtyT40I10D100K DataSet, https://archive.ics.uci.edu/ml/datasets/QtyT40I10D100K; Skin Segmentation Dataset, https://archive.ics.uci.edu/ml/datasets/skin+segmentation ). In the same experiment the obtained percentage gain in time are 92% and 94%, respectively, for classification and recognition. Thus, we strongly recommend IUFLSNN and its OpenMP parallel execution. We also compared parallel executions on two different computing CPU architectures, viz. IBM’s POWER8 and Intel’s Xeon E5-2620. We found that IBM’s POWER8 is two times faster than Intel’s Xeon E5-2620. Thus, we strongly recommend generalized representation of HLS knowledge and OpenMP parallelization.

Priyadarshan S. Dhabe, Sanman D. Sabane
Chapter 25. Performance Evaluation of Multihop Multibranch DF Relaying Cooperative Wireless Network

In this chapter, we investigate the performance of symbol error of a multihop multilink cooperative relayed wireless network by deriving the symbol error probability (SEP) over a Rayleigh flat fading channel. We studied in two different system using decode and forward (DF) relaying model: (1) model of single branch multihop relaying without symbol error and (2) model of multihop multibranch relaying with symbol error in decoding. The expression for symbol-to-noise ratio at destination in terms of cumulative density function (CDF) and probability density function (PDF) are derived. The theoretical results of SEP are obtained by using MPSK modulation technique. The relay links are identically and independent distributed (IID). Monte-Carlo Simulator is used for verifying the analytical results and computer simulated results.

M. Dayanidhy, V. Jawahar Senthil Kumar
Chapter 26. Predicting Property Prices: A Universal Model

The aim of this chapter is to develop a universal model for predicting property prices. The model can be used for estimating the worth of a property that is not trending. Also the developed model can figure out which components factors the property prices most. This model will also help in knowing the factors which influence the property prices in a particular region.Property prices are important reflection of economy. Price ranges are of great interest to both buyers and sellers. In this chapter, property prices are predicted based on explanatory factors that cover many aspects of residential properties. The property prices are predicted using Improved Linear Regression model for the specific selected region. To have a generalized universal model, clustering is done on the predicted regional property prices.Taking into consideration that this model is applicable for any region universally, the accuracy may be compromised for some areas initially. Applying suitable Machine Learning algorithms guarantees improved accuracy with every prediction.

E. Poovammal, Mayank Kumar Nagda, K. Annapoorani
Chapter 27. Facial Based Human Age Estimation Using Deep Belief Network

Facial based human age estimation has attracted lot of attention nowadays. Age estimation has become quite challenging task due to variation in lighting conditions, poses, and facial expression. Despite so much research in facial based human age estimation still there is room to improve performance. To improve accuracy we present age estimation using deep belief network. Deep belief network have shown superior performance as compared to other classification models. Success of deep belief network lies in contrastive divergence algorithm. Facial images passes though viola johns facial detection algorithm, once face is detected facial featured are extracted using active appearance and scattering transform feature method. These feature extraction model not only extracts geometric features but also extracts texture features. Subsequently deep belief network classification model is built on partitioned training images and evaluated on testing images. We performed experimentation on training images. Dataset and results are obtained by varying training percentages. Compared to other age estimation models we achieved low mean absolute error of 4.95 for 70% training images dataset. This study shows that due to inclusion of deep belief network performance is excelled.

Anjali A. Shejul, Kishor S. Kinage, B. Eswara Reddy
Chapter 28. Randomized Agent-Based Model for Mobile Customer Retention Behaviour Prediction

Due to the development of technology, mobile phones have a crucial role in human life. Multiple sim card phones and a single person using multiple mobile phones are common nowadays. Telecommunication is a major area where big data technologies are needed. Competition among the telecommunication companies is high due to customer churn. Customer retention in telecom companies is one of the major problems. In this paper, we propose a Randomized Method (RM) using Map and Reduce big data functions to avoid data duplication in the customer call data of telecommunication application. We use agent-based model (ABM) to predict the complex customer behaviour for the retention of customers with a particular telecommunication service. Agent-based model increases the prediction accuracy due to its dynamic nature of agents. ABM suggests rules based on mobile user variable features using multiple agents. This paper shows the effectiveness RM with MapReduce along with agent-based model to predict customer retention behaviour. The benefit of this proposed system is simple, cost-effective and flexible prediction model with high business value.

N. Sandhya, Philip Samuel, Mariamma Chacko
Chapter 29. Keyword-Based Approach for Detecting Civil Unrest Events from Social Media

In recent years the various online social media platforms like Twitter, Facebook, and Google+ are much popular and this popularity makes the protesters to actively use social media during civil unrest to express their opinions of the remonstrance, communicate their plans, and organize future events, which yield an impressive amount of data that has been used by the researchers to predict the protest activity in near future. Effective detection of such potentially dangerous misinformation can help to ensure the safety of the public with minimum disruption. We identified the correlation between the tweets promoting protest and the imminent protest activity. Thus we proposed a keyword-based approach for analyzing the behavior of a civil unrest event and also build a probabilistic model for classifying civil unrest events. Extensive experimental evaluations were done on the Twitter dataset from #Jallikattu, #BusFareHike and #SaveFisherMen civil unrest to demonstrate the effectiveness and efficiency of our proposed approach.

J. Joslin Iyda, P. Geetha
Chapter 30. Socioeconomic Status Classification of Geographic Regions in Sri Lanka Through Anonymized Call Detail Records

Identifying socioeconomic status for a given geographical region is important for a better ruling and policy making of a country. This is a significant fact when both government and private sector organizations are implementing various schemas for the well-being of people. For example, telecommunication providers, the main stakeholders of this project need to identify behavioural patterns of its user base to accurately target the relevant users when deploying advertising campaigns and other promotional activities. We propose a prediction model to classify each geographical region in Sri Lanka into a particular socioeconomic status using CDR data. This will ease the process of classification of socioeconomic status of geographical regions unlike the traditional methods like census and household surveys, which require a lot of time and money along with many other resources.

W. O. K. I. S. Wijesinghe, C. U. Kumarasinghe, J. Mannapperuma, K. L. D. U. Liyanage

Workshop on the Analysis of Big Data

Frontmatter
Chapter 31. Hand Gesture Based Human-Computer Interaction Using Arduino

In this era of evolving technologies, most of the human interactions with the electronic devices are becoming smart. For elderly and blind people, it may be difficult to use mouse and keyboard for every operation especially when watching videos in computer, increasing or decreasing of volume and play pause, and when using web browsers, scrolling up or down and swapping of the taps. This paper has tried to provide solutions to the above-mentioned problem by reducing the usage of mouse and keyboard using IOT technologies. Gesture recognition is one of the essential techniques to build user-friendly interface. This paper deals with Human Computer Interaction(HCI) consists of hardware components such as Arduino UNO and the ultrasonic sensors with the software components such as Arduino and Python IDLE for exchanging the information or communication between the user and machine.

S. Shreevidya, N. Namratha, V. M. Nisha, M. Dakshayini
Chapter 32. An Automatic Diabetes Risk Assessment System Using IoT Cloud Platform

Diabetes mellitus is a disease that impairs the body’s ability to process blood sugar due to insufficient production of the hormone called insulin or the body’s resistant towards insulin or both. There are three types of diabetes, namely, type 1, type 2, and gestational diabetes. Among these types, type 2 is common and it is associated with lifestyle risk factors such as inadequate physical activity, poor diet and increased body mass index and hereditary factors. If it is not managed carefully, diabetes can lead to a accumulation of blood sugars which can increase the risk of obtaining stroke, heart and kidney diseases. Therefore a personalized advisory system which monitors the health condition of the user through sensors acquire his/her diet and day-to-day activity information through interactive platforms, store them in a common cloud platform, process them through machine learning techniques, and provide valid health related personalized advices to manage their health condition is the need of the hour (American Diabetes Association, Diabetes Care 29:s4–s42, 2006). The proposed system uses an IoT Cloud platform named ThingSpeak, where the sensor data can be sent to the cloud for storing, analyzing, and visualizing the data with MATLAB or other software and our own applications can be developed and operated by MathWorks. A web application has been developed and made available to the users to manage diabetes or prevent them from diabetes and its dangerous complications.

M. Sujaritha, R. Sujatha, R. Anitha Nithya, A. Sunitha Nandhini, N. Harsha
Chapter 33. Message and Image Encryption Embedding Data to GF(2m) Elliptic Curve Point for Nodes in Wireless Sensor Networks

This work proposes encryption of text data and image by embedding data as elliptic curve point. Elliptic curve cryptography (ECC) is selected to satisfy the constraints of wireless sensor networks due to its high security for smaller key bit length. Finite field arithmetic is utilized efficiently in this reconfigurable cryptosystem. MATLAB is employed for pre-computations for text data and image input conversion. This architecture is tailored for cryptographic applications specifically consisting of cost-efficient Xilinx Spartan-xc3s100e-4-fg320 FPGA with Verilog coding. Time and area constraints are analyzed and compared with previous implementations considering both the approaches. The total encryption and decryption time results are around 10.09021 μs for 100 × 100 images and 0.029 μs for a message. Computational and combination path delay is not observed in any module design implementation. The dynamic mapping of input data and a cipher image with high randomness indicate the good security, i.e., less vulnerable to attacks. To evaluate the strength of the proposed method, an entropy statistical analysis is performed on plain and encrypted images. As per authors’ knowledge both data and image mapping on the elliptic curve are considered for the first time to utilize in wireless sensor nodes.

G. Leelavathi, K. Shaila, K. R. Venugopal
Chapter 34. Crack Detection in Welded Images: A Comprehensive Survey

This chapter presents a review on the different crack detection techniques. Welding crack detection plays a vital role in engineering applications. Some of this application includes engineering machinery, ships, civil infrastructure, etc. The complexities of the weld structure, the disparity of the welded materials, the variation in surface thermal radiation and the angle between the crack and the weld are found to be the major factors in crack detection. The intention of this survey is to find the improved performance metric like accuracy, sensitivity and specificity in different image processing techniques.

L. Mohanasundari, P. Sivakumar
Chapter 35. An Effective Hybridized Classifier Integrated with Homomorphic Encryption to Enhance Big Data Security

Wireless sensor network and big data has gained a lot of importance in recent years. Linear regression, linear classifiers and neural networks have been examined to secure confidential data and enhance privacy protection. The data produced by millions of wireless sensor network generate big data. Big data sources are usually gathered and analysed in wireless sensor network. Therefore major threats prevailing in wireless sensor network must be resolved; hence we proposed an effective hybridized classifier integrated with homomorphic encryption which shows better performances in evaluation. The evaluation shows that the proposed system achieved a higher accuracy rate.

R. Udendhran, M. Balamurgan
Chapter 36. AI Powered Analytics App for Visualizing Accident-Prone Areas

Increasing urbanization over the years has resulted in exponential traffic rise and consequently the risk to drivers and passengers. Appropriate action is vital for safe driving under various road conditions. With the advent of intelligent vehicular systems, identification and notification of accident-prone zones while driving has become a hot topic. Data Analytics could improve driving safety in such regions and thereby save invaluable human lives. Warnings about accident-prone areas are usually notified using signboards which may be overlooked and have no significant impact while driving past these zones. At the outset analytics performed on the accident data could provide invaluable insights and thereby enable drivers to be more cautious while approaching or crossing these areas to facilitate safe journey. In this paper an App built on Artificial Intelligence (AI) Einstein platform performs analytics on data collected from sensors for major and minor accidents to visualize and share details of zones prone to major accidents via mobile App for the data stored in Salesforce Cloud. With the actual accident statistics not available or appropriately documented, a prototype to detect and collect spatial details of accidents has been developed using vibration sensor and Global Positioning System (GPS) on Ardunio. This simulation further facilitates drivers to determine major crash spots powered by Einstein AI on their mobile App.

Preethi Harris, Rajesh Nambiar, Anand Rajasekharan, Bhavesh Gupta

Workshop on Big Data and Society

Frontmatter
Chapter 37. IOT Based Autonomous Inventory Management for Warehouses

We all know that in today’s highly competitive world, everything has been developed by some technologies like IOT, digital, cloud, sensors, etc. Nowadays warehouses and inventories are facing so many problems like huge amount of human’s involvement in work and manual errors (or) human errors, and lot of workers are required for controlling or managing the process. May be sometimes human could make errors, but machines never could make errors. So that we are presenting a new idea which is movable bar code scanner using IOT (industry automation and smart glasses) for reducing the problems in warehouses and inventories. It can make the industry foster (advance), quick, efficient, and better digitalized.

A. Madhu Vamsi, P. Deepalakshmi, P. Nagaraj, Akash Awasthi, Anup Raj
Chapter 38. Internal Repeats of Human Organs

In cloud computing bioinformatics plays the major role to maintain the large amount of protein data is available in less amount of resource is a challenge thing in life sciences. These biological data are containing lot of protein and DNA sequence. In this paper, data of human organs proteins are taken to find all the internal repeats besides in sequences. Sequence represents the function of the protein resides in human organs. With the large data to find the motif in sequence is difficult task. More number of website is provided to calculate the repeats by using some several algorithms. In such cases, input of sequence is given separately. So major problem is repeats of fasta file can be known but proteins in organs of human body cannot be taken entirely. To avoid the problem, a database is developed called IRHO database (i.e., Internal Repeats of Human Organs) as services is provided for human organs sequence with internal repeats. Based on the client query the data will be produced form the databases. Repeats taken in database are identical and similar repeats of human organs. Data are categorized into group enriched, tissue enhanced, and tissue enlarged. In human body total protein sequences are 3747 human genes are available. Repeats analysis of the corresponding tissue proteins identified of the respective tissue or organ. The new system as a website can be viewed anywhere in the world as a cloud computing.

B. Ramya, E. S. Samundeeswari
Chapter 39. Bitcoin Prediction and Time Series Analysis

For supporting the decision-making process along with goal of useful information discovery, the mechanism which helps to illustrate and evaluate data with statistical and/or logical process and provides insight to relevant conclusion is termed as data analysis. Predictive analysis is used to predict the trends and behaviour patterns. The predictive model is exercised to understand how a similar unit collected from different samples exhibit performance in a special pattern. Cryptocurrency is the digital currency, for which unit generation and fund transfers are decentralized and regulated by encryption methodologies. Bitcoin is the first decentralized digital cryptocurrency, which has showed significant market capitalization growth in last few years. It is important to understand what drives the fluctuations of the bitcoin exchange price and to what extent they are predictable.This research work explores how the bitcoin market price is associated with a set of relevant external and internal factors.

Krishna Chakravarty, Manjusha Pandey, Siddharth Routaray
Chapter 40. Smart Active Helmet

As living space becomes wider, new wider sophisticated gadgets are required for human well-being. New technology and innovations prove that they help human kind for comfortable living. Always prevention is better than cure. Life has to live with healthy and comfort. This research paper briefs about the essence of life with special designed gadget, namely, smart active helmet has been incorporated in doing a smart, intelligent action which helps the life span of humans. The most precious and gift of God is birth. Saving the life is a challenging and essential for all needs of people prosperity. This new gadget named “Smart Active Helmet” prevents from accidents and saves lifetime. The smart active helmet plays a vital role in alcohol testing using a fuel cell sensor. This sensor detects alcohol-consuming victim and forward the details to the central hub in cloud internet. The hub on analysis of the blood alcohol content of the victim takes a respective action to continue riding the motor cycle or triggers the ignition interlock device (IID) in motor cycle to lock and avoid from further operation. The result of this research work brings about the detail insight of its requirements, roles, functions and salient features with challenges and limitations. So using smart active helmet can able to overcome from the road accidents and thereby providing a safe environment.

W. Gracy Theresa, A. Gayathri
Backmatter
Metadata
Title
EAI International Conference on Big Data Innovation for Sustainable Cognitive Computing
Editors
Prof. Dr. Anandakumar Haldorai
Arulmurugan Ramu
Dr. Sudha Mohanram
Dr. Chow Chee Onn
Copyright Year
2020
Electronic ISBN
978-3-030-19562-5
Print ISBN
978-3-030-19561-8
DOI
https://doi.org/10.1007/978-3-030-19562-5