Skip to main content

2019 | Buch

Data Science and Big Data Analytics

ACM-WIR 2018

herausgegeben von: Dr. Durgesh Kumar Mishra, Prof. Xin-She Yang, Dr. Aynur Unal

Verlag: Springer Singapore

Buchreihe : Lecture Notes on Data Engineering and Communications Technologies

insite
SUCHEN

Über dieses Buch

This book presents conjectural advances in big data analysis, machine learning and computational intelligence, as well as their potential applications in scientific computing. It discusses major issues pertaining to big data analysis using computational intelligence techniques, and the conjectural elements are supported by simulation and modelling applications to help address real-world problems. An extensive bibliography is provided at the end of each chapter. Further, the main content is supplemented by a wealth of figures, graphs, and tables, offering a valuable guide for researchers in the field of big data analytics and computational intelligence.

Inhaltsverzeichnis

Frontmatter
A Study of the Correlation Between Internet Addiction and Aggressive Behaviour Among the Namibian University Students

The explosion of online Social Networking Sites over time has its benefits as well as its risks. A potential risk is the fact that so many individuals have become victims of aggressive and cyber-bullying acts via Online Social Networking Sites. In the paper, the aim of this study is to analyse the correlation between Internet addiction and Aggressive Behavior Among the Namibian University Students. Based on statistical analysis the paper concluded that there is a worthwhile correlation between Internet addiction and Aggressive Behaviour and a sizable majority of the students who participated in the study suffer from moderate addiction problems due to their Internet usage. Also, the results indicate that the two most prevalent forms of aggression among the majority of the students are hostility and Physical Aggression.

Poonam Dhaka, Cynthia Naris
Genetic Algorithm Approach for Optimization of Biomass Estimation at LiDAR

Estimation of Biomass at ICESat/GLAS footprint level was finished by incorporating information from various sensors viz., spaceborne LiDAR (ICESat/GLAS). The biomass estimation accuracies of Genetic Algorithm were studied by optimizing the waveform parameters. Multiple linear regression equation was generated using the most important variables found by Genetic Algorithm. The results of the study were very encouraging. Optimum results were obtained using the top 18 parameters derived from GLAS waveform. The biomass was predicted at small area by Genetic Algorithm with an R2 63% of and RMSE of 18.94 t/ha using the best six variables, viz. wdistance, wpdistance, R50, Ecanopy, Home., wcentroid over to 18 independent variables. The same methodology can be used for biomass estimation over a large area. The estimation is done at Tripura area. The study finally established that Genetic approach is produced the better result to predicting AGB. The best outcome of the study was the formulation of an approach that could result in higher biomass estimation accuracies.

Sonika, Aditi Jain
E-alive: An Integrated Platform Based on Machine Learning Techniques to Aware and Educate Common People with the Current Statistics of Maternal and Child Health Care

Data science finds a variety of applications in day-to-day life. Its practical uses can cater to needs of improving the lifestyle and health standards of the individuals of the society. This paper proposes an intelligent tool, called E-alive, build to encourage people towards the sensitivity of maternal and child health care. This tool serves as an integrated platform for rural and urban people, government officials and policy makers to actively participate and analyse the current statistics of various parameters such as infant mortality rate, life expectancy ratios for females and males individually, female and male sterilization rates and maternal mortality rates for the next subsequent years. This can help them in taking quality decisions in order to improve upon the predicted values. Further this tool can assist in classifying the educational status of an individual, community and state on the basis of total fertility rates. This implies that the awareness factor among the people of respective community or state and total fertility rate can be predicted by this tool for the future years. The current analysis analyses the two government schemes in detail: Swadhar Scheme and Janani Suraksha Yojana. Other analysis factors include Life Expectancy Ratio, Education Details, Maternal Mortality Rate and the Contraceptive Methods used by people in major cities.

Garima Malik, Sonakshi Vij, Devendra Tayal, Amita Jain
An Effective TCP’s Congestion Control Approach for Cross-Layer Design in MANET

The fast expansion in correspondence innovation has offered ascend to strong research enthusiasm on Wireless Networks. As of late numerous scientists have concentrated on planning steering plans which would effectively work on the continuous condition of remote systems, e.g., MANETs. Every one of the hubs in the mobile ad hoc network (MANET) agreeably keep up the system network. Abusing the conditions and collaborations between layers has been appeared to expand execution in specific situations of remote systems administration. Albeit layered structures have served well for wired systems, they are not reasonable for remote systems. The standard TCP clog control system is not ready to deal with the unique properties of a common remote channel. TCP clog control works exceptionally well on the Internet. In any case, versatile specially appointed systems display some properties that extraordinarily influence the plan of fitting conventions and convention stacks when all is said in done, and of clog control component specifically. This paper proposes an outline approach, going amiss from the customary system plan, toward improving the cross-layer association among various layers, to be specific physical, MAC and system. The Cross-Layer configuration approach for blockage control utilizing TCP-Friendly Rate Control (TFRC) mechanism is to maintain congestion control between the nodes what’s more, to locate a successful course between the source and the goal. This cross-layer configuration approach was tried by reproduction (NS2 test system) and its execution over AODV was observed to be better.

Pratik Gite, Saurav Singh Dhakad, Aditya Dubey
Baron-Cohen Model Based Personality Classification Using Ensemble Learning

These days intelligence is not the only factor for judging the personality of a human being. Rather, emotional quotient (EQ) as well as systemizing quotient (SQ) has a major role to play for classifying a human’s personality in many areas. Using these quotients, we can foresee one’s personality in the society. The broad classification of personality on the basis of EQ and SQ score has been well researched using machine learning techniques with varying degree of accuracy. In the present research work, the performance of different classification techniques have been enhanced using ensemble learning in which various combination of classification models with different permutations has been done. Ensemble learning technique increases the accuracy in most of the cases in the present work.

Ashima Sood, Rekha Bhatia
Investigation of MANET Routing Protocols via Quantitative Metrics

This paper aims at completing the analysis of the routing protocol of MANET with different categories using various parameters over the various scenarios. The MANET routing protocols are verified for different sets of data and on this basis the router which is best suited for data transmission among existing protocols is analyzed. To study the performance of a lot of routing protocols at the time of exchanging of data, we have generated assumed progress over a lots of MANET consisting of different pairs of source and destination node. The process of simulation has been done by using NS-3 which is an open-source simulator. Here we have successfully generated and analyzed the scenarios where the data communication effects can be analyzed over the rapid incrementation in network mobility and communication is evaluated and network data traffic is analyzed. The effort is beneficial for the candidate who is working on various problems of MANETs such as attacks, Quality-of-Service and effects of increasing number of nodes on various parameter etc. to know which protocol is best suitable for their effort towards a routing protocol.

Sunil Kumar Jangir, Naveen Hemrajani
Review on Internet Traffic Sharing Using Markov Chain Model in Computer Network

The Internet traffic sharing is one of the major concerns in the dynamic field of Information and communications technology (ICT). In this scenario the concept of Big Data arises which defines the unstructured nature of data so that there is a strong need of efficient techniques to tackle this heterogeneous type of environment. Many things are dependent on Internet today, and a person has a lot of work to be done with the help of Internet. Due to this problems arise like congestion, disconnectivity, non-connectivity, call drop, and cyber crime. This review study is for the analysis purpose of all this type of problems. Various kinds of methods are discussed based upon the problem formation of Internet access and their respected solutions are discovered with the help of Markov chain model. This model is used to study about how the quality of service is obtained and the traffic share is distributed among the operators on the basis of state probability, share loss analysis, call-by-call attempt, two-call attempt, two market, disconnectivity, index, iso-share curve, elasticity, cyber crime, re-attempt, least square curve fitting, bounded area, area estimation and computation, Rest state, and multi-operator environment.

Sarla More, Diwakar Shukla
Anomaly Detection Using Dynamic Sliding Window in Wireless Body Area Networks

Anomaly detection is one of the critical challenges in Wireless Body Area Networks (WBANs). Faulty measurements in applications like health care lead to high false alarm rates in the system which may sometimes even causes danger to human life. The main motivation of this paper is to decrease false alarms thereby increasing the reliability of the system. In this paper, we propose a method for detecting anomalous measurements for improving the reliability of the system. This paper utilizes dynamic sliding window instead of static sliding window and Weighted Moving Average (WMA) for prediction purposes. The propose method compares the difference between predicted value and actual sensor value with a varying threshold. If average of the number of parameters exceed the threshold, true alarm is raised. Finally we evaluate the performance of the proposed model using a publicly available dataset and has been compared with existing approaches. The accuracy of the proposed system is evaluated with statistical metrics.

G. S. Smrithy, Ramadoss Balakrishnan, Nikita Sivakumar
Effective Healthcare Services by IoT-Based Model of Voluntary Doctors

There is dearth of skilled doctors in developing countries like India, various health challenges and high growth of population patient are required to treat in hospital. In this research, we proposed model aimed to design and develop a system, which connect doctors to hospital needs their expertise for treatment of patient. The proposed system allows capture patient healthcare data, store in database and transmit on cloud through various sensors attached to patient bodies. Based on patient data basic, voluntary doctors suggest appropriate treatment and medicine doze based on healthcare data and treatment requirement. It may save life of the patient and further this platform may be helpful to share opinion by analyzing changing capture patient healthcare data.

Bharat B. Prajapati, Satyen M. Parikh, Jignesh M. Patel
Emotional State Recognition with EEG Signals Using Subject Independent Approach

EEG signals vary from human to human and hence it is very difficult to create a subject independent emotion recognition system. Even though subject dependent methodologies could achieve good emotion recognition accuracy, the subject-independent approaches are still in infancy. EEG is reliable than facial expression or speech signal to recognize emotions, since it can not be fake. In this paper, a Multilayer Perceptron neural network based subject-independent emotion recognition system is proposed. Performance evaluation of the proposed system, on the benchmark DEAP dataset shows good accuracy compared to the state of the art subject independent methods.

Pallavi Pandey, K. R. Seeja
Development of Early Prediction Model for Epileptic Seizures

Epilepsy is the neurological disorder of brain electrical system causes the seizure because of that the brain and body behave abnormally (Yadollahpour, Jalilifar, Biomed Pharmacol J 7(1):153–162, 2014) [1]. Epilepsy is the result of recurrent seizure, i.e., if the person has single seizure in their whole lives then that person is not affected by epilepsy but if that person has more than two seizures in their lives then that person is affected by Epilepsy. Near about 0.8–1% of population all over the world is affected by an epilepsy, epilepsy is not able to cure but able to controlled by using anti epileptic medicine or by performing resective surgery then also in 25% epileptic patients no present therapy is used to controlled the epilepsy. Epilepsy is unpredictable in nature so it increases the risk of end dangerous accident when person work with heavy machineries like driving a car, cooking or swimming, again a patient always have fear of next seizure it really affect on their daily lives so to minimize the risk and to improve the quality of life of such patient it is necessary to predict the epilepsy before its onset. In the present study by using 21 patients EEG database which consist of 80 seizure, learn the 336 predictive model using four different classifier, i.e., ANN, KNN, MC-SVM using 1-against-1 approach and MC-SVM using 1-against-all approach and make possible to predict epilepsy 25 min before onset with the maximum average accuracy 98.19% and sensitivity 98.97% and predict 30 min before onset with the average maximum accuracy 98.04% and sensitivity of 98.85%.

Anjum Shaikh, Mukta Dhopeshwarkar
Research Issue in Data Anonymization in Electronic Health Service: A Survey

At today time, the rapid change of technology is changing the day-to-day activity of human being. Healthcare data and practice also made use of these technologies; they change its way to handle the data. The electronic health Service (EHS) is increasingly collecting large amount of sensitive data of the patient that is used by the patient, doctors and others data analysts. When we are using EHS we should concern to security and privacy of the medical data, because of medical data is too sensitive due to their personal nature. Especially privacy is critical for the sensitive data when we give for medical data analysis or medical research purpose, first we should do sanitization or anonymized of the data before releasing it. Data anonymization is the removing or hiding of personal identifier information like name, id, and SSN from the health datasets and to not to be identified by the recipient of the data. To anonymize the data we are using different models and techniques of anonymization. This paper is survey on data anonymization in Electronic Health Service (EHS).

Amanuel Gebrehiwot, Ambika V. Pawar
Prediction of Cervical Cancer Based on the Life Style, Habits, and Diseases Using Regression Analysis Framework

Cervical cancer is the most common disease in the woman nowadays. Even though its panic diseases, we can control and prevent it by finding the symptoms of growing cancer. It is the disease formed in the genital area of the woman and later it spreads to all the parts of the body and makes the system collapse to stop functioning of the organs. Condylomatosis, wrong sexual practices and hormonal contraceptives are one of the major primary factors for getting the cervical cancer very easily via Human Papilloma Virus. The secondary factors for causing the cervical cancer is smoking and alcoholic usage. Along with these factors molluscum contagiosum, HIV and Hepatitis B also make the humans to get affected by the cervical cancer very easily. All these factors are to be considered for analysing the patient whether they got affected by the cervical cancer. Regression Analysis model framework is used for comparing the various factors to determine the diseases vastly.

K. Vinoth Raja, M. Thangamani Murugesan
Novel Outlier Detection by Integration of Clustering and Classification

A unique method of outlier detection consisting of integration of clustering and classification is proposed here. Basically the algorithm is divided into two parts the first phase consists of application of the classical DBSCAN algorithm to the data set which is followed by the second phase which consists of application of decision tree classification algorithm. The analysis on the algorithm states that the accuracy of unwanted data detection is high in the proposed method.

Sarita Tripathy, Laxman Sahoo
A Review of Wireless Charging Nodes in Wireless Sensor Networks

Sensing of data and computation properties of sensors has created a new group of these type devices and to the using more of these types of sensor devices which are established in wireless sensor networks (WSNs). Wireless sensor networks customized sensor nodes which are situated in open areas as well as public places, with a large number that makes some problems for the scholars and designer, for designing the wireless network. There are problems such as security, routing of data and processing of bulk amount of data, etc., and life time of sensor nodes due to limited battery power, charging and replacement of batteries are sometimes not possible. This paper describes the concepts of wireless charging of sensors nodes with energy-efficient manner in WSN. We explore the concepts of wireless charging node in WSN with detail literature review and comparison of well-known works. It helps to new scholar to get decision in existing techniques and more explore about energy transfer to the sensor nodes in wireless sensor network.

Shiva Prakash, Vikas Saroj
Leeway of Lean Concept to Optimize Big Data in Manufacturing Industry: An Exploratory Review

Implementation of lean concept is most recent trend in manufacturing industries. Enterprise resource planning (ERP) solutions such as SAP, Oracle, and BAAN IV are used to carry out day-to-day life activities for operational convenience. All these activities recorded and generated very big data, and the managers or strategic decision-makers heavily rely on them for decision-making. The challenging task here is to manage such type of big data of manufacturing company using lean concept. Present study comprises two: lean principle and data optimization concept of manufacturing activities. Here, we focus to integrate lean principles for the optimization of big data for efficient and effective decision-making, and also attempt to summarize the experience gain from the study. Big data generally stands for datasets that may be recorded and analyzed computationally to generate trends and pattern. These data quantities stored are indeed required large space and have a valuable cost in present era of development. For manufacturing unit, the big data can help to improve the product quality and impart lucidity in the work practices, which are having an ability to untangle uncertainty such as inconsistence availability and performance of machines and assembly shop as a system. Desirable transparency and predictive manufacturing as application approach required large amount of data and advanced tool for prediction to use this big data as useful information. Lean term is applicable to minimize the waste generated from the big data collection. Application of lean principles for managing big data of manufacturing process is a kind of minimizing the GIGO (garbage in garbage out) to reduce the data cost and also reduce the time of data processing for the decision-making of the managers. Present study gives new approach to manage big data very accurately using lean principles.

Hardik Majiwala, Dilay Parmar, Pankaj Gandhi
Monitoring Public Participation in Multilateral Initiatives Using Social Media Intelligence

Governments, multilateral agencies like the World Bank, United Nations, and Development Banks as well as other nonprofits are involved in a variety of developmental activities across the world. A lot of resources are spent to ensure proper consultations and post-implementation verification of results. But this does not completely ensure whether the objectives are achieved. The new web technologies provided methodologies and developed tools that allow the users to pool resources on projects over the Internet. Social media allowed real-time feedback for citizens, monitoring developmental initiatives of Governments and multilateral agencies. The role of technology ensures that the consultations and ongoing feedback can be captured, analyzed, and used in understating the stakeholder reactions to the project and its implementation. This helps in making necessary course corrections avoiding costly mistakes and overruns. In this paper, we model a tool to monitor, study, and analyze popular feedback, using forums, social media, surveys, and other crowdsourcing techniques. The feedback is gathered and analyzed using both quantitative and qualitative methods to understand what crowd is saying. The summation and visualization of patterns are automated using text mining and sentiment analysis tools including text analysis and tagging/annotation. These patterns provide insight into the popular feedback and sentiment effectively and accurately than the conventional method. The model is created by integrating such feedback channels. Data is collected and analyzed, and the results are presented using tools developed in open-source platform.

Ulanat Mini, Vijay Nair, K. Poulose Jacob
An Efficient Context-Aware Music Recommendation Based on Emotion and Time Context

The enormous growth of Internet facilities, the user finds difficulties in choosing the music based on their current mindset. The context-aware recommendation has turned out to be well-established technique, which recommends a music based on the mindset of the user in various contexts. To enhance the potential of music recommendation, the emotion and time interval are considered as the most important context. Emotion context is not explored due to the difficulty in acquisition of emotions from user’s microblogs on the particular music. This paper proposes an algorithm to extract the emotions of a user from microblog during a different time interval and represented at different granularity levels. Each music piece crawled from online YouTube repository is represented in a triplet format: (User_id, Emotion_vector, Music_id). These triplet associations are considered for developing several emotion-aware techniques to provide music recommendations. Several trial of experimentation demonstrates that the proposed method with user emotional context enhances the recommendation performance in terms of hit rate, precision, recall, and F1-measure.

C. Selvi, E. Sivasankar
Implementation of Improved Energy-Efficient FIR Filter Using Reversible Logic

The demand for high-speed processing has been increasing as a result of expanding computer and signal processing applications. Nowadays reducing the time delay and power consumption main factor of the circuit. One of the main advantage of reversible logic gates is to reduce the heat dissipation and improve the performance of circuit. Reversible logic gate is used for building complex circuits like multiplier, adder, FIR, and much more and reduce heat dissipation. FIR (finite impulse response) filter is used in various range of digital signal processing applications. This paper describes reversible Vedic FIR filter and compared with irreversible Vedic FIR filter.

Lavisha Sahu, Umesh Kumar, Lajwanti Singh
A Study on Benefits of Big Data for Healthcare Sector of India

Big data has taken the world by storm. Due to the tremendous amount of data being generated in each and every field, the use of big data has dramatically increased. Health is the heart of a nation, and thus healthcare is one of the unavoidable and best examples to be given when discussed application of big data in today’s era. Similar to western countries like US leveraging the benefits of big data starting from every simpler thing to handling the most complicated tasks, India can also utilize the potential of big data. In the present paper, we have started with the overview of healthcare sector of India in urban as well as rural areas, followed by general merits of big data in healthcare as well as domain-specific uses and ended with a broad framework depicting big data in context to healthcare sector of India.

Komal Sindhi, Dilay Parmar, Pankaj Gandhi
Handling Uncertainty in Linguistics Using Probability Theory

Uncertainty is the lack of knowledge, or insufficient information. In this paper, we will be majorly discussing uncertainty occurring in natural language. Numerous natural language processing techniques can be applied to minimise linguistic ambiguities. We discuss one of the most widely used techniques—probability theory. An attempt is then made to solve the linguistic uncertainty using the theory.

Annapurna P. Patil, Aditya Barsainya, Sairam Anusha, Devadas Keerthana, Manoj J. Shet
Review of Quality of Service Based Techniques in Cloud Computing

Nowadays cloud computing is an emerging technology for enhancing our daily life. cloud computing provides the facility to pay for use. There are some important service providers in cloud computing such as IBM, Google, Amazon, Microsoft, etc., these service providers give the different services to the users. Cloud providers offer services to the users based on the expected quality requirements, So it is a big challenge of cloud to provide enhanced Quality of Service (QoS) to their users. In this paper, we are presenting main existing research work of different QoS management techniques in cloud computing and also compare them with their strength and weakness. This review will be helpful for new researchers in this field to get exposure of quality of service based management techniques.

Geeta, Shiva Prakash
Skyline Computation for Big Data

From a multidimensional dataset, a skyline query extracts the data which satisfy the multiple preferences given by the user. The real challenge in skyline computation is to retrieve such data, in the optimum time. When the datasets are huge, the challenge becomes critical. In this paper, we address exactly this issue focusing on the big data. For this, we aim at utilizing the correlations observed in the user queries. These correlations and the results of historical skyline queries, executed on the same dataset, are very much helpful in optimizing the response time of further skyline computation. For the same purpose, we have earlier proposed a novel structure namely Query Profiler (QP). In this paper, we present a technique namely SkyQP to assert the effectiveness of this concept against the big data. We have also presented the time and space analysis of the proposed technique. The experimental results obtained assert the efficacy of the SkyQP technique.

R. D. Kulkarni, B. F. Momin
Human Face Detection Enabled Smart Stick for Visually Impaired People

The present work enhances the capabilities of a newly developed smart stick (Sharma et al Multiple distance sensors based smart stick for visually impaired persons, Las Vegas, pp 1–5, 2017 [1]) by detecting human faces using the PI camera on Raspberry Pi board. Visually impaired people can use this stick developed by us (Sharma et al Multiple distance sensors based smart stick for visually impaired persons, Las Vegas, pp 1–5, 2017 [1]) to locate static and dynamic obstacles using multiple distance sensors and now can even detect the presence of a human if he/she is in front of the user. The problem of human face detection with simple and complex backgrounds is addressed in this paper using Haar-cascade classifier. Haar classifier has been chosen because it does not require high computational cost while maintaining accuracy in detecting single as well as multiple faces. Experimental results have been performed on the smart stick in indoor and outdoor unstructured environments. The stick is successfully detecting the human face(s) and generates alerts in form of vibration in the stick as well as audio in a headphone. OpenCV-python is used to implement Haar-cascade classifier and an accuracy ≈98% is achieved with this setup.

Shivam Anand, Amit Kumar, Meenakshi Tripathi, Manoj Singh Gaur
Web-Based Service Recommendation System by Considering User Requirements

In this age of Internet and service delivery almost all the kinds of services and products are available online for selection and use. In addition of that for a single kind of product or service a number of different vendors and service providers are exist. Additionally all the providers are claimed to provide most valuable services. In this context to compare and find the appropriate service according to the end client a service recommendation system is required. The aim of this recommendation system design is to understand the client current requirements and explore the database for recovering the most likely services. In order to demonstrate the issues and solution of this domain a real-world problem namely hotel booking service is used. On the problem of this recommendation system design is treated as a search system on structured data source. Thus to find the suitable outcomes from the proposed working model quantum genetic technique is used. That technique first accepts the dataset information and the user requirements, after that the encoding of information is performed in binary values. Additionally the query sequence is treated as binary string with all 1s. Finally the genetic algorithm is implemented for finding the fit solution among all the available binary sequences. The generated seeds from the genetic algorithm are treated as final recommendation of search system. Additionally the fitness values are used to rank the solutions. The implementation and result evaluation is performed on JAVA technology. After that the performance using time and space complexity notified. Both the performance parameters demonstrate the acceptability of the work.

Neha Malviya, Sarika Jain
Unsupervised Machine Learning for Clustering the Infected Leaves Based on the Leaf-Colors

In data mining, the clustering is one of the important processes for categorizing the elements into groups whose associated members are similar in their features. In this paper, the plant leaves are grouped based on the colors in the leaves. Totally, three categories are specified to represent the leaf with more green, leaf with yellowish shades and leaf with reddish shades. The task is performed using image processing. The leaf images are processed in the sequence such as image preprocessing, segmentation, feature extraction, and clustering. Preprocessing is done to denoize, enhance, and background color fixing for betterment of result. Then, the color-based segmentation is done on the preprocessed image for generating the sub-images by clustering the pixels based on the colors. Next, the basic features such as entropy, mean, and standard deviation are extracted from each sub-images. The extracted features are used for clustering the images based on the colors. The image clustering is done by the Neural Network architecture, self-organizing map (SOM), and K-Means algorithm. They are evaluated with various distance measuring functions. Finally, the city-block in both method produced the clusters with same size. This cluster set can be used as a training set for the leaf classification in future.

K. Ashok Kumar, B. Muthu Kumar, A. Veeramuthu, V. S. Mynavathi
Real-Time Big Data Analysis Architecture and Application

Real-Time Big Data Analysis systems are those systems that process big data in given deadline or time limit. These types of systems are used to analysis a big data that is using data from some real world environment to analysis, predicate the solution to real-world problem. In this paper, we deal with architecture of this type of system what is basic structure of this type of system and their application in different area. We also categories theses type of in two main categories real-time system and near real-time system.

Nandani Sharma, Manisha Agarwal
Missing Value Imputation in Medical Records for Remote Health Care

In remote area where scarcity of doctors is evident, health kiosks are deployed for collecting primary health records of patients like blood pressure, pulse rate, etc. However, the symptoms in the records are often imprecise due to measurement error and contain missing value for various reasons. Moreover, the medical records contain multivariate symptoms with different data types and a particular symptom may be the cause of more than one diseases. The records collected in health kiosks are not adequate so, imputing missing value by analyzing such dataset is a challenging task. In the paper the imprecise medical datasets are fuzzified and fuzzy c-mean clustering algorithm has been applied to group the symptoms into different disease classes. In the paper missing symptom values are imputed using linear regression models corresponding to each disease using fuzzified input of 1000 patients’ health-related data obtained from the kiosk. With the imputed symptom values new patients are diagnosed into appropriate disease classes achieving 97% accuracy. The results are verified with ground truth provided by the experts.

Sayan Das, Jaya Sil
Recommendation Framework for Diet and Exercise Based on Clinical Data: A Systematic Review

Nowadays, diet and exercise recommender frameworks have gaining expanding consideration because of their importance for living healthy lifestyle. Due of the expanded utilization of the web, people obtain the applicable wellbeing data with respect to their medicinal problem and available medications. Since diseases have a strong relationship with food and exercise, it is especially essential for the patients to focus on adopting good food habits and normal exercise routine. Most existing systems on the diet concentrate on proposals that recommend legitimate food items by considering their food choices or medical issues. These frameworks provide functionalities to monitor nutritional requirement and additionally suggest the clients to change their eating conduct in an interactive way. We present a review of diet and physical activity recommendation frameworks for people suffering from specific diseases in this paper. We demonstrate the advancement made towards recommendation frameworks helping clients to find customized, complex medical facilities or make them available some preventive services measures. We recognize few challenges for diet and exercise recommendation frameworks which are required to be addressed in sensitive areas like health care.

Vaishali S. Vairale, Samiksha Shukla
Security Assessment of SAODV Protocols in Mobile Ad hoc Networks

The basic requirement in mobile ad hoc network (MANET) is to achieve Secure routing. Dynamic characteristics of MANET offers many challenges to achieve security parameters such as availability, integrity, confidentiality authentication, and non-repudiation. To hinder the normal routing operation malicious nodes make use of the vulnerable routing protocols. The real challenge to achieve secure routing as secure versions of the routing protocols is also vulnerable for routing attack. In this paper, We assess and compare the security of Ad hoc On demand Distance Vector routing protocol and Secure Ad hoc On demand Distance Vector routing protocol under different types of routing attack like blackhole and Replay attack.

Megha Soni, Brijendra Kumar Joshi
Secure Sum Computation Using Homomorphic Encryption

Secure sum allows cooperating parties to compute sum of their private data without revealing their individual data to one another. Many secure sum protocols exists in the literature. Most of them assume network to be secure. In this paper we drop that assumption and provide a protocol that is applicable to insecure networks as well. We used additive homomorphic encryption technique for secure sum computation.

Rashid Sheikh, Durgesh Kumar Mishra
Automated Workload Management Using Machine Learning

Mainframe System processing includes a “Batch Cycle” that approximately spans 8 pm to 8 am, every week, from Monday night to Saturday morning. The core part of the cycle completes around 2 am, with key client deliverables associated with the end times of certain jobs, tracked by Service Delivery. There are single and multi-client batch streams, a QA stream which includes all clients, and about 2,00,000 batch jobs per day that execute. Despite a sophisticated job scheduling software, and automated system workload management, operator intervention is required, or believed to be required, to reprioritize when and what jobs get available system resources. Our work is to characterize, analyse and visualize the reasons for a manual change in the schedule. The work requires extensive data preprocessing and building machine learning models for the causal relationship between various system variables and the time of manual changes.

K. Deivanai, V. Vijayakumar, Priyanka
Multi-user Detection in Wireless Networks Using Decision Feedback Signal Cancellation

Wireless Networks are becoming increasingly ubiquitous in computer networks due to lesser cost and maintenance overhead. While some wireless networks may operate in regulated spectrum, the majority operate in the unregulated (ISM) band. It is highly challenging for a base station or control stations to successfully detect signals from multiple users in the same frequency range which may occur due to comparatively small frequency reuse distance. This paper proposes a technique based on decision feedback equalization (DFE) (Tu et al Proceedings of 44th Asilomar conference signals [1]) and strongest signal cancellation for multi-user detection (MUD) in wireless networks. It has can be seen that by employing the proposed system, the Bit Error Rate (BER) for strong (Stojanovic M Proceedings 137 of MTS/IEEE OCEANS conference, Boston, MA, 2006 [2]), average and weak users converge thereby indicating the fact that all the signals are detected with equal accuracy (Li et al IEEE J Ocean Eng 33(2):198–209, 2008 [3]).

Monika Sharma, Balwant Prajapat
ANN-Based Predictive State Modeling of Finite State Machines

Finite state machines have so many applications in the day-to-day life. Design of Finite State machines spread its role from the simple systems to complex systems. As Artificial Intelligence rule all over the technology world by its very effective applications, Finite state machines can also significantly use its essence in the process of next state prediction. The predictive analysis of Artificial intelligence helps to speed up the process of Finite state machines. This paper explores the design of anticipative state machines with the help of Artificial Neural Networks. To get the higher performance, less training time and low error prediction, Back propagation algorithm is used in ANN which helps to analyze the critical parameters in real time applications. Our proposed technique provides better results than the previously used technique and also provides less prediction and training time error with increasing number of inputs.

Nishat Anjum, Balwant Prajapat
Deep Dive Exploration of Mixed Reality in the World of Big Data

With the exponential growth of data volumes in current scenario, it’s becoming more difficult to incorporate the increasing need of data storage and analysis with the existing available systems. Methods to deal with Big Data and analyzing it, comes in play here. In the real world, higher revenues are generated from processing of big data in comparison to the costs involved in processing it which attracts all big organizations in the world. Visualization techniques and methods are improving regularly to cope with the increasing complexity of Big Data. A new perspective solution can be seen here which involves the use of Virtual Reality, Augmented Reality or Mixed Reality to make use of human perception and cognition for more effective and useful ways to utilize the information gathered from Big Data.

Prajal Mishra
Backmatter
Metadaten
Titel
Data Science and Big Data Analytics
herausgegeben von
Dr. Durgesh Kumar Mishra
Prof. Xin-She Yang
Dr. Aynur Unal
Copyright-Jahr
2019
Verlag
Springer Singapore
Electronic ISBN
978-981-10-7641-1
Print ISBN
978-981-10-7640-4
DOI
https://doi.org/10.1007/978-981-10-7641-1