Skip to main content

2017 | Buch

Computational Intelligence in Data Mining

Proceedings of the International Conference on CIDM, 10-11 December 2016

herausgegeben von: Himansu Sekhar Behera, Durga Prasad Mohapatra

Verlag: Springer Singapore

Buchreihe : Advances in Intelligent Systems and Computing

insite
SUCHEN

Über dieses Buch

The book presents high quality papers presented at the International Conference on Computational Intelligence in Data Mining (ICCIDM 2016) organized by School of Computer Engineering, Kalinga Institute of Industrial Technology (KIIT), Bhubaneswar, Odisha, India during December 10 – 11, 2016. The book disseminates the knowledge about innovative, active research directions in the field of data mining, machine and computational intelligence, along with current issues and applications of related topics. The volume aims to explicate and address the difficulties and challenges that of seamless integration of the two core disciplines of computer science.

Inhaltsverzeichnis

Frontmatter
Safety and Crime Assistance System for a Fast Track Response on Mobile Devices in Bhubaneswar

We have developed an android application which will locate the user using GPS system and recommend the nearest emergency point along with their route direction and contact details. We will try to solve the various issues in the area of emergency prevention and mitigation. In this technology, the system is highly effective for local residents and a boon for the tourists. Service for providing highway emergency number, which is different for different highways sections, is in pipeline. There is also an issue of non-availability of information about emergency numbers in many areas or the credibility of information is questioned so we are working to provide the facility at Bhubaneswar, updating the information through crowd computing, i.e., through people of that concerned area.

Debabrata Singh, Abhijeet Das, Abhijit Mishra, Binod Kumar Pattanayak
Major Global Energy (Biomass)

This research gives brief introduction about types of renewable energy mainly biomass energy methods of generation and its pros and cons.

Sukhleen Kaur, Vandana Mukhija, Kamal Kant Sharma, Inderpreet Kaur
Detecting Targeted Malicious E-Mail Using Linear Regression Algorithm with Data Mining Techniques

E-mail is the most fundamental means of communication. It is the focus of attack by the terrorists, e-mail spammers, imposters, business fraudsters, and hackers. To combat this, different data mining classifiers are used to identify the spam mails. This paper introduces a system that imports data from the e-mail accounts and performs preprocessing techniques like file conversions that are appropriate to conduct the experiments, searching for frequency of a word by Knuth–Morris–Pratt (KMP) string searching algorithm, and feature selection using principal component analysis (PCA) are applied. Next, linear regression classification is used to predict the spam mails. Then, association rule mining is performed. The mean absolute error and root mean squared error for the training data and test data are computed. The errors of the training and test data sets are negligible which indicates the classifier is well trained. Finally, the results are displayed by the visualization techniques.

A. Sesha Rao, P. S. Avadhani, Nandita Bhanja Chaudhuri
Classical and Evolutionary Image Contrast Enhancement Techniques: Comparison by Case Studies

Histogram equalization (HE) and histogram stretching (HS) are two commonly used classical approaches for improving the appearance of a poor image. Such approaches may end up at developing artefacts, rendering the image unusable. Moreover these two classical approaches involve algorithmically complex tasks. On the other hand evolutionary soft computing methods claim to offer hassle free and effective contrast enhancement. In the present work, we report development of algorithms for two evolutionary approaches viz. genetic algorithm (GA) and artificial bee colony (ABC) and went on to evaluate the contrast enhancement capability of these algorithms using some test images. Further we compared the output images obtained using above two evolutionary approaches with the output images got using HE and HS. We report that evolutionary methods result in better contrast enhancement than classical methods in all our test cases. ABC approach outperformed GA approach, when output images were subjected to quantitative comparison.

Manmohan Sahoo
Cost Effectiveness Analysis of a Vertical Midimew-Connected Mesh Network (VMMN)

Previous study of vertical midimew-connected mesh network (VMMN) has been conducted to evaluate some of static network performance parameters of this network. These parameters include diameter, network degree, average distance, wiring complexity, cost, and arc connectivity. VMMN showed good results compared to some of conventional and hierarchical interconnection networks. However, there are some important static parameters that were not included in that study such as packing density, message traffic density, cost effectiveness factor, and time-cost effectiveness factor. In this paper, we will focus on evaluating VMMN to these parameters, and then we will compare the results to that from mesh, torus, TESH, and MMN networks.

M. M. Hafizur Rahman, Faiz Al Faisal, Rizal Mohd Nor, T. M. T. Sembok, Dhiren Kumar Behera, Yasushi Inoguchi
Cluster Analysis Using Firefly-Based K-means Algorithm: A Combined Approach

Nature-inspired algorithms have evolved as a hot topic of research interest around the globe. Since the last decade, K-means clustering has become an attractive area for researchers towards solving many real-world clustering problems. But, unfortunately K-means does not work well for non-globular clusters. Firefly algorithm is a recently developed metaheuristic algorithm that simulates through the flashing characteristics of the fireflies. The firefly algorithm uses the capacity of global search to resolve the limitations of K-means technique and helps in escaping from the local optima. In this work, a novel firefly-based K-means algorithm (FA-K-means) has been proposed for efficient cluster analysis and the results of the proposed approach are compared with some other benchmark approaches. Simulation results divulge that the proposed approach can be efficiently used for solving clustering problems as it avoids the trapping in local optima and helpful for faster convergence.

Janmenjoy Nayak, Bighnaraj Naik, H. S. Behera
A Study of Dimensionality Reduction Techniques with Machine Learning Methods for Credit Risk Prediction

With the huge advancement of financial institution, credit risk prediction assumes a critical part to grant a loan to the customer and helps the financial institution to minimize their misfortunes. Despite the fact that there are different statistical and artificial intelligent methods available, there is no single best strategy for credit risk prediction. In our work, we have used feature selection and feature extraction methods as preprocessing techniques before building a classifier model. To validate the feasibility and effectiveness of our models, three credit data sets are picked namely Australia, German, and Japanese. Experimental results demonstrates that the SVM classifier performs better among several classifier methods, i.e., NB, LogR, DT, and KNN with LDA feature extraction technique. Test result demonstrates that the feature extraction preprocessing technique with base classifiers are the best suited for credit risk prediction.

E. Sivasankar, C. Selvi, C. Mala
A Fuzzy Knowledge Based Mechanism for Secure Data Aggregation in Wireless Sensor Networks

Wireless Sensor Networks (WSNs) consist of a number of limited energy sensor nodes deployed randomly over an area. The paper proposes a fuzzy knowledge based secure tree-based data aggregation mechanism for WSNs called the Fuzzy knowledge based Data Aggregation Scheme (FDAS). In the proposed FDAS mechanism, a combination of fuzzy rules is applied to predict the node status of each node in the aggregation tree. The faulty nodes are then isolated from the process of data aggregation. The proposed FDAS mechanism also ensures the security of the network by the application of the privacy homomorphic cryptography technique. It has been found to give better performance characteristics than the other existing algorithms as validated through the results obtained from simulation.

Sasmita Acharya, C. R. Tripathy
Estimation of Noise in Digital Image

Denoising plays an important role in improving the performance of algorithms such as classification, recognition, enhancement, segmentation, etc. To eliminate the noise from noisy image, one should know the noise type, noise level, noise distribution, etc. Typically noise level information is identified from noise standard deviation. Estimation of the image noise from the noisy image is major concern for several reasons. So, efficient and effective noise estimation technique is required to suppress the noise from the noisy image. This paper presents noise estimation based on rough fuzzy c-means clustering technique. The experimental results and performance analysis of the system are presented.

K. G. Karibasappa, K. Karibasappa
Image Compression Using Shannon Entropy-Based Image Thresholding

In this paper, we proposed multilevel image thresholding for image compression using Shannon entropy which is maximized by the nature-inspired Bacterial Foraging Optimization Algorithm (BFOA). Ordinary threading methods are computationally expensive, while extending for multilevel image thresholding, so there is a need of optimization techniques to reduce the computational time. Particle swarm optimization undergoes instability when particle velocity is maximum. So we proposed a BFOA-based multilevel image thresholding by maximizing Shannon entropy and the results are compared with differential evolution and Particle swarm optimization and proved better in Peak signal-to-noise ratio (PSNR), Compression ratio and reconstructed image quality.

Karri Chiranjeevi, Uma Ranjan Jena, Asha Harika
Solving Sparsity Problem in Rating-Based Movie Recommendation System

Recommendation is a very important part of our digital lives. Without recommendation one can get lost in web of data. Movies are also very important form of entertainment. We watch most movies that are recommended by someone or others. Each person likes specific type of movies. So movie recommendation system can increase sales of a movie rent/sales shop. Websites like Netflix are using it. But there is one problem that can cause recommendation system to fail. This problem is sparsity problem. In this paper, we have used a new approach that can solve sparsity problem to a great extent.

Nitin Mishra, Saumya Chaturvedi, Vimal Mishra, Rahul Srivastava, Pratibha Bargah
Combining Apriori Approach with Support-Based Count Technique to Cluster the Web Documents

The dynamic Web where thousands of pages are updated in every second is growing at lightning speed. Hence, getting required Web documents in a fraction of time is becoming a challenging task for the present search engine. Clustering, which is an important technique of data mining can shed light on this problem. Association technique of data mining plays a vital role in clustering the Web documents. This paper is an effort in that direction where the following techniques have been proposed:(1)a new feature selection technique named term-term correlation has been introduced which reduces the size of the corpus by eliminating noise and redundant features.(2)a novel technique named Support Based Count (SBC) has been proposed which combines with traditional Apriori approach for clustering the Web documents.Empirical results on two benchmark datasets show that the proposed approach is more promising compared to the traditional clustering approaches.

Rajendra Kumar Roul, Sanjay Kumar Sahay
Genetic Algorithm Based Correlation Enhanced Prediction of Online News Popularity

Online News is an article which is meant for spreading awareness of any topic or subject published on the Internet and is available to a large section of users to gather information. For complete knowledge proliferation we need to know the right way and time to do so. For achieving this goal we have come up with a model which on the basis of, multiple factors, like describing the article type (structure and design) and publishing time predicts popularity of the article. In this paper we use Correlation techniques to get the dependency of the popularity obtained from an article, and then we use Genetic Algorithm to get the optimum attributes or best set which should be considered while formatting the article. Data has been procured from UCI Machine Learning Repository with 39644 articles with sixty condition attributes and one decision attribute. We implemented twelve different data learning algorithms on the above mentioned data set, including Correlation Analysis and Neural Network. We have also given a comparison of the performances got from various algorithms in the Result section.

Swati Choudhary, Angkirat Singh Sandhu, Tribikram Pradhan
CSIP—Cuckoo Search Inspired Protocol for Routing in Cognitive Radio Ad Hoc Networks

Cognitive radio (CR) is viewed as the empowering innovation of the dynamic spectrum access (DSA) model which is imagined to take care of the present spectrum scarcity issue by encouraging the contraption of new remote administrations. Cognitive devices have the similar proficiency of CR and the expedient network that they form dynamically is called cognitive radio ad hoc networks (CRAHNs). Due to assorted qualities in channels, routing becomes a critical undertaking job in CRAHN. Minimizing the end-to-end delay is one of the major difficult tasks in CRAHNs, where the transmission of packets passes on every hop of routing path. In this paper, a new reactive multicast routing protocol namely cuckoo search inspired protocol (CSIP) is proposed to reduce the overall end-to-end delay, which progressively reduces the congestion level on various routing path by considering the spectrum accessibility and the service rate of each hop in CRAHNs. Simulations are demonstrated using NS2 tool and the results proved that the proposed routing protocol CSIP significantly outperforms better than other baseline schemes in minimizing end-to-end delay in CRAHNs.

J. Ramkumar, R. Vadivel
Performance Analysis of Airplane Health Surveillance System

Airplane Health Surveillance System is an information system developed to help the operators and maintenance crew to make business decision. System will detect the defect along with cause which would lead to delay and airplane crashes which has high impact on society. The system is capable of detecting and diagnosing the defects which may be initiated during a flight. Thereby triggers alert to safeguard the airplane from possible odds by analyzing the effects caused by defect detected. Based on alerts and cause, business decisions are made. Airplane health surveillance system will obtain data from simulator which is designed to emulate the data received from supplier ground system which in turn will be receiving ACARS from flying fleet in real time. This is a User-friendly application which has a very powerful impact on aerospace division by eliminating the uncertain economic loss.

N. B. Rachana, S. Seema
A Parallel Forecasting Approach Using Incremental K-means Clustering Technique

A parallel forecasting approach used in weather prediction which has important aspect of the modern society, especially with the realization of modern smart cities. A new approach is considering here which will provide excellent result for large semi-structured data. Thus, it is very important to analyze weather data keeping in mind the enormity of the available data sizes. Here it is presented a methodology for weather data analysis keeping in mind the big-data nature of the data sizes pertaining to weather data. Here also it has been taken date-wise atmospheric conditions collected of decade. The traditional k-means clustering is used to form clusters which represents association in between related dates of current year’s and previous year’s weather data. Such associations predict atmospheric conditions of one year’s weather condition on the bases of previous data. Incremental k-means clustering algorithm is used to process current year’s weather parameters as new data and it shows that the calculated weather condition falls under one of the existing clusters to represent similar atmospheric conditions. The total work has been divided by two parts: first, Storing NCDC semi-structured data on hadoop cluster and second, fitting a clustering methodology for predicting weather conditions.

Swagatika Sahoo
Key Author Analysis in 1 and 1.5 Degree Egocentric Collaboration Network

The evaluation of scientific performance of an individual author in the research community is based on total number of articles published, citation count, average citation count, h-index, etc. But generally research work is done by the group of researchers and the performance of individual may depends on the neighbors’ researchers. So the evaluation of scientific impact of an individual will carry based on their immediate neighbors. In this paper, we form a 1 degree and 1.5 degree egocentric network of every node and convey the degree centrality, closeness centrality, and density to estimate the scientific efficacy of an individual. Finally, compare the results of all those types of network and found that the 1.5 degree egocentric networks are useful to appraise the performance of researchers.

Anand Bihari, Sudhakar Tripathi
Customer Segmentation by Various Clustering Approaches and Building an Effective Hybrid Learning System on Churn Prediction Dataset

Success of every organization or firm depends on Customer Preservation (CP) and Customer Correlation Management (CCM). These are the two parameters determining the rate at which the customers decide to subscribe with the same organization. Thus higher service quality reduces the chance of customer churn. It involves various attributes to be analyzed and predicted in industries like telecommunication, banking, and financial institutions. Customer churn forecast helps the organization to retain the valuable customers and it avoids failure of the particular organization in a competitive market. Single classifier does not result in higher churn forecast accuracy. Nowadays, both unsupervised and supervised techniques are being combined to get better classification accuracy. Also unsupervised classification plays a major role in hybrid learning techniques. Hence, this work focuses on various unsupervised learning techniques which are comparatively studied using algorithms like Fuzzy C-Means (FCM), Possibilistic Fuzzy C-Means (PFCM), K-Means clustering (K-Means), where similar type of customers is grouped within a cluster and better customer segmentation is predicted. The clusters are divided for training and testing by Holdout method, in which training is carried out by decision tree and testing is done by the model generated. The results of the churn prediction data set experiment show that; K-Means clustering algorithm along with the decision tree helps improving the result of churn prediction problem present in the telecommunication industry.

E. Sivasankar, J. Vijaya
An Artificial Neural Network Model for a Diesel Engine Fuelled with Mahua Biodiesel

In this paper, an Artificial Neural Network (ANN) model is used to predict the different parameters of a diesel engine fuelled with the mixture of diesel and mahua biodiesel in different proportion. The data has been obtained from an experiment carried out in a twin cylinder diesel engine in different loading condition and different blending ratios of diesel and biodiesel. Two input data, i.e., engine load and blending ratio and five output data, i.e., Brake Thermal Efficiency (BTE), Brake Specific Fuel Consumption (BSFC), Smoke level, Carbon monoxide (CO), and Nitrogen Oxides (NOx) emissions have been considered for ANN modeling. The network used is back propagation, feed forward with multilayer perceptron having ten numbers of neurons in hidden layer with trainlm training algorithm being proposed. It has been observed that the prediction ability of the model is high as there is minimum difference between the predicted and the experimentally measured values.

N. Acharya, S. Acharya, S. Panda, P. Nanda
An Application of NGBM for Forecasting Indian Electricity Power Generation

The average generation of electricity is getting increased day by day due to its increasing demand. So forecasting the future needs of electricity is very essential, especially in India. In this paper, a Grey Model (GM) and a Nonlinear Grey Model (NGM) are introduced with the concept of the Bernoulli Differential Equation (BDE) to obtain higher predictive precision, accuracy rate. To improve the prediction accuracy of GM, the Nonlinear Grey Bernoulli Model (NGBM) is used. The NGBM model is having the capability to produce more reliable outcomes. The NGBM with power r is a nonlinear differential equation. Using power r in NGBM the expected result can be controlled and adjusted to fit the results of 1-AGO historical raw data. NGBM is a recent grey prediction model to easily adjust for the correctness of GM(1, 1) stable with a BDE. The differentiation of desired outcome with the actual GM(1, 1) has been displayed through a feasible forecasting model NGBM(1, 1) by accumulating the decisive variables. This model may help government to extend future planning for generation of electricity.

Dushant P. Singh, Prashant J. Gadakh, Pravin M. Dhanrao, Sharmila Mohanty, Debabala Swain, Debabrata Swain
Publishing Personal Information by Preserving Privacy in Vertically Partitioned Distributed Databases

In vertically partitioned distributed databases, the data will be distributed over multiple sites. To publish such data for research or business applications, data may be collected at one site and published data to the needy. Publishing data in many fields like banking, medical, political, research, etc., by preserving one’s privacy is very important. Apart from preserving privacy, anonymity of the publisher must be preserved. To achieve these objectives in this paper, multidimensional k-anonymity with onion routing and mix-network methods are proposed to preserve privacy and to provide anonymous communication. Mix-net is a multistage system which accepts quantities of data on input batch and produces cryptographically transformed data through output batch. Output batch is a permutation of the transformed input batch, to achieve untraceability between the input and output batches. Mix-net can change the appearance and random reordering which prevents trace back. In onion routing encryption, data is encapsulated in layers of encryption, which is analogous to layers of onion. The data is being sent to transmit inside the several layers of encryption. The final node or exit node in the chain is to decrypt and deliver the data to the recipient by applying multidimensional k-anonymity on collected data.

R. Srinivas, K. A. Sireesha, Shaik Vahida
Maintaining Security Concerns to Cloud Services and Its Application Vulnerabilities

Cloud computing is a highly demand service which is widely used by IT industry because of its flexibility and scalable in nature. In this paper, the main purpose is about cloud computing security concerns. A few steps are need to be followed to secure the cloud such as understanding the cloud, demand transparency, reinforcing the internal security, legal implication considerations and finally paying attention to the cloud environment. The major concern bothering the IT industries is to provide the security to cloud services (SaaS, PaaS and IaaS). In this paper, we represent a multiplicative homomorphic encryption namely advanced RSA algorithm using three-factor authentication for ensuring that the data is secured.

Lakkireddy Venkateswara Reddy, Vajragiri Viswanath
Fuzzy Clustering with Improved Swarm Optimization and Genetic Algorithm: Hybrid Approach

Fuzzy c-means clustering is one of the popularly used algorithms in various diversified areas of applications due to its ease of implementation and suitability of parameter selection, but it suffers from one major limitation like easy stuck at local optima positions. Particle swarm optimization is a globally adopted metaheuristic technique used to solve complex optimization problems. However, this technique needs a lot of fitness evaluations to get the desired optimal solution. In this paper, hybridization between the improved particle swarm optimization and genetic algorithm has been performed with fuzzy c-means algorithm for data clustering. The proposed method has been compared with some of the existing algorithms like genetic algorithm, PSO, and K-means method. Simulation result shows that the proposed method is efficient and can divulge encouraging results for finding global optimal solutions.

Bighnaraj Naik, Sarita Mahapatra, Janmenjoy Nayak, H. S. Behera
RF-Based Thermal Validation and Monitoring Software for Temperature Sensitive Products

Pharmaceutical and healthcare industries are highly controlled industries all around the world. Nowadays, millions of temperature sensitive products are manufactured, warehoused, or circulated all over the world. For all these sensitive products, the control of temperature is essential. In some developing nations, they are offering no automated processes such as manual recordings of sensed information by supervisors of the organization or thermometers or USB Data loggers-means bringing the sensor node in contact with USB Data reader which aids as the interface between loggers and the system software and also enables pre-study programming and data can be download only after the study completion in some cases proved inefficient and most of the cases as burdensome. To overcome the problems raised by non-automated processes in this paper, RF-based temperature validation and monitoring software for Pharma, food processing, and warehouses is introduced. Reliant on the nature of the application, sensor nodes are deployed into the area of sensing where base station and loggers communicate by means of radio waves and temperature reading is recorded from base station which is connected to PC through Ethernet or USB.

P. Siva Sowmya, P. Srinivasa Reddi
Quantitative Analysis of Frequent Itemsets Using Apriori Algorithm on Apache Spark Framework

Frequent itemset mining is one of the popular techniques used to discover hidden knowledge from large-scale transactional datasets in a wide range of applications. Apriori algorithm is considered as a typical algorithm to find frequent itemsets in market basket analysis. Since its inception, many efforts have been made to enhance the efficiency of the original algorithm. MapReduce model is one of the efficient tools to implement parallel and distributed computing, so that large-scale data set algorithms such as Apriori algorithm can be made efficient in terms of speed up and other related parameters. One of the major drawbacks of the MapReduce model is that it is not suitable for iterative jobs/tasks due to overheads imposed. Now-a-days, Apache Spark is getting huge attention for iterative jobs because of its in-memory processing capabilities. Most of the frequent pattern mining algorithms consider only distinct items in a transaction. For transactional data analysis, multiple occurrences of an item or in other words “quantities” by which a particular item is purchased in the same transaction can be important to derive additional information about frequent itemsets. In this script, we propose a modified version of the Apriori algorithm based on Apache Spark framework that not only mines the frequent itemsets in the input transactional data but also analyzes related quantities of the items for a particular itemset to find the most frequent quantity being purchased for every frequent itemset. Experiments are conducted to gain insight in the form of effectiveness, efficiency, and scalability of the proposed approach.

Ramesh Dharavath, Shashi Raj
E-CLONALG: An Enhanced Classifier Developed from CLONALG

This paper proposes an improved version of CLONALG, Clone Selection Algorithm based on Artificial Immune System that matches with the conventional classifiers in terms of accuracy tested on the same data sets. Clonal Selection Algorithm is an artificial immune system model. Instead of randomly selecting antibodies, it is proposed to take k memory pools consisting of all the learning cases. Also, an array averaged over the pools is created and is considered for cloning. Instead of using the best clone and calculating the similarity measure and comparing with the original cell, here, k best clones were selected, the average similarity measure was evaluated and noise was filtered. This process enhances the accuracy from 76.9 to 94.2 %, ahead of the conventional classification methods.

Arijit Panigrahy, Rama Krushna Das
Encryption First Split Next Model for Co-tenant Covert Channel Protection

Cloud computing technology present days fulfills the infrastructural needs of small and medium enterprises without creating additional load of initial setup cost. Industrial infrastructural needs like storage, computing has became like utilities that uses cloud. Features like availability, scalability, pay per use makes cloud as a dependable choice for enterprises. The growth of cloud computing depends on vital technologies like distributed computing, parallel processing, Internet, virtualization technologies, etc. Virtualization enables resources utilization in efficient way. It offers multi-tenancy which is sharing same physical space by many virtual machines. Co-residency of virtual machine on the other side brings more security bumps. As co-residents share common channels like memory bus, storage controllers, cache these channels can become sources of data leakage for malicious VMs. Malicious attacker VMs use covert channel for accessing sensible data from target VM. This paper presents a novel model Encryption First Split Next (EFSN) model for covert channel protection. This model targets about two key issues like isolating conflict of interest files from attacker using split share model. Other is applying encryption for making attacker difficult even if he gains the data from covert channels. This suggested model gave positive reliable results. This can be a choice for lazy or immature cloud users to rely on for their data storage security needs from co-tenant covert channel protection.

S. Rama Krishna, B. Padmaja Rani
Cognitive Radio: A Technological Review on Technologies, Spectrum Awareness, Channel Awareness, and Challenges

Radio spectrum is a limited resource in this world of broadband connections. With the increase in demand of digital wireless communications in commercial and personal networks judicious use of frequency spectrum has become a serious concern for wireless engineers. Cognitive adios (CR) are a solution to this clustered spectrum issue nowadays. Cognitive radio or SDR is a new concept/technology which is being implemented to maximize the utilization of radio spectrum. At any time, unused frequency spectrum can be sensed by cognitive radio from the wide range of wireless radio spectrum. So, as a result this gives really an efficient utilization of radio resources. Major factors that govern cognitive radio network (CRN) are related to service contradiction, routing, and jamming attacks which may interfere with other operating wireless technologies. Thrust areas in developing a good CR lies in the efficient algorithm design for spectrum sensing, spectrum sharing, and radio re-configurability. This paper presents an overview of required technologies, spectrum and channel awareness, and challenges of cognitive radio.

Gourav Misra, Arun Agarwal, Sourav Misra, Kabita Agarwal
Chessography: A Cryptosystem Based on the Game of Chess

Chess is one of the most strategic games played around the globe. It provides a player the platform to explore the exponential complexity of the game through their intellectual moves in the game. Though, the number of players, pieces, and board dimension and squares are finite, it is a game where the combination of moves for each player will change distinctly from game to game. Now, moving onto the field of Cryptography, which aims to achieve security for the data by encrypting it, and makes the plain text unintelligible for the intruders. Taking advantage of the numerous moves in the game of Chess, we can form an amalgamation of Chess game and cryptography. Chessography is the confluence of the game of Chess and Cryptography. Encryption will take place in the form of moves played by each player on the board. Resulting cipher text produced will be a Product Cipher, where the square positions on the game board will be substituted by new square positions during the move in the game and all the actual values dealt with, throughout the game will form the transposed cipher text. Chessography algorithm uses two keys, one fixed length and one variable length key called as ‘paired key’. Along with the strength of keys generated, this algorithm generates a very strong cipher text as each game of chess is different, so is the cipher text generated. The complexity in the game of chess forms the Chessography’s defensive strength to deal against various attacks during the data transmission.

Vaishnavi Ketan Kamat
Odia Compound Character Recognition Using Stroke Analysis

In Odia optical character recognition (OCR) model, detection of compound characters plays a very important role. The 80% of allograph classes present in Odia script is compound. Here a zone-based stroke analysis model is used to recognize the compound characters. Each character is divided into nine zones, and each zone has some similarity to one or few of the considered 12 strokes. For similarity measure, structural similarity index (SSIM) is used. This proposed feature have a higher potential for recognizing compound characters. The recognition accuracy of 92% is obtained for characters in Kalinga font.

Dibyasundar Das, Ratnakar Dash, Banshidhar Majhi
A Comparative Study and Performance Analysis of Routing Algorithms for MANET

The mobility for communication devices has generated the requirement of next generation wireless network technology development, several type of wireless technologies have been developed and implemented, one of them is ad hoc mobile network system which facilitates to a cluster of mobile terminals which are animatedly and randomly changing locations in such a way that the interconnections need to be handled continually. A routing procedure is used to find the routes between the mobile terminals in the set of connections for easy and reliable communication; main purpose of this protocol is to find a proper route from caller to called terminals with minimum of overhead and bandwidth consumption during construction of route. This paper provides an impression of a broad range of routing protocols projected in the literature and has in addition provided a piece comparison of all routing protocols with discussion of their respective merits and drawbacks.

Mohammed Abdul Bari, Sanjay Kalkal, Shahanawaj Ahmad
Classification of Research Articles Hierarchically: A New Technique

The amount of research work taking place in all streams of Science, Engineering, Medicines, etc., is growing rapidly and hence the research articles are increasing everyday. In this dynamic environment, identifying and maintaining such a large collection of articles in one place and classifying them manually are becoming very exhaustive. Often, the allocation of articles in various subject areas will be made simply on the basis of the journals in which they are published. This paper proposes an approach for handling such huge volume of articles by classifying them into their respective categories based on the keywords extracted from the keyword section of the article. Query enrichment is used by generating unigram and bigram of these keywords and giving them proper weights using probability measure. Microsoft Academic Research dataset is used for the experimental purpose and the empirical results show the effectiveness of the propose approach.

Rajendra Kumar Roul, Jajati Keshari Sahoo
Experimental Study of Multi-fractal Geometry on Electronic Medical Images Using Differential Box Counting

This paper focuses on the effect of fractal dimension and lacunarity to measure the roughness of digital medical images that has been contaminated with varying intensities of Gaussian noise as additive noise, Periodic noise as multiplicative noise, and Poisson noise as distributive noise. The experimental study shows that the fractal dimension successfully reveals the roughness of an image, while lacunarity reveals homogeneity or heterogeneity of the image and differential box counting method provides both the fractal dimension and lacunarity values which helps in diagnosis of diseases.

Tina Samajdar, Prasant Kumar Pattnaik
Tsallis Entropy Based Image Thresholding for Image Segmentation

Image segmentation is a method of segregating the image into required segments/regions. Image thresholding being a simple and effective technique, mostly used for image segmentation and these thresholds are optimized by optimization techniques by maximizing the Tsallis entropy. However, as the two level thresholding is extended to multi-level thresholding, the computational complexity of the algorithm is further increased. So there is need of evolutionary and swarm optimization techniques. In this paper, first time optimal thresholds are obtained by maximizing the Tsallis entropy using novel adaptive cuckoo search algorithm (ACS). The proposed ACS algorithm performance of image segmentation is tested using natural and standard images. Experiments shows that proposed ACS is better than particle swarm optimization (PSO) and cuckoo search (CS).

M. S. R. Naidu, P. Rajesh Kumar
Efficient Techniques for Clustering of Users on Web Log Data

Web usage mining is one of the essential framework to find domain knowledge from interaction of users with the web. This domain knowledge is used for effective management of predictive websites, creation of adaptive websites, enhancing business and web services, personalization, and so on. In nonprofitable organization’s website it is difficult to identify who are users, what information they need, and their interests change with time. Web usage mining based on log data provides a solution to this problem. The proposed work focuses on web log data preprocessing, sparse matrix construction based on web navigation of each user and clustering the users of similar interests. The performance of web usage mining is also compared based on k-means, X-means and farthest first clustering algorithms.

P. Dhana Lakshmi, K. Ramani, B. Eswara Reddy
Critique on Signature Analysis Using Cellular Automata and Linear Feedback Shift Register

In surge to cater the needs of modern technology and high performance computation requirement the complexity of the VLSI design increasing with complex logic design, more memory space and large test vector for testing the digital circuit. Signature analysis compresses the data. It is also known to be a compacting technique which follows the concept of cyclic redundancy checking (CRC) which in turn detects error during transmission. It is used in hardware using shift register, cellular automata, etc. as a part of VLSI design process. This paper deals with the popular mechanism of signature analysis in the context of digital system testing using LFSR and CA-based signature analysis along with its critique.

Shaswati Patra, Supriti Sinhamahapatra, Samaresh Mishra
SparshJa: A User-Centric Mobile Application Designed for Visually Impaired

Every person has an ability of exploring the world by means of touch and this specifically becomes the core ability if a person is visually impaired. This core ability is unfolded by the mobile application SparshJa that accepts the input through touch-based Braille keypad. SparshJa is a Sanskrit word which means “Gaining Knowledge through Touch”. SparshJa presents a user interface which mimics the Braille cell on touch screen phone where position of each dot is identified by means of distinct vibratory patterns. This Haptic feedback is augmented by auditory feedback that spells out the input letter. This dual feedback system makes the design of application and especially graphical interface totally user centric. The user-centric design improves utility of the application by simplifying navigation and reducing the learning curve. These claims are validated by means of on-field user trials which evaluate usability of SparshJa.

Prasad Gokhale, Neha Pimpalkar, Nupur Sawke, Debabrata Swain
A Novel Approach for Tracking Sperm from Human Semen Particles to Avoid Infertility

Now a days, the infertility is a big problem for human being, especially for men. The mobility of the sperm does not depend on the number of sperm present in the semen. To avoid infertility, the detection rate of the multi moving sperms is to measured. There are different algorithms are utilized for detection of sperms in the human semen, but their detection rate is not up to the mark. This article proposed a method to track and detect the human sperm with high detection rate as compared to existing approaches. The sperm candidates are tracked using Kalman filters and proposed algorithms.

Sumant Kumar Mohapatra, Sushil Kumar Mahapatra, Sakuntala Mahapatra, Santosh Kumar Sahoo, Shubhashree Ray, Smruti Ranjan Dash
The Use Robotics for Underwater Research Complex Objects

In the present article, the authors give an overview of the historical development of underwater technical vehicles and dwell on the modern representatives of this class of vehicles−autonomous (Unmanned) underwater vehicles (AUV). The authors touch upon the structure of this class of vehicles, problems to be decided, and perspective directions of their development, cover scientific materials basing mainly on the data about the benefits of the application of the vehicles in a number of tasks of narrow focus.

Sergei Sokolov, Anton Zhilenkov, Anatoliy Nyrkov, Sergei Chernyi
Elicitation of Testing Requirements from the Selected Set of Software’s Functional Requirements Using Fuzzy-Based Approach

Software requirements elicitation is employed to find out different types of software requirements. In literature, we find out that goal-oriented requirements elicitation (GORE) techniques do not underpin the identification of testing requirements from the functional requirements (FR) in early phase of requirements engineering. Therefore, to tackle this research issue, we proposed an approach for the elicitation of the testing requirements from FR. In real-life applications, only those requirements are implemented which are selected by stakeholders; and tested by testers after implementation during different releases of software. So in the proposed method we used fuzzy-based technique for FR selection on the basis of nonfunctional requirements (NFR). Finally, an example is given to explain the proposed method.

Mohd. Sadiq, Neha
Analysis of TCP Variant Protocol Using Active Queue Management Techniques in Wired-Cum-Wireless Networks

In this work, we analyzed the performance of TCP variant protocols in wired-cum-wireless networks considering active queue management (AQM) techniques such as random exponential marking (REM) and adaptive virtual queue (AVQ) along with Droptail. For analysis, we consider Reno, Newreno, Sack1, and Vegas as TCP variants and proposed a network model for wired-cum-wireless scenario. Then, the performance of TCP variants is analyzed using delayed acknowledgement (DelACK) technique. The simulation results show that Newreno performs better than others when DelACK is not used. However, when DelACK is used, the performance of Vegas is better than others irrespective of AQM techniques.

Sukant Kishoro Bisoy, Bibudhendu Pati, Chhabi Rani Panigrahi, Prasant Kumar Pattnaik
Image Texture-Based New Cryptography Scheme Using Advanced Encryption Standard

Encapsulation of information using mathematical barrier for forbidding malicious access is a traditional approach from past to modern era of information technology. Recent advancement in security field is not restricted to the traditional symmetric and asymmetric cryptography; rather, immense security algorithms were proposed in the recent past, from which biometric-based security, steganography, visual cryptography, etc. gained prominent focus within research communities. In this paper, we have proposed a robust cryptographic scheme to original message. First, each message byte, the ASCII characters ranging from Space (ASCII-32) to Tilde (ASCII-126), is represented as object using flat texture in a binary image which is decorated as n by n geometrical-shaped object in images of size N × N. Create a chaotic arrangement pattern by using the prime number encrypted by Advanced Encryption Standard (AES). The sub-images are shuffled and united as rows and columns to form a host covert or cipher image which looks like a grid-structured image where each sub-grid represents the coded information. The performance of the proposed method has been analyzed with empirical examples.

Ram Chandra Barik, Suvamoy Changder, Sitanshu Sekhar Sahu
MusMed: Balancing Blood Pressure Using Music Therapy and ARBs

Recently, increase or decrease in blood pressure level is one of the main problems for human all over the world which causes heart attack, Brain stroke, and many other diseases. There are many reasons for blood pressure and is increasing day-by-day and controlling blood pressure is one of the difficult tasks for every patient. For that patients take different kinds of medicines according to doctors suggestion. But, effects of these medicines gradually decrease at the time passes by. In this work we propose a treatment named as MusMed which is the combination of music therapy and medicine which helps to control blood pressure of human body. We used Indian classical raga by instrumental guitar as a music therapy and olmesartan molecule as a medicine. The results indicate that the combination of music therapy and medicine works well as compared to only medicine. Our results are validated through mercury sphygmomanometer.

V. Ramasamy, Joyanta Sarkar, Rinki Debnath, Joy Lal Sarkar, Chhabi Rani Panigrahi, Bibudhendu Pati
Interprocedural Conditioned Slicing

A technique, named Node Marking Conditioned Slicing (NMCS) algorithm, has been proposed to compute conditioned slices for interprocedural programs. First, the System Dependence Graph (SDG) is constructed as an intermediate representation of a given program. Then, NMCS algorithm selects the nodes satisfying a given condition by marking process and computes the conditioned slices for each variable at each statement during marking process. A stack has been used in NMCS algorithm to preserve the context in which a method is called. Some edges of SDG have been labeled to signify which statement calls a method.

Madhusmita Sahu, Durga Prasad Mohapatra
Face Biometric-Based Document Image Retrieval Using SVD Features

Nowadays, a lot of documents such as passport, identity card, voter id, certificates contain photograph of a person. These documents are maintained on the network and used in various applications. This paper presents a novel method for the retrieval of documents using face biometrics. We use trace of singular matrix to construct face biometric features in the proposed method. K-nearest neighbor approach with correlation distance is used for similarity measure and to retrieve document images from the database. Proposed method is tested on the synthetic database of 810 document images created by borrowing face images from face94 database [1]. Results are compared with discrete wavelet transform features (DWT), which is counterpart of singular value decomposition (SVD). Proposed features in combination with correlation similarity measure provided mean average precision (MAP) of 75.73% in our experiments.

Umesh D. Dixit, M. S. Shirdhonkar
Learning Visual Word Patterns Using BoVW Model for Image Retrieval

Bag of visual words (BoVW) model is popularly used for retrieving relevant images for a requested image. Though it is simple, compact, efficient, and scalable image representation, one of its major drawbacks is the visual words formed by this model are noisy that leads to mismatched visual words between two semantically irrelevant images and thus the discriminative power gets reduced. In this paper, a new pattern is learnt from the generated visual words for each image category (group) and the learnt pattern is applied to the numerous images of each category. The uniqueness and correctness of the learnt pattern are verified leading to the reduction of false image matches. This pattern learning is experimented using Caltech 256 dataset and obtained higher precision values.

P. Arulmozhi, S. Abirami
Test Scenario Prioritization Using UML Use Case and Activity Diagram

Software testing mainly aims at providing software quality assurance by verifying the behavior of a software using a finite set of test cases. The continuous evolution of software makes it impossible to perform exhaustive testing. The need for regression testing is to uncover new software bugs in existing system after some changes have been made to ensure that the existing functionalities are working fine. Re-executing the whole test suite is time-consuming as well as expensive. Hence, this issue can be handled by test case prioritization technique. Prioritization helps to organize the test suites in an effective manner where high-priority test cases are executed earlier than the low priority test cases based on some criteria. In this paper, a new prioritization approach is proposed using UML use case diagram and UML activity diagram. We have applied our technique to a particular of a case study which indicates the effectiveness of our proposed approach in prioritizing test scenarios.

Prachet Bhuyan, Abhishek Ray, Manali Das
Conditioned Slicing of Aspect-Oriented Program

The different variants of slicing techniques of aspect-oriented programs (AOPs) are used in software maintenance, reuse, debugging, testing, program evolution, etc. In this paper, we propose a conditioned slicing algorithm for slicing AOPs, which computes more precise slice in comparison with dynamic slice. First, we have constructed an intermediate representation named conditioned aspect-oriented dependence graph (CAODG) to represent the aspect-oriented programs. The construction of CAODG is based on execution of aspect-oriented program with respect to pre-/post-condition rule, which is defined in aspect code. Then, we have proposed a conditioned slicing algorithm for AOP using the proposed CAODG.

Abhishek Ray, Chandrakant Kumar Niraj
Comparative Analysis of Different Land Use–Land Cover Classifiers on Remote Sensing LISS-III Sensors Dataset

Determination and identification of land use–land cover (LULC) of urban area have become very challenging issue in planning a city development. In this paper, we report application of four classifiers to identify LULC using remote sensing data. In our study, LISS-III image dataset of February 2015, obtained from NRSC Hyderabad, India, for the region of Aurangabad city (India) has been used. It was found that all classifiers provided similar results for water body, whereas significant differences were detected for regions related to residential, rock, barren land and fallow land. The average values from these four classifiers are satisfactory in agreement with Toposheet obtained from the Survey of India.

Ajay D. Nagne, Rajesh Dhumal, Amol Vibhute, Karbhari V. Kale, S. C. Mehrotra
Review on Assistive Reading Framework for Visually Challenged

Objective of this paper is to review various approaches used for providing the assistive reading framework for the visually challenged persons. Visually challenged persons are the persons who are either blind or having any kind of difficulty in reading any printed material to acquire the domain knowledge. A lot of research is being done for visually challenged to make them independent in their life. We want to study the existing assistive technology for visually challenged and then propose a robust reading framework for visually challenged. The discussed reading framework will help the visually challenged person to read normal printed books, typed documents, journals, magazines, newspapers and computer displays of emails, Web pages, etc., like normal persons. It is a system based on image processing and pattern recognition using which visually challenged person carries or wears a portable camera as a digitizing device and uses computer as a processing device. The camera captures the image of the text to be read along with the relevant image and data. An optical character recognition (OCR) system segregates the image into text and non-text boxes, and then, the OCR converts the text from the text boxes to ASCII or text file. The text file is converted to voice by a text-to-speech (TTS) converter. Thus, blind person would ‘hear’ the text information that has been captured. So this technology can help the visually challenged to read the captured textual information independently which will be communicated as voice signal output of a text-to-speech converter.

Avinash Verma, Deepak Kumar Singh, Nitesh Kumar Singh
Performance Evaluation of the Controller in Software-Defined Networking

Classical Internet architecture plays major obstacles in IPV6 deployment. There exits different reasons to extend classical approach to be more polished. Next generation of future Internet demands for routing not only within the same network domain but also outside the domain. Along with this, it offers many attributes such as network availability, end-to-end network connectivity, QoS management dynamically, and many more. The application area extends from small network size to big data center. To take in hand all these concerns, software-defined networking (SDN) has taken the major role in current situations. SDN separates the control plane and data plane to make the routing more versatile. We model the packet-in message processing of SDN controller as the queueing systems M / M / 1.

Suchismita Rout, Sudhansu Shekhar Patra, Bibhudatta Sahoo
Task-Scheduling Algorithms in Cloud Environment

Cloud computing has increased its popularity due to which it is been used in various sectors. Now it has come to light and is in demand because of amelioration in technology. Many applications are submitted to the data centers, and services are given as pay-per-use basis. As there is an increase in the client demands, the workload is increased, and as there are limited resources, workload is moved to different data centers in order to handle the client demands on as-you-pay basis. Hence, scheduling the increasing demand of workload in the cloud environments is highly necessary. In this paper, we propose three different task-scheduling algorithms such as Minimum-Level Priority Queue (MLPQ), MIN-Median, Mean-MIN-MAX which aims to minimize the makespan with maximum utilization of cloud. The results of our proposed algorithms are also compared with some existing algorithms such as Cloud List Scheduling (CLS) and Minimum Completion Cloud (MCC) Scheduling.

Preeta Sarkhel, Himansu Das, Lalit K. Vashishtha
Skin-Colored Gesture Recognition and Support Vector Machines-Based Classification of Light Sources by Their Illumination Properties

The illumination characteristics of light sources can determine whether the light source is a normal or a faulty one. The proposed work is based on moving a platform containing a light source in both horizontal and vertical directions by gesture recognition. The gesture recognition done by Fuzzy C means and snake algorithm-based skin color detection makes the recognition more accurate. The illumination values of the light source are obtained by a webcam. The set of data helps in classification of the state of an unknown light source (normal or faulty) by support vector machines with radial basis function as kernel with a yield of an error rate of about 0.6% marking the efficacy of the system and making the system a novel and sophisticated one.

Shreyasi Bandyopadhyay, Sabarna Choudhury, Riya Ghosh, Saptam Santra, Rabindranath Ghosh
Test Case Prioritization Using UML State Chart Diagram and End-User Priority

The intangible behaviour of software has given rise to various challenges in the field of testing software. One of the major challenges is to efficiently carry out regression testing. Regression testing is performed to ensure that any modifications in one component of the software do not adversely affect the other components. But, the retesting of test cases during regression testing increases the testing time and leads to delayed delivery of the software product. In this paper, a dynamic model, i.e. UML state chart diagram, is used for system modelling. Further, the UML state chart diagram is converted into an intermediate representation, i.e. State Chart Graph (SCG). The SCG is traversed to identify the affected nodes due to certain modification in the software. This information, about the affected nodes, is periodically stored in a historical data store across different versions of the software. Next time, when regression testing is carried out for any change, the stored data decides the pattern of frequently affected nodes for prioritizing the test cases and further it decides the criticality value (CV) of a test case. Along with this, to strengthen the prioritization the test sequence, two more criteria, i.e. priority set by the end-user for different functions and browsing history of the end-user, are also added up. This approach is found to be very efficient as we are able to model dynamic nature of applications, maintain a historical data store of the test cases and track the complete life of an object.

Namita Panda, Arup Abhinna Acharya, Prachet Bhuyan, Durga Prasad Mohapatra
Performance Analysis of Spectral Features Based on Narrowband Vegetation Indices for Cotton and Maize Crops by EO-1 Hyperion Dataset

The objective of this paper is to estimate and analyze the selected narrowband vegetation indices for cotton and maize crops at canopy level, generated by using EO-1 Hyperion dataset. EO-1 Hyperion data of the date 15th October 2014 has been collected from United States Geological Survey (USGS) Earth Explorer by data aquisition request (DAR). After performing atmospheric corrections by using Quick Atmospheric Correction (QUAC), we have applied selected narrowband vegetation indices specifically those which are based on greenness/leaf pigments namely NDVI, EVI, ARVI, SGI, and red-edge indices such as RENDVI and VOG-I. Statistical analysis has been done by using the statistical t-test, it is found that there is a more significant difference in the mean of the responses of cotton and maize to NDVI, ARVI and VOG-1 than EVI & RENDVI, whereas, the response to SGI for both the crops is very close to each other.

Rajesh K. Dhumal, Amol D. Vibhute, Ajay D. Nagne, Karbhari V. Kale, Suresh C. Mehrotra
Measuring Hit Ratio Metric for SOA-Based Application Using Black-Box Testing

In our proposed work, we discuss how to generate test cases automatically for BPEL processes to compute Hit Ratio percentage of an SOA application. First, we design an SOA-based application using OpenEsb tool. That application is supplied to code converter to get XML code of the designed application. Then, we have supplied this XML code to Tcases tool to generate test cases according to black-box testing technique. These test cases are supplied to Hit Ratio Calculator to compute Hit Ratio percentage. On an average of four SOA-based applications, we achieved Hit Ratio percentage as 63.94%.

A. Dutta, S. Godboley, D. P. Mohapatra
Circularly Polarized MSA with Suspended L-shaped Strip for ISM Band Application

The broadband circular polarized (CP) microstrip patch antenna (MSA) for ISM band (2.4 GHz) applications is proposed. The proposed antenna consists of a suspended square ring along with corner-chopped square shape slot and a suspended horizontal L-shaped strip line. Two cylindrical probes which connect L-shaped strip and radiating patch are used to feed patch at two different positions. This feed network of two probes excited two orthogonal signals of equal magnitude which generated CP radiation. The protocol of proposed antenna is simulated and fabricated, and experimental results shows that 10 dB impedance bandwidth of 21.75% (2.13–2.65 GHz), 3 dB axial ratio bandwidth of 19.6% (2.16–2.60 GHz), and measured simulated gain over 3 dB axial ratio (AR) are 7.16 dBi.

Kishor B. Biradar, Mansi S. Subhedar
A Genetic Algorithm with Naive Bayesian Framework for Discovery of Classification Rules

Genetic algorithms (GAs) for discovery of classification rules have gained importance due to their capability of finding global optimal solutions. However, building a rule-based classification model from large datasets using GAs is very time-consuming task. This paper proposes an efficient GA that seeds the initial population with gain ratio-based probabilistic framework. The gain ratio is normalized information gain of attributes and is not biased toward multi-valued attributes. In addition, the proposed approach computes the fitness of individuals from a pre-computed matrix of posterior probabilities using Bayesian framework instead of making repeated training database scans. This approach for fitness computation increases the efficiency of the proposed GA by eliminating a large number of database scans. The enhanced GA is validated on ten datasets from UCI machine learning repository. The results confirm that the proposed approach performs better than a GA with randomly initialized population (GARIP) and Naïve Bayesian classifier.

Pooja Goyal, Saroj
Study of a Multiuser Kurtosis Algorithm and an Information Maximization Algorithm for Blind Source Separation

An attempt is made in this study to use two distinct algorithms to inspect blind source separation (BSS). In this paper, we have used multiuser Kurtosis (MUK) algorithm for BSS and an information maximization algorithm for separation and deconvolution of voice signals. Among the various criteria available for evaluating the objective function for BSS, we have considered information theory principles and Kurtosis as a measure for statistical independence. The MUK algorithm uses a combination of Gram–Schmidt orthogonalization and a stochastic gradient update to achieve non-Gaussian behavior. A correlation coefficient is used as an evaluation criterion to analyze the performance of both the algorithms. Simulations results are presented at the end of the study along with the performance tabulation for discussion and analysis.

Monorama Swain, Rachita Biswal, Rutuparna Panda, Prithviraj Kabisatpathy
Use of Possibilistic Fuzzy C-means Clustering for Telecom Fraud Detection

This paper presents a novel approach for detecting fraudulent activities in mobile telecommunication networks by using a possibilistic fuzzy c-means clustering. Initially, the optimal values of the clustering parameters are estimated experimentally. The behavioral profile modelling of subscribers is then done by applying the clustering algorithm on two relevant call features selected from the subscriber’s historical call records. Any symptoms of intrusive activities are detected by comparing the most recent calling activity with their normal profile. A new calling instance is identified as malicious when its distance measured from the profile cluster centers exceeds a preset threshold. The effectiveness of our system is justified by carrying out large-scale experiments on a real-world dataset.

Sharmila Subudhi, Suvasini Panigrahi
A Classification Model to Analyze the Spread and Emerging Trends of the Zika Virus in Twitter

The Zika disease is a 2015–16 virus epidemic and continues to be a global health issue. The recent trend in sharing critical information on social networks such as Twitter has been a motivation for us to propose a classification model that classifies tweets related to Zika and thus enables us to extract helpful insights into the community. In this paper, we try to explain the process of data collection from Twitter, the preprocessing of the data, building a model to fit the data, comparing the accuracy of support vector machines and Naïve Bayes algorithm for text classification and state the reason for the superiority of support vector machine over Naïve Bayes algorithm. Useful analytical tools such as word clouds are also presented in this research work to provide a more sophisticated method to retrieve community support from social networks such as Twitter.

B. K. Tripathy, Saurabh Thakur, Rahul Chowdhury
Prediction of Child Tumours from Microarray Gene Expression Data Through Parallel Gene Selection and Classification on Spark

Microarray gene expression data play a major role in predicting chronic disease at an early stage. It also helps to identify the most appropriate drug for curing the disease. Such microarray gene expression data is huge in volume to handle. All gene expressions are not necessary to predict a disease. Gene selection approaches pick only genes that play a prominent role in detecting a disease and drug for the same. In order to handle huge gene expression data, gene selection algorithms can be executed in parallel programming frameworks such as Hadoop Mapreduce and Spark. Paediatric cancer is a threatening illness that affects children at age of 0–14 years. It is very much necessary to identify child tumours at early stage to save the lives of children. So the authors investigate on paediatric cancer gene data to identify the optimal genes that cause cancer in children. The authors propose to execute parallel Chi-Square gene selection algorithm on Spark, selected genes are evaluated using parallel logistic regression and support vector machine (SVM) for Binary classification on Spark Machine Learning library (Spark MLlib) and compare the accuracy of prediction and classification respectively. The results show that parallel Chi-Square selection followed by parallel logistic regression and SVM provide better accuracy compared to accuracy obtained with complete set of gene expression data.

Y. V. Lokeswari, Shomona Gracia Jacob
Handover Decision in Wireless Heterogeneous Networks Based on Feedforward Artificial Neural Network

In heterogeneous networks, vertical handover decision is a significant issue due to increasing demand of customers to access various service features among them. In order to provide a seamless transfer between various technologies, the effect of various user preference metrics and network conditions needs to be considered. This paper proposes a multilayer feedforward artificial neural network algorithm for handover decision in wireless heterogeneous networks. Neural network aids in taking the handover and selection of best candidate based on data rate, service cost, received signal strength indicator (RSSI) and velocity of mobile device. Experimental results show an improvement in reducing number of handover effectively as compared to other existing systems. It is found that probability of handover decision is also improved.

Archa G. Mahira, Mansi S. Subhedar
Tweet Cluster Analyzer: Partition and Join-based Micro-clustering for Twitter Data Stream

Data stream mining is the process of extracting knowledge from continuously generated data. Since data stream processing is not a trivial task, the streams have to be analyzed with proper stream mining techniques. In many large volume of data stream processing, stream clustering helps to find the valuable hidden information. Many works have concentrated on clustering the data streams using various methods, but mostly those approaches lack in some core tasks needed to improve the cluster accuracy and quick processing of data streams. To tackle the problem of improving cluster quality and reducing the time for data stream processing time in cluster generation, the partition-based DBStream clustering method is proposed. The result has been compared with various data stream clustering methods, and it is evident from the experiments that the purity of clusters improves 5% and the time taken is reduced by 10% than the average time taken by other methods for clustering the data streams.

M. Arun Manicka Raja, S. Swamynathan
Contour-Based Real-Time Hand Gesture Recognition for Indian Sign Language

Gesture recognition system is widely being developed recently as gesture-controlled devices are on a large scale used by the consumers. The gesture may be in static or in dynamic form, typically applied in robot control, gaming control, sign language recognition, television control etc. This paper focuses on the use of dynamic gestures for Indian sign language recognition. The methodology is implemented in real time for hand gestures using contour and convex hull for feature extraction and Harris corner detector for gesture recognition. The accuracy results are obtained under strong, dark, and normal illumination. The overall accuracy achieved for Indian sign language recognition under dark illumination is 81.66. With Indian sign language application, the recognized gesture can also be applied for any machine interaction.

Rajeshri R. Itkarkar, Anilkumar Nandi, Bhagyashri Mane
Firefly Algorithm for Feature Selection in Sentiment Analysis

Selecting and extracting feature is a vital step in sentiment analysis. The statistical techniques of feature selection like document frequency thresholding produce sub-optimal feature subset because of the non-polynomial (NP)-hard character of the problem. Swarm intelligence algorithms are used extensively in optimization problems. Swarm optimization renders feature subset selection by improving the classification accuracy and reducing the computational complexity and feature set size. In this work, we propose firefly algorithm for feature subset selection optimization. SVM classifier is used for the classification task. Four different datasets are used for the classification of which two are in Hindi and two in English. The proposed method is compared with feature selection using genetic algorithm. This method, therefore, is successful in optimizing the feature set and improving the performance of the system in terms of accuracy.

Akshi Kumar, Renu Khorwal
Adaptive Dynamic Genetic Algorithm Based Node Scheduling for Time-Triggered Systems

Nowadays, there has been tremendous increase in the use of reliable systems for safety critical applications such as avionics and automotives. As the systems become reliable, fault-tolerant design is must often involving strict timings. The implementation of such systems with traditional event-triggered approach is inappropriate; consequently, time-triggered approach is taking control. The time-triggered architectures are hard real-time embedded systems. Similarly, the scheduling process has to be redefined for optimality in resource allocation. The schedulability of tasks in such systems is analysed with meta-heuristic approach; genetic algorithm for optimization of processing nodes. Further, an adaptive approach has been arrived called adaptive dynamic genetic algorithm which allocates tasks to available nodes in a better optimized way for multiprocessor architecture.

B. Abdul Rahim, K. Soundara Rajan
Deformation Monitoring of Volcanic Eruption Using DInSAR Method

The high-resolution images provided by TerraSAR-X satellite is very much useful for the displacement monitoring of earth phenomenon. In this paper, a model is developed for the deformation monitoring of Kilauea volcano in Hawaii Island region. This deformation monitoring is done with the help of differential interferometric synthetic aperture radar (DInSAR) algorithms. For this, a two-pass TerraSAR-X dataset of Hawaii region is used. The input master and slave images are preprocessed to remove speckle from the image, and then atmospheric correction is made to get the noise-free data. The preprocessed data are co-registered and then interferogram is calculated from this co-registered master and slave images. Phase unwrapping is performed to remove the multiple phase in the images. After that the vertical, horizontal, and LOS displacement is calculated. The results of the proposed model are validated with existing DInSAR method.

P. Saranya, K. Vani
Effective Printed Tamil Text Segmentation and Recognition Using Bayesian Classifier

Text segmentation and recognition of Indian languages have gained a lot of research interest in the recent years. The existence of a huge number of symbols and varying characteristics in these languages makes segmentation and extraction of text a challenging task. The Tamil language has a wide variety of the literature, and printed text is available in various forms such as newspaper, books, and magazines. In this paper, extraction of printed Tamil text from an image is done irrespective of the characteristics of the text such as font style, color, and size. The proposed work uses scanned printed Tamil text as the input image. This input image is binarized since text is always available in the foreground, and histograms can be used to segment them into lines and words. The morphological operator, dilation, is used to remove outliers such as dots and commas present in an underlying object and segment the printed text into words to facilitate text detection. Further, each character is identified using bounding box technique. Classification of Tamil letters is done by extracting features such as gradient information and curvature-based information obtained from grayscale and binary images. These features are trained, and characters are classified using Bayesian classifier. The recognized characters are documented as text using Unicode format. The performance of the approach is evaluated using precision, recall, and F-measure.

S. Manisha, T. Sree Sharmila
Retrieval of Homogeneous Images Using Appropriate Color Space Selection

In this paper, convenient color space selection issue in content-based image retrieval system for low-level feature mining is addressed by the exploration of color edge histogram feature extraction on HSV, YIQ, YUV, and YCbCr color spaces. Moreover, Haar wavelet transform is applied to reduce feature vector count for the beneficial to speed up the retrieval process, and then semantic retrieval is obtained via similarity metric. Retrieval accuracy of each color space is analyzed through the parameters such as precision, recall, and response time of the system. Experimental results show that HSV color space-based retrieval system averagely gives 5%, 18.3%, and 26% high retrieval than the YIQ, YUV, and YCbCr color spaces, respectively.

L. K. Pavithra, T. Sree Sharmila
Stylometry Detection Using Deep Learning

Author profiling is one of the active researches in the field of data mining. Rather than only concentrated on the syntactic as well as stylometric features, this paper describes about more relevant features which will profile the authors more accurately. Readability metrics, vocabulary richness, and emotional status are the features which are taken into consideration. Age and gender are detected as the metrics for author profiling. Stylometry is defined by using deep learning algorithm. This approach has attained an accuracy of 97.7% for gender and 90.1% for age prediction.

K. Surendran, O. P. Harilal, P. Hrudya, Prabaharan Poornachandran, N. K. Suchetha
SVPWM-Based DTC Controller for Brushless DC Motor

Brushless DC motor is one of the growing electrical drives in present days, because of their higher efficiency, high power density, easy maintenance and control, and high torque to inertia ratio. In this paper, a sensorless space vector modulation-based direct torque and indirect flux control of BLDC has been investigated. There are several methods that are projected for BLDC to gain better torque and current control, i.e., with minimum torque and current pulsations. The proposed sensorless method is similar to the usual direct torque control method which is utilized for sinusoidal alternating current motors so that it controls toque directly and stator flux indirectly by varying direct axis current. And the electric rotor position can be found by using winding inductance and stationary reference frame stator fluxes and currents. In this paper, space vector modulation technique is utilized which permits the regulation of varying signals than that of PWM and vector control techniques. The validity of the projected sensorless three-phase conduction direct torque control of BLDC motor drive method is established in the Simulink, and the results are observed.

G. T. Chandra Sekhar, Budi Srinivasa Rao, Krishna Mohan Tatikonda
Implementation of IoT-Based Smart Video Surveillance System

Smart video surveillance is a IOT-based application as it uses Internet for various purposes. The proposed system intimates about the presence of any person in the premises, also providing more security by recording the activity of that person. While leaving the premises, user activates the system by entering password. System working starts with detection of motion refining to human detection followed by counting human in the room and human presence also gets notified to neighbor by turning on alarm. In addition, notification about the same is send to user through SMS and e-mail. The proposed system’s hardware implementation is supported by Raspberry Pi and Arduino board; on the other hand, software is given by OpenCV (for video surveillance) and GSM module (for SMS alert and e-mail notification). Apart from security aspect, system is intelligent enough to optimize power consumption wastage if user forgets to switch off any electronic appliances by customizing coding with specific appliances.

Sonali P. Gulve, Suchitra A. Khoje, Prajakta Pardeshi
Context-Aware Recommendations Using Differential Context Weighting and Metaheuristics

Context plays a paramount role in language and conversations, and since their incorporation into traditional recommendation engines, which made use of just the user and item details, an effective method to utilize them in the best possible manner is of great importance. In this paper, we propose a novel approach to handle the sparsity of contextual data, their increasing dimensionality, and the development of an effective model for a context-aware recommender system (CARS). We further go on given relevance, in the form of assigning weights even to the individual attributes of each context. Differential context weighting (DCW) is used as the rating model to obtain the desired ratings. Optimization of weights required for DCW is done through metaheuristic techniques, and toward this, we have further gone on to experimentally compare two of the most popular ones, namely particle swarm optimization (PSO) and the firefly algorithm (FA). Recommendations using the optimal one were then obtained.

Kunal Gusain, Aditya Gupta
A Multi-clustering Approach to Achieve Energy Efficiency Using Mobile Sink in WSN

A wireless sensor network (WSN) consists of a large number of interconnected sensors, which provides unique features to visualize the real world scenario. It explores many opportunities in the field of research due to its wide range of applications in current fields that require survey and periodic monitoring which is inevitable in our daily life. However, the main limitations of such sensors are their resource constrained nature, mainly to conserve battery power for extending the network lifetime. We have proposed an algorithm for energy efficiency in WSN in which mobile sink node is used to operate the routing process considering the shortest path between multiple unequal clusters with reduced energy. This model also ensures non-occurrences of energy hole problems within the network area.

Samaleswari Pr. Nayak, S. C. Rai, Sipali Pradhan
Selection of Similarity Function for Context-Aware Recommendation Systems

Earlier recommendation engines used to work on just the user and item details; however, more recently, users’ specific contexts have found an equally significant place as a metric in finding recommendations. The addition of contexts to the mix makes for more personalized suggestions and search for a truly efficient context-aware recommendation system (CARS) continues. Differential context weighting (DCW)-based CARS needs to compute similarities between different users to give recommendations. Our objective is to analyze, compare, and contrast various similarity functions, not only to find the best-suited one but also to implement an efficacious and economic CARS. To optimize the weights in DCW, we use a metaheuristic approach.

Aditya Gupta, Kunal Gusain
Medical Dataset Classification Using k-NN and Genetic Algorithm

This paper proposes a hybrid technique that applies artificial bee colony (ABC) algorithm for the feature selection and combined k-nearest neighbor (k-NN) with genetic algorithm (GA) used for effective classification. The aim of this paper was to select the finest features including the elimination of the insignificant features of the datasets that severely affect the classification accuracy. The proposed approach used in heart disease and diabetes diagnosis, which has higher impact rate on reducing quality of life throughout the world, is developed. The datasets including heart disease, diabetes, and hepatitis are taken from UCI repository and evaluated by the proposed technique. The classification accuracy is achieved by 10-fold cross-validation. Experimental results show the higher accuracy of our proposed algorithm compared to other existing systems.

Santosh Kumar, G. Sahoo
Analysis of Static Power System Security with Support Vector Machine

Security analysis is the task of evaluating security and reliability limits of the power system, up to what level the system is secure. Power system security is divided into four classes, namely secure, critically secure, insecure, and highly insecure, depending on the value of security index. A multi-class support vector machine (SVM) classifier algorithm is used, in this paper, to categorize the patterns. These patterns are generated at different generating and loading conditions for IEEE 6 bus, IEEE 14 bus, and New England 39 bus systems by Newton–Raphson load flow method for line outage contingencies. The main target is to give a forewarning or hint to the system operator at security level which helps to actuate requisite regulating actions at the suitable time, to put a stop to the system collapse.

B Seshasai, A Santhi, Ch Jagan Mohana Rao, B Manmadha Rao, G. T. Chandra Sekhar
Credit Card Fraud Detection Using a Neuro-Fuzzy Expert System

In this paper, a two-stage neuro-fuzzy expert system has been proposed for credit card fraud detection. An incoming transaction is initially processed by a pattern-matching system in the first stage. This component comprises of a fuzzy clustering module and an address-matching module, and each of them assigns a score to the transaction based on its extent of deviation. A fuzzy inference system computes a suspicious score by combining these score values and accordingly classifies the transaction as genuine, suspicious, or fraudulent. Once a transaction is detected as suspicious, a neural network trained with history transactions is employed in the second stage to verify whether it was an actual fraudulent action or an occasional deviation by the legitimate user. The effectiveness of the proposed system has been verified by conducting experiments and comparative analysis with other systems.

Tanmay Kumar Behera, Suvasini Panigrahi
Backmatter
Metadaten
Titel
Computational Intelligence in Data Mining
herausgegeben von
Himansu Sekhar Behera
Durga Prasad Mohapatra
Copyright-Jahr
2017
Verlag
Springer Singapore
Electronic ISBN
978-981-10-3874-7
Print ISBN
978-981-10-3873-0
DOI
https://doi.org/10.1007/978-981-10-3874-7