Skip to main content

2014 | Buch

Machine Learning and Cybernetics

13th International Conference, Lanzhou, China, July 13-16, 2014. Proceedings

herausgegeben von: Xizhao Wang, Witold Pedrycz, Patrick Chan, Qiang He

Verlag: Springer Berlin Heidelberg

Buchreihe : Communications in Computer and Information Science

insite
SUCHEN

Über dieses Buch

This book constitutes the refereed proceedings of the 13th International Conference on Machine Learning and Cybernetics, Lanzhou, China, in July 2014. The 45 revised full papers presented were carefully reviewed and selected from 421 submissions. The papers are organized in topical sections on classification and semi-supervised learning; clustering and kernel; application to recognition; sampling and big data; application to detection; decision tree learning; learning and adaptation; similarity and decision making; learning with uncertainty; improved learning algorithms and applications.

Inhaltsverzeichnis

Frontmatter

Classification and Semi-Supervised Learning

Frontmatter
Combining Classifiers Based on Gaussian Mixture Model Approach to Ensemble Data

Combining multiple classifiers to achieve better performance than any single classifier is one of the most important research areas in machine learning. In this paper, we focus on combining different classifiers to form an effective ensemble system. By introducing a novel framework operated on outputs of different classifiers, our aim is to build a powerful model which is competitive to other well-known combining algorithms such as Decision Template, Multiple Response Linear Regression (MLR), SCANN and fixed combining rules. Our approach is difference from the traditional approaches in that we use Gaussian Mixture Model (GMM) to model distribution of Level1 data and to predict the label of an observation based on maximizing the posterior probability realized through Bayes model. We also apply Principle Component Analysis (PCA) to output of base classifiers to reduce its dimension of what before GMM modeling. Experiments were evaluated on 21 datasets coming from University of California Irvine (UCI) Machine Learning Repository to demonstrate the benefits of our framework compared with several benchmark algorithms.

Tien Thanh Nguyen, Alan Wee-Chung Liew, Minh Toan Tran, Mai Phuong Nguyen
Sentiment Classification of Chinese Reviews in Different Domain: A Comparative Study

With the rapid development of micro-blog, blog and other types of social media, users’ reviews on the social media increase dramatically. Users’ reviews mining plays an important role in the application of product information or public opinion monitoring. Sentiment classification of users’ reviews is one of key issues in the review mining. Comparative study on sentiment classification results of reviews in different domains and the adaptability of sentiment classification methods is an interesting research topic. This paper classifies users’ reviews in three different domains based on Support Vector Machine with six kinds of feature weighting methods. Experiment results in three domains indicate that different domains have their own characteristics and the selection of feature weighting methods should consider the domain characteristics.

Qingqing Zhou, Chengzhi Zhang
A Computer-Aided System for Classification of Breast Tumors in Ultrasound Images via Biclustering Learning

The occurance of Breast cancer increases significantly in the modern world. Therefore, the importance of computer-aided recognition of breast tumors also increases in clinical diagnosis. This paper proposes a novel computer-aided diagnosis (CAD) method for the classification of breast lesions as benign or malignant tumors using the biclustering learning technique. The medical data is graded based on the sonographic breast imaging reporting with data system (BI-RADS) lexicon. In the biclustering learning, the training data is used to find significant grading patterns. The grading pattern being learned is then applied to the test data. The k-Nearest Neighbors (k-NN) classifier is used as the classifier of breast tumors. Experimental results demonstrate that the proposed method classifies breast tumors into benign and malignant effectively. This indicates that it could yield good performances in real applications.

Qiangzhi Zhang, Huali Chang, Longzhong Liu, Anhua Li, Qinghua Huang
An Improved Approach to Ordinal Classification

A simple ordinal classification approach (SOCA) has been proposed by Frank and Hall. SOCA is a general method, any classification algorithm such as C4.5, k nearest neighbors (KNN) algorithm and extreme learning machine (ELM) etc. can be applied to this approach. We find that in SOCA only ordering information of decision attribute is used to classify objects but the ordering information of conditional attributes is not considered. Furthermore we experimentally find that ordering information of conditional attributes can also improve the generalization ability of the classification method. In this paper, we propose an improved ordinal classification methodology by employing the ordering information of both condition and decision attributes. In addition, we analyze the sensitivity of the SOCA on performance to the underlying classification algorithms, for instance, C4.5, KNN and ELM. A number of experiments are conducted and the experimental results show that the proposed method is feasible and effective.

Donghui Wang, Junhai Zhai, Hong Zhu, Xizhao Wang
Classification Based on Lower Integral and Extreme Learning Machine

It is known that the non-linear integral has been generally used as an aggregation operator in classification problems, because it represents the potential interaction of a group of attributes. The lower integral is a type of non-linear integral with respect to non-additive set functions, which represents the minimum potential of efficiency for a group of attributes with interaction. Through solving a linear programming problem, the value of lower integral could be calculated. When we consider the lower integral as a classifier, the difficult step is the learning of the non-additive set function, which is used in lower integral. Then, the Extreme Learning Machine technique is applied to solve the problem and the ELM lower integral classifier is proposed in this paper. The implementations and performances of ELM lower integral classifier and single lower integral classifier are compared by experiments with six data sets.

Aixia Chen, Huimin Feng, Zhen Guo
User Input Classification for Chinese Question Answering System

Restricted-domain question answering system gives high quality answer to questions within the domain, but gives no response or wrong answer for out of the domain questions. For normal users, the boundary of in-domain and out-domain is unclear. Most users often send out-domain inputs to the restricted-domain question answering system. In such cases, both no answer and wrong answer from the system will yield bad user experience. In this paper, an approach is proposed to solve the bad system response issue of the restricted-domain question answering system. Firstly, it uses a binary classifier to recognize in-domain user inputs and uses the restricted-domain question answering system to proved correct answer. Secondly, an user input taxonomy for out-domain user input is designed, and a classifier is trained to classify the out-domain user input based on the taxonomy. Finally, different response strategies are designed to response to different classes of out-domain user inputs. Experiments and actual application on a restricted-domain question answering system shows that the proposed approach is effective to improve user experience.

Yongshuai Hou, Xiaolong Wang, Qingcai Chen, Man Li, Cong Tan
Fusion of Classifiers Based on a Novel 2-Stage Model

The paper introduces a novel 2-Stage model for multi-classifier system. Instead of gathering posterior probabilities resulted from base classifiers into a single dataset called meta-data or Level1 data like in the original 2-Stage model, here we separate data in K Level1 matrices corresponding to the K base classifiers. These data matrices, in turn, are classified in sequence by a new classifier at the second stage to generate output of that new classifier called Level2 data. Next, Weight Matrix algorithm is proposed to combine Level2 data and produces prediction for unlabeled observations. Experimental results on CLEF2009 medical image database demonstrate the benefit of our model in comparison with several existing ensemble learning models.

Tien Thanh Nguyen, Alan Wee-Chung Liew, Minh Toan Tran, Thi Thu Thuy Nguyen, Mai Phuong Nguyen

Clustering and Kernel

Frontmatter
Comparative Analysis of Density Estimation Based Kernel Regression

The local linear kernel estimator (LLKE) is a typical kernel-type regression method which is a non-parametric method to estimate the conditional expectation of a random variable and the non-linear mapping from input to output. There are three commonly used LLKEs, i.e., the Nadaraya-Watson kernel estimator, the Priestley-Chao kernel estimator and the Gasser-Müller kernel estimator. Existing studies show that the performance of LLKE mainly depends on the selection of an important parameter, i.e. bandwidth

$$h$$

, when a special kernel function is employed. However, there is no comparative research conducted to study the effectiveness of different kernel functions. In this paper, we compare the performance of three aforementioned LLKEs based on 6 different kernel functions (i.e., Gaussian, uniform, Epanechnikov, biweight, triweight and cosine kernels) on their estimation error measured by the mean squared error (i.e.,

$$mse$$

) and stability of method measured by the standard deviation of

$$mse$$

(i.e.,

$$std$$

). Finally, we give guidelines for the selection of LLKE method and corresponding kernel function in practical applications.

Junying Chen, Yulin He
Thermal Power Units’ Energy Consuming Speciality Analysis Based on Support Vector Regression (SVR)

There are some characteristics such as multi-borders, nonlinear time-variation of the thermal system of large coal-fired power units, the complex relationships between operating parameters and energy consumption, which affect the operation precision of thermal power units. According to rigorous theoretical analysis key operating parameters are identified and used to determine the standard coal consumption rate. On this basis, features are extracted and used as the inputs to SVR for training and testing. Energy consumption distribution model under full conditions of large coal-fired power units based on aforesaid method achieved a high precision.

Ming Zhao, Zhengbo Yan, Liukun Zhou
Bandwidth Selection for Nadaraya-Watson Kernel Estimator Using Cross-Validation Based on Different Penalty Functions

The traditional cross-validation usually selects an over-smoothing bandwidth for kernel regression. The penalty function based cross-validation (e.g., generalized cross-validation (

$$\mathrm{{CV}}_{\mathrm{{GCV}}}$$

), the Shibata’s model selector (

$$\mathrm{{CV}}_{\mathrm{{S}}}$$

), the Akaike’s information criterion (

$$\mathrm{{CV}}_{\mathrm{{AIC}}}$$

) and the Akaike’s finite prediction error (

$$\mathrm{{CV}}_{\mathrm{{FPE}}}$$

)) are introduced to relieve the problem of selecting over-smoothing bandwidth parameter by the traditional cross-validation for kernel regression problems. In this paper, we investigate the influence of these four different penalty functions on the cross-validation based bandwidth selection in the framework of a typical kernel regression method, i.e., the Nadaraya-Watson kernel estimator (NWKE). Firstly, we discuss the mathematical properties of these four penalty functions. Then, experiments are given to compare the performance of aforementioned cross-validation methods. Finally, we give guidelines for the selection of different penalty functions in practical applications.

Yumin Zhang
A Hough Transform-Based Biclustering Algorithm for Gene Expression Data

In pattern classification, when the feature space is of high dimensionality or patterns are “similar” on a subset of features only, the traditional clustering methods do not show good performance. Biclustering is a class of methods that simultaneously carry out grouping on two dimensions and has many applications to different fields, especially gene expression data analysis. Because of simultaneous classification on both rows and columns of a data matrix, the biclustering problem is inherently intractable and computationally complex. One of the most complex models in biclustering problem is linear coherent model. Several biclustering algorithms based on this model have been proposed in recent years. However, none of them is able to perfectly recognize all linear patterns in a bicluster. In this work, we propose a novel algorithm based on Hough transform that can find all linear coherent patterns. In the sequel we apply it to gene expression data.

Cuong To, Tien Thanh Nguyen, Alan Wee-Chung Liew
An Effective Biclustering Algorithm for Time-Series Gene Expression Data

The biclustering is a useful tool in analysis of massive gene expression data, which performs simultaneous clustering on rows and columns of the data matrix to find subsets of coherently expressed genes and conditions. Especially, in analysis of time-series gene expression data, it is meaningful to restrict biclusters to contiguous time points concerning coherent evolutions. In this paper, the BCCC-Bicluster is proposed as an extension of the CCC-Bicluster. An algorithm based on the frequent sequential mining is proposed to find all maximal BCCC-Biclusters. The newly defined Frequent-Infrequent Tree-Array (FITA) is constructed to speed up the traversal process, with useful strategies originating from Apriori Property to avoid redundant search. To make it more efficient, the bitwise operation XOR is applied to capture identical or opposite contiguous patterns between two rows. The algorithm is tested on the yeast microarray data. Experimental results show that the proposed algorithm is able to find all embedded BCCC-Biclusters, which are proven to reveal significant GO terms involved in biological processes.

Huixin Xu, Yun Xue, Zhihao Lu, Xiaohui Hu, Hongya Zhao, Zhengling Liao, Tiechen Li
Multiple Orthogonal K-means Hashing

Hashing methods are efficient in dealing with large scale image retrieval problems. Current hashing methods, such as the orthogonal k-means, using coordinate descent algorithm to minimize quantization error usually yield unstable performance. It is because the coordinate descent algorithm only provides a local optimum solution. The orthogonal k-means develops a new model with a compositional parameterization of cluster centers to efficiently represent multiple centers. The objective of the orthogonal k-means is to minimize the quantization error by using the coordinate descent algorithm to find the optimal rotation, scaling and translation on descriptor vectors of images. The performance of the orthogonal k-means is dependent on the initialization of the rotation matrix. In this work, we propose the multiple ok-means hashing method to reduce the instability of performance of the orthogonal k-means hashing. For large scale retrieval problems, standard multiple hash tables methods using

M

tables require

M

times storage in comparison to single hash table schemes. We propose a binary code selection scheme to reduce the storage of the multiple orthogonal k-means to use the same size of storage as for single table’s. Experimental results show that the proposed method outperforms ok-mean using the same size of storage.

Ziqian Zeng, Yueming Lv, Wing W. Y. Ng

Application to Recognition

Frontmatter
Recognizing Bangladeshi Currency for Visually Impaired

Visually impaired people often have to face difficulty when they try to identify denominations of bank notes. Currently in Bangladesh, there is no system that can easily detect the monetary value of the note. Pattern recognition systems developed over the years are now fast enough to do image matching in real time. This enables us to develop a system able to analyze an input frame and generate the value of the paper-based currency in order to aid the visually impaired in their day-to-day life. The proposed system can recognize Bangladeshi paper currency notes with 89.4% accuracy on white paper background and with 78.4% accuracy tested on a complex background.

Mohammad M. Rahman, Bruce Poon, M. Ashraful Amin, Hong Yan
Face Recognition Using Genetic Algorithm

Recently human faces recognition has become a significant problem in many fields especially in criminal investigation area. In order to minimize the scope of searching for a suspect, it is necessary to adopt a method to search the suspect quickly and efficiently. This paper achieves the recognition of human faces by using genetic algorithm. The unique selection of chromosome coding method and the method to select a fitness function are presented. Since human faces include various expressions and different angles of photographs which added to the difficulties of recognition, this article adopts the face, eyes and mouth as the feature extraction which reduces the risk of adverse factors and increases the recognition rate. These three characteristics are fused to make a new face. In the procedure of matching, the foundation to the similarity calculation is the principal component of each feature. Besides, it is the fitness function that measures the characteristics of the suspect and the Euclidean distance between the principal components of each human feature. It implements the value of the fitness of chromosome and accomplishes the automatic recognition.

Qin Qing, Eric C. C. Tsang
Face Liveness Detection by Brightness Difference

This paper proposes a method to detect face liveness against video replay attack. The live persons are distinguished from and video reply attack by analyzing the brightness difference on the face and background. By taking photos with/without a flashlight, the brightness differences of the face are compared with the one of the background. The live person and the attack should have different brightness differences. The accuracy on the liveness detection using the proposed model is satisfying in the experiments.

Patrick P. K. Chan, Ying Shu

Sampling and Big Data

Frontmatter
User Behavior Research Based on Big Data

In this paper, the enterprise user behavior had been studied based on big data. By combining cloudy computing and k-means clustering algorithm, we proposed the parallel k-mean clustering. The feature were chosen as follows: Power consumption rate in the peak load time; the load rate and the power consumption rate in the valley time and so on. The feature weight can be calculated with entropy weight method. The experimental data came from the intelligent industrial park of Gansu province. The enterprise users are classified into two classes, the different type enterprise has their electricity law. In the future, enterprise can optimize their working time, lower the electricity cost in the same power consumption. This provides strong support for the demand of side response of power grid.

Suxiang Zhang, Suxian Zhang
Stochastic Sensitivity Oversampling Technique for Imbalanced Data

Data level technique is proved to be effective in imbalance learning. The SMOTE is a famous oversampling technique generating synthetic minority samples by linear interpolation between adjacent minorities. However, it becomes inefficiency for datasets with sparse distributions. In this paper, we propose the Stochastic Sensitivity Oversampling (SSO) which generates synthetic samples following Gaussian distributions in the Q-union of minority samples. The Q-union is the union of Q-neighborhoods (hypercubes centered at minority samples) and such that new samples are synthesized around minority samples. Experimental results show that the proposed algorithm performs well on most of datasets, especially those with a sparse distribution.

Tongwen Rong, Huachang Gong, Wing W. Y. Ng

Application to Detection

Frontmatter
A Heterogeneous Graph Model for Social Opinion Detection

Microblogging services, such as Twitter, have become popular for people to share their opinions towards a broad range of topics. It is a great challenge to get an overview of some important topics by reading all tweets every day. Previous researches such as opinion detection and opinion summarization have been studied for this problem. However, these works mainly focus on the content of text without taking the quality of short text and features of social media into consideration. In this paper, we propose a heterogeneous graph model for users’ opinion detection on microblog. We first extract keywords of topics. Then, a three-level microblog graph is constructed by combining user influence, word importance, post significance, and topic periodicity. Microblog posts are ranked from different topics by using the random walk algorithm. Experimental results on real a dataset validate the effectiveness of our approach. In comparison with baseline approaches, the proposed method achieves 8 % improvement.

Xiangwen Liao, Yichao Huang, Jingjing Wei, Zhiyong Yu, Guolong Chen
A Storm-Based Real-Time Micro-Blogging Burst Event Detection System

Micro-blogging is becoming an important information source of breaking news event. Since micro-blogs are real-time unbounded stream with complex relationships, traditional burst event detection techniques do not work well. This paper presents the RBEDS which is a real-time burst event detection system following Storm distributed streaming processing framework. K-Means clustering approach and burst feature detection approach are performed to identify candidate burst events, respectively. Their outputs are incorporated to generate final event detection results. Such operation is implemented as a Storm Topology. The proposed system is evaluated on a large Sina micro-blogging dataset. The achieved system performance shows that the RBEDS system may detect burst events with good timeliness, effectiveness and scalability.

Yiding Wang, Ruifeng Xu, Bin Liu, Lin Gui, Bin Tang
A Causative Attack Against Semi-supervised Learning

Semi-supervised learning plays an important role in pattern classification as it learns from not only the labeled sample but also the unlabeled samples. It saves the cost and time on sample labeling. Recently, semi-supervised learning has been applied in many security applications. An adversary may present in these applications to confuse the learning processes. In this paper, we investigate the influence of the adversarial attack on the semi-supervised learning. We propose a causative attack, which injects the attack samples in the training set, to mislead the training of the semi-supervised learning. The experimental results show the accuracy of the classifier trained by the semi-supervised learning drop significantly after attacking by our proposed model.

Yujiao Li, Daniel S. Yeung

Decision Tree Learning

Frontmatter
Study and Improvement of Ordinal Decision Trees Based on Rank Entropy

Decision tree is one of the most commonly used methods of machine learning, and ordinal decision tree is an important way to deal with ordinal classification problems. Through researches and analyses on ordinal decision trees based on rank entropy, the rank mutual information for every cut of each continuous-valued attribute is necessary to determine during the selection of expanded attributes for constructing decision trees based on rank entropy in ordinal classification. Then we need to compare these values of rank mutual information to get the maximum which corresponds to the expanded attribute. As the computational complexity is high, an improved algorithm which establishes a mathematical model is proposed. The improved algorithm is theoretically proved that it only traverses the unstable cut-points without computing the values of stable cut-points. Therefore, the computational efficiency of constructing decision trees is greatly improved. Experiments also confirm that the computational time of the improved algorithm can be reduced greatly.

Jiankai Chen, Junhai Zhai, Xizhao Wang
Extended Space Decision Tree

An extension of the attribute space of a dataset typically increases the prediction accuracy of a decision tree built for this dataset. Often attribute space is extended by randomly combining two or more attributes. In this paper, we propose a novel approach for the space extension where we only choose the combined attributes that have high classification capacity. We expect the inclusion of these attributes in the attribute space increases the prediction capacity of the trees built from the datasets with the extended space. We conduct experiments on five datasets coming from the UCI machine learning repository. Our experimental results indicate that the proposed space extension leads to the tree of higher accuracy than the case where original attribute space is used. Moreover, the experimental results demonstrate a clear superiority of the proposed technique over an existing space extension technique.

Md. Nasim Adnan, Md. Zahidul Islam, Paul W. H. Kwan
Monotonic Decision Tree for Interval Valued Data

Traditional decision tree algorithms for interval valued data only can deal with non-ordinal classification problems. In this paper, we presented an algorithm to solve the ordinal classification problems, where both the condition attributes with interval values and the decision attributes meet the monotonic requirement. The algorithm uses the rank mutual information to select extended attributes, which guarantees that the outputted decision tree is monotonic. The proposed algorithm is illustrated by a numerical example, and a monotonically consistent decision tree is generated. The design of algorithm can provide some useful guidelines for extending real-vauled to interval-valued attributes in ordinal decision tree induction.

Hong Zhu, Junhai Zhai, Shanshan Wang, Xizhao Wang
Parallel Ordinal Decision Tree Algorithm and Its Implementation in Framework of MapReduce

Ordinal decision tree (ODT) can effectively deal with monotonic classification problems. However, it is difficult for the existing ordinal decision tree algorithms to learning ODT from large data sets. In order to deal with the problem of generating an ODT from large datasets, this paper presents a parallel processing mechanism in the framework of MapReduce. Similar to the general ordinal decision tree algorithms, the rank mutual information (RMI) is still used to select the extended attributes. Differing from the calculation of RMI in the previous algorithms, this paper applies a strategy of attribute parallelization to calculate the RMI. Experiments on large ordered data sets (which are generated artificially) confirm that our proposed algorithm is feasible. Experimental results show that our algorithm is effective and efficient from three aspects: speed-up, scale-up and size-up.

Shanshan Wang, Junhai Zhai, Hong Zhu, Xizhao Wang

Learning and Adaptation

Frontmatter
An Improved Iterative Closest Point Algorithm for Rigid Point Registration

Iterative Closest Point (ICP) is a popular rigid point set registration method that has been used to align two or more rigid shapes. In order to reduce the computation complexity and improve the flexibility of ICP algorithm, an efficient and robust subset-ICP rigid registration method is proposed in this paper. It searches for the corresponding pairs on subsets of the entire data, which can provide structural information to benefit the registration. Experimental results on 2D and 3D point sets demonstrate the efficiency and robustne

ss of the proposed method.

Junfen Chen, Bahari Belaton
Approachs to Computing Maximal Consistent Block

Maximal consistent block is a technique for rule acquisition in incomplete information systems. It was first proposed by Yee Leung and Deyu Li in 2001. However, the maximal consistent blocks of an incomplete information system must be computed before they are put into use. In this paper, we introduced several approaches for computing maximal consistent block and their characteristics were further investigated. Each approach’s time complexity is provided as well.

Xiangrui Liu, Mingwen Shao
Missing Value Imputation for the Analysis of Incomplete Traffic Accident Data

Road traffic accidents are a major public health concern, resulting in an estimated 1.3 million deaths and 52 million injuries worldwide each year. All the developed and developing countries suffer from the consequences of increase in both human and vehicle population. Therefore, methods to reduce accident severity are of great interest to traffic agencies and the public at large. To analysis the traffic accident factors effectively we need a complete traffic accident historical database without missing data. Road accident fatality rate depends on many factors and it is a very challenging task to investigate the dependencies between the attributes because of the many environmental and road accident factors. Any missing data in the database could obscure the discovery of important factors and lead to invalid conclusions. In order to make the traffic accident datasets useful for analysis, it should be preprocessed properly. In this paper, we present a novel method based on decision tree and imputed value sampling based on correlation measure for the imputation of missing values to improve the quality of the traffic accident data. We applied our algorithm to the publicly available large traffic accident database of United States (explore.data.gov), which is the largest open federal database in United States. We compare our algorithm with three existing imputation methods using three evaluation criteria, i.e. mean absolute error, coefficient of determination and root mean square error. Our results indicate that the proposed method performs significantly better than the three existing algorithms.

Rupam Deb, Alan Wee-chung Liew
Learning Behaviour for Service Personalisation and Adaptation

Context-aware applications within pervasive environments are increasingly being developed as services and deployed in the cloud. As such these services are increasingly required to be adaptive to individual users to meet their specific needs or to reflect the changes of their behavior. To address this emerging challenge this paper introduces a service-oriented personalisation framework for service personalisation with special emphasis being placed on behavior learning for user model and service function adaptation. The paper describes the system architecture and the underlying methods and technologies including modelling and reasoning, behavior analysis and a personalisation mechanism. The approach has been implemented in a service-oriented prototype system, and evaluated in a typical scenario of providing personalised travel assistance for the elderly using the help-on-demand services deployed on smartphone.

Liming Chen, Kerry Skillen, William Burns, Susan Quinn, Joseph Rafferty, Chris Nugent, Mark Donnelly, Ivar Solheim
Extraction of Class Attributes from Online Encyclopedias

Class attributes are important resources in question answering, knowledge base building and semantic retrieval. In this paper, we propose an approach extracting class attributes from online encyclopedias. This approach combines the tolerance rough set model and semantic relatedness computing. Firstly, the implementation of the tolerance rough set model ensures a high precision of top-

$$k$$

extracted class attributes, and then the semantic relatedness computing improves the coverage of top-

$$k$$

extracted class attributes in order to achieve higher accuracy. Finally experiments on the extracted class attributes show the effectiveness of our approach.

Hongzhi Guo, Qincai Chen, Chunxiao Sun
Selective Ensemble of RBFNNs Based on Improved Negative Correlation Learning

In this paper, a novel selective ensemble method based on the improved negative correlation learning is proposed. To make the proposed ensemble strategy more robust against noise, correntropy is utilized to substitute mean square error (MSE). Moreover, an L1-norm based regularization term of ensemble weights is incorporated into the objective function of the proposed ensemble strategy to fulfill the task of selective ensemble. The half-quadratic optimization technique and the surrogate function method are used to solve the optimization problem of the proposed ensemble strategy. Experimental results on two synthetic data sets and the five benchmark data sets demonstrate that the proposed method is superior to the single radial basis function neural network (RBFNN).

Hongjie Xing, Lifei Liu, Sen Li
A Two-Phase RBF-ELM Learning Algorithm

A variant of extreme learning machine (ELM) named RBF-ELM was proposed by Huang et al. in 2004. The RBF-ELM is tailored for radial basis function (RBF) networks. Similar to ELM, RBF-ELM also employs randomized method to initialize the centers and widths of RBF kernels, and analytically calculate the output weights of RBF networks. In this paper, we proposed a two-phase RBF-ELM learning algorithm, which only randomly initializes the width parameters. The center parameters are determined by an instance selection method. The first phase of the proposed algorithm is to select the centers of the RBF network rather than randomly initializing. The second phase is to train the RBF network with ELM. Compared with the RBF-ELM, the experimental results show that the proposed algorithm can improve the testing accuracy.

Junhai Zhai, Wenxiang Hu, Sufang Zhang

Similarity and Decision Making

Frontmatter
A Study on Decision Making by Thai Software House Companies in Choosing Computer Programming Languages

Choosing good computer programming languages by software house companies can support software development. Managers of the companies have to make decision to choose the languages based on various criteria. This study aims to investigate the factors and criteria that influence the decision. The research findings show that characteristics of programmers, technology and tools, culture and society, problems occurring in software development process have high influence on the decision in choosing computer programming languages with statistical significance. Such information could be incorporated in a decision making system to optimize the appropriate language to be adopted.

Vasin Chooprayoon
An Improved Method for Semantic Similarity Calculation Based on Stop-Words

Text similarity calculation has become one of the key issues of many applications such as information retrieval, semantic disambiguation, automatic question answering. There are increasing needs of similarity calculations in different levels, e.g. characters, vocabularies, syntactic structures and semantic etc. Most of existing semantic similarity algorithms can be categorized into statistical based methods, rule based methods and combination of these two methods. Statistical methods use knowledge bases to incorporate more comprehensive knowledge and have the capability of reducing knowledge noise. So they are able to obtain better performance. Nevertheless, for the unbalanced distribution of different items in the knowledge base, semantic similarity calculation performance for low-frequency words is usually poor. In this work, based on the distributions of stop-words, we proposes a weights normalization method for semantic dimensions. The proposed method uses the semantic independence of stop-words to avoid semantic bias of corpus in statistical methods. It further improves the accuracy of semantic similarity computation. Experiments compared with several existing algorithms show the effectiveness of the proposed method.

Haodi Li, Qingcai Chen, Xiaolong Wang

Learning with Uncertainty

Frontmatter
Sensitivity Analysis of Radial-Basis Function Neural Network due to the Errors of the I.I.D Input

An important issue, in the design and implementation of a Radial-Basis Function Neural Network (RBFNN), is the sensitivity of its output to input perturbations. Based on the central limit theorem, this paper proposes a method to compute the sensitivity of the RBFNN due to the errors of the inputs of the networks. For simplicity and practicality, all inputs are assumed to be independent and identically distributed (i.i.d.) with uniform distribution on interval (a, b). A number of simulations are conducted and the good agreement between the experimental results and the theoretical results verifies the reliability and feasibility of the proposed method. With this method, not only the relationship among the sensitivity of RBFNN, input error ratios and the number of the neurons of the input layer but also the relationship among the sensitivity, input error ratios and the number of the neurons of the hidden layer is founded.

Jie Li, Jun Li, Ying Liu
Fuzzy If-Then Rules Classifier on Ensemble Data

This paper introduces a novel framework that uses fuzzy IF-THEN rules in an ensemble system. Our model tackles several drawbacks. First, IF-THEN rules approaches have problems with high dimensional data since computational cost is exponential. In our framework, rules are operated on outputs of base classifiers which frequently have lower dimensionality than the original data. Moreover, outputs of base classifiers are scaled within the range [0, 1] so it is convenient to apply fuzzy rules directly instead of requiring data transformation and normalization before generating fuzzy rules. The performance of this model was evaluated through experiments on 6 commonly used datasets from UCI Machine Learning Repository and compared with several state-of-art combining classifiers algorithms and fuzzy IF-THEN rules approaches. The results show that our framework can improve the classification accuracy.

Tien Thanh Nguyen, Alan Wee-Chung Liew, Cuong To, Xuan Cuong Pham, Mai Phuong Nguyen
Image Segmentation Based on Graph-Cut Models and Probabilistic Graphical Models: A Comparative Study

Image segmentation has been one of the most important unsolved problems in computer vision for many years. Recently, there have been great effort in producing better segmentation algorithms. The purpose of this paper is to introduce two proposed graph based segmentation methods, namely, graph-cut models (deterministic) and a unified graphical model (probabilistic). We present some foreground/background segmentation results to illustrate the performance of the algorithms on images with complex background scene.

Maedeh Beheshti, Alan Wee-Chung Liew
Tolerance Rough Fuzzy Approximation Operators and Their Properties

In the framework of classification, the rough fuzzy set (RFS) deal with the fuzzy decision tables with discrete conditional attributes and fuzzy decision attribute. However, in many applications, the conditional attributes are often real-valued. In order to deal with this problem, this paper extends the RFS model to tolerance RFS, The definitions of the tolerance rough fuzzy set approximation operators are given, and their properties are investigated.

Yao Zhang, Junhai Zhai, Sufang Zhang
Extreme Learning Machine for Interval-Valued Data

Extreme learning machine (ELM) is a fast learning algorithm for single hidden layer feed-forward neural networks, but it only can deal with the data sets with numerical attributes. Interval-valued data is considered as a direct attempt to extend precise real-valued data to imprecise scenarios. To deal with imprecise data, this paper proposes three extreme learning machine (ELM) models for interval-valued data. Mid-point and range of the interval are selected as the variables in the first model as in previous works. The second model selects endpoints as variables and produces better performance than model 1. The third model, a constrained ELM for interval-valued data, is built to guarantee the left bound is always smaller than its right bound. Three different standards are used to test the effectiveness of the three models, and experimental results show that the latter two models offer better performances than the former one.

Shixin Zhao, Xizhao Wang
Credibility Estimation of Stock Comments Based on Publisher and Information Uncertainty Evaluation

Recently, there are rapidly increasing stock-related comments sharing on Internet. However, the qualities of these comments are quite different. This paper presents an automatic approach to identify high quality stock comments by means of estimating the credibility of the comments from two aspects. Firstly, the credibility of information source is evaluated by estimating the historical credibility and industry-related credibility using a linear regression model. Secondly, the credibility of the comment information is estimated through calculating the uncertainty of comment content using an uncertainty glossary based matching method. The final stock comment credibility is obtained by incorporating the above two credibility measures. The experiments on real stock comment dataset show that the proposed approach identifies high quality stock comments and institutions/individuals effectively.

Qiaoyun Qiu, Ruifeng Xu, Bin Liu, Lin Gui, Yu Zhou
A Fast Algorithm to Building a Fuzzy Rough Classifier

In this paper, by strict mathematic reasoning, we discover the relation between the similarity relation and lower approximation. Based on this relation, we design a fast algorithm to build a rule based fuzzy rough classifier. Finally, the numerical experiments demonstrate the efficiency and the affectivity of the proposed algorithm.

Eric C. C. Tsang, Suyun Zhao

Improved Learning Algorithms and Applications

Frontmatter
Surface Electromyography Time Series Analysis for Regaining the Intuitive Grasping Capability After Thumb Amputation

The thumb enables most of the hand’s functions such as grasping, gripping and pinching. Therefore the amputation of thumb results many difficulties in object manipulation. In this paper, we present an experimental procedure for manipulating an artificial finger to regain the intuitive grasping capability after the thumb amputation. Here we demonstrate a proportional surface electromyography (s-EMG) classifier, which can be used to obtain three key factors for grasping; the motor command from of the user’s nervous system, corresponding angle of rotation and the appropriate torque. The system was tested with both amputated and non-amputated subjects. Based on experiments we offer evidence that, our strategy can be used to intuitively manipulate a prosthetic finger in real time. The system provides a dynamic, smooth and anthropomorphic manipulation of prosthetic fingers under a low training time which is around 3 - 5 seconds with afast response time around 0.5 seconds.

Chithrangi Kaushalya Kumarasinghe, D. K. Withanage
Study on Orthogonal Basis NN-Based Storage Modelling for Lake Hume of Upper Murray River, Australia

The Murray-Darling Basin is Australia’s most iconic and the largest catchment. It is also one of the largest river systems in the world and one of the driest. For managing the sustainable use of the Basin’s water, hydrological modelling plays important role. The main models in use are the mathematical represented models which are difficult of containing full relationship between rainfall runoff, flow routing, upstream storage, evaporation and other water losses. Hume Reservoir is the main supply storage and one of the two major headwater storages for the River Murray system. It is crucial in managing flows and securing water supplies along the entire River Murray System, including Adelaide. In this paper, two Orthogonal Basis NN-Based storage models for Hume Reservoir are developed by using flow data from upstream gauge stations. One is only considering flow data from upstream gauge stations. Another is considering both upstream flow data and rainfall. The Neural Network (NN) learning algorithm is based on Ying Li’s previous research outcome. The modelling results proved that the approach has high accuracy, good adaptability and extensive applicability.

Ying Li, Yan Li, Xiaofen Wang
De Novo Gene Expression Analysis to Assess the Therapeutic and Toxic Effect of Tachyplesin I on Human Glioblastoma Cell Lines

Tachyplesin I (TP-I) is an antimicrobial peptide isolated from the hemocytes of the horseshoe crab. A series of biochemical analysis has been performed to gain insight into the mechanism of its strong antimicrobial and anticancer activity. In this study, we employ the microarray technology to identify the co-regulated gene groups of TP-I on human glioma cell lines. The 3 phenotypes of cell lines are treated with the different doses of TP-I including 1-ug/ml, 4-ug/ml and blank groups. As a result, the differentially expressed genes are identified by the paired-comparison of the phenotypes. Considering the consistency within the replicated samples, only the 2572 differential genes are used for the biclustering analysis. Different from the standard clustering, the biclustering algorithms perform clustering along two dimensions of row and column of the data matrix. Detected local patterns may provide clues about the biological processes associated with different physiological states. With the expression data matrix of significant genes across 9 samples, we performs the geometrical biclustering algorithm to find significant co-expressed genes within every phenotype. The further GO analysis with the co-expressed genes are performed to infer the therapeutic and toxic effect of TP-I on human glioma cell lines at the genome level. Some biological processes are of interests. For example, the process related to actin is significantly enriched in Glioblastoma without the treatment with TP-I. Genes defenses virus with the treatment of TP-I. With the increasing dose of TP-I, some toxic effect such as a defensive response to other organism are shown. Our findings provides an alternative choice in the clinical pharmacy for treating glioma with TP-I.

Hongya Zhao, Hong Ding, Gang Jin
An Improved Reject on Negative Impact Defense

Causative attack in which the training samples have been attacked in order to mislead the learning of a classifier is a common scenario in adversarial learning. One of the countermeasures is called the data sanitization which removes suspect attack or noisy samples before training. The data sanitization can be categorized into classifier-independent and classifier-dependent methods. Classifier-independent methods measure the characteristics of the samples while classifiers are trained in classifier-dependent methods. Although the accuracy of classifier-dependent methods is higher, they are time-consumed in comparison with classifier-independent methods. This paper proposes a data sanitization method using both classifier-dependent and classifier-independent information. Not only one sample but a set of similar samples identified by the relative neighborhood graph are considered in Reject on Negative Impact method. The experimental results suggest that the performance of the proposed method is similar to the RONI but with less time complexity.

Hongjiang Li, Patrick P. K. Chan
Backmatter
Metadaten
Titel
Machine Learning and Cybernetics
herausgegeben von
Xizhao Wang
Witold Pedrycz
Patrick Chan
Qiang He
Copyright-Jahr
2014
Verlag
Springer Berlin Heidelberg
Electronic ISBN
978-3-662-45652-1
Print ISBN
978-3-662-45651-4
DOI
https://doi.org/10.1007/978-3-662-45652-1

Premium Partner