scroll identifier for mobile
main-content

## Über dieses Buch

The four volume set LNCS 9489, LNCS 9490, LNCS 9491, and LNCS 8836 constitutes the proceedings of the 22nd International Conference on Neural Information Processing, ICONIP 2015, held in Istanbul, Turkey, in November 2015.

The 231 full papers presented were carefully reviewed and selected from 375 submissions. The 4 volumes represent topical sections containing articles on Learning Algorithms and Classification Systems; Artificial Intelligence and Neural Networks: Theory, Design, and Applications; Image and Signal Processing; and Intelligent Social Networks.

## Inhaltsverzeichnis

### Texture Classification with Patch Autocorrelation Features

Recently, a novel approach of capturing the autocorrelation of an image termed Patch Autocorrelation Features (PAF) was proposed. The PAF approach was successfully evaluated in a series of handwritten digit recognition experiments on the popular MNIST data set. However, the PAF representation has limited applications, because it is not invariant to affine transformations. In this work, the PAF approach is extended to become invariant to image transformations such as translation and rotation changes. First, several features are extracted from each image patch taken at a regular interval. Based on these features, a vector of similarity values is computed between each pair of patches. Then, the similarity vectors are clustered together such that the spatial offset between the patches of each pair is roughly the same. Finally, the mean and the standard deviation of each similarity value are computed for each group of similarity vectors. These statistics are concatenated in a feature vector called Translation and Rotation Invariant Patch Autocorrelation Features (TRIPAF). The TRIPAF vector essentially records information about the repeating patterns within an image at various spatial offsets. Several texture classification experiments are conducted on the Brodatz data set to evaluate the TRIPAF approach. The empirical results indicate that TRIPAF can improve the performance by up to $$10\,\%$$10% over a system that uses the same features, but extracts them from entire images. Furthermore, state of the art accuracy rates are obtained when the TRIPAF approach is combined with a scale invariant model, namely a bag of visual words model based on SIFT features.

Radu Tudor Ionescu, Andreea Lavinia Popescu, Dan Popescu

### Novel Architecture for Cellular Neural Network Suitable for High-Density Integration of Electron Devices-Learning of Multiple Logics

We will propose a novel architecture for a cellular neural network suitable for high-density integration of electron devices. A neuron consists of only eight transistors, and a synapse consists of just only one transistor. We fabricated a cellular neural network using thin-film devices. Particularly in this time, we confirmed that our neural network can learn multiple logics even in a small-scale neural network. We think that this result indicates that our proposal has a big potential for future electronics using neural networks.

Mutsumi Kimura, Yusuke Fujita, Tomohiro Kasakawa, Tokiyoshi Matsuda

### Analyzing the Impact of Feature Drifts in Streaming Learning

Learning from data streams requires efficient algorithms capable of deriving a model accordingly to the arrival of new instances. Data streams are by definition unbounded sequences of data that are possibly non stationary, i.e. they may undergo changes in data distribution, phenomenon named concept drift. Concept drifts force streaming learning algorithms to detect and adapt to such changes in order to present feasible accuracy throughout time. Nonetheless, most of works presented in the literature do not account for a specific kind of drifts: feature drifts. Feature drifts occur whenever the relevance of an arbitrary attribute changes through time, also impacting the concept to be learned. In this paper we (i) verify the occurrence of feature drift in a publicly available dataset, (ii) present a synthetic data stream generator capable of performing feature drifts and (iii) analyze the impact of this type of drift in stream learning algorithms, enlightening that there is room and the need for dynamic feature selection strategies for data streams.

Jean Paul Barddal, Heitor Murilo Gomes, Fabrício Enembreck

### Non-linear Metric Learning Using Metric Tensor

Manifold based metric learning methods have become increasingly popular in recent years. In almost all these methods, however, the underlying manifold is approximated by a point cloud, and the matric tensor, which is the most basic concept to describe the manifold, is neglected. In this paper, we propose a non-linear metric learning framework based on metric tensor. We construct a Riemannian manifold and its metric tensor on sample space, and replace the Euclidean metric by the learned Riemannian metric. By doing this, the sample space is twisted to a more suitable form for classification, clustering and other applications. The classification and clustering results on several public datasets show that the learned metric is effective and promising.

Liangying Yin, Mingtao Pei

### An Optimized Second Order Stochastic Learning Algorithm for Neural Network Training

The performance of a neural network depends critically on its model structure and the corresponding learning algorithm. This paper proposes bounded stochastic diagonal Levenberg-Marquardt (B-SDLM), an improved second order stochastic learning algorithm for supervised neural network training. The algorithm consists of a single hyperparameter only and requires negligible additional computations compared to conventional stochastic gradient descent (SGD) method while ensuring better learning stability. The experiments have shown very fast convergence and better generalization ability achieved by our proposed algorithm, outperforming several other learning algorithms.

Mohamed Khalil-Hani, Shan Sung Liew, Rabia Bakhteri

### Max-Pooling Dropout for Regularization of Convolutional Neural Networks

Recently, dropout has seen increasing use in deep learning. For deep convolutional neural networks, dropout is known to work well in fully-connected layers. However, its effect in pooling layers is still not clear. This paper demonstrates that max-pooling dropout is equivalent to randomly picking activation based on a multinomial distribution at training time. In light of this insight, we advocate employing our proposed probabilistic weighted pooling, instead of commonly used max-pooling, to act as model averaging at test time. Empirical evidence validates the superiority of probabilistic weighted pooling. We also compare max-pooling dropout and stochastic pooling, both of which introduce stochasticity based on multinomial distributions at pooling stage.

Haibing Wu, Xiaodong Gu

### Predicting Box Office Receipts of Movies with Pruned Random Forest

Predicting box office receipts of movies in theatres is a difficult and challenging problem on which many theatre managers cogitated. In this study, we use pruned random forest to predict the box office of the first week in Chinese theatres one month before movies’ theatrical release. In our model, the prediction problem is converted into a classification problem, where the box office receipt of a movie is discretized into eight categories. Experiments on 68 theatres show that the proposed method outperforms other statistical models. In fact, our model can predict the expected revenue range of a movie, it can be used as a powerful decision aid by theatre managers.

Zhenyu Guo, Xin Zhang, Yuexian Hou

### A Novel $$\ell ^1$$ ℓ 1 -graph Based Image Classification Algorithm

In original sparse representation based classification algorithms, each training sample belongs to exactly one class, neglecting the association between the training sample and the other classes. However, different classes’ features are visually similar and correlated (e.g. facial images), which means the association between the training sample and the different classes contain important information, and must be taken into consideration. In this paper, we propose a novel $$\ell ^1$$ℓ1-graph based image classification algorithm (LGC). Our algorithm can automatically calculate associations between training samples and all classes, which are used for future classification. We evaluate our method on some popular visual benchmarks, the experimental results prove the effectiveness of our method.

Jia-Yue Xu, Shu-Tao Xia

### Classification of Keystroke Patterns for User Identification in a Pressure-Based Typing Biometrics System with Particle Swarm Optimization (PSO)

Classification of users’ keystroke patterns captured from a typing biometrics system is discussed in this paper. Although the user identification system developed here requires the user to key-in their passwords as they would normally do, the identification of the users will only be based on their keystroke patterns rather than the actual passwords. The keystroke pattern generated is represented by the force applied on a numerical keypad and it is this set of features extracted from a common password that will be submitted to the classifiers to identify the different users. The typing biometrics system had been designed and developed with an 8-bit microcontroller that is based on the AVR enhanced RISC architecture. Classification of these keystroke patterns will be with PSO (particle swarm optimization) and this will be compared with the standard K-Means. The preliminary experimental results showed that the identity of users can be authenticated based solely on their keystroke biometric patterns from a numeric keypad.

Weng Kin Lai, Beng Ghee Tan, Ming Siong Soo, Imran Khan

### Discriminative Orthonormal Dictionary Learning for Fast Low-Rank Representation

This paper presents a discriminative orthonormal dictionary learning method for low-rank representation. The orthonormal property is beneficial for the representative power of the dictionary by avoiding the dictionary redundancy. To enhance the discriminative power of the dictionary, all the class-specific dictionaries which are encouraged to well represent the samples from the same class are optimized simultaneously. With the learned discriminative orthonormal dictionary, the low-rank representation problem can be solved much faster than traditional methods. Experiments on three public datasets demonstrate the effectiveness and efficiency of our method.

Zhen Dong, Mingtao Pei, Yunde Jia

### Supervised Topic Classification for Modeling a Hierarchical Conference Structure

In this paper we investigate the problem of supervised latent modeling for extracting topic hierarchies from data. The supervised part is given in the form of expert information over document-topic correspondence. To exploit the expert information we use a regularization term that penalizes the difference between a predicted and an expert-given model. We hence add the regularization term to the log-likelihood function and use a stochastic EM based algorithm for parameter estimation. The proposed method is used to construct a topic hierarchy over the proceedings of the European Conference on Operational Research and helps to automatize the abstract submission system.

Mikhail Kuznetsov, Marianne Clausel, Massih-Reza Amini, Eric Gaussier, Vadim Strijov

### A Framework for Online Inter-subjects Classification in Endogenous Brain-Computer Interfaces

Inter-subjects classification and online adaptation techniques have been actively explored in the brain-computer interfaces (BCIs) research community during the last years. However, few works tried to conceive classification models that take advantage of both techniques. In this paper we propose an online inter-subjects classification framework for endogenous BCIs. Inter-subjects classification is performed using a weighted average ensemble in which base classifiers are learned using data recorded from different subjects and weighted according to their accuracies in classifying brain signals of current BCI user. Online adaptation is performed by updating base classifiers’ weights in a semi-supervised way based on ensemble predictions reinforced by interaction error-related potentials (iErrPs). The effectiveness of our approach is demonstrated using two electroencephalography (EEG) data sets and a previously proposed procedure for simulating interaction error potentials.

Sami Dalhoumi, Gérard Dray, Jacky Montmain, Stéphane Perrey

### A Bayesian Sarsa Learning Algorithm with Bandit-Based Method

We propose an efficient algorithm called Bayesian Sarsa (BS) on the consideration of balancing the tradeoff between exploration and exploitation in reinforcement learning. We adopt probability distributions to estimate Q-values and compute posterior distributions about Q-values by Bayesian Inference. It can improve the accuracy of Q-values function estimation. In the process of algorithm learning, we use a Bandit-based method to solve the exploration/exploitation problem. It chooses actions according to the current mean estimate of Q-values plus an additional reward bonus for state-action pairs that have been observed relatively little. We demonstrate that Bayesian Sarsa performs quite favorably compared to state-of-the-art reinforcement learning approaches.

Shuhua You, Quan Liu, Qiming Fu, Shan Zhong, Fei Zhu

### Incrementally Built Dictionary Learning for Sparse Representation

Extracting sparse representations with Dictionary Learning (DL) methods has led to interesting image and speech recognition results. DL has recently been extended to supervised learning (SDL) by using the dictionary for feature extraction and classification. One challenge with SDL is imposing diversity for extracting more discriminative features. To this end, we propose Incrementally Built Dictionary Learning (IBDL), a supervised multi-dictionary learning approach. Unlike existing methods, IBDL maximizes diversity by optimizing the between-class residual error distance. It can be easily parallelized since it learns the class-specific parameters independently. Moreover, we propose an incremental learning rule that improves the convergence guarantees of stochastic gradient descent under sparsity constraints. We evaluated our approach on benchmark digit and face recognition tasks, and obtained comparable performances to existing sparse representation and DL approaches.

Ludovic Trottier, Brahim Chaib-draa, Philippe Giguère

### Learning to Reconstruct 3D Structure from Object Motion

In this paper, we propose a new approach for reconstructing 3D structure from motion parallax. Instead of obtaining 3D structure from multi-view geometry or factorization, a Deep Neural Network (DNN) based method is proposed without assuming the camera model explicitly. In the proposed method, the targets are first split into connected 3D corners, and then the DNN regressor is trained to estimate the relative 3D structure of each corner from the target rotation. Finally, a temporal integration is performed to further improve the reconstruction accuracy. The effectiveness of the method is proved by a typical experiment of the Kinetic Depth Effect (KDE) in human visual system, in which the DNN regressor reconstructs the structure of a rotating 3D bent wire. The proposed method is also applied to reconstruct another two real targets. Experimental results on both synthetic and real images show that the proposed method is accurate and effective.

Wentao Liu, Haobin Dou, Xihong Wu

### Convolutional Networks Based Edge Detector Learned via Contrast Sensitivity Function

Edge detection extracts rich geometric structures of the image and largely reduces the amount of data to be processed, providing essential input to many visual tasks. Traditional algorithms consist of three steps: smoothing, filtering and locating, in which the filters are usually designed manually and thresholds are selected without strictly theoretical support. In this paper, convolutional networks (ConvNets) are trained to detect edges by learning a group of filters and classifiers simultaneously. In addition, the contrast sensitivity function (CSF) in visual psychology is adopted to determine whether an edge is visible to human visual system (HVS). Edge samples of various appearance are synthesised, and then labelled via CSF for model training. Multi-channel ConvNets are trained to perceive edges of different frequencies and composed at last. Compared with classical algorithms, ConvNets-CSF model is more robust to contrast variation and more biologically plausible. Evaluated on USF edge detection dataset, it achieves comparable performance as Canny edge detector and outperforms other classical algorithms.

Haobin Dou, Wentao Liu, Junnan Zhang, Xihong Wu

### Learning Algorithms and Frame Signatures for Video Similarity Ranking

Learning algorithms that harmonize standardized video similarity tools and an integrated system are presented. The learning algorithms extract exemplars reflecting time courses of video frames. There were five types of such clustering methods. Among them, this paper chooses a method called time-partition pairwise nearest-neighbor because of its reduced complexity. On the similarity comparison among videos whose lengths vary, the M-distance that can absorb the difference of the exemplar cardinalities is utilized both for global and local matching. Given the order-aware clustering and the M-distance comparison, system designers can build a basic similar-video retrieval system. This paper promotes further enhancement on the exemplar similarity that matches the video signature tools for the multimedia content description interface by ISO/IEC. This development showed the ability of the similarity ranking together with the detection of plagiarism of video scenes. Precision-recall curves showed a high performance in this experiment.

Teruki Horie, Akihiro Shikano, Hiromichi Iwase, Yasuo Matsuyama

### On Measuring the Complexity of Classification Problems

There has been a growing interest in describing the difficulty of solving a classification problem. This knowledge can be used, among other things, to support more grounded decisions concerning data pre-processing, as well as for the development of new data-driven pattern recognition techniques. Indeed, to estimate the intrinsic complexity of a classification problem, there are a variety of measures that can be extracted from a training data set. This paper presents some of them, performing a theoretical analysis.

Ana Carolina Lorena, Marcilio C. P. de Souto

### The Effect of Stemming and Stop-Word-Removal on Automatic Text Classification in Turkish Language

Text classification is defined simply as the labeling of natural and unstructured language text documents using predefined categories or classes. This classification not only help organizations in improving their business communication skills and their customer satisfaction levels, but also improves the usage of unstructured data in academic and non-academic world. The aim of this study is to analyze the effect of stemming, over-sampling, and stopword-removal when doing automatic classification on Turkish content. After obtaning a Turkish Corpus, stemming, balancing, and stopword-removal is applied and the results are evaluated.

Mustafa Çağataylı, Erbuğ Çelebi

### Example-Specific Density Based Matching Kernel for Classification of Varying Length Patterns of Speech Using Support Vector Machines

In this paper, we propose example-specific density based matching kernel (ESDMK) for the classification of varying length patterns of long duration speech represented as sets of feature vectors. The proposed kernel is computed between the pair of examples, represented as sets of feature vectors, by matching the estimates of the example-specific densities computed at every feature vector in those two examples. In this work, the number of feature vectors of an example among the K nearest neighbors of a feature vector is considered as an estimate of the example-specific density. The minimum of the estimates of two example-specific densities, one for each example, at a feature vector is considered as the matching score. The ESDMK is then computed as the sum of the matching score computed at every feature vector in a pair of examples. We study the performance of the support vector machine (SVM) based classifiers using the proposed ESDMK for speech emotion recognition and speaker identification tasks and compare the same with that of the SVM-based classifiers using the state-of-the-art kernels for varying length patterns.

Abhijeet Sachdev, A. D. Dileep, Veena Thenkanidiyoor

### Possibilistic Information Retrieval Model Based on Relevant Annotations and Expanded Classification

The heterogeneity and the great mass of information found on the web today require an information treatment before being used. The annotations, like all other information, must be filtered to determine those that are relevant. The new concept of “relevant annotation” can be then, considered as a new source of evidence. In addition to the vast amount of annotations, we notice that annotations express generally brief ideas using some words that they cannot be comprehensible independently of his context. This is why, we thought to classify it in clusters annotations sharing the same context and semantically related. In this paper, we propose a new model based on clustering for the classification and probabilistic model for the filtering. In the experiments, we tried to consider the relevant annotation classes as a new source of information able to improve the collaborative information retrieval.

Fatiha Naouar, Lobna Hlaoua, Mohamed Nazih Omri

### A Transfer Learning Method with Deep Convolutional Neural Network for Diffuse Lung Disease Classification

We introduce a deep convolutional neural network (DCNN) as feature extraction method in a computer aided diagnosis (CAD) system in order to support diagnosis of diffuse lung diseases (DLD) on high-resolution computed tomography (HRCT) images. DCNN is a kind of multi layer neural network which can automatically extract features expression from the input data, however, it requires large amount of training data. In the field of medical image analysis, the number of acquired data is sometimes insufficient to train the learning system. Overcoming the problem, we apply a kind of transfer learning method into the training of the DCNN. At first, we apply massive natural images, which we can easily collect, for the pre-training. After that, small number of the DLD HRCT image as the labeled data is applied for fine-tuning. We compare DCNNs with training of (i) DLD HRCT images only, (ii) natural images only, and (iii) DLD HRCT images + natural images, and show the result of the case (iii) would be better DCNN feature rather than those of others.

Hayaru Shouno, Satoshi Suzuki, Shoji Kido

### Evaluation of Machine Learning Algorithms for Automatic Modulation Recognition

Automatic modulation recognition (AMR) becomes more important because of usable in advanced general-purpose communication such as cognitive radio as well as specific applications. Therefore, developments should be made for widely used modulation types; machine learning techniques should be tried for this problem. In this study, we evaluate performance of different machine learning algorithms for AMR. Specifically, we propose nonnegative matrix factorization (NMF) technique and additionally we evaluate performance of artificial neural networks (ANN), support vector machines (SVM), random forest tree, k-nearest neighbor (k-NN), Hoeffding tree, logistic regression and Naive Bayes methods to obtain comparative results. These are most preferred feature extraction methods in the literature and they are used for a set of modulation types for general-purpose communication. We compare their recognition performance in accuracy metric. Additionally, we prepare and donate the first data set to University of California-Machine Learning Repository related with AMR.

Muhammed Abdurrahman Hazar, Niyazi Odabaşioğlu, Tolga Ensari, Yusuf Kavurucu

### Probabilistic Prediction in Multiclass Classification Derived for Flexible Text-Prompted Speaker Verification

So far, we have presented a method for text-prompted multistep speaker verification using GEBI (Gibbs-distribution based extended Bayesian inference) for reducing single-step verification error, where we use thresholds for acceptance and rejection but the tuning is not so easy and affects the performance of verification. To solve the problem of thresholds, this paper presents a method of probabilistic prediction in multiclass classification for solving verification problem. We also present loss functions for evaluating the performance of probabilistic prediction. By means of numerical experiments using recorded real speech data, we examine the properties of the present method using GEBI and BI (Bayesian inference) and show the effectiveness and the risk of probability loss in the present method.

Shuichi Kurogi, Shota Sakashita, Satoshi Takeguchi, Takuya Ueki, Kazuya Matsuo

### Simple Feature Quantities for Learning of Dynamic Binary Neural Networks

This paper presents simple feature quantities for learning of dynamic binary neural networks. The teacher signal is a binary periodic orbit corresponding to control signal of switching circuits. The feature quantities characterize generation of spurious memories and stability of the teacher signal. We present a simple greedy search based algorithm where the two feature quantities are used as cost functions. Performing basic numerical experiments, the algorithm efficiency is confirmed.

Ryuji Sato, Toshimichi Saito

### Transfer Metric Learning for Kinship Verification with Locality-Constrained Sparse Features

Kinship verification between aged parents and their children based on facial images is a challenging problem, due to aging factor which makes their facial similarities less distinct. In this paper, we propose to perform kinship verification in a transfer learning manner, which introduces photos of parents in their earlier ages as intermediate references to facilitate the verification. Child-young parent pairs are regarded as source domain and child-old parent ones are considered as target domain. The transfer learning scheme contains two phases. In the transfer metric learning phase, the extracted locality-constrained sparse features of images are projected into an optimized subspace where the intra-class distances are minimized and the inter-class ones are maximized. In the transfer classifier learning phase, a cross domain classifier is learned by a transfer SVM algorithm. Experimental results on UB KinFace dataset indicate that our method outperforms state-of-the-art methods.

Yanli Zhang, Bo Ma, Lianghua Huang, Hongwei Hu

### Unsupervised Land Classification by Self-organizing Map Utilizing the Ensemble Variance Information in Satellite-Borne Polarimetric Synthetic Aperture Radar

Polarimetric satellite-borne synthetic aperture radar is expected to provide land usage information globally and precisely. In this paper, we propose a two-stage unsupervised-learning land state classification system using a self-organizing map (SOM) based on the ensemble variance. We find that the Poincare sphere parameters representing the polarization state of scattered wave have specific features of the land state, in particular, in their dispersion (or ensemble variance). We present two-stage clustering procedure to utilize the dispersion features of the clusters as well as the mean values. Experiments demonstrate its high capability of self-organizing and discovering classification based on the polarimetric scattering features representing the land states.

Yuto Takizawa, Fang Shang, Akira Hirose

### Algorithmic Robustness for Semi-Supervised $$(\epsilon , \gamma , \tau )$$ -Good Metric Learning

Using the appropriate metric is crucial for the performance of most of machine learning algorithms. For this reason, a lot of effort has been put into distance and similarity learning. However, it is worth noting that this research field lacks theoretical guarantees that can be expected on the generalization capacity of the classifier associated to a learned metric. The theoretical framework of $$(\epsilon , \gamma , \tau )$$-good similarity functions [1] provides means to relate the properties of a similarity function and those of a linear classifier making use of it. In this paper, we extend this theory to a method where the metric and the separator are jointly learned in a semi-supervised way, setting that has not been explored before. We furthermore prove the robustness of our algorithm, which allows us to provide a generalization bound for this approach. The behavior of our method is illustrated via some experimental results.

Maria-Irina Nicolae, Marc Sebban, Amaury Habrard, Eric Gaussier, Massih-Reza Amini

### Patchwise Tracking via Spatio-Temporal Constraint-Based Sparse Representation and Multiple-Instance Learning-Based SVM

This paper proposes a patch-based tracking algorithm via a hybrid generative-discriminative appearance model. For establishing the generative appearance model, we present a spatio-temporal constraint-based sparse representation (STSR), which not only exploits the intrinsic relationship among the target candidates and the spatial layout of the patches inside each candidate, but also preserves the temporal similarity in consecutive frames. To construct the discriminative appearance model, we utilize the multiple-instance learning-based support vector machine (MIL&SVM), which is robust to occlusion and alleviates the drifting problem. According to the classification result, the occlusion state can be predicted, and it is further used in the templates updating, making the templates more efficient both for the generative and discriminative model. Finally, we incorporate the hybrid appearance model into a particle filter framework. Experimental results on six challenging sequences demonstrate that our tracker is robust in dealing with occlusion.

Yuxia Wang, Qingjie Zhao

### An Autonomous Mobile Robot with Functions of Action Learning, Memorizing, Recall and Identifying the Environment Using Gaussian Mixture Model

In this paper, behavior scheme of autonomous mobile robots to achieve the objectives of them in environments are proposed, having function of identifying the current environment in which they are placed and making use of learning, memorizing and recalling behaviors of corresponding to each of plural different environments. Specifically, each robot has the function of identifying the environment using some behavioral statistical data for each environment, and if the robot has already experienced the environment, it behaves by making use of own experienced data stored in the database, otherwise it performs a new behavior learning and adds the learning results into the database.

Masanao Obayashi, Taiki Yamane, Takashi Kuremoto, Shingo Mabu, Kunikazu Kobayashi

### SEIR Immune Strategy for Instance Weighted Naive Bayes Classification

Naive Bayes (NB) has been popularly applied in many classification tasks. However, in real-world applications, the pronounced advantage of NB is often challenged by insufficient training samples. Specifically, the high variance may occur with respect to the limited number of training samples. The estimated class distribution of a NB classier is inaccurate if the number of training instances is small. To handle this issue, in this paper, we proposed a SEIR (Susceptible, Exposed, Infectious and Recovered) immune-strategy-based instance weighting algorithm for naive Bayes classification, namely SWNB. The immune instance weighting allows the SWNB algorithm adjust itself to the data without explicit specification of functional or distributional forms of the underlying model. Experiments and comparisons on 20 benchmark datasets demonstrated that the proposed SWNB algorithm outperformed existing state-of-the-art instance weighted NB algorithm and other related computational intelligence methods.

Shan Xue, Jie Lu, Guangquan Zhang, Li Xiong

### Enhancing Competitive Island Cooperative Neuro-Evolution Through Backpropagation for Pattern Classification

Cooperative coevolution is a promising method for training neural networks which is also known as cooperative neuro-evolution. Cooperative neuro-evolution has been used for pattern classification, time series prediction and global optimisation problems. In the past, competitive island based cooperative coevolution has been proposed that employed different instances of problem decomposition methods for competition. Neuro-evolution has limitations in terms of training time although they are known as global search methods. Backpropagation algorithm employs gradient descent which helps in faster convergence which is needed for neuro-evolution. Backpropagation suffers from premature convergence and its combination with neuro-evolution can help eliminate the weakness of both the approaches. In this paper, we propose a competitive island cooperative neuro-evolutionary method that takes advantage of the strengths of gradient descent and neuro-evolution. We use feedforward neural networks on benchmark pattern classification problems to evaluate the performance of the proposed algorithm. The results show improved performance when compared to related methods.

Gary Wong, Rohitash Chandra

### Email Personalization and User Profiling Using RANSAC Multi Model Response Regression Based Optimized Pruning Extreme Learning Machines and Gradient Boosting Trees

Email personalization is the process of customizing the content and structure of email according to member’s specific and individual needs taking advantage of member’s navigational behavior. Personalization is a refined version of customization, where marketing is done automated on behalf of customer’s user’s profiles, rather than customer requests on his own behalf. There is very thin line between customization and personalization which is achieved by leveraging customer level information using analytical tools. E-commerce is growing fast, and with this growth companies are willing to spend more on improving the online experience.Thus, in this study, we propose a new architectural design of email personalization and user profiling using gradient boost trees and optimized pruned extreme learning machines as base estimators. We also conducted an in-depth data analysis to find each member’s behavior and important attributes which plays a significant role in increasing click rates in personalized emails. From the experimental validation, we concluded that our prosed method works much better in predicting customer’s behavior on deals send in personalized emails compared to other methods in past literature.

Lavneet Singh, Girija Chetty

### An Auto-Encoder for Learning Conversation Representation Using LSTM

In this paper, an auto-encoder is proposed to learn conversation representation. First, the long short term memory (LSTM) neural network is used to encode the sequence of sentences in a conversation. The interactive context is encoded into a fixed-length vector. Then, through the LSTM-decoder, the learnt representation is used to reconstruct the sentence vectors of a conversation. To train our model, we construct one corpus with 32,881 conversations from the online shopping platform. Finally, experiments on topic recognition task demonstrate the effectiveness of the proposed auto-encoder on learning conversation representation, especially when training data of topic recognition is relatively small.

Xiaoqiang Zhou, Baotian Hu, Qingcai Chen, Xiaolong Wang

### On the Use of Score Ratio with Distance-Based Classifiers in Biometric Signature Recognition

Biometric user verification or authentication is a pattern recognition problem that can be stated as a basic hypothesis test: X is from client C ($$H_0$$H0) vs. X is not from client C ($$H_1$$H1), where X is the biometric input sample (face, fingerprint, etc.). When probabilistic classifiers are used (e.g., Hidden Markov Models), the decision is typically performed by means of the likelihood ratio: $${P(X/H_0)}/{P(X/H_1)}$$P(X/H0)/P(X/H1). However, as far as we know, this ratio is not usually performed when distance-based classifiers (e.g., Dynamic Time Warping) are used. Following that idea, we propose, here, to perform the decision based not only on the score (“score” being the classifier output) supposing X is from the client ($$H_0$$H0), but also using the score supposing X is not from the client ($$H_1$$H1), by means of the ratio between both scores: the score ratio. A first approach to this proposal can be seen in this work, showing that to use the score ratio can be an interesting technique to improve distance-based biometric systems. This research has focused on the biometric signature, where several state of the art systems based on distance can be found. Here, the score ratio proposal is tested in three of them, achieving great improvements in the majority of the tests performed. The best verification results have been achieved with the use of the score ratio, improving the best ones without the score ratio by, on average, 24 %.

Carlos Vivaracho-Pascual, Arancha Simon-Hurtado, Esperanza Manso-Martinez

### A Multifactor Dimensionality Reduction Based Associative Classification for Detecting SNP Interactions

Identification and characterization of interactions between genes have been increasingly explored in current Genome-wide association studies (GWAS). Several machine learning and data mining approaches have been proposed to identify the multi-locus interactions in higher order genomic data. However, detecting these interactions is challenging due to bio-molecular complexities and computational limitations. In this paper, a multifactor dimensionality reduction based associative classifier is proposed for detecting SNP interactions in genetic epidemiological studies. The approach is evaluated for one to six loci models by varying heritability, minor allele frequency, case-control ratios and sample size. The experimental results demonstrated significant improvements in accuracy for detecting interacting single nucleotide polymorphisms (SNPs) responsible for complex diseases when compared to the previous approaches. Further, the approach was successfully evaluated by using sporadic breast cancer data. The results show interactions among five polymorphisms in three different estrogen-metabolism genes.

Suneetha Uppu, Aneesh Krishna, Raj P. Gopalan

### Distributed Q-learning Controller for a Multi-Intersection Traffic Network

This paper proposes a Q-learning based controller for a network of multi intersections. According to the increasing amount of traffic congestion in modern cities, using an efficient control system is demanding. The proposed controller designed to adjust the green time for traffic signals by the aim of reducing the vehicles’ travel delay time in a multi-intersection network. The designed system is a distributed traffic timing control model, applies individual controller for each intersection. Each controller adjusts its own intersection’s congestion while attempt to reduce the travel delay time in whole traffic network. The results of experiments indicate the satisfied efficiency of the developed distributed Q-learning controller.

Sahar Araghi, Abbas Khosravi, Douglas Creighton

### Learning Rule for Linear Multilayer Feedforward ANN by Boosted Decision Stumps

A novel method for learning a linear multilayer feedforward artificial neural network (ANN) by using ensembles of boosted decision stumps is presented. Network parameters are adapted through a layer-wise iterative traversal of neurons with weights of each neuron learned by using a boosting based ensemble and an appropriate reduction. Performances of several neural network models using the proposed method are compared for a variety of datasets with networks learned using three other algorithms, namely Perceptron learning rule, gradient decent back propagation algorithm, and Boostron learning.

Mirza Mubasher Baig, El-Sayed M. El-Alfy, Mian M. Awais

### Class-Semantic Color-Texture Textons for Vegetation Classification

This paper proposes a new color-texture texton based approach for roadside vegetation classification in natural images. Two individual sets of class-semantic textons are first generated from color and filter bank texture features for each class. The color and texture features of testing pixels are then mapped into one of the generated textons using the nearest distance, resulting in two texton occurrence matrices – one for color and one for texture. The classification is achieved by aggregating color-texture texton occurrences over all pixels in each over-segmented superpixel using a majority voting strategy. Our approach outperforms previous benchmarking approaches and achieves 81% and 74.5% accuracies of classifying seven objects on a cropped region dataset and six objects on an image dataset collected by the Department of Transport and Main Roads, Queensland, Australia.

Ligang Zhang, Brijesh Verma, David Stockwell

### Towards Unsupervised Learning for Arabic Handwritten Recognition Using Deep Architectures

In the pattern recognition field and especially in the Handwriting recognition one, the Deep learning is becoming the new trend in Artificial Intelligence with the sheer size of raw data available nowadays. In this paper, we highlights how Deep Learning techniques can be effectively applied for recognizing Arabic handwritten script, our field of interest, and this by investigating two deep architectures: Deep Belief Network (DBN) and Convolutional Neural Networks (CNN). The two proposed architectures take the raw data as input and proceed with a greedy layer-wise unsupervised learning algorithm. The experimental study has proved promising results which are comparable or even superior to the standard classifiers with an efficiency of DBN over CNN architecture.

Mohamed Elleuch, Najiba Tagougui, Monji Kherallah

### Optimum Colour Space Selection for Ulcerated Regions Using Statistical Analysis and Classification of Ulcerated Frames from WCE Video Footage

The Wireless Capsule Endoscopy (WCE) is a painless and non-invasive procedure that allows clinicians to visualize the entire Gastrointestinal Tract (GIT) and detect various abnormalities. During the inspection of GIT, numerous images are acquired at a rate of approximately 2 frames per second (fps) and recorded into a video footage (containing about 55,000 images). Inspecting the WCE video is very tedious and time consuming for the doctors, resulting in limited application of WCE. Therefore, it is crucial to develop a computer aided intelligent algorithm to process the huge number of WCE frames. This paper proposes an ulcerated frame detection method based on RGB and CIE Lab colour spaces. In order to select and provide the classifier with the bands containing most ulcer information, a statistical analysis of ulcerated images pixel based is proposed. The resulting band selection will enhance the classification results and increase the sensitivity and specificity with regards to ulcerated frame identification.

Shipra Suman, Nicolas Walter, Fawnizu Azmadi Hussin, Aamir Saeed Malik, Shiaw Hooi Ho, Khean Lee Goh, Ida Hilmi

### Learning the Optimal Product Design Through History

The search for novel and high-performing product designs is a ubiquitous problem in science and engineering: aided by advances in optimization methods the conventional approaches usually optimize a (multi) objective function using simulations followed by experiments.However, in some scenarios such as vehicle layout design, simulations and experiments are restrictive, inaccurate and expensive. In this paper, we propose an alternative approach to search for novel and high-performing product designs by optimizing not only a proposed novelty metric, but also a performance function learned from historical data. Computational experiments using more than twenty thousand vehicle models over the last thirty years shows the usefulness and promising results for a wider set of design engineering problems.

Victor Parque, Tomoyuki Miyashita

### Learning Shape-Driven Segmentation Based on Neural Network and Sparse Reconstruction Toward Automated Cell Analysis of Cervical Smears

The development of an automatic and accurate segmentation approach for both nuclei and cytoplasm remains an open problem due to the complexities of cell structures resulting from inconsistent staining, poor contrast, and the presence of mucus, blood, inflammatory cells, and highly overlapping cells. This paper introduces a computer vision slide analysis technique of two stages: the 3-class cellular component classification, and individual cytoplasm segmentation. Feed forward neural network along with discriminative shape and texture features is applied to classify the cervical cell images in the cellular components. Then, a learned shape prior incorporated with variational framework is applied for accurate localization and delineation of overlapping cells. The shape prior is dynamically modelled during the segmentation process as a weighted linear combination of shape templates from an over-complete shape repository. The proposed approach is evaluated and compared to the state-of-the-art methods on a dataset of synthetically generated overlapping cervical cell images, with competitive results in both nuclear and cytoplasmic segmentation accuracy.

Afaf Tareef, Yang Song, Weidong Cai, Heng Huang, Yue Wang, Dagan Feng, Mei Chen

### Adaptive Differential Evolution Based Feature Selection and Parameter Optimization for Advised SVM Classifier

This paper proposes a pattern recognition model for classification. Adaptive differential evolution based feature selection is used for dimensionality reduction and a new advised version of support vector machine is used for evaluation of selected features and for the classification. The tuning of the control parameters for differential evolution algorithm, parameter value optimization for support vector machine and selection of most relevant features form the datasets all are done together. This helps in dealing with their interdependent effect on the overall performance of the learning model. The proposed model is tested on some latest machine learning medical datasets and compared with some well-developed methods in literature. The proposed model provided quite convincing results on all the test datasets.

Ammara Masood, Adel Al-Jumaily

### TNorm: An Unsupervised Batch Effects Correction Method for Gene Expression Data Classification

In the field of biomedical research, gene expression analysis helps to identify the disease-related genes as genetic markers for diagnosis. As there is a huge number of publicly available gene expression datasets, the ongoing challenge is to utilize those available data effectively. Merging microarray datasets from different batches to improve the statistical power of a study is one of the active research topics. However, various works have addressed the issue of batch effects variation, which describes variation in gene expression levels induced by different experimental environments. Ignoring this variation may result in erroneous findings in a study. This work proposes a method for batch effect correction by mapping underlying topology of different batches. The mapping process for cross-batch normalization is examined using basic linear transformation. The comparative study of three cancers is conducted to compare the proposed method with a proven batch effects correction method. The results show that our method outperforms the existing method in most cases.

Praisan Padungweang, Worrawat Engchuan, Jonathan H. Chan

### Finger-Vein Quality Assessment by Representation Learning from Binary Images

Finger-vein quality assessment is an important issue in finger-vein verification systems as spurious and missing features in poor quality images may increase the verification error. Despite recent advances, current solutions depend on domain knowledge and are typically driven by visual inspection. In this work, we propose a deep Neural Network (DNN) for representation learning from binary images to predict vein quality. First, driven by the primary target of biometric quality assessment, i.e. verification error minimization, we assume that low quality images are false rejected finger-vein images in a verification system. Based on this assumption, the low and high quality images are labeled automatically. Second, as image processing approaches such as enhancement and segmentation may produce false features and ignore actual ones thus degrading verification accuracy, we train a DNN on binary images and derive deep features from its last hidden layer for quality assessment. Our experiments on two large public finger-vein databases show that the proposed scheme accurately identifies high and low quality images and significantly outperform existing approaches in terms of the impact on equal error rate (EER) improvement.

Huafeng Qin, Mounîm A. El-Yacoubi

### Learning to Predict Where People Look with Tensor-Based Multi-view Learning

Eye movements data collection is very expensive and laborious. Moreover, there are usually missing values. Assuming that we are collecting eye movements data on a set of images from different users (views). There is a possibility that we are not able to collect eye movements of all users on all images. One or more views are not represented in the image. We assume that the relationships among the views can be learnt from the complete items. The task is then to reproduce the missing part of the incomplete items from the relationships derived from the complete items and the known part of these items. Using the properties of tensor algebra we show that this problem can be formulated consistently as a regression type learning task. Furthermore, there is a maximum margin based optimisation framework where this problem can be solved in a tractable way. This problem is similar to learning to predict where human look. The proposed algorithm is proved to be more effective than well-known saliency detection techniques.

Kitsuchart Pasupa, Sandor Szedmak

### Classification of the Scripts in Medieval Documents from Balkan Region by Run-Length Texture Analysis

The paper presents a script classification method of the medieval documents originated from the Balkan region. It consists in a multi-step procedure which includes the text mapping according to typographical features, creation of equivalent image patterns, run-length pattern analysis in order to establish a feature vector and state-of-the art classification method Genetic Algorithms Image Clustering for Document Analysis (GA-ICDA) which successfully disseminates the documents written in different scripts. The proposed method is evaluated on custom oriented document databases, which include the handprinted or printed documents written in old Cyrillic, angular and round Glagolitic, ancient Latin and Greek scripts. The experiment demonstrates very good results.

Darko Brodić, Alessia Amelio, Zoran N. Milivojević

### Accelerating Artificial Bee Colony Algorithm for Global Optimization

As an efficient optimization technique, artificial bee colony (ABC) algorithm has attracted a lot of attention for its good performance. However, ABC is good at exploration but poor at exploitation for its solution search equation. Thus, how to enhance the exploitation becomes an active research trend. In this paper, we propose a trigonometric search equation in which a hypergeometric triangle is formed to generate offspring. Additionally, the orthogonal learning strategy is integrated into the scout bee phase for generating new food source. Experiments are conducted on 23 well-known benchmark functions, and the results show that our approach has promising performance.

Xinyu Zhou, Mingwen Wang, Jianyi Wan

### Classification of High and Low Intelligent Individuals Using Pupil and Eye Blink

A commonly used method to determine the intelligence of an individual is a group test. It checks accuracy and response time while they solve a series of problems. However, it takes long time and is often inaccurate if the difficulty level of problems is high or the number of problems is too small. Therefore, there is an urgent need to find an objective, readily available, fast and more reliable method to determine the intelligence level of individuals. In this paper, we propose an alternative method to distinguish between high and low intelligent individuals using pupillary response and eye blink pattern. Studies have shown that these measures indicate the cognitive state of an individual more accurately and objectively. Our experimental results show that the bio-signals between high and low intelligent individuals are significantly different and proposed method has good performance.

Giyoung Lee, Amitash Ojha, Minho Lee

### Learning Task Specific Distributed Paragraph Representations Using a 2-Tier Convolutional Neural Network

We introduce a type of 2-tier convolutional neural network model for learning distributed paragraph representations for a special task (e.g. paragraph or short document level sentiment analysis and text topic categorization). We decompose the paragraph semantics into 3 cascaded constitutes: word representation, sentence composition and document composition. Specifically, we learn distributed word representations by a continuous bag-of-words model from a large unstructured text corpus. Then, using these word representations as pre-trained vectors, distributed task specific sentence representations are learned from a sentence level corpus with task-specific labels by the first tier of our model. Using these sentence representations as distributed paragraph representation vectors, distributed paragraph representations are learned from a paragraph-level corpus by the second tier of our model. It is evaluated on DBpedia ontology classification dataset and Amazon review dataset. Empirical results show the effectiveness of our proposed learning model for generating distributed paragraph representations.

Tao Chen, Ruifeng Xu, Yulan He, Xuan Wang

### A Comparison of Supervised Learning Techniques for Clustering

The significance of data mining has experienced dramatic growth over the past few years. This growth has been so drastic that many industries and academic disciplines apply data mining in some form. Data mining is a broad subject that encompasses several topics and problems; however this paper will focus on the supervised learning classification problem and discovering ways to optimize the classification process. Four classification techniques (naive Bayes, support vector machine, decision tree, and random forest) were studied and applied to data sets from the UCI Machine Learning Repository. A Classification Learning Toolbox (CLT) was developed using the R statistical programming language to analyze the date sets and report the relationships and prediction accuracy between the four classifiers.

William Ezekiel, Umashanger Thayasivam

### Radar Pattern Classification Based on Class Probability Output Networks

Modern aircraft and ships are equipped with radars emitting specific patterns of electromagnetic signals. The radar antennas are detecting these patterns which are required to identify the types of emitters. A conventional way of emitter identification is to categorize the radar patterns according to the sequences of frequencies, time of arrivals, and pulse widths of emitting signals by human experts. In this respect, this paper presents a method of classifying the radar patterns automatically using the network of calculating the p-values of testing the hypotheses of the types of emitters referred to as the class probability output network (CPON). Through the simulation for radar pattern classification, the effectiveness of the proposed approach has been demonstrated.

Lee Suk Kim, Rhee Man Kil, Churl Hee Jo

### Hierarchical Data Classification Using Deep Neural Networks

Deep Neural Networks (DNNs) are becoming an increasingly interesting, valuable and efficient machine learning paradigm with implementations in natural language processing, image recognition and hand-written character recognition. Application of deep architectures is increasing in domains that contain feature hierarchies (FHs) i.e. features from higher levels of the hierarchy formed by the composition of lower level features. This is because of a perceived relationship between on the one hand the hierarchical organisation of DNNs, with large numbers of neurons at the bottom layers and increasingly smaller numbers at upper layers, and on the other hand FHs, with comparatively large numbers of low level features resulting in a small number of high level features. However, it is not clear what the relationship between DNNs hierarchies and FHs should be, or whether there even exists one. Nor is it clear whether modelling FHs with a hierarchically organised DNN conveys any benefits over using non-hierarchical neural networks. This study is aimed at exploring these questions and is organized into two parts. Firstly, a taxonomic FH with associated data is generated and a DNN is trained to classify the organisms into various species depending on characteristic features. The second part involves testing the ability of DNNs to identify whether two given organisms are related or not, depending on the sharing of appropriate features in their FHs. The experimental results show that the accuracy of the classification results is reduced with the increase in ‘depth’. Further, improved performance was achieved when every hidden layer has the same number of nodes compared with DNNs with increasingly fewer hidden nodes at higher levels. In other words, our experiments show that the relationship between DNNs and FHs is not simple and may require further extensive experimental research to identify the best DNN architectures when learning FHs.

Sreenivas Sremath Tirumala, A. Narayanan

### Model Inclusive Learning for Shape from Shading with Simultaneously Estimating Illumination Directions

The problem of recovering shape from shading is important in computer vision and robotics and several studies have been done. We already proposed a versatile method of solving the problem by model inclusive learning of neural networks. The method is versatile in the sense that it can solve the problem in various circumstances. Almost all of the methods of recovering shape from shading proposed so far assume that illumination conditions are known a priori. It is, however, very difficult to identify them exactly. This paper discusses a method to solve the problem. We propose a model inclusive learning of neural networks which makes it possible to recover shape with simultaneously estimating illumination directions. The performance of the proposed method is demonstrated through some experiments.

Yasuaki Kuroe, Hajimu Kawakami

### A Computational Model of Match Decision-Making Problem Using Spiking SHESN with Reward-Modulated Reinforcement Learning

Match decision-making problem is one of the hot topics in the field of computational neuroscience. In this paper, we propose a spiking SHESN model with reward-modulated reinforcement learning so as to conduct computational modeling and prediction of such an open problem in a manner that has more neurophysiological characteristics. Neural coding of two sequentially-presented stimuli is read out from a collection of clustered neural populations in state reservoir through reward-modulated reinforcement learning. To evaluate match decision-making performance of our computational model, we set up three kinds of test datasets with different spike timing trains and present a criterion of maximum correlation coefficient for assessing whether match/nonmatch decision-making is successful or not. Finally, extensive experimental results show that the proposed model has strong robustness on interval of both spike timings and spike shift, which is consistent with monkey’s behavior records exhibited in match decision-making experiment [1].

Zhidong Deng, Guorun Yang

### Identify Website Personality by Using Unsupervised Learning Based on Quantitative Website Elements

This paper reports a pilot study in identifying and ranking the personality of a website automatically and intelligently to help the users to find a more suitable website and to help the owners to improve the quality of their websites. The mapping between the selected items defined in WPS and the quantitative elements of a website was developed first. 240 valid websites were classified by using unsupervised clustering algorithm K-means. The classification was implemented for multiple times from K = 2 to K = 15. The average values for each attribute in each cluster were calculated, the standard deviation for all the clusters for a given K value was calculated to find out a suitable K value. A preliminary verification suggested that the attributes and the method used can properly identify the personality of a website. A software written in Java integrating other existing software packages was developed for the required experiments.

Shafquat Chishti, Xiaosong Li, Abdolhossein Sarrafzadeh

### Discriminative Dictionary Learning for Skeletal Action Recognition

Human action recognition is an important yet challenging task. With the introduction of RGB-D sensors, human body joints can be extracted with high accuracy, and skeleton-based action recognition has been investigated and gained some success. In this paper, we split an entire action trajectory into several segments and represent each segment using covariance descriptor of joints’ coordinates. We further employ the projective dictionary pair learning (PDPL) and majority-voting for multi-class action classification. Experimental results on two benchmark datasets demonstrate the effectiveness of our approach.

Yang Xiang, Jinhua Xu

### Single Face Image Super-Resolution via Multi-dictionary Bayesian Non-parametric Learning

The face image super-resolution is a domain specific problem. Human face has complex, and fixed domain specific priors, which should be detail explored in super-resolution algorithm. This paper proposes an effective single image face super-resolution method by pre-clustering training data and Bayesian non-parametric learning. After pre-clustering, face patches from different clusters represent different areas in face, and also offer specific priors on these areas. Bayesian non-parametric learning captures consistent and accurate mapping between coupled spaces. Experimental results show that our method produces competitive results to other state-of-the-art methods, with much less computational time.

Jingjing Wu, Hua Zhang, Yanbing Xue, Mian Zhou, Guangping Xu, Zan Gao

### Sparse LS-SVM in the Sorted Empirical Feature Space for Pattern Classification

In this paper, we discuss an improved sparse least support vector training in the reduced empirical feature space which is generated by linearly independent training data. In this method, we select the linearly independent training data as the basis vectors of empirical feature space. Then, before we select these data, we sort training data in ascending order from the standpoint of classification with the values of objective function in training least squares support vector machines. Thus, good training data from the standpoint of classification can be selected in preference as the basis vectors of the empirical feature space. Next, we train least squares support vector machine in the empirical feature space. Then, the solution is sparse since the number of support vectors is equal to that of the basis vectors. Using two-class problems, we evaluate the effectiveness of the proposed method over the conventional methods.

Takuya Kitamura, Kohei Asano

### A Cost Sensitive Minimal Learning Machine for Pattern Classification

The present work proposes a variant of the Minimal Learning Machine (MLM) in a cost sensitiVe framework for classification. MLM is a recently proposed supervised learning algorithm with a simple formulation and few hyperparameters. The proposed method is tested under two classification problems: imbalanced classification and classification with reject option. The results are comparable to other to state of the art classifiers.

João Paulo P. Gomes, Amauri H. Souza, Francesco Corona, Ajalmar R. Rocha Neto

### A Minimal Learning Machine for Datasets with Missing Values

Minimal Learning Machine (MLM) is a recently proposed supervised learning algorithm with simple implementation and few hyper-parameters. Learning MLM model consists on building a linear mapping between input and output distance matrices. In this work, the standard MLM is modified to deal with missing data. For that, the expected squared distance approach is used to compute the input space distance matrix. The proposed approach showed promising results when compared to standard strategies that deal with missing data.

Diego P. Paiva Mesquita, João Paulo P. Gomes, Amauri H. Souza Jr.

### Calibrated k-labelsets for Ensemble Multi-label Classification

RAndom k-labELsets (RAkEL) is an effective ensemble multi-label classification (MLC) model where each base-classifier is trained on a small random subset of k labels. However, the model construction does not fully benefit from the diversity of the ensemble and the label probability estimates obtained with RAkEL are usually badly calibrated due to the problems raised by the imbalanced label representation. In this paper, we propose three practical solutions to overcome these drawbacks. One is to increase the diversity of the base classifiers in the ensemble. The second to smooth the label powerset probability estimates during the ensemble aggregation process, and the third to calibrate the label decision thresholds. Experimental results on various benchmark data sets indicate that the proposed approach outperforms significantly recent state-of-the-art MLC algorithms, including RAkEL and its variants.

Ouadie Gharroudi, Haytham Elghazel, Alex Aussem

### EMG Signal Based Knee Joint Angle Estimation of Flexion and Extension with Extreme Learning Machine (ELM) for Enhancement of Patient-Robotic Exoskeleton Interaction

To capture the intended action of the patient and provide assistance as needed, the robotic rehabilitation device controller needs the intended posture, intended joint angle, intended torque and intended desired impedance of the patient. These parameters can be extracted from sEMG signal that are associated with knee joint. Thus an exoskeleton device requires a multilayer control mechanism to achieve a smooth Human Machine Interaction force. This paper proposes a method to estimate the required knee joint angles and associate parameters. The paper has investigated the feasibility of Extreme Learning Machine (ELM) as a estimator of the operation range of extension $$(0^\circ - 90^\circ )$$(0∘-90∘) and The performance is compared with Generalized Regression Neural Network (GRNN) and Neural Network (NN). ELM has performed relatively better than GRNN and NN.

Tanvir Anwar, Khairul Anam, Adel Al Jumaily

### Continuous User Authentication Using Machine Learning on Touch Dynamics

In the context of constantly evolving carry-on technology and its increasing accessibility, namely smart-phones and tablets, a greater need for reliable authentication means comes into sight. The current study offers an alternative solution of uninterrupted testing towards verifying user legitimacy. A continuously collected dataset of 41 users’ touch-screen inputs provides a good starting point into modeling each user’s behavior and later differentiate among users. We introduce a system capable of processing features based on raw data extracted from user-screen interactions and attempting to assign each gesture to its originator. Achieving an accuracy of over 83 %, we prove that this type of authentication system is feasible and that it can be further integrated as a continuous way of disclosing intruders within given mobile applications.

Ştefania Budulan, Elena Burceanu, Traian Rebedea, Costin Chiru

### Information Theoretical Analysis of Deep Learning Representations

Although deep learning shows high performance in pattern recognition and machine learning, the reasons are little clarified. To tackle this problem, we calculated the information theoretical variables of representations in hidden layers and analyzed their relationship to the performance. We found that the entropy and the mutual information decrease in a different way as the layer gets deeper. This suggests that the information theoretical variables may become a criterion to determine the number of layers in deep learning.

Yasutaka Furusho, Takatomi Kubo, Kazushi Ikeda

### Hybedrized NSGA-II and MOEA/D with Harmony Search Algorithm to Solve Multi-objective Optimization Problems

A multi-objective optimization problem is an area concerned an optimization problem involving more than one objective function to be optimized simultaneously. Several techniques have been proposed to solve Multi-Objective Optimization Problems. The two most famous algorithms are: NSGA-II and MOEA/D. Harmony Search is relatively a new heuristic evolutionary algorithm that has successfully proven to solve single objective optimization problems. In this paper, we hybridized two well-known multi-objective optimization evolutionary algorithms: NSGA-II and MOEA/D with Harmony Search. We studied the efficiency of the proposed novel algorithms to solve multi-objective optimization problems. To evaluate our work, we used well-known datasets: ZDT, DTLZ and CEC2009. We evaluate the algorithm performance using Inverted Generational Distance (IGD). The results showed that the proposed algorithms outperform in solving problems with multiple local fronts in terms of IGD as compared to the original ones (i.e., NSGA-II and MOEA/D).

### A Complex Network-Based Anytime Data Stream Clustering Algorithm

Data stream mining is an active area of research that poses challenging research problems. In the latter years, a variety of data stream clustering algorithms have been proposed to perform unsupervised learning using a two-step framework. Additionally, dealing with non-stationary, unbounded data streams requires the development of algorithms capable of performing fast and incremental clustering addressing time and memory limitations without jeopardizing clustering quality. In this paper we present CNDenStream, a one-step data stream clustering algorithm capable of finding non-hyper-spherical clusters which, in opposition to other data stream clustering algorithms, is able to maintain updated clusters after the arrival of each instance by using a complex network construction and evolution model based on homophily. Empirical studies show that CNDenStream is able to surpass other algorithms in clustering quality and requires a feasible amount of resources when compared to other algorithms presented in the literature.

Jean Paul Barddal, Heitor Murilo Gomes, Fabrício Enembreck

### Robust Online Multi-object Tracking by Maximum a Posteriori Estimation with Sequential Trajectory Prior

This paper address the problem of online multi-object tracking by using the Maximum a Posteriori (MAP) framework. Given the observations up to the current frame, we estimate the optimal object trajectories by solving two MAP estimation problems: object detection and trajectory-detection association. By introducing the sequential trajectory prior, i.e., the prior information from previous frames about “good” trajectories, into MAP estimation, the output of the pre-trained object detector is refined and the correctness of the association between trajectories and detections is enhanced. In addition, the sequential trajectory prior allows the two MAP stages interact with each other in a sequential manner, which facilitates online multi-object tracking. Our experiments on publicly available challenging datasets demonstrate that the proposed algorithm provides superior performance in various complex scenes.

Min Yang, Mingtao Pei, Jiajun Shen, Yunde Jia

### Enhance Differential Evolution Algorithm Based on Novel Mutation Strategy and Parameter Control Method

Differential evolution (DE) algorithm is a very effective and efficient approach for solving global numerical optimization problems. However, DE still suffers from some limitations. Moreover, the performance of DE is sensitive to its mutation strategy and associated parameters. In this paper, an enhanced differential evolution algorithm called EDE is proposed, which including a new mutation strategy and a new control method of parameters. Compared with other DE algorithms including four classical DE and two state-of-the-art DE variants on ten numerical benchmarks, the experiment results indicate that the performance of EDE is better than those of the other algorithms.

Laizhong Cui, Genghui Li, Li Li, Qiuzhen Lin, Jianyong Chen, Nan Lu

### Hybrid Model for the Training of Interval Type-2 Fuzzy Logic System

In this paper, a hybrid training model for interval type-2 fuzzy logic system is proposed. The hybrid training model uses extreme learning machine to tune the consequent part parameters and genetic algorithm to optimize the antecedent part parameters. The proposed hybrid learning model of interval type-2 fuzzy logic system is tested on the prediction of Mackey-Glass time series data sets with different levels of noise. The results are compared with the existing models in literature; extreme learning machine and Kalman filter based learning of consequent part parameters with randomly generated antecedent part parameters. It is observed that the interval type-2 fuzzy logic system provides improved performance with the proposed hybrid learning model.

Saima Hassan, Abbas Khosravi, Jafreezal Jaafar, Mojtaba Ahmadieh Khanesar

### A Numerical Optimization Algorithm Based on Bacterial Reproduction

According to characteristics of rapid speed and large quantity in the process of bacterial reproduction, and natural selection, survival of the fittest in the process of evolution, the framework of bacterial reproduction optimization(BRO) algorithm is proposed from a macro perspective of bacteria reproduction. The process of bacteria reproduction is divided to four periods with lag period, logarithmic period, stable period and decline period. Likewise, the process of optimization algorithm proposed by this paper is segmented into four periods with initial period, iteration period, stable period and decline period. Based on the framework, strategies are introduced to design BRO more efficiently. Experimental results and theoretical analysis show that BRO has faster convergence speed and higher accuracy for high-dimensional problems.

Peng Shao, Zhijian Wu, Xuanyu Zhou, Xinyu Zhou, Zelin Wang, Dang Cong Tran

### Visual-Textual Late Semantic Fusion Using Deep Neural Network for Document Categorization

Multi-modality fusion has recently drawn much attention due to the fast increasing of multimedia data. Document that consists of multiple modalities i.e. image, text and video, can be better understood by machines if information from different modalities semantically combined. In this paper, we propose to fuse image and text information with deep neural network (DNN) based approach. By jointly fusing visual-textual feature and taking the correlation between image and text into account, fusion features can be learned for representing document. We investigated the fusion features on document categorization, found that DNN-based fusion outperforms mainstream algorithms include K-Nearest Neighbor(KNN), Support Vector Machine (SVM) and Naive Bayes (NB) and 3-layer Neural Network (3L-NN) in both early and late fusion strategies.

Cheng Wang, Haojin Yang, Christoph Meinel

### Prototype Selection on Large and Streaming Data

Since streaming data keeps coming continuously as an ordered sequence, massive amounts of data is created. A big challenge in handling data streams is the limitation of time and space. Prototype selection on streaming data requires the prototypes to be updated in an incremental manner as new data comes in. We propose an incremental algorithm for prototype selection. This algorithm can also be used to handle very large datasets. Results have been presented on a number of large datasets and our method is compared to an existing algorithm for streaming data. Our algorithm saves time and the prototypes selected gives good classification accuracy.

Lakhpat Meena, V. Susheela Devi

### GOS-IL: A Generalized Over-Sampling Based Online Imbalanced Learning Framework

Online imbalanced learning has two important characteristics: samples of one class (minority class) are under-represented in the data set and samples come to the learner online incrementally. Such a data set may pose several problems to the learner. First, it is impossible to determine the minority class beforehand as the learner has no complete view of the whole data. Second, the status of imbalance may change over time. To handle such a data set efficiently, we present here a dynamic and adaptive algorithm called Generalized Over-Sampling based Online Imbalanced Learning (GOS-IL) framework. The proposed algorithm works by updating a base learner incrementally. This update is triggered when number of errors made by the learner crosses a threshold value. This deferred update helps the learner to avoid instantaneous harms of noisy samples and to achieve better generalization ability in the long run. In addition, correctly classified samples are not used by the algorithm to update the learner for avoiding over-fitting. Simulation results on some artificial and real world datasets show the effectiveness of the proposed method on two performance metrics: recall and g-mean.

Sukarna Barua, Md. Monirul Islam, Kazuyuki Murase

### A New Version of the Dendritic Cell Immune Algorithm Based on the K-Nearest Neighbors

In this paper, we propose a new approach of classification based on the artificial immune Dendritic Cell Algorithm (DCA). Many researches have demonstrated the promising DCA classification results in many real world applications. Despite of that, it was shown that the DCA has a main limitation while performing its classification task. To classify a new data item, the expert knowledge is required to calculate a set of signal values. Indeed, to achieve this, the expert has to provide some specific formula capable of generating these values. Yet, the expert mandatory presence has received criticism from researchers. Therefore, in order to overcome this restriction, we have proposed a new version of the DCA combined with the K-Nearest Neighbors (KNN). KNN is used to provide a new way to calculate the signal values independently from the expert knowledge. Experimental results demonstrate the significant performance of our proposed solution in terms of classification accuracy, in comparison to several state-of-the-art classifiers, while avoiding the mandatory presence of the expert.

Kaouther Ben Ali, Zeineb Chelly, Zied Elouedi

### Impact of Base Partitions on Multi-objective and Traditional Ensemble Clustering Algorithms

This paper presents a comparative study of cluster ensemble and multi-objective cluster ensemble algorithms. Our aim is to evaluate the extent to which such methods are able to identify the underlying structure hidden in a data set, given different levels of information they receive as input in the set of base partitions (BP). To do so, given a gold/reference partition, we produced nine sets of BP containing properties of interest for our analysis, such as large number of subdivisions of true clusters. We aim at answering questions such as: are the methods able to generate new and more robust partitions than those in the set of BP? are the techniques influenced by poor quality partitions presented in the set of BP?

Jane Piantoni, Katti Faceli, Tiemi C. Sakata, Julio C. Pereira, Marcílio C. P. de Souto

### Multi-Manifold Matrix Tri-Factorization for Text Data Clustering

We propose a novel algorithm that we called Multi-Manifold Co-clustering (MMC). This algorithm considers the geometric structures of both the sample manifold and the feature manifold simultaneously. Specifically, multiple Laplacian graph regularization terms are constructed separately to take local invariance into account; the optimal intrinsic manifold is constructed by linearly combining multiple manifolds. We employ multi-manifold learning to approximate the intrinsic manifold using a subset of candidate manifolds, which better reflects the local geometrical structure by graph Laplacian. The candidate manifolds are obtained using various representative manifold-based dimensionality reduction methods. These selected methods are based on different rationales and use different metrics for data distances. Experimental results on several real world text data sets demonstrate the effectiveness of MMC.

Kais Allab, Lazhar Labiod, Mohamed Nadif

### Clustering of Binary Data Sets Using Artificial Ants Algorithm

As an important technique for data mining, clustering often consists in forming a set of groups according to a similarity measure such as hamming distance. In this paper, we present a new bio-inspired model based on artificial ants over a dynamical graph of clusters using colonial odors and pheromone-based reinforcement process. Results analysis are provided and based on the impact of parameter values on purity index which is a measure of clustering quality. Dynamic evolution of cluster graph topologies are presented on two databases from Machine Learning Repository.

Nesrine Masmoudi, Hanane Azzag, Mustapha Lebbah, Cyrille Bertelle, Maher Ben Jemaa

### Inverse Reinforcement Learning Based on Behaviors of a Learning Agent

Reinforcement learning agents can acquire the optimal policy to achieve their objectives based on trials and errors. An appropriate design of reward function is essential, because there are variety of reward functions for the same objective whereas different reward functions would give rise to different learning processes. There is no systematic way to determine a good reward function for a given environment and objective. One possible way is finding a reward function to imitate the learning strategy of a reference agent which is intelligent enough to efficiently adapt even variable environments. In this study, we extended the apprenticeship learning framework in order to imitate a learning reference agent, whose policy may change on the process of optimization. For the imitation above, we propose a new inverse reinforcement learning based on that agent’s history of states and actions. When mimicking a reference agent that was trained with a simple 2-state Markov decision process, the proposed method showed better performance than that by the apprenticeship learning.

Shunsuke Sakurai, Shigeyuki Oba, Shin Ishii

### Backmatter

Weitere Informationen