Skip to main content
main-content

Über dieses Buch

The four volume set LNCS 9947, LNCS 9948, LNCS 9949, and LNCS 9950 constitues the proceedings of the 23rd International Conference on Neural Information Processing, ICONIP 2016, held in Kyoto, Japan, in October 2016. The 296 full papers presented were carefully reviewed and selected from 431 submissions. The 4 volumes are organized in topical sections on deep and reinforcement learning; big data analysis; neural data analysis; robotics and control; bio-inspired/energy efficient information processing; whole brain architecture; neurodynamics; bioinformatics; biomedical engineering; data mining and cybersecurity workshop; machine learning; neuromorphic hardware; sensory perception; pattern recognition; social networks; brain-machine interface; computer vision; time series analysis; data-driven approach for extracting latent features; topological and graph based clustering methods; computational intelligence; data mining; deep neural networks; computational and cognitive neurosciences; theory and algorithms.

Inhaltsverzeichnis

Frontmatter

Machine Learning

Frontmatter

Non-parametric e-mixture of Density Functions

Mixture modeling is one of the simplest ways to represent complicated probability density functions, and to integrate information from different sources. There are two typical mixtures in the context of information geometry, the m- and e-mixtures. This paper proposes a novel framework of non-parametric e-mixture modeling by using a simple estimation algorithm based on geometrical insights into the characteristics of the e-mixture. An experimental result supports the proposed framework.

Hideitsu Hino, Ken Takano, Shotaro Akaho, Noboru Murata

An Entropy Estimator Based on Polynomial Regression with Poisson Error Structure

A method for estimating Shannon differential entropy is proposed based on the second order expansion of the probability mass around the inspection point with respect to the distance from the point. Polynomial regression with Poisson error structure is utilized to estimate the values of density function. The density estimates at every given data points are averaged to obtain entropy estimators. The proposed estimator is shown to perform well through numerical experiments for various probability distributions.

Hideitsu Hino, Shotaro Akaho, Noboru Murata

A Problem in Model Selection of LASSO and Introduction of Scaling

In this article, we considered to assign a single scaling parameter to LASSO estimators for investigating and improving a problem of excessive shrinkage at a sparse representation. This problem is important because it directly affects a quality of model selection in LASSO. We derived a prediction risk for LASSO with scaling and obtained an optimal scaling parameter value that minimizes the risk. We then showed the risk is improved by assigning the optimal scaling value. In a numerical example, we found that an estimate of the optimal scaling value is larger than one especially at a sparse representation; i.e. excessive shrinkage is relaxed by expansion via scaling. Additionally, we observed that a risk for LASSO is high at a sparse representation and it is minimized at a relatively large model while this is improved by the introduction of an estimate of the optimal scaling value. We here constructed a fully empirical risk estimate that approximates the actual risk well. We then observed that, by applying the risk estimate as a model selection criterion, LASSO with scaling tends to obtain a model with low risk and high sparsity compared to LASSO without scaling.

Katsuyuki Hagiwara

A Theoretical Analysis of Semi-supervised Learning

We analyze the dynamical behaviors of semi-supervised learning in the framework of on-line learning by using the statistical-mechanical method. A student uses several correlated input vectors in each update. The student is given a desired output for only one input vector out of these correlated input vectors. In this model, we derive simultaneous differential equations with deterministic forms that describe the dynamical behaviors of order parameters using the self-averaging property in the thermodynamic limit. We treat the Hebbian and Perceptron learning rules. As a result, it is shown that using unlabeled data is effective in the early stages for both of the two learning rules. In addition, we show that the two learning rules have qualitatively different dynamical behaviors. Furthermore, we propose a new algorithm that improves the generalization performance by switching the number of input vectors used in an update as the time step proceeds.

Takashi Fujii, Hidetaka Ito, Seiji Miyoshi

Evolutionary Multi-task Learning for Modular Training of Feedforward Neural Networks

Multi-task learning enables learning algorithms to harness shared knowledge from several tasks in order to provide better performance. In the past, neuro-evolution has shownpromising performance for a number of real-world applications. Recently, evolutionary multi-tasking has been proposed for optimisation problems. In this paper, we present a multi-task learning for neural networks that evolves modular network topologies. In the proposed method, each task is defined by a specific network topology defined with a different number of hidden neurons. The method produces a modular network that could be effective even if some of the neurons and connections are removed from selected trained modules in the network. We demonstrate the effectiveness of the method using feedforward networks to learn selected n-bit parity problems of varying levels of difficulty. The results show better training and generalisation performance when the modules for representing additional knowledge are added by increasing hidden neurons during training.

Rohitash Chandra, Abhishek Gupta, Yew-Soon Ong, Chi-Keong Goh

On the Noise Resilience of Ranking Measures

Performance measures play a pivotal role in the evaluation and selection of machine learning models for a wide range of applications. Using both synthetic and real-world data sets, we investigated the resilience to noise of various ranking measures. Our experiments revealed that the area under the ROC curve (AUC) and a related measure, the truncated average Kolmogorov-Smirnov statistic (taKS), can reliably discriminate between models with truly different performance under various types and levels of noise. With increasing class skew, however, the H-measure and estimators of the area under the precision-recall curve become preferable measures. Because of its simple graphical interpretation and robustness, the lower trapezoid estimator of the area under the precision-recall curve is recommended for highly imbalanced data sets.

Daniel Berrar

BPSpike II: A New Backpropagation Learning Algorithm for Spiking Neural Networks

Using gradient descent, we propose a new backpropagation learning algorithm for spiking neural networks with multi-layers, multi-synapses between neurons, and multi-spiking neurons. It adjusts synaptic weights, delays, and time constants, and neurons’ thresholds in output and hidden layers. It guarantees convergence to minimum error point, and unlike SpikeProp and its extensions, does not need a one-to-one correspondence between actual and desired spikes in advance. So, it is stably and widely applicable to practical problems.

Satoshi Matsuda

Group Dropout Inspired by Ensemble Learning

Deep learning is a state-of-the-art learning method that is used in fields such as visual object recognition and speech recognition. This learning uses a large number of layers and a huge number of units and connections, so overfitting occurs. Dropout learning is a kind of regularizer that neglects some inputs and hidden units in the learning process with a probability p; then, the neglected inputs and hidden units are combined with the learned network to express the final output. We compared dropout learning and ensemble learning from three viewpoints and found that dropout learning can be regarded as ensemble learning that divides the student network into two groups of hidden units. From this insight, we explored novel dropout learning that divides the student network into more than two groups of hidden units to enhance the benefit of ensemble learning.

Kazuyuki Hara, Daisuke Saitoh, Takumi Kondou, Satoshi Suzuki, Hayaru Shouno

Audio Generation from Scene Considering Its Emotion Aspect

Scenes can convey emotion like music. If that’s so, it might be possible that, given an image, one can generate music with similar emotional reaction from users. The challenge lies in how to do that. In this paper, we use the Hue, Saturation and Lightness features from a number of image samples extracted from videos excerpts and the tempo, loudness and rhythm from a number of audio samples also extracted from the same video excerpts to train a group of neural networks, including Recurrent Neural Network and Neuro-Fuzzy Network, and obtain the desired audio signal to evoke a similar emotional response to a listener. This work could prove to be an important contribution to the field of Human-Computer Interaction because it can improve the interaction between computers and humans. Experimental results show that this model effectively produces an audio that matches the video evoking a similar emotion from the viewer.

Gwenaelle Cunha Sergio, Minho Lee

Semi Supervised Autoencoder

Autoencoders are self-supervised learning tools, but are unsupervised in the sense that class information is not required for training; but almost invariably they are used for supervised classification tasks. We propose to learn the autoencoder for a semi-supervised paradigm, i.e. with both labeled and unlabeled samples available. Given labeled and unlabeled data, our proposed autoencoder automatically adjusts – for unlabeled data it acts as a standard autoencoder (unsupervised) and for labeled data it additionally learns a linear classifier. We use our proposed semi-supervised autoencoder to (greedily) construct a stacked architecture. We demonstrate the efficacy our design in terms of both accuracy and run time requirements for the case of image classification. Our model is able to provide high classification accuracy with even simple classification schemes as compared to existing models for deep architectures.

Anupriya Gogna, Angshul Majumdar

Sampling-Based Gradient Regularization for Capturing Long-Term Dependencies in Recurrent Neural Networks

Vanishing (and exploding) gradients effect is a common problem for recurrent neural networks which use backpropagation method for calculation of derivatives. We construct an analytical framework to estimate a contribution of each training example to the norm of the long-term components of the target functions gradient and use it to hold the norm of the gradients in the suitable range. Using this subroutine we can construct mini-batches for the stochastic gradient descent (SGD) training that leads to high performance and accuracy of the trained network even for very complex tasks. To check our framework experimentally we use some special synthetic benchmarks for testing RNNs on ability to capture long-term dependencies. Our network can detect links between events in the (temporal) sequence at the range 100 and longer.

Artem Chernodub, Dimitri Nowicki

Face Hallucination Using Correlative Residue Compensation in a Modified Feature Space

Local linear embedding (LLE) is a promising manifold learning method in the field of machine learning. Number of face hallucination (FH) methods have been proposed due to its neighborhood preserving nature. However, the projection of low resolution (LR) image to high resolution (HR) is “one-to-multiple” mapping; therefore manifold assumption does not hold well. To solve the above inconsistency problem we proposed a new approach. First, an intermediate HR patch is constructed based on the non linear relationship between LR and HR patches, which is established using partial least square (PLS) method. Secondly, we incorporate the correlative residue compensation to the intermediate HR results by using only the HR residue manifold. We use the same combination coefficient as for the intermediate hallucination of the first phase. Extensive experiments show that the proposed method outperforms some state-of-the-art methods in both reconstruction error and visual quality.

Javaria Ikram, Yao Lu, Jianwu Li, Nie Hui

Modal Regression via Direct Log-Density Derivative Estimation

Regression is aimed at estimating the conditional expectation of output given input, which is suitable for analyzing functional relation between input and output. On the other hand, when the conditional density with multiple modes is analyzed, modal regression comes in handy. Partial mean shift (PMS) is a promising method of modal regression, which updates data points toward conditional modes by gradient ascent. In the implementation, PMS first obtains an estimate of the joint density by kernel density estimation and then computes its derivative for gradient ascent. However, this two-step approach can be unreliable because a good density estimator does not necessarily mean a good density derivative estimator. In this paper, we propose a novel method for modal regression based on direct estimation of the log-density derivative without density estimation. Experiments show the superiority of our direct method over PMS.

Hiroaki Sasaki, Yurina Ono, Masashi Sugiyama

Simplicial Nonnegative Matrix Tri-factorization: Fast Guaranteed Parallel Algorithm

Nonnegative matrix factorization (NMF) is a linear powerful dimension reduction and has various important applications. However, existing models remain the limitations in the terms of interpretability, guaranteed convergence, computational complexity, and sparse representation. In this paper, we propose to add simplicial constraints to the classical NMF model and to reformulate it into a new model called simplicial nonnegative matrix tri-factorization to have more concise interpretability via these values of factor matrices. Then, we propose an effective algorithm based on a combination of three-block alternating direction and Frank-Wolfe’s scheme to attain linear convergence, low iteration complexity, and easily controlled sparsity. The experiments indicate that the proposed model and algorithm outperform the NMF model and its state-of-the-art algorithms.

Duy-Khuong Nguyen, Quoc Tran-Dinh, Tu-Bao Ho

Active Consensus-Based Semi-supervised Growing Neural Gas

In this paper, we propose a new active semi-supervised growing neural gas (GNG) model, named Active Consensus-Based Semi-Supervised GNG, or ACSSGNG. This model extends the former CSSGNG model by introducing an active mechanism for querying more representative samples in comparison to a random, or passive, selection. Moreover, as a semi-supervised model, the ACSSGNG takes both labelled and unlabelled samples in the training procedure. In comparison to other adaptations of the GNG to semi-supervised classification, the ACSSGNG does not assign a single scalar label value to each neuron. Instead, a vector containing the representativeness level of each class is associated with each neuron. Here, this information is used to select which sample the specialist might label instead of using a random selection of samples. Computer experiments show that our model can deliver, on average, better classification results than state-of-art semi-supervised algorithms, including the CSSGNG.

Vinícius R. Máximo, Mariá C. V. Nascimento, Fabricio A. Breve, Marcos G. Quiles

Kernel L1-Minimization: Application to Kernel Sparse Representation Based Classification

The sparse representation based classification (SRC) was initially proposed for face recognition problems. However, SRC was found to excel in a variety of classification tasks. There have been many extensions to SRC, of which group SRC, kernel SRC being the prominent ones. Prior methods in kernel SRC used greedy methods like Orthogonal Matching Pursuit (OMP). It is well known that for solving a sparse recovery problem, both in theory and in practice, l 1 -minimization is a better approach compared to OMP. The standard l 1 -minimization is a solved problem. For the first time in this work, we propose a technique for Kernel l 1 -minimization. Through simulation results we show that our proposed method outperforms prior kernelised greedy sparse recovery techniques.

Anupriya Gogna, Angshul Majumdar

Nuclear Norm Regularized Randomized Neural Network

Extreme Learning Machine (ELM) or Randomized Neural Network (RNN) is a feedforward neural network where the network weights between the input and the hidden layer are not learned; they are assigned from some probability distribution. The weights between the hidden layer and the output targets are learnt. Neural networks are believed to mimic the human brain; it is well known that the brain is a redundant network. In this work we propose to explicitly model the redundancy of the human brain. We model redundancy as linear dependency of link weights; this leads to a low-rank model of the output (hidden layer to target) network. This is solved by imposing a nuclear norm penalty. The proposed technique is compared with the basic ELM and the Sparse ELM. Results on benchmark datasets, show that our method outperforms both of them.

Anupriya Gogna, Angshul Majumdar

Gram-Schmidt Orthonormalization to the Adaptive ICA Function for Fixing the Permutation Ambiguity

Recently, we have proposed a new objective function of ICA called the adaptive ICA function (AIF). AIF is a summation of weighted 4th-order statistics, where the weights are determined by adaptively estimated kurtoses. In this paper, the Gram-Schmidt orthonormalization is applied to the optimization of AIF. The proposed method is theoretically guaranteed to extract the independent components in the unique order of the degree of non-Gaussianity. Consequently, it enables us to fix the permutation ambiguity. Experimental results on blind image separation problems show the usefulness of the proposed method.

Yoshitatsu Matsuda, Kazunori Yamaguchi

Data Cleaning Using Complementary Fuzzy Support Vector Machine Technique

In this paper, a Complementary Fuzzy Support Vector Machine (CMTFSVM) technique is proposed to handle outlier and noise in classification problems. Fuzzy membership values are applied for each input point to reflect the degree of importance of the instances. Datasets from the UCI and KEEL are used for the comparison. In order to confirm the proposed methodology, 40 % random noise is added to the datasets. The experiment results of CMTFSVM are analysed and compared with the Complementary Neural Network (CMTNN). The outcome indicated that the combined CMTFSVM outperformed the CMTNN approach.

Ratchakoon Pruengkarn, Kok Wai Wong, Chun Che Fung

Fault-Tolerant Incremental Learning for Extreme Learning Machines

The extreme learning machine (ELM) framework provides an efficient way for constructing single-hidden-layer feedforward networks (SLFNs). Its main idea is that the input bias terms and the input weights of the hidden nodes are selected in a random way. During training, we only need to adjust the output weights of the hidden nodes. The existing incremental learning algorithms, called incremental-ELM (I-ELM) and convex I-ELM (CI-ELM), for extreme learning machines (ELMs) cannot handle the fault situation. This paper proposes two fault-tolerant incremental ELM algorithms, namely fault-tolerant I-ELM (FTI-ELM) and fault-tolerant CI-ELM (FTCI-ELM). The FTI-ELM only tunes the output weight of the newly additive node to minimize the training set error of faulty networks. It keeps all the previous learned weights unchanged. Its fault-tolerant performance is better than that of I-ELM and CI-ELM. To further improve the performance, the FTCI-ELM is proposed. It tunes the output weight of the newly additive node, as well as using a simple scheme to modify the existing output weights, to maximize the reduction in the training set error of faulty networks.

Ho-Chun Leung, Chi-Sing Leung, Eric W. M. Wong

Character-Aware Convolutional Neural Networks for Paraphrase Identification

Convolutional Neural Network (CNN) have been successfully used for many natural language processing applications. In this paper, we propose a novel CNN model for sentence-level paraphrase identification. We learn the sentence representations using character-aware convolutional neural network that relies on character-level input and gives sentence-level representation. Our model adopts both random and one-hot initialized methods for character representation and trained with two paraphrase identification corpora including news and social media sentences. A comparison between the results of our approach and the typical systems participating in challenge on the news sentence, suggest that our model obtains a comparative performance with these baselines. The experimental result with tweets corpus shows that the proposed model has a significant performance than baselines. The results also suggest that character inputs are effective for modeling sentences.

Jiangping Huang, Donghong Ji, Shuxin Yao, Wenzhi Huang

Learning a Discriminative Dictionary with CNN for Image Classification

In this paper, we propose a novel framework for image recognition based on an extended sparse model. First, inspired by the impressive results of CNN over different tasks in computer vision, we use the CNN models pre-trained on large datasets to generate features. Then we propose an extended sparse model which learns a dictionary from the CNN features by incorporating the reconstruction residual term and the coefficients adjustment term. Minimizing the reconstruction residual term guarantees that the class-specific sub-dictionary has good representation power for the samples from the corresponding class and minimizing the coefficients adjustment term encourages samples from different classes to be reconstructed by different class-specific sub-dictionaries. With this learned dictionary, not only the representation residual but also the representation coefficients will be discriminative. Finally, a metric involving these discriminative information is introduced for image classification. Experiments on Caltech101 and PASCAL VOC 2012 datasets show the effectiveness of the proposed method on image classification.

Shuai Yu, Tao Zhang, Chao Ma, Lei Zhou, Jie Yang, Xiangjian He

Online Weighted Multi-task Feature Selection

The goal of multi-task feature selection is to learn explanatory features across multiple related tasks. In this paper, we develop a weighted feature selection model to enhance the sparsity of the learning variables and propose an online algorithm to solve this model. The worst-case bounds of the time complexity and the memory cost of this algorithm at each iteration are both in $$\mathcal {O}(N\times Q)$$O(N×Q), where N is the number of feature dimensions and Q is the number of tasks. At each iteration, the learning variables can be solved analytically based on a memory of the previous (sub)gradients and the whole weighted regularization, and the weight coefficients used for the next iteration are updated by the current learned solution. A theoretical analysis for the regret bound of the proposed algorithm is presented, along with experiments on public data demonstrating that it can yield better performance, e.g., in terms of convergence speed and sparsity.

Wei Xue, Wensheng Zhang

Multithreading Incremental Learning Scheme for Embedded System to Realize a High-Throughput

Recent improvement of the microcomputer enables it to execute complex intelligent algorithms on embedded systems. However, when using conventional incremental learning methods, its resources are often increased with learning, and continuing the execution of the incremental learning becomes difficult on small embedded systems. Moreover, for real applications, the response time should be reduced. This paper proposes a technique for implementing incremental learning methods on a budget. Normally, they proceed online learning by alternating recognition and learning, so that they cannot respond to the next new instance until the previous learning is finished. Unfortunately, their computational learining complexities are extremely high to realize a quick response to new input. Therefore, this paper introduces a multithreading technique for such learning schemes. The recognition and learning threads are executed in parallel so that the system can respond to a new instance even when it is in the progress of learning. Moreover, this paper shows that such multithreading learning schemes sometime need a “sleep-period” to complete the learning similar to a biological brain. During the “sleep-period,” the leaning system prohibits the receival of any sensory inputs and yielding outputs.

Daisuke Nishio, Koichiro Yamauchi

Hyper-Parameter Tuning for Graph Kernels via Multiple Kernel Learning

Kernelized learning algorithms have seen a steady growth in popularity during the last decades. The procedure to estimate the performances of these kernels in real applications is typical computationally demanding due to the process of hyper-parameter selection. This is especially true for graph kernels, which are computationally quite expensive. In this paper, we study an approach that substitutes the commonly adopted procedure for kernel hyper-parameter selection by a multiple kernel learning procedure that learns a linear combination of kernel matrices obtained by the same kernel with different values for the hyper-parameters. Empirical results on real-world graph datasets show that the proposed methodology is faster than the baseline method when the number of parameter configurations is large, while always maintaining comparable and in some cases superior performances.

Carlo M. Massimo, Nicolò Navarin, Alessandro Sperduti

A Corrector for the Sample Mahalanobis Distance Free from Estimating the Population Eigenvalues of Covariance Matrix

To correct the effect deteriorating the recognition performance of the sample Mahalanobis distance by a small number of learning sample, a new corrector for the sample Mahalanobis distance toward the corresponding population Mahalanobis distance is proposed without the population eigenvalues estimated from the sample covariance matrix defining the sample Mahalanobis distance. To omit computing the population eigenvalues difficult to estimate, the corrector uses the Stein’s estimator of covariance matrix. And the corrector also uses accurate expectation of the principal component of the sample Mahalanobis distance by the delta method in statistics. Numerical experiments show that the proposed corrector improves the probability distribution and the recognition performance in comparison with the sample Mahalanobis distance.

Yasuyuki Kobayashi

Online Learning Neural Network for Adaptively Weighted Hybrid Modeling

The soft sensor models constructed based on historical data have poor generalization due to the characters of strong non-linearity and time-varying dynamics. Moving window and recursively sample updating online modeling methods can not achieve a balance between accuracy and training speed. Aiming at these problems, a novel online learning neural network (LNN) selects high-quality samples with just-in-time learning (JITL) for modeling. And the local samples could be further determined by principal component analysis (PCA). The LNN model shows better performance but poor stability. Weighted multiple sub models, the hybrid model improves accuracy by covering deficiencies. Additionally, the weights could be developed with mean square error (MSE) of each sub model. And the detailed simulation results verify the superiority of adaptive weighted hybrid model.

Shao-Ming Yang, Ya-Lin Wang, Yong-fei Xue, Bei Sun, Bu-song Yang

Semi-supervised Support Vector Machines - A Genetic Algorithm Approach

Semi-supervised learning combines both labeled and unlabeled examples in order to find better future predictions. Semi-supervised support vector machines (SSSVM) present a non-convex optimization problem. In this paper a genetic algorithm is used to optimize the non-convex error - GSSSVM. It is experimented with multiple datasets and the performance of the genetic algorithm is compared to its supervised equivalent and shows very good results. A tailor-made modification of the genetic algorithm is also proposed which uses less unlabeled examples – the closest neighbors of the labeled instances.

Gergana Lazarova

Hinge Loss Projection for Classification

Hinge loss is one-sided function which gives optimal solution than that of squared error (SE) loss function in case of classification. It allows data points which have a value greater than 1 and less than $$-1$$-1 for positive and negative classes, respectively. These have zero contribution to hinge function. However, in the most classification tasks, least square (LS) method such as ridge regression uses SE instead of hinge function. In this paper, a simple projection method is used to minimize hinge loss function through LS methods. We modify the ridge regression and its kernel based version i.e. kernel ridge regression so that it can adopt to hinge function instead of using SE in case of classification problem. The results show the effectiveness of hinge loss projection method especially on imbalanced data sets in terms of geometric mean (GM).

Syukron Abu Ishaq Alfarozi, Kuntpong Woraratpanya, Kitsuchart Pasupa, Masanori Sugimoto

Analytical Incremental Learning: Fast Constructive Learning Method for Neural Network

Extreme learning machine (ELM) is a fast learning algorithm for single hidden layer feed-forward neural network (SLFN) based on random input weights which usually requires large number of hidden nodes. Recently, novel constructive and destructive parsimonious (CP and DP)-ELM which provide the effectiveness generalization and compact hidden nodes have been proposed. However, the performance might be unstable due to the randomization either in ordinary ELM or CP and DP-ELM. In this study, analytical incremental learning (AIL) algorithm is proposed in which all weights of neural network are calculated analytically without any randomization. The hidden nodes of AIL are incrementally generated based on residual error using least square (LS) method. The results show the effectiveness of AIL which has not only smallest number of hidden nodes and more stable but also good generalization than those of ELM, CP and DP-ELM based on seven benchmark data sets evaluation.

Syukron Abu Ishaq Alfarozi, Noor Akhmad Setiawan, Teguh Bharata Adji, Kuntpong Woraratpanya, Kitsuchart Pasupa, Masanori Sugimoto

Acceleration of Word2vec Using GPUs

Word2vec is a widely used word embedding toolkit which generates word vectors by training input corpus. Since word vector can represent an exponential number of word cluster and enables reasoning of words with simple algebraic operations, it has become a widely used representation for the subsequent NLP tasks. In this paper, we present an efficient parallelization of word2vec using GPUs that preserves the accuracy. With two K20 GPUs, the proposed acceleration technique achieves 1.7M words/sec, which corresponds to about 20× of speedup compared to a single-threaded CPU execution.

Seulki Bae, Youngmin Yi

Automatic Design of Neural Network Structures Using AiS

Structures of neural networks are usually designed by experts to fit target problems. This study proposes a method to automate small network design for a regression problem based on the Add-if-Silent (AiS) function used in the neocognitron. Because the original AiS is designed for image pattern recognition, this study modifies the intermediate function to be Radial Basis Function (RBF). This study shows that the proposed method can determine an optimized network structure using the Bike Sharing Dataset as one case study. The generalization performance is also shown.

Toshisada Mariyama, Kunihiko Fukushima, Wataru Matsumoto

Sequential Collaborative Ranking Using (No-)Click Implicit Feedback

We study Recommender Systems in the context where they suggest a list of items to users. Several crucial issues are raised in such a setting: first, identify the relevant items to recommend; second, account for the feedback given by the user after he clicked and rated an item; third, since new feedback arrive into the system at any moment, incorporate such information to improve future recommendations. In this paper, we take these three aspects into consideration and present an approach handling click/no-click feedback information. Experiments on real-world datasets show that our approach outperforms state of the art algorithms.

Frédéric Guillou, Romaric Gaudel, Philippe Preux

Group Information-Based Dimensionality Reduction via Canonical Correlation Analysis

As an effective way of avoiding the curse of dimensionality and leveraging the predictive performance in high-dimensional regression analysis, dimension reduction suffers from small sample size. We proposed to utilize group information generated from pairwise data, to learn a low-dimensional representation highly correlated with target value. Experimental results on four public datasets imply that the proposed method can reduce regression error by effective dimension reduction.

Haiping Zhu, Hongming Shan, Youngjoo Lee, Yiwei He, Qi Zhou, Junping Zhang

Neuromorphic Hardware

Frontmatter

Simplification of Processing Elements in Cellular Neural Networks

Working Confirmation Using Circuit Simulation

Simplification of processing elements is greatly desired in cellular neural networks to realize ultra-large scale integration. First, we propose reducing a neuron to two-inverter two-switch circuit, two-inverter one-switch circuit, or two-inverter circuit. Next, we propose reducing a synapse only to one variable resistor or one variable capacitor. Finally, we confirm the correct workings of the cellular neural networks using circuit simulation. These results will be one of the theoretical bases to apply cellular neural networks to brain-type integrated circuits.

Mutsumi Kimura, Nao Nakamura, Tomoharu Yokoyama, Tokiyoshi Matsuda, Tomoya Kameda, Yasuhiko Nakashima

Pattern and Frequency Generation Using an Opto-Electronic Reservoir Computer with Output Feedback

Reservoir Computing is a bio-inspired computing paradigm for processing time dependent signals. The performance of its analogue implementations matches other digital algorithms on a series of benchmark tasks. Their potential can be further increased by feeding the output signal back into the reservoir, which would allow to apply the algorithm to time series generation. This requires, in principle, implementing a sufficiently fast readout layer for real-time output computation. Here we achieve this with a digital output layer driven by an FPGA chip. We demonstrate the first opto-electronic reservoir computer with output feedback and test it on two examples of time series generation tasks: pattern and frequency generation. The good results we obtain open new possible applications for analogue Reservoir Computing.

Piotr Antonik, Michiel Hermans, Marc Haelterman, Serge Massar

A Retino-Morphic Hardware System Simulating the Graded and Action Potentials in Retinal Neuronal Layers

We recently developed a retino-morphic hardware system operating at a frame interval of 5 ms, that was short enough for simulating the graded voltage responses of neurons in the retinal circuit in a quasi-continuous manner. In the present, we made a further progress, by implementing the Izhikevich model so that spatial spike distributions in a ganglion-cell layer can be simulated with millisecond-order timing precision. This system is useful for examining the retinal spike encoding of natural visual scenes.

Yuka Kudo, Yuki Hayashida, Ryoya Ishida, Hirotsugu Okuno, Tetsuya Yagi

Stability Analysis of Periodic Orbits in Digital Spiking Neurons

This paper considers stability of various periodic spike-trains from digital spiking neuron constructed by two coupled shift registers. The dynamics is integrated into a digital spike map defined on a set of points. In order to analyze the stability, we introduce two simple feature quantities that characterize plentifulness and superstability of the periodic spike-trains. Using the feature quantities, stability of typical examples is investigated.

Tomoki Hamaguchi, Kei Yamaoka, Toshimichi Saito

Letter Reproduction Simulator for Hardware Design of Cellular Neural Network Using Thin-Film Synapses

Crosspoint-Type Synapses and Simulation Algorithm

Recently, neural networks have been developed for variable purposes including image and voice recognitions. However, those based on only software implementation require huge amount of calculation and energy. Therefore, we are now designing a hardware with cellular neural network (CNN) that features low power, high-density, and high-functionality. In this study, we developed a CNN simulator for evaluating some letter reproduction algorithm. In this simulator, each of the neurons is just connected to neighboring neurons with surrounding synapses. Learning process is executed by modifying the strength of each connection. Particularly, we assumed to employ a-IGZO films for crosspoint-type synapses that utilize a phenomenon that the conductance changes when an electric current flows. We modeled this phenomenon and implemented it into the simulator to determine the network architecture and device parameters. In this paper, the structure, allocation method of a-IGZO and the algorithm are described. Finally, we confirmed that our cellular neural network can learn two letters. Furthermore, it was found that the estimated time for learning is around 100 h based on the current characteristic change model of a-IGZO film, and some conditions to enhance the deterioration speed of a-IGZO film should be explored.

Tomoya Kameda, Mutsumi Kimura, Yasuhiko Nakashima

Sensory Perception

Frontmatter

An Analysis of Current Source Density Profiles Activated by Local Stimulation in the Mouse Auditory Cortex in Vitro

To examine microcircuit properties of the mouse auditory cortex (AC) in vitro, we extracellularly recorded spatiotemporal laminar profiles driven by short electric microstimulation on a planar multielectrode array (MEA) substrate. The recorded local field potentials (LFPs) were subsequently evaluated using current source density (CSD) analysis to identify sources and sinks. Current sinks are thought to be an indicator of net synaptic current in a small volume of cortex surrounding the recording site. Thus, CSD analysis combined with MEAs enabled us to compare mean synaptic activity in response to current stimuli on a layer-by-layer basis. Here, we used senescence-accelerated mice (SAM), some strains of which show age-related hearing loss, to examine characteristic spatiotemporal CSD patterns stimulated by electrodes in specific cortical layers. Thus, the CSD patterns were classified into several clusters based on the stimulation sites in the cortical layers. We also found, in a reduced space obtained by principle component analysis, some CSD pattern differences between the two SAM strains in terms of aging and stimulation layers. Finally, on the basis of these results, we discuss the effects of aging on AC microcircuit properties.

Daiki Yamamura, Sano Ayaka, Takashi Tateno

Differential Effect of Two Types of Anesthesia on Sound-Driven Oscillations in the Rat Primary Auditory Cortex

Neural oscillations are considered to reflect the activity of neural populations, and are thus closely associated with brain function. However, the extent to which different anesthetic agents exert unique effects on such oscillations is unclear. A mixture of three anesthetics (medetomidine, midazolam, and butorphanol) was recently developed as an alternative to ketamine, which has potential addictive effects. Yet, little is known about the effects of this combination of anesthetics on neural oscillations. In this study, we used multi-channel electrophysiological recording and flavoprotein endogenous imaging to compare sound-driven oscillations in primary auditory cortical neurons after administration of ketamine vs. a medetomidine, midazolam, and butorphanol mixture. We observed differences in high gamma activities (over 120 Hz) between these two anesthetics, independent of cortical layers, but found no differences in activities including lower frequency components (<120 Hz). Our results provide new information about how specific anesthetics influence sound-driven neural oscillations.

Hisayuki Osanai, Takashi Tateno

Developing an Implantable Micro Magnetic Stimulation System to Induce Neural Activity in Vivo

Although electromagnetic stimulation is widely used in neurological studies and clinical applications, conventional electromagnetic stimulation methods have several limitations. Recent studies have reported that micro magnetic stimulation (µMS), which can directly activate neural tissue and cells via sub-millimeter solenoids, has the possibility to overcome such limitations. However, the development and application of µMS using implantable sub-millimeter solenoids has not yet been reported. Here, we proposed a new implantable µMS system and evaluated its validity. In particular, using flavoprotein fluorescence imaging with a high spatial resolution, we evaluated if the stimuli delivered by our system were large enough to activate the mouse auditory cortex in vivo. The results indicated that our system successfully activated neural tissue, and the activity propagation was observed on the brain surface. Thus, this study is the first step to applying µMS implantable devices in investigating basic neuroscience and clinical application tools.

Shunsuke Minusa, Takashi Tateno

“Figure” Salience as a Meta-Rule for Rule Dynamics in Visual Perception

The brain faces many ill-posed problems whose solutions cannot be achieved solely on the basis of external conditions. To solve such problems, certain constraints or rules are required. Using a fixed set of rules, however, is not necessarily advantageous in ever-changing environments. Here, I revisit two of our previous psychophysical experiments. One pertains to visual depth perception based on spatial frequency cues, and the other involves apparent group motion. Results from these experiments demonstrate that perceptual rules change dynamically depending on the experimental conditions. They also suggest the existence of a meta-rule governing the dynamics of perceptual rules, which I refer to as a meta-rule of “figure” salience.

Kazuhiro Sakamoto

A Neural Network Model for Retaining Object Information Required in a Categorization Task

Categorization is our ability to generalize properties of object, and clearly fundamental cognitive capacity. A delayed match-to-categorization task requires working memory of category information, shaped by the interaction between prefrontal cortex (PFC) and inferior temporal (IT) cortex. In the present study, we present the neural mechanism by which working memory is shaped and retained in PFC and how top-down signals from PFC to IT affect the categorization ability.

Yuki Abe, Kazuhisa Fujita, Yoshiki Kashimori

Pattern Recognition

Frontmatter

Weighted Discriminant Analysis and Kernel Ridge Regression Metric Learning for Face Verification

A new formulation of metric learning is introduced by assimilating the kernel ridge regression (KRR) and weighted side-information linear discriminant analysis (WSILD) to enjoy the best of both worlds for unconstrained face verification task. To be specific, we formulate a doublet constrained metric learning problem by means of a second degree polynomial kernel function. The said metric learning problem can be solved analytically for Mahalanobis distance metric due to simplistic nature of KRR in which we named KRRML. In addition, the WSILD further enhances the learned Mahalanobis distance metric by leveraging the within-class and between-class scatter matrix of doublets. We evaluate the proposed method with Labeled Faces in the Wild database, a large benchmark dataset targeted for unconstrained face verification. The promising result attests the robustness and feasibility of the proposed method.

Siew-Chin Chong, Andrew Beng Jin Teoh, Thian-Song Ong

An Incremental One Class Learning Framework for Large Scale Data

In this paper, we propose a novel one class learning method for the large scale data. In the context of one class learning, the proposed method could automatically learn the appropriate number of prototypes needed to represent the original target examples, and acquire the essential topology structure of target distribution. Then based on the learned topology structure, a neighbors analysis technique is utilized to separate the target examples from outlier examples. Experimental results show that our method can accommodate the large scale data environment, and achieve comparable or preferable performance than other contemporary methods on both artificial and real word data sets.

Qilin Deng, Yi Yang, Furao Shen, Chaomin Luo, Jinxi Zhao

Gesture Spotting by Using Vector Distance of Self-organizing Map

This paper proposes a dynamic hand gesture recognition algorithm with a function of gesture spotting. The algorithm consists of two self-organizing maps (SOMs) and a Hebb learning network. Feature vectors are extracted from input images, and these are fed to one of the SOMs and a vector that represents the sequence of postures in the given frame is generated. Using this vector, gesture classification is performed using another SOM. In the SOM, the vector distance between the input vector and the winner neuron’s weight vector is used for the gesture spotting. The following Hebb network identifies the gesture class. The experimental results show that the system recognizes eight gestures with the accuracy of 95.8 %.

Yuta Ichikawa, Shuji Tashiro, Hidetaka Ito, Hiroomi Hikawa

Cross-Database Facial Expression Recognition via Unsupervised Domain Adaptive Dictionary Learning

Dictionary learning based methods have achieved state-of-the-art performance in the task of conventional facial expression recognition (FER), where the distributions between training and testing data are implicitly assumed to be matched. But in the practical scenes this assumption is usually broken, especially when testing samples and training samples come from different databases, a.k.a. the cross-database FER problem. To address this problem, we propose a novel method called unsupervised domain adaptive dictionary learning (UDADL) to deal with the unsupervised case that all samples in target database are completely unlabeled. In UDADL, to obtain more robust representations of facial expressions and to reduce the time complexity in training and testing phases, we introduce a dual dictionary pair consisting of a synthesis one and an analysis one to mutually bridge the samples and their codes. Meanwhile, to relieve the distribution disparity of source and target samples, we further integrate the learning of unlabeled testing data into UDADL to adaptively adjust the misaligned distribution in an embedded space, where geometric structures of both domains are also encourage to be preserved. The UDADL model can be solved by an iterate optimization strategy with each sub-optimization in a closed analytic form. The extensive experiments on Multi-PIE and BU-3DFE databases demonstrate that the proposed UDADL is superior over most widely-used domain adaptation methods in dealing with cross-database FER, and achieves the state-of-the-art performance.

Keyu Yan, Wenming Zheng, Zhen Cui, Yuan Zong

Adaptive Multi-view Semi-supervised Nonnegative Matrix Factorization

Multi-view clustering, which explores complementary information between multiple distinct feature sets, has received considerable attention. For accurate clustering, all data with the same label should be clustered together regardless of their multiple views. However, this is not guaranteed in existing approaches. To address this issue, we propose Adaptive Multi-View Semi-Supervised Nonnegative Matrix Factorization (AMVNMF), which uses label information as hard constraints to ensure data with same label are clustered together, so that the discriminating power of new representations are enhanced. Besides, AMVNMF provides a viable solution to learn the weight of each view adaptively with only a single parameter. Using $$L_{2,1}$$L2,1-norm, AMVNMF is also robust to noises and outliers. We further develop an efficient iterative algorithm for solving the optimization problem. Experiments carried out on five well-known datasets have demonstrated the effectiveness of AMVNMF in comparison to other existing state-of-the-art approaches in terms of accuracy and normalized mutual information.

Jing Wang, Xiao Wang, Feng Tian, Chang Hong Liu, Hongchuan Yu, Yanbei Liu

Robust Soft Semi-supervised Discriminant Projection for Feature Learning

Image feature extraction and noise/outlier processing has received more and more attention. In this paper, we first take the full use of labeled and unlabeled samples, which leads to a semi-supervised model. Based on the soft label, we combine unlabeled samples with their predicted labels so that all the samples have their own soft labels. Our ratio based model maximizes the soft between-class scatter, as well as minimizes the soft within-class scatter plus a neighborhood preserving item, so that our approach can explicitly extract discriminant and locality preserving features. Further, to make the result be more robust to outliers, all the distance metrics are configured as L1-norm instead of L2-norm. An effective iterative method is taken to solve the optimal function. Finally, we conduct simulation experiments on CASIA-HWDB1.1 and MNIST handwriting digits datasets. The results verified the effectiveness of our approach compared with other related methods.

Xiaoyu Wang, Zhao Zhang, Yan Zhang

A Hybrid Pooling Method for Convolutional Neural Networks

The convolutional neural network (CNN) is an effective machine learning model which has been successfully used in the computer vision tasks such as image recognition and object detection. The pooling step is an important process in the CNN to decrease the dimensionality of the input image data and keep the transformation invariance for preventing the overfitting problem. There are two major pooling methods, i.e. the max pooling and the average pooling. Their performances depend on the data and the features to be extracted. In this study, we propose a hybrid system of the two pooling methods to improve the feature extraction performance. We randomly choose one of them for each pooling zone with a fixed probability. We show that the hybrid pooling method (HPM) enhances the generalization ability of the CNNs in numerical experiments with the handwritten digit images.

Zhiqiang Tong, Kazuyuki Aihara, Gouhei Tanaka

Multi-nation and Multi-norm License Plates Detection in Real Traffic Surveillance Environment Using Deep Learning

This paper aims to highlight the problems of license plate detection in real traffic surveillance environment. We notice that existing systems require strong assumptions on license plate norm and environment. We propose a novel solution based on deep learning using self-taught features to localize multi-nation and multi-norm license plates under real road conditions such poor illumination, complex background and several positions. Our method is insensitive to illumination (day, night, sunrise, sunset,...), translation and poses. Despite the low resolution of images collected from real road surveillance environment, a series of experiments shows interesting results and the fastest time processing comparing with traditional algorithms.

Amira Naimi, Yousri Kessentini, Mohamed Hammami

A Study on Cluster Size Sensitivity of Fuzzy c-Means Algorithm Variants

Detecting clusters of different sizes represents a serious difficulty for all c-means clustering models. This study investigates the set of various modified fuzzy c-means clustering algorithms within the bounds of the probabilistic constraint, from the point of view of their sensitivity to cluster sizes. Two numerical frameworks are constructed, one of them addressing clusters of different cardinalities but relatively similar diameter, while the other manipulating with both cluster cardinality and diameter. The numerical evaluations have shown the existence of algorithms that can effectively handle both cases. However, these are difficult to automatically adjust to the input data through their parameters.

László Szilágyi, Sándor M. Szilágyi, Călin Enăchescu

Social Networks

Frontmatter

Influence Spread Evaluation and Propagation Rebuilding

In social networks, studies about influence maximization mainly focus on the algorithm of finding seed nodes, but ignore the intrinsic properties of influence propagation. In this paper, we consider the relationship between seed sets & influence spread. For static propagation, we reasonably abstract the relationship between the size of the seed set and the influence spread in influence maximization problem as a logarithmic function. We also provide experiments on large collaboration networks, showing the rationality and the accuracy of the proposed function. For dynamic influence propagation, we rebuild it as a continuous linear dynamical system called 3DS, which is based on Newton’s law of cooling. Furthermore, we give an efficient method to compute the influence spread function of time without much loss of accuracy. Its efficiency is demonstrated by complexity analysis.

Qianwen Zhang, Cheng-Chao Huang, Jinkui Xie

A Tag Probability Correlation Based Microblog Recommendation Method

In order to improve users’ experience it is necessary to recommend valuable and interesting content for users. A tag probability correlation based microblog recommendation method (TPCMR) is presented via analyzing microblog features and the deficiencies of existing microblog recommendation algorithm. Firstly, our method takes advantage of the probability correlation between tags to construct the tag similarity matrix. Then the weight of the tag for each user is enhanced based on the relevance weighting scheme and the user tag matrix can be constructed. The matrix is updated using the tag similarity matrix, which contains both the user interest information and the relationship between tags and tags. Experimental results show that the algorithm is effective for microblog recommendation.

Di Zhang, Huifang Ma, Junjie Jia, Li Yu

A New Model and Heuristic for Infection Minimization by Cutting Relationships

Models of infection spreading have been used and applied to economic, health, and social contexts. Seeing them as an optimization problem, the spreading can be maximized or minimized. This paper presents a novel optimization problem for infection spreading control applied to networks. It uses as a parameter the number of relations (edges) that must be cut, and the optimal solution is the set of edges that must be cut to ensure the minimal infection over time. The problem uses the states of SEIS nodes, which is based on the SEIR and SIS models. We refer to the problem as Min-SEIS-Cluster. The model also considers that the infections occurred over different probabilities in different clusters of individuals (nodes). We also report a heuristic to solve Min-SEIS-Cluster. The analysis of the obtained results allows one to observe that there exists a positive correlation between the proportion of removed edges and relative increase of mitigation effectiveness.

Rafael de Santiago, Wellington Zunino, Fernando Concatto, Luís C. Lamb

Sentiment and Behavior Analysis of One Controversial American Individual on Twitter

Social media is a convenient tool for expressing ideas and a powerful means for opinion formation. In this paper, we apply sentiment analysis and machine learning techniques to study a controversial American individual on Twitter., aiming to grasp temporal patterns of opinion changes and the geographical distribution of sentiments (positive, neutral or negative), in the American territory. Specifically, we choose the American TV presenter and candidate for the Republican party nomination, Donald J. Trump. The results acquired aim to elucidate some interesting points about the data, such as: what is the distribution of users considering a match between their sentiment and their relevance? Which clusters can we get from the temporal data of each state? How is the distribution of sentiments, before and after, the first two Republican party debates?

J. Eliakin M. de Oliveira, Moshe Cotacallapa, Wilson Seron, Rafael D. C. dos Santos, Marcos G. Quiles

Brain-Machine Interface

Frontmatter

Emotion Recognition Using Multimodal Deep Learning

To enhance the performance of affective models and reduce the cost of acquiring physiological signals for real-world applications, we adopt multimodal deep learning approach to construct affective models with SEED and DEAP datasets to recognize different kinds of emotions. We demonstrate that high level representation features extracted by the Bimodal Deep AutoEncoder (BDAE) are effective for emotion recognition. With the BDAE network, we achieve mean accuracies of 91.01 % and 83.25 % on SEED and DEAP datasets, respectively, which are much superior to those of the state-of-the-art approaches. By analysing the confusing matrices, we found that EEG and eye features contain complementary information and the BDAE network could fully take advantage of this complement property to enhance emotion recognition.

Wei Liu, Wei-Long Zheng, Bao-Liang Lu

Continuous Vigilance Estimation Using LSTM Neural Networks

In this paper, we propose a novel continuous vigilance estimation approach using LSTM Neural Networks and combining Electroencephalogram (EEG) and forehead Electrooculogram (EOG) signals. We combine these two modalities to leverage their complementary information using a multimodal deep learning method. Moreover, since the change of vigilance level is a time dependent process, temporal dependency information is explored in this paper, which significantly improves the performance of vigilance estimation. We introduce two LSTM Neural Network architectures, the F-LSTM and the S-LSTM, to encode the time sequences of EEG and EOG into a high level combined representation, from which we can predict the vigilance levels. The experimental results demonstrate that both of the two LSTM multimodal structures can improve the performance of vigilance estimation models in comparison with the single modality models and non-temporal dependent models.

Nan Zhang, Wei-Long Zheng, Wei Liu, Bao-Liang Lu

Motor Priming as a Brain-Computer Interface

This paper reports on a project to overcome a difficulty associated with motor imagery (MI) in a brain–computer interface (BCI), in which user training relies on discovering how to best carry out the MI given only open-ended instructions. To address this challenge we investigate the use of a motor priming (MP), a similar mental task but one linked to a tangible behavioural goal. To investigate the efficacy of this approach in creating the changes in brain activity necessary to drive a BCI, an experiment is carried out in which the user is required to prepare and execute predefined movements. Significant lateralisations of alpha activity are discussed and significant classification accuracies of movement preparation versus no preparation are also reported; indicating that this method is promising alternative to motor imagery in driving a BCI.

Tom Stewart, Kiyoshi Hoshino, Andrzej Cichocki, Tomasz M. Rutkowski

Discriminating Object from Non-object Perception in a Visual Search Task by Joint Analysis of Neural and Eyetracking Data

The single-trial classification of neural responses to stimuli is an essential element of non-invasive brain-machine interfaces (BMI) based on the electroencephalogram (EEG). However, typically, these stimuli are artificial and the classified neural responses only indirectly related to the content of the stimulus. Fixation-related potentials (FRP) promise to overcome these limitations by directly reflecting the content of visual information that is perceived. We present a novel approach for discriminating between single-trial FRP related to fixations on objects versus on a plain background. The approach is based on a source power decomposition that exploits fixation parameters as target variables to guide the optimization. Our results show that this method is able to classify object versus non-object epochs with a much better accuracy than reported previously. Hence, we provide a further step to exploiting FRP for more versatile and natural BMI.

Andrea Finke, Helge Ritter

Assessing the Properties of Single-Trial Fixation-Related Potentials in a Complex Choice Task

Event-related potentials (ERP) are usually studied by means of their grand averages, or, like in brain-machine interfaces (BMI), classified on a single-trial level. Both approaches do not offer a detailed insight into the individual, qualitative variations of the ERP occurring between single trials. These variations, however, convey valuable information on subtle but relevant differences in the neural processes that generate these potentials. Understanding these differences is even more important when ERP are studied in more complex, natural and real-life scenarios, which is essential to improve and extend current BMI. We propose an approach for assessing these variations, namely amplitude, latency and morphology, in a recently introduced ERP, fixation-related potentials (FRP). To this end, we conducted a study with a complex, real-world like choice task to acquire FRP data. Then, we present our method based on multiple-linear regression and outline, how this method may be used for a detailed, qualitative analysis of single-trial FRP data.

Dennis Wobrock, Andrea Finke, Thomas Schack, Helge Ritter

Computer Vision

Frontmatter

Unconstrained Face Detection from a Mobile Source Using Convolutional Neural Networks

We present unconstrained mobile face detection using convolutional neural networks which have potential application for guidance systems for visually impaired persons. We develop a dataset of videos captured from a mobile source that features motion blur and noise from camera shakes. This makes the application a very challenging aspect of unconstrained face detection. The performance of the convolutional neural network is compared with a cascade classifier. The results show promising performance in daylight and artificial lighting conditions while the challenges lie for moonlight conditions with the need for reduction of false positives in order to develop a robust system.

Shonal Chaudhry, Rohitash Chandra

Driver Face Detection Based on Aggregate Channel Features and Deformable Part-Based Model in Traffic Camera

We explore the problem of detecting driver faces in cabs from images taken by traffic cameras. Dim light in cabs, occlusion and low resolution make it a challenging problem. We employ aggregate channel features instead of a single feature to reduce the miss rate, which will introduce more false positives. Based on the observation that most running vehicles have a license plate and the relative position between the plate and driver face has an approximately fixed pattern, we refer to the concept of deformable part-based model and regard a candidate face and a plate as two deformable parts of a face-plate couple. A candidate face will be rejected if it has a low confidence score. Experiment results demonstrate the effectiveness of our method.

Yang Wang, Xiaoma Xu, Mingtao Pei

Segmentation with Selectively Propagated Constraints

This paper presents a novel selective constraint propagation method for constrained image segmentation. In the literature, many pairwise constraint propagation methods have been developed to exploit pairwise constraints for cluster analysis. However, since these methods mostly have a polynomial time complexity, they are not much suitable for segmentation of images even with a moderate size, which is equal to cluster analysis with a large data size. In this paper, we thus choose to perform pairwise constraint propagation only over a selected subset of pixels, but not over the whole image. Such a selective constraint propagation problem is then solved by an efficient graph-based learning algorithm. Finally, the selectively propagated constraints are exploited based on $$L_1$$L1-minimization for normalized cuts over the whole image. The experimental results show the promising performance of the proposed method.

Peng Han, Guangzhen Liu, Songfang Huang, Wenwu Yuan, Zhiwu Lu

Gaussian-Bernoulli Based Convolutional Restricted Boltzmann Machine for Images Feature Extraction

Image feature extraction is an essential step in image recognition. In this paper, taking the benefits of the effectiveness of Gaussian-Bernoulli Restricted Boltzmann Machine (GRBM) for learning discriminative image features and the capability of Convolutional Neural Network (CNN) for learning spatial features, we propose a hybrid model called Convolutional Gaussian-Bernoulli Restricted Boltzmann Machine (CGRBM) for image feature extraction by combining GRBM with CNN. Experimental results implemented on some benchmark datasets showed that our model is more effective for natural images recognition tasks than some popular methods, which is suggested that our proposed method is a potential applicable method for real-valued image feature extraction and recognition.

Ziqiang Li, Xun Cai, Ti Liang

Gaze Movement Control Neural Network Based on Multidimensional Topographic Class Grouping

Target search is an important ability of the human visual system. One major problem is that the real human visual cognitive process, which requires only few samples for learning, has abilities of inference with obtained knowledge for searching when he meets the new target. Based on the Topographic Class Grouping (TCG) [1] and a series of models of Visual Perceiving and Eyeball-Motion Controlling Neural Networks [2–5], we make effective improvements to the models, by incorporating the cerebral self-organizing feature mapping function in terms of multidimensional TCG. In this paper, we propose the gaze movement control neural network based on multidimensional TCG. Experiments show that gaze movement control neural network by adding a block of multidimensional TCG and by self-organizing visual field image features-spatial relationship clustering achieves the visual inference and stable results on the target search tasks.

Wenqi Zhong, Jun Miao, Laiyun Qing

Incremental Robust Nonnegative Matrix Factorization for Object Tracking

Nonnegative Matrix Factorization (NMF) has received considerable attention in visual tracking. However noises and outliers are not tackled well due to Frobenius norm in NMF’s objective function. To address this issue, in this paper, NMF with $$L_{2,1}$$L2,1 norm loss function (robust NMF) is introduced into appearance modelling in visual tracking. Compared to standard NMF, robust NMF not only handles noises and outliers but also provides sparsity property. In our visual tracking framework, basis matrix from robust NMF is used for appearance modelling with additional $$\ell _1$$ℓ1 constraint on reconstruction error. The corresponding iterative algorithm is proposed to solve this problem. To strengthen its practicality in visual tracking, multiplicative update rules in incremental learning for robust NMF are proposed for model update. Experiments on the benchmark show that the proposed method achieves favorable performance compared with other state-of-the-art methods.

Fanghui Liu, Mingna Liu, Tao Zhou, Yu Qiao, Jie Yang

High Precision Direction-of-Arrival Estimation for Wideband Signals in Environment with Interference Based on Complex-Valued Neural Networks

We propose a null steering scheme for wideband acoustic pulses and narrow band interference (NBI) using direction of arrival (DoA) estimation based on complex-valued spatio-temporal neural networks (CVSTNNs) and power inversion adaptive array (PIAA). For acoustic imaging, pulse spectrum should be wide to make the pulses less invasive. When a pulse has a wide frequency band, narrow band interference causes DoA errors in conventional CVSTNN. We use the weights of PIAA as the initial weights of CVSTNN to achieve higher precision in DoA estimation. Simulations demonstrate that the proposed method realizes accurate DoA estimation than the conventional CVSTNN method.

Kazutaka Kikuta, Akira Hirose

Content-Based Image Retrieval Using Deep Search

The aim of Content-based Image Retrieval (CBIR) is to find a set of images that best match the query based on visual features. Most existing CBIR systems find similar images in low level features, while Text-based Image Retrieval (TBIR) systems find images with relevant tags regardless of contents in the images. Generally, people are more interested in images with similarity both in contours and high-level concepts. Therefore, we propose a new strategy called Deep Search to meet this requirement. It mines knowledge from the similar images of original queries, in order to compensate for the missing information in feature extraction process. To evaluate the performance of Deep Search approach, we apply this method to three different CBIR systems (HOF [5], HOG and GIST) in our experiments. The results show that Deep Search greatly improves the performance of original algorithms, and is not restricted to any particular methods.

Zhengzhong Zhou, Liqing Zhang

Robust Part-Based Correlation Tracking

Visual tracking is a challenging task where the target may undergo background clutters, deformation, severe occlusion and out-of-view in video sequences. In this paper, we propose a novel tracking method, which utilizes representative parts of the target to handle occlusion situations. For the sake of efficiency, we train a classifier for each part using correlation filter which has been used in visual tracking recently due to its computational efficiency. In addition, we exploit the motion vectors of reliable parts between two consecutive frames to estimate the position of the object target and we utilize the spatial relationship between representative part and target center to estimate the scale of the target. Furthermore, part models are adaptively updated to avoid introducing errors which can cause model drift. Extensive experiments show that our algorithm is comparable to state-of-the-art methods on visual tracking benchmark in terms of accuracy and robustness.

Xiaodong Liu, Yue Zhou

A New Weight Adjusted Particle Swarm Optimization for Real-Time Multiple Object Tracking

This paper proposes a novel Weight Adjusted Particle Swarm Optimization (WAPSO) to overcome the occlusion problem and computational cost in multiple object tracking. To this end, a new update strategy of inertia weight of the particles in WAPSO is designed to maintain particle diversity and prevent pre-mature convergence. Meanwhile, the implementation of a mechanism that enlarges the search space upon the detection of occlusion enhances WAPSO’s robustness to non-linear target motion. In addition, the choice of Root Sum Squared Errors as the fitness function further increases the speed of the proposed approach. The experimental results has shown that in combination with the model feature that enables initialization of multiple independent swarms, the high-speed WAPSO algorithm can be applied to multiple non-linear object tracking for real-time applications.

Guang Liu, Zhenghao Chen, Henry Wing Fung Yeung, Yuk Ying Chung, Wei-Chang Yeh

Fast Visual Object Tracking Using Convolutional Filters

Recently, a class of tracking techniques called synthetic exact filters has been shown to give promising results at impressive speeds. Synthetic exact filters are trained using a large number of training images and associated continuous labels, however, there is not much theory behind it. In this paper, we theoretically explain the reason why synthetic exact filters based methods work well and propose a novel visual object tracking algorithm based on convolutional filters, which are trained only by training images without labels. Compared with the prior methods such as synthetic exact filters which are trained by training images and labels, advantages of the convolutional filters training include: faster and more robust than synthetic exact filters, insensitive to parameters and simpler in pre-processing of training images. Convolutional filters are theoretically optimal in terms of the signal-to-noise ratio. Furthermore, we utilize spatial context information to improve robustness of our tracking system. Experiments on many challenging video sequences demonstrate that our convolutional filters based tracker is competitive with the state-of-the-art trackers in accuracy and outperforms most trackers in efficiency.

Mingxuan Di, Guang Yang, Qinchuan Zhang, Kang Fu, Hongtao Lu

An Effective Approach for Automatic LV Segmentation Based on GMM and ASM

In this paper, we propose a novel approach for automatic left ventricle (LV) segmentation in cardiac magnetic resonance images (CMRI). This algorithm incorporates three key techniques: (1) the mid-ventricular coarse segmentation based on Gaussian mixture model (GMM); (2) the mid-slice endo-/epi-cardial initialization based on geometric transformation; (3) the myocardium tracking based on active shape models (ASM). Experiment results tested on a standard database demonstrate the effectiveness and competitiveness of the proposed method.

Yurun Ma, Deyuan Wang, Yide Ma, Ruoming Lei, Min Dong, Kemin Wang, Li Wang

Position Gradient and Plane Consistency Based Feature Extraction

Labeling scene objects is an essential task for many computer vision applications. However, differentiating scene objects with visual similarity is a very challenging task. To overcome this challenge, this paper proposes a position gradient and plane consistency based feature which is designed to distinguish visually similar objects and improve the overall labeling accuracy. Using the proposed feature we can differentiate objects with the same histogram of the gradient as well as we can differentiate horizontal and vertical objects. Integrating the proposed feature with low-level texture features and a neural network classifier, we achieve a superior performance (82 %) compared to state-of-the-art scene labeling methods on the Stanford background dataset.

Sujan Chowdhury, Brijesh Verma, Ligang Zhang

Fusion of Multi-view Multi-exposure Images with Delaunay Triangulation

In this paper, we present a completely automatic method for multi-view multi-exposure image fusion. The technique adopts the normalized cross-correlation (NCC) as the measurement of the similarity of interest points. With the matched feature points, we divide images into a set of triangles by Delaunay triangulation. Then we apply affine transformation to each matched triangle pairs respectively to get the registration of multi-view images. After images aligned, we partition the image domain into uniformed regions and select the images that provides the most information with certain blocks. The selected images are fused together under monotonically blending functions.

Hanyi Yu, Yue Zhou

Detection of Human Faces Using Neural Networks

Human face detection is a key technology in machine vision applications including human recognition, access control, security surveillance and so on. This research proposes a precise scheme for human face detection using a hybrid neural network. The system is based on visual information of the face image sequences and is commenced with estimation of the skin area depending on color components. In this paper we have considered HSV and YCbCr color space to extract the visual features. These features are used to train the hybrid network consisting of a bidirectional associative memory (BAM) and a back propagation neural network (BPNN). The BAM is used for dimensional reduction and the multi-layer BPNN is used for training the facial color features. Our system provides superior performance comparable to the existing methods in terms of both accuracy and computational efficiency. The low computation time required for face detection makes it suitable to be employed in real time applications.

Mozammel Chowdhury, Junbin Gao, Rafiqul Islam

Compound PDE-Based Image Restoration Algorithm Using Second-Order and Fourth-Order Diffusions

A hybrid nonlinear diffusion-based image restoration technique is proposed in this article. The novel compound PDE denoising model combines nonlinear second-order and fourth-order diffusions to achieve a more effective image enhancement. The weak solution of the combined PDE, representing the restored digital image, is determined by developing a robust explicit numerical approximation scheme using the finite-difference method. The performed denoising tests and method comparison are also described in this paper.

Tudor Barbu

Multi-swarm Particle Grid Optimization for Object Tracking

In recent years, one of the popular swarm intelligence algorithm Particle Swarm Optimization has demonstrated to have efficient and accurate outcomes for tracking different object movement. But there are still problems of multiple interferences in object tracking need to overcome. In this paper, we propose a new multiple swarm approach to improve the efficiency of the particle swarm optimization in object tracking. This proposed algorithm will allocate multiple swarms in separate frame grids to provide higher accuracy and wider search domain to overcome some interferences problem which can produce a stable and precise tracking orbit. It can also achieve better quality in target focusing and retrieval. The results in real environment experiments have been proved to have better performance when compare to other traditional methods like Particle Filter, Genetic Algorithm and traditional PSO.

Feng Sha, Henry Wing Fung Yeung, Yuk Ying Chung, Guang Liu, Wei-Chang Yeh

Energy-Based Multi-plane Detection from 3D Point Clouds

Detecting multi-plane from 3D point clouds can provide concise and meaningful abstractions of 3D data and give users higher-level interaction possibilities. However, existing algorithms are deficient in accuracy and robustness, and highly dependent on thresholds. To overcome these deficiencies, a novel method is proposed, which detects multi-plane from 3D point clouds by labeling points instead of greedy searching planes. It first generates initial models. Second, it computes energy terms and constructs the energy function. Third, the point labeling problem is solved by minimizing the energy function. Then, it refines the labels and parameters of detected planes. This process is iterated until the energy does not decrease. Finally, multiple planes are detected. Experimental results validate the proposed method. It outperforms existing algorithms in accuracy and robustness. It also alleviates the high dependence on thresholds and the unknown number of planes in 3D point clouds.

Liang Wang, Chao Shen, Fuqing Duan, Ping Guo

Bi-Lp-Norm Sparsity Pursuiting Regularization for Blind Motion Deblurring

Blind motion deblurring from a single image is essentially an ill-posed problem that requires regularization to solve. In this paper, we introduce a new type of an efficient and fast method for the estimation of the motion blur-kernel, through a bi-lp-norm regularization applied on both the sharp image and the blur kernel in the MAP framework. Without requiring any prior information of the latent image and the blur kernel, our proposed approach is able to restore high-quality images from given blurred images. Moreover a fast numerical scheme is used for alternatingly caculating the sharp image and the blur-kernel, by combining the split Bregman method and look-up table trick. Experiments on both sythesized and real images revealed that our algorithm can compete with much more sophisticated state-of-the-art methods.

Wanlin Gan, Yue Zhou, Liming He

Backmatter

Weitere Informationen

Premium Partner

BranchenIndex Online

Die B2B-Firmensuche für Industrie und Wirtschaft: Kostenfrei in Firmenprofilen nach Lieferanten, Herstellern, Dienstleistern und Händlern recherchieren.

Whitepaper

- ANZEIGE -

Best Practices für die Mitarbeiter-Partizipation in der Produktentwicklung

Unternehmen haben das Innovationspotenzial der eigenen Mitarbeiter auch außerhalb der F&E-Abteilung erkannt. Viele Initiativen zur Partizipation scheitern in der Praxis jedoch häufig. Lesen Sie hier  - basierend auf einer qualitativ-explorativen Expertenstudie - mehr über die wesentlichen Problemfelder der mitarbeiterzentrierten Produktentwicklung und profitieren Sie von konkreten Handlungsempfehlungen aus der Praxis.
Jetzt gratis downloaden!

Bildnachweise