Skip to main content

Über dieses Buch

The three volume set LNCS 7062, LNCS 7063, and LNCS 7064 constitutes the proceedings of the 18th International Conference on Neural Information Processing, ICONIP 2011, held in Shanghai, China, in November 2011.

The 262 regular session papers presented were carefully reviewed and selected from numerous submissions.

The papers of part I are organized in topical sections on perception, emotion and development, bioinformatics, biologically inspired vision and recognition, bio-medical data analysis, brain signal processing, brain-computer interfaces, brain-like systems, brain-realistic models for learning, memory and embodied cognition, Clifford algebraic neural networks, combining multiple learners, computational advances in bioinformatics, and computational-intelligent human computer interaction.

The second volume is structured in topical sections on cybersecurity and data mining workshop, data mining and knowledge doscovery, evolutionary design and optimisation, graphical models, human-originated data analysis and implementation, information retrieval, integrating multiple nature-inspired approaches, kernel methods and support vector machines, and learning and memory.

The third volume contains all the contributions connected with multi-agent systems, natural language processing and intelligent Web information processing, neural encoding and decoding, neural network models, neuromorphic hardware and implementations, object recognition, visual perception modelling, and advances in computational intelligence methods based pattern recognition.



Multi-agent Systems

Multimodal Identity Verification Based on Learning Face and Gait Cues

In this paper we propose a novel multimodal Bayesian approach based on PCA-LDA processing for person identification from low resolution surveillance video with cues extracted from gait and face biometrics. The experimental evaluation of the proposed scheme on a publicly available database [2] showed that the combined PCA-LDA face and gait features can lead to powerful identity verification and can capture the inherent multimodality in walking gait patterns and discriminate the identity from low resolution surveillance videos.

Emdad Hossain, Girija Chetty

Robust Control of Nonlinear System Using Difference Signals and Multiple Competitive Associative Nets

This paper describes a robust control method using difference signals and multiple competitive associative nets (CAN2s). Using difference signals of a plant to be controlled, the CAN2 is capable of leaning piecewise Jacobian matrices of nonlinear dynamics of the plant. By means of employing the GPC (generalized predictive controller), a robust control method to switch multiple CAN2s to cope with plant parameter change is introduced. We show the effectiveness of the present method via numerical experiments of a crane system.

Shuichi Kurogi, Hiroshi Yuno, Takeshi Nishida, Weicheng Huang

Selective Track Fusion

In this paper, the relationship between the fusion result and the number of sensor tracks taking part in fusion is investigated, which reveals that it may be better to fuse many instead of all of the sensor tracks at hand. This result is interesting because at present, most approaches fuse all the available sensor tracks and treat all sensor data equally without regard of their different quality and different contribution to the system tracks. Then, in order to show that the appropriate sensor tracks for a fusion can be effectively selected from a set of available sensor tracks, an approach named STF is presented. STF is based on a two-stage paradigm of heuristic function construction and track state estimation fusion. The outliers in the tracks are eliminated by the orthogonal polynomial regression method at first. Then heuristic function is constructed by evaluating the quality of each track using grey correlation degree. Last, the track state estimation fusion is guided by the heuristic function, in which an optimal number of tracks are fused. In addition, the paper discusses its implementation in the multi-sensor and multi-target environment. The effectiveness and the superiority of STF are verified in experiment.

Li Xu, Peijun Ma, Xiaohong Su

The Bystander Effect: Agent-Based Simulation of People’s Reaction to Norm Violation

The bystander effect is a well-known phenomenon in criminology, stating that bystanders tend to inhibit people’s tendency to intervene in situations where norms are violated. This paper presents an agent-based simulation model of this phenomenon. The simulation model presented demonstrates the decision process of an agent for norm violation situations with different characteristics, such as high versus low personal implications. The model has been tested by performing a number of case studies. The outcome of these case studies show that the model is able to represent the behaviour of bystanders as expected based on various experimental studies.

Charlotte Gerritsen

Multi Agent Carbon Trading Incorporating Human Traits and Game Theory

Carbon trading scheme is being established around the world as an instrument in reducing global GHG emission. Being an emerging market, there only a few simple simulation studies related to carbon trading that have been reported. In this paper, we propose a novel carbon trading simulator capable of modeling traits of human traders in carbon markets. The model is driven by the concept of Nash equilibrium within an agent based modeling paradigm. The model is capable of implementing crucial issues such as carbon emissions, Marginal Abatement Cost Curve (MAC), and complex human trading behaviour. Experiments carried out provide insights into interaction between traders’ behaviour and how the interaction affects profitability.

Long Tang, Madhu Chetty, Suryani Lim

Fast and Incremental Neural Associative Memory Based Approach for Adaptive Open-Loop Structural Control in High-Rise Buildings

A novel neural associative memory-based structural control method, coined as AMOLCO, is proposed in this study. AMOLCO is an open-loop control system that autonomously and incrementally learns to suppress the structural vibration caused by dynamic loads such as wind excitations and earthquakes to stabilize high-rise buildings. First, AMOLCO incrementally learns the associative pair of input excitation from either winds or earthquakes and the corresponding output control response generated by standard optimal control only under a single simple condition (i.e., low wind conditions). After learning for a short period of time, i.e., 15 min, AMOLCO becomes capable of efficiently suppressing more intense structural vibrations such as those caused by very strong winds or even earthquakes. In this study, evaluation of the AMOLCO method is performed by using the physical simulation data. The results show that the control signal generated by AMOLCO is similar to that generated by the state-of-the-art control system used in a building. In addition, the resulting control signal is tested on a realistic simulation to affirm that the signal can control the structures. These results show that for the first time, AMOLCO offers another approach of structural control, which is inexpensive and stable similar to a standard open-loop system and also adaptive against disturbances and dynamic changes similar to a closed-loop system.

Aram Kawewong, Yuji Koike, Osamu Hasegawa, Fumio Sato

Emergence of Leadership in Evolving Robot Colony

Growing interest in robot colony has led to initial experimental applications in biology, sociology, and synecology. Especially, it is noticeable that some researchers have tried to study on robot colony using evolutionary computational. In this paper, we present an evolutionary robot colony model and analyze their behavior for leadership characteristics in group of robots. Each robot has its own social position: leader, follower, and stranger. Leaders have responsibility of the existence of its group while followers choose their behavior going after their leaders’. Strangers behave independently without a leader or a follower. Transition between social positions is controlled by simple rules and probability, and behaviors change adaptively to the environment using evolutionary computation. Simulation has been conducted with 2-D based robot simulator Enki of EPuck mobile robots. Through experiments, we have found that the more centralized structure emerges in the evolutionary robot colony with a few leaders and safety behavior policy when facing with a difficult condition.

Seung-Hyun Lee, Si-Hyuk Yi, Sung-Bae Cho

Emergence of Purposive and Grounded Communication through Reinforcement Learning

Communication is not just the manipulation of words, but needs to decide what is communicated considering the surrounding situations and to understand the communicated signals considering how to reflect it on the actions. In this paper, aiming to the emergence of purposive and grounded communication, communication is seamlessly involved in the entire process consisted of one neural network, and no special learning for communication but reinforcement learning is used to train it. A real robot control task was done in which a transmitter agent generates two sounds from 1,785 camera image signals of the robot field, and a receiver agent controls the robot according to the received sounds. After learning, appropriate communication was established to lead the robot to the goal. It was found that, for the learning, the experience of controlling the robot by the transmitter is useful, and the correlation between the communication signals and robot motion is important.

Katsunari Shibata, Kazuki Sasahara

An Action Selection Method Based on Estimation of Other’s Intention in Time-Varying Multi-agent Environments

An action selection method based on the estimation of other’s intention is proposed to treat with time-varying multi-agent environments. Firstly, the estimation level of other’s intention is stratified as active, passive and thoughtful levels. Secondly, three estimation levels are formulated by a policy estimation method. Thirdly, a new action selection method by switching three estimation levels is proposed to cope with time-varying environments. Fourthly, the estimation methods of other’s intention are applied to the Q-learning method. Finally, through computer simulations using pursuit problems, the performance of the estimation methods are investigated. As a result, it is shown that the proposed method can select the appropriate estimation level in time-varying environments.

Kunikazu Kobayashi, Ryu Kanehira, Takashi Kuremoto, Masanao Obayashi

Describing Human Identity Using Attributes

Smart surveillance of wide areas requires a system of multiple cameras to keep tracking people by their identities. In such multi-view systems, the captured body figures and appearances of human, the orientation as well as the backgrounds are usually different camera by camera, which brings challenges to the view-invariant representation of human towards correct identification. In order to tackle this problem, we introduce an attribute based description of human identity in this paper. Firstly, two groups of attributes responsible for figure and appearance are obtained respectively. Then, Predict-Taken and Predict-Not-Taken schemes are defined to overcome the attribute-loss problem caused by different view of multi-cameras, and the attribute representation of human is obtained consequently. Thirdly, the human identification based on voter-candidate scheme is carried out by taking into account of human outside of the training data. Experimental results show that our method is robust to view changes, attributes-loss and different backgrounds.

Zhuoli Zhou, Jiajun Bu, Dacheng Tao, Luming Zhang, Mingli Song, Chun Chen

Visual Information of Endpoint Position Is Not Required for Prism Adaptation of Shooting Task

Humans can easily adapt to a visually distorted environment: We can make correct movements after a few dozens of actions with visual guidance in the new environment. However, it is unclear what visual information our brain uses for this visuo-motor adaptation. To answer this question, we conducted a behavioral experiment of prism adaption of a ball shooting task, with manipulating visual information of the ball. We found that prism adaptation occurred when the position of ball impact (or endpoint) was not visually presented. A similar result was replicated in a modified experimental setup where the vision of the body was completely eliminated. These results imply that the error information at the time of hit/impact (i.e., the displacement between the target and the hit position) is not required for prism adaptation. This suggests that the visual information of on-the-fly ball trajectory can be utilized for prism adaptation.

Takumi Ishikawa, Yutaka Sakaguchi

Q-Learning with Double Progressive Widening: Application to Robotics

Discretization of state and action spaces is a critical issue in


-Learning. In our contribution, we propose a real-time adaptation of the discretization by the progressive widening technique which has been already used in bandit-based methods. Results are consistently converging to the optimum of the problem, without changing the parametrization for each new problem.

Nataliya Sokolovska, Olivier Teytaud, Mario Milone

Natural Language Processing and Intelligent Web Information Processing

User Identification for Instant Messages

In this paper we study on recognizing user’s identity based on instant messages. Considering the special characteristics of chatting text, we mainly focus on three problems, one is how to extract the features of chatting text, the second is how the user’s model is affected by the size of training data, and the third is which classification model is fit for this problem. The chatting corpus used in this paper is collected from a Chinese IM tool and different feature selection methods and classification models are evaluate on it.

Yuxin Ding, Xuejun Meng, Guangren Chai, Yan Tang

Using Hybrid Kernel Method for Question Classification in CQA

A new question classification approach is presented for questions in CQA (Community Question and answering Systems). In CQA, most of the questions are non-factoid questions and can hardly be classified according to their answer types as factoid questions. A rough grained category is introduced and Multi-label classification method is used for question classification. That is, a question can belong to several categories instead of a specific one and the classification result is a category set. A two-step strategy is used for question Multi-label classification. In the first step, series binary classifiers of each question category are used separately. In the second step, results of those classifiers are combined and a set of question category is given as classification result. A hybrid kernel model, which combines tree kernel and polynomial kernel, is used for each binary classifier. A data set with 22000 questions is built and 20000 of which is used as training data, other 2000 as test data. Experiment result shows that the hybrid model is effective. A question paraphrase recognition experiment is carried on to verify the effectiveness of multi-label classification. The experiment results show that Multi-label classification is better than Single-label classification for questions in CQA.

Shixi Fan, Xiaolong Wang, Xuan Wang, Xiaohong Yang

Towards Understanding Spoken Tunisian Dialect

This paper presents a method for semantic interpretation designed for Tunisian dialect. Our method is based on lexical semantics to overcome the lack of resources for the studied dialect. This method is Ontology-based which allows exploiting the ontological concepts for semantic annotation and ontological relations for interpretation. This combination reduces inaccuracies and increases the rate of comprehension. This paper also details the process of building the Ontology used for annotation and interpretation of Tunisian dialect utterances in the context of speech understanding in dialogue systems.

Marwa Graja, Maher Jaoua, Lamia Hadrich Belguith

Topic Modeling of Chinese Language Using Character-Word Relations

Topic models are hierarchical Bayesian models for language modeling and document analysis. It has been well-used and achieved a lot of success in modeling English documents. However, unlike English and the majority of alphabetic languages, the basic structural unit of Chinese language is character instead of word, and Chinese words are written without spaces between them. Most previous research of using topic models for Chinese documents did not take the Chinese character-word relationship into consideration and simply take the Chinese word as the basic term of documents. In this paper, we propose a novel model to consider the character-word relation into topic modeling by placing an asymmetric prior on the topic-word distribution of the standard Latent Dirichlet Allocation (LDA) model. Compared to LDA, this model can improve performance in document classification especially when test data contains considerable number of Chinese words not appeared in training data.

Qi Zhao, Zengchang Qin, Tao Wan

Enrichment and Reductionism: Two Approaches for Web Query Classification

Classifying web queries into predefined target categories, also known as

web query classification

, is important to improve search relevance and online advertising. Web queries are however typically short, ambiguous and in constant flux. Moreover, target categories often lack standard taxonomies and precise semantic descriptions. These challenges make the web query classification task a non-trivial problem. In this paper, we present two complementary approaches for the web query classification task. First is the

enrichment method

that uses the World Wide Web (WWW) to enrich target categories and further models the web query classification as a search problem. Our second approach, the

reductionist approach

, works by reducing web queries to few central tokens. We evaluate the two approaches based on few thousands human labeled local and non-local web queries. From our study, we find the two approaches to be complementary to each other as the reductionist approach exhibits high precision but low recall, whereas the enrichment method exhibits high recall but low precision.

Ritesh Agrawal, Xiaofeng Yu, Irwin King, Remi Zajac

Dynamic Template Based Online Event Detection

In this paper, we propose a new dynamic template based event detection algorithm (DTED). Candidate template of an event is firstly constructed from a set of texts or their surrogates. Each candidate template contains several terms automatically extracted by the term weighting algorithm proposed in this paper. Then, we classify each text into a candidate event through a new similarity function. Some insignificant candidate templates are deleted. Whether an event template represents a new happened event or not is determined by comparing it with the event templates constructed in previous time window. Some events are merged into existing events and their templates are updated again. To evaluate the proposed DTED algorithm, we construct two datasets for experiment and F-measure is used as performance metric. The experiment result shows that DTED outperforms single-pass algorithm and clustering algorithms implemented in Cluto toolkit; meanwhile, Experimental results on Linguistic Data Consortium (LDC) dataset TDT4 show that DTED gets promising result.

Dandan Wang, Qingcai Chen, Xiaolong Wang, Jiacai Weng

Effect of Dimensionality Reduction on Different Distance Measures in Document Clustering

In document clustering, semantically similar documents are grouped together. The dimensionality of document collections is often very large, thousands or tens of thousands of terms. Thus, it is common to reduce the original dimensionality before clustering for computational reasons. Cosine distance is widely seen as the best choice for measuring the distances between documents in k-means clustering. In this paper, we experiment three dimensionality reduction methods with a selection of distance measures and show that after dimensionality reduction into small target dimensionalities, such as 10 or below, the superiority of cosine measure does not hold anymore. Also, for small dimensionalities, PCA dimensionality reduction method performs better than SVD. We also show how



normalization affects different distance measures. The experiments are run for three document sets in English and one in Hindi.

Mari-Sanna Paukkeri, Ilkka Kivimäki, Santosh Tirunagari, Erkki Oja, Timo Honkela

Diversifying Question Recommendations in Community-Based Question Answering

Question retrieval is an important research topic in community-based question answering (QA). Conventionally, questions semantically equivalent to the query question are considered as top ranks. However, traditional question retrieval technique has the difficulty to process the users’ information needs which are implicitly embedded in the question. This paper proposes a novel method of question recommendation by considering user’s diverse information needs. By estimating information need compactness in the question retrieval results, we further identify the retrieval results need to be diversified. For these results, the scores of information retrieval model, the importance and novelty of both question types and the informational aspects of question content, are combined to do diverse question recommendation. Comparative experiments on a large scale real community-based QA dataset show that the proposed method effectively improves information need coverage and diversity through relevant questions recommendation.

Yaoyun Zhang, Xiaolong Wang, Xuan Wang, Ruifeng Xu, Buzhou Tang

Neural Encoding and Decoding

Classification of Multi-spike Trains and Its Application in Detecting Task Relevant Neural Cliques

Prediction of animal’s behavior and detection of task relevant neural cliques using multi-spike trains are of great importance and challenges. We propose a robust and high accurate approach to classify multi-spike trains based on point process model and Bayesian rules. To detect task relevant neural cliques, a graph is constructed with its edge weights indicating the collaboration degree of neurons’ trail-to-trail response to tasks. Then minimum graph cut algorithm is introduced to detect neural cliques. Tested by data synchronously recorded in hippocampus during five sets of mouse U maze experiments (about 500 trails), the predicting accuracy is rather high and the statistical significance of the cliques is demonstrated.

Fanxing Hu, Bao-Ming Li, Hui Wei

Dreaming Your Fear Away: A Computational Model for Fear Extinction Learning during Dreaming

In this paper a computational model is presented that models how dreaming is used to learn fear extinction. The approach addresses dreaming as internal simulation incorporating memory elements in the form of sensory representations and their associated fear. During dream episodes regulation of fear takes place, which is strengthened by Hebbian learning. The model was evaluated by a number of simulation experiments for different scenarios.

Jan Treur

Simple Models for Synaptic Information Integration

Neural information processing is extremely complicated. A core challenge in theoretical neuroscience is to develop properly simplified models, which, on one hand, capture the fundamental features of the complex systems, and on the other hand, allow us to pursue analytic treatments. In the present study, we aim to develop simple models for synaptic information integration. We use simple current-based models to approximate the dynamics of conductance-based multi-compartment ones. The nonlinear shunting inhibition is expressed as a product between the contributions of excitatory and inhibitory currents, and its strength depends on the spatial configuration of excitatory and inhibitory inputs, agreeing with the experimental data. We expect that the current study will serve as a building brick for analyzing the dynamics of large-size networks.

Danke Zhang, Yuwei Cui, Yuanqing Li, Si Wu

On Rationality of Decision Models Incorporating Emotion-Related Valuing and Hebbian Learning

In this paper an adaptive decision model based on predictive loops through feeling states is analysed from the perspective of rationality. Four different variations of Hebbian learning are considered for different types of connections in the decision model. To assess the extent of rationality, a measure is introduced reflecting the environment’s behaviour. Simulation results and the extents of rationality of the different models over time are presented and analysed.

Jan Treur, Muhammad Umair

Evolving Probabilistic Spiking Neural Networks for Spatio-temporal Pattern Recognition: A Preliminary Study on Moving Object Recognition

This paper proposes a novel architecture for continuous spatio-temporal data modeling and pattern recognition utilizing evolving probabilistic spiking neural network ‘reservoirs’ (epSNNr). The paper demonstrates on a simple experimental data for moving object recognition that: (1) The epSNNr approach is more accurate and flexible than using standard SNN; (2) The use of probabilistic neuronal models is superior in several aspects when compared with the traditional deterministic SNN models, including a better performance on noisy data.

Nikola Kasabov, Kshitij Dhoble, Nuttapod Nuntalid, Ammar Mohemmed

Nonlinear Effect on Phase Response Curve of Neuron Model

One of the more useful tools for better understanding population dynamics is the phase response curve (PRC). Recent physiological experiments on the PRCs using real neurons showed that different shapes of the PRCs are generated depending on the perturbation, which has a finite amplitude. In order to clarify the origin of the nonlinear response of the PRCs, we analytically derived the PRCs from single neurons by using a spike response model. We clarified the relation between the subthreshold membrane response property and the PRC. Furthermore, we performed numerical simulations using the Hodgkin-Huxley model and their results have shown that a nonlinear change of the PRCs is generated. Our theory and numerical results imply that the nonlinear change of PRCs is due to the nonlinear element in spike time shift of firing neurons induced by the finite amplitude of the perturbation stimuli.

Munenori Iida, Toshiaki Omori, Toru Aonishi, Masato Okada

Modulations of Electric Organ Discharge and Representation of the Modulations on Electroreceptors

Weakly electric fish can recognize object’s parameters, such as material, size, distance and shape, in complete darkness. The ability to recognize these object’s parameters is provided by electrosensory system of the fish. The fish generates electric field using its electric organ (EOD: electric organ discharge). An object around the fish distorts the self-generated EOD and make the EOD modulation on fish’s body surface. The EOD modulation is converted into firings of electroreceptor afferents on fish’s body surface. The fish can extract object’s parameters from the firings. In the present study, we investigated features of the EOD modulations including information of object’s shape. Therefore we calculated EOD modulations generated by objects that were various shapes and firing patterns of electroreceptors evoked by electric images using computer simulation. We found that the shape of an object near the fish was represented by the maximum of firing rate of the receptor network. However the difference of the maximum of the firing rate between various objects was small when the distance of the object from the fish was more than about 3-4 cm. This result suggested that detection limit of the fish for object’s shape would be about 3-4 cm and the limit would be smaller than that of other sensory systems.

Kazuhisa Fujita

Spiking Neural PID Controllers

A PID controller is a simple and general-purpose way of providing responsive control of dynamic systems with reduced overshoot and oscillation. Spiking neural networks offer some advantages for dynamic systems control, including an ability to adapt, but it is not obvious how to alter such a control network’s parameters to shape its response curve. In this paper we present a spiking neural PID controller: a small network of neurons that mimics a PID controller by using the membrane recovery variable in Izhikevich’s simple model of spiking neurons to approximate derivative and integral functions.

Andrew Webb, Sergio Davies, David Lester

Neural Network Models

Analysis on Wang’s kWTA with Stochastic Output Nodes

Recently, a Dual Neural Network-based


WTA has been proposed, in which the output nodes are defined as a Heaviside step activation function. In this paper, we extend this model by considering that the output nodes are stochastic. Precisely, we define this stochastic behavior by the logistic function. It is shown that the DNN-based


WTA with stochastic output nodes is able to converge and the convergence rates of this network are three folds. Finally, the energy function governing the dynamical behavior of the network is unveiled.

John Pui-Fai Sum, Chi-Sing Leung, Kevin Ho

Regularizer for Co-existing of Open Weight Fault and Multiplicative Weight Noise

This paper first derives the training objective function of faulty radial basis function (RBF) networks, in which open weight fault and multiplicative weight noise co-exist. A regularizer is then identified from the objective function. Finally, the corresponding learning algorithm is developed. Compared to the conventional approach, our approach has a better fault tolerant ability. We then develop a faulty mean prediction error (FMPE) formula to estimate the generalization ability of faulty RBF networks. The FMPE formula helps us to understand the generalization ability of faulty networks without using a test set or generating a number of potential faulty networks. We then demonstrate how to use our FMPE formula to optimize the RBF width for the co-existing fault situation.

Chi-Sing Leung, John Pui-Fai Sum

Research on a RBF Neural Network in Stereo Matching

There are so many shortcomings in current stereo matching algorithms, for example, they have a low robustness, so as to be influenced by the environment easily, especially the intensity of the light and the number of the occlusion areas; also they often have a poor practical performance for they are difficult to deal with the matching problem without knowing the disparity range and have a high complexity when using the global optimization. In order to solve the above problems, here design a new stereo matching algorithm called RBFSM which main uses the RBF neural network (RBFNN). The RBFSM will get the correspondence between the input layer nodes and hidden layer nodes by the Gaussian function and then use the weight matrix between the hidden layer and output layer to calculate input pixels’ disparity. Here will give the analysis of this new RBF neural network matching algorithm through a lot of experiments, and results show that the new algorithm not only overcome the shortcomings of the traditional methods like low robustness and low practical performance, but also can improve the matching precision significantly with a low complexity.

Sheng Xu, Ning Ye, Fa Zhu, Shanshan Xu, Liuliu Zhou

An Evolutionary Algorithm Based Optimization of Neural Ensemble Classifiers

Ensemble classifiers are very useful tools and can be applied in many real world applications for classifying unseen data patterns into one of the known or unknown classes. However, there are many problems facing ensemble classifiers such as finding appropriate number of layers, clusters or even base classifiers which can produce best diversity and accuracy. There has been very little research conducted in this area and there is lack of an automatic method to find these parameters. This paper presents an evolutionary algorithm based approach to identify the optimal number of layers and clusters in hierarchical neural ensemble classifiers. The proposed approach has been evaluated on UCI machine learning benchmark datasets. A comparative analysis of results using the proposed approach and recently published approaches in the literature is presented in this paper.

Chien-Yuan Chiu, Brijesh Verma

Stability Criterion of Discrete-Time Recurrent Neural Networks with Periodic Delays

This paper deals with the problem of stability criterion of discrete-time recurrent neural networks with periodic delays. It is written as a discrete-time multi-switched liner system (DMSLS), applying the parameter and time dependent Lyapunov functions we obtain several new sufficient conditions and sufficient conditions for asymptotically stability of these systems.

Xing Yin, Weigen Wu, Qianrong Tan

Improved Global Robust Stability Criteria for Delayed BAM Neural Networks

This paper is concerned with uniqueness and global robust stability for the equilibrium point of the interval bidirectional associative memory (BAM) delayed neural networks. By employing linear matrix inequality and Lyapunov functional, a new criterion is proposed for the global robust stability of BAM neural networks. An example is given to show the effectiveness of the present results.

Xiaolin Li, Ming Liu

High Order Hopfield Network with Self-feedback to Solve Crossbar Switch Problem

High order network has a higher store capacity and a faster convergence speed compared with the first order network. To improve the convergence speed of the energy function, in this paper a new kind of high order discrete neural network with self-feedback is proposed to solve crossbar switch problem. The construction method of the high order energy function for this problem is presented and the neural computing method is given. We also discuss the strategies for the network to escape from local minima. Compared with the first order Hopfield network, experimental results show the high order network with self-feedback has a quick convergence speed, its performance is better than the first order Hopfield network.

Yuxin Ding, Li Dong, Bin Zhao, Zhanjun Lu

Use of a Sparse Structure to Improve Learning Performance of Recurrent Neural Networks

The objective of our study is to find out how a sparse structure affects the performance of a recurrent neural network (RNN). Only a few existing studies have dealt with the sparse structure of RNN with learning like Back Propagation Through Time (BPTT). In this paper, we propose a RNN with sparse connection and BPTT called Multiple time scale RNN (MTRNN). Then, we investigated how sparse connection affects generalization performance and noise robustness. In the experiments using data composed of alphabetic sequences, the MTRNN showed the best generalization performance when the connection rate was 40%. We also measured sparseness of neural activity and found out that sparseness of neural activity corresponds to generalization performance. These results means that sparse connection improved learning performance and sparseness of neural activity would be used as metrics of generalization performance.

Hiromitsu Awano, Shun Nishide, Hiroaki Arie, Jun Tani, Toru Takahashi, Hiroshi G. Okuno, Tetsuya Ogata

Recall Time Reduction of a Morphological Associative Memory Employing a Reverse Recall

As one of associative memory models, a morphological associative memory (MAM) has been proposed by Ritter. The model has advantages of large memory capacity and high perfect recall rate in comparison with other associative memory models. Unfortunately, the conventional MAM has a problem that it cannot recall the correct pattern for a pattern completely included with other stored patterns. To overcome the problem, we proposed a MAM employing a reverse recall. However, this model needs additional calculations for the reverse recall. The extra recall time increases as the number of included patterns increases. In this paper, as one of the solutions, we propose a MAM employing a simplified reverse recall. The extra recall time of the proposed model can be reduced by simplifying the calculation of the reverse recall for binary patterns. We confirm the validity of the proposed method by evaluating the recall time on hetero-association experiments.

Hidetaka Harada, Tsutomu Miki

Analyzing the Dynamics of Emotional Scene Sequence Using Recurrent Neuro-Fuzzy Network

In this paper, we propose a framework to analyze the temporal dynamics of the emotional stimuli. For this framework, both EEG signal and visual information are of great importance. The fusion of visual information with brain signals allows us to capture the users’ emotional state. Thus we adopt previously proposed fuzzy-GIST as emotional feature to summarize the emotional feedback. In order to model the dynamics of the emotional stimuli sequence, we develop a recurrent neuro-fuzzy (RNF) network for modeling the dynamic events of emotional dimensions including valence and arousal. It can incorporate human expertise by IF-THEN fuzzy rule while recurrent connections allow the network fuzzy rules to see its own previous output. The results show that such a framework can interact with human subjects and generate arbitrary emotional sequences after learning the dynamics of an emotional sequence with enough number of samples.

Qing Zhang, Minho Lee

Stress Classification for Gender Bias in Reading

The paper investigates classification of stress in reading for males and females based on an artificial neural network model (ANN). An experiment was conducted, with stressful and non-stressful reading material as stimuli, to obtain galvanic skin response (GSR) signals, a good indicator of stress. GSR signals formed the input of the ANN with stressed and non-stressed states as the two output classes. Results show that stress in reading for males compared to females are significantly different (p < 0.01), with males showing different patterns in GSR signals to females.

Nandita Sharma, Tom Gedeon

Self-Adjusting Feature Maps Network

In this paper, we propose a novel artificial neural network, called self-adjusting feature map (SAM), and its unsupervised learning algorithm with self-adjusting mechanism. After the training of SAM network, we will obtain a map composed of a set of representative connected neurons. The trained network structure of representative connected neurons not only displays the spatial relation of the input data distribution but also quantizes the data well. SAM can automatically isolate a set of connected neurons, in which the number of the set may indicate the number of clusters to be used. The idea of self-adjusting mechanism is based on combining of mathematical statistics and neurological advance and retreat of waste. For each representative neuron, there are three periods, growth, adaptation and decline, in its training process. The network of representative neurons will first create the necessary neurons according to the local density of the input data in the growth period. Then it will adjust neighborhood neuron pair’s connected/disconnected topology constantly according to the statistics of input feature data in the adaptation period. Lastly the unnecessary neurons of the network will be merged or deleted in the decline period. In this study, we exploit SAM to handle some peculiar cases that cannot be well dealt with by classical unsupervised learning networks such as self-organizing feature map (SOM) network. Furthermore, we also take several real world cases to exhibit the remarkable characteristics of SAM.

Chin-Teng Lin, Dong-Lin Li, Jyh-Yeong Chang

Neuromorphic Hardware and Implementations

Statistical Nonparametric Bivariate Isotonic Regression by Look-Up-Table-Based Neural Networks

Bivariate regression allows inferring a model underlying two data-sets. We consider the case of regression from possibly incomplete data sets, namely the case that data in the two sets do not necessarily correspond in size and might come unmatched/unpaired. The paper proposes to tackle the problem of bivariate regression through a non-parametric neural-learning method that is able to match the statistics of the available data sets. The devised neural algorithm is based on a look-up-table representation of the involved functions. A numerical experiment, performed on a real-world data set, serves to illustrate the features of the proposed statistical regression procedure.

Simone Fiori

Recovery of Sparse Signal from an Analog Network Model

This paper presents an analog neural network model to recover sparse signals. In the original constrained optimization task for recovering sparse signals, the objective function is not differentiable. Hence, we recast the original nonlinear programming problem as a linear programming problem with linear inequality constraints and equality constraints. However, the second order gradient of the objective function is not convex at an equilibrium point. To solve this problem, we further modify the objective function such that the second order gradient is convex at the equilibrium point. This paper presents two sets of network dynamics. One is for the standard recovery of sparse signals. Another one is for the noisy situation.

Chi-Sing Leung, John Pui-Fai Sum, Ping-Man Lam, A. G. Constantinides

A VLSI Spiking Neural Network with Symmetric STDP and Associative Memory Operation

This paper proposes an analog CMOS VLSI circuit which implements integrate-and-fire spiking neural networks with spike-timing dependent synaptic plasticity (STDP). The designed VLSI chip includes 25 neurons and 600 synapse circuits with symmetric all-to-all connection STDP. Using the fabricated VLSI chip, we implement a Hopfield-type feedback network, and demonstrate its associative memory operation. In our chip, analog information is represented by the relative timing of spike firing events. Symmetric STDP provides an auto-correlation learning function depending on relative timing between spikes consisting of a learning pattern. Each learning and test pattern consists of 20 spike pulses each of which has a relative delay corresponding to a gray-scale pixel intensity. The chip has successfully associated from an input pattern the most similar learning pattern.

Frank L. Maldonado Huayaney, Hideki Tanaka, Takayuki Matsuo, Takashi Morie, Kazuyuki Aihara

Method of Solving Combinatorial Optimization Problems with Stochastic Effects

The higher order connections network is useful to solve the combinatorial optimization problems, however, the network topology is complicated so that implementation on hardware is not easy. To implement the higher order connections more simply, we introduce the stochastic logic architecture to the discrete hysteresis network with the higher order connections. The proposed network can solve a Traveling Salesman Problems as the conventional network.

Takahiro Sota, Yoshihiro Hayakawa, Shigeo Sato, Koji Nakajima

Dynamic Response Behaviors of a Generalized Asynchronous Digital Spiking Neuron Model

A generalized asynchronous digital spiking neuron model that can be implemented by an asynchronous sequential logic circuit is presented. The presented model is the most generalized version of asynchronous sequential logic circuit based neurons, where the sensitivity of its vector field to a stimulation input is generalized. It is clarified that, the generalization enables the model to exhibit various nonlinear responses characteristics that is classified into four groups. In addition, it is clarified that the generalization enables the model to exhibit typical dynamic response behaviors having prominent features observed in biological and model neurons.

Takashi Matsubara, Hiroyuki Torikai

Generalized PWC Analog Spiking Neuron Model and Reproduction of Fundamental Neurocomputational Properties

An artificial spiking neuron model which has a generalized

piece-wise constant

(ab. PWC) vector field and state-dependent reset is proposed. Advantages of the PWC vector field include simplicity for hardware implementation, easiness to tune parameters, suitability for theoretical analysis based on theories on discontinuous

ordinary differential equations

(ab. ODEs). Using the analysis techniques of discontinuous ODEs, it is shown that the model can reproduce 6 types of the typical neuron-like responses (neurocomputational properties), the occurrence mechanisms of which have qualitative similarities to those of Izhikevich’s simple neuron model.

Yutaro Yamashita, Hiroyuki Torikai

Implementation of Visual Attention System Using Artificial Retina Chip and Bottom-Up Saliency Map Model

This paper proposes a new hardware system for visual selective attention, in which a neuromorphic silicon retina chip is used as an input camera and a bottom-up saliency map model is implemented by a Field-Programmable Gate Array (FPGA) device. The proposed system mimics the roles of retina cells, V1 cells, and parts of lateral inferior parietal lobe (LIP), such as edge extraction, orientation, and selective attention response, respectively. The center surround difference and normalization for mimicking the roles of on-center and off-surround function in the lateral geniculate nucleus (LGN) are implemented by the FPGA. The integrated artificial retina chip with the FPGA successfully produces the human-like visual attention function, with small computational overhead. In order to apply this system to mobile robotic vision, the proposed system aims to low power dissipation and compactness. The experimental results show that the proposed system successfully generates the saliency information from natural scene.

Bumhwi Kim, Hirotsugu Okuno, Tetsuya Yagi, Minho Lee

Event-Driven Simulation of Arbitrary Spiking Neural Networks on SpiNNaker

Programming supercomputers correctly and optimally is non-trivial, which presents a problem for scientists simulating large areas of the brain. Researchers face the challenges of learning how to fully exploit hardware whilst avoiding the numerous pitfalls of parallel programming such as race conditions, deadlock and poor scaling. The SpiNNaker architecture is designed to exploit up to a million processors in modelling as many as one billion neurons in real-time. We present a programming interface for the architecture to allow modelling of arbitrary neuron and synapse dynamics using standard sequential C code, without concern for parallel-programming techniques or interprocessor communication mechanisms. An example is presented in which SpiNNaker is programmed to model multiple synaptic dynamics that are exchanged on the fly and the results of the different synaptic efficacies are shown.

Thomas Sharp, Luis A. Plana, Francesco Galluppi, Steve Furber

Object Recognition

Geometry vs. Appearance for Discriminating between Posed and Spontaneous Emotions

Spontaneous facial expressions differ from posed ones in appearance, timing and accompanying head movements. Still images cannot provide timing or head movement information directly. However, indirectly the distances between key points on a face extracted from a still image using active shape models can capture some movement and pose changes. This information is superposed on information about non-rigid facial movement that is also part of the expression. Does geometric information improve the discrimination between spontaneous and posed facial expressions arising from discrete emotions? We investigate the performance of a machine vision system for discrimination between posed and spontaneous versions of six basic emotions that uses SIFT appearance based features and FAP geometric features. Experimental results on the NVIE database demonstrate that fusion of geometric information leads only to marginal improvement over appearance features. Using fusion features, surprise is the easiest emotion (83.4% accuracy) to be distinguished, while disgust is the most difficult (76.1%). Our results find different important facial regions between discriminating posed versus spontaneous version of one emotion and classifying the same emotion versus other emotions. The distribution of the selected SIFT features shows that mouth is more important for sadness, while nose is more important for surprise, however, both the nose and mouth are important for disgust, fear, and happiness. Eyebrows, eyes, nose and mouth are important for anger.

Ligang Zhang, Dian Tjondronegoro, Vinod Chandran

Towards Learning Inverse Kinematics with a Neural Network Based Tracking Controller

Learning an inverse kinematic model of a robot is a well studied subject. However, achieving this without information about the geometric characteristics of the robot is less investigated. In this work, a novel control approach is presented based on a recurrent neural network. Without any prior knowledge about the robot, this control strategy learns to control the iCub’s robot arm online by solving the inverse kinematic problem in its control region. Because of its exploration strategy the robot starts to learn by generating and observing random motor behavior. The modulation and generalization capabilities of this approach are investigated as well.

Tim Waegeman, Benjamin Schrauwen

Enhanced Codebook Model for Real-Time Background Subtraction

The CodeBook is one of the popular real-time background models for moving object detection in a video. However, for some of the complex scenes, it does not achieve satisfactory results due to the lack of an automatic parameters estimation mechanism. In this paper, we present an improved CodeBook model, which is robust in sudden illumination changes and quasi-periodic motions. The major contributions of the paper are a robust statistical parameter estimation method, a controlled adaptation procedure, a simple, but effective technique to suppress shadows and a novel block based approach to utilize the local spatial information. The proposed model was tested on numerous complex scenes and results shows a significant performance improvement over standard model.

Munir Shah, Jeremiah Deng, Brendon Woodford

Color Image Segmentation Based on Blocks Clustering and Region Growing

In order to overcome the discontinuity in clustering segmentation, a novel color image segmentation algorithm is proposed, which is based on seeds clustering and can locate the seeds of regions quickly. Firstly, the image is divided into a series of non-overlapping blocks with the size of




pixels in HSI color space. For each block, the centroid pixel of salient homogeneous region is selected as a feature point of the block. Secondly, based on the principles of color similarity centroids are clustered to obtain the clustered centroids as seeds for region growing. Finally, invalid and noisy regions are merged to get the complete segmentation results. Comparing with other segmentation algorithms, the experimental results demonstrate that the proposed method can accurately segment regions and objects, it outperforms other methods in terms of human visual perception.

Haifeng Sima, Lixiong Liu, Ping Guo

Speed Up Spatial Pyramid Matching Using Sparse Coding with Affinity Propagation Algorithm

Recently support vector machines (SVMs) combining spatial pyramid matching (SPM) kernel have been highly successful in image annotation. And linear spatial pyramid matching using sparse coding (ScSPM) scheme was proposed to enhance the performance of SPM both in time and annotation accuracy. However, both of these algorithms suffer from expansibility problem, and ScSPM needs quite a long time for codebook construction. In this paper, we proposed an adjusted framework for the ScSPM algorithm, which applies multi-level affinity propagation (AP) algorithm to the codebook construction process (AP-ScSPM). This novel approach can remarkably reduces the time complexity of codebook construction process. Furthermore, as AP algorithm can automatically determine the representative vector number, the expansibility of the algorithm is improved. By a series of experiments, we find that the proposed framework greatly reduces the time of codebook construction process and has the same performance in terms of annotation accuracy with ScSPM.

Rukun Hu, Ping Guo

Airport Detection in Remote Sensing Images Based on Visual Attention

This paper proposes an airport detection and recognition method for remote sensing image based on visual attention mechanism. Considering the disadvantage in traditional methods by which the remote sensing images are analyzed pixel by pixel, we introduce visual attention models into airport detection and improve the efficiency of automatic target detection greatly. Firstly, Hough transform is used to judge the existence of an airport and then the improved graph-based visual saliency (GBVS) visual attention model is used to extract regions of candidates (ROCs). According to the scale-invariant feature transform (SIFT) feature extracted from ROCs and classified by HDR tree, the airport areas are recognized. Experimental results show that the proposed method has faster speed, higher recognition rate and lower false alarm rate than other current methods, and is robust against white noise.

Xin Wang, Bin Wang, Liming Zhang

A Method to Construct Visual Recognition Algorithms on the Basis of Neural Activity Data

Visual recognition by animals significantly outperforms man-made algorithms. The brain’s intelligent choice of visual features is considered to be underlying this performance gap. In order to attain better performance for man-made algorithms, we suggest using the visual features that are used in the brain in these algorithms. For this goal, we propose to obtain visual features correlated with the brain activity by applying a kernel canonical correlation analysis (KCCA) method to pairs of image data and neural data recorded from the brain of an animal exposed to the images. It is expected that only the visual features that are highly correlated with the neural activity provide useful information for visual recognition. Applied to hand-written digits as image data and activity data of a multi-layer neural network model as a model for a brain, the method successfully extracted visual features used in the neural network model. Indeed, the use of these visual features in the support vector machine (SVM) made it possible to discriminate the hand-written digits. Since this discrimination required to utilize the knowledge possessed in the neural network model, a simple application of the usual SVM without the use of these features could not discriminate them. We further demonstrate that even the use of non-digit hand-written characters for the KCCA extracts visual features which enable the SVM to discriminate the hand-written digits. This indicates the versatile applicability of our method.

Hiroki Kurashige, Hideyuki Câteau

Adaptive Colour Calibration for Object Tracking under Spatially-Varying Illumination Environments

In the context of a Fuzzy-Genetic system, auto-calibration of colour classifiers, under spatially varying illumination conditions, to produce near perfect object recognition accuracy requires a balancing act for the fitness function. One general approach would be to maximise the true positives while minimising the false positives. This has been found effective in the presence of large amount of noise. However, experiments show that this fitness function needs improvement for cases where there are target colours with similar hues. In this paper, we present an extension to our fuzzy-genetic colour contrast fusion algorithm, now utilising a fitness function that detects clusters of false positives, and limits the search space for finding the properties of the colour classifier. We tested the performance of the auto-calibrated colour classifiers by subjecting them to object recognition tasks in the robot soccer domain, under varying illumination conditions, until we find its limits. It was observed that the accuracy of the object recognition began to degrade, on the average, at illumination settings that are either about three times brighter (starting from 797.4 lux), or two times darker (less than 138 lux) than what it was trained for (average of 285.47 lux). Otherwise, near perfect recognition accuracy is achieved.

Heesang Shin, Napoleon H. Reyes, Andre L. Barczak

Analog-Digital Circuit for Motion Detection Based on Vertebrate Retina and Its Application to Mobile Robot

We proposed in this study the simple analog-digital circuits for detecting motion direction based on information processing of the vertebrate retina. The array of the circuits was applied to the mobile robot. The test circuit was fabricated by discrete metal oxide semiconductor (MOS) transistors on the breadboard. The measured results of the test circuit showed that the unit circuit can output the motion signal. The motion sensor for detecting the movement direction constructed with array of the unit circuits was connected with the microcomputer introduced in the mobile robot. It was clarified that the proposed circuits can control the mobile robot.

Kimihiro Nishio, Taiki Yasuda

Spatial Finite Non-gaussian Mixture for Color Image Segmentation

A new color image segmentation algorithm based on the integration of spatial information into finite generalized Dirichlet mixture models is presented. The integration of spatial information is done via the consideration of image pixels neighborhoods. The segmentation model presented is learned using maximum likelihood estimation within an expectation maximization (EM) optimization framework. The obtained results, evaluated quantitatively, using real images are very encouraging and are better than those obtained using similar approaches.

Ali Sefidpour, Nizar Bouguila

A Motion Detection Model Inspired by Hippocampal Function and Its FPGA Implementation

We propose a motion detection model inspired by hippocampal function and its FPGA implementation. The proposed model detects the motion of edges extracted from monocular image sequences. The motion is detected on segmented 2D maps without image matching, which allows the model to operate with higher speed than the video rate. We introduce gating units into our original CA3-CA1 model to improve the detection rate, where CA3 and CA1 are the names of hippocampal regions. We have evaluated the performance of our model by using artificial and real image sequences. The results show that the proposed model can achieve high detection rate. We have implemented the model into an FPGA, by which we can achieve motion detection within 1.0 msec/frame with power dissipation of about 1.4 W when 64 × 60 segmented blocks are used for 320 × 240 pixel images.

Haichao Liang, Takashi Morie

An Automated System for the Analysis of the Status of Road Safety Using Neural Networks

This paper presents a neural network based novel automated system that can analyze vehicle mounted video data for improving road safety. There are video data collection systems currently available although no tools exist which could be used to automatically analyze vehicle mounted video data and estimate future crash sites. The main aim of the research presented in this paper is to develop a technique to segment roadside data obtained from vehicle mounted video into regions of interest, classify roadside objects and estimate the risk factor based on roadside conditions and objects for various crashes. A clustering technique for segmentation of roadside frames into regions of interest and a neural network to classify the regions of interest into objects are investigated. The preliminary segmentation and classification results on a small dataset taken from Transport and Main Roads’ vehicle mounted video data collection are promising.

Brijesh Verma, David Stockwell

Decision Tree Based Recognition of Bangla Text from Outdoor Scene Images

This article proposes a scheme for automatic recognition of Bangla text extracted from outdoor scene images. For extraction, we obtain the headline, then apply certain conditions to distinguish between text and non-text. By removing the headline we partition the text into two zones. We further observe an association among the text symbols in these two different zones. For recognition purpose, we design a decision tree classifier with Multilayer Perceptron (MLP) at leaf nodes. The root node takes into account all possible text symbols. Further nodes highlight distinguishable features and act as two-class classifiers. Finally, at leaf nodes, a few text symbols remain, that are recognized using MLP classifiers. The association between the two zones makes recognition simpler and efficient. The classifiers are trained using about 7100 samples of 52 classes. Experiments are performed on 250 images (200 scene images and 50 scanned images).

Ranjit Ghoshal, Anandarup Roy, Tapan Kumar Bhowmik, Swapan K. Parui

Learning Global and Local Features for License Plate Detection

This paper proposes an intelligent system that is capable of automatically detecting license plates from static images captured by a digital still camera. A supervised learning approach is used to extract features from license plates, and both global feature and local feature are organized into a cascaded structure. In general, our framework can be divided into two stages. The first stage is constructed by extracting global correlation features and a posterior probability can be estimated to quickly determine the degree of resemblance between the evaluated image region and a license plate. The second stage is constructed by further extracting local dense-SIFT (dSIFT) features for AdaBoost supervised learning approach, and the selected dSIFT features will be used to construct a strong classifier. Using dSIFT as a type of highly distinctive local feature, our algorithm gives high detection rate under various complex conditions. The proposed framework is compared with existing works and promising results are obtained.

Sheng Wang, Wenjing Jia, Qiang Wu, Xiangjian He, Jie Yang

Intelligent Video Surveillance System Using Dynamic Saliency Map and Boosted Gaussian Mixture Model

In this paper, we propose an intelligent video camera system for traffic surveillance, which can detect moving objects in road, recognize the types of objects, and track their moving trajectories. A dynamic saliency map based object detection model is proposed to robustly detect a moving object against light condition change. A Gaussian mixture model (GMM) integrated with an Adaboosting algorithm is proposed for classifying the detected objects into vehicles, pedestrian and background. The GMM uses C1-like features of HMAX model as input features, which are robust to image translation and scaling. And a local appearance model is also proposed for object tracking. Experimental results plausibly demonstrate the excellence performance of the proposed system.

Wono Lee, Giyoung Lee, Sang-Woo Ban, Ilkyun Jung, Minho Lee

Contour-Based Large Scale Image Retrieval

The paper presents a contour-based method for large scale image retrieval. With the contour saliency map of the object, it could address the shift-invariance problem, and with hierarchical and multi-scale feature extraction, it is able to deal with the scale-invariance problem to a certain extent. Different from existing algorithms, the features used in the retrieval system contain not only local information, but also global information of the object. By taking advantage of this characteristic, we could build a hierarchical index structure which helps to fast retrieval of the large scale database. Furthermore, our method allows two kinds of query image: a hand-drawn sketch or a natural image. Thus it is possible to refine the search results by choosing one image from the list of previous sketch retrieval results as the new query. It brings the better interactive user experiment and the convenience for those who aren’t good at drawing. The experiment results verify the performance of our method on a database of four million images.

Rong Zhou, Liqing Zhang

Visual Perception Modelling

Three Dimensional Surface Temperature Measurement System

In this paper, the three-dimensional surface shape measurement system with the temperature information is introduced. The measurement is established using a three-dimensional surface measurement system and a thermography. The measurement system is composed of CCD camera, a laser and thermography. The laser is projected to the object and the laser streak image appeared on the surface of the object is observed by a CCD camera and a thermography. The streak image recorded by the CCD camera is used to reconstruct the object shape on a computer, and the corresponding temperature data obtained by a thermography is allocated to the reconstructed surfaces of the object on a computer. The obtained data can be used for a quantitative analysis of a heat radiation considering the area and the roughness of the heat source object. Experimental result shows the feasibility of our system.

Tao Li, Kikuhito Kawasue, Satoshi Nagatomo

A Markov Random Field Model for Image Segmentation Based on Gestalt Laws

This paper proposes a Markov Random Field model for image segmentation based on statistical characteristics of contours. Different from previous approaches, we use Gestalt Laws of Perceptual Organization as natural constraints for segmentation by integrating contour orientations into segmentation labels. The basic framework of our model consists of three modules: foreground/backgraound separation, attentive selection and information integration. This model can be realized for both automatic and semiautomatic image segmentations. Our algorithm achieves smooth segmentation boundaries and outperforms other popular algorithms.

Yuan Ren, Huixuan Tang, Hui Wei

Weber’s Law Based Center-Surround Hypothesis for Bottom-Up Saliency Detection

Inspired by Weber’s Law and the biological model of synergistic center-surround receptive field, this paper proposes a center-surround hypothesis for saliency detection. Specifically, this detector defines two types of salient stimuli. One type is local stimulus represented as a set of differential excitation of gradient orientation for each pixel. The other type is global stimulus, which is the relative intensity differences of center region against the overall mean. Then a center-surround model with ring topology structure is designed to extract salient responses of these two types of stimuli. For a given color image, these salient responses are computed on each color channel separately, and then combined linearly to get the final saliency map. Comparison experiments demonstrate this detector not only can generate high quality saliency maps with the same resolution as the input image, but also has stronger response in activation regions and better inhibition performance in other regions.

Lili Lin, Wenhui Zhou, Hua Zhang

Multi-scale Image Analysis Based on Non-Classical Receptive Field Mechanism

In the real world, the biological visual system is more efficient than the machine visual system in analyzing visual information. Physiology theories show that this efficiency owes to the multi-layer neural network in human visual system, in which every layer accomplishes different tasks and is related with other layers. The low-level stages of the human visual system, especially the retina, can provide certain scale information for the high-level stages of visual system through using the non-classical receptive field (nCRF) mechanism. This mechanism that the nCRF size can be adjusted automatically by ganglion cell (GC) can achieve a multi-scale image analysis. The results, reflecting the distribution of the image information, can be shared by several algorithms or processes solving different visual tasks, such as contour detection and image segmentation. A model of multi-scale image analysis based on GC has been proposed in this paper, which retains the key information and reduces the redundancy information for the further stages of the visual system. Experimental results on N-cut and contour detection show that this multi-scale image analysis model provides distinctive improvement for these image processing tasks.

Hui Wei, Qingsong Zuo, Bo Lang

Visual Constructed Representations for Object Recognition and Detection

We propose a neurally inspired model for parallel visual process for recognition and detection. This model is based on the Gabor feature explicit representation construction. An input image is decomposed of different scale features through the low-pass filter. Nevertheless, recycling and overlapping again the scale features, the most likely object stored in memory can be detected on the input image. This is done by scale feature correspondence finding. Simultaneously, Gabor feature representations stored in memory are also constructed by selecting the most similar scale features to the input. We also test a recognition ability of our model, using a number of facial images of different persons. Distortion invariant recognition is also demonstrated.

Yasuomi D. Sato, Yasutaka Kuriya

Multiview Range Image Registration Using Competitive Associative Net and Leave-One-Image-Out Cross-Validation Error

This paper presents a method for multiview range image registration to fuse 3D surfaces in range images taken from around an object by a laser range finder (LRF). The method uses competitive associative net (CAN2) for learning piecewise linear approximation of surfaces in the LRF range image involving various noise, and then executes pairwise registration of consecutive range images approximated by piecewise planes. To reduce the propagation error caused by the consecutive pairwise registration, the method introduces leave-one-image-out cross-validation (LOOCV) and tries to minimize the LOOCV registration error. The effectiveness is shown by using real LRF range images of several objects.

Shuichi Kurogi, Tomokazu Nagi, Shoichi Yoshinaga, Hideaki Koya, Takeshi Nishida

Multi-view Pedestrian Recognition Using Shared Dictionary Learning with Group Sparsity

Pedestrian tracking in multi-camera is an important task in intelligent visual surveillance system, but it suffers from the problem of large appearance variations of the same person under different cameras. Inspired by the success of existing view transformation model in multi-view gait recognition, we present a novel view transformation model based approach named shared dictionary learning with group sparsity to address the problem. It projects the pedestrian appearance feature descriptor in probe view into the gallery one before feature descriptors matching. In this case,


1, ∞ 

regularization over the latent embedding ensure the lower reconstruction error and more stable feature descriptors generation, comparing with the existing Singular Value Decomposition. Although the overall optimization function is not global convex, the Nesterovs optimal gradient scheme ensure the efficiency and reliability. Experiments on VIPeR dataset show that our approach reaches the state-of-the-art performance.

Shuai Zheng, Bo Xie, Kaiqi Huang, Dacheng Tao

A Feature Selection Approach for Emulating the Structure of Mental Representations

In order to develop artificial agents operating in complex ever-changing environments, advanced technical memory systems are required. At this juncture, two central questions are which information needs to be stored and how it is represented. On the other hand, cognitive psychology provides methods to measure the structure of mental representations in humans. But the nature and the characteristics of the underlying representations are largely unknown. We propose to use feature selection methods to determine adequate technical features for approximating the structure of mental representations found in humans. Although this approach does not allow for drawing conclusions transferable to humans, it constitutes an excellent basis for creating technical equivalents of mental representations.

Marko Tscherepanow, Marco Kortkamp, Sina Kühnel, Jonathan Helbach, Christoph Schütz, Thomas Schack

Super Resolution of Text Image by Pruning Outlier

We propose a learning based super resolution algorithm for single frame text image. The distance based candidate of example can’t avoid the outliers and the super resolution result will be disturbed by the irrelevant outliers. In this work, the unique constraints of the text image are used to reject the outliers in the learning based SR algorithm. The final image is obtained by the Markov random field network with k nearest neighbor candidates from an image database that contains pairs of corresponding low resolution and high resolution text image patches. We demonstrate our algorithm on simulated and real scanned documents with promising results.

Ziye Yan, Yao Lu, JianWu Li

Integrating Local Features into Discriminative Graphlets for Scene Classification

Scene classification plays an important role in multimedia information retrieval. Since local features are robust to image transformation, they have been used extensively for scene classification. However, it is difficult to encode the spatial relations of local features in the classification process. To solve this problem, Geometric Local Features Integration(GLFI) is proposed. By segmenting a scene image into a set of regions, a so-called Region Adjacency Graph(RAG) is constructed to model their spatial relations. To measure the similarity of two RAGs, we select a few discriminative templates and then use them to extract the corresponding discriminative graphlets(connected subgraphs of an RAG). These discriminative graphlets are further integrated by a boosting strategy for scene classification. Experiments on five datasets validate the effectiveness of our GLFI.

Luming Zhang, Wei Bian, Mingli Song, Dacheng Tao, Xiao Liu

Opponent and Feedback: Visual Attention Captured

Visual attention, as an important issue in computer vision field, has been raised for decades. And many approaches mainly based on the bottom-up or top-down computing models have been put forward to solve this problem. In this paper, we propose a new and effective saliency model which considers the inner opponent relationship of the image information. Inspired by the opponent and feedback mechanism in human perceptive learning, firstly, some opponent models are proposed based on the analysis of original color image information. Secondly, as both positive and negative feedbacks can be learned from the opponent models, we construct the saliency map according to the optimal combination of these feedbacks by using the least square regression with constraints method. Experimental results indicate that our model achieves a better performance both in the simple and complex nature scenes.

Senlin Wang, Mingli Song, Dacheng Tao, Luming Zhang, Jiajun Bu, Chun Chen

Depth from Defocus via Discriminative Metric Learning

In this paper, we propose a discriminative learning-based method for recovering the depth of a scene from multiple defocused images. The proposed method consists of a discriminative learning phase and a depth estimation phase. In the discriminative learning phase, we formalize depth from defocus (DFD) as a multi-class classification problem which can be solved by learning the discriminative metrics from the synthetic training set by minimizing a criterion function. To enhance the discriminative and generalization performance of the learned metrics, the criterion takes both within-class and between-class variations into account, and incorporates margin constraints. In the depth estimation phase, for each pixel, we compute the


discriminative functions and determine the depth level according to the minimum discriminant value. Experimental results on synthetic and real images show the effectiveness of our method in providing a reliable estimation of the depth of a scene.

Qiufeng Wu, Kuanquan Wang, Wangmeng Zuo, Yanjun Chen

Analysis of the Proton Mediated Feedback Signals in the Outer Plexiform Layer of Goldfish Retina

Center-surround antagonistic receptive field in the retina is generated by negative feedback from horizontal cells (HCs) via a proton feedback mechanism [1]. In this study, the contribution of protons on the color opponent signal formation is analyzed. Increasing the buffer capacity of the external medium by 10 mM HEPES depolarized the dark membrane potential of HCs, and substantially increased hyperpolarizing responses to light stimulation. In contrast, feedback mediated depolarizing responses of H2 and H3 HCs were suppressed by HEPES. Moreover, depolarizing response onset of H2 and H3 HCs was significantly delayed compared to the hyperpolarizing responses. These indicate that proton plays an important role on the color opponent signal formation of HCs, and that the feedback from H1 to H2 HCS is delayed by 10 – 20 ms. A similar delay might be applicable to other feedback pathways as well.

Nilton Liuji Kamiji, Masahiro Yamada, Kazunori Yamamoto, Hajime Hirasawa, Makoto Kurokawa, Shiro Usui

Modeling Manifold Ways of Scene Perception

In this paper, under the efficient coding theory we propose a computational model to explore the


dimensionality of scene perception. This model is hierarchically constructed according to the information pathway of visual cortex: By pooling together the activity of local low-level feature detectors across a large regions of the visual fields, we build the population feature representation as the statistical summary of the input image. Then, a large amount of population feature representations of scene images are embedded unsupervisedly into a low-dimensional space called perceptual manifold. Further analysis on the perceptual manifold reveals the topographic properties that 1) scene images which share similar perceptual similarity stay nearby in the manifold space, and 2) dimensions of the space could describe the perceptual continuous changes in the spatial layout of scenes, representing the degree of naturalness, openness, etc. Moreover, scene classification task is implemented to validate the topographic properties of the perceptual manifold space.

Mengyuan Zhu, Bolei Zhou

Advances in Computational Intelligence Methods Based Pattern Recognition

Utilization of a Virtual Patient Model to Enable Tailored Therapy for Depressed Patients

Major depression is a prominent mental disorder that has significant impact upon the patient suffering from the depression as well as on the society as a whole. Currently, therapies are offered via the Internet in the form of self-help modules, and they have shown to be as effective as face-to-face counseling. In order to take automated therapies a step further, models which describe the development of the internal states associated with depression can be of great help to give dedicated advice and feedback to the patient e.g. by means of making predictions using the model. In this paper, an existing computational model for states related to depression (e.g. mood) is taken as a basis in combination with models that express the influence of various therapies upon these states. These models are utilized to give dedicated feedback to the patient, tailor the parameters towards the observed patient behavior, and give an appropriate advice regarding the therapy to be followed.

Fiemke Both, Mark Hoogendoorn

Learning Based Visibility Measuring with Images

Visibility is one of the major items of meteorological observation. Its accuracy is very important to air, sea and highways and transport. A method of visibility calculation based on image analysis and learning is introduced in this paper. First, visibility image is effectively represented by contrast based vision features. Then, a Supported Vector Regression (SVR) based learning system is constructed between image features and the target visibility. Consequently, visibility can be measured directly from a single inputting image with this learning system. The method makes use of the existing video cameras to calculate visibility in real time. Specific experiments show that this method has the characteristic of low cost, fast calculation, and convenience. Moreover, our proposed technology can be used anywhere to measure visibility.

Xu-Cheng Yin, Tian-Tian He, Hong-Wei Hao, Xi Xu, Xiao-Zhong Cao, Qing Li

Polynomial Time Algorithm for Learning Globally Optimal Dynamic Bayesian Network

This paper is concerned with the problem of learning the globally optimal structure of a dynamic Bayesian network (DBN). We propose using a recently introduced information theoretic criterion named MIT (Mutual Information Test) for evaluating the goodness-of-fit of the DBN structure. MIT has been previously shown to be effective for learning static Bayesian network, yielding results competitive to other popular scoring metrics, such as BIC/MDL, K2 and BD, and the well-known constraint-based PC algorithm. This paper adapts MIT to the case of DBN. Using a modified variant of MIT, we show that learning the globally optimal DBN structure can be efficiently achieved in polynomial time.

Nguyen Xuan Vinh, Madhu Chetty, Ross Coppel, Pramod P. Wangikar

A Hybrid FMM-CART Model for Fault Detection and Diagnosis of Induction Motors

A new approach to detect and classify fault conditions of induction motors using a hybrid Fuzzy Min-Max (FMM) neural network and the Classification and Regression Tree (CART) is proposed. The hybrid model, known as FMM-CART, exploits the advantages of both FMM and CART for undertaking data classification and rule extraction problems. A series of experiments using real data measurements of motor currents from healthy and faulty induction motors is conducted. FMM-CART is able to detect and classify the associated inductor motor faults with good accuracy rates. Useful rules in the form of a decision tree are also elicited from FMM-CART to analyze and understand different fault conditions of induction motors.

Manjeevan Seera, CheePeng Lim, Dahaman Ishak

A Multimodal Information Collector for Content-Based Image Retrieval System

Explicit relevance feedback requires the user to explicitly refine the search queries for content-based image retrieval. This may become laborious or even impossible due to the ever-increasing volume of digital databases. We present a multimodal information collector that can unobtrusively record and asynchronously transmit the user’s


relevance feedback on a displayed image to the remote CBIR server for assisting in retrieving relevant images. The modalities of user interaction include eye movements, pointer tracks and clicks, keyboard strokes, and audio including speech. The client-side information collector has been implemented as a browser extension using the JavaScript programming language and has been integrated with an existing CBIR server. We verify its functionality by evaluating the performance of the gaze-enhanced CBIR system in on-line image tagging tasks.

He Zhang, Mats Sjöberg, Jorma Laaksonen, Erkki Oja

Graphical Lasso Quadratic Discriminant Function for Character Recognition

The quadratic discriminant function (QDF) derived from the multivariate Gaussian distribution is effective for classification in many pattern recognition tasks. In particular, a variant of QDF, called MQDF, has achieved great success and is widely recognized as the state-of-the-art method in character recognition. However, when the number of training samples is small, covariance estimation involved in QDF will usually be ill-posed, and it leads to the loss of the classification accuracy. To attack this problem, in this paper, we engage the graphical lasso method to estimate the covariance and propose a new classification method called the Graphical Lasso Quadratic Discriminant Function (GLQDF). By exploiting a coordinate descent procedure for the lasso, GLQDF can estimate the covariance matrix (and its inverse) more precisely. Experimental results demonstrate that the proposed method can perform better than the competitive methods on two artificial and six real data sets (including both benchmark digit and Chinese character data).

Bo Xu, Kaizhu Huang, Irwin King, Cheng-Lin Liu, Jun Sun, Naoi Satoshi

Denial-of-Service Attack Detection Based on Multivariate Correlation Analysis

The reliability and availability of network services are being threatened by the growing number of Denial-of-Service (DoS) attacks. Effective mechanisms for DoS attack detection are demanded. Therefore, we propose a multivariate correlation analysis approach to investigate and extract second-order statistics from the observed network traffic records. These second-order statistics extracted by the proposed analysis approach can provide important correlative information hiding among the features. By making use of this hidden information, the detection accuracy can be significantly enhanced. The effectiveness of the proposed multivariate correlation analysis approach is evaluated on the KDD CUP 99 dataset. The evaluation shows encouraging results with average 99.96% detection rate and 2.08% false positive rate. Comparisons also show that our multivariate correlation analysis based detection approach outperforms some other current researches in detecting DoS attacks.

Zhiyuan Tan, Aruna Jamdagni, Xiangjian He, Priyadarsi Nanda, Ren Ping Liu

Deep Belief Networks for Financial Prediction

Financial business prediction has lately raised a great interest due to the recent world crisis events. In spite of the many advanced shallow computational methods that have extensively been proposed, most algorithms have not yet attained a desirable level of applicability. All show a good performance for a given financial setup but fail in general to create better and reliable models. The main focus of this paper is to present a deep learning model with strong ability to generate high level feature representations for accurate financial prediction. The proposed Deep Belief Network (DBN) approach tested in a real dataset of French companies compares favorably to shallow architectures such as Support Vector Machines (SVM) and single Restricted Boltzmann Machine (RBM). We show that the underlying financial model with deep machine technology has a strong potential thus empowering the finance industry.

Bernardete Ribeiro, Noel Lopes

Uncertainty Measure for Selective Sampling Based on Class Probability Output Networks

This paper presents a novel method of selective sampling using conditional class probabilities estimated from a network referred to as the class probability output network (CPON). For selective sampling, an uncertainty measure is defined using the confidence level for the CPON output. As a result, the proposed uncertainty measure represents how confident the CPON output is. We compared the recognition performance between other sampling methods and the proposed one. The relationship between the uncertainty measure and recognition rate was also investigated.

Ho-Gyeong Kim, Rhee Man Kil, Soo-Young Lee


Weitere Informationen