Skip to main content
main-content

Über dieses Buch

This book includes the proceedings of the International Conference on Artificial Neural Networks (ICANN 2006) held on September 10-14, 2006 in Athens, Greece, with tutorials being presented on September 10, the main conference taking place during September 11-13 and accompanying workshops on perception, cognition and interaction held on September 14, 2006. The ICANN conference is organized annually by the European Neural Network Society in cooperation with the International Neural Network Society, the Japanese Neural Network Society and the IEEE Computational Intelligence Society. It is the premier European event covering all topics concerned with neural networks and related areas. The ICANN series of conferences was initiated in 1991 and soon became the major European gathering for experts in these fields. In 2006 the ICANN Conference was organized by the Intelligent Systems Laboratory and the Image, Video and Multimedia Systems Laboratory of the National Technical University of Athens in Athens, Greece. From 475 papers submitted to the conference, the International Program Committee selected, following a thorough peer-review process, 208 papers for publication and presentation to 21 regular and 10 special sessions. The quality of the papers received was in general very high; as a consequence, it was not possible to accept and include in the conference program many papers of good quality.

Inhaltsverzeichnis

Frontmatter

Feature Selection and Dimension Reduction for Regression (Special Session)

Dimensionality Reduction Based on ICA for Regression Problems

In manipulating data such as in supervised learning, we often extract new features from the original features for the purpose of reducing the dimensions of feature space and achieving better performance. In this paper, we show how standard algorithms for independent component analysis (ICA) can be applied to extract features for regression problems. The advantage is that general ICA algorithms become available to a task of feature extraction for regression problems by maximizing the joint mutual information between target variable and new features. Using the new features, we can greatly reduce the dimension of feature space without degrading the regression performance.

Nojun Kwak, Chunghoon Kim

A Functional Approach to Variable Selection in Spectrometric Problems

In spectrometric problems, objects are characterized by high-resolution spectra that correspond to hundreds to thousands of variables. In this context, even fast variable selection methods lead to high computational load. However, spectra are generally smooth and can therefore be accurately approximated by splines. In this paper, we propose to use a B-spline expansion as a pre-processing step before variable selection, in which original variables are replaced by coefficients of the B-spline expansions. Using a simple leave-one-out procedure, the optimal number of B-spline coefficients can be found efficiently. As there is generally an order of magnitude less coefficients than original spectral variables, selecting optimal coefficients is faster than selecting variables. Moreover, a B-spline coefficient depends only on a limited range of original variables: this preserves interpretability of the selected variables. We demonstrate the interest of the proposed method on real-world data.

Fabrice Rossi, Damien François, Vincent Wertz, Michel Verleysen

The Bayes-Optimal Feature Extraction Procedure for Pattern Recognition Using Genetic Algorithm

The paper deals with the extraction of features for statistical pattern recognition. Bayes probability of correct classification is adopted as the extraction criterion. The problem with complete probabilistic information is discussed and Bayes-optimal feature extraction procedure is presented in detail. The case of recognition with learning is also considered. As method of solution of optimal feature extraction a genetic algorithm is proposed. A numerical example demonstrating capability of proposed approach to solve feature extraction problem is presented.

Marek Kurzynski, Edward Puchala, Aleksander Rewak

Speeding Up the Wrapper Feature Subset Selection in Regression by Mutual Information Relevance and Redundancy Analysis

A hybrid filter/wrapper feature subset selection algorithm for regression is proposed. First, features are filtered by means of a relevance and redundancy filter using mutual information between regression and target variables. We introduce permutation tests to find statistically significant relevant and redundant features. Second, a wrapper searches for good candidate feature subsets by taking the regression model into account. The advantage of a hybrid approach is threefold. First, the filter provides interesting features independently from the regression model and, hence, allows for an easier interpretation. Secondly, because the filter part is computationally less expensive, the global algorithm will faster provide good candidate subsets compared to a stand-alone wrapper approach. Finally, the wrapper takes the bias of the regression model into account, because the regression model guides the search for optimal features. Results are shown for the ‘Boston housing’ and ‘orange juice’ benchmarks based on the multilayer perceptron regression model.

Gert Van Dijck, Marc M. Van Hulle

Effective Input Variable Selection for Function Approximation

Input variable selection is a key preprocess step in any I/O modelling problem. Normally, better generalization performance is obtained when unneeded parameters coming from irrelevant or redundant variables are eliminated. Information theory provides a robust theoretical framework for performing input variable selection thanks to the concept of mutual information. Nevertheless, for continuous variables, it is usually a more difficult task to determine the mutual information between the input variables and the output variable than for classification problems. This paper presents a modified approach for variable selection for continuous variables adapted from a previous approach for classification problems, making use of a mutual information estimator based on the

k

-nearest neighbors.

L. J. Herrera, H. Pomares, I. Rojas, M. Verleysen, A. Guilén

Comparative Investigation on Dimension Reduction and Regression in Three Layer Feed-Forward Neural Network

Three layer feed-forward neural network (3-LFFNN) has been widely used for nonlinear regression. It is well known that its hidden layer can be regarded as taking the role of feature extraction and dimension reduction, and that the regression performance relies on how the feature dimension or equivalently the number of hidden units is determined appropriately. There are many publications on determining the hidden unit number for a desired generalization error. However, few comparative studies have been made on different approaches proposed, especially on those typical model selection criteria for this purpose. This paper targets such an aim. Using both simulated data and several real world data sets, a comparative study has been made on the regression performances with the number of hidden units determined by several typical model selection criteria, including Akaike’s Information Criterion (AIC), the consistent Akaike’s information criterion (CAIC), Schwarz’s Bayesian Inference Criterion (BIC) which coincides with Rissanen’s Minimum Description Length (MDL) criterion, and the well-known technique cross-validation (CV), as well as the Bayesian Ying-Yang harmony criterion on a small sample size (BYY-S). As shown in experiments on a small size of samples, BIC and CV are better than AIC and CAIC obviously. Moreover, BIC may be better than CV on certain data sets, while CV may be better than BIC on other data sets. Interestingly, BYY-S generally outperforms both BIC and CV.

Lei Shi, Lei Xu

Learning Algorithms (I)

On-Line Learning with Structural Adaptation in a Network of Spiking Neurons for Visual Pattern Recognition

This paper presents an on-line training procedure for a hierarchical neural network of integrate-and-fire neurons. The training is done through synaptic plasticity and changes in the network structure. Event driven computation optimizes processing speed in order to simulate networks with large number of neurons. The training procedure is applied to the face recognition task. Preliminary experiments on a public available face image dataset show the same performance as the optimized off-line method. A comparison with other classical methods of face recognition demonstrates the properties of the system.

Simei Gomes Wysoski, Lubica Benuskova, Nikola Kasabov

Learning Long Term Dependencies with Recurrent Neural Networks

Recurrent neural networks (RNNs) unfolded in time are in theory able to map any open dynamical system. Still they are often blamed to be unable to identify long-term dependencies in the data. Especially when they are trained with backpropagation through time (BPTT) it is claimed that RNNs unfolded in time fail to learn inter-temporal influences more than ten time steps apart.

This paper provides a disproof of this often cited statement. We show that RNNs and especially normalised recurrent neural networks (NRNNs) unfolded in time are indeed very capable of learning time lags of at least a hundred time steps. We further demonstrate that the problem of a vanishing gradient does not apply to these networks.

Anton Maximilian Schäfer, Steffen Udluft, Hans Georg Zimmermann

Adaptive On-Line Neural Network Retraining for Real Life Multimodal Emotion Recognition

Emotions play a major role in human-to-human communication enabling people to express themselves beyond the verbal domain. In recent years, important advances have been made in unimodal speech and video emotion analysis where facial expression information and prosodic audio features are treated independently. The need however to combine the two modalities in a naturalistic context, where adaptation to specific human characteristics and expressivity is required, and where single modalities alone cannot provide satisfactory evidence, is clear. Appropriate neural network classifiers are proposed for multimodal emotion analysis in this paper, in an adaptive framework, which is able to activate retraining of each modality, whenever deterioration of the respective performance is detected. Results are presented based on the IST HUMAINE NoE naturalistic database; both facial expression information and prosodic audio features are extracted from the same data and feature-based emotion analysis is performed through the proposed adaptive neural network methodology.

Spiros Ioannou, Loic Kessous, George Caridakis, Kostas Karpouzis, Vered Aharonson, Stefanos Kollias

Time Window Width Influence on Dynamic BPTT(h) Learning Algorithm Performances: Experimental Study

The purpose of the research addressed in this paper is to study the influence of the time window width in dynamic truncated BackPropagation Through Time BPTT(h) learning algorithms. Statistical experiments based on the identification of a real biped robot balancing mechanism are carried out to raise the link between the window width and the stability, the speed and the accuracy of the learning. The time window width choice is shown to be crucial for the convergence speed of the learning process and the generalization ability of the network. Although, a particular attention is brought to a divergence problem (gradient blow up) observed with the assumption where the net parameters are constant along the window. The limit of this assumption is demonstrated and parameters evolution storage, used as a solution for this problem, is detailed.

V. Scesa, P. Henaff, F. B. Ouezdou, F. Namoun

Framework for the Interactive Learning of Artificial Neural Networks

We propose framework for interactive learning of artificial neural networks. In this paper we study interaction during training of visualizable supervised tasks. If activity of hidden node in network is visualized similar way as are network outputs, human observer might deduce the effect of this particular node on the resulting output. We allow human to interfere with the learning process of network, thus he or she can improve the learning performance by incorporating his or her lifelong experience. This interaction is similar to the process of teaching children, where teacher observes their responses to questions and guides the process of learning. Several methods of interaction with neural network training are described and demonstrated in the paper.

Matúš Užák, Rudolf Jakša

Analytic Equivalence of Bayes a Posteriori Distributions

A lot of learning machines which have hidden variables or hierarchical structures are singular statistical models. They have singular Fisher information matrices and different learning performance from regular statistical models. In this paper, we prove mathematically that the learning coefficient is determined by the analytic equivalence class of Kullback information, and show experimentally that the stochastic complexity by the MCMC method is also given by the equivalence class.

Takeshi Matsuda, Sumio Watanabe

Learning Algorithms (II)

Neural Network Architecture Selection: Size Depends on Function Complexity

The relationship between generalization ability, neural network size and function complexity have been analyzed in this work. The dependence of the generalization process on the complexity of the function implemented by neural architecture is studied using a recently introduced measure for the complexity of the Boolean functions. Furthermore an association rule discovery (ARD) technique was used to find associations among subsets of items in the whole set of simulations results. The main result of the paper is that for a set of quasi-random generated Boolean functions it is found that large neural networks generalize better on high complexity functions in comparison to smaller ones, which performs better in low and medium complexity functions.

Iván Gómez, Leonardo Franco, José L. Subirats, José M. Jerez

Competitive Repetition-suppression (CoRe) Learning

The paper introduces Competitive Repetition-suppression (CoRe) learning, a novel paradigm inspired by a cortical mechanism of perceptual learning called repetition suppression. CoRe learning is an unsupervised, soft-competitive [1] model with conscience [2] that can be used for self-generating compact neural representations of the input stimuli. The key idea underlying the development of CoRe learning is to exploit the temporal distribution of neurons activations as a source of training information and to drive memory formation. As a case study, the paper reports the CoRe learning rules that have been derived for the unsupervised training of a Radial Basis Function network.

Davide Bacciu, Antonina Starita

Real-Time Construction of Neural Networks

A stepwise two-stage algorithm is proposed for real-time construction of generalized single-layer networks (GSLNs). The first stage of this algorithm generates a network using a forward selection procedure, which is then reviewed at the second stage to replace insignificant neural nodes. The main contribution of this paper is that these two stages are performed within one regression context using Cholesky decomposition, leading to significantly neural network performance and concise real-time network construction procedures.

Kang Li, Jian Xun Peng, Minrui Fei

MaxMinOver Regression: A Simple Incremental Approach for Support Vector Function Approximation

The well-known MinOver algorithm is a simple modification of the perceptron algorithm and provides the maximum margin classifier without a bias in linearly separable two class classification problems. In [1] and [2] we presented DoubleMinOver and MaxMinOver as extensions of MinOver which provide the maximal margin solution in the primal and the Support Vector solution in the dual formulation by dememorising non Support Vectors. These two approaches were augmented to soft margins based on the

ν

-SVM and the C2-SVM. We extended the last approach to SoftDoubleMaxMinOver [3] and finally this method leads to a Support Vector regression algorithm which is as efficient and its implementation as simple as the C2-SoftDoubleMaxMinOver classification algorithm.

Daniel Schneegaß, Kai Labusch, Thomas Martinetz

A Variational Formulation for the Multilayer Perceptron

In this work we present a theory of the multilayer perceptron from the perspective of functional analysis and variational calculus. Within this formulation, the learning problem for the multilayer perceptron lies in terms of finding a function which is an extremal for some functional. As we will see, a variational formulation for the multilayer perceptron provides a direct method for the solution of general variational problems, in any dimension and up to any degree of accuracy. In order to validate this technique we use a multilayer perceptron to solve some classical problems in the calculus of variations.

Roberto Lopez, Eugenio Oñate

Advances in Neural Network Learning Methods (Special Session)

Natural Conjugate Gradient Training of Multilayer Perceptrons

For maximum log–likelihood estimation, the Fisher matrix defines a Riemannian metric in weight space and, as shown by Amari and his coworkers, the resulting natural gradient greatly accelerates on–line multilayer perceptron (MLP) training. While its batch gradient descent counterpart also improves on standard gradient descent (as it gives a Gauss–Newton approximation to mean square error minimization), it may no longer be competitive with more advanced gradient–based function minimization procedures. In this work we shall show how to introduce natural gradients in a conjugate gradient (CG) setting, showing numerically that when applied to batch MLP learning, they lead to faster convergence to better minima than that achieved by standard euclidean CG descent. Since a drawback of full natural gradient is its larger computational cost, we also consider some cost simplifying variants and show that one of them, diagonal natural CG, also gives better minima than standard CG, with a comparable complexity.

Ana González, José R. Dorronsoro

Building Ensembles of Neural Networks with Class-Switching

This article investigates the properties of ensembles of neural networks, in which each network in the ensemble is constructed using a perturbed version of the training data. The perturbation consists in switching the class labels of a subset of training examples selected at random. Experiments on several UCI and synthetic datasets show that these class-switching ensembles can obtain improvements in classification performance over both individual networks and bagging ensembles.

Gonzalo Martínez-Muñoz, Aitor Sánchez-Martínez, Daniel Hernández-Lobato, Alberto Suárez

K-Separability

Neural networks use their hidden layers to transform input data into linearly separable data clusters, with a linear or a perceptron type output layer making the final projection on the line perpendicular to the discriminating hyperplane. For complex data with multimodal distributions this transformation is difficult to learn. Projection on

k

≥2 line segments is the simplest extension of linear separability, defining much easier goal for the learning process. The difficulty of learning non-linear data distributions is shifted to separation of line intervals, making the main part of the transformation much simpler. For classification of difficult Boolean problems, such as the parity problem, linear projection combined with

k

-separability is sufficient.

Włodzisław Duch

Lazy Training of Radial Basis Neural Networks

Usually, training data are not evenly distributed in the input space. This makes non-local methods, like Neural Networks, not very accurate in those cases. On the other hand, local methods have the problem of how to know which are the best examples for each test pattern. In this work, we present a way of performing a trade off between local and non-local methods. On one hand a Radial Basis Neural Network is used like learning algorithm, on the other hand a selection of the training patterns is used for each query. Moreover, the RBNN initialization algorithm has been modified in a deterministic way to eliminate any initial condition influence. Finally, the new method has been validated in two time series domains, an artificial and a real world one.

José M. Valls, Inés M. Galván, Pedro Isasi

Investigation of Topographical Stability of the Concave and Convex Self-Organizing Map Variant

We investigate, by a systematic numerical study, the parameter dependence of the stability of the Kohonen Self-Organizing Map and the Zheng and Greenleaf concave and convex learning with respect to different input distributions, input and output dimensions.

Topical groups:

Advances in Neural Network Learning Methods, Neural and hybrid architectures and learning algorithms, Self-organization.

Fabien Molle, Jens Christian Claussen

Alternatives to Parameter Selection for Kernel Methods

In this paper we propose alternative methods to parameter selection techniques in order to build a kernel matrix for classification purposes using Support Vector Machines (SVMs). We describe several methods to build a unique kernel matrix from a collection of kernels built using a wide range of values for the unkown parameters. The proposed techniques have been successfully evaluated on a variety of artificial and real data sets. The new methods outperform the best individual kernel under consideration and they can be used as an alternative to the parameter selection problem in kernel methods.

Alberto Muñoz, Isaac Martí n de Diego, Javier M. Moguerza

Faster Learning with Overlapping Neural Assemblies

Cell assemblies in neural network are often assumed as overlapping, i.e. a neuron may belong to several of them simultaneously. We argue that network structures with overlapping cell assemblies can exhibit faster learning comparing to non-overlapping ones. In such structures newly trained assemblies take advantage of their overlaps with the already trained neighbors. The assemblies learned in such manner nevertheless preserve the ability for subsequent separate firing. We discuss the implications it may have for intensification of neural network training methods and we also propose to view this learning speed-up in a broader context of inter-assembly cooperation useful for modeling concept formation in human thinking.

Andrei Kursin, Dušan Húsek, Roman Neruda

Improved Storage Capacity of Hebbian Learning Attractor Neural Network with Bump Formations

Recently, bump formations in attractor neural networks with distance dependent connectivities has become of increasing interest for investigation in the field of biological and computational neuroscience. Although the distance dependent connectivity is common in biological networks, a common fault of these network is the sharp drop of the number of patterns

p

that can remembered, when the activity changes from global to bump-like, than effectively makes these networks low effective.

In this paper we represent a bump-based recursive network specially designed in order to increase its capacity, which is comparable with that of randomly connected sparse network. To this aim, we have tested a selection of 700 natural images on a network with

N

= 64

K

neurons with connectivity per neuron

C

. We have shown that the capacity of the network is of order of

C

, that is in accordance with the capacity of highly diluted network. Preserving the number of connections per neuron, a non-trivial behavior with the radius of the connectivity has been observed. Our results show that the decrement of the capacity of the bumpy network can be avoided.

Kostadin Koroutchev, Elka Korutcheva

Error Entropy Minimization for LSTM Training

In this paper we present a new training algorithm for the Long Short-Term Memory (LSTM) recurrent neural network. This algorithm uses entropy instead of the usual mean squared error as the cost function for the weight update. More precisely we use the Error Entropy Minimization approach, were the entropy of the error is minimized after each symbol is present to the network. Our experiments show that this approach enables the convergence of the LSTM more frequently than with the traditional learning algorithm. This in turn relaxes the burden of parameter tuning since learning is achieved for a wider range of parameter values. The use of EEM also reduces, in some cases, the number of epochs needed for convergence.

Luís A. Alexandre, J. P. Marques de Sá

Ensemble Learning

Can AdaBoost.M1 Learn Incrementally? A Comparison to Learn + +  Under Different Combination Rules

We had previously introduced Learn

 + + 

, inspired in part by the ensemble based AdaBoost algorithm, for incrementally learning from new data, including new concept classes, without forgetting what had been previously learned. In this effort, we compare the incremental learning performance of Learn

 + + 

and AdaBoost under several combination schemes, including their native, weighted majority voting. We show on several databases that changing AdaBoost’s distribution update rule from hypothesis based update to ensemble based update allows significantly more efficient incremental learning ability, regardless of the combination rule used to combine the classifiers.

Hussein Syed Mohammed, James Leander, Matthew Marbach, Robi Polikar

Ensemble Learning with Local Diversity

The concept of

Diversity

is now recognized as a key characteristic of successful ensembles of predictors. In this paper we investigate an algorithm to generate diversity locally in regression ensembles of neural networks, which is based on the idea of imposing a neighborhood relation over the set of learners. In this algorithm each predictor iteratively improves its state considering only information about the performance of the neighbors to generate a sort of local negative correlation. We will assess our technique on two real data sets and compare this with

Negative Correlation Learning

, an effective technique to get diverse ensembles. We will demonstrate that the local approach exhibits better or comparable results than this global one.

Ricardo Ñanculef, Carlos Valle, Héctor Allende, Claudio Moraga

A Machine Learning Approach to Define Weights for Linear Combination of Forecasts

The linear combination of forecasts is a procedure that has improved the forecasting accuracy for different time series. In this procedure, each method being combined is associated to a numerical weight that indicates the contribution of the method in the combined forecast. We present the use of machine learning techniques to define the weights for the linear combination of forecasts. In this paper, a machine learning technique uses features of the series at hand to define the adequate weights for a pre-defined number of forecasting methods. In order to evaluate this solution, we implemented a prototype that uses a MLP network to combine two widespread methods. The experiments performed revealed significantly accurate forecasts.

Ricardo Prudêncio, Teresa Ludermir

A Game-Theoretic Approach to Weighted Majority Voting for Combining SVM Classifiers

A new approach from the game-theoretic point of view is proposed for the problem of optimally combining classifiers in dichotomous choice situations. The analysis of weighted majority voting under the viewpoint of coalition gaming, leads to the existence of

analytical solutions

to optimal weights for the classifiers based on their prior competencies. The general framework of weighted majority rules (WMR) is tested against common rank-based and simple majority models, as well as two soft-output averaging rules. Experimental results with combined support vector machine (SVM) classifiers on benchmark classification tasks have proven that WMR, employing the theoretically optimal solution for combination weights proposed in this work, outperformed all the other rank-based, simple majority and soft-output averaging methods. It also provides a very generic and theoretically well-defined framework for all hard-output (voting) combination schemes between any type of classifier architecture.

Harris Georgiou, Michael Mavroforakis, Sergios Theodoridis

Improving the Expert Networks of a Modular Multi-Net System for Pattern Recognition

A

Modular

Multi-Net System consists on some networks which solve partially a problem. The original problem has been decomposed into subproblems and each network focuses on solving a subproblem. The

Mixture of Neural Networks

consist on some expert networks which solve the subproblems and a gating network which weights the outputs of the expert networks. The expert networks and the gating network are trained all together in order to reduce the correlation among the networks and minimize the error of the system. In this paper we present the

Mixture of Multilayer Feedforward

(

MixMF

) a method based on

MixNN

which uses

Multilayer Feedfoward

networks for the expert level. Finally, we have performed a comparison among

Simple Ensemble

,

MixNN

and

MixMF

and the results show that

MixMF

is the best performing method.

Mercedes Fernández-Redondo, Joaquín Torres-Sospedra, Carlos Hernández-Espinosa

Learning Random Neural Networks and Stochastic Agents (Special Session)

Evaluating Users’ Satisfaction in Packet Networks Using Random Neural Networks

Quantifying the quality of a video or audio transmission over the Internet is usually a hard task, as based on the statistical processing of the evaluations made by a panel of humans (the corresponding and standardized area is called

subjective testing

). In this paper we describe a methodology called Pseudo-Subjective Quality Assessment (PSQA), based on Random Neural Networks, which is able to perform this task automatically, accurately and efficiently. RNN had been chosen here because of their good performances over other possibilities; this is discussed in the paper. Some new insights on PSQA’s use and performance are also given. In particular we discuss new results concerning PSQA–based dynamic quality control, and conversational quality assessment.

Gerardo Rubino, Pierre Tirilly, Martın Varela

Random Neural Networks for the Adaptive Control of Packet Networks

The Random Neural Network (RNN) has been used in a wide variety of applications, including image compression, texture generation, pattern recognition, and so on. Our work focuses on the use of the RNN as a routing

decision maker

which uses Reinforcement Learning (RL) techniques to explore a search space (

i.e.

the set of all possible routes) to find the optimal route in terms of the Quality of Service metrics that are most important to the underlying traffic. We have termed this algorithm as the Cognitive Packet Network (CPN), and have shown in previous works its application to a variety of network domains. In this paper, we present a set of experiments which demonstrate how CPN performs in a realistic environment compared to

a priori

-computed optimal routes. We show that RNN with RL can autonomously

learn

the best route in the network simply through exploration in a very short time-frame. We also demonstrate the quickness with which our algorithm is able to adapt to a disruption along its current route, switching to the new optimal route in the network. These results serve as strong evidence for the benefits of the RNN Reinforcement Learning algorithm which we employ.

Michael Gellman, Peixiang Liu

Hardware Implementation of Random Neural Networks with Reinforcement Learning

In this paper, we present a hardware implementation of a random neural network (RNN) model. The RNN, introduced by Gelenbe, is a spiked neural network model that possesses several mathematical properties such as the existence and uniqueness of the solution, and convergence of the learning algorithm. In particular, we discuss the implementation details for an RNN which uses a reinforcement learning algorithm. We also illustrate an example where this circuit implementation is used as a building block in a recently proposed novel network routing protocol called cognitive packet networks (CPN). CPN does not employ a routing table instead it relies on the RNN with a reinforcement algorithm to route probing packets.

Taskin Kocak

G-Networks and the Modeling of Adversarial Agents

As a result of the structure and content transformation of an evolving society, many large scale autonomous systems emerged in diverse areas such as biology, ecology or finance. Inspired by the desire to better understand and make the best out of these systems, we propose an approach which builds stochastic mathematical models, in particular G-networks models, that allow the efficient representation of systems of agents and offer the possibility to analyze their behavior using mathematics. This approach is capable of modelling the system at different abstraction levels, both in terms of the number of agents and the size of the geographical location. We demonstrate our approach with some urban military planning scenarios and the results suggest that this approach has tackled the problem in modelling autonomous systems at low computational cost. Apart from offering the numerical estimates of the outcome, the approach helps us identify the characteristics that impact the system most and allows us to compare alternative strategies.

Yu Wang

Hybrid Architectures

Development of a Neural Net-Based, Personalized Secure Communication Link

This paper describes a novel ultra-secure, unidirectional communication channel for use in public communication networks, which is based on

a) learning algorithms in combination with neural nets for fabrication of a unique pair of modules for encryption and decryption, and

b) in combination with decision trees for the decryption process,

c) signal transformation from spatial to temporal patterns by means of ambiguous spatial-temporal filters (ST filters),

d) absence of public- or private keys, and

e) requirement of biometric data of one of the users for both generation of the pair of hardware/software modules and for the decryption by the receiver.

To achieve these features we have implemented an encryption-unit (EU) using ST filters for encryption and a decryption unit (DU) using learning algorithms and decision trees for decryption.

Dirk Neumann, Rolf Eckmiller, Oliver Baruth

Exact Solutions for Recursive Principal Components Analysis of Sequences and Trees

We show how a family of exact solutions to the Recursive Principal Components Analysis learning problem can be computed for sequences and tree structured inputs. These solutions are derived from eigenanalysis of extended vectorial representations of the input structures and substructures. Experimental results performed on sequences and trees generated by a context-free grammar show the effectiveness of the proposed approach.

Alessandro Sperduti

Active Learning with the Probabilistic RBF Classifier

In this work we present an active learning methodology for training the probabilistic RBF (PRBF) network. It is a special case of the RBF network, and constitutes a generalization of the Gaussian mixture model. We propose an incremental method for semi-supervised learning based on the Expectation-Maximization (EM) algorithm. Then we present an active learning method that iteratively applies the semi-supervised method for learning the labeled and unlabeled observations concurrently, and then employs a suitable criterion to select an unlabeled observation and query its label. The proposed criterion selects points near the decision boundary, and facilitates the incremental semi-supervised learning that also exploits the decision boundary. The performance of the algorithm in experiments using well-known data sets is promising.

Constantinos Constantinopoulos, Aristidis Likas

Merging Echo State and Feedforward Neural Networks for Time Series Forecasting

Echo state neural networks, which are a special case of recurrent neural networks, are studied from the viewpoint of their learning ability, with a goal to achieve their greater prediction ability. A standard training of these neural networks uses pseudoinverse matrix for one-step learning of weights from hidden to output neurons. Such learning was substituted by backpropagation of error learning algorithm and output neurons were replaced by feedforward neural network. This approach was tested in temperature forecasting, and the prediction error was substantially smaller in comparison with the prediction error achieved either by a standard echo state neural network, or by a standard multi-layered perceptron with backpropagation.

Štefan Babinec, Jiří Pospíchal

Language and Cognition Integration Through Modeling Field Theory: Category Formation for Symbol Grounding

Neural Modeling Field Theory is based on the principle of associating lower-level signals (e.g., inputs, bottom-up signals) with higher-level concept-models (e.g. internal representations, categories/concepts, top-down signals) avoiding the combinatorial complexity inherent to such a task. In this paper we present an extension of the Modeling Field Theory neural network for the classification of objects. Simulations show that (i) the system is able to dynamically adapt when an additional feature is introduced during learning, (ii) that this algorithm can be applied to the classification of action patterns in the context of cognitive robotics and (iii) that it is able to classify multi-feature objects from complex stimulus set. The use of Modeling Field Theory for studying the integration of language and cognition in robots is discussed.

Vadim Tikhanoff, José F. Fontanari, Angelo Cangelosi, Leonid I. Perlovsky

A Methodology for Estimating the Product Life Cycle Cost Using a Hybrid GA and ANN Model

Although the product life cycle cost (LCC) is mainly committed by early design stage, designers do not consider the costs caused in subsequent phases of life cycle at this phase. The estimation method for the product life cycle cost in early design processes has been required because of both the lack of detailed information and time for a detailed LCC for a various range of design alternatives. This paper proposes a hybrid genetic algorithm (GA) and artificial neural network (ANN) model to estimate the product LCC that allows the designer to make comparative LCC estimation between the different product concepts. In this study, GAs are employed to select feature subsets eliminated irrelevant factors and determine the number of hidden nodes and processing elements. In addition, GAs are to optimize the connection weights between layers of ANN simultaneously. Experimental results show that a hybrid GA and ANN model outperforms the conventional backpropagation neural network and verify the effectiveness of the proposed method.

Kwang-Kyu Seo

Self Organization

Using Self-Organizing Maps to Support Video Navigation

Content-based video navigation is an efficient method for browsing video information. A common approach is to cluster shots into groups and visualize them afterwards. In this paper, we present a prototype that follows in general this approach. Unlike existing systems, the clustering is based on a growing self-organizing map algorithm. We focus on studying the applicability of SOMs for video navigation support. We ignore the temporal aspect completely during the clustering, but we project the grouped data on an original time bar control afterwards. This complements our interface by providing – at the same time – an integrated view of time and content based information. The aim is to supply the user with as much information as possible on one single screen, without overwhelming him. Special attention is also given to the interaction possibilities which are hierarchically organized.

Thomas Bärecke, Ewa Kijak, Andreas Nürnberger, Marcin Detyniecki

Self-Organizing Neural Networks for Signal Recognition

In this paper we introduce a self-organizing neural network that is capable of recognition of temporal signals. Conventional self-organizing neural networks like recurrent variant of Self-Organizing Map provide clustering of input sequences in space and time but the identification of the sequence itself requires supervised recognition process, when such network is used. In our network called TICALM the recognition is expressed by speed of convergence of the network while processing either learned or an unknown signal. TICALM network capabilities are shown on an experiment with handwriting recognition.

Jan Koutník, Miroslav Šnorek

An Unsupervised Learning Rule for Class Discrimination in a Recurrent Neural Network

A number of well-known unsupervised feature extraction neural network models are present in literature. The development of unsupervised pattern classification systems, although they share many of the principles of the aforementioned network models, has proven to be more elusive. This paper describes in detail a neural network capable of performing class separability through self-organizing Hebbian like dynamics, i.e., the network is able to autonomously find classes of patterns without the help from any external agent. The model is built around a recurrent network performing winner-takes-all competition. Automatic labelling of input data samples is based upon the induced activity pattern after presentation of the sample. Neurons compete against each other through recurrent interactions to code the input sample. Resulting active neurons update their parameters to improve the classification process. The learning dynamics are moreover absolutely stable.

Juan Pablo de la Cruz Gutiérrez

On the Variants of the Self-Organizing Map That Are Based on Order Statistics

Two well-known variants of the self-organizing map (SOM) that are based on order statistics are the marginal median SOM and the vector median SOM. In the past, their efficiency was demonstrated for color image quantization. In this paper, we employ the well-known IRIS data set and we assess their performance with respect to the accuracy, the average over all neurons mean squared error between the patterns that were assigned to a neuron and the neuron’s weight vector, and the Rand index. All figures of merit favor the marginal median SOM and the vector median SOM against the standard SOM. Based on the aforementioned findings, the marginal median SOM and the vector median SOM are used to re-distribute emotional speech patterns from the Danish Emotional Speech database that were originally classified as being neutral to four emotional states such as hot anger, happiness, sadness, and surprise.

Vassiliki Moschou, Dimitrios Ververidis, Constantine Kotropoulos

On the Basis Updating Rule of Adaptive-Subspace Self-Organizing Map (ASSOM)

This paper gives other views on the basis updating rule of the ASSOM proposed by Kohonen. We first show that the traditional basis vector rotation rule can be expressed as a correction to the basis vector which is a scaling of component vectors in the episode. With the latter form, some intermediate computations can be reused, leading to a computational load only linear to the input dimension and the subspace dimension, whereas a naive implementation of the traditional rotation rule has a computational load quadratic to the input dimension. We then proceed to propose a batch-mode updating of the basis vectors. We show that the correction made to each basis vector is a linear combination of component vectors in the input episode. Computations can be further saved. Experiments show that the proposed methods preserve the ability to generate topologically ordered invariant-feature filters and that the learning procedure is largely boosted.

Huicheng Zheng, Christophe Laurent, Grégoire Lefebvre

Composite Algorithm for Adaptive Mesh Construction Based on Self-Organizing Maps

A neural network approach for the adaptive mesh construction based on Kohonen’s Self-Organizing Maps (SOM) is considered. The approach belongs to a class of methods in which an adaptive mesh is a result of mapping of a computational domain onto a physical domain. There are some imperfections in using the SOM for mesh construction in a pure form. The composite algorithm to overcome these imperfections is proposed. The algorithm is based on the idea to alternate mesh construction on the border and inside the physical domain and includes techniques to control the consistency between boundary and interior mesh nodes and to provide an appropriate distribution of boundary nodes along the border of the domain. To increase the quality and the speed of mesh construction, a number of experiments are held to improve the learning rate. It has been shown that the quality of meshes constructed using the proposed algorithm is admissible according to the generally accepted quality criteria for finite difference meshes.

Olga Nechaeva

A Parameter in the Learning Rule of SOM That Incorporates Activation Frequency

In the traditional self-organizing map (SOM) the best matching unit (BMU) affects other neurons, through the learning rule, as a function of distance. Here, we propose a new parameter in the learning rule so neurons are not only affected by BMU as a function of distance, but as a function of the frequency of activation from both, the BMU and input vectors, to the affected neurons. This frequency parameter allows non radial neighborhoods and the quality of the formed maps is improved with respect to those formed by traditional SOM, as we show by comparing several error measures and five data sets.

Antonio Neme, Pedro Miramontes

Nonlinear Projection Using Geodesic Distances and the Neural Gas Network

A nonlinear projection method that uses geodesic distances and the neural gas network is proposed. First, the neural gas algorithm is used to obtain codebook vectors, and a connectivity graph is concurrently created by using competitive Hebbian rule. A procedure is added to tear or break non-contractible cycles in the connectivity graph, in order to project efficiently ‘circular’ manifolds such as cylinder or torus. In the second step, the nonlinear projection is created by applying an adaptation rule for codebook positions in the projection space. The mapping quality obtained with the proposed method outperforms CDA and Isotop, in terms of the trustworthiness, continuity, and topology preservation measures.

Pablo A. Estévez, Andrés M. Chong, Claudio M. Held, Claudio A. Perez

Connectionist Cognitive Science

Contextual Learning in the Neurosolver

In this paper, we introduce an enhancement to the Neurosolver, a neuromorphic planner and a problem solving system. The enhanced architecture enables contextual learning. The Neurosolver was designed and tested on several problem solving and planning tasks such as re-arranging blocks and controlling a software-simulated artificial rat running in a maze. In these tasks, the Neurosolver learned temporal patterns independent of the context. However in the real world no skill is acquired in vacuum; Contextual cues are a part of every situation, and the brain can incorporate such stimuli as evidenced through experiments with live rats. Rats use cues from the environment to navigate inside mazes. The enhanced architecture of the Neurosolver accommodates similar learning.

Andrzej Bieszczad, Kasia Bieszczad

A Computational Model for the Effect of Dopamine on Action Selection During Stroop Test

Based on a connectionist model of cortex-basal ganglia-thalamus loop recently proposed by authors, a simple connectionist model realizing the Stroop effect is established. The connectionist model of cortex-basal ganglia-thalamus loop is a nonlinear dynamical system and the model is not only capable of revealing the action selection property of basal ganglia but also is capable of modelling the effect of dopamine on action selection. While the interpretation of action selection function is based on solutions of nonlinear dynamical system, the effect of dopamine is modelled by a parameter. The effect of dopamine in inhibiting the habitual behaviour corresponding to word reading in Stroop test and letting the novel one occur corresponding to colour naming is investigated using the model established in this work.

Ozkan Karabacak, N. Serap Sengor

A Neural Network Model of Metaphor Understanding with Dynamic Interaction Based on a Statistical Language Analysis

The purpose of this study is to construct a human-like neural network model that represents the process of metaphor understanding with dynamic interaction, based on data obtained from statistical language analysis. In this paper, the probabilistic relationships between concepts and their attribute values are first computed from the statistical analysis of language data. Secondly, a computational model of the metaphor understanding process is constructed, including dynamic interaction among attribute values. Finally, a psychological experiment is conducted to examine the psychological validity of the model.

Asuka Terai, Masanori Nakagawa

Strong Systematicity in Sentence Processing by an Echo State Network

For neural networks to be considered as realistic models of human linguistic behavior, they must be able to display the level of systematicity that is present in language. This paper investigates the systematic capacities of a sentence-processing Echo State Network. The network is trained on sentences in which particular nouns occur only as subjects and others only as objects. It is then tested on novel sentences in which these roles are reversed. Results show that the network displays so-called strong systematicity.

Stefan L. Frank

Modeling Working Memory and Decision Making Using Generic Neural Microcircuits

Classical behavioral experiments to study working memory typically involve three phases. First the subject receives a stimulus, then holds it in the working memory, and finally makes a decision by comparing it with another stimulus. A neurocomputational model using generic neural microcircuits with feedback is presented here that integrates the three computational stages into a single unified framework. The architecture is tested using the two-interval discrimination and delayed-match-to-sample experimental paradigms as benchmarks.

Prashant Joshi

A Virtual Machine for Neural Computers

Neural Networks are mainly seen as algorithmic solutions for optimization and learning tasks where the ability to spread the acquired knowledge into several neurons, i.e., the use of sub-symbolic computation, is the key. We have shown in previous works that neural networks can perform other types of computation, namely symbolic and chaotic computations. Here in, we show how these nets can be decomposed into tuples which can be efficient calculated by software or hardware simpler than previous neural solutions.

João Pedro Neto

Cognitive Machines (Special Session)

Machine Cognition and the EC Cognitive Systems Projects: Now and in the Future

’The strong support for the development of cognitive machines by the EC (under INFSO E5 – Cognition) will be reviewed, covering the main ideas of the 23 projects in this unit funded under FP6. The variety of approaches to cognition contained in these will be summarized, and future developments in FP7 considered. Conclusions on the future of the development of cognitive machine seen from this European perspective will conclude the paper.

John G Taylor

A Complex Neural Network Model for Memory Functioning in Psychopathology

In an earlier paper [1], we described the mental pathology known as neurosis in terms of its relation to memory function. We proposed a mechanism whereby neurotic behavior may be understood as an associative memory process in the brain, and the symbolic associative process involved in psychoanalytic working-through can be mapped onto a process of reconfiguration of the neuronal network. Memory was modeled by a Boltzmann machine represented by a complete graph. However, it is known that brain neuronal topology is selectively structured. Here, we further develop the memory model, by including known mechanisms that control synaptic properties, showing that the network self organizes to a hierarchical, clustered structure. Two modules corresponding to sensorial and declarative memory interact, producing sensorial and symbolic activity, representing unconscious and conscious mental processes. This extension of the model allows an evaluation of the idea of working-through in a hierarchical network structure.

Roseli S. Wedemann, Luís Alfredo V. de Carvalho, Raul Donangelo

Modelling Working Memory Through Attentional Mechanisms

Recent studies of working memory have shown that the network of brain areas that supports working memory function overlaps heavily with the well studied network of selective attention. It has thus been suggested that working memory may operate by means of a repeated focusing of attention on the internal representations of the items that need to be maintained. We have employed our CODAM model of attention to simulate a specific working memory paradigm based on precisely this concept of ‘refreshing’ internal representations using attention. We propose here that the well known capacity limit of working memory can be attributed to the ‘scarceness’ of attentional resources. The specific mechanism of CODAM for modelling such scarceness is used in the paradigm to explain the behavioural and brain imaging data. This and related paradigms allow us to extend the specification of CODAM sites and functions to more detailed executive functions under executive control.

John Taylor, Nickolaos Fragopanagos, Nienke Korsten

A Cognitive Model of Multi-objective Multi-concept Formation

The majority of previous computational models of high-order human cognition incorporate gradient descent algorithms for their learning mechanisms and strict error minimization as the sole objective of learning. Recently, however, the validity of gradient descent as a descriptive model of real human cognitive processes has been criticized. In the present paper, we introduce a new framework for descriptive models of human learning that offers qualitatively plausible interpretations of cognitive behaviors. Specifically, we apply a simple multi-objective evolutionary algorithm as a learning method for modeling human category learning, where the definition of the learning objective is not based solely on the accuracy of knowledge, but also on the subjectively and contextually determined utility of knowledge being acquired. In addition, unlike gradient descent, our model assumes that humans entertain multiple hypotheses and learn not only by modifying a single existing hypothesis but also by combining a set of hypotheses. This learning-by-combination has been empirically supported, but largely overlooked in computational modeling research. Simulation studies show that our new modeling framework successfully replicated observed phenomena.

Toshihiko Matsuka, Yasuaki Sakamoto, Jeffrey V. Nickerson, Arieta Chouchourelou

A Basis for Cognitive Machines

We propose a general attention-based approach to thinking and cognition (more specifically reasoning and planning) in cognitive machines as based on the ability to manipulate neural activity in a virtual manner so as to achieve certain goals; this can then lead to decisions to make movements or to no actions whatever. The basic components are proposed to consist of forward/inverse model motor control pairs in an attention-control architecture, in which buffers are used to achieve sequencing by recurrence of virtual actions and attended states. How this model can apply to various reasoning paradigm will be described and first simulations presented using a virtual robot environment.

J. G. Taylor, S. Kasderidis, P. Trahanias, M. Hartley

Neural Model of Dopaminergic Control of Arm Movements in Parkinson’s Disease Bradykinesia

Patients suffering from Parkinson’s disease display a number of symptoms such a resting tremor, bradykinesia, etc. Bradykinesia is the hallmark and most disabling symptom of Parkinson’s disease1 (PD). Herein, a basal ganglia-cortico-spinal circuit for the control of voluntary arm movements in PD bradykinesia is extended by incorporating DAergic innervation of cells in the cortical and spinal components of the circuit. The resultant model simulates successfully several of the main reported effects of DA depletion on neuronal, electromyographic and movement parameters of PD bradykinesia.

Vassilis Cutsuridis

Occlusion, Attention and Object Representations

Occlusion is currently at the centre of analysis in machine vision. We present an approach to it that uses attention feedback to an occluded object to obtain its correct recognition. Various simulations are performed using a hierarchical visual attention feedback system, based on contrast gain (which we discuss as to its relation to possible hallucinations that could be caused by feedback). We then discuss implications of our results for object representations per se.

Neill R. Taylor, Christo Panchev, Matthew Hartley, Stathis Kasderidis, John G. Taylor

A Forward / Inverse Motor Controller for Cognitive Robotics

Before making a movement aimed at achieving a task, human beings either run a mental process that attempts to find a feasible course of action (at the same time, it must be compatible with a number of internal and external constraints and near-optimal according to some criterion) or select it from a repertoire of previously learned actions, according to the parameters of the task. If neither reasoning process succeeds, a typical backup strategy is to look for a tool that might allow the operator to match all the task constraints. A cognitive robot should support a similar reasoning system. A central element of this architecture is a coupled pair of controllers: FMC (forward motor controller: it maps tentative trajectories in the joint space into the corresponding trajectories of the end-effector variables in the workspace) and IMC (inverse motor controller: it maps desired trajectories of the end-effector into feasible trajectories in the joint space). The proposed FMC/IMC architecture operates with any degree of redundancy and can deal with geometric constraints (range of motion in the joint space, internal and external constraints in the workspace) and effort-related constraints (range of torque of the actuators, etc.). It operates by alternating two basic operations: 1) relaxation in the configuration space (for reaching a target pose); 2) relaxation in the null space of the kinematic transformation (for producing the required interaction force). The failure of either relaxation can trigger a higher level of reasoning. For both elements of the architecture we propose a closed-form solution and a solution based on ANNs.

Vishwanathan Mohan, Pietro Morasso

A Computational Model for Multiple Goals

The paper discusses a computational model suitable for the monitoring and execution of multiple co-existing goals inside an autonomous agent. The model uses a number of mechanisms to calculate dynamically goal priority. We provide an overview of the model, a discussion of a Cognitive Agent architecture that includes it and we provide results that support the current design. We conclude with a discussion of the results, points of interest and future work.

Stathis Kasderidis

Neural Dynamics and Complex Systems

Detection of a Dynamical System Attractor from Spike Train Analysis

Dynamics of the activity of neuronal networks have been intensively studied from the view point of the nonlinear dynamical system. The neuronal activities are recorded as multivariate time series of the epochs of spike occurrences–the spike trains–which are often effected by intrinsic and measuring noise. The spike trains can be considered as a mixture of a realization of deterministic and stochastic processes. Within this framework we considered several simulated spike trains derived from the Zaslavskii map with additive noise. The ensemble of all preferred firing sequences detected by the pattern grouping algorithm (PGA) in the noisy spike trains form a new time series that retains the dynamics of the original mapping.

Yoshiyuki Asai, Takashi Yokoi, Alessandro E. P. Villa

Recurrent Neural Networks Are Universal Approximators

Neural networks represent a class of functions for the efficient identification and forecasting of dynamical systems. It has been shown that feedforward networks are able to approximate any (Borel-)measurable function on a compact domain [1,2,3]. Recurrent neural networks (RNNs) have been developed for a better understanding and analysis of open dynamical systems. Compared to feedforward networks they have several advantages which have been discussed extensively in several papers and books, e.g. [4]. Still the question often arises if RNNs are able to map every open dynamical system, which would be desirable for a broad spectrum of applications. In this paper we give a proof for the universal approximation ability of RNNs in state space model form. The proof is based on the work of Hornik, Stinchcombe, and White about feedforward neural networks [1].

Anton Maximilian Schäfer, Hans Georg Zimmermann

A Discrete Adaptive Stochastic Neural Model for Constrained Optimization

The ability to map and solve combinatorial optimization problems with constraints on neural networks has frequently motivated a proposal for using such a model of computation.

We introduce a new stochastic neural model, working out for a specific class of constraints, which is able to choose adaptively its weights in order to find solutions into a proper subspace (feasible region) of the search space.

We show its asymptotic convergence properties and give evidence of its ability to find hight quality solution on benchmark and randomly generated instances of a specific problem.

Giuliano Grossi

Quantum Perceptron Network

A novel neural network,quantum perceptron network(QPN), is presented built upon the combination of classical perceptron network and quantum computing.This quantum perceptron network utilizing quantum phase adequately has the computing power that the conventional perceptron is unable to realize. Through case’ performance analysis and simulation,a quantum perceptron with only one neuron can realize XOR function unrealizable with a classical perceptron having a neuron. Simple network structure can achieve comparatively complicated network function,which will throw heavy influence on the field of artificial intelligence and control engineering.

Rigui Zhou, Ling Qin, Nan Jiang

Critical Echo State Networks

We are interested in the optimization of the recurrent connection structure of Echo State Networks (ESNs), because their topology can strongly influence performance. We study ESN predictive capacity by numerical simulations on Mackey-Glass time series, and find that a particular small subset of ESNs is much better than ordinary ESNs provided that the topology of the recurrent feedback connections satisfies certain conditions. We argue that the small subset separates two large sets of ESNs and this separation can be characterized in terms of phase transitions. With regard to the criticality of this phase transition, we introduce the notion of Critical Echo State Networks (CESN). We discuss why CESNs perform better than other ESNs.

Márton Albert Hajnal, András Lőrincz

Rapid Correspondence Finding in Networks of Cortical Columns

We describe a neural network able to rapidly establish correspondence between neural fields. The network is based on a cortical columnar model described earlier. It realizes dynamic links with the help of specialized columns that evaluate similarities between the activity distributions of local feature cell populations, are subject to a topology constraint, and gate the transfer of feature information between the neural fields. Correspondence finding requires little time (estimated to 10-40 ms in physiological terms) and is robust to noise in feature signals.

Jörg Lücke, Christoph von der Malsburg

Adaptive Thresholds for Layered Neural Networks with Synaptic Noise

The inclusion of a macroscopic adaptive threshold is studied for the retrieval dynamics of layered feedforward neural network models with synaptic noise. It is shown that if the threshold is chosen appropriately as a function of the cross-talk noise and of the activity of the stored patterns, adapting itself automatically in the course of the recall process, an autonomous functioning of the network is guaranteed. This self-control mechanism considerably improves the quality of retrieval, in particular the storage capacity, the basins of attraction and the mutual information content.

D. Bollé, R. Heylen

Backbone Structure of Hairy Memory

This paper presents a new memory of the Hopfield model that fixes many drawbacks of the model, such as loading capacity, limit cycle and error tolerance. This memory is derived from the hairy model [15]. This paper also constructs a training process to further balance the vulnerable memory parts and improve the memory.

Cheng-Yuan Liou

Dynamics of Citation Networks

The aim of this paper is to give theoretical and experimental tools for measuring the

driving force

in evolving complex networks. First a discrete-time stochastic model framework is introduced to state the question of how the dynamics of these networks depend on the properties of the parts of the system. Then a method is presented to determine this dependence in the possession of the required data about the system. This measurement method is applied to the citation network of high energy physics papers to extract the in-degree and age dependence of the dynamics. It is shown that the method yields close to “optimal” results.

Gábor Csárdi

Computational Neuroscience

Processing of Information in Synchroneously Firing Chains in Networks of Neurons

The Abeles model of cortical activity assumes that in absence of stimulation neural activity in zero order can be described by a Poisson process. Here the model is extended to describe information processing by synfire chains within a network of activity uncorrelated to the synfire chain. A quantitative derivation of the transfer function from this concept is given.

Jens Christian Claussen

Phase Precession and Recession with STDP and Anti-STDP

We show that standard, Hebbian spike-timing dependent plasticity (STDP) induces the precession of the firing phase of neurons in oscillatory networks, while anti-Hebbian STDP induces phase recession. In networks that are subject to oscillatory inhibition, the intensity of excitatory input relative to the inhibitory one determines whether the phase can precess due to STDP or whether the phase is fixed. This phenomenon can give a very simple explanation to the experimentally-observed hippocampal phase precession. Modulation of STDP can lead, through precession and recession, to the synchronization of the firing of a trained neuron to a target phase.

Răzvan V. Florian, Raul C. Mureşan

Visual Pathways for Detection of Landmark Points

We describe a neuron multi-layered architecture that extracts landmark points of high curvature from 2d shapes and resembles the visual pathway of primates. We demonstrate how the rotated orientation specific receptive fields of the simple neurons that were discovered by Hubel and Wiesel can perform landmark point detection on the 2d contour of the shape that is projected on the retina of the eye. Detection of landmark points of high curvature is a trivial task with sophisticated machine equipment but we demonstrate how such a task can be accomplished by only using the hardware of the visual cortex of primates abiding to the discoveries of Hubel and Wiesel regarding the rotated arrangements of orientation specific simple neurons. The proposed layered architecture first extracts the 2dimensional shape from the projection on the retina then it rotates the extracted shape in multiple layers in order to detect the landmark points. Since rotating the image about the focal origin is equivalent to the rotation of the simple cells orientation field, our model offers an explanation regarding the mystery of the arrangement of the cortical cells in the areas of layer 2 and 3 on the basis of shape cognition from its landmark points.

Konstantinos Raftopoulos, Nikolaos Papadakis, Klimis Ntalianis

A Model of Grid Cells Based on a Path Integration Mechanism

The grid cells of the dorsocaudal medial entorhinal cortex (dMEC) in rats show higher firing rates when the position of the animal correlates with the vertices of regular triangular tessellations covering the environment. Strong evidence indicates that these neurons are part of a path integration system. This raises the question, how such a system could be implemented in the brain. Here, we present a cyclically connected artificial neural network based on a path integration mechanism, implementing grid cells on a simulated mobile agent. Our results show that the synaptic connectivity of the network, which can be represented by a twisted torus, allows the generation of regular triangular grids across the environment. These tessellations share same spacing and orientation, as neighboring grid cells in the dMEC. A simple gain and bias mechanism allows to control the spacing and the orientation of the grids, which suggests that these different characteristics can be generated by a unique algorithm in the brain.

Alexis Guanella, Paul F. M. J. Verschure

Temporal Processing in a Spiking Model of the Visual System

Increasing amount of evidence suggests that the brain has the necessary mechanisms to and indeed does generate and process temporal information from the very early stages of sensory pathways. This paper presents a novel biologically motivated model of the visual system based on temporal encoding of the visual stimuli and temporally precise lateral geniculate nucleus (LGN) spikes. The work investigates whether such a network could be developed using an extended type of integrate-and-fire neurons (ADDS) and trained to recognise objects of different shapes using STDP learning. The experimental results contribute further support to the argument that temporal encoding can provide a mechanism for representing information in the visual system and has the potential to complement firing-rate-based architectures toward building more realistic and powerful models.

Christo Panchev

Accelerating Event Based Simulation for Multi-synapse Spiking Neural Networks

The simulation of large spiking neural networks (SNN) is still a very time consuming task. Therefore most simulations are limited to rather unrealistic small or medium sized networks (typically hundreds of neurons). In this paper, some methods for the fast simulation of large SNN are discussed. Our results equally amongst others show that event based simulation is an efficient way of simulating SNN, although not all neuron models are suited for an event based approach. We compare some models and discuss several techniques for accelerating the simulation of more complex models. Finally we present an algorithm that is able to handle multi-synapse models efficiently.

Michiel D’Haene, Benjamin Schrauwen, Dirk Stroobandt

A Neurocomputational Model of an Imitation Deficit Following Brain Lesion

This paper investigates the neural mechanisms of

visuo-motor imitation

in humans through convergent evidence from neuroscience. In particular, we consider a deficit in imitation following

callosal brain lesion

, based on the rational that looking at how imitation is impaired can unveil its underlying neural principles. We ground the functional architecture and information flow of our model in brain imaging studies and use findings from monkey brain neurophysiological studies to drive the choice of implementation of our processing modules. Our neural model of visuo-motor imitation is based on self-organizing maps with associated activities. Patterns of impairment of the model, realized by adding uncertainty in the transfer of information between the networks, account for the scores found in a clinical examination of imitation [1]. The model also allows several interesting predictions.

Biljana Petreska, Aude G. Billard

Temporal Data Encoding and SequenceLearning with Spiking Neural Networks

Sequence Learning using a Spiking Neural Network (SNN) was performed. An SNN is a type of Artificial Neural Network (ANN) that uses input signal arrival time information to process temporal data. An SNN can learn not only combinational inputs but also sequential inputs over some limited amount of time without using a recurrent network. Music melodies were encoded using unit amplitude spikes having various inter-spike interval times. These spikes were then fed into an SNN learning system. The SNN learning system was able to recognize various melodies after learning. The SNN could identify the original and noise-added melody versions properly in most cases.

Robert H. Fujii, Kenjyu Oozeki

Neural Control, Reinforcement Learning and Robotics Applications

Optimal Tuning of Continual Online Exploration in Reinforcement Learning

This paper presents a framework allowing to tune continual exploration in an optimal way. It first quantifies the rate of exploration by defining the

degree of exploration

of a state as the probability-distribution entropy for choosing an admissible action. Then, the exploration/exploitation tradeoff is stated as a

global optimization problem

: find the exploration strategy that minimizes the expected cumulated cost, while maintaining fixed degrees of exploration at same nodes. In other words, “exploitation” is maximized for constant “exploration”. This formulation leads to a set of nonlinear updating rules reminiscent of the value-iteration algorithm. Convergence of these rules to a local minimum can be proved for a stationary environment. Interestingly, in the deterministic case, when there is no exploration, these equations reduce to the Bellman equations for finding the shortest path while, when it is maximum, a full “blind” exploration is performed.

Youssef Achbany, Francois Fouss, Luh Yen, Alain Pirotte, Marco Saerens

Vague Neural Network Controller and Its Applications

Fuzzy neural network is a promising intelligence system that combines the artificial neural network and fuzzy logic. However fuzzy neural network has its shortages: fuzzy membership function has only one single value, it cannot get more reasonable classified and cognizable results. While, vague sets theory is a generalization of fuzzy sets theory, its’ distinguishing feature is having a truth-membership function and a false-membership function. It presents both of the opposite factors to deal with nonlinearities and uncertain of control system in the control fields., this paper has been accomplished the controller of vague neural networks based on the vague set theory, this controller combines the advantage of vague set in handling uncertain information and the capability of artificial neural networks in learning process. Moreover, in the application of inverted pendulum, the character of vague neural network controller was expressed.

Yibiao Zhao, Rui Fang, Shun Zhang, Siwei Luo

Parallel Distributed Profit Sharing for PC Cluster

This paper presents a parallel reinforcement learning method considered communication cost. In our method, each agent communicates only action sequences with a constant episode interval. As the communication interval is longer, communication cost is smaller, but parallelism is lower. Implementing our method on PC cluster, we investigate such trade-off characteristics. We show that computation time to learning can be reduced by properly adjusting the communication interval.

Takuya Fujishiro, Hidehiro Nakano, Arata Miyauchi

Feature Extraction for Decision-Theoretic Planning in Partially Observable Environments

In this article, we propose a feature extraction technique for decision-theoretic planning problems in partially observable stochastic domains and show a novel approach for solving them. To maximize an expected future reward, all the agent has to do is to estimate a Markov chain over a statistic variable related to rewards. In our approach, an auxiliary state variable whose stochastic process satisfies the Markov property, called internal state, is introduced to the model with the assumption that the rewards are dependent on the pair of an internal state and an action. The agent then estimates the dynamics of an internal state model based on the maximum likelihood inference made while acquiring its policy; the internal state model represents an essential feature necessary to decision-making. Computer simulation results show that our technique can find an appropriate feature for acquiring a good policy, and can achieve faster learning with fewer policy parameters than a conventional algorithm, in a reasonably sized partially observable problem.

Hajime Fujita, Yutaka Nakamura, Shin Ishii

Reinforcement Learning with Echo State Networks

Function approximators are often used in reinforcement learning tasks with large or continuous state spaces. Artificial neural networks, among them recurrent neural networks are popular function approximators, especially in tasks where some kind of of memory is needed, like in real-world partially observable scenarios. However, convergence guarantees for such methods are rarely available. Here, we propose a method using a class of novel RNNs, the echo state networks. Proof of convergence to a bounded region is provided for

k

-order Markov decision processes. Runs on POMDPs were performed to test and illustrate the working of the architecture.

István Szita, Viktor Gyenes, András Lőrincz

Reward Function and Initial Values: Better Choices for Accelerated Goal-Directed Reinforcement Learning

An important issue in Reinforcement Learning (RL) is to accelerate or improve the learning process. In this paper, we study the influence of some RL parameters over the learning speed. Indeed, although RL convergence properties have been widely studied, no precise rules exist to correctly choose the reward function and initial Q-values. Our method helps the choice of these RL parameters within the context of reaching a goal in a minimal time. We develop a theoretical study and also provide experimental justifications for choosing on the one hand the reward function, and on the other hand particular initial Q-values based on a goal bias function.

Laëtitia Matignon, Guillaume J. Laurent, Nadine Le Fort-Piat

Nearly Optimal Exploration-Exploitation Decision Thresholds

While in general trading off exploration and exploitation in reinforcement learning is hard, under some formulations relatively simple solutions exist. Optimal decision thresholds for the multi-armed bandit problem, one for the infinite horizon discounted reward case and one for the finite horizon undiscounted reward case are derived, which make the link between the reward horizon, uncertainty and the need for exploration explicit. From this result follow two practical approximate algorithms, which are illustrated experimentally.

Christos Dimitrakakis

Dual Adaptive ANN Controllers Based on Wiener Models for Controlling Stable Nonlinear Systems

This paper presents two nonlinear adaptive predictive algorithms based on Artificial Neural Network (ANN) and a Wiener structure for controlling asymptotically stable nonlinear plants. The first algorithm is based on the minimization of a cost function taking into account the future tracking error and the Certainty Equivalence (CE) principle, under which the estimated parameters are used as if they were the true parameters. In order to improve the performance of the adaptive algorithm, we propose to use a cost function, considering not only the future tracking error, but also the effect of the control signal over the estimated parameters. A simulated chemical reactor example illustrates the performance and feasibility of both approaches.

D. Sbarbaro

Online Stabilization of Chaotic Maps Via Support Vector Machines Based Generalized Predictive Control

In this study, the previously proposed Online Support Vector Machines Based Generalized Predictive Control method [1] is applied to the problem of stabilizing discrete-time chaotic systems with small parameter perturbations. The method combines the Accurate Online Support Vector Regression (AOSVR) algorithm [2] with the Support Vector Machines Based Generalized Predictive Control (SVM-Based GPC) approach [3] and thus provides a powerful scheme for controlling chaotic maps in an adaptive manner. The simulation results on chaotic maps have revealed that Online SVM-Based GPC provides an excellent online stabilization performance and maintains it when some measurement noise is added to output of the underlying map.

Serdar Iplikci

Robotics, Control, Planning

Morphological Neural Networks and Vision Based Mobile Robot Navigation

Morphological Associative Memories (MAM) have been proposed for image denoising and pattern recognition. We have shown that they can be applied to other domains, like image retrieval and hyperspectral image unsupervised segmentation. In both cases the key idea is that Morphological Autoassociative Memories (MAAM) selective sensitivity to erosive and dilative noise can be applied to detect the morphological independence between patterns. The convex coordinates obtained by linear unmixing based on the sets of morphological independent patterns define a feature extraction process. These features may be useful either for pattern classification. We present some results on the task of visual landmark recognition for a mobile robot self-localization task.

I. Villaverde, M. Graña, A. d’Anjou

Position Control Based on Static Neural Networks of Anthropomorphic Robotic Fingers

A dynamic neurocontroller for positioning robot manipulators with a tendon-driven transmission system has been developed allowing to track desired trajectories and reject external disturbances. The controller is characterised as providing motor torques rather than joint torques. In this sense, the redundant problem regarded with the tendon-driven transmission systems is solved using neural networks that are able to learned the linear transformation that maps motor torques into joint torques. The neurocontroller not only learn the dynamics associated with the robot manipulator but also the parameters attached to the transmission system such as pulley radii. A theorem relying on the Lyapunov theory has been developed, guaranteeing the uniformly ultimately bounded stability of the whole system and providing both the control laws and weight updating laws.

Juan Ignacio Mulero-Martínez, Francisco García-Córdova, Juan López-Coronado

Learning Multiple Models of Non-linear Dynamics for Control Under Varying Contexts

For stationary systems, efficient techniques for adaptive motor control exist which learn the system’s inverse dynamics online and use this single model for control. However, in realistic domains the system dynamics often change depending on an external unobserved context, for instance the work load of the system or contact conditions with other objects. A solution to context-dependent control is to learn multiple inverse models for different contexts and to infer the current context by analyzing the experienced dynamics. Previous multiple model approaches have only been tested on linear systems. This paper presents an efficient multiple model approach for non-linear dynamics, which can bootstrap context separation from context-unlabeled data and realizes simultaneous online context estimation, control, and training of multiple inverse models. The approach formulates a consistent probabilistic model used to infer the unobserved context and uses Locally Weighted Projection Regression as an efficient online regressor which provides local confidence bounds estimates used for inference.

Georgios Petkos, Marc Toussaint, Sethu Vijayakumar

A Study on Optimal Configuration for the Mobile Manipulator: Using Weight Value and Mobility

A Mobile Manipulator is redundant by itself. Using it’s redundant freedom, a mobile manipulator can perform various task. In this paper, to improve task execution efficiency utilizing the redundancy, optimal configurations of the mobile manipulator are maintained while it is moving to a new task point. And using a cost function for optimality defined as a combination of the square errors of the desired and actual configurations of the mobile robot and of the task robot, the job which the mobile manipulator performs is optimized. Here, The proposed algorithm is experimentally verified and discussed with a mobile manipulator, PURL-II.

Jin-Gu Kang, Kwan-Houng Lee

VSC Perspective for Neurocontroller Tuning

Compact representation of knowledge having strong internal interactions has become possible with the developments in neurocomputing and neural information processing. The field of neural networks has offered various solutions for complex problems, however, the problems associated with the learning performance has constituted a major drawback in terms of the realization performance and computational requirements. This paper discusses the use of variable structure systems theory in learning process. The objective is to incorporate the robustness of the approach into the training dynamics, and to ensure the stability in the adjustable parameter space. The results discussed demonstrate the fulfillment of the design specifications and display how the strength of a robust control scheme could be an integral part of a learning system. This paper discusses how Gaussian radial basis function neural networks could be utilized to drive a mechatronic system’s behavior into a predefined sliding regime, and it is seen that the results are promising.

Mehmet Önder Efe

A Neural Network Module with Pretuning for Search and Reproduction of Input-Output Mapping

A neural network that uses a pretuning procedure for function approximation is presented. Unlike traditional neural network algorithms in which changeable parameters are multiplicative weights of connections between neurons in the network, the pretuning procedure deals with additive thresholds of interneurons of the proposed neural network and is a dynamical combinatory inhibition of these neurons. It is shown that in this case the neural network can combine local approximation and distributed activation. The usefulness of the neural network with pretuning (NNP) for the tasks of search and reproduction of sensorimotor mapping of robot is briefly discussed.

Igor Shepelev

Bio-inspired Neural Network On-Chip Implementation and Applications (Special session)

Physical Mapping of Spiking Neural Networks Models on a Bio-inspired Scalable Architecture

The paper deals with the physical implementation of biologically plausible spiking neural network models onto a hardware architecture with bio-inspired capabilities. After presenting the model, the work will illustrate the major steps taken in order to provide a compact and efficient digital hardware implementation of the model. Special emphasis will be given to the scalability features of the architecture, that will permit the implementation of large-scale networks. The paper will conclude with details about the physical mapping of the model, as well as with experimental results obtained when applying dynamic input stimuli to the implemented network.

J. Manuel Moreno, Javier Iglesias, Jan L. Eriksson, Alessandro E. P. Villa

A Time Multiplexing Architecture for Inter-neuron Communications

This paper presents a hardware implementation of a Time Multiplexing Architecture (TMA) that can interconnect arrays of neurons in an Artificial Neural Network (ANN) using a single metal wire. The approach exploits the relative slow operational speed of the biological system by using fast digital hardware to sequentially sample neurons in a layer and transmit the associated spikes to neurons in other layers. The motivation for this work is to develop minimal area inter-neuron communication hardware. An estimate of the density of on-chip neurons afforded by this approach is presented. The paper verifies the operation of the TMA and investigates pulse transmission errors as a function of the sampling rate. Simulations using the Xilinx System Generator (XSG) package demonstrate that the effect of these errors on the performance of an SNN, pre-trained to solve the XOR problem, is negligible if the sampling frequency is sufficiently high.

Fergal Tuffy, Liam McDaid, Martin McGinnity, Jose Santos, Peter Kelly, Vunfu Wong Kwan, John Alderman

Neuronal Cell Death and Synaptic Pruning Driven by Spike-Timing Dependent Plasticity

The embryonic nervous system is refined over the course of development as a result of two main processes: apoptosis (programmed cell death) and selective axon pruning. We simulated a large scale spiking neural network characterized by an initial apoptotic phase, driven by an excessive firing rate, followed by the onset of spike-timing-dependent plastiticity (STDP), driven by spatiotemporal patterns of stimulation. In the apoptotic phase the cell death affected the inhibitory more than the excitatory units. The network activity stabilized such that recurrent preferred firing sequences appeared along the STDP phase, thus suggesting the emergence of cell assemblies from large randomly connected networks.

Javier Iglesias, Alessandro E. P. Villa

Effects of Analog-VLSI Hardware on the Performance of the LMS Algorithm

Device mismatch, charge leakage and nonlinear transfer functions limit the resolution of analog-VLSI arithmetic circuits and degrade the performance of neural networks and adaptive filters built with this technology. We present an analysis of the impact of these issues on the convergence time and residual error of a linear perceptron using the Least-Mean-Square (LMS) algorithm. We also identify design tradeoffs and derive guidelines to optimize system performance while minimizing circuit die area and power dissipation.

Gonzalo Carvajal, Miguel Figueroa, Seth Bridges

A Portable Electronic Nose (E-Nose) System Based on PDA

The electronic nose (e-nose) has been used in food investigation and quality controls in industry. Recently it finds its applications in medical diagnosis and environmental monitoring. Moreover, the use of portable e-nose enables the on-site measurements and analysis of vapors without extra gas-sampling units. In this study, a PDA-based portable e-nose was developed using micro-machined gas sensor array and miniaturized electronic interfaces. The computing power and flexible interface of the PDA are expected to provide the rapid and application specific development of the diagnostic devices, and easy connection to other information appliances. For performance verification of the developed portable e-nose system, Six different vapors were measured using the system. The results showed the reproducibility of the measured data and the distinguishable patterns between the vapor species. The application of two different artificial neural networks verified the possibility of the automatic vapor recognition based on the portable measurements.

Yoon Seok Yang, Yong Shin Kim, Seung-chul Ha

Optimal Synthesis of Boolean Functions by Threshold Functions

We introduce a new method for obtaining optimal architectures that implement arbitrary Boolean functions using threshold functions. The standard threshold circuits using threshold gates and weights are replaced by nodes computing directly a threshold function of the inputs. The method developed can be considered exhaustive as if a solution exist the algorithm eventually will find it. At all stages different optimization strategies are introduced in order to make the algorithm as efficient as possible. The method is applied to the synthesis of circuits that implement a flip-flop circuit and a multi-configurable gate. The advantages and disadvantages of the method are analyzed.

José Luis Subirats, Iván Gómez, José M. Jerez, Leonardo Franco

Pareto-optimal Noise and Approximation Properties of RBF Networks

Neural networks are intended to be robust to noise and tolerant to failures in their architecture. Therefore, these systems are particularly interesting to be integrated in hardware and to be operating under noisy environment. In this work, measurements are introduced which can decrease the sensitivity of Radial Basis Function networks to noise without any degradation in their approximation capability. For this purpose, pareto-optimal solutions are determined for the parameters of the network.

Ralf Eickhoff, Ulrich Rückert

Backmatter

Weitere Informationen