nach oben

2009 | Buch

Kapitel lesen Erstes Kapitel lesen

Artificial Neural Networks – ICANN 2009

19th International Conference, Limassol, Cyprus, September 14-17, 2009, Proceedings, Part I

herausgegeben von: Cesare Alippi, Marios Polycarpou, Christos Panayiotou, Georgios Ellinas

Verlag: Springer Berlin Heidelberg

Buchreihe : Lecture Notes in Computer Science

Enthalten in: Springer Professional "Wirtschaft+Technik" , Springer Professional "Technik" , Springer Professional "Wirtschaft"

Einloggen, um Zugang zu erhalten

Über dieses Buch

This volume is part of the two-volume proceedings of the 19th International Conf- ence on Artificial Neural Networks (ICANN 2009), which was held in Cyprus during September 14–17, 2009. The ICANN conference is an annual meeting sp- sored by the European Neural Network Society (ENNS), in cooperation with the - ternational Neural Network Society (INNS) and the Japanese Neural Network Society (JNNS). ICANN 2009 was technically sponsored by the IEEE Computational Intel- gence Society. This series of conferences has been held annually since 1991 in various European countries and covers the field of neurocomputing, learning systems and related areas. Artificial neural networks provide an information-processing structure inspired by biological nervous systems. They consist of a large number of highly interconnected processing elements, with the capability of learning by example. The field of artificial neural networks has evolved significantly in the last two decades, with active partici- tion from diverse fields, such as engineering, computer science, mathematics, artificial intelligence, system theory, biology, operations research, and neuroscience. Artificial neural networks have been widely applied for pattern recognition, control, optimization, image processing, classification, signal processing, etc.

Inhaltsverzeichnis

Frontmatter

Learning Algorithms

Mutual Information Based Initialization of Forward-Backward Search for Feature Selection in Regression Problems

Pure feature selection, where variables are chosen or not to be in the training data set, still remains as an unsolved problem, especially when the dimensionality is high. Recently, the Forward-Backward Search algorithm using the Delta Test to evaluate a possible solution was presented, showing a good performance. However, due to the locality of the search procedure, the initial starting point of the search becomes crucial in order to obtain good results. This paper presents new heuristics to find a more adequate starting point that could lead to a better solution. The heuristic is based on the sorting of the variables using the Mutual Information criterion, and then performing parallel local searches. These local searches provide an initial starting point for the actual parallel Forward-Backward algorithm.

Alberto Guillén, Antti Sorjamaa, Gines Rubio, Amaury Lendasse, Ignacio Rojas

Kernel Learning for Local Learning Based Clustering

For most kernel-based clustering algorithms, their performance will heavily hinge on the choice of kernel. In this paper, we propose a novel kernel learning algorithm within the framework of the Local Learning based Clustering (LLC) (Wu & Schölkopf 2006). Given multiple kernels, we associate a non-negative weight with each Hilbert space for the corresponding kernel, and then extend our previous work on feature selection (Zeng & Cheung 2009) to select the suitable Hilbert spaces for LLC. We show that it naturally renders a linear combination of kernels. Accordingly, the kernel weights are estimated iteratively with the local learning based clustering. The experimental results demonstrate the effectiveness of the proposed algorithm on the benchmark document datasets.

Hong Zeng, Yiu-ming Cheung

Projective Nonnegative Matrix Factorization with α-Divergence

A new matrix factorization algorithm which combines two recently proposed nonnegative learning techniques is presented. Our new algorithm,

-PNMF, inherits the advantages of Projective Nonnegative Matrix Factorization (PNMF) for learning a highly orthogonal factor matrix. When the Kullback-Leibler (KL) divergence is generalized to

-divergence, it gives our method more flexibility in approximation. We provide multiplicative update rules for

-PNMF and present their convergence proof. The resulting algorithm is empirically verified to give a good solution by using a variety of real-world datasets. For feature extraction,

-PNMF is able to learn highly sparse and localized part-based representations of facial images. For clustering, the new method is also advantageous over Nonnegative Matrix Factorization with

-divergence and ordinary PNMF in terms of higher purity and smaller entropy.

Zhirong Yang, Erkki Oja

Active Generation of Training Examples in Meta-Regression

Meta-Learning predicts the performance of learning algorithms based on features of the learning problems. Meta-Learning acquires knowledge from a set of meta-examples, which store the experience obtained from applying the algorithms to problems in the past. A limitation of Meta-Learning is related to the generation of meta-examples. In order to construct a meta-example, it is necessary to empirically evaluate the algorithms on a given problem. Hence, the generation of a set of meta-examples may be costly depending on the context. In order to minimize this limitation, the use of Active Learning is proposed to reduce the number of required meta-examples. In this paper, we evaluate this proposal on a promising Meta-Learning approach, called Meta-Regression. Experiments were performed in a case study to predict the performance of learning algorithms for MLP networks. A significant performance gain was observed in the case study when Active Learning was used to support the generation of meta-examples.

Ricardo B. C. Prudêncio, Teresa B. Ludermir

A Maximum-Likelihood Connectionist Model for Unsupervised Learning over Graphical Domains

Supervised relational learning over labeled graphs, e.g. via recursive neural nets, received considerable attention from the connectionist community. Surprisingly, with the exception of recursive self organizing maps, unsupervised paradigms have been far less investigated. In particular, no algorithms for density estimation over graphs are found in the literature. This paper introduces first a formal notion of probability density function (pdf) over graphical spaces. It then proposes a maximum-likelihood pdf estimation technique, relying on the joint optimization of a recursive encoding network and a constrained radial basis functions-like net. Preliminary experiments on synthetically generated samples of labeled graphs are analyzed and tested statistically.

Edmondo Trentin, Leonardo Rigutini

Local Feature Selection for the Relevance Vector Machine Using Adaptive Kernel Learning

A Bayesian learning algorithm is presented that is based on a sparse Bayesian linear model (the Relevance Vector Machine (RVM)) and learns the parameters of the kernels during model training. The novel characteristic of the method is that it enables the introduction of parameters called ‘scaling factors’ that measure the significance of each feature. Using the Bayesian framework, a sparsity promoting prior is then imposed on the scaling factors in order to eliminate irrelevant features. Feature selection is local, because different values are estimated for the scaling factors of each kernel, therefore different features are considered significant at different regions of the input space. We present experimental results on artificial data to demonstrate the advantages of the proposed model and then we evaluate our method on several commonly used regression and classification datasets.

Dimitris Tzikas, Aristidis Likas, Nikolaos Galatsanos

MINLIP: Efficient Learning of Transformation Models

This paper studies a risk minimization approach to estimate a transformation model from noisy observations. It is argued that transformation models are a natural candidate to study ranking models and ordinal regression in a context of machine learning. We do implement a structural risk minimization strategy based on a Lipschitz smoothness condition of the transformation model. Then, it is shown how the estimate can be obtained efficiently by solving a convex quadratic program with

(

) linear constraints and unknowns, with

the number of data points. A set of experiments do support these findings.

Vanya Van Belle, Kristiaan Pelckmans, Johan A. K. Suykens, Sabine Van Huffel

Efficient Uncertainty Propagation for Reinforcement Learning with Limited Data

In a typical reinforcement learning (RL) setting details of the environment are not given explicitly but have to be estimated from observations. Most RL approaches only optimize the expected value. However, if the number of observations is limited considering expected values only can lead to false conclusions. Instead, it is crucial to also account for the estimator’s uncertainties. In this paper, we present a method to incorporate those uncertainties and propagate them to the conclusions. By being only approximate, the method is computationally feasible. Furthermore, we describe a Bayesian approach to design the estimators. Our experiments show that the method considerably increases the robustness of the derived policies compared to the standard approach.

Alexander Hans, Steffen Udluft

Optimal Training Sequences for Locally Recurrent Neural Networks

The problem of determining an optimal training schedule for a locally recurrent neural network is discussed. Specifically, the proper choice of the most informative measurement data guaranteeing the reliable prediction of the neural network response is considered. Based on a scalar measure of the performance defined on the Fisher information matrix related to the network parameters, the problem was formulated in terms of optimal experimental design. Then, its solution can be readily achieved via the adaptation of effective numerical algorithms based on the convex optimization theory. Finally, some illustrative experiments are provided to verify the presented approach.

Krzysztof Patan, Maciej Patan

Statistical Instance-Based Ensemble Pruning for Multi-class Problems

Recent research has shown that the provisional count of votes of an ensemble of classifiers can be used to estimate the probability that the final ensemble prediction coincides with the current majority class. For a given instance, querying can be stopped when this probability is above a specified threshold. This instance-based ensemble pruning procedure can be efficiently implemented if these probabilities are pre-computed and stored in a lookup table. However, the size of the table and the cost of computing the probabilities grow very rapidly with the number of classes of the problem. In this article we introduce a number of computational optimizations that can be used to make the construction of the lookup table feasible. As a result, the application of instance-based ensemble pruning is extended to multi-class problems. Experiments in several UCI multi-class problems show that instance-based pruning speeds-up classification by a factor between 2 and 10 without any significant variation in the prediction accuracy of the ensemble.

Gonzalo Martínez-Muñoz, Daniel Hernández-Lobato, Alberto Suárez

Robustness of Kernel Based Regression: A Comparison of Iterative Weighting Schemes

It has been shown that Kernel Based Regression (KBR) with a least squares loss has some undesirable properties from robustness point of view. KBR with more robust loss functions, e.g. Huber or logistic losses, often give rise to more complicated computations. In this work the practical consequences of this sensitivity are explained, including the breakdown of Support Vector Machines (SVM) and weighted Least Squares Support Vector Machines (LS-SVM) for regression. In classical statistics, robustness is improved by reweighting the original estimate. We study the influence of reweighting the LS-SVM estimate using four different weight functions. Our results give practical guidelines in order to choose the weights, providing robustness and fast convergence. It turns out that Logistic and Myriad weights are suitable reweighting schemes when outliers are present in the data. In fact, the Myriad shows better performance over the others in the presence of extreme outliers (e.g. Cauchy distributed errors). These findings are then illustrated on toy example as well as on a real life data sets.

Kris De Brabanter, Kristiaan Pelckmans, Jos De Brabanter, Michiel Debruyne, Johan A. K. Suykens, Mia Hubert, Bart De Moor

Mixing Different Search Biases in Evolutionary Learning Algorithms

This work investigates the benefits of using different distribution functions in the evolutionary learning algorithms with respect to Artificial Neural Networks’ (ANNs) generalization ability. We examine two modification of the recently proposed network weight-based evolutionary algorithm (NWEA), by mixing mutation strategies based on three distribution functions at the chromosome and the gene levels. The utilization of combined search strategies in the ANNs training implies that different step sizes determined by mixed distributions will direct the evolution towards good generalized ANNs.

Kristina Davoian, Wolfram-M. Lippe

Semi-supervised Learning for Regression with Co-training by Committee

Semi-supervised learning is a paradigm that exploits the unlabeled data in addition to the labeled data to improve the generalization error of a supervised learning algorithm. Although in real-world applications regression is as important as classification, most of the research in semi-supervised learning concentrates on classification. In particular, although Co-Training is a popular semi-supervised learning algorithm, there is not much work to develop new Co-Training style algorithms for semi-supervised regression. In this paper, a semi-supervised regression framework, denoted by

CoBCReg

is proposed, in which an ensemble of diverse regressors is used for semi-supervised learning that requires neither redundant independent views nor different base learning algorithms. Experimental results show that

CoBCReg

can effectively exploit unlabeled data to improve the regression estimates.

Mohamed Farouk Abdel Hady, Friedhelm Schwenker, Günther Palm

An Analysis of Meta-learning Techniques for Ranking Clustering Algorithms Applied to Artificial Data

Meta-learning techniques can be very useful for supporting non-expert users in the algorithm selection task. In this work, we investigate the use of different components in an unsupervised meta-learning framework. In such scheme, the system aims to predict, for a new learning task, the ranking of the candidate clustering algorithms according to the knowledge previously acquired.

In the context of unsupervised meta-learning techniques, we analyzed two different sets of meta-features, nine different candidate clustering algorithms and two learning methods as meta-learners.

Such analysis showed that the system, using MLP and SVR meta-learners, was able to successfully associate the proposed sets of dataset characteristics to the performance of the new candidate algorithms. In fact, a hypothesis test showed that the correlation between the predicted and ideal rankings were significantly higher than the default ranking method. In this sense, we also could validate the use of the proposed sets of meta-features for describing the artificial learning tasks.

Rodrigo G. F. Soares, Teresa B. Ludermir, Francisco A. T. De Carvalho

Probability-Based Distance Function for Distance-Based Classifiers

In the paper a new measure of distance between events/observations in the pattern space is proposed and experimentally evaluated with the use of

k-NN

classifier in the context of binary classification problems. The application of the proposed approach visibly improves the results compared to the case of training without postulated enhancements in terms of speed and accuracy.

Numerical results are very promising and outperform the reference literature results of

k-NN

classifiers built with other distance measures.

Cezary Dendek, Jacek Mańdziuk

Constrained Learning Vector Quantization or Relaxed k-Separability

Neural networks and other sophisticated machine learning algorithms frequently miss simple solutions that can be discovered by a more constrained learning methods. Transition from a single neuron solving linearly separable problems, to multithreshold neuron solving

-separable problems, to neurons implementing prototypes solving

-separable problems, is investigated. Using Learning Vector Quantization (LVQ) approach this transition is presented as going from two prototypes defining a single hyperplane, to many co-linear prototypes defining parallel hyperplanes, to unconstrained prototypes defining Voronoi tessellation. For most datasets relaxing the co-linearity condition improves accuracy increasing complexity of the model, but for data with inherent logical structure LVQ algorithms with constraints significantly outperforms original LVQ and many other algorithms.

Marek Grochowski, Włodzisław Duch

Minimization of Quadratic Binary Functional with Additive Connection Matrix

(

)-matrix is called additive when its elements are pair-wise sums of

real numbers

. For a quadratic binary functional with an additive connection matrix we succeeded in finding the global minimum expressing it through external parameters of the problem. Computer simulations show that energy surface of a quadratic binary functional with an additive matrix is complicate enough.

Leonid Litinskii

Mutual Learning with Many Linear Perceptrons: On-Line Learning Theory

We propose a new mutual learning using many weak learner (or student) which converges into the identical state of Bagging that is kind of ensemble learning, within the framework of on-line learning, and have analyzed its asymptotic property through the statistical mechanics method. Mutual learning involving more than three students is essential compares to two student case from a viewpoint of variety of selection of a student acting as teacher. The proposed model consists of two learning steps: many students independently learn from a teacher, and then the students learn from others through the mutual learning. In mutual learning, students learn from other students and the generalization error is improved even if the teacher has not taken part in the mutual learning. We demonstrate that the learning style of selecting a student to act as teacher randomly is superior to that of cyclic order by using principle component analysis.

Kazuyuki Hara, Yoichi Nakayama, Seiji Miyoshi, Masato Okada

Computational Neuroscience

Synchrony State Generation in Artificial Neural Networks with Stochastic Synapses

In this study, the generation of temporal synchrony within an artificial neural network is examined considering a stochastic synaptic model. A network is introduced and driven by Poisson distributed trains of spikes along with white-Gaussian noise that is added to the internal synaptic activity representing the background activity (neuronal noise). A Hebbian-based learning rule for the update of synaptic parameters is introduced. Only arbitrarily selected synapses are allowed to learn, i.e. change parameter values. The average of the cross-correlation coefficients between a smoothed version of the responses of all the neurons is taken as an indicator for synchrony. Results show that a network using such a framework is able to achieve different states of synchrony via learning. Thus, the plausibility of using stochastic-based models in modeling the neural process is supported. It is also consistent with arguments claiming that synchrony is a part of the memory-recall process and copes with the accepted framework in biological neural systems.

Karim El-Laithy, Martin Bogdan

Coexistence of Cell Assemblies and STDP

We implement a model of leaky-integrate-and fire neurons with conductance-based synapses. Neurons are structurally coupled in terms of an ideal cell assembly. Synaptic changes occur through parameterized spike timing-dependent plasticity rules which allows us to investigate the question whether cell assemblies can survive or even be strengthed by such common learning rules. It turns out that for different delays there are parameter settings which support cell assembly structures and others which do not.

Florian Hauser, David Bouchain, Günther Palm

Controlled and Automatic Processing in Animals and Machines with Application to Autonomous Vehicle Control

There are two modes of control recognised in the cognitive psychological literature.

Controlled

processing is slow, requires serial attention to sub-tasks, and requires effortful memory retrieval and decision making. In contrast

automatic

control is less effortful, less prone to interference from simultaneous tasks, and is driven largely by the current stimulus. Neurobiological analogues of these are

goal-directed

and

habit-based

behaviour respectively. Here, we suggest how these control modes might be deployed in an engineering solution to Automatic Vehicle Control. We present pilot data on a first step towards instantiating automatised control in the architecture, and suggest a synergy between the engineering and biological investigation of this dual-process approach.

Kevin Gurney, Amir Hussain, Jon Chambers, Rudwan Abdullah

Multiple Sound Source Localisation in Reverberant Environments Inspired by the Auditory Midbrain

This paper proposes a spiking neural network (SNN) of the mammalian auditory midbrain to achieve binaural multiple sound source localisation. The network is inspired by neurophysiological studies on the organisation of binaural processing in the medial superior olive (MSO), lateral superior olive (LSO) and the inferior colliculus (IC) to achieve a sharp azimuthal localisation of sound sources over a wide frequency range in a reverberant environment. Three groups of artificial neurons are constructed to represent the neurons in the MSO, LSO and IC that are sensitive to interaural time difference (ITD), interaural level difference (ILD) and azimuth angle respectively. The ITD and ILD cues are combined in the IC to estimate the azimuth direction of a sound source. To deal with echo, we propose an inter-inhibited onset network in the IC, which can extract the azimuth information from the direct path sound and avoid the effects of reverberation. Experiments show that the proposed onset cell network can localise two sound sources efficiently taking into account the room reverberation.

Jindong Liu, David Perez-Gonzalez, Adrian Rees, Harry Erwin, Stefan Wermter

A Model of Neuronal Specialization Using Hebbian Policy-Gradient with “Slow” Noise

We study a model of neuronal specialization using a policy gradient reinforcement approach. (1) The neurons stochastically fire according to their synaptic input plus a noise term; (2) The environment is a closed-loop system composed of a rotating eye and a visual punctual target; (3) The network is composed of a foveated retina, a primary layer and a motoneuron layer; (4) The reward depends on the distance between the subjective target position and the fovea and (5) the weight update depends on a Hebbian trace defined according to a policy gradient principle. In order to take into account the mismatch between neuronal and environmental integration times, we distort the firing probability with a “pink noise” term whose autocorrelation is of the order of 100 ms, so that the firing probability is overestimated (or underestimated) for about 100 ms periods. The rewards occuring meanwhile assess the “value” of those elementary shifts, and modify the firing probability accordingly. Every motoneuron being associated to a particular angular direction, we test at the end of the learning process the preferred output of the visual cells. We find that accordingly with the observed final behavior, the visual cells preferentially excite the motoneurons heading in the opposite angular direction.

Emmanuel Daucé

How Bursts Shape the STDP Curve in the Presence/Absence of GABAergic Inhibition

It has been known for some time that the synapses of the CA1 pyramidal cells are surprisingly unreliable at signalling the arrival of single spikes to the postsynaptic neuron [2]. On the other hand, bursts of spikes are reliably signalled, because transmitter release is facilitated. In the hippocampus, a single burst can produce long-term synaptic modifications. Bursts of spikes in addition to increasing reliability of synaptic transmission [3], they have been shown to provide effective mechanisms for selective communication between neurons in a network [4]. We investigate via computer simulations how the profile of spike-timing-dependent plasticity (STDP) in the CA1 pyramidal cell synapses is affected when an excitatory burst of spikes applied to dendrites is paired with an excitatory single spike applied to the soma in the absence and presence of a 100Hz GABAergic inhibitory spike train applied to the dendrites. We report that the shape of the STDP curve strongly depends on the burst interspike interval in the presence/absence of GABA

when a presynaptic burst and a postsynaptic spike are paired together.

Vassilis Cutsuridis, Stuart Cobb, Bruce P. Graham

Optimizing Generic Neural Microcircuits through Reward Modulated STDP

How can we characterize if a given neural circuit is optimal for the class of computational operations that it has to perform on a certain input distribution? We show that modifying the efficacies of recurrent synapses in a generic neural microcircuit via spike timing dependent plasticity (STDP) can optimize the circuit in an unsupervised fashion for a particular input distribution if STDP is modulated by a global reward signal. More precisely, optimizing microcircuits through reward modulated STDP leads to a lower eigen-value spread of the cross-correlation matrix, higher entropy, highly decorrelated neural activity, and tunes the circuit dynamics to a regime that requires a large number of principal components for representing the information contained in the liquid state as compared to randomly drawn microcircuits. Another set of results show that such optimization brings the mean firing rate into a realistic regime, while increasing the sparseness and the information content of the network. We also show that the performance of optimized circuits improves for several linear and non-linear tasks.

Prashant Joshi, Jochen Triesch

Calcium Responses Model in Striatum Dependent on Timed Input Sources

The striatum is the input nucleus of the basal ganglia and is thought to be involved in reinforcement learning. The striatum receives glutamate input from the cortex, which carries sensory information, and dopamine input from the substantia nigra, which carries reward information. Dopamine-dependent plasticity of cortico-striatal synapses is supposed to play a critical role in reinforcement learning. Recently, a number of labs reported contradictory results of its dependence on the timing of cortical inputs and spike output. To clarify the mechanisms behind spike timing-dependent plasticity of striatal synapses, we investigated spike timing-dependence of intracellular calcium concentration by constructing a striatal neuron model with realistic morphology. Our simulation predicted that the calcium transient will be maximal when cortical spike input and dopamine input precede the postsynaptic spike. The gain of the calcium transient is enhanced during the “up-state” of striatal cells and depends critically on NMDA receptor currents.

Takashi Nakano, Junichiro Yoshimoto, Jeff Wickens, Kenji Doya

Independent Component Analysis Aided Diagnosis of Cuban Spino Cerebellar Ataxia 2

Precedent studies have found abnormalities in the oculomotor system in patients with severe SCA2 form of autosomal dominant cerebellar ataxias (ADCA), including the latency, peak velocity, and deviation in saccadic movements, and causing changes in the morphology of the patient response waveform. This different response suggests a higher degree of statistic independence in sick patients when compared to healthy individuals regarding the patient response to the visual saccadic stimulus. We processed electro-oculogram records of six patient diagnosed with severe ataxia SCA2 and six healthy subjects used as control, employing independent component analysis (ICA), significant differences have been found in the statistical independence of the person response with the stimulus for 60° saccadic tests.

Rodolfo V. García, Fernando Rojas, Jesús González, Belén San Román, Olga Valenzuela, Alberto Prieto, Luis Velázquez, Roberto Rodríguez

Hippocampus, Amygdala and Basal Ganglia Based Navigation Control

In this paper we present a novel robot navigation system aimed at testing hypotheses about the roles of key brain areas in foraging behavior of rats. The key components of the control network are: 1. a Hippocampus inspired module for spatial localization based on associations between sensory inputs and places; 2. an Amygdala inspired module for the association of values with places and sensory stimuli; 3. a Basal Ganglia inspired module for the selection of actions based on the evaluated sensory inputs. By implementing this Hippocampus-Amygdala-Basal Ganglia based control network with a simulated rat embodiment we intend to test not only our understanding of the individual brain areas but especially the interaction between them. Understanding the neural circuits that allows rats to efficiently forage for food will also help to improve the ability of robots to autonomously evaluate and select navigation targets.

Ansgar Koene, Tony J. Prescott

A Framework for Simulation and Analysis of Dynamically Organized Distributed Neural Networks

We present a framework for modelling and analyzing emerging neural activity from multiple interconnected modules, where each module is formed by a neural network. The neural network simulator operates a 2D lattice tissue of leaky integrate-and-fire neurons with genetic, ontogenetic and epigenetic features. The Java Agent DEvelopment (JADE) environment allows the implementation of an efficient automata-like virtually unbound and platform-independent system of agents exchanging hierarchically organized messages. This framework allowed us to develop linker agents capable to handle dynamic configurations characterized by the entrance and exit of additional modules at any time following simple rewiring rules. The development of a virtual electrode allows the recording of a “neural” generated signal, called electrochipogram (EChG), characterized by dynamics close to biological local field potentials and electroencephalograms (EEG). These signals can be used to compute Evoked Potentials by complex sensory inputs and comparisons with neurophysiological signals of similar kind.

Vladyslav Shaposhnyk, Pierre Dutoit, Victor Contreras-Lámus, Stephen Perrig, Alessandro E. P. Villa

Continuous Attractors of Lotka-Volterra Recurrent Neural Networks

Continuous attractor neural network (CANN) models have been studied in conjunction with many diverse brain functions including local cortical processing, working memory, and spatial representation. There is good evidence for continuous stimuli, such as orientation, moving direction, and the spatial location of objects could be encoded as continuous attractors in neural networks. Although their wide applications for the information processing in the brain, representation and stability analysis of continuous attractors in non-linear recurrent neural networks (RNNs) have been reported very little so far. This paper studies the continuous attractors of Lotka-Volterra (LV) recurrent neural networks. Conditions are given to insure the network has continuous attractors. Representation of continuous attractor is obtained under the conditions. Simulations are employed to illustrate the theory.

Haixian Zhang, Jiali Yu, Zhang Yi

Learning Complex Population-Coded Sequences

In humans and primates, the sequential structure of complex actions is apparently learned at an abstract “cognitive” level in several regions of the frontal cortex, independent of the control of the immediate effectors by the motor system. At this level, actions are represented in terms of kinematic parameters – especially direction of end effector movement – and encoded using population codes. Muscle force signals are generated from this representation by downstream systems in the motor cortex and the spinal cord.

In this paper, we consider the problem of learning population-coded kinematic sequences in an abstract neural network model of the medial frontal cortex. For concreteness, the sequences are represented as line drawings in a two-dimensional workspace. Learning such sequences presents several challenges because of the internal complexity of the individual sequences and extensive overlap between sequences. We show that, by using a simple module-selection mechanism, our model is capable of learning multiple sequences with complex structure and very high cross-sequence similarity.

Kiran V. Byadarhaly, Mithun Perdoor, Suresh Vasa, Emmanuel Fernandez, Ali A. Minai

Structural Analysis on STDP Neural Networks Using Complex Network Theory

Synaptic plasticity is one of essential and central functions for the memory, the learning, and the development of the brains. Triggered by recent physiological experiments, the basic mechanisms of the spike-timing-dependent plasticity (STDP) have been widely analyzed in model studies. In this paper, we analyze complex structures in neural networks evolved by the STDP. In particular, we introduce the complex network theory to analyze spatiotemporal network structures constructed through the STDP. As a result, we show that nonrandom structures emerge in the neural network through the STDP.

Hideyuki Kato, Tohru Ikeguchi, Kazuyuki Aihara

Time Coding of Input Strength Is Intrinsic to Synapses with Short Term Plasticity

Many neocortical synapses adapt their postsynaptic response to the input rate of the presynaptic neuron through different mechanisms of short term plasticity: Steady state postsynaptic firing rates become invariant to the presynaptic frequency. Still, timing may convey information about presynaptic rate: The postsynaptic current is shown here analytically to peak earlier when presynaptic input frequency increases. An approximate 1ms/10Hz coding sensitivity for AMPA, and 1ms/1Hz for NMDA receptors in post synaptic potentials was found by a multicompartmental synapse simulation using detailed kinetic channel models. The slower the ion channels, the more expressed the time lag signal, but the same time the less the available headroom when compared at identical frequencies. Such timing code of input strength is transmitted most efficiently when postsynaptic amplitude is normalized by the input rate. Short term plasticity is a mechanism local to the synapse that provides such normalizing framework.

Márton A. Hajnal

Information Processing and Timing Mechanisms in Vision

Researches of neural mechanism of time perception is one of the fastest growing areas of neuroscience. The visual system presents several examples of timing mechanisms. Its activity is characterized by a complex network of synchronized elements which cooperate together. Some authors recently proposed that neural circuits should be inherently capable of temporal processing as a result of the natural complexity of cortical networks coupled with the presence of time-dependent network properties. We present an adaptive feedback model which, through a temporal-to-spatial transformation is able to explain recent experiments on the relationships between vision and time/space perception.

Andrea Guazzini, Pietro Lió, Andrea Passarella, Marco Conti

Review of Neuron Types in the Retina: Information Models for Neuroengineering

Powerful information processing functions are performed in the mammalian retina in which basic units are different types of neurons. This paper presents the types of neurons and their roles in the visual processing system. The aim is to review the principles of how an artificial visual system could be constructed based on a comprehensive understanding of biological systems.

German D. Valderrama-Gonzalez, T. M. McGinnity, Liam Maguire, QingXiang Wu

Brain Electric Microstate and Perception of Simultaneously Audiovisual Presentation

Associations between picture and sound form the basis of reading. Learning the correspondences between them is a crucial step in reading acquisition. This study was designed to investigate whether task-related processing of audio and visual features was independent or task-related processing in one modality might influence the processing of the other. The present study employed simultaneous audio-visual stimulus in the oddball paradigm to re-examine the effects of attention on audio, visual and audio-visual perception in the non-musician brain. Electroencephalographic (EEG) was recorded from 28 normal participants. None of them had more than three years of formal musical training and none had any musical training within the past five years. Chinese and Korean subjects were presented with tones (auditory: A), pictures (visual: V), and simultaneous tones and pictures (audio-visual: AV). The neural basis of this interaction was investigated by subtracting the event-related potentials (ERPs) to the A and the V stimuli alone from the ERP to the combined AV stimuli (i.e. interaction = AV - (A+V)). The Korean group showed larger mean interaction amplitude and longer in time than the Chinese group. This reveals that experience influences the early cortical automatic processing of linguistically relevant suprasegmental pitch contour. These results suggest that efficient processing of associations between pictures and sounds relies on neural mechanisms similar to those naturally evolved for integrating audiovisual perception.

Wichian Sittiprapaporn, Jun Soo Kwon

A Model for Neuronal Signal Representation by Stimulus-Dependent Receptive Fields

Image coding by the mammalian visual cortex has been modeled through linear combinations of receptive-field-like functions. The spatial receptive field of a visual neuron is typically assumed to be signal-independent, a view that has been challenged by recent neurophysiological findings. Motivated by these, we here propose a model for conjoint space-frequency image coding based on stimulus-dependent receptive-field-like functions. For any given frequency, the parameters of the coding functions are obtained from the Fourier transform of the stimulus. The representation is initially presented in terms of Gabor functions, but can be extended to more general forms, and we find that the resulting coding functions show properties that are consistent with those of the receptive fields of simple cortical cells of the macaque.

José R. A. Torreão, João L. Fernandes, Silvia M. C. Victer

Hardware Implementations and Embedded Systems

Area Chip Consumption by a Novel Digital CNN Architecture for Pattern Recognition

The implementation for the digital neural networks on a chip requires a lot of chip area consumption. Our contribution paper deals therefore with the design of a novel type of digital CNN architecture focused on pattern recognition application. The novel designed network we compare with another two CNN implementation of digital network on a chip used for pattern recognition by the selected parameters as the speed and chip area consumption. From the comparison we can recognize that our proposed digital CNN network is the best from the other ones.

Emil Raschman, Daniela Ďuračková

Multifold Acceleration of Neural Network Computations Using GPU

With emergence of graphics processing units (GPU) of the latest generation, it became possible to undertake neural network based computations using GPU on serially produced video display adapters. In this study, NVIDIA CUDA technology has been used to implement standard back-propagation algorithm for training multiple perceptrons simultaneously on GPU. For the problem considered, GPU-based implementation (on NVIDIA GTX 260 GPU) has lead to a 50x speed increase compared to a highly optimized CPU-based computer program, and more than 150x compared to a commercially available CPU-based software (NeuroShell 2) (AMD Athlon 64 Dual core 6000+ processor).

Alexander Guzhva, Sergey Dolenko, Igor Persiantsev

Training Recurrent Neural Network Using Multistream Extended Kalman Filter on Multicore Processor and Cuda Enabled Graphic Processor Unit

Recurrent neural networks are popular tools used for modeling time series. Common gradient-based algorithms are frequently used for training recurrent neural networks. On the other side approaches based on the Kalman filtration are considered to be the most appropriate general-purpose training algorithms with respect to the modeling accuracy. Their main drawbacks are high computational requirements and difficult implementation. In this work we first provide clear description of the training algorithm using simple pseudo-language. Problem with high computational requirements is addresses by performing calculation on Multicore Processor and CUDA-enabled graphic processor unit. We show that important execution time reduction can be achieved by performing computation on manycore graphic processor unit.

Michal Čerňanský

A Non-subtraction Configuration of Self-similitude Architecture for Multiple-Resolution Edge-Filtering CMOS Image Sensor

The

self-similitude

architecture developed in our previous work for multiple-resolution image perception [1] has been transformed into a non-subtraction configuration. In contrast to the previous work, the subtraction operations are entirely eliminated from the computation repertory of processing elements. As a result, the hardware organization of multiple-resolution edge-filtering image sensor has been greatly simplified. In addition, a fully pixel-parallel

self-similitude

processing has been established without any complexity in the interconnects. A proof-of-concept chip capable of performing four directional edge filtering at full, half and quarter resolutions was designed in a 0.18

m 5-metal CMOS technology and was sent to fabrication. The performance was verified by circuit simulation (Synosyps NanoSim), showing that the four directional edge filtering at multiple resolutions is carried out at more than 1000 frames/sec. with a clock rate of 500kHz.

Norihiro Takahashi, Tadashi Shibata

Current-Mode Computation with Noise in a Scalable and Programmable Probabilistic Neural VLSI System

This paper presents the VLSI implementation of a scalable and programmable Continuous Restricted Boltzmann Machine (CRBM), a probabilistic model proved useful for recognising biomedical data. Each single-chip system contains 10 stochastic neurons and 25 adaptable connections. The scalability allows the network size to be expanded by interconnecting multiple chips, and the programmability allows all parameters to be set and refreshed to optimum values. In addition, current-mode computation is employed to increase dynamic ranges of signals, and a noise generator is included to induce continous-valued stochasticity on chip. The circuit design and corresponding measurement results are described and discussed.

Chih-Cheng Lu, H. Chen

Minimising Contrastive Divergence with Dynamic Current Mirrors

Implementing probabilistic models in Very-Large-Scale-Integration (VLSI) has been attractive to implantable biomedical devices for improving sensor fusion. However, hardware non-idealities can introduce training errors, hindering optimal modelling through on-chip adaptation. This paper investigates the feasibility of using the dynamic current mirrors to implement a simple and precise training circuit. The precision required for training the Continuous Restricted Boltzmann Machine (CRBM) is first identified. A training circuit based on accumulators formed by dynamic current mirrors is then proposed. By measuring the accumulators in VLSI, the feasibility of training the CRBM on chip according to its minimizing-contrastive-divergence rule is concluded.

Chih-Cheng Lu, H. Chen

Spiking Neural Network Self-configuration for Temporal Pattern Recognition Analysis

In this work we provide design guidelines for the hardware implementation of Spiking Neural Networks. The proposed methodology is applied to temporal pattern recognition analysis. For this purpose the networks are trained using a simplified Genetic Algorithm. The proposed solution is applied to estimate the processing efficiency of Spiking Neural Networks.

Josep L. Rosselló, Ivan de Paúl, Vincent Canals, Antoni Morro

Image Recognition in Analog VLSI with On-Chip Learning

We present an analog-VLSI neural network for image recognition which features a dimensionality reduction network and a classification stage. We implement local learning rules to train the network on chip or program the coefficients from a computer, while compensating for the negative effects of device mismatch and circuit nonlinearity. Our experimental results show that the circuits perform closely to equivalent software implementations, reaching 87% accuracy for face classification and 89% for handwritten digit classification. The circuit dissipates 20mW and occupies 2.5mm

of die area in a 0.35

m CMOS process.

Gonzalo Carvajal, Waldo Valenzuela, Miguel Figueroa

Behavior Modeling by Neural Networks

Modeling of human and animal behavior is of interest for a number of diagnostic purposes. Convolutional neural networks offer a constructive approach allowing learning on a limited number of examples. Chaotic tendencies make that learning is not always successful. The paper looks into a number of applications to find the reason for this anomaly and identifies the need for behavioral references to provide determinism in the diagnostic model.

Lambert Spaanenburg, Mona Akbarniai Tehrani, Richard Kleihorst, Peter B. L. Meijer

Statistical Parameter Identification of Analog Integrated Circuit Reverse Models

We solve the manufacturing problem of identifying the model statistical parameters ensuring a satisfactory quality of analog circuits produced in a photolithographic process. We formalize it in a statistical framework as the problem of inverting the mapping from the population of the circuit production variables to the performances’ population. Both variables and performances are random. From a sample of the joint population we want to identify the statistical features of the former producing a performance distribution that satisfies the design constraints with a good preassigned probability. The key idea of the solution method we propose consists of describing the above mapping in terms of a mixture of granular functions, where each is responsible for a fuzzy set within the input-output space, hence for a cluster therein. The way of synthesizing the whole space as a mixture of these clusters is learnt directly from the examples. As a result we have an analytical form both of the mapping approximating complex Spice models in terms of polynomials in the production variables, and of the distribution law of the induced performances that allows a relatively quick and easy management of the production variables’ statistical parameters as a function of the probability with which we plan to satisfy the design constraint. We apply the method to case studies and real production data where our method outperforms current methods’ running times and accuracies.

Bruno Apolloni, Simone Bassis, Cristian Mesiano, Salvatore Rinaudo, Angelo Ciccazzo, Angelo Marotta

A New FGMOST Euclidean Distance Computational Circuit Based on Algebraic Mean of the Input Potentials

A new Euclidean distance circuit focused on high-speed operation will be presented in this paper. The computing accuracy will be improved compensating the error introduced by the second-order effects, which affect MOS transistor operation (short channel effect and mobility degradation) by a proper common-mode input voltage excitation of the squarer circuit. Because the elementary approach of designing an Euclidean distance circuit (exclusively based on classical MOS transistors in saturation) requires an additional threshold voltage extractor circuit, the new proposed idea is to use a FGMOST (

loating

ate

MOS

ransistor), having the advantage of a very large reducing of the circuit complexity.

Cosmin Radu Popa

FPGA Implementation of Support Vector Machines for 3D Object Identification

In this paper we present a hardware architecture for a Support Vector Machine intended for vision applications to be implemented in a FPGA device. The architecture computes the contribution of each support vector in parallel without performing multiplications by using a CORDIC algorithm and a hardware-friendly kernel function. Additionally input images are not preprocessed for feature extraction as each image is treated as a point in a high dimensional space.

Marta Ruiz-Llata, Mar Yébenes-Calvino

Reconfigurable MAC-Based Architecture for Parallel Hardware Implementation on FPGAs of Artificial Neural Networks Using Fractional Fixed Point Representation

In this paper, we devise a hardware architecture for ANNs that takes advantage of the dedicated adder blocks, commonly called MACs, to compute both the weighted sum and the activation function. The proposed architecture requires a reduced silicon area considering the fact that the MACs come for free as these are FPGA’s built-in cores. The implementation uses integer fixed point arithmetic and operates with fractions to represent real numbers. The hardware is fast because it is massively parallel. Besides, the proposed architecture can adjust itself on-the-fly to the user-defined configuration of the neural network, i.e., the number of layers and neurons per layer of the ANN can be settled with no extra hardware changes.

Rodrigo Martins da Silva, Nadia Nedjah, Luiza de Macedo Mourelle

Self Organization

A Two Stage Clustering Method Combining Self-Organizing Maps and Ant K-Means

This paper proposes a clustering method SOMAK, which is composed by Self-Organizing Maps (SOM) followed by the Ant K-means (AK) algorithm. SOM is an Artificial Neural Network (ANN), which has one of its characteristics, the nonlinear projection from a high dimensionality of the sensorial space. AK is based in the Ant Colony Optimization (ACO), which is a recently proposed meta-heuristic approach for solving hard combinatorial optimization problems. The AK algorithm modifies the K-means on locating the objects and these are then clustered according to the probabilities which in turn are updated by the pheromone. The SOMAK has a good performance when compared with some clustering techniques and reduces the computational time.

Jefferson R. Souza, Teresa B. Ludermir, Leandro M. Almeida

Image Theft Detection with Self-Organising Maps

In this paper an application of the TS-SOM variant of the self-organising map algorithm on the problem of copyright theft detection for bitmap images is shown. The algorithm facilitates the location of originals of copied, damaged or modified images within a database of hundreds of thousands of stock images. The method is shown to outperform binary decision tree indexing with invariant frame detection.

Philip Prentis, Mats Sjöberg, Markus Koskela, Jorma Laaksonen

Improved Kohonen Feature Map Associative Memory with Area Representation for Sequential Analog Patterns

In this paper, we propose an improved Kohonen feature map associative memory with area representation for sequential analog patterns. This model is based on the conventional Kohonen feature map associative memory with area representation for sequential analog patterns. The proposed model has enough robustness for noisy input and damaged neurons. Moreover, the learning speed of the proposed model is faster than that of the conventional model. We carried out a series of computer experiments and confirmed the effectiveness of the proposed model.

Tomonori Shirotori, Yuko Osana

Surface Reconstruction Method Based on a Growing Self-Organizing Map

This work introduces a method that produces triangular mesh representation of a target object surface. The new surface reconstruction method is based on Growing Self-organizing Maps, which learns both the geometry and the topology of the input data set. Each map grows incrementally producing meshes of different resolutions, according to different application needs. Experimental results show that the proposed method can produce triangular meshes having approximately equilateral faces, that approximate very well the shape of an object, including its concave regions and holes, if any.

Renata L. M. E. do Rego, Hansenclever F. Bassani, Daniel Filgueiras, Aluizio F. R. Araujo

Micro-SOM: A Linear-Time Multivariate Microaggregation Algorithm Based on Self-Organizing Maps

The protection of personal privacy is paramount, and consequently many efforts have been devoted to the study of data protection techniques. Governments, statistical agencies and corporations must protect the privacy of the individuals while guaranteeing the right of the society to knowledge. Microaggregation is one of the most promising solutions to deal with this praiseworthy task. However, its high computational cost prevents its use with large amounts of data. In this article we propose a new microaggregation algorithm that uses self-organizing maps to scale down the computational costs while maintaining a reasonable loss of information.

Agusti Solanas, Arnau Gavalda, Robert Rallo

Identifying Clusters Using Growing Neural Gas: First Results

Growing Neural Gas is a self organizing network capable to build a lattice of neural unit that grows in the input pattern manifold. The structure of the obtained network often is not a planar graph and can be not suitable for visualization; cluster identification is possible only if a set of not connected subgraphs are produced. In this work we propose a method to select the neural units in order to extract the information on the pattern clusters, even if the obtained network graph is connected. The proposed method creates a new structure called Labeling Network (LNet) that repeats the topology of the GNG network and a set of weights to the links of the neuron graph. These weights are trained using an anti-Hebbian algorithm obtaining a new structure capable to label input patterns according to their cluster.

Riccardo Rizzo, Alfonso Urso

Hierarchical Architecture with Modular Network SOM and Modular Reinforcement Learning

We propose a hierarchical architecture composed of a modular network SOM (mnSOM) layer and a modular reinforcement learning (mRL) layer. The mnSOM layer models characteristics of a target system, and the mRL layer provides control signals to the target system. Given a set of inputs and outputs from the target system, a winner module which minimizes the mean square output error is determined in the mnSOM layer. The corresponding module in the mRL layer is trained by reinforcement learning to maximize accumulated future rewards. An essential point, here, is that neighborhood learning is adopted at both layers, which guarantees a topology preserving map based on similarity between modules. Its application to a pursuit-evasion game demonstrates usefulness of interpolated modules in providing appropriate control signals. A modular approach to both modeling and control proposed in the paper provides a promising framework for wide-ranging tasks.

Masumi Ishikawa, Kosuke Ueno

Hybrid Systems for River Flood Forecasting Using MLP, SOM and Fuzzy Systems

This article presents an approach of data partitioning using specialist knowledge incorporated to intelligent solutions for river flow prediction. The main idea is to train the processes through a hybrid systems, neural networks and fuzzy, characterizing its physical process. As a case study, the results obtained with this models from three basins,

Três Marias

Tucuruí

and

Foz do Areia

, all situated in Brazil, are investigated.

Ivna Valença, Teresa Ludermir

Topographic Mapping of Astronomical Light Curves via a Physically Inspired Probabilistic Model

We present a probabilistic generative approach for constructing topographic maps of light curves from eclipsing binary stars. The model defines a low-dimensional manifold of local noise models induced by a smooth non-linear mapping from a low-dimensional latent space into the space of probabilistic models of the observed light curves. The local noise models are physical models that describe how such light curves are generated. Due to the principled probabilistic nature of the model, a cost function arises naturally and the model parameters are fitted via MAP estimation using the Expectation-Maximisation algorithm. Once the model has been trained, each light curve may be projected to the latent space as the the mean posterior probability over the local noise models. We demonstrate our approach on a dataset of artificially generated light curves and on a dataset comprised of light curves from real observations.

Nikolaos Gianniotis, Peter Tiňo, Steve Spreckley, Somak Raychaudhury

Generalized Self-Organizing Mixture Autoregressive Model for Modeling Financial Time Series

The mixture autoregressive (MAR) model regards a time series as a mixture of linear regressive processes. A self-organizing algorithm has been used together with the LMS algorithm for learning the parameters of the MAR model. The self-organizing map has been used to simplify the mixture as a winner-takes-all selection of local models, combined with an autocorrelation coefficient based measure as the similarity measure for identifying correct local models and has been shown previously being able to uncover underlying autoregressive processes from a mixture. In this paper the self-organizing network is further generalized so that it fully considers the mixing mechanism and individual model variances in modeling and prediction of time series. Experiments on both benchmark time series and several financial time series are presented. The results demonstrate the superiority of the proposed method over other time-series modeling techniques on a range of performance measures including mean-square-error, prediction rate and accumulated profits.

Hujun Yin, He Ni

Self-Organizing Map Simulations Confirm Similarity of Spatial Correlation Structure in Natural Images and Cortical Representations

The goal of this paper is to extend the ecological theory of perception to the larger scale spatial organization of cortical maps. This leads to the hypothesis that cortical organization of responses to visual features reflects the environmental organization of these same features. In our previous work we have shown that the spatial statistics of natural images can be characterized by a slowly decaying, or low frequency correlational structure for color, and a rapidly decaying, or high-frequency structure for orientation features. A similar contrasting behavior of spatial statistics for color and orientation was measured in parallel in the cortical response of macaque visual cortex.

In order to explore whether this parallel is meaningful, we performed a cortical simulation using an adaptation of Kohonen’s self-organizing map algorithm. The simulated cortex responds to both low-frequency and high-frequency input visual features, and learns to represent these features through weight modification. We demonstrate that the learnt cortical weights show the same spatial correlation structure that is observed both in natural image statistics and the measured cortical responses.

A. Ravishankar Rao, Guillermo Cecchi

Intelligent Control and Adaptive Systems

Height Defuzzification Method on L ∞ Space

The mathematical framework for studying of a fuzzy approximate reasoning is presented in this paper. One of the defuzzification methods besides the center of gravity method which is the best well known defuzzification method are described. The continuity of the defuzzification methods and its application to a fuzzy feedback control are discussed.

Takashi Mitsuishi, Yasunari Shidama

An Additive Reinforcement Learning

In reinforcement learning, preparing basis functions requires a certain amount of prior knowledge and is in general a difficult task. To overcome this difficulty, an adaptive basis function construction technique has been proposed by Keller et al. recently, but it requires excessive computational cost. We propose an efficient approach to this context, in which the problem of approximating the value function is naturally decomposed into a number of sub-problems, each of which can be solved at small computational cost. Computer experiments show that the cpu-time needed by our method is much smaller than that of the existing method.

Takeshi Mori, Shin Ishii

Neural Spike Suppression by Adaptive Control of an Unknown Steady State

A FitzHugh–Nagumo type spiking neuron model equipped with an asymmetric activation function is investigated. An analogue nonlinear electrical circuit imitating the dynamics of the model is proposed. It is demonstrated that a simple first order linear filter coupled to the system can inhibit spiking and stabilize the system on an unstable steady state, the position of which is not required to be known, since the filter operates as an adaptive controller. Analytical, numerical and experimental results are presented.

Arūnas Tamaševičius, Elena Tamaševičiūtė, Gytis Mykolaitis, Skaidra Bumelienė, Raimundas Kirvaitis, Ruedi Stoop

Combined Mechanisms of Internal Model Control and Impedance Control under Force Fields

In daily life, humans must compensate for loads arising from interaction with the physical environment. Recent studies in human motor control have shown that when compensating for these loads, humans combine two feedforward control strategies, internal model control and impedance control. However, the combined mechanisms of the two control strategies have not been clarified. We propose a computational model of human arm movements and discuss how humans combine the two control strategies. We use an optimal regulator and simulate human arm movements under dynamic environments.

Naoki Tomi, Manabu Gouko, Koji Ito

Neural Network Control of Unknown Nonlinear Systems with Efficient Transient Performance

In this paper we provide a neural-based semi-global stabilization design for unknown nonlinear state-feedback stabilizable systems. The proposed design is shown to guarantee arbitrary good transient performance outside the regions where the system is uncontrollable. This is made possible through an appropriate combination of recent results developed by the author in the areas of adaptive control and adaptive optimization and a new result on the convex construction of Control Lyapunov Functions (CLF) for nonlinear systems.

Elias B. Kosmatopoulos, Diamantis Manolis, M. Papageorgiou

High-Order Fuzzy Switching Neural Networks: Application to the Tracking Control of a Class of Uncertain SISO Nonlinear Systems

In this paper, a high-order neuro-fuzzy network (HONFN) with improved approximation capability w.r.t. the standard high-order neural network (HONN) is proposed. In order to reduce the overall approximation error, a decomposition of the neural network (NN) approximation space into overlapping sub-regions is created and different NN approximations for each sub-region are considered. To this end,

the HONFN implements a fuzzy switching among different HONNs

as its input vector switches along the different sub-regions of the approximation space. The HONFN is then used to design an adaptive controller for a class of uncertain single-input single-output nonlinear systems. The proposed scheme ensures the semiglobal uniform ultimate boundedness of the tracking error within a neighborhood of the origin and the boundedness of the NN weights and control law. Furthermore, a minimal HONFN, with two properly selected fuzzy rules, guarantees that the resulting ultimate bound does not depend on the unknown optimal approximation error (as is the case for classical adaptive NN control schemes) but solely from constants chosen by the designer. A simulation study is carried out to compare the proposed scheme with a classical HONN controller.

Haris E. Psillakis

Neural and Hybrid Architectures

A Guide for the Upper Bound on the Number of Continuous-Valued Hidden Nodes of a Feed-Forward Network

This study proposes and validates a construction concept for the realization of a real-valued single-hidden layer feed-forward neural network (SLFN) with continuous-valued hidden nodes for arbitrary mapping problems. The proposed construction concept says that for a specific application problem, the upper bound on the number of used hidden nodes depends on the characteristic of adopted SLFN and the observed properties of collected data samples. A positive validation result is obtained from the experiment of applying the construction concept to the

-bit parity problem learned by constructing two types of SLFN network solutions.

Rua-Huan Tsaih, Yat-wah Wan

Comparative Study of the CG and HBF ODEs Used in the Global Minimization of Nonconvex Functions

This paper presents a unified

control Liapunov function

(CLF) approach to the design of heavy ball with friction (HBF) and conjugate gradient (CG) neural networks that aim to minimize scalar nonconvex functions that have continuous first- and second-order derivatives and a unique global minimum. This approach leads naturally to the design of second-order differential equations which are the mathematical models of the corresponding implementations as neural networks. Preliminary numerical simulations indicate that, on a small suite of benchmark test problems, a continuous version of the well known conjugate gradient algorithm, designed by the proposed CLF method, has better performance than its HBF competitor.

Amit Bhaya, Fernando A. Pazos, Eugenius Kaszkurewicz

On the Knowledge Organization in Concept Formation: An Exploratory Cognitive Modeling Study

Recent cognitive modeling studies suggest the effectiveness of meta-heuristic optimization in describing human cognitive behaviors. Such models are built on the basis of population-based algorithm (e.g., genetic algorithm) and thus hold multiple solutions or notions. There are, however, important yet unaddressed issues in cognitive mechanisms associated with possession of multiple notions. The issues we address in the present research is about how multiple notions are organized in our mind. In particular, we paid close attention to how each notion interact with other notions while learning a new concept. In so doing, we incorporated Particle Swarm Optimization in a cognitive model of concept learning. Three PSO-based concept learning models were developed and compared in the present exploratory cognitive modeling study.

Toshihiko Matsuka, Hidehito Honda, Arieta Chouchourelou, Sachiko Kiyokawa

Dynamics of Incremental Learning by VSF-Network

In this paper, we report the dynamics of VSF-Network. VSF-Network is a neural network for the incremental learning and it is a hybrid neural network combining the chaos neural network with a hierarchical network. VSF-Network can find the unknown elements from input with clusters generated by the chaos neuron. We introduce new incremental learning model to explain the dynamics of VSF-Network in this paper. We show the result of analysis of the dynamics of VSF-Network. In the analysis, we focused on the connection weights between layers and neuron cluster generated by the chaotic behavior.

Yoshitsugu Kakemoto, Shinchi Nakasuka

Kernel CMAC with Reduced Memory Complexity

Cerebellar Model Articulation Controller (CMAC) has some attractive features: fast learning capability and the possibility of efficient digital hardware implementation. Besides these attractive features it has a serious drawback: its memory complexity may be very large. In multidimensional case this may be so large that practically it cannot be implemented. To reduce memory complexity several different approaches were suggested so far. Although these approaches may greatly reduce memory complexity we have to pay a price for this complexity reduction. Either both modelling and generalization capabilities are deteriorated, or the training process will be much more complicated. This paper proposes a new approach of complexity reduction, where properly constructed hash-coding is combined with regularized kernel representation. The proposed version exploits the benefits of kernel representation and the complexity reduction effect of hash-coding, while smoothing regularization helps to reduce the performance degradation.

Gábor Horváth, Kristóf Gáti

Model Complexity of Neural Networks and Integral Transforms

Model complexity of neural networks is investigated using tools from nonlinear approximation and integration theory. Estimates of network complexity are obtained from inspection of upper bounds on decrease of approximation errors in approximation of multivariable functions by networks with increasing numbers of units. The upper bounds are derived using integral transforms with kernels corresponding to various types of computational units. The results are applied to perceptron networks.

Věra Kůrková

Function Decomposition Network

Novel neural network architecture is proposed to solve the nonlinear function decomposition problem. Top-down approach is applied that does not require prior knowledge about the function’s properties. Abilities of our method are demonstrated using synthetic test functions and confirmed by a real-world problem solution. Possible directions for further development of the presented approach are discussed.

Yevgeniy Bodyanskiy, Sergiy Popov, Mykola Titov

Improved Storage Capacity in Correlation Matrix Memories Storing Fixed Weight Codes

In this paper we introduce an improved binary correlation matrix memory (CMM) with better storage capacity when storing sparse fixed weight codes generated with the algorithm of Baum et al. [3]. We outline associative memory, and describe the binary correlation matrix memory— a specific example of a distributed associative memory. The importance of the representation used in a CMM for input and output codes is discussed, with specific regard to sparse fixed weight codes. We present an algorithm for generating of fixed weight codes, originally given by Baum et al. [3]. The properties of this algorithm are briefly discussed, including possible thresholding functions which could be used when storing these codes in a CMM; L-max and L-wta. Finally, results generated from a series of simulations are used to demonstrate that the use of L-wta as a thresholding function provides an increase in storage capacity over L-max.

Stephen Hobson, Jim Austin

Multiagent Reinforcement Learning with Spiking and Non-Spiking Agents in the Iterated Prisoner’s Dilemma

This paper investigates Multiagent Reinforcement Learning (MARL) in a general-sum game where the payoffs’ structure is such that the agents are required to exploit each other in a way that benefits all agents. The contradictory nature of these games makes their study in multiagent systems quite challenging. In particular, we investigate MARL with spiking and non-spiking agents in the Iterated Prisoner’s Dilemma by exploring the conditions required to enhance its cooperative outcome. According to the results, this is enhanced by: (i) a mixture of positive and negative payoff values and a high discount factor in the case of non-spiking agents and (ii) having longer eligibility trace time constant in the case of spiking agents. Moreover, it is shown that spiking and non-spiking agents have similar behaviour and therefore they can equally well be used in any multiagent interaction setting. For training the spiking agents, a novel and necessary modification enhances competition to an existing learning rule based on stochastic synaptic transmission.

Vassilis Vassiliades, Aristodemos Cleanthous, Chris Christodoulou

Unsupervised Learning in Reservoir Computing: Modeling Hippocampal Place Cells for Small Mobile Robots

Biological systems (e.g., rats) have efficient and robust localization abilities provided by the so called, place cells, which are found in the hippocampus of rodents and primates (these cells encode locations of the animal’s environment). This work seeks to model these place cells by employing three (biologically plausible) techniques: Reservoir Computing (RC), Slow Feature Analysis (SFA), and Independent Component Analysis (ICA). The proposed architecture is composed of three layers, where the bottom layer is a dynamic reservoir of recurrent nodes with fixed weights. The upper layers (SFA and ICA) provides a self-organized formation of place cells, learned in an unsupervised way. Experiments show that a simulated mobile robot with 17 noisy short-range distance sensors is able to self-localize in its environment with the proposed architecture, forming a spatial representation which is dependent on the robot direction.

Eric A. Antonelo, Benjamin Schrauwen

Switching Hidden Markov Models for Learning of Motion Patterns in Videos

Building on the current understanding of neural architecture of the visual cortex, we present a graphical model for learning and classification of motion patterns in videos. The model is composed of an arbitrary amount of Hidden Markov Models (HMMs) with shared Gaussian mixture models. The novel extension of our model is the use of additional Markov chain, serving as a switch for indicating the currently active HMM. We therefore call the model a Switching Hidden Markov Model (SHMM). SHMM learns from input optical flow in an unsupervised fashion. Functionality of the model is tested with artificially simulated time sequences. Tests with real videos show that the model is capable of learning and recognition of motion activities of single individuals, and for classification of motion patterns exhibited by groups of people. Classification rates of about 75 percent for real videos are satisfactory taking into account a relative simplicity of the model.

Matthias Höffken, Daniel Oberhoff, Marina Kolesnik

Multimodal Belief Integration by HMM/SVM-Embedded Bayesian Network: Applications to Ambulating PC Operation by Body Motions and Brain Signals

Methods to integrate multimodal beliefs by Bayesian Networks (BNs) comprising Hidden Markov Models (HMMs) and Support Vector Machines (SVMs) are presented. The integrated system is applied to the operation of ambulating PCs (biped humanoids) across the network. New features in this paper are twofold. First, the HMM/SVM-embedded BN for the multimodal belief integration is newly presented. Its subsystem also has a new structure such as a committee SVM array. Another new fearure is with the applications. Body and brain signals are applied to the ambulating PC operation by using the recognition of multimodal signal patterns. The body signals here are human gestures. Brain signals are either HbO

of NIRS or neural spike trains. As for such ambulating PC operation, the total system shows better performance than HMM and BN systems alone.

Yasuo Matsuyama, Fumiya Matsushima, Youichi Nishida, Takashi Hatakeyama, Nimiko Ochiai, Shogo Aida

A Neural Network Model of Metaphor Generation with Dynamic Interaction

The purpose of this study is to construct a computational model that generates understandable metaphors of the form “A (target) like B (vehicle)” from the features of the target based on a language statistical analysis. The model outputs candidate nouns for the vehicle from inputs for the target and its features that are represented by adjectives and verbs. First, latent classes among nouns and adjectives (or verbs) are estimated from statistical language analysis. Secondly, a computational model of metaphor generation, including dynamic interaction among features, is constructed based on the statistical analysis results. Finally, a psychological experiment is conducted to examine the validity of the model.

Asuka Terai, Masanori Nakagawa

Almost Random Projection Machine

Backpropagation of errors is not only hard to justify from biological perspective but also it fails to solve problems requiring complex logic. A simpler algorithm based on generation and filtering of useful random projections has better biological justification, is faster, easier to train and may in practice solve non-separable problems of higher complexity than typical feedforward neural networks. Estimation of confidence in network decisions is done by visualization of the number of nodes that agree with the final decision.

Włodzisław Duch, Tomasz Maszczyk

Optimized Learning Vector Quantization Classifier with an Adaptive Euclidean Distance

This paper presents a classifier based on Optimized Learning Vector Quantization (optimized version of the basic LVQ1) and an adaptive Euclidean distance. The classifier furnishes discriminative class regions of the input data set that are represented by prototypes. In order to compare prototypes and patterns, the classifier uses an adaptive Euclidean distance that changes at each iteration but is the same for all the class regions. Experiments with real and synthetic data sets demonstrate the usefulness of this classifier.

Renata M. C. R. de Souza, Telmo de M. Silva Filho

Efficient Parametric Adjustment of Fuzzy Inference System Using Error Backpropagation Method

This paper presents a new methodology for the adjustment of fuzzy inference systems, which uses technique based on error back-propagation method. The free parameters of the fuzzy inference system, such as its intrinsic parameters of the membership function and the weights of the inference rules, are automatically adjusted. This methodology is interesting, not only for the results presented and obtained through computer simulations, but also for its generality concerning to the kind of fuzzy inference system used. Therefore, this methodology is expandable either to the Mandani architecture or also to that suggested by Takagi-Sugeno. The validation of the presented methodology is accomplished through estimation of time series and by a mathematical modeling problem. More specifically, the Mackey-Glass chaotic time series is used for the validation of the proposed methodology.

Ivan da Silva, Rogerio Flauzino

Neuro-fuzzy Rough Classifier Ensemble

The paper proposes a new ensemble of neuro-fuzzy rough set classifiers. The ensemble uses fuzzy rules derived by the Adaboost metalearning. The rules are used in an ensemble of neuro-fuzzy rough set systems to gain the ability to work with incomplete data (in terms of missing features). This feature is not common among different machine learning methods like neural networks or fuzzy systems. The systems are combined into the larger ensemble to achieve better accuracy. Simulations on a well-known benchmark showed the ability of the proposed system to perform relatively well.

Marcin Korytkowski, Robert Nowicki, Rafał Scherer

Combining Feature Selection and Local Modelling in the KDD Cup 99 Dataset

In this work, a new approach for intrusion detection in computer networks is introduced. Using the KDD Cup 99 dataset as a benchmark, the proposed method consists of a combination between feature selection methods and a novel local classification method. This classification method –called FVQIT (Frontier Vector Quantization using Information Theory)– uses a modified clustering algorithm to split up the feature space into several local models, in each of which the classification task is performed independently. The method is applied over the KDD Cup 99 dataset, with the objective of improving performance achieved by previous authors. Experimental results obtained indicate the adequacy of the proposed approach.

Iago Porto-Díaz, David Martínez-Rego, Amparo Alonso-Betanzos, Oscar Fontenla-Romero

An Automatic Parameter Adjustment Method of Pulse Coupled Neural Network for Image Segmentation

A Pulse Coupled Neural Network (PCNN) is proposed as a numerical model of cat visual cortex, and it has been applied to the engineering fields especially in an image processing,

e.g.,

segmentation, edge enhancement, and so on. The PCNN model consists of neurons with two kind of inputs, namely feeding input and linking input and they each have a lot of parameters. The Parameters are used to be defined empirically and the optimization of parameters has been known as one of the remaining problem of PCNN. According to the recent studies, parameters in PCNN will be able to be given using parameter learning rule or evolutionary programming. However these methods require teaching images for the learning. In this study, we propose a parameter adjustment method of PCNN for the image segmentation. The proposed method changes the parameters through the iterations of trial of segmentation and the method doesn’t require any teaching signal or teaching pattern. The successful results are obtained in the simulations, and we conclude that the proposed method shows good performance for the parameter adjustment of PCNNs.

Masato Yonekawa, Hiroaki Kurokawa

Pattern Identification by Committee of Potts Perceptrons

A method of estimation of the quality of data identification by a parametric perceptron is presented. The method allows one to combine the parametric perceptrons into a committee. It is shown by the example of the Potts perceptrons that the storage capacity of the committee grows linearly with the increase of the number of perceptrons forming the committee. The combination of perceptrons into a committee is useful when given task parameters (image dimension and chromaticity, the number of patterns, distortion level, identification reliability) one perceptron is unable to solve the identification task. The method can be applied in q-ary or binary pattern identification task.

Vladimir Kryzhanovsky

Support Vector Machine

Is Primal Better Than Dual

Chapelle proposed to train support vector machines (SVMs) in the primal form by Newton’s method and discussed its advantages. In this paper we propose training L2 SVMs in the dual form in the similar way that Chapelle proposed. Namely, we solve the quadratic programming problem for the initial working set of training data by Newton’s method, delete from the working set the data with negative Lagrange multipliers as well as the data with the associated margins larger than or equal to 1, add to the working set training data with the associated margins less than 1, and repeat training the SVM until the working set does not change. The matrix associated with the dual quadratic form is positive definite while that with the primal quadratic form is positive semi-definite. And the former matrix requires less kernel evaluation. Computer experiments show that for most cases training the SVM by the proposed method is more stable and faster than training the SVM in the primal.

Shigeo Abe

A Fast BMU Search for Support Vector Machine

As described in this paper, we propose a fast learning algorithm of a support vector machine (SVM). Our work is base on the Learning Vector Quantization (LVQ) and we compress the data to perform properly in the context of clustered data margin maximization. For solving the problem faster, we propose a fast Best Matching Unit (BMU) search and introduce it to the Threshold Order-Dependent (TOD) algorithm, which is one of the simplest form of LVQ. Experimental results demonstrate that our method is as accurate as the existing implementation, but it is faster in most situations. We also show the extension of the proposed learning framework for online re-training problem.

Wataru Kasai, Yutaro Tobe, Osamu Hasegawa

European Option Pricing by Using the Support Vector Regression Approach

We explore the pricing performance of Support Vector Regression for pricing S&P 500 index call options. Support Vector Regression is a novel nonparametric methodology that has been developed in the context of statistical learning theory, and until now it has not been widely used in financial econometric applications. This new method is compared with the Black and Scholes (1973) option pricing model, using standard implied parameters and parameters derived via the Deterministic Volatility Functions approach. The empirical analysis has shown promising results for the Support Vector Regression models.

Panayiotis C. Andreou, Chris Charalambous, Spiros H. Martzoukos

Learning SVMs from Sloppily Labeled Data

This paper proposes a modelling of Support Vector Machine (SVM) learning to address the problem of learning with

sloppy labels

. In binary classification, learning with sloppy labels is the situation where a learner is provided with labelled data, where the observed labels of each class are possibly noisy (flipped) version of their true class and where the probability of flipping a label

to –

only depends on

. The noise probability is therefore constant and uniform within each class: learning with

positive and unlabeled data

is for instance a motivating example for this model. In order to learn with sloppy labels, we propose

SloppySvm

, an SVM algorithm that minimizes a tailored nonconvex functional that is shown to be a uniform estimate of the noise-free SVM functional. Several experiments validate the soundness of our approach.

Guillaume Stempfel, Liva Ralaivola

The GMM-SVM Supervector Approach for the Recognition of the Emotional Status from Speech

Emotion recognition from speech is an important field of research in human-machine-interfaces, and has various applications, for instance for call centers. In the proposed classifier system RASTA-PLP features (perceptual linear prediction) are extracted from the speech signals. The first step is to compute an universal background model (UBM) representing a general structure of the underlying feature space of speech signals. This UBM is modeled as a Gaussian mixture model (GMM). After computing the UBM the sequence of feature vectors extracted from the utterance is used to re-train the UBM. From this GMM the mean vectors are extracted and concatenated to the so-called GMM supervectors which are then applied to a support vector machine classifier. The overall system has been evaluated by using utterances from the public Berlin emotional database. Utilizing the proposed features a recognition rate of 79% (utterance based) has been achieved which is close to the performance of humans on this database.

Friedhelm Schwenker, Stefan Scherer, Yasmine M. Magdi, Günther Palm

A Simple Proof of the Convergence of the SMO Algorithm for Linearly Separable Problems

We give a new proof of the convergence of the SMO algorithm for SVM training over linearly separable problems that partly builds on the one by Mitchell

et al.

for the convergence of the MDM algorithm to find the point of a convex set closest to the origin. Our proof relies in a simple derivation of SMO that we also present here and, while less general, it is considerably simpler than previous ones and yields algorithmic insights into the working of SMO.

Jorge López, José R. Dorronsoro

Spanning SVM Tree for Personalized Transductive Learning

Personalized Transductive Learning (PTL) builds a unique local model for classification of each test sample and therefore is practically neighborhood dependant. While existing PTL methods usually define the neighborhood by a predefined (dis)similarity measure, in this paper we introduce a new concept of knowledgeable neighborhood and a transductive SVM classification tree (t-SVMT) for PTL. The neighborhood of a test sample is constructed over the classification knowledge modelled by regional SVMs, and a set of such SVMs adjacent to the test sample are aggregated systematically into a t-SVMT. Compared to a regular SVM and other SVMTs, the proposed t-SVMT, by virtue of the aggregation of SVMs, has an inherent superiority on classifying class-imbalanced datasets. Furthermore, t-SVMT has solved the over-fitting problem of all previous SVMTs as it aggregates neighborhood knowledge and thus significantly reduces the size of the SVM tree.

Shaoning Pang, Tao Ban, Youki Kadobayashi, Nik Kasabov

Improving Text Classification Performance with Incremental Background Knowledge

Text classification is generally the process of extracting interesting and non-trivial information and knowledge from text. One of the main problems with text classification systems is the lack of labeled data, as well as the cost of labeling unlabeled data. Thus, there is a growing interest in exploring the use of unlabeled data as a way to improve classification performance in text classification. The ready availability of this kind of data in most applications makes it an appealing source of information.

In this work we propose an Incremental Background Knowledge (IBK) technique to introduce unlabeled data into the training set by expanding it using initial classifiers to deliver oracle decisions. The defined incremental SVM margin-based method was tested in the Reuters-21578 benchmark showing promising results.

Catarina Silva, Bernardete Ribeiro

Empirical Study of the Universum SVM Learning for High-Dimensional Data

Many applications of machine learning involve sparse high-dimensional data, where the number of input features is (much) larger than the number of data samples,

≫

. Predictive modeling of such data is very ill-posed and prone to overfitting. Several recent studies for modeling high-dimensional data employ new learning methodology called Learning through Contradictions or Universum Learning due to Vapnik (1998,2006). This method incorporates a priori knowledge about application data, in the form of additional Universum samples, into the learning process. This paper investigates generalization properties of the Universum-SVM and how they are related to characteristics of the data. We describe practical conditions for evaluating the effectiveness of Random Averaging Universum.

Vladimir Cherkassky, Wuyang Dai

Relevance Feedback for Content-Based Image Retrieval Using Support Vector Machines and Feature Selection

A relevance feedback (RF) approach for content-based image retrieval (CBIR) is proposed, which is based on Support Vector Machines (SVMs) and uses a feature selection technique to reduce the dimensionality of the image feature space. Specifically, each image is described by a multidimensional vector combining color, texture and shape information. In each RF round, the positive and negative examples provided by the user are used to determine a relatively small number of the most important features for the corresponding classification task, via a feature selection methodology. After the feature selection has been performed, an SVM classifier is trained to distinguish between relevant and irrelevant images according to the preferences of the user, using the restriction of the user examples on the set of selected features. The trained classifier is subsequently used to provide an updated ranking of the database images represented in the space of the selected features. Numerical experiments are presented that demonstrate the merits of the proposed relevance feedback methodology.

Apostolos Marakakis, Nikolaos Galatsanos, Aristidis Likas, Andreas Stafylopatis

Recurrent Neural Network

Understanding the Principles of Recursive Neural Networks: A Generative Approach to Tackle Model Complexity

Recursive Neural Networks are non-linear adaptive models that are able to learn deep structured information. However, these models have not yet been broadly accepted. This fact is mainly due to its inherent complexity. In particular, not only for being extremely complex information processing models, but also because of a computational expensive learning phase. The most popular training method for these models is back-propagation through the structure. This algorithm has been revealed not to be the most appropriate for structured processing due to problems of convergence, while more sophisticated training methods enhance the speed of convergence at the expense of increasing significantly the computational cost. In this paper, we firstly perform an analysis of the underlying principles behind these models aimed at understanding their computational power. Secondly, we propose an approximate second order stochastic learning algorithm. The proposed algorithm dynamically adapts the learning rate throughout the training phase of the network without incurring excessively expensive computational effort. The algorithm operates in both on-line and batch modes. Furthermore, the resulting learning scheme is robust against the vanishing gradients problem. The advantages of the proposed algorithm are demonstrated with a real-world application example.

Alejandro Chinea

An EM Based Training Algorithm for Recurrent Neural Networks

Recurrent neural networks serve as black-box models for nonlinear dynamical systems identification and time series prediction. Training of recurrent networks typically minimizes the quadratic difference of the network output and an observed time series. This implicitely assumes that the dynamics of the underlying system is deterministic, which is not a realistic assumption in many cases. In contrast, state-space models allow for noise in both the internal state transitions and the mapping from internal states to observations. Here, we consider recurrent networks as nonlinear state space models and suggest a training algorithm based on Expectation-Maximization. A nonlinear transfer function for the hidden neurons leads to an intractable inference problem. We investigate the use of a Particle Smoother to approximate the E-step and simultaneously estimate the expectations required in the M-step. The method is demonstrated for a sythetic data set and a time series prediction task arising in radiation therapy where it is the goal to predict the motion of a lung tumor during respiration.

Jan Unkelbach, Sun Yi, Jürgen Schmidhuber

Modeling D st with Recurrent EM Neural Networks

Recurrent Neural Networks have been used extensively for space weather forecasts of geomagnetospheric disturbances. One of the major drawbacks for reliable forecasts have been the use of training algorithms that are unable to account for model uncertainty and noise in data. We propose a probabilistic training algorithm based on the Expectation Maximization framework for parameterization of the model which makes use of a forward filtering and backward smoothing Expectation step, and a Maximization step in which the model uncertainty and measurement noise estimates are computed. Through numerical experimentation it is shown that the proposed model allows for reliable forecasts and also outperforms other neural time series models trained with the Extended Kalman Filter, and gradient descent learning.

Derrick Takeshi Mirikitani, Lahcen Ouarbya

On the Quantification of Dynamics in Reservoir Computing

Reservoir Computing (RC) offers a computationally efficient and well performing technique for using the temporal processing power of Recurrent Neural Networks (RNNs), while avoiding the traditional long training times and stability problems. The method is both simple and elegant: a random RNN (called the

reservoir)

is constructed using only a few global parameters to tune the dynamics into a desirable regime, and the dynamic response of the reservoir is used to train a simple linear regression function called the readout function - the reservoir itself remains untrained. This technique has shown some experimentally very convincing results on a variety of tasks, but a thorough understanding of the importance of the dynamics for the performance is still lacking. This contribution aims to extend this understanding, by presenting a more sophisticated extension on the traditional way of characterizing the reservoir dynamics, by using the dynamic profile of the Jacobian of the reservoir instead of static,

a priori

measures such as the standard spectral radius. We show that this measure gives a more accurate description of the reservoir dynamics, and can serve as predictor for the performance. Additionally, due to the theoretical background from dynamical systems theory, this measure offers some insight into the underlying mechanisms of RC.

David Verstraeten, Benjamin Schrauwen

Solving the CLM Problem by Discrete-Time Linear Threshold Recurrent Neural Networks

The competitive layer model (CLM) can be described by the optimization problem that is formulated with the CLM energy function. The minimum points of CLM energy function can be achieved by running some proper recurrent neural networks. In other words, the CLM can be implemented by the recurrent neural networks. This paper proposes the discrete-time linear threshold recurrent networks to solve the CLM problem. The conditions for the stable attractors of the networks are obtained, which just correspond to the conditions of the minimum points of CLM energy function established in the literature before. Therefore, the proposed network can be used to implement the CLM.

Lei Zhang, Pheng Ann Heng, Zhang Yi

Scalable Neural Networks for Board Games

Learning to solve small instances of a problem should help in solving large instances. Unfortunately, most neural network architectures do not exhibit this form of scalability. Our Multi-Dimensional Recurrent LSTM Networks, however, show a high degree of scalability, as we empirically show in the domain of flexible-size board games. This allows them to be trained from scratch up to the level of human beginners, without using domain knowledge.

Tom Schaul, Jürgen Schmidhuber

Reservoir Size, Spectral Radius and Connectivity in Static Classification Problems

Reservoir computing is a recent paradigm that has proved to be quite effective given the classical difficulty in training recurrent neural networks. An approach to using reservoir recurrent neural networks has been recently proposed for static problems and in this paper we look at the influence of the reservoir size, spectral radius and connectivity on the classification error in these problems. The main conclusion derived from the performed experiments is that only the size of the reservoir is relevant with the spectral radius and the connectivity of the reservoir not affecting the classification performance.

Luís A. Alexandre, Mark J. Embrechts

Backmatter

Titel: Artificial Neural Networks – ICANN 2009
herausgegeben von: Cesare Alippi
Marios Polycarpou
Christos Panayiotou
Georgios Ellinas
Verlag: Springer Berlin Heidelberg
Electronic ISBN: 978-3-642-04274-4
Print ISBN: 978-3-642-04273-7
DOI: https://doi.org/10.1007/978-3-642-04274-4