Skip to main content

2005 | Buch

Artificial Neural Networks: Formal Models and Their Applications – ICANN 2005

15th International Conference, Warsaw, Poland, September 11-15, 2005. Proceedings, Part II

herausgegeben von: Włodzisław Duch, Janusz Kacprzyk, Erkki Oja, Sławomir Zadrożny

Verlag: Springer Berlin Heidelberg

Buchreihe : Lecture Notes in Computer Science

insite
SUCHEN

Über dieses Buch

This volume is the first part of the two-volume proceedings of the International C- ference on Artificial Neural Networks (ICANN 2005), held on September 11–15, 2005 in Warsaw, Poland, with several accompanying workshops held on September 15, 2005 at the Nicolaus Copernicus University, Toru , Poland. The ICANN conference is an annual meeting organized by the European Neural Network Society in cooperation with the International Neural Network Society, the Japanese Neural Network Society, and the IEEE Computational Intelligence Society. It is the premier European event covering all topics concerned with neural networks and related areas. The ICANN series of conferences was initiated in 1991 and soon became the major European gathering for experts in those fields. In 2005 the ICANN conference was organized by the Systems Research Institute, Polish Academy of Sciences, Warsaw, Poland, and the Nicolaus Copernicus Univ- sity, Toru , Poland. From over 600 papers submitted to the regular sessions and some 10 special c- ference sessions, the International Program Committee selected – after a thorough peer-review process – about 270 papers for publication. The large number of papers accepted is certainly a proof of the vitality and attractiveness of the field of artificial neural networks, but it also shows a strong interest in the ICANN conferences.

Inhaltsverzeichnis

Frontmatter

New Neural Network Models

Neuro-fuzzy Kolmogorov’s Network

A new computationally efficient learning algorithm for a hybrid system called further

Neuro-Fuzzy Kolmogorov’s Network

(NFKN) is proposed. The NFKN is based on and is the development of the previously proposed neural and fuzzy systems using the famous superposition theorem by A.N. Kolmogorov (KST). The network consists of two layers of neo-fuzzy neurons (NFNs) and is linear in both the hidden and output layer parameters, so it can be trained with very fast and simple procedures. The validity of theoretical results and the advantages of the NFKN in comparison with other techniques are confirmed by experiments.

Yevgeniy Bodyanskiy, Yevgen Gorshkov, Vitaliy Kolodyazhniy, Valeriya Poyedyntseva
A Neural Network Model for Inter-problem Adaptive Online Time Allocation

One aim of Meta-learning techniques is to minimize the time needed for problem solving, and the effort of parameter hand-tuning, by automating algorithm selection. The predictive model of algorithm performance needed for task often requires long training times. We address the problem in an

online

fashion, running multiple algorithms in parallel on a sequence of tasks, continually updating their relative priorities according to a neural model that maps their current state to the expected time to the solution. The model itself is updated at the end of each task, based on the actual performance of each algorithm. Censored sampling allows us to train the model effectively, without need of additional exploration after each task’s solution. We present a preliminary experiment in which this new

inter-

problem technique learns to outperform a previously proposed

intra

-problem heuristic.

Matteo Gagliolo, Jürgen Schmidhuber
Discriminant Parallel Perceptrons

Parallel perceptrons (PPs), a novel approach to committee machine training requiring minimal communication between outputs and hidden units, allows the construction of efficient and stable nonlinear classifiers. In this work we shall explore how to improve their performance allowing their output weights to have real values, computed by applying Fisher’s linear discriminant analysis to the committee machine’s perceptron outputs. We shall see that the final performance of the resulting classifiers is comparable to that of the more complex and costlier to train multilayer perceptrons.

Ana González, Iván Cantador, José R. Dorronsoro
A Way to Aggregate Multilayer Neural Networks

In this paper we consider an aggregation way for multilayer neural networks. For this we will use the generalized nets methodology as well as the index matrix operators. The generalized net methodology was developed as a counterpart of Petri nets for modelling discrete event systems. First, a short introduction of these tools is given. Next, three different kinds of neurons aggregation is considered. The application of the index matrix operators allow to developed three different generalized net models. The methodology seems to be a very good tool for knowledge description.

Maciej Krawczak
Generalized Net Models of MLNN Learning Algorithms

In this paper we consider generalized net models of learning algorithms for multilayer neural networks. Using the standard backpropagation algorithm we will construct it generalized net model. The methodology seems to be a very good tool for knowledge description of learning algorithms. Next, it will be shown that different learning algorithms have similar knowledge representation – it means very similar generalized net models. The generalized net methodology was developed as a counterpart of Petri nets for modelling discrete event systems. In Appendix, a short introduction is given.

Maciej Krawczak
Monotonic Multi-layer Perceptron Networks as Universal Approximators

Multi-layer perceptron networks as universal approximators are well-known methods for system identification. For many applications a multi-dimensional mathematical model has to guarantee the monotonicity with respect to one or more inputs. We introduce the MONMLP which fulfils the requirements of monotonicity regarding one or more inputs by constraints in the signs of the weights of the multi-layer perceptron network. The monotonicity of the MONMLP does not depend on the quality of the training because it is guaranteed by its structure. Moreover, it is shown that in spite of its constraints in signs the MONMLP is a universal approximator. As an example for model predictive control we present an application in the steel industry.

Bernhard Lang
Short Term Memories and Forcing the Re-use of Knowledge for Generalization

Despite the well-known performances and the theoretical power of neural networks, learning and generalizing are sometimes very difficult. In this article, we investigate how short term memories and forcing the agent to re-use its knowledge on-line can enhance the generalization capabilities. For this purpose, a system is described in a temporal framework, where communication skills are increased, thus enabling the teacher to supervise the way the agent “thinks”.

Laurent Orseau
Interpolation Mechanism of Functional Networks

In this paper, the interpolation mechanism of functional networks is discussed. And a kind of three layers Functional networks with single input unit and single output unit and four layers functional networks with double input units and single output unit is designed, a learning algorithm for function approximation is based on minimizing a sum of squares with a unique minimum has been proposed, which can respectively approximate a given one-variable continuous function and a given two-variable continuous function satisfying given precision. Finally, several given examples show that the interpolation method is effective and practical.

Yong-Quan Zhou, Li-Cheng Jiao

Supervised Learning Algorithms

Neural Network Topology Optimization

The determination of the optimal architecture of a supervised neural network is an important and a difficult task. The classical neural network topology optimization methods select weight(s) or unit(s) from the architecture in order to give a high performance of a learning algorithm. However, all existing topology optimization methods do not guarantee to obtain the optimal solution. In this work, we propose a hybrid approach which combines variable selection method and classical optimization method in order to improve optimization topology solution. The proposed approach suggests to identify the relevant subset of variables which gives a good classification performance in the first step and then to apply a classical topology optimization method to eliminate unnecessary hidden units or weights. A comparison of our approach to classical techniques for architecture optimization is given.

Mohammed Attik, Laurent Bougrain, Frédéric Alexandre
Rough Sets-Based Recursive Learning Algorithm for Radial Basis Function Networks

A recursive learning algorithm based on the rough sets approach to parameter estimation for radial basis function neural networks is proposed. The algorithm is intended for the pattern recognition and classification problems. It can also be applied to neuro control, identification, and emulation.

Yevgeniy Bodyanskiy, Yevgen Gorshkov, Vitaliy Kolodyazhniy, Irina Pliss
Support Vector Neural Training

SVM learning strategy based on progressive reduction of the number of training vectors is used for MLP training. Threshold for acceptance of useful vectors for training is dynamically adjusted during learning, leading to a small number of support vectors near decision borders and higher accuracy of the final solutions. Two problems for which neural networks have previously failed to provide good results are presented to illustrate the usefulness of this approach.

Włodzisław Duch
Evolutionary Algorithms for Real-Time Artificial Neural Network Training

This paper reports on experiments investigating the use of Evolutionary Algorithms to train Artificial Neural Networks in real time. A simulated legged mobile robot was used as a test bed in the experiments. Since the algorithm is designed to be used with a physical robot, the population size was one and the recombination operator was not used. The algorithm is therefore rather similar to the original Evolutionary Strategies concept. The idea is that such an algorithm could eventually be used to alter the locomotive performance of the robot on different terrain types. Results are presented showing the effect of various algorithm parameters on system performance.

Ananda Jagadeesan, Grant Maxwell, Christopher MacLeod
Developing Measurement Selection Strategy for Neural Network Models

The paper deals with an application of the theory of optimum experimental design to the problem of selecting the data set for developing neural models. Another objective is to show that neural network trained with the samples obtained according to D-optimum design is endowed with less parameters uncertainty what allows to obtain more reliable tool for modelling purposes.

Przemysław Prętki, Marcin Witczak
Nonlinear Regression with Piecewise Affine Models Based on RBFN

In this paper, a modeling method of high dimensional piecewise affine models is proposed. Because the model interpolates the outputs at the orthogonal grid points in the input space, the shape of the piecewise affine model is easily understood. The interpolation is realized by a RBFN, whose function is defined with max-min functions. By increasing the number of RBFs, the capability to express nonlinearity can be improved. In this paper, an algorithm to determine the number and locations of RBFs is proposed.

Masaru Sakamoto, Dong Duo, Yoshihiro Hashimoto, Toshiaki Itoh
Batch-Sequential Algorithm for Neural Networks Trained with Entropic Criteria

The use of entropy as a cost function in the neural network learning phase usually implies that, in the back-propagation algorithm, the training is done in batch mode. Apart from the higher complexity of the algorithm in batch mode, we know that this approach has some limitations over the sequential mode. In this paper we present a way of combining both modes when using entropic criteria. We present some experiments that validates the proposed method and we also show some comparisons of this proposed method with the single batch mode algorithm.

Jorge M. Santos, Joaquim Marques de Sá, Luís A. Alexandre
Multiresponse Sparse Regression with Application to Multidimensional Scaling

Sparse regression is the problem of selecting a parsimonious subset of all available regressors for an efficient prediction of a target variable. We consider a general setting in which both the target and regressors may be multivariate. The regressors are selected by a forward selection procedure that extends the Least Angle Regression algorithm. Instead of the common practice of estimating each target variable individually, our proposed method chooses sequentially those regressors that allow, on average, the best predictions of all the target variables. We illustrate the procedure by an experiment with artificial data. The method is also applied to the task of selecting relevant pixels from images in multidimensional scaling of handwritten digits.

Timo Similä, Jarkko Tikka
Training Neural Networks Using Taguchi Methods: Overcoming Interaction Problems

Taguchi Methods (and other orthogonal arrays) may be used to train small Artificial Neural Networks very quickly in a variety of tasks. These include, importantly, Control Systems. Previous experimental work has shown that they could be successfully used to train single layer networks with no difficulty. However, interaction between layers precluded the successful reliable training of multi-layered networks. This paper describes a number of successful strategies which may be used to overcome this problem and demonstrates the ability of such networks to learn non-linear mappings.

Alagappan Viswanathan, Christopher MacLeod, Grant Maxwell, Sashank Kalidindi
A Global-Local Artificial Neural Network with Application to Wave Overtopping Prediction

We present a hybrid Radial Basis Function (RBF) – sigmoid neural network with a three-step training algorithm that utilises both global search and gradient descent training. We test the effectiveness of our method using four synthetic datasets and demonstrate its use in wave overtopping prediction. It is shown that the hybrid architecture is often superior to architectures containing neurons of a single type in several ways: lower errors are often achievable using fewer hidden neurons and with less need for regularisation. Our Global-Local Artificial Neural Network (GL-ANN) is also seen to compare favourably with both Perceptron Radial Basis Net (PRBFN) and Regression Tree RBFs.

David Wedge, David Ingram, David McLean, Clive Mingham, Zuhair Bandar

Ensemble-Based Learning

Learning with Ensemble of Linear Perceptrons

In this paper we introduce a model of ensemble of linear perceptrons. The objective of the ensemble is to automatically divide the feature space into several regions and assign one ensemble member into each region and training the member to develop an expertise within the region. Utilizing the proposed ensemble model, the learning difficulty of each member can be reduced, thus achieving faster learning while guaranteeing the overall performance.

Pitoyo Hartono, Shuji Hashimoto
Combination Methods for Ensembles of RBFs

Building an ensemble of classifiers is a useful way to improve the performance. In the case of neural networks the bibliography has centered on the use of Multilayer Feedforward (MF). However, there are other interesting networks like Radial Basis Functions (RBF) that can be used as elements of the ensemble. In a previous paper we presented results of different methods to build the ensemble of RBF. The results showed that the best method is in general the

Simple Ensemble

. The combination method used in that research was averaging. In this paper we present results of fourteen different combination methods for a simple ensemble of RBF. The best methods are Borda Count, Weighted Average and Majority Voting.

Carlos Hernández-Espinosa, Joaquín Torres-Sospedra, Mercedes Fernández-Redondo
Ensemble Techniques for Credibility Estimation of GAME Models

When a real world system is described either by means of mathematical model or by any soft computing method the most important is to find out whether the model is of good quality, and for which configuration of input features the model is credible. Traditional methods restrict the credibility of model to areas of training data presence. These approaches are ineffective when non-relevant or redundant input features are present in the modeled system and for non-uniformly distributed data. Even for simple models, it is often hard to find out how credible the output is for any input vector. We propose a novel approach based on ensemble techniques that allows to estimate credibility of models. We experimentally derived an equation to estimate the credibility of models generated by Group of Adaptive Models Evolution (GAME) method for any configuration of input features.

Pavel Kordík, Miroslav Šnorek
Combination Methods for Ensembles of MF

As shown in the bibliography, training an ensemble of networks is an interesting way to improve the performance. The two key factors to design an ensemble are how to train the individual networks and how to combine the different outputs of the nets. In this paper, we focus on the combination methods. We study the performance of fourteen different combination methods for ensembles of the type “simple ensemble” (SE) and “decorrelated” (DECO). In the case of the “SE” and low number of networks in the ensemble, the method Zimmermann gets the best performance. When the number of networks is in the range of 9 and 20 the weighted average is the best alternative. Finally, in the case of the ensemble “DECO” the best performing method is averaging.

Joaquín Torres-Sospedra, Mercedes Fernández-Redondo, Carlos Hernández-Espinosa
New Results on Ensembles of Multilayer Feedforward

As shown in the bibliography, training an ensemble of networks is an interesting way to improve the performance. However there are several methods to construct the ensemble. In this paper we present some new results in a comparison of twenty different methods. We have trained ensembles of 3, 9, 20 and 40 networks to show results in a wide spectrum of values. The results show that the improvement in performance above 9 networks in the ensemble depends on the method but it is usually low. Also, the best method for a ensemble of 3 networks is called “Decorrelated” and uses a penalty term in the usual Backpropagation function to decorrelate the networks outputs in the ensemble. For the case of 9 and 20 networks the best method is conservative boosting. And finally for 40 networks the best method is Cels.

Joaquín Torres-Sospedra, Carlos Hernández-Espinosa, Mercedes Fernández-Redondo

Unsupervised Learning

On Variations of Power Iteration

The power iteration is a classical method for computing the eigenvector associated with the largest eigenvalue of a matrix. The subspace iteration is an extension of the power iteration where the subspace spanned by

n

largest eigenvectors of a matrix, is determined. The natural power iteration is an exemplary instance of the subspace iteration, providing a general framework for many principal subspace algorithms. In this paper we present variations of the natural power iteration, where

n

largest eigenvectors of a symmetric matrix without rotation ambiguity are determined, whereas the subspace iteration or the natural power iteration finds an invariant subspace (consisting of rotated eigenvectors). The resulting method is referred to as

constrained natural power iteration

and its fixed point analysis is given. Numerical experiments confirm the validity of our algorithm.

Seungjin Choi
Linear Dimension Reduction Based on the Fourth-Order Cumulant Tensor

In high dimensional data analysis, finding non-Gaussian components is an important preprocessing step for efficient information processing. By modifying the contrast function of JADE algorithm for

Independent Component Analysis

, we propose a new

linear dimension reduction

method to identify the non-Gaussian subspace based on the fourth-order cumulant tensor. A numerical study demonstrates the validity of our method and its usefulness for extracting sub-Gaussian structures.

M. Kawanabe
On Spectral Basis Selection for Single Channel Polyphonic Music Separation

In this paper we present a method of separating musical instrument sound sources from their monaural mixture, where we take the harmonic structure of music into account and use the sparseness and the overlapping NMF to select representative spectral basis vectors which are used to reconstruct unmixed sound. A method of spectral basis selection is illustrated and experimental results with monaural instantaneous mixtures of voice/cello and saxophone/viola, are shown to confirm the validity of our proposed method.

Minje Kim, Seungjin Choi
Independent Subspace Analysis Using k-Nearest Neighborhood Distances

A novel algorithm called independent subspace analysis (ISA) is introduced to estimate independent subspaces. The algorithm solves the ISA problem by estimating multi-dimensional differential entropies. Two variants are examined, both of them utilize distances between the

k

-nearest neighbors of the sample points. Numerical simulations demonstrate the usefulness of the algorithms.

Barnabás Póczos, András Lőrincz

Recurrent Neural Networks

Study of the Behavior of a New Boosting Algorithm for Recurrent Neural Networks

We present an algorithm for improving the accuracy of recurrent neural networks (RNNs) for time series forecasting. The improvement is achieved by combining a large number of RNNs, each of them is generated by training on a different set of examples. This algorithm is based on the boosting algorithm and allows concentrating the training on difficult examples but, unlike the original algorithm, by taking into account all the available examples. We study the behavior of our method applied on three time series of reference with three loss functions and with different values of a parameter. We compare the performances obtained with other regression methods.

Mohammad Assaad, Romuald Boné, Hubert Cardot
Time Delay Learning by Gradient Descent in Recurrent Neural Networks

Recurrent Neural Networks (RNNs) possess an implicit internal memory and are well adapted for time series forecasting. Unfortunately, the gradient descent algorithms which are commonly used for their training have two main weaknesses: the slowness and the difficulty of dealing with long-term dependencies in time series. Adding well chosen connections with time delays to the RNNs often reduces learning times and allows gradient descent algorithms to find better solutions. In this article, we demonstrate that the principle of time delay learning by gradient descent, although efficient for feed-forward neural networks and theoretically adaptable to RNNs, shown itself to be difficult to use in this latter case.

Romuald Boné, Hubert Cardot
Representation and Identification Method of Finite State Automata by Recurrent High-Order Neural Networks

This paper presents a new architecture of neural networks for representing deterministic finite state automata. The proposed model is a class of high-order recurrent neural networks. It is capable of representing FSA with the network size being smaller than the existing models proposed so far. We also propose an identification method of FSA from a given set of input and output data by training the proposed model of neural networks.

Yasuaki Kuroe
Global Stability Conditions of Locally Recurrent Neural Networks

The paper deals with a discrete-time recurrent neural network designed with dynamic neural models. Dynamics is reproduced within each single neuron, hence the considered network is a locally recurrent globally feed-forward. In the paper, conditions for global stability of the considered neural network are derived using the pole placement and Lyapunov second method.

Krzysztof Patan, Józef Korbicz, Przemysław Prętki

Reinforcement Learning

An Agent-Based PLA for the Cascade Correlation Learning Architecture

The paper proposes an implementation of the agent-based population learning algorithm (PLA) within the cascade correlation (CC) learning architecture. The first step of the CC procedure uses a standard learning algorithm. It is suggested that using the agent-based PLA as such an algorithm could improve efficiency of the approach. The paper gives a short overview of both – the CC algorithm and PLA, and then explains main features of the proposed agent-based PLA implementation. The approach is evaluated experimentally.

Ireneusz Czarnowski, Piotr Jędrzejowicz
Dual Memory Model for Using Pre-existing Knowledge in Reinforcement Learning Tasks

Reinforcement learning agents explore their environment in order to collect reward that allows them to learn what actions are good or bad in what situations. The exploration is performed using a policy that has to keep a balance between getting more information about the environment and exploiting what is already known about it. This paper presents a method for guiding exploration by pre-existing knowledge expressed as heuristic rules. A dual memory model is used where the value function is stored in long-term memory while the heuristic rules for guiding exploration act on the weights in a short-term memory. Experimental results from a grid task illustrate that exploration is significantly improved when appropriate heuristic rules are available.

Kary Främling
Stochastic Processes for Return Maximization in Reinforcement Learning

In the framework of reinforcement learning, an agent learns an optimal policy via return maximization, not via the instructed choices by a supervisor. The framework is in general formulated as an ergodic Markov decision process and is designed by tuning some parameters of the action-selection strategy so that the learning process eventually becomes almost stationary. In this paper, we examine a theoretical class of more general processes such that the agent can achieve return maximization by considering the asymptotic equipartition property of such processes. As a result, we show several necessary conditions that the agent and the environment have to satisfy for possible return maximization.

Kazunori Iwata, Hideaki Sakai, Kazushi Ikeda
Maximizing the Ratio of Information to Its Cost in Information Theoretic Competitive Learning

In this paper, we introduce costs in the framework of information maximization and try to maximize the ratio of information to its associated cost. We have shown that competitive learning is realized by maximizing mutual information between input patterns and competitive units. One shortcoming of the method is that maximizing information does not necessarily produce representations faithful to input patterns. Information maximizing primarily focuses on some parts of input patterns used to distinguish between patterns. Thus, we introduce the ratio of information to its cost that represents distance between input patterns and connection weights. By minimizing the ratio, final connection weights reflect well input patterns. We applied unsupervised information maximization to a voting attitude problem and supervised learning to a chemical data analysis. Experimental results confirmed that by minimizing the ratio, the cost is decreased with better generalization performance.

Ryotaro Kamimura, Sachiko Aida-Hyugaji
Completely Self-referential Optimal Reinforcement Learners

We present the first class of mathematically rigorous, general, fully self-referential, self-improving, optimal reinforcement learning systems. Such a system rewrites any part of its own code as soon as it has found a proof that the rewrite is

useful,

where the problem-dependent

utility function

and the hardware and the entire initial code are described by axioms encoded in an initial proof searcher which is also part of the initial code. The searcher systematically and efficiently tests computable

proof techniques

(programs whose outputs are proofs) until it finds a provably useful, computable self-rewrite. We show that such a self-rewrite is globally optimal—no local maxima!—since the code first had to prove that it is not useful to continue the proof search for alternative self-rewrites. Unlike previous

non

-self-referential methods based on hardwired proof searchers, ours not only boasts an optimal

order

of complexity but can optimally reduce any slowdowns hidden by the

O

()-notation, provided the utility of such speed-ups is provable at all.

Jürgen Schmidhuber
Model Selection Under Covariate Shift

A common assumption in supervised learning is that the training and test input points follow the same probability distribution. However, this assumption is not fulfilled, e.g., in interpolation, extrapolation, or active learning scenarios. The violation of this assumption—known as the covariate shift—causes a heavy bias in standard generalization error estimation schemes such as cross-validation and thus they result in poor model selection. In this paper, we therefore propose an alternative estimator of the generalization error. Under covariate shift, the proposed generalization error estimator is unbiased if the learning target function is included in the model at hand and it is asymptotically unbiased in general. Experimental results show that model selection with the proposed generalization error estimator is compared favorably to cross-validation in extrapolation.

Masashi Sugiyama, Klaus-Robert Müller

Bayesian Approaches to Learning

Smooth Performance Landscapes of the Variational Bayesian Approach

We consider the practical advantage of the Bayesian approach over maximum

a posteriori

methods in its ability to smoothen the landscape of generalization performance measures in the space of hyperparameters, which is vitally important for determining the optimal hyperparameters. The variational method is used to approximate the intractable distribution. Using the leave-one-out error of support vector regression as an example, we demonstrate a further advantage of this method in the analytical estimation of the leave-one-out error, without doing the cross-validation. Comparing our theory with the simulations on both artificial (the “sinc” function) and benchmark (the Boston Housing) data sets, we get a good agreement.

Zhuo Gao, K. Y. Michael Wong
Jacobi Alternative to Bayesian Evidence Maximization in Diffusion Filtering

Nonlinear diffusion filtering presents a way to define and iterate Gaussian process regression so that large variance noise can be efficiently filtered from observations of size

n

in

m

iterations by performing approximately

O

(

mn

) number of multiplications, while at the same time preserving the edges of the signal. Experimental evidence indicates that the optimal stopping time exist and the steady state solutions obtained by setting

m

to an arbitrarily large number are suboptimal. This work discusses the Bayesian evidence criterion, gives an interpretation to its basic components and proposes an alternative, simple optimal stopping method. A synthetic large-scale example indicates the usefulness of the proposed stopping criterion.

Ramūnas Girdziušas, Jorma Laaksonen
Bayesian Learning of Neural Networks Adapted to Changes of Prior Probabilities

We treat Bayesian neural networks adapted to changes in the ratio of prior probabilities of the categries. If an ordinary Bayesian neural network is equipped with

m

–1 additional input units, it can learn simultaneously

m

distinct discriminant functions which correspond to the

m

different ratios of the prior probabilities.

Yoshifusa Ito, Cidambi Srinivasan, Hiroyuki Izumi
A New Method of Learning Bayesian Networks Structures from Incomplete Data

This paper describes a new data mining algorithm to learn Bayesian networks structures from incomplete data based on extended Evolutionary programming (EP) method and the Minimum Description Length (MDL) principle. This problem is characterized by a huge solution space with a highly multimodal landscape. The algorithm presents fitness function based on expectation, which converts incomplete data to complete data utilizing current best structure of evolutionary process. The algorithm adopts a strategy to alleviate the undulate phenomenon. Aiming at preventing and overcoming premature convergence, the algorithm combines the niche technology into the selection mechanism of EP. In addition, our algorithm, like some previous work, does not need to have a complete variable ordering as input. The experimental results illustrate that our algorithm can learn a good structure from incomplete data.

Xiaolin Li, Xiangdong He, Senmiao Yuan
Bayesian Hierarchical Ordinal Regression

We present a Bayesian approach to ordinal regression. Our model is based on a hierarchical mixture of experts model and performs a soft partitioning of the input space into different ranks, such that the order of the ranks is preserved. Experimental results on benchmark data sets show a comparable performance to support vector machine and Gaussian process methods.

Ulrich Paquet, Sean Holden, Andrew Naish-Guzman
Traffic Flow Forecasting Using a Spatio-temporal Bayesian Network Predictor

A novel predictor for traffic flow forecasting, namely spatio-temporal Bayesian network predictor, is proposed. Unlike existing methods, our approach incorporates all the spatial and temporal information available in a transportation network to carry our traffic flow forecasting of the current site. The Pearson correlation coefficient is adopted to rank the input variables (traffic flows) for prediction, and the best-first strategy is employed to select a subset as the cause nodes of a Bayesian network. Given the derived cause nodes and the corresponding effect node in the spatio-temporal Bayesian network, a Gaussian Mixture Model is applied to describe the statistical relationship between the input and output. Finally, traffic flow forecasting is performed under the criterion of Minimum Mean Square Error (M.M.S.E.). Experimental results with the urban vehicular flow data of Beijing demonstrate the effectiveness of our presented spatio-temporal Bayesian network predictor.

Shiliang Sun, Changshui Zhang, Yi Zhang

Learning Theory

Manifold Constrained Variational Mixtures

In many data mining applications, the data manifold is of lower dimension than the dimension of the input space. In this paper, it is proposed to take advantage of this additional information in the frame of variational mixtures. The responsibilities computed in the VBE step are constrained according to a discrepancy measure between the Euclidean and the geodesic distance. The methodology is applied to variational Gaussian mixtures as a particular case and outperforms the standard approach, as well as Parzen windows, on both artificial and real data.

Cédric Archambeau, Michel Verleysen
Handwritten Digit Recognition with Nonlinear Fisher Discriminant Analysis

To generalize the Fisher Discriminant Analysis (FDA) algorithm to the case of discriminant functions belonging to a nonlinear, finite dimensional function space

$\mathcal{F}$

(Nonlinear FDA or NFDA), it is sufficient to expand the input data by computing the output of a basis of

$\mathcal{F}$

when applied to it [1,2,3,4]. The solution to NFDA can then be found like in the linear case by solving a generalized eigenvalue problem on the between- and within-classes covariance matrices (see e.g.[5]). The goal of NFDA is to find linear projections of the expanded data (i.e., nonlinear transformations of the original data) that minimize the variance within a class and maximize the variance between different classes. Such a representation is of course ideal to perform classification. The application of NFDA to pattern recognition is particularly appealing, because for a given input signal and a fixed function space it has no parameters and it is easy to implement and apply. Moreover, given

C

classes only

C

–1 projections are relevant [5]. As a consequence, the feature space is very small and the algorithm has low memory requirements and high speed during recognition.

Pietro Berkes
Separable Data Aggregation in Hierarchical Networks of Formal Neurons

In this paper we consider principles of such data aggregation in hierarchical networks of formal neurons which allows one to preserve the separability of the categories. The postulate of the categories separation in the layers of formal neurons is examined by means of the concept of clear and mixed dipoles. Dependence of separation of the categories on the feature selection is analysed.

Leon Bobrowski
Induced Weights Artificial Neural Network

It is widely believed in the pattern recognition field that the number of examples needed to achieve an acceptable level of generalization ability depends on the number of independent parameters needed to specify the network configuration. The paper presents a neural network for classification of high-dimensional patterns. The network architecture proposed here uses a layer which extracts the global features of patterns. The layer contains neurons whose weights are induced by a neural subnetwork. The method reduces the number of independent parameters describing the layer to the parameters describing the inducing subnetwork.

Slawomir Golak
SoftDoubleMinOver: A Simple Procedure for Maximum Margin Classification

The well-known MinOver algorithm is a simple modification of the perceptron algorithm and provides the maximum margin classifier without a bias in linearly separable two class classification problems. DoubleMinOver as a slight modification of MinOver is introduced, which now includes a bias. It is shown how this simple and iterative procedure can be extended to SoftDoubleMinOver for classification with soft margins and with kernels. On benchmarks the extremely simple SoftDoubleMinOver algorithm achieves the same classification performance with the same computational effort as sophisticated Support-Vector-Machine software.

Thomas Martinetz, Kai Labusch, Daniel Schneegaß
On the Explicit Use of Example Weights in the Construction of Classifiers

We present a novel approach to two-class classification, in which a classifier is parameterised in terms of a distribution over examples. The optimal distribution is determined by the solution of a linear program; it is found experimentally to be highly sparse, and to yield a classifier resistant to noise whose error rates are competitive with the best existing methods.

Andrew Naish-Guzman, Sean Holden, Ulrich Paquet
A First Approach to Solve Classification Problems Based on Functional Networks

In this paper the ability of the functional networks approach to solve classification problems is explored. Functional networks were introduced by Castillo et al. [1] as an alternative to neural networks. They have the same purpose, but unlike neural networks, neural functions are learned instead of weights, using families of linear independent functions. This is illustrated by applying several models of functional networks to a set of simulated data and to the well-known Iris data and Pima Indian data sets.

Rosa Eva Pruneda, Beatriz Lacruz, Cristina Solares
A Description of a Simulation Environment and Neural Architecture for A-Life

This paper describes a project in progress, a modular environment for a-life experiments. The main purpose of the project is to design a neural architecture that would allow artificial creatures (biots) to learn to perform certain simple tasks within the environment, having to deal with only the information they can gather during exploration and semi-random trials. That means that the biots are given no explicit information about their position, distance from surrounding objects or even any measure of progress in a task. Information that a task has been started and either accomplished or failed is to be the only reinforcement passed to the learning process.

Leszek Rybicki
Neural Network Classifers in Arrears Management

The literature suggests that an ensemble of classifiers outperforms a single classifier across a range of classification problems. This paper investigates the application of an ensemble of neural network classifiers to the prediction of potential defaults for a set of personal loan accounts drawn from a medium sized Australian financial institution. The imbalanced nature of the data sets necessitates the implementation of strategies to avoid under learning of the minority class and two such approaches (minority over-sampling and majority under-sampling) were adopted here. The ensemble out performed the single networks irrespective of which strategy was used. The results also compared more than favourably with those reported in the literature for a similar application area.

Esther Scheurmann, Chris Matthews
Sequential Classification of Probabilistic Independent Feature Vectors Based on Multilayer Perceptron

The paper presents methods of classification based on a sequence of feature vectors extracted from signal generated by the object. The feature vectors are assumed to be probabilistic independent. Each feature vector is separately classified by a multilayer perceptron giving a set of local classification decisions. This set of statistical independent decisions is a base for a global classification rule. The rule is derived from statistical decision theory. According to it, an object belongs to a class for which product of corresponding neural network outputs is the largest. The neural outputs are modified in a way to prevent them vanishing to zero. The performance of the proposed rule was tested in an automatic, text independent, speaker identification task. Achieved results are presented.

Tomasz Walkowiak
Multi-class Pattern Classification Based on a Probabilistic Model of Combining Binary Classifiers

We propose a novel probabilistic model for constructing a multi-class pattern classifier by weighted aggregation of general binary classifiers including one-versus-the-rest, one-versus-one, and others. Our model has a latent variable that represents class membership probabilities, and it is estimated by fitting it to probability estimate outputs of binary classfiers. We apply our method to classification problems of synthetic datasets and a real world dataset of gene expression profiles. We show that our method achieves comparable performance to conventional voting heuristics.

Naoto Yukinawa, Shigeyuki Oba, Kikuya Kato, Shin Ishii
Evaluating Performance of Random Subspace Classifier on ELENA Classification Database

This work describes the model of random subspace classifier and provides benchmarking results on the ELENA database. The classifier uses a coarse coding technique to transform the input real vector into the binary vector of high dimensionality. Thus, class representatives are likely to become linearly separable. Taking into account the training time, recognition time and error rate the RSC network in many cases surpasses well known classification algorithms.

Dmitry Zhora

Artificial Neural Networks for System Modeling, Decision Making, Optimalization and Control

A New RBF Neural Network Based Non-linear Self-tuning Pole-Zero Placement Controller

In this paper a new self-tuning controller algorithm for non-linear dynamical systems has been derived using the Radial Basis Function Neural Network (RBF). In the proposed controller, the unknown non-linear plant is represented by an equivalent model consisting of a linear time-varying sub-model plus a non-linear sub-model. The parameters of the linear sub-model are identified by a recursive least squares algorithm with a directional forgetting factor, whereas the unknown non-linear sub-model is modelled using the (RBF) network resulting in a new non-linear controller with a generalised minimum variance performance index. In addition, the proposed controller overcomes the shortcomings of other linear designs and provides an adaptive mechanism which ensures that both the closed-loop poles and zeros are placed at their pre-specified positions. Example simulation results using a non-linear plant model demonstrate the effectiveness of the proposed controller.

Rudwan Abdullah, Amir Hussain, Ali Zayed
Using the Levenberg-Marquardt for On-line Training of a Variant System

This paper presents an application of the Levenberg-Marquardt algorithm to on-line modelling of a variant system. Because there is no iterative version of the Levenberg-Marquardt algorithm, a batch version is used with a double sliding window and Early Stopping to produce models of a system whose poles change during operation. The models are used in a Internal Model Controller to control the system which is held functioning in the initial phase by a PI controller.

Fernando Morgado Dias, Ana Antunes, José Vieira, Alexandre Manuel Mota
Optimal Control Yields Power Law Behavior

Power law tails can be observed in the statistics of human motor control such as the balancing of a stick at the fingertip. We derive a simple control algorithm that employs optimal parameter estimation based on past observations. The resulting control system self-organizes into a critical regime, whereby the exponents of power law tails do not depend on system parameters. The occurrence of power laws is robust with respect to the introduction of delays and a variation in the length of the memory trace. Our results suggest that multiplicative noise causing scaling behavior may result from optimal control.

Christian W. Eurich, Klaus Pawelzik
A NeuroFuzzy Controller for 3D Virtual Centered Navigation in Medical Images of Tubular Structures

In this paper we address the problem of virtual central navigation in 3D tubular structures. A virtual mobile robot, equipped with a neuro-fuzzy controller, is trained to navigate inside image datasets of tubular structures, keeping a central position; virtual range sensors are used to sense the surrounding walls and to provide input to the controller. Aim of this research is the identification of smooth and continuous central paths which are useful in several medical applications: virtual endoscopy, virtual colonoscopy, virtual angioscopy, virtual bronchoscopy, etc. We fully validated the algorithm on synthetic datasets, and performed successful experiments on a colon dataset.

Luca Ferrarini, Hans Olofsen, Johan H. C. Reiber, Faiza Admiraal-Behloul
Emulating Process Simulators with Learning Systems

We explore the possibility of replacing a process simulator with a learning system. This is motivated in the presented test case setting by a need to speed up a simulator that is to be used in conjunction with an optimisation algorithm to find near optimal process parameters. Here we will discuss the potential problems and difficulties in this application, how to solve them and present the results from a paper mill test case.

Daniel Gillblad, Anders Holst, Björn Levin
Evolving Modular Fast-Weight Networks for Control

In practice, almost all control systems in use today implement some form of linear control. However, there are many tasks for which conventional control engineering methods are not directly applicable because there is not enough information about how the system should be controlled (i.e. reinforcement learning problems). In this paper, we explore an approach to such problems that evolves

fast-weight

neural networks. These networks, although capable of implementing arbitrary non-linear mappings, can more easily exploit the piecewise linearity inherent in most systems, in order to produce simpler and more comprehensible controllers. The method is tested on 2D mobile robot version of the pole balancing task where the controller must learn to switch between two operating modes, one using a single pole and the other using a jointed pole version that has not before been solved.

Faustino Gomez, Jürgen Schmidhuber
Topological Derivative and Training Neural Networks for Inverse Problems

We consider the problem of locating small openings inside the domain of definition of elliptic equation using as the observation data the values of finite number of integral functionals. Application of neural networks requires a great number of training sets. The approximation of these functionals by means of topological derivative allows to generate training data very quickly. The results of computations for 2

D

examples show, that the method allows to determine an approximation of the global solution to the inverse problem, sufficiently closed to the exact solution.

Lidia Jackowska-Strumiłło, Jan Sokołowski, Antoni Żochowski
Application of Domain Neural Network to Optimization Tasks

A new model of neural network (the domain model) is proposed. In this model the neurons are joined together into more large groups (domains), and accordingly the updating rule is modified. It is shown that memory capacity grows linearly as function of the domain size. In optimization tasks, this kind of neural network allows one to find more deep local minima of the energy than the standard asynchronous dynamics.

Boris Kryzhanovsky, Bashir Magomedov
Eigenvalue Problem Approach to Discrete Minimization

The problem of finding of the deepest local minimum of a quadratic functional of binary variables is discussed. Our approach is based on the asynchronous neural dynamics and utilizes the eigenvalues and eigenvectors of the connection matrix. We discuss the role of the largest eigenvalues. We report the results of intensive computer experiments with random matrices of large dimensions

N

~ 10

2

–10

3

.

Leonid B. Litinskii
A Neurocomputational Approach to Decision Making and Aging

The adaptive toolbox approach to human rationality analyzes environments and proposes detailed cognitive mechanisms that exploit the structures identified. This paper argues that the posited mechanisms are suitable for implementation as connectionist networks and that this allows (1) integrating behavioral, biological, and information processing levels, an attractive feature of any approach in cognitive science; and (2) addressing developmental issues. These claims are supported by reporting implementations of decision strategies using simple recurrent networks and showing how age differences related to attenuation in cholaminergic modulation can be modeled by lowering the

G

parameter in these networks. This approach is shown to be productive by deriving empirically testable predictions of age differences in decision making tasks.

Rui Mata
Comparison of Neural Network Robot Models with Not Inverted and Inverted Inertia Matrix

The mathematical model of an industrial robot is usually described in the form of Lagrange-Euler equations, Newton-Euler equations or generalized d’Alambert equations. However, these equations require the physical parameters of a robot that are difficult to obtain. In this paper, two methods for calculation of a Lagrange-Euler model of robot using neural networks are presented and compared. The proposed network structure is based on an approach where either a not inverted or inverted inertia matrix is calculated. The presented models show good performance for different sets of data.

Jakub Możaryn, Jerzy E. Kurek
Causal Neural Control of a Latching Ocean Wave Point Absorber

A causal neural control strategy is described for a simple “heaving” wave energy converter. It is shown that effective control can be produced over a range of off-resonant frequencies. A latching strategy is investigated, utilising a biologically inspired neural oscillator as the basis for the control.

T. R. Mundon, A. F. Murray, J. Hallam, L. N. Patel
An Off-Policy Natural Policy Gradient Method for a Partial Observable Markov Decision Process

There has been a problem called “exploration-exploitation problem” in the field of reinforcement learning. An agent must decide whether to explore a better action which may not necessarily exist, or to exploit many rewards by taking the current best action. In this article, we propose an off-policy reinforcement learning method based on a natural policy gradient learning, as a solution of the exploration-exploitation problem. In our method, the policy gradient is estimated based on a sequence of state-action pairs sampled by performing an arbitrary “behavior policy”; this allows us to deal with the exploration-exploitation problem by handling the generation process of behavior policies. By applying to an autonomous control problem of a three-dimensional cart-pole, we show that our method can realize an optimal control efficiently in a partially observable domain.

Yutaka Nakamura, Takeshi Mori, Shin Ishii
A Simplified Forward-Propagation Learning Rule Applied to Adaptive Closed-Loop Control

In terms of computational neuroscience, several theoretical learning schemes have been proposed to acquire suitable motor controllers in the human brain. The controllers have been classified into a feedforward manner and a feedback manner as inverse models of controlled objects. For learning a feedforward controller, we have proposed a forward-propagation learning (FPL) rule which propagates error “forward” in a multi-layered neural network to solve a credit assignment problem. In the current work, FPL is simplified to realize accurate learning, and to be extended to adaptive feedback control. The suitability of a proposed scheme is confirmed by computer simulation.

Yoshihiro Ohama, Naohiro Fukumura, Yoji Uno
Improved, Simpler Neural Controllers for Lamprey Swimming

Swimming for the lamprey (an eel-like fish) is governed by activity in its spinal neural network, called a central pattern generator (CPG). Simpler, alternative controllers can be evolved which provide improved performance over the biological prototype (modelled by Ekeberg). Results of computational evolutions demonstrate several possible outcomes exist, with reduced connectivity (16 connections instead of 26) and a diminished equation set for describing the model. Furthermore, resulting oscillators operate over a wider frequency range (0.99 – 12.67 Hz), outperforming the biological prototype (frequencies 1.74 – 5.56 Hz). Evolving advanced yet simpler controllers provides solutions which are more attainable in silicon (VLSI), determines the extent to which nature’s solutions are unique and generates efficient task-specific versions.

Leena N. Patel, John Hallam, Alan Murray
Supervision of Control Valves in Flotation Circuits Based on Artificial Neural Network

Flotation circuits play an important role in extracting valuable minerals from the ore. To control this process, the level is used to manipulate either the concentrate or the tailings grade. One of the key elements in controlling the level of a flotation cell is the control valve. The timely detection of any problem in these valves could mean big operational savings. This paper compares two Artificial Neural Network architectures for detecting clogging in control valves. The first one is based on the traditional autoassociative feedforward architecture with a bottleneck layer and the other one is based on discrete principal curves. We show that clogging can can be promptly detected by both methods; however, the second alternative can carry out the detection more efficiently than the first one.

D. Sbarbaro, G. Carvajal
Comparison of Volterra Models Extracted from a Neural Network for Nonlinear Systems Modeling

In this paper, a Time-Delayed feed-forward Neural Network (NN) is used to make an input-output time-domain characterization of a nonlinear electronic device. The procedure provides also an analytical expression for its behavior, the Volterra Series model, to predict the device response to multiple input power levels. This model, however, can be built to different accuracy degrees, depending on the activation function chosen for the NN used. We compare two Volterra series models extracted from different networks, having hyperbolic tangent and polynomial activation functions. This analysis is applied to the modeling of a Power Amplifier (PA).

Georgina Stegmayer
Identification of Frequency-Domain Volterra Model Using Neural Networks

In this paper, a new method is introduced for the identification of a Volterra model for the representation of a nonlinear electronic device in the frequency domain. The Volterra model is a numerical series with some particular terms named kernels. Our proposal is the use of feedforward neural networks (FNN) for the modeling of the nonlinearities in the device behavior, and a special procedure which uses the neural networks parameters for the kernels identification. The proposed procedure has been tested with simulation data from a class “A” Power Amplifier (PA) which validate our approach.

Georgina Stegmayer, Omar Chiotti
Hierarchical Clustering for Efficient Memory Allocation in CMAC Neural Network

CMAC Neural Network is a popular choice for control applications. One of the main problems with CMAC is that the memory needed for the network grows exponentially with each addition of input variable. In this paper, we present a new CMAC architecture with more effective allocation of the available memory space. The proposed architecture employs hierarchical clustering to perform adaptive quantization of the input space by capturing the degree of variation in the output target function to be learned. We showed through a car maneuvering control application that using this new architecture, the memory requirement can be reduced significantly compared with conventional CMAC while maintaining the desired performance quality.

Sintiani D. Teddy, Edmund M. -K. Lai

Special Session: Knowledge Extraction from Neural Networks Organizer and Chair: D. A. Elizondo

Knowledge Extraction from Unsupervised Multi-topographic Neural Network Models

This paper presents a new approach whose aim is to extent the scope of numerical models by providing them with knowledge extraction capabilities. The basic model which is considered in this paper is a multi-topographic neural network model. One of the most powerful features of this model is its generalization mechanism that allows rule extraction to be performed. The extraction of association rules is itself based on original quality measures which evaluate to what extent a numerical classification model behaves as a natural symbolic classifier such as a Galois lattice. A first experimental illustration of rule extraction on documentary data constituted by a set of patents issued form a patent database is presented.

Shadi Al Shehabi, Jean-Charles Lamirel
Current Trends on Knowledge Extraction and Neural Networks

The extraction of knowledge from trained neural networks provides a way for explaining the functioning of a neural network. This is important for artificial networks to gain a wider degree of acceptance. An increasing amount of research has been carried out to develop mechanisms, procedures and techniques for extracting knowledge from trained neural networks. This publication presents some of the current research trends on extracting knowledge from trained neural networks.

David A. Elizondo, Mario A. Góngora
Prediction of Yeast Protein–Protein Interactions by Neural Feature Association Rule

In this paper, we present an association rule based protein interaction prediction method. We use neural network to cluster protein interaction data and feature selection method to reduce protein feature dimension. After this model training, association rules for protein interaction prediction are generated by decoding a set of learned weights of trained neural network and association rule mining. For model training, the initial network model was constructed with existing protein interaction data in terms of their functional categories and interactions. The protein interaction data of

Yeast

(

S.cerevisiae)

from MIPS and SGD are used. The prediction performance was compared with traditional simple association rule mining method. According to the experimental results, proposed method shows about 96.1% accuracy compared to simple association mining approach which achieved about 91.4%.

Jae-Hong Eom, Byoung-Tak Zhang
A Novel Method for Extracting Knowledge from Neural Networks with Evolving SQL Queries

While artificial neural networks (ANNs) are undoubtedly powerful classifiers their results are sometimes treated with suspicion. This is because their decisions are not open to inspection – the knowledge they contain is hidden. In this paper we describe a method for extracting and representing the knowledge within an ANN. Mappings between inputs and output classifications are stored in a table and, for each classification, Structured Query Language (SQL) queries are evolved using a genetic algorithm. Each evolved query is a simple, human-readable representation of the knowledge used by the ANN to decide on the classification based on the inputs. This method can also be used to show how the knowledge within an ANN develops as it is trained, and can help to identify problems that are particularly hard, or easy, for ANNs to classify.

Mario A. Góngora, Tim Watson, David A. Elizondo
CrySSMEx, a Novel Rule Extractor for Recurrent Neural Networks: Overview and Case Study

In this paper, it will be shown that it is feasible to extract finite state machines in a domain of, for rule extraction, previously unencountered complexity. The algorithm used is called the Crystallizing Substochastic Sequential Machine Extractor, or

CrySSMEx

. It extracts the machine from sequence data generated from the RNN in interaction with its domain.

CrySSMEx

is parameter free, deterministic and generates a sequence of increasingly deterministic extracted stochastic models until a fully deterministic machine is found.

Henrik Jacobsson, Tom Ziemke
Computational Neurogenetic Modeling: Integration of Spiking Neural Networks, Gene Networks, and Signal Processing Techniques

The paper presents a theory and a new generic computational model of a biologically plausible artificial neural network (ANN), the dynamics of which is influenced by the dynamics of internal gene regulatory network (GRN). We call this model a “computational neurogenetic model” (CNGM) and this new area of research Computational Neurogenetics. We aim at developing a novel computational modeling paradigm that can potentially bring original insights into how genes and their interactions influence the function of brain neural networks in normal and diseased states. In the proposed model, FFT and spectral characteristics of the ANN output are analyzed and compared with the brain EEG signal. The model includes a large set of biologically plausible parameters and interactions related to genes/proteins and spiking neuronal activities. These parameters are optimized, based on targeted EEG data, using genetic algorithm (GA). Open questions and future directions are outlined.

Nikola Kasabov, Lubica Benuskova, Simei Gomes Wysoski
Information Visualization for Knowledge Extraction in Neural Networks

In this paper, a user-centred innovative method of knowledge extraction in neural networks is described. This is based on information visualization techniques and tools for artificial and natural neural systems. Two case studies are presented. The first demonstrates the use of various information visualization methods for the identification of neuronal structure (e.g. groups of neurons that fire synchronously) in spiking neural networks. The second study applies similar techniques to the study of embodied cognitive robots in order to identify the complex organization of behaviour in the robot’s neural controller.

Liz Stuart, Davide Marocco, Angelo Cangelosi
Combining GAs and RBF Neural Networks for Fuzzy Rule Extraction from Numerical Data

The idea of using RBF neural networks for fuzzy rule extraction from numerical data is not new. The structure of this kind of architectures, which supports clustering of data samples, is favorable for considering clusters as if-then rules. However, in order for real if-then rules to be derived, proper antecedent parts for each cluster need to be constructed by selecting the appropriate subspace of input space that best matches each cluster’s properties. In this paper we address the problem of antecedent part construction by (a) initializing the hidden layer of an RBF-Resource Allocating Network using an unsupervised clustering technique whose metric is based on input dimensions that best relate the data samples in a cluster, and (b) by pruning input connections to hidden nodes in a per node basis, using an innovative Genetic Algorithm optimization scheme.

Manolis Wallace, Nicolas Tsapatsoulis

Temporal Data Analysis, Prediction and Forecasting

Neural Network Algorithm for Events Forecasting and Its Application to Space Physics Data

Many practical tasks require discovering interconnections between the behavior of a complex object and events initiated by this behavior or correlating with it. In such cases it is supposed that emergence of an event is preceded by some phenomenon – a combination of values of the features describing the object, in a known range of time delays. Recently the authors suggested a neural network based method of analysis of such objects. In this paper, the results of experiments on real-world data are presented. The method aims at revealing morphological and dynamical features causing the event or preceding its emergence.

S. A. Dolenko, Yu. V. Orlov, I. G. Persiantsev, Ju. S. Shugai
Counterpropagation with Delays with Applications in Time Series Prediction

The paper presents a method for time series prediction using a complete counterpropagation network with delay kernels. Our network takes advantage of the clustering and mapping capability of the original CPN combined with dynamical elements and become able to discover and approximate the strongest topological and temporal relationships among the fields in the data. Experimental results using two chaotic time series and a set of astrophysical data validate the performance of the proposed method.

Carmen Fierascu
Bispectrum-Based Statistical Tests for VAD

In this paper we propose a voice activity detection (VAD) algorithm for improving speech recognition performance in noisy environments. The approach is based on statistical tests applied to multiple observation window based on the determination of the speech/non-speech bispectra by means of third order auto-cumulants. This algorithm differs from many others in the way the decision rule is formulated (detection tests) and the domain used in this approach (bispectrum). It is shown that application of statistical detection test leads to a better separation of the speech and noise distributions, thus allowing a more effective discrimination and a tradeoff between complexity and performance. The experimental analysis carried out on the AURORA databases and tasks provides an extensive performance evaluation together with an exhaustive comparison to the standard VADs such as ITU G.729, GSM AMR and ETSI AFE for distributed speech recognition (DSR), and other recently reported VADs. Clear improvements in Speech Recognition are obtained when the proposed VAD is used as a part of a ASR system.

J. M. Górriz, J. Ramírez, C. G. Puntonet, F. Theis, E. W. Lang
Back-Propagation as Reinforcement in Prediction Tasks

The back-propagation (BP) training scheme is widely used for training network models in cognitive science besides its well known technical and biological short-comings. In this paper we contribute to making the BP training scheme more acceptable from a biological point of view in cognitively motivated prediction tasks overcoming one of its major drawbacks.

Traditionally, recurrent neural networks in symbolic time series prediction (e. g. language) are trained with gradient decent based learning algorithms, notably with back-propagation (BP) through time. A major drawback for the biological plausibility of BP is that it is a supervised scheme in which a teacher has to provide a fully speci.ed target answer. Yet, agents in natural environments often receive a summary feed-back about the degree of success or failure only, a view adopted in reinforcement learning schemes.

In this work we show that for simple recurrent networks in prediction tasks for which there is a probability interpretation of the network’s output vector, Elman BP can be reimplemented as a reinforcement learning scheme for which the expected weight updates agree with the ones from traditional Elman BP, using ideas from the AGREL learning scheme (van Ooyen and Roelfsema 2003) for feed-forward networks.

André Grüning
Mutual Information and k-Nearest Neighbors Approximator for Time Series Prediction

This paper presents a method that combines Mutual Information and

k

-Nearest Neighbors approximator for time series prediction. Mutual Information is used for input selection.

K

-Nearest Neighbors approximator is used to improve the input selection and to provide a simple but accurate prediction method. Due to its simplicity the method is repeated to build a large number of models that are used for long-term prediction of time series. The Santa Fe A time series is used as an example.

Antti Sorjamaa, Jin Hao, Amaury Lendasse
Some Issues About the Generalization of Neural Networks for Time Series Prediction

Some issues about the generalization of ANN training are investigated through experiments with several synthetic time series and real world time series. One commonly accepted view is that when the ratio of the training sample size to the number of weights is larger than 30, the overfitting will not occur. However, it is found that even with the ratio higher than 30, overfitting still exists. In cross-validated early stopping, the ratio of cross-validation data size to training data size has no significant impact on the testing error. For stationary time series, 10% may be a practical choice. Both Bayesian regularization method and the cross-validated early stopping method are helpful when the ratio of training sample size to the number of weights is less than 20. However, the performance of early stopping is highly variable. Bayesian method outperforms the early stopping method in most cases, and in some cases even outperforms no-stop training when the training data set is large.

Wen Wang, Pieter H. A. J. M. Van Gelder, J. K. Vrijling
Multi-step-ahead Prediction Based on B-Spline Interpolation and Adaptive Time-Delay Neural Network

The availability of accurate empirical models for multi-step-ahead (MS) prediction is desirable in many areas. Motivated by B-spline interpolation and adaptive time-delay neural network (ATNN) which have proven successful in addressing different complicated problems, we aim at investigating the applicability of ATNN for MS prediction and propose a hybrid model SATNN. The annual sunspots and Mackey-Glass equation considered as benchmark chaotic nonlinear systems were selected to test our model. Validation studies indicated that the proposed model is quite effective in MS prediction, especially for single factor time series.

Jing-Xin Xie, Chun-Tian Cheng, Bin Yu, Qing-Rui Zhang

Support Vector Machines and Kernel-Based Methods

Training of Support Vector Machines with Mahalanobis Kernels

Radial basis function (RBF) kernels are widely used for support vector machines. But for model selection, we need to optimize the kernel parameter and the margin parameter by time-consuming cross validation. To solve this problem, in this paper we propose using Mahalanobis kernels, which are generalized RBF kernels. We determine the covariance matrix for the Mahalanobis kernel using the training data corresponding to the associated classes. Model selection is done by line search. Namely, first the margin parameter is optimized and then the Mahalanobis kernel parameter is optimized. According to the computer experiments for two-class problems, a Mahalanobis kernel with a diagonal covariance matrix shows better generalization ability than a Mahalanobis kernel with a full covariance matrix, and a Mahalanobis kernel optimized by line search shows comparable performance with that with an RBF kernel optimized by grid search.

Shigeo Abe
Smooth Bayesian Kernel Machines

In this paper, we consider the possibility of obtaining a kernel machine that is sparse in feature space and smooth in output space. Smooth in output space implies that the underlying function is supposed to have continuous derivatives up to some order. Smoothness is achieved by applying a roughness penalty, a concept from the area of functional data analysis. Sparseness is taken care of by automatic relevance determination. Both are combined in a Bayesian model, which has been implemented and tested. Test results are presented in the paper.

Rutger W. ter Borg, Léon J. M. Rothkrantz
A New Kernel-Based Algorithm for Online Clustering

This paper presents a kernel-based clustering algorithm called SAKM (Self-Adaptive Kernel Machine) that is developed to learn continuously evolving clusters from non-stationary data. Dedicated to online clustering in multi-class environment, this algorithm is based on an unsupervised learning process with self-adaptive abilities. This process is achieved through three main stages: clusters creation (with an initialization procedure), online clusters adaptation and clusters fusion. Thanks to a new specific kernel-induced similarity measure, the SAKM algorithm is attractive to be very computationally efficient in online applications. At the end, some experiments illustrate the capacities of our algorithm in non-stationary environment.

Habiboulaye Amadou Boubacar, Stéphane Lecoeuche
The LCCP for Optimizing Kernel Parameters for SVM

Tuning hyper-parameters is a necessary step to improve learning algorithm performances. For Support Vector Machine classifiers, adjusting kernel parameters increases drastically the recognition accuracy. Basically, cross-validation is performed by sweeping exhaustively the parameter space. The complexity of such grid search is exponential with respect to the number of optimized parameters. Recently, a gradient descent approach has been introduced in[1] which reduces drastically the search steps of the optimal parameters. In this paper, we define the LCCP (Log Convex Concave Procedure) optimization scheme derived from the CCCP (Convex ConCave Procedure) for optimizing kernel parameters by minimizing the radius-margin bound. To apply the LCCP, we prove, for a particular choice of kernel, that the radius is log convex and the margin is log concave. The LCCP is more efficient than gradient descent technique since it insures that the radius margin bound decreases monotonically and converges to a local minimum without searching the size step. Experimentations with standard data sets are provided and discussed.

Sabri Boughorbel, Jean Philippe Tarel, Nozha Boujemaa
The GCS Kernel for SVM-Based Image Recognition

In this paper, we present a new compactly supported kernel for SVM based image recognition. This kernel which we called Geometric Compactly Supported (GCS) can be viewed as a generalization of spherical kernels to higher dimensions. The construction of the GCS kernel is based on a geometric approach using the intersection volume of two n-dimensional balls. The compactness property of the GCS kernel leads to a sparse Gram matrix which enhances computation efficiency by using sparse linear algebra algorithms. Comparisons of the GCS kernel performance, for image recognition task, with other known kernels prove the interest of this new kernel.

Sabri Boughorbel, Jean-Philippe Tarel, François Fleuret, Nozha Boujemaa
Informational Energy Kernel for LVQ

We describe a kernel method which uses the maximization of Onicescu’s informational energy as a criteria for computing the relevances of input features. This adaptive relevance determination is used in combination with the neural-gas and the generalized relevance LVQ algorithms. Our quadratic optimization function, as an

L

2

type method, leads to linear gradient and thus easier computation. We obtain an approximation formula similar to the mutual information based method, but in a more simple way.

Angel Caţaron, Răzvan Andonie
Reducing the Effect of Out-Voting Problem in Ensemble Based Incremental Support Vector Machines

Although Support Vector Machines (SVMs) have been successfully applied to solve a large number of classification and regression problems, they suffer from the

catastrophic forgetting

phenomenon. In our previous work, integrating the SVM classifiers into an ensemble framework using Learn++ (SVMLearn++) [1], we have shown that the SVM classifiers can in fact be equipped with the incremental learning capability. However, Learn++ suffers from an inherent

out-voting

problem: when asked to learn new classes, an unnecessarily large number of classifiers are generated to learn the new classes. In this paper, we propose a new ensemble based incremental learning approach using SVMs that is based on the incremental Learn++.MT algorithm. Experiments on the real-world and benchmark datasets show that the proposed approach can reduce the number of SVM classifiers generated, thus reduces the effect of

out-voting

problem. It also provides performance improvements over previous approach.

Zeki Erdem, Robi Polikar, Fikret Gurgen, Nejat Yumusak
A Comparison of Different Initialization Strategies to Reduce the Training Time of Support Vector Machines

This paper presents a comparison of different initialization algorithms joint with decomposition methods, in order to reduce the training time of Support Vector Machines (SVMs). Training a SVM involves the solution of a quadratic optimization problem (QP).The QP problem is very resource consuming (computational time and computational memory), because the quadratic form is dense and the memory requirements grow square the number of data points. The SVM-QP problem can be solved by several optimization strategies but, for large scale applications, they must be combined with decomposition algorithms that breaks up the entire SVM-QP problem into a series of smaller ones. The support vectors found in the training of SVM’s represent a small subgroup of the training patterns. Some algorithms are used to initilizate the SVMs, making a fast approximation of the points standing for support vectors, to train the SVM only with those data. Combination of these initializations algorithms and the decomposition approach, coupled with an QP solver specially arranged for the SVM-QP problem, are compared using some well-known benchmarks in order to show their capabilities.

Ariel García-Gamboa, Neil Hernández-Gress, Miguel González-Mendoza, Rodolfo Ibarra-Orozco, Jaime Mora-Vargas
A Hierarchical Support Vector Machine Based Solution for Off-line Inverse Modeling in Intelligent Robotics Applications

A novel approach is presented for continuous function approximation using a two-stage neural network model involving Support Vector Machines (SVM) and an adaptive unsupervised Neural Network to be applied to real functions of many variables. It involves an adaptive Kohonen feature map (SOFM) in the first stage which aims at quantizing the input variable space into smaller regions representative of the input space probability distribution and preserving its original topology, while rapidly increasing, on the other hand, cluster distances. During convergence phase of the map a group of Support Vector Machines, associated with its codebook vectors, is simultaneously trained in an online fashion so that each SVM learns to respond when the input data belong to the topological space represented by its corresponding codebook vector. The proposed methodology is applied, with promising results, to the design of a neural-adaptive controller, by involving the computer-torque approach, which combines the proposed two-stage neural network model with a servo PD feedback controller. The results achieved by the suggested SVM approach are favorably compared to the ones obtained if the role of SVMs is undertaken, instead, by Radial Basis Functions (RBF).

D. A. Karras
LS-SVM Hyperparameter Selection with a Nonparametric Noise Estimator

This paper presents a new method for the selection of the two hyperparameters of Least Squares Support Vector Machine (LS-SVM) approximators with Gaussian Kernels. The two hyperparameters are the width

σ

of the Gaussian kernels and the regularization parameter

λ

. For different values of

σ

, a Nonparametric Noise Estimator (NNE) is introduced to estimate the variance of the noise on the outputs. The NNE allows the determination of the best

λ

for each given

σ

. A Leave-one-out methodology is then applied to select the best

σ

. Therefore, this method transforms the double optimization problem into a single optimization one. The method is tested on 2 problems: a toy example and the Pumadyn regression Benchmark.

Amaury Lendasse, Yongnan Ji, Nima Reyhani, Michel Verleysen
Building Smooth Neighbourhood Kernels via Functional Data Analysis

In this paper we afford the problem of estimating high density regions from univariate or multivariate data samples. To be more precise, we propose a method based on the use of functional data analysis techniques for the construction of smooth kernel functions oriented to solve the One-Class problem. The proposed kernels increase the precision of One-Class estimation procedures. The advantages of this new point of view are shown using data sets drawn from representative density functions.

Alberto Muñoz, Javier M. Moguerza
Recognition of Heartbeats Using Support Vector Machine Networks – A Comparative Study

The paper presents the comparison of performance of the individual and ensemble of SVM classifiers for the recognition of abnormal heartbeats on the basis of the registered ECG waveforms. The recognition system applies two different Support Vector Machine based classifiers and the ensemble systems composed of the individual classifiers combined together in different way to obtain the best possible performance on the ECG data. The results of numerical experiments using the data of MIT BIH Arrhythmia Database have confirmed the superior performance of the proposed solution.

Stanisław Osowski, Tran Haoi Linh, Tomasz Markiewicz
Componentwise Support Vector Machines for Structure Detection

This paper extends recent advances in Support Vector Machines and kernel machines in estimating additive models for classification from observed multivariate input/output data. Specifically, we address the question how to obtain predictive models which gives insight into the structure of the dataset. This contribution extends the framework of structure detection as introduced in recent publications by the authors towards estimation of componentwise Support Vector Machines (cSVMs). The result is applied to a benchmark classification task where the input variables all take binary values.

K. Pelckmans, J. A. K. Suykens, B. De Moor
Memory in Backpropagation-Decorrelation O(N) Efficient Online Recurrent Learning

We consider regularization methods to improve the recently introduced backpropagation-decorrelation (BPDC) online algorithm for O(N) training of fully recurrent networks. While BPDC combines one-step error backpropagation and the usage of temporal memory of a network dynamics by means of decorrelation of activations, it is an online algorithm using only instantaneous states and errors. As enhancement we propose several ways to introduce memory in the algorithm for regularization. Simulation results of standard tasks show that different such strategies cause different effects either improving training performance at the cost of overfitting or degrading training errors.

Jochen J. Steil

Soft Computing Methods for Data Representation, Analysis and Processing

Incremental Rule Pruning for Fuzzy ARTMAP Neural Network

Fuzzy ARTMAP is capable of incrementally learning interpretable rules. To remove unused or inaccurate rules, a rule pruning method has been proposed in the literature. This paper addresses its limitations when incremental learning is used, and modifies it so that it does not need to store previously learnt samples. Experiments show a better performance, especially in concept drift problems.

A. Andrés-Andrés, E. Gómez-Sánchez, M. L. Bote-Lorenzo
An Inductive Learning Algorithm with a Partial Completeness and Consistence via a Modified Set Covering Problem

We present an inductive learning algorithm that allows for a partial completeness and consistence, i.e. that derives classification rules correctly describing, e.g, most of the examples belonging to a class and not describing most of the examples not belonging to this class. The problem is represented as a modification of the set covering problem that is solved by a greedy algorithm. The approach is illustrated on some medical data.

Janusz Kacprzyk, Grażyna Szkatuła
A Neural Network for Text Representation

Text categorization and retrieval tasks are often based on a good representation of textual data. Departing from the classical

vector space model

, several probabilistic models have been proposed recently, such as PLSA. In this paper, we propose the use of a neural network based, non-probabilistic, solution, which captures jointly a rich representation of words and documents. Experiments performed on two information retrieval tasks using the TDT2 database and the TREC-8 and 9 sets of queries yielded a better performance for the proposed neural network model, as compared to PLSA and the classical TFIDF representations.

Mikaela Keller, Samy Bengio
A Fuzzy Approach to Some Set Approximation Operations

In many real–life problems we deal with a set of objects together with their properties. Due to incompleteness and/or imprecision of available data, the true knowledge about subsets of objects can be determined approximately. In this paper we present a fuzzy generalisation of two relation–based operations suitable for set approximations. The first approach is based on relationships between objects and their properties, while the second set approximation operations are based on similarities between objects. Some properties of these operations are presented.

Anna Maria Radzikowska
Connectionist Modeling of Linguistic Quantifiers

This paper presents a new connectionist model of the grounding of linguistic quantifiers in perception that takes into consideration the contextual factors affecting the use of vague quantifiers. A preliminary validation of the model is presented through the training and testing of the model with experimental data on the rating of quantifiers. The model is able to perform the “psychological” counting of objects (fish) in visual scenes and to select the quantifier that best describes the scene, as in psychological experiments.

Rohana K. Rajapakse, Angelo Cangelosi, Kenny R. Coventry, Steve Newstead, Alison Bacon
Fuzzy Rule Extraction Using Recombined RecBF for Very-Imbalanced Datasets

An introduction to how to use RecBF to work with very-imbalanced datasets is described. In this paper, given a very-imbalanced dataset obtained from medicine, a set of Membership Functions (MF) and Fuzzy Rules are extracted. The core of this method is a recombination of the Membership Functions given by the RecBF algorithm which provides a better generalization than the original one. The results thus obtained can be interpreted as sets of low number of rules and MF.

Vicenç Soler, Jordi Roig, Marta Prim
An Iterative Artificial Neural Network for High Dimensional Data Analysis

We present an Iterative Artificial Neural Network (IANN) in which computation is performed through a set of successive layers sharing the same weights. This network requires fewer weights while it can handle high-dimensional inputs. IANN is applied, with good results, to a time series prediction and two classification problems.

Armando Vieira
Towards Human Friendly Data Mining: Linguistic Data Summaries and Their Protoforms

We show how linguistic database summaries can provide tools for human friendly data mining. The relevance of Zadeh’s concept of a protoform is indicated. We present the use of our fuzzy databse querying interface for an effective and efficient mining of such linguistic data summaries. We outline an implementation for a computer retailer involving both data from an internal database of the company and data downloaded from external databases via the Internet.

Sławomir Zadrożny, Janusz Kacprzyk, Magdalena Gola

Special Session: Data Fusion for Industrial, Medical and Environmental Applications Organizers and Chairs: D. Mandic, D. Obradovic

Localization of Abnormal EEG Sources Incorporating Constrained BSS

An effective method has been developed to solve the localization problem of the brain sources. A priori knowledge about normal source locations has been effectively exploited in estimating the rotation matrix, which inherently permutes the estimated separating matrix in the blind source separation (BSS) algorithm. An important application of this method is to localize the Focal epilepsy sources, which causes changes in attention, movement and behavior. Here, an effective and simple technique for both separation and localization of the EEG sources has been developed incorporating BSS. The criterion is subject to having some of the sources known. The constraint is then incorporated into the separation objective function using Lagrange multipliers whereby changing it to an unconstrained problem.

Mohamed Amin Latif, Saeid Sanei, Jonathon A. Chambers
Myocardial Blood Flow Quantification in Dynamic PET: An Ensemble ICA Approach

Linear models such as factor analysis, independent component analysis (ICA), and nonnegative matrix factorization (NMF) were successfully applied to dynamic myocardial

$H_{2}^{15}O$

PET image data, showing that meaningful factor images and appropriate time activity curves were estimated for the quantification of myocardial blood flow. In this paper we apply the ensemble ICA to dynamic myocardial

$H_{2}^{15}O$

PET image data. The benefit of the ensemble ICA (or Bayesian ICA) in such a task is to decompose the image data into a linear sum of independent components as in ICA, with imposing the nonnegativity constraints on basis vectors as well as encoding variables, through the rectified Gaussian prior. We show that major cardiac components are separated successfully by the ensemble ICA method and blood flow could be estimated in 15 patients. Mean myocardial blood flow was 1.2 ± 0.40 ml/min/g in rest, 1.85 ± 1.12 ml/min/g in stress state. Blood flow values obtained by an operator in two different occasion were highly correlated (r=0.99). In myocardium component images, the image contrast between left ventricle and myocardium was 1:2.7 in average.

Byeong Il Lee, Jae Sung Lee, Dong Soo Lee, Seungjin Choi
Data Fusion for Modern Engineering Applications: An Overview

An overview of data fusion approaches is provided from the signal processing viewpoint. The general concept of data fusion is introduced, together with the related architectures, algorithms and performance aspects. Benefits of such an approach are highlighted and potential applications are identified. Case studies illustrate the merits of applying data fusion concepts in real world applications.

Danilo P. Mandic, Dragan Obradovic, Anthony Kuh, Tülay Adali, Udo Trutschell, Martin Golz, Philippe De Wilde, Javier Barria, Anthony Constantinides, Jonathon Chambers
Modified Cost Functions for Modelling Air Quality Time Series by Using Neural Networks

In this paper a new Backpropagation algorithm appropriately studied for modelling air pollution time series is proposed. The underlying idea is that of modifying the error definition in order to improve the capability of the model to forecast episodes of poor air quality. Five different expressions of error definition are proposed and their cumulative performances are rigorously evaluated in the framework of a real case study which refers to the modelling of 1 hour average daily maximum Ozone concentration recorded in the industrial area of Melilli (Siracusa, Italy). Furthermore, two new performance indices to evaluate the model prediction capabilities referred to as Probability Index and Global Index respectively, are introduced. Results indicate that the traditional and the proposed version of Backpropagation perform quite similarly in terms of the Global Index which gives a cumulative evaluation of the model. However the latter algorithm performs better in terms of the percentage of exceedences correctly forecast. Finally a criterion to make the choice among various air quality prediction models is proposed.

Giuseppe Nunnari, Flavio Cannavó
Troubleshooting in GSM Mobile Telecommunication Networks Based on Domain Model and Sensory Information

Mobile cellular telecommunication networks are complex dynamic systems whose troubleshooting presents formidable challenges. Typically, the network performance analysis is carried out on a network cell basis and it is based on the traffic information obtained from various sensors such as the number of requested calls, number of dropped calls, number of handovers, etc. This paper presents a novel troubleshooting system, which provides likelihood of different user-specified root causes of performance degradation based on the observed sensory information and the underlying domain model. This domain model has a form of a Causal Network whose structure is appropriately chosen. The novelty of the herein presented approach is that the domain model is initially based on expert knowledge and later on refined via supervised learning with the data gathered during system operation.

Dragan Obradovic, Ruxandra Lupas Scheiterer
Energy of Brain Potentials Evoked During Visual Stimulus: A New Biometric?

We further explore the possibility of using the energy of brain potentials evoked during processing of visual stimuli (VS) as a new biometric tool, where biometric features representing the energy of high frequency electroencephalogram (EEG) spectra are used in the person identification paradigm. For convenience and ease of processing of cognitive processing, in the experiments, simple black and white drawings of common objects are used as VS. In the classification stage, the Elman neural network is employed to classify the generated EEG features. The high recognition rate of 99.62% on an ensemble of 800 raw EEG signals indicates the potential of the proposed method.

Ramaswamy Palaniappan, Danilo P. Mandic
Communicative Interactivity – A Multimodal Communicative Situation Classification Approach

The problem of modality detection in so called Communicative Interactivity is addressed. Multiple audio and video recordings of human communication are analyzed within this framework, based on fusion of the extracted features. At the decision level, Support Vector Machines (SVM) are utilized to segregate between the communication modalities. The proposed approach is verified through simulations on real world recordings.

Tomasz M. Rutkowski, Danilo Mandic
Bayesian Network Modeling Aspects Resulting from Applications in Medical Diagnostics and GSM Troubleshooting

This paper addresses issues in constructing a Bayesian Network (BN) domain model for diagnostic purposes from expert knowledge. The novelty of this paper is the approach for structured generation of a model that incorporates the unstructured multifaceted and possibly conflicting probabilistic information provided by the experts.

Ruxandra Lupas Scheiterer, Dragan Obradovic
Fusion of State Space and Frequency- Domain Features for Improved Microsleep Detection

A novel approach for Microsleep Event detection is presented. This is achieved based on multisensor electroencephalogram (EEG) and electrooculogram (EOG) measurements recorded during an overnight driving simulation task. First, using video clips of the driving, clear Microsleep (MSE) and Non-Microsleep (NMSE) events were identified. Next, segments of EEG and EOG of the selected events were analyzed and features were extracted using Power Spectral Density and Delay Vector Variance. The so obtained features are used in several combinations for MSE detection and classification by means of populations of Learning Vector Quantization (LVQ) networks. Best classification results, with test errors down to 13%, were obtained by a combination of all the recorded EEG and EOG channels, all features, and with feature relevance adaptation using Genetic Algorithms.

David Sommer, Mo Chen, Martin Golz, Udo Trutschel, Danilo Mandic
Combining Measurement Quality into Monitoring Trends in Foliar Nutrient Concentrations

Quality of measurements is an important factor affecting the reliability of analyses in environmental sciences. In this paper we combine foliar measurement data from Finland and results of multiple measurement quality tests from different sources in order to study the effect of measurement quality on the reliability of foliar nutrient analysis. In particular, we study the use of weighted linear regression models in detecting trends in foliar time series data and show that the development of measurement quality has a clear effect on the significance of results.

Mika Sulkava, Pasi Rautio, Jaakko Hollmén
A Fast and Efficient Method for Compressing fMRI Data Sets

We present a new lossless compression method named FTTcoder, which compresses images and 3d sequences collected during a typical functional MRI experiment. The large data sets involved in this popular medical application necessitate novel compression algorithms to take into account the structure of the recorded data as well as the experimental conditions, which include the 4d recordings, the used stimulus protocol and marked regions of interest (ROI). We propose to use simple temporal transformations and entropy coding with context modeling to encode the 4d scans after preprocessing with the ROI masking. Experiments confirm the superior performance of FTTcoder in contrast to previously proposed algorithms both in terms of speed and compression.

Fabian J. Theis, Toshihisa Tanaka

Special Session: Non-linear Predictive Models for Speech Processing Organizers and Chairs: M. Chetouani, M. Faundez-Zanuy, B. Gas, A. Hussain

Non-linear Predictive Models for Speech Processing

This paper aims to provide an overview of the emerging area of non-linear predictive modelling for speech processing. Traditional predictors are linear based models related to the speech production model. However, non-linear phenomena involved in the production process justify the use of non-linear models. This paper investigates certain statistical and signal processing perspectives and reviews a number of non-linear models including their structure and key parameters (such as prediction context).

M. Chetouani, Amir Hussain, M. Faundez-Zanuy, B. Gas
Predictive Speech Coding Improvements Based on Speaker Recognition Strategies

This paper compares the speech coder and speaker recognizer applications, showing some parallelism between them. In this paper, some approaches used for speaker recognition are applied to speech coding based on neural networks, in order to improve the prediction accuracy. Experimental results show an improvement in Segmental SNR (SEGSNR) up to 1.7 dB.

Marcos Faundez-Zanuy
Predictive Kohonen Map for Speech Features Extraction

Some well known theoretical results concerning the universal approximation property of MLP neural networks with one hidden layer have shown that for any function

f

from [0,1]

n

to

$\mathcal{R}$

, only the output layer weights depend on

f

. We use this result to propose a network architecture called the

predictive Kohonen map

allowing to design a new speech features extractor. We give experimental results of this approach on a phonemes recognition task.

Bruno Gas, Mohamed Chetouani, Jean-Luc Zarader, Christophe Charbuillet
Bidirectional LSTM Networks for Improved Phoneme Classification and Recognition

In this paper, we carry out two experiments on the TIMIT speech corpus with bidirectional and unidirectional Long Short Term Memory (LSTM) networks. In the first experiment (framewise phoneme classification) we find that bidirectional LSTM outperforms both unidirectional LSTM and conventional Recurrent Neural Networks (RNNs). In the second (phoneme recognition) we find that a hybrid BLSTM-HMM system improves on an equivalent traditional HMM system, as well as unidirectional LSTM-HMM.

Alex Graves, Santiago Fernández, Jürgen Schmidhuber
Improvement in Language Detection by Neural Discrimination in Comparison with Predictive Models

In this paper, we present a new method of language detection. This method is based on language pair discrimination using neural networks as classifier of acoustic features. No acoustic decomposition of the speech signal is needed. We present an improvement of our method applied to the detection of English for a signal duration of less than 3 seconds (Call Friend corpus), as well as a comparison with a neural predictive model. The obtained results highlight scores ranging from 74.7% to 76.9% according to the method used.

Sébastien Herry

Special Session: Intelligent Multimedia and Semantics Organizers and Chairs: Y. Avrithis, S. Kollias

Learning Ontology Alignments Using Recursive Neural Networks

The Semantic Web is based on technologies that make the content of the Web machine-understandable. In that framework, ontological knowledge representation has become an important tool for the analysis and understanding of multimedia information. Because of the distributed nature of the Semantic Web however, ontologies describing similar fields of knowledge are being developed and the data coming from similar but non-identical ontologies can be combined only if a semantic mapping between them is first established. This has lead to the development of several ontology alignment tools. We propose an automatic ontology alignment method based on the recursive neural network model that uses ontology instances to learn similarities between ontology concepts. Recursive neural networks are an extension of common neural networks, designed to process efficiently structured data. Since ontologies are a structured data representation, the model is inherently suitable for use with ontologies.

Alexandros Chortaras, Giorgos Stamou, Andreas Stafylopatis
Minimizing Uncertainty in Semantic Identification When Computing Resources Are Limited

In this paper we examine the problem of automatic semantic identification of entities in multimedia documents from a computing point of view. Specifically, we identify as main points to consider the storage of the required knowledge and the computational complexity of the handling of the knowledge as well as of the actual identification process. In order to tackle the above we utilize (i) a sparse representation model for storage, (ii) a novel transitive closure algorithm for handling and (iii) a novel approach to identification that allows for the specification of computational boundaries.

Manolis Falelakis, Christos Diou, Manolis Wallace, Anastasios Delopoulos
Automated Extraction of Object- and Event-Metadata from Gesture Video Using a Bayesian Network

In this work a method for metadata extraction from sign language videos is proposed, by employing high level domain knowledge. The metadata concern the depicted objects of the head and the right/left hand and the occlusion events, which are essential for interpretation and therefore for subsequent higher level semantic indexing. The occlusions between hands, head and hands and body and hands, can easily confuse metadata extraction and can consequently lead to wrong gesture interpretation. Therefore, a Bayesian network is employed to bridge the gap between the high level knowledge about the valid spatiotemporal configurations of the human body and the metadata extractor. The approach is applied here in sign-language videos, but it can be generalized to video indexing based on gestures.

Dimitrios I. Kosmopoulos
f-SWRL: A Fuzzy Extension of SWRL

In an attempt to extend existing knowledge representation systems to deal with the imperfect nature of real world information involved in several applications like multimedia analysis and understanding, the AI community has devoted considerable attention to the representation and management of uncertainty, imprecision and vague knowledge. Moreover, a lot of work has been carried out on the development of reasoning engines that can interpret imprecise knowledge. The need to deal with imperfect and imprecise information is likely to be common in the context of multimedia and the (Semantic) Web. In anticipation of such requirements, this paper presents a proposal for fuzzy extensions of SWRL, which is a rule extension to OWL DL.

Jeff Z. Pan, Giorgos Stamou, Vassilis Tzouvaras, Ian Horrocks
An Analytic Distance Metric for Gaussian Mixture Models with Application in Image Retrieval

In this paper we propose a new distance metric for probability density functions (PDF). The main advantage of this metric is that unlike the popular Kullback-Liebler (KL) divergence it can be computed in closed form when the PDFs are modeled as Gaussian Mixtures (GM). The application in mind for this metric is histogram based image retrieval. We experimentally show that in an image retrieval scenario the proposed metric provides as good results as the KL divergence at a fraction of the computational cost. This metric is also compared to a Bhattacharyya-based distance metric that can be computed in closed form for GMs and is found to produce better results.

G. Sfikas, C. Constantinopoulos, A. Likas, N. P. Galatsanos
Content-Based Retrieval of Web Pages and Other Hierarchical Objects with Self-organizing Maps

We propose a content-based information retrieval (CBIR) method that models known relationships between multimedia objects as a hierarchical tree-structure incorporating additional implicit semantic information. The objects are indexed based on their contents by mapping automatically extracted low-level features to a set of Self-Organized Maps (SOMs). The retrieval result is formed by estimating the relevance of each object by using the SOMs and relevance sharing in the hierarchical object structure. We demonstrate the usefulness of this approach with a small-scale experiment by using our PicSOM CBIR system.

Mats Sjöberg, Jorma Laaksonen
Fusing MPEG-7 Visual Descriptors for Image Classification

This paper proposes three content-based image classification techniques based on fusing various low-level MPEG-7 visual descriptors. Fusion is necessary as descriptors would be otherwise incompatible and inappropriate to directly include e.g. in a Euclidean distance. Three approaches are described: A “merging” fusion combined with an SVM classifier, a back-propagation fusion combined with a KNN classifier and a Fuzzy-ART neurofuzzy network. In the latter case, fuzzy rules can be extracted in an effort to bridge the “semantic gap” between the low-level descriptors and the high-level semantics of an image. All networks were evaluated using content from the repository of the aceMedia project and more specifically in a

beach

/

urban

scene classification problem.

Evaggelos Spyrou, Hervé Le Borgne, Theofilos Mailis, Eddie Cooke, Yannis Avrithis, Noel O’Connor

Applications to Natural Language Proceesing

The Method of Inflection Errors Correction in Texts Composed in Polish Language – A Concept

The idea of verification the inflection correctness of sentences composed in polish language is presented in this paper. The idea and its realization is based on the formal model of modified link grammar, grammatical rules of polish language and neural network as a classification tool of individual words for the sake of gender, number, person, mood and case. The crucial to determine the inflection correctness is the third of mentioned items. The proposition of the artificial neural classifier is presented. Finally, the application and expect results are discussed.

Tomasz Kapłon, Jacek Mazurkiewicz
Coexistence of Fuzzy and Crisp Concepts in Document Maps

SOM document-map based search engines require initial document clustering in order to present results in a meaningful way. This paper reports on our ongoing research in applications of Bayesian Networks for document map creation at various stages of document processing. Modifications are proposed to original algorithms based on our experience of superiority of crisp edge point between classes/groups of documents.

Mieczysław A. Kłopotek, Sławomir T. Wierzchoń, Krzysztof Ciesielski, Michał Dramiński, Dariusz Czerski
Information Retrieval Based on a Neural-Network System with Multi-stable Neurons

Neurophysiological findings of graded persistent activity suggest that memory retrieval in the brain is described by dynamical systems with continuous attractors. It has recently been shown that robust graded persistent activity is generated in single cells. Multiple levels of stable activity at a single cell can be replicated by a model neuron with multiple hysteretic compartments. Here we propose a framework to simply calculate the dynamical behavior of a network of multi-stable neurons. We applied this framework to spreading activation for document retrieval. Our method shows higher performance of retrieval than other spreading activation methods. The present study thus presents novel and useful information-processing algorithm inferred from neuroscience.

Yukihiro Tsuboshita, Hiroshi Okamoto
Neural Coding Model of Associative Ontology with Up/Down State and Morphoelectrotonic Transform

We propose a new coding model to the associative ontology that based on result of association experiment to person. The semantic network with the semantic distance on the words is constructed on the neural network and the association relation is expressed by using the up and down states. The associative words are changing depending on the context and the words with the polysemy and the homonym solve vagueness in self organization by using the up and down states. In addition, the relation of new words is computed depending on the context by morphoelectrotonic transform theory. In view of these facts, the simulation model of dynamic cell assembly on neural network depending on the context and word sense disambiguation is constructed.

Norifumi Watanabe, Shun Ishizaki

Various Applications

Robust Structural Modeling and Outlier Detection with GMDH-Type Polynomial Neural Networks

The paper presents a new version of a GMDH type algorithm able to perform an automatic model structure synthesis, robust model parameter estimation and model validation in presence of outliers. This algorithm allows controlling the complexity – number and maximal power of terms – in the models and provides stable results and computational efficiency. The performance of this algorithm is demonstrated on artificial and real data sets. As an example we present an application to the study of the association between clinical symptoms of Parkinsons disease and temporal patterns of neuronal activity recorded in the subthalamic nucleus of human patients.

Tatyana Aksenova, Vladimir Volkovich, Alessandro E. P. Villa
A New Probabilistic Neural Network for Fault Detection in MEMS

Micro Electro Mechanical Systems will soon usher in a new technological renaissance. Learn about the state of the art, from inertial sensors to microfluidic devices [1]. Over the last few years, considerable effort has gone into the study of the failure mechanisms and reliability of MEMS. Although still very incomplete, our knowledge of the reliability issues relevant to MEMS is growing. One of the major problems in MEMS production is fault detection. After fault diagnosis, hardware or software methods can be used to overcome it. Most of MEMS have nonlinear and complex models. So it is difficult or impossible to detect the faults by traditional methods, which are model-based.In this paper, we use Robust Heteroscedastic Probabilistic Neural Network, which is a high capability neural network for fault detection. Least Mean Square algorithm is used to readjust some weights in order to increase fault detection capability.

Reza Asgary, Karim Mohammadi
Analog Fault Detection Using a Neuro Fuzzy Pattern Recognition Method

There are different methods for detecting digital faults in electronic and computer systems. But for analog faults, there are some problems. This kind of faults consist of many different and parametric faults, which can not be detected by digital fault detection methods. One of the proposed methods for analog fault detection, is neural networks. Fault detection is actually a pattern recognition task. Faulty and fault free data are different patterns which must be recognized. In this paper we use a probabilistic neural network to recognize different faults(patterns) in analog systems. A fuzzy system is used to improve performance of network. Finally different network results are compared.

Reza Asgary, Karim Mohammadi
Support Vector Machine for Recognition of Bio-products in Gasoline

The paper presents the application of Support Vector Machine for recognition and classification of the bio-products in the gasoline. We consider the supplement of such bio-products, as ethanol, MTBE, ETBE and benzene. The recognition system contains the measuring part in the form of semiconductor array sensors responding with a signal pattern characteristic for each gasoline blend type. The SVM network working in the classification mode processes these signals and associates them with an appropriate class. It will be shown that the proposed measurement system represents an excellent tool for the recognition of different types of the gasoline blends. The results are compared with application of multilayer perceptron.

Kazimierz Brudzewski, Stanisław Osowski, Tomasz Markiewicz, Jan Ulaczyk
Detecting Compounded Anomalous SNMP Situations Using Cooperative Unsupervised Pattern Recognition

This research employs unsupervised pattern recognition to approach the thorny issue of detecting anomalous network behavior. It applies a connectionist model to identify user behavior patterns and successfully demonstrates that such models respond well to the demands and dynamic features of the problem. It illustrates the effectiveness of neural networks in the field of Intrusion Detection (ID) by exploiting their strong points: recognition, classification and generalization. Its main novelty lies in its connectionist architecture, which up until the present has never been applied to Intrusion Detection Systems (IDS) and network security. The IDS presented in this research is used to analyse network traffic in order to detect anomalous SNMP (Simple Network Management Protocol) traffic patterns. The results also show that the system is capable of detecting independent and compounded anomalous SNMP situations. It is therefore of great assistance to network administrators in deciding whether such anomalous situations represent real intrusions.

Emilio Corchado, Álvaro Herrero, José Manuel Sáiz
Using Multilayer Perceptrons to Align High Range Resolution Radar Signals

In this paper we propose the use of Multilayer Perceptrons (MLPs) to align High Range Resolution (HRR) radar signals circularly shifted in time. To study the performance, the error of shift estimation is measured for different values of Signal to Noise ratio (SNR). The Zero Phase method is used for comparison purposes. Results show the best performance of the Zero Phase method with completely misaligned patterns, and the best performance of the MLP with low grades of misalignment. Using these results, a new method is proposed. First, the Zero Phase algorithm is used to pre-align the signals. Then, a MLP is trained using the pre-aligned signals in order to get more accuracy on the estimation of the shift. Results show an improvement up to 30%.

R. Gil-Pita, M. Rosa-Zurera, P. Jarabo-Amores, F. López-Ferreras
Approximating the Neyman-Pearson Detector for Swerling I Targets with Low Complexity Neural Networks

This paper deals with the application of neural networks to approximate the Neyman-Pearson detector. The detection of Swerling I targets in white gaussian noise is considered. For this case, the optimum detector and the optimum decision boundaries are calculated. Results prove that the optimum detector is independent on TSNR, so, under good training conditions, neural network performance should be independent of it. We have demonstrated that the minimum number of hidden units required for enclosing the optimum decision boundaries is three. This result allows to evaluate the influence of the training algorithm. Results demonstrate that the LM algorithm is capable of finding excellent solutions for MLPs with only 4 hidden units, while the BP algorithm best results are obtained with 32 or more hidden units, and are worse than those obtained with the LM algorithm and 4 hidden units.

D. de la Mata-Moya, P. Jarabo-Amores, M. Rosa-Zurera, F. López-Ferreras, R. Vicen-Bueno
Completing Hedge Fund Missing Net Asset Values Using Kohonen Maps and Constrained Randomization

Analysis of financial databases is sensitive to missing values (no reported information, provider errors, outlier filters...). Risk analysis and portfolio asset allocation require cylindrical and complete samples. Moreover, return distributions are characterised by non-normalities due to heteroskedasticity, leverage effects, volatility feedbacks and asymmetric local correlations. This makes completion algorithms very useful for portfolio management applications, specifically if they can deal properly with the empirical stylised facts of asset returns. Kohonen maps constitute powerful nonlinear financial classification tools (see [3], [4] or [6] for instance), following the approach of Cottrell

et al.

(2003), we use a Kohonen algorithm (see [2]), altogether with the Constrained Randomization Method (see [8]) to deal with mutual fund missing Net Asset Values. The accuracy of rebuilt NAV estimated series is then evaluated according to a comparison between the first moments of the series.

Paul Merlin, Bertrand Maillet
Neural Architecture for Concurrent Map Building and Localization Using Adaptive Appearance Maps

This paper describes a novel omnivision-based Concurrent Map-building and Localization (CML) approach which is able to localize a mobile robot in complex and dynamic environments. The approach extends or improves known CML techniques in essential aspects. For example, a more flexible model of the environment is used to represent experienced observations. By applying an improved learning regime, observations which are not longer of importance for the localization task are actively forgotten to limit complexity. Furthermore, a generalized scheme for hypotheses fusion is presented that enables the integration of further multi-sensory position estimators.

St. Mueller, A. Koenig, H. -M. Gross
New Neural Network Based Mobile Location Estimation in a Metropolitan Area

This paper presents a new neural network based approach to the prediction of mobile locations using signal strength measurements in a simulated metropolitan area. The prediction of a mobile location using propagation path loss (signal strength) is a very difficult and complex task. Several techniques have been proposed recently mostly based on linearized, geometrical and maximum likelihood methods. An alternative approach based on artificial neural networks is proposed in this paper which offers the advantages of increased flexibility to adapt to different environments and high speed parallel processing. The paper first gives an overview of conventional location estimation techniques and the various propagation models reported to-date, and a new signal-strength based neural network technique is then described. A simulated mobile architecture based on the COST-231 Non-line of Sight (NLOS) Walfisch-Ikegami implementation of a metropolitan environment is used to assess the generalization performance of a Multi-Layered Perceptron (MLP) Neural Network based mobile location predictor with promising initial results.

Javed Muhammad, Amir Hussain, Alexander Neskovic, Evan Magill
Lagrange Neural Network for Solving CSP Which Includes Linear Inequality Constraints

We proposed a neural network called LPPH-CSP (Lagrange Programming neural network with Polarized High-order connections for Constraint Satisfaction Problem) to solve the CSP. The CSP is a problem to find a variable assignment which satisfies all given constraints. Because the CSP has a well defined representation ability, it can represent many problems in AI compactly. From experimental results of LPPH-CSP and GENET which is a famous CSP solver, we confirmed that our method is as efficient as the GENET. In addition, unlike the other conventional CSP solvers which are discrete-valued methods, our method is a continuous-valued method and it can update all variables simultaneously, while the conventional csp solvers cannot find a solution by updating all variables simultaneously Because of the oscilation of the states. Therefore, we can expect the speed-up of LPPH-CSP if it is implemented by the hardware such as FPGA. In this paper, we extend LPPH-CSP to deal with the linear inequality constraints. By using this type of constraint, we can represent various practical problems more briefly. In this paper, we also define the CSP which has an objective function, and we extend LPPH-CSP to solve this problem. In experiment, we apply our method and OPBDP to the warehouse location problem and compare the effectiveness.

Takahiro Nakano, Masahiro Nagamatu
Modelling Engineering Problems Using Dimensional Analysis for Feature Extraction

The performance of a method for the reduction of the input space dimensionality of a physical or engineering problem is analyzed. The results of its application to several engineering problems are compared with those obtained by other well-known methods for the reduction of input space dimensionality, such as Principal Component Analysis and Independent Component Analysis. In order to carry out this study, the features extracted by the three methods were used as inputs to a feedforward neural network. The advantages of the proposed method are that it presents a computational complexity depending on the number of variables and guarantees dimensional homogeneity in the new space.

Noelia Sánchez-Maroño, Oscar Fontenla-Romero, Enrique Castillo, Amparo Alonso-Betanzos
Research on Electrotactile Representation Technology Based on Spatiotemporal Dual-Channel

An electro-tactile representation technology based on spatio-temporal dual-channel is presented and discussed. Both the stimuli current on temporal channel and the signal for tactile element selection on spatial channel are provided by sound waves. Signals on two channels are composed into a file in WAV format by a special wave editor. This WAV format file can be converted to a dimensional wave by the sound card of a computer. When the output of the sound card is connected to the current stimulator, it provides the control signal for tactile stimuli current on temporal channel and tactile element selection signal on spatial channel. The analysis on the model of electrotactile representation shows that the whole tactile perception can be divided into the base volume and the fluctuant amount. To obtain a comfortable electrotactile sensation, a limit to the fluctuant amount is needed.

Shuai Liguo, Kuang Yinghui, Xuemei Wang, Xu Yanfang
Application of Bayesian MLP Techniques to Predicting Mineralization Potential from Geoscientific Data

Conventional neural network training methods attempt to find a single set of values for the network weights by minimizing an error function using a gradient descent based technique. In contrast, the Bayesian approach estimates the posterior distribution of weights, and produces predictions by integrating over this distribution. A distinct advantage of the Bayesian approach is that the optimization of parameters such as weight decay regularization coefficients can be performed without use of a cross-validation procedure. In the context of mineral potential mapping, this leads to maps which display far less variability than maps produced using conventional MLP training techniques, the latter which are highly sensitive to factors such as initial weights and cross-validation partitioning.

Andrew Skabar
Solving Satisfiability Problem by Parallel Execution of Neural Networks with Biases

We have proposed a neural network named LPPH (Lagrange programming neural network with polarized high-order connections) for solving the SAT (SATisfiability problem of propositional calculus), together with parallel execution of LPPHs to increase efficiency. Experimental results demonstrate a high speedup ratio of this parallel execution. LPPH dynamics has an important parameter named attenuation coefficient which strongly affects LPPH execution speed. We have proposed a method in which LPPHs have different attenuation coefficients generated by a probabilistic generating function. Experimental results show the efficiency of this method. In this paper, to increase the diversity we propose a parallel execution in which LPPHs have mutually different kinds of biases, e.g., positive bias, negative bias, and centripetal bias. Experimental results show the efficiency of this method.

Kairong Zhang, Masahiro Nagamatu

Special Session: Computational Intelligence in Games Organizer and Chair: J. Ma´ndziuk

Local vs Global Models in Pong

We review two previous simulations in which opponent modelling was performed within the computer game of pong. These results suggested that sums of local models were better than a single global model on this data set. We compare two supervised methods, the multilayered perceptron, which is global, and the radial basis function network which is a sum of local models on this data and again find that the latter gives better performance. Finally we introduce a new topology preserving network which can give very local or more global estimates of results and show that, while the local estimates are more accurate, they result in game play which is less human-like in behaviour.

Colin Fyfe
Evolution of Heuristics for Give-Away Checkers

The efficacy of two evolutionary approaches to the problem of generation of heuristical linear and non-linear evaluation functions in the game of give-away checkers is tested in the paper. Experimental results show that both tested methods lead to heuristics of reasonable quality and evolutionary algorithms can be successfully applied to heuristic generation in case not enough expert knowledge is available.

Magdalena Kusiak, Karol Walędzik, Jacek Mańdziuk
Nonlinear Relational Markov Networks with an Application to the Game of Go

It would be useful to have a joint probabilistic model for a general relational database. Objects in a database can be related to each other by indices and they are described by a number of discrete and continuous attributes. Many models have been developed for relational discrete data, and for data with nonlinear dependencies between continuous values. This paper combines two of these methods, relational Markov networks and hierarchical nonlinear factor analysis, resulting in joining nonlinear models in a structure determined by the relations in the data. The experiments on collective regression in the board game go suggest that regression accuracy can be improved by taking into account both relations and nonlinearities.

Tapani Raiko
Flexible Decision Process for Astronauts in Marsbase Simulator

We present a friendly software platform called Marsbase for the simulation of human activities on Mars. The interface allows the user to command every astronaut activity on Mars, typically exploring the region, analyzing rocks or building new facilities. In the AI mode, astronauts’ decisions and actions are defined by a competition between different basic processes. Some human factors such as tiredness or perseverance have been implemented.

Jean Marc Salotti

Issues in Hardware Implementation

Tolerance of Radial Basis Functions Against Stuck-At-Faults

Neural networks are intended to be used in future nanoelectronic systems since neural architectures seem to be robust against malfunctioning elements and noise in their weights. In this paper we analyze the fault-tolerance of Radial Basis Function networks to Stuck-At-Faults at the trained weights and at the output of neurons. Moreover, we determine upper bounds on the mean square error arising from these faults.

Ralf Eickhoff, Ulrich Rückert
The Role of Membrane Threshold and Rate in STDP Silicon Neuron Circuit Simulation

Spike-timing dependent synaptic plasticity (STDP) circuitry is designed in 0.35

μm

CMOS VLSI. By setting different circuit parameters and generating diverse spike inputs, we got different steady weight distributions. Through analysing these simulation results, we show the effect of membrane threshold and input rate in STDP adaptation.

Juan Huo, Alan Murray
Systolic Realization of Kohonen Neural Network

The paper is focused on partial parallel realization of retrieving phase as well as learning phase of Kohonen neural network algorithms. The method proposed is based on pipelined systolic arrays – an example of SIMD architecture. The discussion is realized based on operations which create the following steps of learning and retrieving algorithms. The data which are transferred among the calculation units are the second criterion of the problem.

Jacek Mazurkiewicz
A Real-Time, FPGA Based, Biologically Plausible Neural Network Processor

A real-time, large scale, leaky-integrate-and-fire neural network processor realized using FPGA is presented. This has been designed, as part of a collaborative project, to investigate and implement biologically plausible models of the rodent vibrissae based somatosensory system to control a robot. An emphasis has been made on hard real-time performance of the processor, as it is to be used as part of a feedback control system. This has led to a revision of some of the established modelling protocols used in other hardware spiking neural network processors. The underlying neuron model has the ability to model synaptic noise and inter-neural propagation delays to provide a greater degree of biological plausibility. The processor has been demonstrated modelling real neural circuitry in real-time, independent of the underlying neural network activity.

Martin Pearson, Ian Gilhespy, Kevin Gurney, Chris Melhuish, Benjamin Mitchinson, Mokhtar Nibouche, Anthony Pipe
Balancing Guidance Range and Strength Optimizes Self-organization by Silicon Growth Cones

We characterize the first hardware implementation of a self-organizing map algorithm based on axon migration. A population of silicon growth cones automatically wires a topographic mapping by migrating toward sources of a diffusible guidance signal that is released by postsynaptic activity. We varied the diffusion radius of this signal, trading strength for range. Best performance is achieved by balancing signal strength against signal range.

Brian Taba, Kwabena Boahen

Erratum

Erratum
Editorial Board
Backmatter
Metadaten
Titel
Artificial Neural Networks: Formal Models and Their Applications – ICANN 2005
herausgegeben von
Włodzisław Duch
Janusz Kacprzyk
Erkki Oja
Sławomir Zadrożny
Copyright-Jahr
2005
Verlag
Springer Berlin Heidelberg
Electronic ISBN
978-3-540-28756-8
Print ISBN
978-3-540-28755-1
DOI
https://doi.org/10.1007/11550907