Active Learning and Dynamic Environments

Frontmatter

Deep Active Learning for Autonomous Navigation

Abstract

Imitation learning refers to an agent’s ability to mimic a desired behavior by learning from observations. A major challenge facing learning from demonstrations is to represent the demonstrations in a manner that is adequate for learning and efficient for real time decisions. Creating feature representations is especially challenging when extracted from high dimensional visual data. In this paper, we present a method for imitation learning from raw visual data. The proposed method is applied to a popular imitation learning domain that is relevant to a variety of real life applications; namely navigation. To create a training set, a teacher uses an optimal policy to perform a navigation task, and the actions taken are recorded along with visual footage from the first person perspective. Features are automatically extracted and used to learn a policy that mimics the teacher via a deep convolutional neural network. A trained agent can then predict an action to perform based on the scene it finds itself in. This method is generic, and the network is trained without knowledge of the task, targets or environment in which it is acting. Another common challenge in imitation learning is generalizing a policy over unseen situation in training data. To address this challenge, the learned policy is subsequently improved by employing active learning. While the agent is executing a task, it can query the teacher for the correct action to take in situations where it has low confidence. The active samples are added to the training set and used to update the initial policy. The proposed approach is demonstrated on 4 different tasks in a 3D simulated environment. The experiments show that an agent can effectively perform imitation learning from raw visual data for navigation tasks and that active learning can significantly improve the initial policy using a small number of samples. The simulated testbed facilitates reproduction of these results and comparison with other approaches.

Ahmed Hussein, Mohamed Medhat Gaber, Eyad Elyan

2D Recurrent Neural Networks for Robust Visual Tracking of Non-Rigid Bodies

Abstract

The efficient tracking of articulated bodies over time is an essential element of pattern recognition and dynamic scenes analysis. This paper proposes a novel method for robust visual tracking, based on the combination of image-based prediction and weighted correlation. Starting from an initial guess, neural computation is applied to predict the position of the target in each video frame. Normalized cross-correlation is then applied to refine the predicted target position.

Image-based prediction relies on a novel architecture, derived from the Elman’s Recurrent Neural Networks and adopting nearest neighborhood connections between the input and context layers in order to store the temporal information content of the video. The proposed architecture, named 2D Recurrent Neural Network, ensures both a limited complexity and a very fast learning stage. At the same time, it guarantees fast execution times and excellent accuracy for the considered tracking task. The effectiveness of the proposed approach is demonstrated on a very challenging set of dynamic image sequences, extracted from the final of triple jump at the London 2012 Summer Olympics. The system shows remarkable performance in all considered cases, characterized by changing background and a large variety of articulated motions.

G. L. Masala, B. Golosio, M. Tistarelli, E. Grosso

Choice of Best Samples for Building Ensembles in Dynamic Environments

Abstract

Machine learning approaches often focus on optimizing the algorithm rather than assuring that the source data is as rich as possible. However, when it is possible to enhance the input examples to construct models, one should consider it thoroughly. In this work, we propose a technique to define the best set of training examples using dynamic ensembles in text classification scenarios. In dynamic environments, where new data is constantly appearing, old data is usually disregarded, but sometimes some of those disregarded examples may carry substantial information. We propose a method that determines the most relevant examples by analysing their behaviour when defining separating planes or thresholds between classes. Those examples, deemed better than others, are kept for a longer time-window than the rest. Results on a Twitter scenario show that keeping those examples enhances the final classification performance.

Joana Costa, Catarina Silva, Mário Antunes, Bernardete Ribeiro

Semi-supervised Modeling

Frontmatter

Semi-supervised Hybrid Modeling of Atmospheric Pollution in Urban Centers

Abstract

Air pollution is directly linked with the development of technology and science, the progress of which besides significant benefits to mankind it also has adverse effects on the environment and hence on human health. The problem has begun to take worrying proportions especially in large urban centers, where 60,000 deaths are reported each year in Europe’s towns and 3,000,000 worldwide, due to long-term air pollution exposure (exposure of the European Agency for the Environment http://www.eea.europa.eu/). In this paper we propose a novel and flexible hybrid machine learning system that combines Semi-Supervised Classification and Semi-Supervised Clustering, in order to realize prediction of air pollutants outliers and to study the conditions that favor their high concentration.

Ilias Bougoudis, Konstantinos Demertzis, Lazaros Iliadis, Vardis-Dimitris Anezakis, Antonios Papaleonidas

Classification Applications

Frontmatter

Predicting Abnormal Bank Stock Returns Using Textual Analysis of Annual Reports – a Neural Network Approach

Abstract

This paper aims to extract both sentiment and bag-of-words information from the annual reports of U.S. banks. The sentiment analysis is based on two commonly used finance-specific dictionaries, while the bag-of-words are selected according to their tf-idf. We combine these features with financial indicators to predict abnormal bank stock returns using a neural network with dropout regularization and rectified linear units. We show that this method outperforms other machine learning algorithms (Naïve Bayes, Support Vector Machine, C4.5 decision tree, and k-nearest neighbour classifier) in predicting positive/negative abnormal stock returns. Thus, this neural network seems to be well suited for text classification tasks working with sparse high-dimensional data. We also show that the quality of the prediction significantly increased when using the combination of financial indicators and bigrams and trigrams, respectively.

Petr Hájek, Jana Boháčová

Emotion Recognition Using Facial Expression Images for a Robotic Companion

Abstract

Social robots are gradually becoming part of society. However, social robots lack the ability to adequately interact with users in a natural manner and are in need of more human-like abilities. In this paper we present experimental results on emotion recognition through the use of facial expression images obtained from the KDEF database, a fundamental first step towards the development of an empathic social robot. We compare the performance of Support Vector Machines (SVM) and a Multilayer Perceptron Network (MLP) on facial expression classification. We employ Gabor filters as an image pre-processing step before classification. Our SVM model achieves an accuracy rate of 97.08 %, whereas our MLP achieves 93.5 %. These experiments serve as benchmark for our current research project in the area of social robotics.

Ariel Ruiz-Garcia, Mark Elshaw, Abdulrahman Altahhan, Vasile Palade

Application of Artificial Neural Networks for Analyses of EEG Record with Semi-Automated Etalons Extraction: A Pilot Study

Abstract

Application of artificial neural network (ANN) classification – multilayer perceptron (MLP) with simulated annealing for initialization and genetic algorithm for weight optimization on multi-channel EEG record is presented here. The novelty of the approach lies in the semi-automated etalon extraction. The etalons are suggested by the k-means algorithm and verified/edited by an expert. The whole process of EEG record consists of multichannel adaptive segmentation, feature extraction from segments, semi-automatic process of etalons extraction by the k-means cluster analysis leading to color segment identification and continuing with manual choice of segments for etalons by the expert and feature extraction of chosen etalons. Subsequent classification by ANN leads to unique color identification of segments in the EEG record and additionally in temporal profile. Our goal is to help the physician by mimetic software because the examination of long multichannel EEG is a tedious work.

Hana Schaabova, Vladimir Krajca, Vaclava Sedlmajerova, Olena Bukhtaieva, Lenka Lhotska, Jitka Mohylova, Svojmil Petranek

Clustering Applications

Frontmatter

Economies Clustering Using SOM-Based Dissimilarity

Abstract

Clustering of countries and economies has been done for a long time by expert comparing to ideal theoretical entities. More recently a data driven approach has been taken, including, so called, black box methods. In this paper a SOM-based dissimilarity measure is presented and used for agglomerative hierarchical clustering of economies. It turns out that the results differ significantly from those obtained via a more traditional Euclidean distance based approach.

Adam Chudziak

Elastic Net Application: Case Study to Find Solutions for the TSP in a Beowulf Cluster Architecture

Abstract

This study aims to apply the Durbin-Willshaw elastic net using parallel algorithms in order to solve the Traveling Salesman Problem (TSP) through a Beowulf cluster architecture for High-Performance Computing. The solutions for the TSP for the different number of cities are achieved by the minimization of the internal energy and by the maximization of the entropy in the information system. In this way, approximate solutions to the TSP can be determined. This work proposes a framework to implement a parallel algorithm to the Beowulf cluster. In order to find solutions for the TSP, we worked with 5000 cities with a net of 12500 nodes up to 10000 cities with 25000 nodes.

Marcos Lévano, Andrea Albornoz

Comparison of Methods for Automated Feature Selection Using a Self-organising Map

Abstract

The effective modelling of high-dimensional data with hundreds to thousands of features remains a challenging task in the field of machine learning. One of the key challenges is the implementation of effective methods for selecting a set of relevant features, which are buried in high-dimensional data along with irrelevant noisy features by choosing a subset of the complete set of input features that predicts the output with higher accuracy comparable to the performance of the complete input set. Kohonen’s Self Organising Neural Network MAP has been utilized in various ways for this task. In this work, a review of the appropriate application of multiple methods for this task is carried out. The feature selection approach based on analysis of the Self Organising network result after training is presented with comparison of performance of two methods.

Aliyu Usman Ahmad, Andrew Starkey

EEG-Based Condition Clustering using Self-Organising Neural Network Map

Abstract

Electroencephalography (EEG) has recently emerged as a useful neurophysiological biomarker for characterizing different physiological and pathological conditions of healthy and un-healthy brain activity measurements. However, the complexity and high temporal resolution of the EEG signal data has brought about the need for efficient and accurate automated methods for distinguishing mental tasks activities and the recording conditions. Distinguishing mental tasks with high accuracy is pertinent for early detection and clinical diagnostic of several neurodegenerative diseases. Expert clinicians are needed in order to distinguish between mental tasks and EEG recording conditions, which is a manual process that is prone to inefficiencies and errors especially when the EEG data is miss-annotated at the recording stage. This paper proposes the application of a Self-organizing neural network Map (SOM) with Learning Vector Quantization (LVQ) for EEG Eyes Open (EO) and Eyes Closed (EC) condition classification. This was achieved with classification accuracy of 88.5 %. The proposed approach shows good performance and hence the method can be readily applied to other classification/clustering problems on brain measurements in the Brain Computer Interface (BCI) arena.

Hassan Hamdoun, Aliyu Ahmad Usman

Cyber-Physical Systems and Cloud Applications

Frontmatter

Intelligent Measurement in Unmanned Aerial Cyber Physical Systems for Traffic Surveillance

Abstract

An adaptive framework for building intelligent measurement systems has been proposed in the paper and tested on simulated traffic surveillance data. The use of the framework enables making intelligent decisions related to the presence of anomalies in the surveillance data with the help of statistical analysis, computational intelligent and machine learning. Computational intelligence can also be effectively utilised for identifying the main contributing features in detecting anomalous data points within the surveillance data. The experimental results have demonstrated that a reasonable performance is achieved in terms of inferential accuracy and data processing speed.

Andrei Petrovski, Prapa Rattadilok, Sergey Petrovskii

Predictive Model for Detecting MQ2 Gases Using Fuzzy Logic on IoT Devices

Abstract

This paper shows the design, implementation and analysis of a fuzzy system for monitoring and alert generation for gas detection in enclosed spaces, which can be very useful either at home or industrial environments. Furthermore, this could be a useful application in the fields of Home Automation which may be developed by integrating devices and technologies of The Internet of Things. Such application consists of the provision of sensors, which constantly receive signals on gases in the environment. Subsequently, the information is analyzed by a fuzzy system that determines when to generate alert notifications, identifying the times when levels are high, either by incendiary or high pollution situations. The prototype consists of connecting an MQ-2 sensor with a Raspberry Pi, which receives the information provided and analyses it by fuzzy logic, thus determining in which cases it is necessary to alarm at sensitive events, generating alert emails and historical data.

Catalina Hernández, Sergio Villagrán, Paulo Gaona

A Multi-commodity Network Flow Model for Cloud Service Environments

Abstract

Next-generation systems, such as the big data cloud, have to cope with several challenges, e.g., move of excessive amount of data at a dictated speed, and thus, require the investigation of concepts additional to security in order to ensure their orderly function. Resilience is such a concept, which when ensured by systems or networks they are able to provide and maintain an acceptable level of service in the face of various faults and challenges. In this paper, we investigate the multi-commodity flows problem, as a task within our \(D^2R^2+DR\) resilience strategy, and in the context of big data cloud systems. Specifically, proximal gradient optimization is proposed for determining optimal computation flows since such algorithms are highly attractive for solving big data problems. Many such problems can be formulated as the global consensus optimization ones, and can be solved in a distributed manner by the alternating direction method of multipliers (ADMM) algorithm. Numerical evaluation of the proposed model is carried out in the context of specific deployments of a situation-aware information infrastructure.

Ioannis M. Stephanakis, Syed Noor-Ul-Hassan Shirazi, Antonios Gouglidis, David Hutchison

Designing a Context-Aware Cyber Physical System for Smart Conditional Monitoring of Platform Equipment

Abstract

An adaptive multi-tiered framework, which can be utilised for designing a context-aware cyber physical system is proposed and applied within the context of assuring offshore asset integrity. Adaptability is achieved through the combined use of machine learning and computational intelligence techniques. The proposed framework has the generality to be applied across a wide range of problem domains requiring processing, analysis and interpretation of data obtained from heterogeneous resources.

Farzan Majdani, Andrei Petrovski, Daniel Doolan

Time-Series Prediction

Frontmatter

Convolutional Radio Modulation Recognition Networks

Abstract

We study the adaptation of convolutional neural networks to the complex-valued temporal radio signal domain. We compare the efficacy of radio modulation classification using naively learned features against using expert feature based methods which are widely used today and e show significant performance improvements. We show that blind temporal learning on large and densely encoded time series using deep convolutional neural networks is viable and a strong candidate approach for this task especially at low signal to noise ratio.

Timothy J. O’Shea, Johnathan Corgan, T. Charles Clancy

Mutual Information with Parameter Determination Approach for Feature Selection in Multivariate Time Series Prediction

Abstract

For modeling of multivariate time series, input variable selection is a key problem. Feature selection is to select a relevant subset to reduce the dimensionality of the problem without significant loss of information. This paper presents the estimation of mutual information and its application in feature selection problem. Mutual information is one of the most common strategies borrowed from information theory for feature selection. However, the calculation of probability density function (PDF) according to the definition of mutual information is difficult, especially for high dimensional variables. A k-nearest neighbor (k-NN) method based estimator is widely used to estimate the mutual information between two variables directly from the data set. Nevertheless, this estimator depends on smoothing parameter. There is no theoretically method to choose the parameter. This paper purposes to solve two problems: one is to employ resampling methods to help the mutual information estimator to improve feature selection and the other is to apply these methods to a wind power prediction problem.

Tianhong Liu, Haikun Wei, Chi Zhang, Kanjian Zhang

Learning-Algorithms

Frontmatter

On Learning Parameters of Incremental Learning in Chaotic Neural Network

Abstract

The incremental learning is a method to compose an associate memory using a chaotic neural network and provides larger capacity than correlative learning in compensation for a large amount of computation. A chaotic neuron has spatio-temporal sum in it and the temporal sum makes the learning stable to input noise. When there is no noise in input, the neuron may not need temporal sum. In this paper, to reduce the computations, a simplified network without temporal sum is introduced and investigated through the computer simulations comparing with the network as in the past. Then, to shorten the learning steps, the learning parameters are changed during the learning along 3 functions.

Toshinori Deguchi, Naohiro Ishii

Accelerated Optimal Topology Search for Two-Hidden-Layer Feedforward Neural Networks

Abstract

Two-hidden-layer feedforward neural networks are investigated for the existence of an optimal hidden node ratio. In the experiments, the heuristic \( n_{1} = int(0.5n_{h} + 1 \)), where \( n_{1} \) is the number of nodes in the first hidden layer and \( n_{h} \) is the total number of hidden nodes, found networks with generalisation errors, on average, just 0.023 %–0.056 % greater than those found by exhaustive search. This reduced the complexity of an exhaustive search from quadratic, to linear in \( n_{h} \), with very little penalty. Further reductions in search complexity to logarithmic could be possible using existing methods developed by the Authors.

Alan J. Thomas, Simon D. Walters, Miltos Petridis, Saeed Malekshahi Gheytassi, Robert E. Morgan

An Outlier Ranking Tree Selection Approach to Extreme Pruning of Random Forests

Abstract

Random Forest (RF) is an ensemble classification technique that was developed by Breiman over a decade ago. Compared with other ensemble techniques, it has proved its accuracy and superiority. Many researchers, however, believe that there is still room for enhancing and improving its performance in terms of predictive accuracy. This explains why, over the past decade, there have been many extensions of RF where each extension employed a variety of techniques and strategies to improve certain aspect(s) of RF. Since it has been proven empirically that ensembles tend to yield better results when there is a significant diversity among the constituent models, the objective of this paper is twofold. First, it investigates how an unsupervised learning technique, namely, Local Outlier Factor (LOF) can be used to identify diverse trees in the RF. Second, trees with the highest LOF scores are then used to create a new RF termed LOFB-DRF that is much smaller in size than RF, and yet performs at least as good as RF, but mostly exhibits higher performance in terms of accuracy. The latter refers to a known technique called ensemble pruning. Experimental results on 10 real datasets prove the superiority of our proposed method over the traditional RF. Unprecedented pruning levels reaching as high as 99 % have been achieved at the time of boosting the predictive accuracy of the ensemble. The notably extreme pruning level makes the technique a good candidate for real-time applications.

Khaled Fawagreh, Mohamed Medhat Gaber, Eyad Elyan

Lower Bounds on Complexity of Shallow Perceptron Networks

Abstract

Model complexity of shallow (one-hidden-layer) perceptron networks computing multivariable functions on finite domains is investigated. Lower bounds are derived on growth of the number of network units or sizes of output weights in terms of variations of functions to be computed. A concrete construction of a class of functions which cannot be computed by percetron networks with considerably smaller numbers of units and output weights than the sizes of the function’s domains is presented. In particular, functions on Boolean d-dimensional cubes are constructed which cannot be computed by shallow perceptron networks with numbers of hidden units and sizes of output weights depending on d polynomially. A subclass of these functions is described whose elements can be computed by two-hidden-layer networks with the number of units depending on d linearly.

Věra Kůrková

Kernel Networks for Function Approximation

Abstract

Capabilities of radial convolution kernel networks to approximate multivariate functions are investigated. A necessary condition for universal approximation property of convolution kernel networks is given. Kernels that satisfy the condition in arbitrary dimension are investigated in terms of their Hankel and Fourier transforms. A computational example is presented to assess approximation capabilities of different convolution kernel networks.

David Coufal

Short Papers

Frontmatter

Simple and Stable Internal Representation by Potential Mutual Information Maximization

Abstract

The present paper aims to interpret final representations obtained by neural networks by maximizing the mutual information between neurons and data sets. Because complex procedures are needed to maximize information, the computational procedures are simplified as much as possible using the present method. The simplification lies in realizing mutual information maximization indirectly by focusing on the potentiality of neurons. The method was applied to restaurant data for which the ordinary regression analysis could not show good performance. For this problem, we tried to interpret final representations and obtain improved generalization performance. The results revealed a simple configuration where just a single important feature was extracted to explicitly explain the motivation to visit the restaurant.

Ryotaro Kamimura

Urdu Speech Corpus and Preliminary Results on Speech Recognition

Abstract

Language resources for Urdu language are not well developed. In this work, we summarize our work on the development of Urdu speech corpus for isolated words. The Corpus comprises of 250 isolated words of Urdu recorded by ten individuals. The speakers include both native and non-native, male and female individuals. The corpus can be used for both speech and speaker recognition tasks. We also report our results on automatic speech recognition task for the said corpus. The framework extracts Mel Frequency Cepstral Coefficients along with the velocity and acceleration coefficients, which are then fed to different classifiers to perform recognition task. The classifiers used are Support Vector Machines, Random Forest and Linear Discriminant Analysis. Experimental results show that the best results are provided by the Support Vector Machines with a test set accuracy of 73 %. The results reported in this work may provide a useful baseline for future research on automatic speech recognition of Urdu.

Hazrat Ali, Nasir Ahmad, Abdul Hafeez

Bio-inspired Audio-Visual Speech Recognition Towards the Zero Instruction Set Computing

Abstract

The traditional approach to automatic speech recognition continues to push the limits of its implementation. The multimodal approach to audio-visual speech recognition and its neuromorphic computational modeling is a novel data driven paradigm that will lead towards zero instruction set computing and will enable proactive capabilities in audio-visual recognition systems. An engineering-oriented deployment of the audio-visual processing framework is discussed in this paper, proposing a bimodal speech recognition framework to process speech utterances and lip reading data, applying soft computing paradigms according to a bio-inspired and the holistic modeling of speech.

Mario Malcangi, Hao Quan

Tutorials

Frontmatter

Classification of Unbalanced Datasets and Detection of Rare Events in Industry: Issues and Solutions

Abstract

Classification of unbalanced datasets is a critical task that is getting interest due to its relevance in many contexts and especially in the industrial one where machine faults, quality deviations belong to the class of rare events whose identification is fundamental. This work introduces and outlines the main themes related to this problem including an analysis of the factors that make the detection of unfrequent events complicated, a list of the metrics used for classifiers assessment and a review of most popular and emerging approaches used for facing class unbalance with a special focus on the detection of rare events.

Marco Vannucci, Valentina Colla

Variable Selection for Efficient Design of Machine Learning-Based Models: Efficient Approaches for Industrial Applications

Abstract

In many real word applications of neural networks and other machine learning approaches, large experimental datasets are available, containing a huge number of variables, whose effect on the considered system or phenomenon is not completely known or not deeply understood. Variable selection procedures identify a small subset from original feature space in order to point out the input variables, which mainly affect the considered target. The identification of such variables leads to very important advantages, such as lower complexity of the model and of the learning algorithm, savings of computational time and improved performance. Moreover, variable selection procedures can help to acquire a deeper knowledge of the considered problem, system or phenomenon by identifying the factors which mostly affect it. This concept is strictly linked to the crucial aspect of the stability of the variable selection, defined as the sensitivity of a machine learning model with respect to variations in the dataset that is exploited in its training phase. In the present review, different categories of variable section procedures are presented and discussed, in order to highlight strengths and weaknesses of each method in relation to the different tasks and to the variables of the considered dataset.

Silvia Cateni, Valentina Colla

Springer Professional

Über dieses Buch

Inhaltsverzeichnis

Frontmatter

Active Learning and Dynamic Environments

Frontmatter

Deep Active Learning for Autonomous Navigation

2D Recurrent Neural Networks for Robust Visual Tracking of Non-Rigid Bodies

Choice of Best Samples for Building Ensembles in Dynamic Environments

Semi-supervised Modeling

Frontmatter

Semi-supervised Hybrid Modeling of Atmospheric Pollution in Urban Centers

Classification Applications

Frontmatter

Predicting Abnormal Bank Stock Returns Using Textual Analysis of Annual Reports – a Neural Network Approach

Emotion Recognition Using Facial Expression Images for a Robotic Companion

Application of Artificial Neural Networks for Analyses of EEG Record with Semi-Automated Etalons Extraction: A Pilot Study

Clustering Applications

Frontmatter

Economies Clustering Using SOM-Based Dissimilarity

Elastic Net Application: Case Study to Find Solutions for the TSP in a Beowulf Cluster Architecture

Comparison of Methods for Automated Feature Selection Using a Self-organising Map

EEG-Based Condition Clustering using Self-Organising Neural Network Map

Cyber-Physical Systems and Cloud Applications

Frontmatter

Intelligent Measurement in Unmanned Aerial Cyber Physical Systems for Traffic Surveillance

Predictive Model for Detecting MQ2 Gases Using Fuzzy Logic on IoT Devices

A Multi-commodity Network Flow Model for Cloud Service Environments

Designing a Context-Aware Cyber Physical System for Smart Conditional Monitoring of Platform Equipment

Time-Series Prediction

Frontmatter

Convolutional Radio Modulation Recognition Networks

Mutual Information with Parameter Determination Approach for Feature Selection in Multivariate Time Series Prediction

Learning-Algorithms

Frontmatter

On Learning Parameters of Incremental Learning in Chaotic Neural Network

Accelerated Optimal Topology Search for Two-Hidden-Layer Feedforward Neural Networks

An Outlier Ranking Tree Selection Approach to Extreme Pruning of Random Forests

Lower Bounds on Complexity of Shallow Perceptron Networks

Kernel Networks for Function Approximation

Short Papers

Frontmatter

Simple and Stable Internal Representation by Potential Mutual Information Maximization

Urdu Speech Corpus and Preliminary Results on Speech Recognition

Bio-inspired Audio-Visual Speech Recognition Towards the Zero Instruction Set Computing

Tutorials

Frontmatter

Classification of Unbalanced Datasets and Detection of Rare Events in Industry: Issues and Solutions

Variable Selection for Efficient Design of Machine Learning-Based Models: Efficient Approaches for Industrial Applications

Backmatter

Premium Partner