Skip to main content

2017 | Buch

Artificial Neural Networks and Machine Learning – ICANN 2017

26th International Conference on Artificial Neural Networks, Alghero, Italy, September 11-14, 2017, Proceedings, Part II

herausgegeben von: Dr. Alessandra Lintas, Stefano Rovetta, Paul F.M.J. Verschure, Alessandro E.P. Villa

Verlag: Springer International Publishing

Buchreihe : Lecture Notes in Computer Science

insite
SUCHEN

Über dieses Buch

The two volume set, LNCS 10613 and 10614, constitutes the proceedings of then 26th International Conference on Artificial Neural Networks, ICANN 2017, held in Alghero, Italy, in September 2017.

The 128 full papers included in this volume were carefully reviewed and selected from 270 submissions. They were organized in topical sections named: From Perception to Action; From Neurons to Networks; Brain Imaging; Recurrent Neural Networks; Neuromorphic Hardware; Brain Topology and Dynamics; Neural Networks Meet Natural and Environmental Sciences; Convolutional Neural Networks; Games and Strategy; Representation and Classification; Clustering; Learning from Data Streams and Time Series; Image Processing and Medical Applications; Advances in Machine Learning.
There are 63 short paper abstracts that are included in the back matter of the volume.

Inhaltsverzeichnis

Frontmatter

Convolutional Neural Networks

Frontmatter
Spiking Convolutional Deep Belief Networks

Understanding visual input as perceived by humans is a challenging task for machines. Today, most successful methods work by learning features from static images. Based on classical artificial neural networks, those methods are not adapted to process event streams as provided by the Dynamic Vision Sensor (DVS). Recently, an unsupervised learning rule to train Spiking Restricted Boltzmann Machines has been presented [9]. Relying on synaptic plasticity, it can learn features directly from event streams. In this paper, we extend this method by adding convolutions, lateral inhibitions and multiple layers. We evaluate our method on a self-recorded DVS dataset as well as the Poker-DVS dataset. Our results show that our convolutional method performs better and needs less parameters. It also achieves comparable results to previous event-based classification methods while learning features in an unsupervised fashion.

Jacques Kaiser, David Zimmerer, J. Camilo Vasquez Tieck, Stefan Ulbrich, Arne Roennau, Rüdiger Dillmann
Convolutional Neural Network for Pixel-Wise Skyline Detection

Outdoor augmented reality applications are an emerging class of software systems that demand the fast identification of natural objects, such as plant species or mountain peaks, in low power mobile devices. Convolutional Neural Networks (CNN) have exhibited superior performance in a variety of computer vision tasks, but their training is a labor intensive task and their execution requires non negligible memory and CPU resources. This paper presents the results of training a CNN for the fast extraction of mountain skylines, which exhibits a good balance between accuracy (94,45% in best conditions and 86,87% in worst conditions), memory consumption (9,36 MB on average) and runtime execution overhead (273 ms on a Nexus 6 mobile phone), and thus has been exploited for implementing a real-world augmented reality applications for mountain peak recognition running on low to mid-end mobile phones.

Darian Frajberg, Piero Fraternali, Rocio Nahime Torres
1D-FALCON: Accelerating Deep Convolutional Neural Network Inference by Co-optimization of Models and Underlying Arithmetic Implementation

Deep convolutional neural networks (CNNs), which are at the heart of many new emerging applications, achieve remarkable performance in audio and visual recognition tasks, at the expense of high computational complexity, limiting their deployability. In modern CNNs it is typical for the convolution layers to consume the vast majority of the compute resources during inference. This has made the acceleration of these layers an important research and industrial goal. In this paper, we examine the effects of co-optimizing the internal structures of the convolutional layers and underlying implementation of fundamental convolution operation. We demonstrate that a combination of these methods can have a big impact on the overall speed-up of a CNN, achieving a tenfold increase over baseline. We also introduce a new class of fast 1-D convolutions for CNNs using the Toom-Cook algorithm. We show that our proposed scheme is mathematically well grounded, robust, does not require any time-consuming retraining, and still achieves speed-ups solely from convolutional layers with no loss in baseline accuracy.

Partha Maji, Robert Mullins
Shortcut Convolutional Neural Networks for Classification of Gender and Texture

Convolutional neural networks are global trainable multi-stage architectures that automatically learn translation invariant features from raw input images. However, in tradition they only allow adjacent layers connected, limiting integration of multi-scale information. To further improve their performance in classification, we present a new architecture called shortcut convolutional neural networks. This architecture can concatenate multi-scale feature maps by shortcut connections to form the fully-connected layer that is directly fed to the output layer. We give an investigation of the proposed shortcut convolutional neural networks on gender classification and texture classification. Experimental results show that shortcut convolutional neural networks have better performances than those without shortcut connections, and it is more robust to different settings of pooling schemes, activation functions, initializations, and optimizations.

Ting Zhang, Yujian Li, Zhaoying Liu
Word Embedding Dropout and Variable-Length Convolution Window in Convolutional Neural Network for Sentiment Classification

Recently the research on sentiment analysis has been attracting growing attention because of the popularity of opinion-rich resources, such as internet movie databases and e-commerce websites. Convolutional neural network(CNN) has been widely used in sentiment analysis to classify the polarity of reviews. For deep convolutional neural networks, dropout is known to work well in the fully-connected layer. In this paper, we use dropout technique in the word embedding layer, and proof it is equivalent to randomly picking activation based on a multinomial distribution at training time. Empirical results also support this and show that using dropout in the word embedding layer can reduce over-fitting. Meanwhile, we investigate the effect of convolution window size on the classification results, and use variable-length convolution window in proposed method. Experimental results show that our method obtains a state-of-the-art performance on ASR. Compared with other similar architectures, the accuracies of our method in this paper are also competitive on IMDB and Subj.

Shangdi Sun, Xiaodong Gu
Reducing Overfitting in Deep Convolutional Neural Networks Using Redundancy Regularizer

Recently, deep convolutional neural networks (CNNs) have achieved excellent performance in many modern applications. These high performance models normally accompany with deep architectures and a huge number of convolutional kernels. These deep architectures may cause overfitting, especially when applied to small training datasets. We observe a potential reason that there exists (linear) redundancy among these kernels. To mitigate this problem, we propose a novel regularizer to reduce kernel redundancy in a deep CNN model and prevent overfitting. We apply the proposed regularizer on various datasets and network architectures and compare to the traditional L2 regularizer. We also compare our method with some widely used methods for preventing overfitting, such as dropout and early stopping. Experimental results demonstrate that kernel redundancy is significantly removed and overfitting is substantially reduced with even better performance achieved.

Bingzhe Wu, Zhichao Liu, Zhihang Yuan, Guangyu Sun, Charles Wu
An Improved Convolutional Neural Network for Sentence Classification Based on Term Frequency and Segmentation

Recently, Sentence classification is a ubiquitous Natural Language Processing (NLP) task and deep learning is proved to be a kind of methods that has a significant effect in this area. In this work, we propose an improved Convolutional Neural Network (CNN) for sentence classification, in which a word-representation model is introduced to capture semantic features by encoding term frequency and segmenting sentence into proposals. The experimental results show that our methods outperform the state-of-the-art methods.

Qi Wang, Jungang Xu, Ben He, Zhengcai Qin
Parallel Implementation of a Bug Report Assignment Recommender Using Deep Learning

For large software projects which receive many reports daily, assigning the most appropriate developer to fix a bug from a large pool of potential developers is both technically difficult and time-consuming. We introduce a parallel, highly scalable recommender system for bug report assignment. From a machine learning perspective, the core of such a system consists of a multi-class classification process using characteristics of a bug, like textual information and other categorical attributes, as features and the most appropriate developer as the predicted class. We use alternatively two Deep Learning classifiers: Convolutional and Recurrent Neural Networks. The implementation is realized on an Apache Spark engine, running on IBM Power8 servers. The experiments use real-world data from the Netbeans, Eclipse and Mozilla projects.

Adrian-Cătălin Florea, John Anvik, Răzvan Andonie
A Deep Learning Approach to Detect Distracted Drivers Using a Mobile Phone

Detect distracted driver is an essential factor to maintain road safety and avoid the risk of accidents and deaths. Studies of the World Health Organization shows that the distraction caused by mobile phones can increase the crash risk by up to 400%. This paper proposes a convolutional neural network that is able to monitor drivers video surveillance, more specifically detect and classify when the driver is using a cell phone. The experiments show an impressive accuracy, achieving up 99% of accuracy detecting distracted driver.

Renato Torres, Orlando Ohashi, Eduardo Carvalho, Gustavo Pessin
A Multi-level Weighted Representation for Person Re-identification

The introduction of deep neural networks (DNN) into person re-identification tasks has significantly improved the re-identification accuracy. However, the substantial characteristics of features extracted from different layers of convolutional neural networks (CNN) are infrequently considered in existing methods. In this paper, we propose a multi-level weighted representation for person re-identification, in which features containing strong discriminative powers or rich semantic meanings are extracted from different layers of a deep CNN, and an estimation subnet evaluates the quality of each feature and generates quality scores used as concatenation weights for all multi-level features. The features multiplied by their weights are concatenated together to the final representations which are improved eventually by a triplet loss to increase the inter-class distance. Therefore, the representation exploits the various benefits of different level features jointly. Experiments on the iLIDS-VID and PRID 2011 datasets show that our proposed representation significantly outperforms the baseline and the state of the art methods.

Xianglai Meng, Biao Leng, Guanglu Song

Games and Strategy

Frontmatter
DeepAPT: Nation-State APT Attribution Using End-to-End Deep Neural Networks

In recent years numerous advanced malware, aka advanced persistent threats (APT) are allegedly developed by nation-states. The task of attributing an APT to a specific nation-state is extremely challenging for several reasons. Each nation-state has usually more than a single cyber unit that develops such advanced malware, rendering traditional authorship attribution algorithms useless. Furthermore, those APTs use state-of-the-art evasion techniques, making feature extraction challenging. Finally, the dataset of such available APTs is extremely small.In this paper we describe how deep neural networks (DNN) could be successfully employed for nation-state APT attribution. We use sandbox reports (recording the behavior of the APT when run dynamically) as raw input for the neural network, allowing the DNN to learn high level feature abstractions of the APTs itself. Using a test set of 1,000 Chinese and Russian developed APTs, we achieved an accuracy rate of 94.6%.

Ishai Rosenberg, Guillaume Sicard, Eli (Omid) David
Estimation of the Change of Agents Behavior Strategy Using State-Action History

Reinforcement learning (RL) provides a computational model to animal’s autonomous acquisition of behaviors even in an uncertain environment. Inverse reinforcement learning (IRL) is its opposite; given a history of behaviors of an agent, IRL attempts to determine the unknown characteristics, like a reward function, of the agent. Conventional IRL methods usually assume the agent has taken a stationary policy that is optimal in the environment. However, real RL agents do not necessarily take stationary policy, because they are often on the way of adapting to their own environments. Especially when facing an uncertain environment, an intelligent agent should take a mixed (or switching) strategy consisting of an exploitation that is best at the current situation and an exploration to resolve the environmental uncertainty. In this study, we propose a new IRL method that can identify both of a non-stationary policy and a fixed but unknown reward function, based on the behavioral history of a learning agent; in particular, we estimate a change point of the behavior policy from an exploratory one in the agent’s early stage of the learning and an exploitative one in its later learning stage. When applied to a computer simulation during a simple maze task of an agent, our method could identify the change point of the behavior policy and the fixed reward function, only from the agent’s history of behaviors.

Shihori Uchida, Sigeyuki Oba, Shin Ishii

Boltzmann Machines and Phase Transitions

Frontmatter
Generalising the Discriminative Restricted Boltzmann Machines

We present a novel theoretical result that generalises the Discriminative Restricted Boltzmann Machine (DRBM). While originally the DRBM was defined assuming the $$\{0, 1\}$$-Bernoulli distribution in each of its hidden units, this result makes it possible to derive cost functions for variants of the DRBM that utilise other distributions, including some that are often encountered in the literature. This paper shows that this function can be extended to the Binomial and $$\{-1,+1\}$$-Bernoulli hidden units.

Srikanth Cherla, Son N. Tran, Artur d’Avila Garcez, Tillman Weyde
Extracting M of N Rules from Restricted Boltzmann Machines

Rule extraction is an important method seeking to understand how neural networks are able to solve problems. In order for rule extraction to be comprehensible, good knowledge representations should be used. So called M of N rules are a compact way of representing knowledge that has a strong intuitive connection to the structure of neural networks. M of N rules have been used in the past in the context of supervised models but not unsupervised models. Here we present a novel extension of a previous rule extraction algorithm for RBMs that allows us to quickly extract accurate M of N rules. The results are compared on simple datasets showing that M of N extraction has the potential to be an effective method for the knowledge representation of RBMs.

Simon Odense, Artur d’Avila Garcez
Generalized Entropy Cost Function in Neural Networks

Artificial neural networks are capable of constructing complex decision boundaries and over the recent years they have been widely used in many practical applications ranging from business to medical diagnosis and technical problems. A large number of error functions have been proposed in the literature to achieve a better predictive power. However, only a few works employ Tsallis statistics, which has successfully been applied in other fields. This paper undertakes the effort to examine the $$ q $$-generalized function based on Tsallis statistics as an alternative error measure in neural networks. The results indicate that Tsallis entropy error function can be successfully applied in the neural networks yielding satisfactory results.

Krzysztof Gajowniczek, Leszek J. Chmielewski, Arkadiusz Orłowski, Tomasz Ząbkowski
Learning from Noisy Label Distributions

In this paper, we consider a novel machine learning problem, that is, learning a classifier from noisy label distributions. In this problem, each instance with a feature vector belongs to at least one group. Then, instead of the true label of each instance, we observe the label distribution of the instances associated with a group, where the label distribution is distorted by an unknown noise. Our goals are to (1) estimate the true label of each instance, and (2) learn a classifier that predicts the true label of a new instance. We propose a probabilistic model that considers true label distributions of groups and parameters that represent the noise as hidden variables. The model can be learned based on a variational Bayesian method. In numerical experiments, we show that the proposed model outperforms existing methods in terms of the estimation of the true labels of instances.

Yuya Yoshikawa
Phase Transition Structure of Variational Bayesian Nonnegative Matrix Factorization

In this paper, we theoretically clarify the phase transition structure of the variational Bayesian nonnegative matrix factorization (VBNMF). By asymptotic analysis of the objective functional in variational inference, we find that the variational posterior distribution of the VBNMF is drastically changed by hyperparameters; we call this phenomenon phase transition of the VBNMF. We also discuss a numerical experiment demonstrating our theoretical results.

Masahiro Kohjima, Sumio Watanabe
Link Enrichment for Diffusion-Based Graph Node Kernels

The notion of node similarity is key in many graph processing techniques and it is especially important in diffusion graph kernels. However, when the graph structure is affected by noise in the form of missing links, similarities are distorted proportionally to the sparsity of the graph and to the fraction of missing links. Here, we introduce the notion of link enrichment, that is, performing link prediction in order to improve the performance of diffusion-based kernels. We empirically show a robust and large effect for the combination of a number of link prediction and a number of diffusion kernel techniques on several gene-disease association problems.

Dinh Tran-Van, Alessandro Sperduti, Fabrizio Costa

Context Information Learning and Self-assessment in Advanced Machine Learning Models

Frontmatter
Classless Association Using Neural Networks

The goal of this paper is to train a model based on the relation between two instances that represent the same unknown class. This task is inspired by the Symbol Grounding Problem and the association learning between modalities in infants. We propose a novel model called Classless Association that has two parallel Multilayer Perceptrons (MLPs) with a EM-training rule. Moreover, the training relies on matching the output vectors of the MLPs against a statistical distribution as alternative loss function because of the unlabeled data. In addition, the output classification of one network is used as target of the other network, and vice versa for learning the agreement between both unlabeled sample. We generate four classless datasets based on MNIST, where the input is two different instances of the same digit. Furthermore, our classless association model is evaluated against two scenarios: totally supervised and totally unsupervised. In the first scenario, our model reaches a good performance in terms of accuracy and the classless constraint. In the second scenario, our model reaches better results against two clustering algorithms.

Federico Raue, Sebastian Palacio, Andreas Dengel, Marcus Liwicki
Shape from Shading by Model Inclusive Learning Method with Simultaneous Estimation of Parameters

The problem of recovering shape from shading is important in computer vision and robotics. It is essentially an ill-posed problem and several studies have been done. In this paper, we present a versatile method of solving the problem by neural networks. The proposed method introduces the concept of the model inclusive learning with simultaneous estimation of unknown parameters. In the method a mathematical model, which we call ‘image-formation model’, expressing the process that the image is formed from an object surface, is introduced and is included in the learning loop of a neural network. The neural network is trained so as to recover the shape with simultaneously estimating unknown parameters in the image-formation model. The performance of the proposed method is demonstrated through experiments.

Yasuaki Kuroe, Hajimu Kawakami
Radius-Margin Ratio Optimization for Dot-Product Boolean Kernel Learning

It is known that any dot-product kernel can be seen as a linear non-negative combination of homogeneous polynomial kernels. In this paper, we demonstrate that, under mild conditions, any dot-product kernel defined on binary valued data can be seen as a linear non-negative combination of boolean kernels, specifically, monotone conjunctive kernels (mC-kernels) with different degrees. We also propose a new radius-margin based multiple kernel learning (MKL) algorithm to learn the parameters of the combination. An empirical analysis of the MKL weights distribution shows that our method is able to give solutions which are more sparse and effective compared to the ones of state-of-the-art margin-based MKL methods. The empirical analysis have been performed on eleven UCI categorical datasets.

Ivano Lauriola, Mirko Polato, Fabio Aiolli
Learning a Compositional Hierarchy of Disparity Descriptors for 3D Orientation Estimation in an Active Fixation Setting

Interaction with everyday objects requires by the active visual system a fast and invariant reconstruction of their local shape layout, through a series of fast binocular fixation movements that change the gaze direction on the 3-dimensional surface of the object. Active binocular viewing results in complex disparity fields that, although informative about the orientation in depth (e.g., the slant and tilt), highly depend on the relative position of the eyes. Assuming to learn the statistical relationships between the differential properties of the disparity vector fields and the gaze directions, we expect to obtain more convenient, gaze-invariant visual descriptors. In this work, local approximations of disparity vector field differentials are combined in a hierarchical neural network that is trained to represent the slant and tilt from the disparity vector fields. Each gaze-related cell’s activation in the intermediate representation is recurrently merged with the other cells’ activations to gain the desired gaze-invariant selectivity. Although the representation has been tested on a limited set of combinations of slant and tilt, the resulting high classification rate validates the generalization capability of the approach.

Katerina Kalou, Agostino Gibaldi, Andrea Canessa, Silvio P. Sabatini
A Priori Reliability Prediction with Meta-Learning Based on Context Information

Machine learning systems are used in a wide variability of tasks, where reliability is very important. Often from the output of these systems their reliability cannot directly be deduced. We propose an approach to predict the reliability of a machine learning system externally. We tackle this by using an additional machine learning component we call meta-learner. This meta-learner can use the original input as well as supplementary context information for its judgment. With this approach the meta-learner can make a prediction of the performance of the machine learner before this one is actually executed. Based on this prediction unreliable decisions can be rejected and the systems reliability is retained. We show that our method outperforms certainty-based approaches at the example of road terrain detection.

Jennifer Kreger, Lydia Fischer, Stephan Hasler, Thomas H. Weisswange, Ute Bauer-Wersing
Attention Aware Semi-supervised Framework for Sentiment Analysis

Using sentiment analysis methods to retrieve useful information from the accumulated documents in the Internet has become an important research subject. In this paper, we proposed a semi-supervised framework, which uses the unlabeled data to promote the learning ability of the long short memory (LSTM) network. It is composed of an unsupervised attention aware long short term memory (LSTM) encoder-decoder and a single LSTM model used for feature extraction and classification. Experimental study on commonly used datasets has demonstrated our framework’s good potential for sentiment classification tasks. And it has shown that the unsupervised learning part can improve the LSTM network’s learning ability.

Jingshuang Liu, Wenge Rong, Chuan Tian, Min Gao, Zhang Xiong
Chinese Lexical Normalization Based on Information Extraction: An Experimental Study

In this work, we described a novel method for normalizing Chinese informal words to their standard equivalents. We form the task as an information extraction problem, using Q & A community answers as source corpus. We proposed several LSTM based models for the extraction task. To evaluate and compare performances of the proposed models, we developed a standard dataset containing factoid generated by real-world users in daily life. Since our method do not use any linguistic features, it’s also applicable to other languages.

Tian Tian, WeiRan Xu
Analysing Event Transitions to Discover Student Roles and Predict Grades in MOOCs

When interacting with a MOOC, students can perform different kinds of actions such as watching videos, answering exercises, participating in the course forum, submitting a project or reviewing a document. These actions represent the dynamism of student learning paths, and their preferences when learning in an autonomous mode. In this paper we propose to analyse these learning paths with two goals in mind. The first one is to try to discover the different roles that students may adopt when interacting with an online course. By applying k-means, six of these roles are discovered and we give a qualitative interpretation of them based on student information associated to each cluster. The other goal is to predict academic performance. In this sense, we present the results obtained with Random Forest and Neural Networks that allow us to predict the final grade with around 10% of mean absolute error.

Ángel Pérez-Lemonche, Gonzalo Martínez-Muñoz, Estrella Pulido-Cañabate
Applying Artificial Neural Networks on Two-Layer Semantic Trajectories for Predicting the Next Semantic Location

Location-awareness and prediction play a steadily increasing role as systems and services become more intelligent. At the same time semantics gain in importance in geolocation application. In this work, we investigate the use of artificial neural networks (ANNs) in the field of semantic location prediction. We evaluate three different ANN types: FFNN, RNN and LSTM on two different data sets on two different semantic levels each. In addition we compare each of them to a Markov model predictor. We show that neural networks perform overall well, with LSTM achieving the highest average score of 76,1%.

Antonios Karatzoglou, Harun Sentürk, Adrian Jablonski, Michael Beigl
Model-Aware Representation Learning for Categorical Data with Hierarchical Couplings

Learning an appropriate representation for categorical data is a critical yet challenging task. Current research makes efforts to embed the categorical data into the vector or dis/similarity spaces, however, it either ignores the complex interactions within data or overlooks the relationship between the representation and its fed learning model. In this paper, we propose a model-aware representation learning framework for categorical data with hierarchical couplings, which simultaneously reveals the couplings from value to object and optimizes the fitness of the represented data for the follow-up learning model. An SVM-aware representation learning method has been instantiated for this framework. Extensive experiments on ten UCI categorical datasets with diverse characteristics demonstrate the representation via our proposed method can significantly improve the learning performance (up to 18.64% improved) compared with other three competitors.

Jianglong Song, Chengzhang Zhu, Wentao Zhao, Wenjie Liu, Qiang Liu
Perceptron-Based Ensembles and Binary Decision Trees for Malware Detection

Nowadays, security researchers witness an exponential growth of the number of malware variants in the wild. On top of this, various advanced techniques like metamorphism, server-side polymorphism, anti-emulation, commercial or custom packing, and so on, are being used in order to evade detection. It is clear that standard detection techniques no longer cope with the ongoing anti-malware fight. This is why machine learning techniques for malware detection are continually being developed and improved. These, however, operate on huge amounts of data and face challenges like finding an equilibrium between the three most desired requirements: low false positive rate, high detection rate, acceptable performance impact. This paper aims to reach this equilibrium by starting with an algorithm which has a zero false positive rate during the training phase and continuing by further improving it, in order to increase the detection rate without significantly altering the low false positive property.

Cristina Vatamanu, Doina Cosovan, Dragoş Gavriluţ, Henri Luchian
Multi-column Deep Neural Network for Offline Arabic Handwriting Recognition

In recent years Deep Neural Networks (DNNs) have been successfully applied to several pattern recognition filed. For example, Multi-Column Deep Neural Networks (MCDNN) achieve state of the art recognition rates on Chinese characters database. In this paper, we utilized MCDNN for Offline Arabic Handwriting Recognition (OAHR). Through several settings of experiments using the benchmarking IFN/ENIT Database, we show incremental improvements of the words recognition comparable to approaches used Deep Belief Network (DBN) or Recurrent Neural Network (RNN.) Lastly, we compare our best result to those of previous state-of-the-arts.

Rolla Almodfer, Shengwu Xiong, Mohammed Mudhsh, Pengfei Duan
Using LSTMs to Model the Java Programming Language

Recurrent neural networks (RNNs), specifically long-short term memory networks (LSTMs), can model natural language effectively. This research investigates the ability for these same LSTMs to perform next “word” prediction on the Java programming language. Java source code from four different repositories undergoes a transformation that preserves the logical structure of the source code and removes the code’s various specificities such as variable names and literal values. Such datasets and an additional English language corpus are used to train and test standard LSTMs’ ability to predict the next element in a sequence. Results suggest that LSTMs can effectively model Java code achieving perplexities under 22 and accuracies above 0.47, which is an improvement over LSTM’s performance on the English language which demonstrated a perplexity of 85 and an accuracy of 0.27. This research can have applicability in other areas such as syntactic template suggestion and automated bug patching.

Brendon Boldt

Representation and Classification

Frontmatter
Classification of Categorical Data in the Feature Space of Monotone DNFs

Nowadays, kernel based classifiers, such as SVM, are widely used on many different classification tasks. One of the drawbacks of these kind of approaches is their poor interpretability. In the past, some efforts have been devoted in designing kernels able to construct a more understandable feature space, e.g., boolean kernels, but only combinations of simple conjunctive clauses have been proposed.In this paper, we present a family of boolean kernels, specifically, the Conjunctive kernel, the Disjunctive kernel and the DNF-kernel. These kernels are able to construct feature spaces with a wide spectrum of logical formulae. For all of these kernels, we provide a description of their corresponding feature spaces and efficient ways to calculate their values implicitly. Experiments on several categorical datasets show the effectiveness of the proposed kernels.

Mirko Polato, Ivano Lauriola, Fabio Aiolli
DeepBrain: Functional Representation of Neural In-Situ Hybridization Images for Gene Ontology Classification Using Deep Convolutional Autoencoders

This paper presents a novel deep learning-based method for learning a functional representation of mammalian neural images. The method uses a deep convolutional denoising autoencoder (CDAE) for generating an invariant, compact representation of in situ hybridization (ISH) images. While most existing methods for bio-imaging analysis were not developed to handle images with highly complex anatomical structures, the results presented in this paper show that functional representation extracted by CDAE can help learn features of functional gene ontology categories for their classification in a highly accurate manner. Using this CDAE representation, our method outperforms the previous state-of-the-art classification rate, by improving the average AUC from 0.92 to 0.98, i.e., achieving 75% reduction in error. The method operates on input images that were downsampled significantly with respect to the original ones to make it computationally feasible.

Ido Cohen, Eli (Omid) David, Nathan S. Netanyahu, Noa Liscovitch, Gal Chechik
Mental Workload Classification Based on Semi-Supervised Extreme Learning Machine

The real-time operator’s mental workload (MWL) monitoring system is crucial for the design and development of adaptive operator-aiding/assistance systems. Although the data-driven approach has shown promising performance for MWL recognition, its major challenge lies in the difficulty in acquiring extensive labeled data. This paper attempts to apply the semi-supervised extreme learning machine (ELM) to the challenging problem of operator’s mental workload classification based only on a small number of labeled physiological data. The real data analysis results show that the semi-supervised ELM method can effectively improve the accuracy and computational efficiency of the MWL pattern classification.

Jianrong Li, Jianhua Zhang
View-Weighted Multi-view K-means Clustering

In many clustering problems, there are dozens of data which are represented by multiple views. Different views describe different aspects of the same set of instances and provide complementary information. Considering blindly combining the information from different views will degrade the multi-view clustering result, this paper proposes a novel view-weighted multi-view k-means method. Meanwhile, to reduce the adverse effect of outliers, $$l_{2,1}$$ norm is employed to calculate the distance between data points and cluster centroids. An alternative iterative update schema is developed to find the optimal value. Comparative experiments on real world datasets reveal that the proposed method has better performance.

Hong Yu, Yahong Lian, Shu Li, JiaXin Chen
Indefinite Support Vector Regression

Non-metric proximity measures got wide interest in various domains such as life sciences, robotics and image processing. The majority of learning algorithms for these data are focusing on classification problems. Here we derive a regression algorithm for indefinite data representations based on the support vector machine. The approach avoids heuristic eigen spectrum modifications or costly proxy matrix approximations, as used in general. We evaluate the method on a number of benchmark data using an indefinite measure.

Frank-Michael Schleif
Instance-Adaptive Attention Mechanism for Relation Classification

Recently, attention mechanism has been transferred to relation classification task. Since relation classification is a sequence-to-label task, the challenge is how to generate the deciding factor to calculate attention weights. The previous solution randomly initializes a global deciding factor, which is easy to suffer from over-fitting. To solve the problem, we propose instance-adaptive attention mechanism, which generates a specially designed deciding factor for each sentence. The experimental result on SemEval-2010 Task 8 dataset shows that our method can outperform most state-of-the-art systems without external linguistic features.

Yao Lu, Chunyun Zhang, Weiran Xu
ReForeSt: Random Forests in Apache Spark

Random Forests (RF) of tree classifiers are a popular ensemble method for classification. RF are usually preferred with respect to other classification techniques because of their limited hyperparameter sensitivity, high numerical robustness, native capacity of dealing with numerical and categorical features, and effectiveness in many real world classification problems. In this work we present ReForeSt, a Random Forests Apache Spark implementation which is easier to tune, faster, and less memory consuming with respect to MLlib, the de facto standard Apache Spark machine learning library. We perform an extensive comparison between ReForeSt and MLlib by taking advantage of the Google Cloud Platform (https://cloud.google.com). In particular, we test ReForeSt and MLlib with different library settings, on different real world datasets, and with a different number of machines equipped with different number of cores. Results confirm that ReForeSt outperforms MLlib in all the above mentioned aspects. ReForeSt is made publicly available via GitHub (https://github.com/alessandrolulli/reforest).

Alessandro Lulli, Luca Oneto, Davide Anguita
Semi-Supervised Multi-view Multi-label Classification Based on Nonnegative Matrix Factorization

Many real-world applications involve multi-label classification where each sample is usually associated with a set of labels. Although many methods have been proposed, most of them are just applicable to single-view data neglecting the complementary information among multiple views. Besides, most existing methods are supervised, hence they cannot handle the case where only a few labeled data are available. To address these issues, we propose a novel semi-supervised multi-view multi-label classification method based on nonnegative matrix factorization (NMF). Specifically, it explores the complementary information by adopting multi-view NMF, regularizes the learned labels of each view towards a common consensus labeling, and obtains the labels of the unlabeled data guided by supervised information. Experimental results on real-world benchmark datasets demonstrate the superior performance of our method over the state-of-the-art methods.

Guangxia Wang, Changqing Zhang, Pengfei Zhu, Qinghua Hu
Masked Conditional Neural Networks for Audio Classification

We present the ConditionaL Neural Network (CLNN) and the Masked ConditionaL Neural Network (MCLNN) designed for temporal signal recognition. The CLNN takes into consideration the temporal nature of the sound signal and the MCLNN extends upon the CLNN through a binary mask to preserve the spatial locality of the features and allows an automated exploration of the features combination analogous to hand-crafting the most relevant features for the recognition task. MCLNN have achieved competitive recognition accuracies on the GTZAN and the ISMIR2004 music datasets that surpass several state-of-the-art neural network based architectures and hand-crafted methods applied on both datasets.

Fady Medhat, David Chesmore, John Robinson
A Feature Selection Approach Based on Information Theory for Classification Tasks

This paper proposes the use of a Information Theory measure in a dynamic feature selection approach. We tested such approach including elements of Information Theory in the process, such as Mutual Information, and compared with classical methods like PCA and LDA as well as Mutual Information based algorithms. Results showed that the proposed method achieved better performance in most cases when compared with the other methods. Based on this, we could conclude that the proposed approach is very promising since it achieved better performance than well-established dimensionality reduction methods.

Jhoseph Jesus, Anne Canuto, Daniel Araújo
Two-Level Neural Network for Multi-label Document Classification

This paper deals with multi-label document classification using neural networks. We propose a novel neural network which is composed of two sub-nets: the first one estimates the scores for all classes, while the second one determines the number of classes assigned to the document. The proposed approach is evaluated on Czech and English standard corpora. The experimental results show that the proposed method is competitive with state of the art on both languages.

Ladislav Lenc, Pavel Král
Ontology Alignment with Weightless Neural Networks

In this paper, we present an ontology matching process based on the usage of Weightless Neural Networks (WNN). The alignment of ontologies for specific domains provides several benefits, such as interoperability among different systems and the improvement of the domain knowledge derived from the insights inferred from the combined information contained in the various ontologies. A WiSARD classifier is built to estimate a distribution-based similarity measure among the concepts of the several ontologies being matched. To validate our approach, we apply the proposed matching process to the knowledge domain of algorithms, software and computational problems, having some promising results.

Thais Viana, Carla Delgado, João C. P. da Silva, Priscila Lima
Marine Safety and Data Analytics: Vessel Crash Stop Maneuvering Performance Prediction

Crash stop maneuvering performance is one of the key indicators of the vessel safety properties for a shipbuilding company. Many different factors affect these performances, from the vessel design to the environmental conditions, hence it is not trivial to assess them accurately during the preliminary design stages. Several first principal equation methods are available to estimate the crash stop maneuvering performance, but unfortunately, these methods usually are either too costly or not accurate enough. To overcome these limitations, the authors propose a new data-driven method, based on the popular Random Forests learning algorithm, for predicting the crash stopping maneuvering performance. Results on real-world data provided by the DAMEN Shipyards show the effectiveness of the proposal.

Luca Oneto, Andrea Coraddu, Paolo Sanetti, Olena Karpenko, Francesca Cipollini, Toine Cleophas, Davide Anguita
Combining Character-Level Representation for Relation Classification

Word representation models have achieved great success in natural language processing tasks, such as relation classification. However, it does not always work on informal text, and the morphemes of some misspelling words may carry important short-distance semantic information. We propose a hybrid model, combining the merits of word-level and character-level representations to learn better representations on informal text. Experiments on the SemEval-2010 Task8 dataset for relation classification show that our model achieves a competitive result.

Dongyun Liang, Weiran Xu, Yinge Zhao
On Combining Clusterwise Linear Regression and K-Means with Automatic Weighting of the Explanatory Variables

This paper gives a clusterwise linear regression method aiming to provide linear regression models that are based on homogeneous clusters of observations w.r.t. the explanatory variables. To achieve this aim, this method combine the standard clusterwise linear regression and K-Means with automatic computation of relevance weights for the explanatory variables. Experiments with benchmark datasets corroborate the usefulness of the proposed method.

Ricardo A. M. da Silva, Francisco de A. T. de Carvalho
PSO-RBFNN: A PSO-Based Clustering Approach for RBFNN Design to Classify Disease Data

The Radial Basis Function Neural Networks (RBFNNs) are non-iterative in nature so they are attractive for disease classification. These are four layer networks with input, hidden, output and decision layers. The RBFNNs require single iteration for training the network. On the other side, it suffers from growing hidden layer size on par with training dataset. Though various attempts have been made to solve this issue by clustering the input data. But, in a given dataset estimating the optimal number of clusters is unknown and also it involves more computational time. Hence, to address this problem in this paper, a Particle Swarm Optimization (PSO)-based clustering methodology has been proposed. In this context, we introduce a measure in the objective function of PSO, which allows us to measure the quality of wide range of clusters without prior information. Next, this PSO-based clustering methodology yields a set of High-Performance Cluster Centers (HPCCs). The proposed method experimented on three medical datasets. The experimental results indicate that the proposed method outperforms the competing approaches.

Ramalingaswamy Cheruku, Damodar Reddy Edla, Venkatanareshbabu Kuppili, Ramesh Dharavath

Clustering

Frontmatter
Modularity-Driven Kernel k-means for Community Detection

The k-means algorithm is probably the most well-known and most popular clustering method in existence today. This work evaluates if a new, autonomous, kernel k-means approach for graph node clustering coupled with the modularity criterion can rival, e.g., the well-established Louvain method. We test the algorithm on social network datasets of various sizes and types. The new method estimates the optimal kernel or distance parameters as well as the natural number of clusters in the dataset based on modularity. Results indicate that this simple black-box algorithm manages to perform on par with the Louvain method given the same input.

Felix Sommer, François Fouss, Marco Saerens
Measuring Clustering Model Complexity

The capacity of a clustering model can be defined as the ability to represent complex spatial data distributions. We introduce a method to quantify the capacity of an approximate spectral clustering model based on the eigenspectrum of the similarity matrix, providing the ability to measure capacity in a direct way and to estimate the most suitable model parameters. The method is tested on simple datasets and applied to a forged banknote classification problem.

Stefano Rovetta, Francesco Masulli, Alberto Cabri
GNMF Revisited: Joint Robust k-NN Graph and Reconstruction-Based Graph Regularization for Image Clustering

Clustering has long been a popular topic in machine learning and is the basic task of many vision applications. Graph regularized NMF (GNMF) and its variants as extensions of NMF decompose the whole dataset as the product of two low-rank matrices which respectively indicate centroids of clusters and cluster memberships for each sample. Although they utilize graph structure to reveal the geometrical structure within datasets, these methods completely ignore the robustness of graph structure. To address the issue above, this paper jointly incorporates a novel Robust Graph and Reconstruction-based Graph regularization into NMF (RG$$^2$$NMF) to promote the gain in clustering performance. Particularly, RG$$^2$$NMF stabilizes the objective of GNMF through the reconstruction regularization, and meanwhile exploits a learning procedure to derive the robust graph. Experiments of image clustering on two popular datasets illustrate the effectiveness of RG$$^2$$NMF compared with the baseline methods in quantities.

Feng Gu, Wenju Zhang, Xiang Zhang, Chenxu Wang, Xuhui Huang, Zhigang Luo
Two Staged Fuzzy SVM Algorithm and Beta-Elliptic Model for Online Arabic Handwriting Recognition

Online handwriting recognition has been gaining more interest in the field of document analysis due to the growth of data entry technology. In this context, we propose a new architecture for online Arabic Word recognition based on a pre-classification of their handwriting trajectory segments delimited by pen-down and pen-up actions. To characterize these segments, we extract their kinematic and geometric profiles characteristics according to the overlapped beta-elliptic approach. The main contribution in this work consists on combining two stages of Support Vector Machines (SVM). The first one is developed in fuzzy logic (Fuzzy SVM) and allows computing the membership probabilities of pseudo-words in different sub-groups. The second stage consists on gathering the membership probabilities vectors of pseudo-words belonging to the same word in order to predict the word label. The tests are performed on 937 classes which represent the Tunisian town names from the ADAB database. The obtained results show the effectiveness of the proposed architecture which reached the rate of 99.89%.

Ramzi Zouari, Houcine Boubaker, Monji Kherallah
Evaluating the Compression Efficiency of the Filters in Convolutional Neural Networks

Along with the recent development of Convolutional Neural Network (CNN) and its multilayering, it is important to reduce the amount of computation and the amount of data associated with convolution processing. Some compression methods of convolutional filters using low-rank approximation have been studied. The common goal of these studies is to accelerate the computation wherever possible while maintaining the accuracy of image recognition. In this paper, we investigate the trade-off between the compression error by low-rank approximation and the computational complexity for the state-of-the-arts CNN model.

Kazuki Osawa, Rio Yokota
Dynamic Feature Selection Based on Clustering Algorithm and Individual Similarity

This paper introduces a new dynamic feature selection to classification algorithms, which is based on individual similarity and it uses a clustering algorithm to select the best features for an instance individually. In addition, an empirical analysis will be performed to evaluate the performance of the proposed method and to compare it with existing feature selection methods, applying to classification problems. The results shown in this paper indicate that the proposed method had better performance results than the existing methods compared, in most cases.

Carine A. Dantas, Rômulo O. Nunes, Anne M. P. Canuto, João C. Xavier-Júnior

Learning from Data Streams and Time Series

Frontmatter
Dialogue-Based Neural Learning to Estimate the Sentiment of a Next Upcoming Utterance

In a conversation, humans use changes in a dialogue to predict safety-critical situations and use them to react accordingly. We propose to use the same cues for safer human-robot interaction for early verbal detection of dangerous situations. Due to the limited availability of sentiment-annotated dialogue corpora, we use a simple sentiment classification for utterances to neurally learn sentiment changes within dialogues and ultimately predict the sentiment of upcoming utterances. We train a recurrent neural network on context sequences of words, defined as two utterances of each speaker, to predict the sentiment class of the next utterance. Our results show that this leads to useful predictions of the sentiment class of the upcoming utterance. Results for two challenging dialogue datasets are reported to show that predictions are similar independent of the dataset used for training. The prediction accuracy is about 63% for binary and 58% for multi-class classification.

Chandrakant Bothe, Sven Magg, Cornelius Weber, Stefan Wermter
Solar Power Forecasting Using Pattern Sequences

We consider the task of predicting the solar power output for the next day from solar power output and weather data for previous days, and weather forecast for the next day. We study the performance of pattern sequence methods which combine clustering and sequence similarity. We show how the standard PSF algorithm can be extended to utilize data from more than one data source by proposing two extensions, PSF1 and PSF2. The performance of the three PSF methods is evaluated on Australian data for two years and compared with three neural network models and a baseline. Our results show that the extensions were beneficial, especially PSF2 which uses a 2-tier clustering and sequence matching. We also investigate the robustness of all methods for different levels of noise in the weather forecast.

Zheng Wang, Irena Koprinska, Mashud Rana
A New Methodology to Exploit Predictive Power in (Open, High, Low, Close) Data

Prediction of financial markets using neural networks and other techniques has predominately focused on the close price. Here, in contrast, the concept of a mid-price based on an Open, High, Low, Close (OHLC) data structure is proposed as a prediction target and shown to be a significantly easier target to forecast, suggesting previous works have attempted to extract predictive power from OHLC data in the wrong context. A prediction framework incorporating a factor discovery and mining process is developed using Randomised Decision Trees, with Long Short Term Memory Recurrent Neural Networks subsequently demonstrating remarkable predictive capabilities of up to 50.73% better than random (75.42% accuracy) on hourly data based on the FGBL German Bund futures contract, and 42.5% better than random (72.04% accuracy) on a comparison Bitcoin dataset.

Andrew D. Mann, Denise Gorse
Recurrent Dynamical Projection for Time Series-Based Fraud Detection

A Reservoir Computing approach is used in this work for generating a rich nonlinear spatial feature from the dynamical projection of a limited-size input time series. The final state of the Recurrent neural network (RNN) forms the feature subsequently used as input to a regressor or classifier (such as Random Forest or Least Squares). This proposed method is used for fraud detection in the energy distribution domain, namely, detection of non-technical loss (NTL) using a real-world dataset containing only the monthly energy consumption time series of (more than 300 K) users. The heterogeneity of user profiles is dealt with a clustering approach, where the cluster id is also input to the classifier. Experimental results shows that the proposed recurrent feature generator is able to extract relevant nonlinear transformations of the raw time series without a priori knowledge and perform as good as (and sometimes better than) baseline models with handcrafted features.

Eric A. Antonelo, Radu State
Transfer Information Energy: A Quantitative Causality Indicator Between Time Series

We introduce an information-theoretical approach for analyzing cause-effect relationships between time series. Rather than using the Transfer Entropy (TE), we define and apply the Transfer Information Energy (TIE), which is based on Onicescu’s Information Energy. The TIE can substitute the TE for detecting cause-effect relationships between time series. The advantage of using the TIE is computational: we can obtain similar results, but faster. To illustrate, we compare the TIE and the TE in a machine learning application. We analyze time series of stock market indexes, with the goal to infer causal relationships between them (i.e., how they influence each other).

Angel Caţaron, Răzvan Andonie
Improving Our Understanding of the Behavior of Bees Through Anomaly Detection Techniques

Bees are one of the most important pollinators since they assist in plant reproduction and ensure seed and fruit production. They are important both for pollination and honey production, which benefits small and large-scale agriculturists. However, in recent years, the bee populations have declined significantly in alarming ways on a global scale. In this scenario, understanding the behavior of bees has become a matter of great concern in an attempt to find the possible causes of this situation. In this study, an anomaly detection algorithm is created for data labeling, as well as to evaluate the classification models of anomalous events in a time series obtained from RFID sensors installed in bee hives.

Fernando Gama, Helder M. Arruda, Hanna V. Carvalho, Paulo de Souza, Gustavo Pessin
Applying Bidirectional Long Short-Term Memories (BLSTM) to Performance Data in Air Traffic Management for System Identification

The performance analysis of complex systems like Air Traffic Management (ATM) is a challenging task. To overcome statistical complexities through analyzing non-linear time series we approach the problem with machine learning methods. Therefore we understand ATM (and its identified system model) as a system of coupled and interdependent sub-systems working in time-continuous processes, measurable through time-discrete time series.In this paper we discuss the requirements of a system identification process and the attached statistical analysis of ATM emitted performance data based on discussed benchmarking frameworks. The superior aim is to show, that neural networks are able to handle complex non-linear time-series, to learn how to rebuild them considering multidimensional inputs and to store knowledge about the observation data set’s behavior.

Stefan Reitmann, Karl Nachtigall

Image Processing and Medical Applications

Frontmatter
A Novel Image Tag Completion Method Based on Convolutional Neural Transformation

In the problems of image retrieval and annotation, complete textual tag lists of images play critical roles. However, in real-world applications, the image tags are usually incomplete, thus it is important to learn the complete tags for images. In this paper, we study the problem of image tag complete and proposed a novel method for this problem based on a popular image representation method, convolutional neural network (CNN). The method estimates the complete tags from the convolutional filtering outputs of images based on a linear predictor. The CNN parameters, linear predictor, and the complete tags are learned jointly by our method. We build a minimization problem to encourage the consistency between the complete tags and the available incomplete tags, reduce the estimation error, and reduce the model complexity. An iterative algorithm is developed to solve the minimization problem. Experiments over benchmark image data sets show its effectiveness.

Yanyan Geng, Guohui Zhang, Weizhi Li, Yi Gu, Ru-Ze Liang, Gaoyuan Liang, Jingbin Wang, Yanbin Wu, Nitin Patil, Jing-Yan Wang
Reducing Unknown Unknowns with Guidance in Image Caption

Deep recurrent models applied in Image Caption, which link up computer vision and natural language processing, have achieved excellent results enabling automatically generating natural sentences describing an image. However, the mismatch of sample distribution between training data and the open world may leads to tons of hiding-in-dark Unknown Unknowns (UUs). And such errors may greatly harm the correctness of generated captions. In this paper, we present a framework targeting on UUs reduction and model optimization based on recurrently training with small amounts of external data detected under assistance of crowd commonsense. We demonstrate and analyze our method with currently state-of-the-art image-to-text model. Aiming at reducing the number of UUs in generated captions, we obtain over 12% of UUs reduction and reinforcement of model cognition on these scenes.

Mengjun Ni, Jing Yang, Xin Lin, Liang He
A Novel Method for Ship Detection and Classification on Remote Sensing Images

Ship detection and classification is critical for national maritime security and national defense. As massive optical remote sensing images of high resolution are available, ship detection and classification on optical remote sensing images is becoming a promising technique, and has attracted great attention on applications including maritime security and traffic control. Some image processing-based methods have been proposed to detect ships in optical remote sensing images, but most of them face difficulty in terms of accuracy, performance and complexity. Therefore, in this paper, we propose a novel ship detection and classification approach which utilizes deep convolutional neural network (CNN) as the ship classifier. Next, in order to overcome the divergence problem of deep CNN-based classifier, a residual network-based ship classifier is proposed. In order to deepen the network without excessive growth of network complexity, inception layers are used. In addition, batch normalization is used in each convolution layer to accelerate the convergence. The performance of our proposed ship detection and classification approach is evaluated on a set of ship images downloaded from Google Earth, each in 256 × 64 pixels at the resolution 0.5 m. Ninety-five percent classification accuracy is achieved. A CUDA-enabled residual network is implemented in model training which achieved 75× speedup on 1 Nvidia Titan X GPU.

Ying Liu, Hongyuan Cui, Guoqing Li
Single Image Super-Resolution by Learned Double Sparsity Dictionaries Combining Bootstrapping Method

A novel single image super-resolution (SISR) method using learned double sparsity dictionaries combining bootstrapping method is proposed in this paper. The bootstrapping method we used is proposed by Zeyde et al. in [1], which uses the input low-resolution (LR) image (as high-resolution image) and its own scaled-down version (as LR image) as the training images. In our previous work [15], with the output image obtained by the bootstrapping method, two difference images can be computed and are used to learn a pair of dictionaries as proposed in [1]. In this paper, we further improve the SISR method by using four wavelet sub-bands of the two difference images as extra information when learning the sparse representation model. We use the K-singular value decomposition (K-SVD) method to obtain the dictionary and the orthogonal matching pursuit (OMP) method to derive the sparse representation coefficients. Comparative experimental results show that our proposed method perform better in terms of both visual effect and Peak Signal to Noise Ratio (PSNR) improvements.

Na Ai, Jinye Peng, Jun Wang, Lin Wang, Jin Qi
Attention Focused Spatial Pyramid Pooling for Boxless Action Recognition in Still Images

Existing approaches for still image based action recognition rely heavily on bounding boxes and could be restricted to specific applications with bounding boxes available. Thus, exploring the boxless action recognition in still images is very challenging for lack of any supervised knowledge. To address this issue, we propose an attention focused spatial pyramid pooling (SPP) network (AttSPP-net) free from the bounding boxes by jointly integrating the soft attention mechanism and SPP into a convolutional neural network. Particularly, soft attention mechanism automatically indicates relevant image regions to be an action. Besides, AttSPP-net further exploits SPP to boost the robustness to action deformation by capturing spatial structures among image pixels. Experiments on two public action recognition benchmark datasets including PASCAL VOC 2012 and Stanford-40 demonstrate that AttSPP-net can achieve promising results and even outweighs some methods based on ground-truth bounding boxes, and provides an alternative way towards practical applications.

Weijiang Feng, Xiang Zhang, Xuhui Huang, Zhigang Luo
The Impact of Dataset Complexity on Transfer Learning over Convolutional Neural Networks

This paper makes use of diverse domains’ datasets to analyze the impact of image complexity and diversity on the task of transfer learning in deep neural networks. As the availability of labels and quality instances for several domains are still scarce, it is imperative to use the knowledge acquired from similar problems to improve classifier performance by transferring the learned parameters. We performed a statistical analysis through several experiments in which the convolutional neural networks (LeNet-5, AlexNet, VGG-11 and VGG-16) were trained and transferred to different target tasks layer by layer. We show that when working with complex low-quality images and small datasets, fine-tuning the transferred features learned from a low complexity source dataset gives the best results.

Miguel D. de S. Wanderley, Leonardo de A. e Bueno, Cleber Zanchettin, Adriano L. I. Oliveira
Real-Time Face Detection Using Artificial Neural Networks

In this paper, we propose a model for face detection that works in both real-time and unstructured environments. For feature extraction, we applied the HOG (Histograms of Oriented Gradients) technique in a canonical window. For classification, we used a feed-forward neural network. We tested the performance of the proposed model at detecting faces in sequences of color images. For this task, we created a database containing color image patches of faces and background to train the neural network and color images of 320 × 240 to test the model. The database is available at http://electronica-el.espe.edu.ec/actividad-estudiantil/face-database/. To achieve real-time, we split the model into several modules that run in parallel. The proposed model exhibited an accuracy of 91.4% and demonstrated robustness to changes in illumination, pose and occlusion. For the tests, we used a 2-core-2.5 GHz PC with 6 GB of RAM memory, where input frames of 320 × 240 pixels were processed in an average time of 81 ms.

Pablo S. Aulestia, Jonathan S. Talahua, Víctor H. Andaluz, Marco E. Benalcázar
On the Performance of Classic and Deep Neural Models in Image Recognition

Deep learning has arisen in the last years as a powerful and ultimate tool for machine learning problems. This article analyses the performance of classic and deep neural network models in a challenging problem like face recognition. The aim of this article is to study what the main advantages and disadvantages deep neural networks provide and when they will be more suitable than classic models, which have also obtained really good results in some complex problems. Is it worth using deep learning? The results show that deep models increase the learning capabilities of classic neural networks in problems with high non-linearities features.

Ricardo García-Ródenas, Luis Jiménez Linares, Julio Alberto López-Gómez
Winograd Algorithm for 3D Convolution Neural Networks

Three-dimensional convolution neural networks (3D CNN) have achieved great success in many computer vision applications, such as video analysis, medical image classification, and human action recognition. However, the efficiency of this model suffers from great computational intensity. In this work, we reduce the algorithmic complexity of 3D CNN to accelerate this model with Winograd’s minimal algorithm. We benchmark a net model on GPU platform, resulting in a speed-up by a factor of 1.2$$\times $$ compared with cuDNN, which is commonly used in many current machine learning frameworks.

Zelong Wang, Qiang Lan, Hongjun He, Chunyuan Zhang
Core Sampling Framework for Pixel Classification

The intermediate map responses of a Convolutional Neural Network (CNN) contain contextual knowledge about its input. In this paper, we present a framework that uses these activation maps from several layers of a CNN as features to a Deep Belief Network (DBN) using transfer learning to provide an understanding of an input image. We create a representation of these features and the training data and use them to extract more information from an image at the pixel level, hence gaining understanding of the whole image. We experimentally demonstrate the usefulness of our framework using a pretrained model and use a DBN to perform segmentation on the BAERI dataset of Synthetic Aperture Radar (SAR) imagery and the CAMVID dataset with a relatively smaller training dataset.

Manohar Karki, Robert DiBiano, Saikat Basu, Supratik Mukhopadhyay
Biomedical Data Augmentation Using Generative Adversarial Neural Networks

Synthesizing photo-realistic images is a challenging problem with many practical applications [15]. In many cases, the availability of a significant amount of images is crucial, yet obtaining them might be not trivial. For instance, obtaining huge databases of images is hard, in the biomedical domain, but strictly needed in order to improve both algorithms and physicians’ skills. In the latest years, new deep learning models have been proposed in the literature, called Generative Adversarial Neural Networks (GANNs) [7], that turned out as effective at synthesizing high-quality image in several domains. In this work we propose a new application of GANNs to the automatic generation of artificial Magnetic Resonance Images (MRI) of slices of the human brain; both quantitative and human-based evaluations of generated images have been carried out in order to assess effectiveness of the method.

Francesco Calimeri, Aldo Marzullo, Claudio Stamile, Giorgio Terracina
Detection of Diabetic Retinopathy Based on a Convolutional Neural Network Using Retinal Fundus Images

Diabetic retinopathy is one of the leading causes of blindness. Its damage is associated with the deterioration of blood vessels in retina. Progression of visual impairment may be cushioned or prevented if detected early, but diabetic retinopathy does not present symptoms prior to progressive loss of vision, and its late detection results in irreversible damages. Manual diagnosis is performed on retinal fundus images and requires experienced clinicians to detect and quantify the importance of several small details which makes this an exhaustive and time-consuming task. In this work, we attempt to develop a computer-assisted tool to classify medical images of the retina in order to diagnose diabetic retinopathy quickly and accurately. A neural network, with CNN architecture, identifies exudates, micro-aneurysms and hemorrhages in the retina image, by training with labeled samples provided by EyePACS, a free platform for retinopathy detection. The database consists of 35126 high-resolution retinal images taken under a variety of conditions. After training, the network shows a specificity of 93.65% and an accuracy of 83.68% on validation process.

Gabriel García, Jhair Gallardo, Antoni Mauricio, Jorge López, Christian Del Carpio
A Comparison of Machine Learning Approaches for Classifying Multiple Sclerosis Courses Using MRSI and Brain Segmentations

The objective of this paper is to classify Multiple Sclerosis courses using features extracted from Magnetic Resonance Spectroscopic Imaging (MRSI) combined with brain tissue segmentations of gray matter, white matter, and lesions. To this purpose we trained several classifiers, ranging from simple (i.e. Linear Discriminant Analysis) to state-of-the-art (i.e. Convolutional Neural Networks). We investigate four binary classification tasks and report maximum values of Area Under receiver operating characteristic Curve between 68% and 95%. Our best results were found after training Support Vector Machines with gaussian kernel on MRSI features combined with brain tissue segmentation features.

Adrian Ion-Mărgineanu, Gabriel Kocevar, Claudio Stamile, Diana M. Sima, Françoise Durand-Dubief, Sabine Van Huffel, Dominique Sappey-Marinier

Advances in Machine Learning

Frontmatter
Parallel-Pathway Generator for Generative Adversarial Networks to Generate High-Resolution Natural Images

Generative Adversarial Networks (GANs) can learn various generative models such as probability distribution and images, while it is difficult to converge training. There are few successful methods for generating high-resolution images. In this paper, we propose the parallel-pathway generator network to generate high-resolution natural images. Our parallel network are constructed by parallelly stacked generators with different structure. To investigate the effect of our structure, we apply it to two image generation tasks: human-face image and road image which does not have square resolution. Results indicate that our method can generate high-resolution natural images with few parameter tuning.

Yuya Okadome, Wenpeng Wei, Toshiko Aizono
Using Echo State Networks for Cryptography

Echo state networks are simple recurrent neural networks that are easy to implement and train. Despite their simplicity, they show a form of memory and can predict or regenerate sequences of data. We make use of this property to realize a novel neural cryptography scheme. The key idea is to assume that Alice and Bob share a copy of an echo state network. If Alice trains her copy to memorize a message, she can communicate the trained part of the network to Bob who plugs it into his copy to regenerate the message. Considering a byte-level representation of in- and output, the technique applies to arbitrary types of data (texts, images, audio files, etc.) and practical experiments reveal it to satisfy the fundamental cryptographic properties of diffusion and confusion.

Rajkumar Ramamurthy, Christian Bauckhage, Krisztian Buza, Stefan Wrobel
Two Alternative Criteria for a Split-Merge MCMC on Dirichlet Process Mixture Models

The free energy and the generalization error are two major model selection criteria. However, in general, they are not equivalent. In previous studies, for the split-merge algorithm on conjugate Dirichlet process mixture models, the complete free energy was mainly used. In this work, we propose, the new criterion, the complete leave one out cross validation which is based on the approximation of the generalization error. In numerical experiments, our proposal outperforms the previous methods with the test set perplexity. Finally, we discuss the appropriate usage of these two criteria taking into account the experimental results.

Tikara Hosino
FP-MRBP: Fine-grained Parallel MapReduce Back Propagation Algorithm

MRBP algorithm is a training algorithm based on the MapReduce model for Back Propagation Network Networks (BPNNs), that employs the data parallel capability of the MapReduce model to improve the training efficiency and has shown a good performance for training BPNNs with massive training patterns. However, it is a coarse-grained pattern parallel algorithm and lacks the capability of fine-grained structure parallelism. As a result, when training a large scale BPNN, its training efficiency is still insufficient. To solve this issue, this paper proposes a novel MRBP algorithm, Fine-grained Parallel MRBP (FP-MRBP) algorithm, which has the capability of fine-grained structure parallelism. To the best knowledge of the authors, it is the first time to introduce the fine-grained parallelism to the classic MRBP algorithm. The experimental results show that our algorithm has a better training efficiency when training a large scale BPNN.

Gang Ren, Qingsong Hua, Pan Deng, Chao Yang
IQNN: Training Quantized Neural Networks with Iterative Optimizations

Quantized Neural Networks (QNNs) use low bitwidth numbers for representing parameters and intermediate results. The lowering of bitwidths saves storage space and allows for exploiting bitwise operations to speed up computations. However, QNNs often have lower prediction accuracies than their floating point counterparts, due to the extra quantization errors. In this paper, we propose a quantization algorithm that iteratively solves for the optimal scaling factor during every forward pass, which significantly reduces quantization errors. Moreover, we propose a novel initialization method for the iterative quantization, which speeds up convergence and further reduces quantization errors. Overall, our method improves prediction accuracies of QNNs at no extra costs for the inference. Experiments confirm the efficacy of our method in the quantization of AlexNet, GoogLeNet and ResNet. In particular, we are able to train a GoogLeNet having 4-bit weights and activations to reach 11.4% in top-5 single-crop error on ImageNet dataset, outperforming state-of-the-art QNNs. The code will be available online.

Shuchang Zhou, He Wen, Taihong Xiao, Xinyu Zhou
Compressing Neural Networks by Applying Frequent Item-Set Mining

Deep neural networks have been widely used contemporarily. To achieve better performance, people tend to build larger and deeper neural networks with millions or even billions of parameters. A natural question to ask is whether we can simplify the architecture of neural networks so that the storage and computational cost are reduced. This paper presented a novel approach to prune neural networks by frequent item-set mining. We propose a way to measure the importance of each item-set and then prune the networks. Compared with existing state-of-the-art pruning algorithms, our proposed algorithm can obtain a higher compression rate in one iteration with almost no loss of accuracy. To prove the effectiveness of our algorithm, we conducted several experiments on various types of neural networks. The results show that we can reduce the complexity of the model dramatically as well as enhance the performance of the model.

Zi-Yi Dou, Shu-Jian Huang, Yi-Fan Su
Applying the Heavy-Tailed Kernel to the Gaussian Process Regression for Modeling Point of Sale Data

Heavy-tailed distributions such as student’s t distribution have a special position in the statistical machine learning research due to their robustness when handling Gaussian noise model or other models within unknown types of noise. In this paper, we focus on using the robust kernel as an alternative to the wildly used squared exponential kernel for promoting the model’s robustness. Furthermore, we apply the heavy-tailed kernel to the Gaussian process with Bayesian regression for predicting the daily turnover of merchandises based on learning Point of Sale (PoS) data. The experiment results show better and more robust performs when comparing with other kernels.

Rui Yang, Yukio Ohsawa
Chaotic Associative Memory with Adaptive Scaling Factor

In this paper, we propose a Chaotic Associative Memory with Adaptive Scaling Factor. In the proposed model, the scaling factor of refractoriness is adjusted according to the maximum absolute value of the internal state up to that time as similar as the conventional Chaotic Multidirectional Associative Memory with Adaptive Scaling Factor. Computer experiments are carried out and we confirmed that the proposed model has the same dynamic association ability as the conventional model, and the proposed model also has recall capability similar to that of the conventional model, even for the number of neurons not used for automatic adjustment of parameters.

Tatsuuya Okada, Yuko Osana
Backmatter
Metadaten
Titel
Artificial Neural Networks and Machine Learning – ICANN 2017
herausgegeben von
Dr. Alessandra Lintas
Stefano Rovetta
Paul F.M.J. Verschure
Alessandro E.P. Villa
Copyright-Jahr
2017
Electronic ISBN
978-3-319-68612-7
Print ISBN
978-3-319-68611-0
DOI
https://doi.org/10.1007/978-3-319-68612-7