nach oben

2018 | Buch

Hybrid Artificial Intelligent Systems

13th International Conference, HAIS 2018, Oviedo, Spain, June 20-22, 2018, Proceedings

herausgegeben von: Francisco Javier de Cos Juez, José Ramón Villar, Enrique A. de la Cal, Álvaro Herrero, Héctor Quintián, José António Sáez, Emilio Corchado

Verlag: Springer International Publishing

Buchreihe : Lecture Notes in Computer Science

Enthalten in: Springer Professional "Wirtschaft+Technik" , Springer Professional "Technik" , Springer Professional "Wirtschaft"

Einloggen, um Zugang zu erhalten

Über dieses Buch

This volume constitutes the refereed proceedings of the 13th International Conference on Hybrid Artificial Intelligent Systems, HAIS 2018, held in Oviedo, Spain, in June 2018.
The 62 full papers published in this volume were carefully reviewed and selected from 104 submissions. They are organized in the following topical sections: Neurocomputing, fuzzy systems, rough sets, evolutionary algorithms, Agents andMultiagent Systems, and alike.

Inhaltsverzeichnis

Frontmatter

Data Mining, Knowledge Discovery and Big Data

Frontmatter

A Deep Learning-Based Recommendation System to Enable End User Access to Financial Linked Knowledge

Motivated by the assumption that Semantic Web technologies, especially those underlying the Linked Data paradigm, are not sufficiently exploited in the field of financial information management towards the automatic discovery and synthesis of knowledge, an architecture for a knowledge base for the financial domain in the Linked Open Data (LOD) cloud is presented in this paper. Furthermore, from the assumption that recommendation systems can be used to make consumption of the huge amounts of financial data in the LOD cloud more efficient and effective, we propose a deep learning-based hybrid recommendation system to enable end user access to the knowledge base. We implemented a prototype of a knowledge base for financial news as a proof of concept. Results from an Information Systems-oriented validation confirm our assumptions.

Luis Omar Colombo-Mendoza, José Antonio García-Díaz, Juan Miguel Gómez-Berbís, Rafael Valencia-García

On the Use of Random Discretization and Dimensionality Reduction in Ensembles for Big Data

Massive data growth in recent years has made data reduction techniques to gain a special popularity because of their ability to reduce this enormous amount of data, also called Big Data. Random Projection Random Discretization is an innovative ensemble method. It uses two data reduction techniques to create more informative data, their proposed Random Discretization, and Random Projections (RP). However, RP has some shortcomings that can be solved by more powerful methods such as Principal Components Analysis (PCA). Aiming to tackle this problem, we propose a new ensemble method using the Apache Spark framework and PCA for dimensionality reduction, named Random Discretization Dimensionality Reduction Ensemble. In our experiments on five Big Data datasets, we show that our proposal achieves better prediction performance than the original algorithm and Random Forest.

Diego García-Gil, Sergio Ramírez-Gallego, Salvador García, Francisco Herrera

Hybrid Deep Learning Based on GAN for Classifying BSR Noises from Invehicle Sensors

BSR (Buzz, squeak, and rattle) noises are essential criteria for the quality of a vehicle. It is necessary to classify them to handle them appropriately. Although many studies have been conducted to classify noise, they suffered some problems: the difficulty in extracting features, a small amount of data to train a classifier, and less robustness to background noise. This paper proposes a method called transferred encoder-decoder generative adversarial networks (tedGAN) which solves the problems. Deep auto-encoder (DAE) compresses and reconstructs the audio data for capturing the features of them. The decoder network is transferred to the generator of GAN so as to make the process of training generator more stable. Because the generator and the discriminator of GAN are trained at the same time, the capacity of extracting features is enhanced, and a knowledge space of the data is expanded with a small amount of data. The discriminator to classify whether the input is the real or fake BSR noises is transferred again to the classifier; then it is finally trained to classify the BSR noises. The classifier yields the accuracy of 95.15%, which outperforms other machine learning models. We analyze the model with t-SNE algorithm to investigate the misclassified data. The proposed model achieves the accuracy of 92.05% for the data including background noise.

Jin-Young Kim, Seok-Jun Bu, Sung-Bae Cho

Inferring User Expertise from Social Tagging in Music Recommender Systems for Streaming Services

Suppliers of music streaming services are showing an increasing interest for providing users with reliable personalized recommendations since their practically unlimited offerings make it difficult for users to find the music they like. In this work, we take advantage of social tags that users give to music through streaming platforms for improving recommendations. Most of the works in the literature use the tags in the context of content based methods for finding similarities between songs and artists, but we use them for characterizing users, instead of characterizing music, aiming at improving user-based collaborative filtering algorithms. The expertise level of users is inferred from the frequency analysis of their tags by using TF-IDF (Term Frequency-Inverse Document Frequency), which is an indicator of the quantity and relevance of the tags that users provide to items. User expertise has been studied in the context of recommender systems and other domains, but, as far as we know, it has not been studied in the context of music recommendations.

Diego Sánchez-Moreno, María N. Moreno-García, Nasim Sonboli, Bamshad Mobasher, Robin Burke

Learning Logical Definitions of n-Ary Relations in Graph Databases

Given a set of facts and related background knowledge, it has always been a challenging task to learn theories that define the facts in terms of background knowledge. In this study, we focus on graph databases and propose a method to learn definitions of n-ary relations stored in such mediums. The proposed method distinguishes from state-of-the-art methods as it employs hypergraphs to represent relational data and follows substructure matching approach to discover concept descriptors. Moreover, the proposed method provides mechanisms to handle inexact substructure matching, incorporate numerical attributes into concept discovery process, avoid target instance ordering problem and concept descriptors suppress each other. Experiments conducted on two benchmark biochemical datasets show that the proposed method is capable of inducing concept descriptors that cover all the target instances and are similar to those induced by state-of-the-art methods.

Furkan Goz, Alev Mutlu

GAparsimony: An R Package for Searching Parsimonious Models by Combining Hyperparameter Optimization and Feature Selection

Nowadays, there is an increasing interest in automating KDD processes. Thanks to the increasing power and costs reduction of computation devices, the search of best features and model parameters can be solved with different meta-heuristics. Thus, researchers can be focused in other important tasks like data wrangling or feature engineering. In this contribution, GAparsimony R package is presented. This library implements GA-PARSIMONY methodology that has been published in previous journals and HAIS conferences. The objective of this paper is to show how to use GAparsimony for searching accurate parsimonious models by combining feature selection, hyperparameter optimization, and parsimonious model search. Therefore, this paper covers the cautions and considerations required for finding a robust parsimonious model by using this package and with a regression example that can be easily adapted for another problem, database or algorithm.

F. J. Martinez-de-Pison, R. Gonzalez-Sendino, J. Ferreiro, E. Fraile, A. Pernia-Espinoza

Improving Adaptive Optics Reconstructions with a Deep Learning Approach

The use of techniques such as adaptive optics is mandatory when performing astronomical observation from ground based telescopes, due to the atmospheric turbulence effects. In the latest years, artificial intelligence methods were applied in this topic, with artificial neural networks becoming one of the reconstruction algorithms with better performance. These algorithms are developed to work with Shack-Hartmann wavefront sensors, which measures the turbulent profiles in terms of centroid coordinates of their subapertures and the algorithms calculate the correction over them. In this work is presented a Convolutional Neural Network (CNN) as an alternative, based on the idea of calculating the correction with all the information recorded by the Shack-Hartmann, for avoiding any possible loss of information. With the support of the Durham Adaptive optics Simulation Platform (DASP), simulations were performed for the training and posterior testing of the networks. This new CNN reconstructor is compared with the previous models of neural networks in tests varying the altitude of the turbulence layer and the strength of the turbulent profiles. The CNN reconstructor shows promising improvements in all the tested scenarios.

Sergio Luis Suárez Gómez, Carlos González-Gutiérrez, Enrique Díez Alonso, Jesús Daniel Santos Rodríguez, Maria Luisa Sánchez Rodríguez, Jorge Carballido Landeira, Alastair Basden, James Osborn

Complexity of Rule Sets in Mining Incomplete Data Using Characteristic Sets and Generalized Maximal Consistent Blocks

In this paper, missing attribute values in incomplete data sets have two possible interpretations, lost values and “do not care” conditions. For rule induction we use characteristic sets and generalized maximal consistent blocks. Therefore we apply four different approaches for data mining. As follows from our previous experiments, where we used an error rate evaluated by ten-fold cross validation as the main criterion of quality, no approach is universally the best. Therefore we decided to compare our four approaches using complexity of rule sets induced from incomplete data sets. We show that the cardinality of rule sets is always smaller for incomplete data sets with “do not care” conditions. Thus the choice between interpretations of missing attribute values is more important than the choice between characteristic sets and generalized maximal consistent blocks.

Patrick G. Clark, Cheng Gao, Jerzy W. Grzymala-Busse, Teresa Mroczek, Rafal Niemiec

Optimization of the University Transportation by Contraction Hierarchies Method and Clustering Algorithms

This research work focuses on the study of different models of solution reflected in the literature, which treat the optimization of the routing of vehicles by nodes and the optimal route for the university transport service. With the recent expansion of the facilities of a university institution, the allocation of the routes for the transport of its students, became more complex. As a result, geographic information systems (GIS) tools and operations research methodologies are applied, such as graph theory and vehicular routing problems, to facilitate mobilization and improve the students transport service, as well as optimizing the transfer time and utilization of the available transport units. An optimal route management procedure has been implemented to maximize the level of service of student transport using the K-means clustering algorithm and the method of node contraction hierarchies, with low cost due to the use of free software.

Israel D. Herrera-Granda, Leandro L. Lorente-Leyva, Diego H. Peluffo-Ordóñez, Robert M. Valencia-Chapi, Yakcleem Montero-Santos, Jorge L. Chicaiza-Vaca, Andrés E. Castro-Ospina

Identification of Patterns in Blogosphere Considering Social Positions of Users and Reciprocity of Relations

The aim of the paper is to identify and categorize frequent patterns describing interactions between users in social networks. We consider a social network with already identified relationships between users which evolves in time. The social network is based on the Polish blog website pertaining on socio-political issues salon24.pl. It consists of bloggers and links between them, which result from the intensity and characteristic features of posting comments. In our research, we discover patterns based on frequent and fast interactions between pairs of users. The patterns are described by the characteristics of these interactions, such as their reciprocity, the relative difference between estimates of global influence in the pairs of users participating in the discussions and time of day of the conversation. In addition, we consider the roles of system users, determined by the number of interactions initiating discussions, their frequency and the number of strong interactions in which users are involved. We take into account how many such intense conversations individual users participate in.

Krzysztof Rudek, Jarosław Koźlak

SmartFD: A Real Big Data Application for Electrical Fraud Detection

The main objective of this paper is the application of big data analytics to a real case in the field of smart electric networks. Smart meters are not only elements to measure consumption, but they also constitute a network of millions of sensors in the electricity network. These sensors provide a huge amount of data that, once analyzed, can lead to significant advances for the society. In this way, tools are being developed in order to reach certain goals, such as obtaining a better consumption estimation (which would imply a better production planning), finding better rates based on the time discrimination or the contracted power, or minimizing the non-technical losses in the network, whose actual costs are eventually paid by end-consumers, among others. In this work, real data from Spanish consumers have been analyzed to detect fraud in consumption. First, 1 TB of raw data was preprocessed in a HDFS-Spark infrastructure. Second, data duplication and outliers were removed, and missing values handled with specific big data algorithms. Third, customers were characterized by means of clustering techniques in different scenarios. Finally, several key factors in fraud consumption were found. Very promising results were achieved, verging on 80% accuracy.

D. Gutiérrez-Avilés, J. A. Fábregas, J. Tejedor, F. Martínez-Álvarez, A. Troncoso, A. Arcos, J. C. Riquelme

Multi-class Imbalanced Data Oversampling for Vertebral Column Pathologies Classification

Medical data mining problems are usually characterized by examples of some of the classes appearing more frequently. Such a learning difficulty is known as imbalanced classification problems. This contribution analyzes the application of algorithms for tackling multi-class imbalanced classification in the field of vertebral column diseases classification. Particularly, we study the effectiveness of applying a recent approach, known as Selective Oversampling for Multi-class Imbalanced Datasets (SOMCID), which is based on analyzing the structure of the classes to detect those examples in minority classes that are more interesting to oversample. Even though SOMCID has been previously applied to data belonging to different domains, its suitability in the difficult vertebral column medical data has not been analyzed until now. The results obtained show that the application of SOMCID for the detection of pathologies in the vertebral column may lead to a significant improvement over state-of-the-art approaches that do not consider the importance of the types of examples.

José A. Sáez, Héctor Quintián, Bartosz Krawczyk, Michał Woźniak, Emilio Corchado

Bio-inspired Models and Evolutionary Computation

Frontmatter

A Hybrid Genetic-Bootstrapping Approach to Link Resources in the Web of Data

In the Web of Data, real-world entities are represented by means of resources, for instance the southern Spanish city “Seville” that is represented by means of the resource that is available at http://es.dbpedia.org/page/Sevilla in the DBpedia dataset. Link rules are intended to link resources that are different, but represent the same real-world entities; for instance the resource that is available at https://www.wikidata.org/wiki/Q8717 represents exactly the same real-world entity as the resource aforementioned. A link rule may establish that two resources that represent cities should be linked as long as the GPS coordinates are the same. Such rules are then paramount to integrating web data, because otherwise programs would deal with every resource independently from the other. Knowing that the previous resources represent the same real-world entity allows them to merge the information that they provide independently (which is commonly known as integrating link data). State-of-the-art link rules are learnt by genetic programming systems and build on comparing the values of the attributes of the resources. Unfortunately, this approach falls short in cases in which resources have similar values for their attributes, but represent different real-world entities. In this paper, we present a proposal that hybridises a genetic programming system that learns link rules and an ad-hoc filtering technique that bootstraps them to decide whether the links that they produce must be selected or not. Our analysis of the literature reveals that our approach is novel and our experimental analysis confirms that it helps improve the $$F_1$$ score, which is defined in the literature as the harmonic mean of precision and recall, by increasing precision without a significant penalty on recall.

Andrea Cimmino, Rafael Corchuelo

Modelling and Forecasting of the Radiation Level Time Series at the Canfranc Underground Laboratory

The $$^{222}Rn$$ level at underground laboratories, where Physics experiments of low-background are installed, is the largest source of background; and it is the main distortion for obtaining high accuracy results. At Spain, the Canfranc Underground Laboratory hosts ground-breaking experiments, such as Argon Dark Matter-1t aimed at the dark matter direct searches. For the collaborations exploiting these experiments, the modelling and forecasting of the $$^{222}Rn$$ level are very relevant tasks for efficient planning activities of installation and maintenance. In this paper, four years of values of $$^{222}Rn$$ level from the Canfranc Underground Laboratory are analysed using methods such as Holt-Winters, AutoRegressive Integrated Moving Averages, Seasonal and Trend Decomposition using Loess, Feed-Forward Neural Networks, and Convolutional Neural Networks. In order to evaluate the performance of these methods, both the Mean Squared Error and the Mean Absolute Error are used. Both metrics determine that the Seasonal and Trend Decomposition using Loess no periodic, and the Convolutional Neural Networks, are the techniques which obtain the best predictive results. This is the first time that the mentioned data are investigated, and it constitutes an excellent example of scientific time series with relevant implications for the quality of the scientific results of the experiments.

Iván Méndez-Jiménez, Miguel Cárdenas-Montes

Sensor Fault Detection and Recovery Methodology for a Geothermal Heat Exchanger

This research addresses a sensor fault detection and recovery methodology oriented to a real system as can be a geothermal heat exchanger installed as part of the heat pump installation at a bioclimatic house. The main aim is to stablish the procedure to detect the anomaly over a sensor and recover the value when it occurs. Therefore, some experiments applying a Multi-layer Perceptron (MLP) regressor, as modelling technique, have been made with satisfactory results in general terms. The correct election of the input variables is critical to get a robust model, specially, those features based on the sensor values on the previous state.

Héctor Alaiz-Moretón, José Luis Casteleiro-Roca, Laura Fernández Robles, Esteban Jove, Manuel Castejón-Limas, José Luis Calvo-Rolle

Distinctive Features of Asymmetric Neural Networks with Gabor Filters

To make clear the mechanism of the visual motion detection is important in the visual system, which is useful to robotic systems. The prominent features are the nonlinear characteristics as the squaring and rectification functions, which are observed in the retinal and visual cortex networks. Conventional models for motion processing, are to use symmetric quadrature functions with Gabor filters. This paper proposes a new motion processing model of the asymmetric networks. To analyze the behavior of the asymmetric nonlinear network, white noise analysis and Wiener kernels are applied. It is shown that the biological asymmetric network with nonlinearities is effective for generating the directional movement from the network computations. Further, responses to complex stimulus and the frequency characteristics are computed in the asymmetric networks, which are not derived for the conventional energy model.

Naohiro Ishii, Toshinori Deguchi, Masashi Kawaguchi, Hiroshi Sasaki

Tuning CNN Input Layout for IDS with Genetic Algorithms

Intrusion Detection Systems (IDS) are implemented by service providers and network operators to monitor and detect attacks. Many machine learning algorithms, stand-alone or combined, have been proposed, including different types of Artificial Neural Networks (ANN). This work evaluates a Convolutional Neural Network (CNN), created for image classification, as an IDS that can be deployed in a router, which has not been evaluated previously. The layout of the features in the input matrix of the CNN is relevant. A Genetic Algorithm (GA) is used to find a high-quality solution by rearranging the layout of the input features, reducing the features if required. The GA improves the capacity of intrusion detection from 0.71 to 0.77 for normalized input featuress, similar to existing algorithms. For scenarios where data normalization is not possible, many input layouts are useless. The GA finds a solution with an intrusion detection capacity of 0.73.

Roberto Blanco, Juan J. Cilla, Pedro Malagón, Ignacio Penas, José M. Moya

Improving the Accuracy of Prediction Applications by Efficient Tuning of Gradient Descent Using Genetic Algorithms

Gradient Descent is an algorithm very used by Machine Learning methods, as Recommender Systems in Collaborative Filtering. It tries to find the optimal values of some parameters in order to minimize a particular cost function. In our research case, we consider Matrix Factorization as application of Gradient Descent, where the optimal values of two matrices must be calculated for minimizing the Root Mean Squared Error criterion, given a particular training dataset. However, there are two important parameters in Gradient Descent, both constant real numbers, whose values are set without any strict rule and have a certain influence on the algorithm accuracy: the learning rate and regularization factor. In this work we apply a evolutionary metaheuristic for finding the optimal values of these two parameters. To that end, we consider as experimental framework the Prediction Student Performance problem, a problem tackled as Recommender System with training and test datasets extracted for real cases. After performing a direct search of the optimal values, we apply a Genetic Algorithm obtaining best results of the Gradient Descent accuracy with less computational effort.

Arturo Duran-Dominguez, Juan A. Gomez-Pulido, David Rodriguez-Lozano

Mono-modal Medical Image Registration with Coral Reef Optimization

Image registration (IR) involves the transformation of different sets of image data having a shared content into a common coordinate system. To achieve this goal, the search for the optimal correspondence is usually treated as an optimization problem. The limitations of traditional IR methods have boomed the application of metaheuristic-based approaches to solve the problem while improving the performance. In this contribution, we consider a recent bio-inspired method: the Coral Reef Optimization Algorithm (CRO). This novel algorithm simulates the natural phenomena underlying a coral reef. We adapt the algorithm following two different approaches: feature-based and intensity-based designs and perform a thorough experimental study in a medical IR problem considering similarity transformations. The results show that CRO overcome the state-of-the-art results in terms of robustness, accuracy, and efficiency considering both approaches.

E. Bermejo, M. Chica, S. Damas, S. Salcedo-Sanz, O. Cordón

Evaluating Feature Selection Robustness on High-Dimensional Data

With the explosive growth of high-dimensional data, feature selection has become a crucial step of machine learning tasks. Though most of the available works focus on devising selection strategies that are effective in identifying small subsets of predictive features, recent research has also highlighted the importance of investigating the robustness of the selection process with respect to sample variation. In presence of a high number of features, indeed, the selection outcome can be very sensitive to any perturbations in the set of training records, which limits the interpretability of the results and their subsequent exploitation in real-world applications. This study aims to provide more insight about this critical issue by analysing the robustness of some state-of-the-art selection methods, for different levels of data perturbation and different cardinalities of the selected feature subsets. Furthermore, we explore the extent to which the adoption of an ensemble selection strategy can make these algorithms more robust, without compromising their predictive performance. The results on five high-dimensional datasets, which are representatives of different domains, are presented and discussed.

Barbara Pes

Learning Algorithms

Frontmatter

Generalized Probability Distribution Mixture Model for Clustering

Gaussian Mixture Model is a popular clustering method based on modelling the data through a set of Gaussian probability distributions. This method is able to correctly identify non-spheroidal overlapping or not overlapping clusters with different sizes and densities. Due to its performance, it has achieved a high popularity among the practitioners. In this work, the first efforts for extending Gaussian Mixture Models toward the mixture of other probability distributions are presented. At this point, it includes the use of diverse probability distributions, such as Gaussian, Exponential, Weibull, and t-Student Probability Distributions. Instead of the Expectation-Maximization algorithm, for optimizing the parameters of the Gaussian mixture, the parameters of the mixture of diverse probability distributions are optimized using the Coral Reef Optimizer meta-heuristic.

David Crespo-Roces, Iván Méndez-Jiménez, Sancho Salcedo-Sanz, Miguel Cárdenas-Montes

A Hybrid Approach to Mining Conditions

Text mining pursues producing valuable information from natural language text. Conditions cannot be neglected because it may easily lead to misinterpretations. There are naive proposals to mine conditions that rely on user-defined patterns, which falls short; there is only one machine-learning proposal, but it requires to provide specific-purpose dictionaries, taxonomies, and heuristics, it works on opinion sentences only, and it was evaluated very shallowly. We present a novel hybrid approach that relies on computational linguistics and deep learning; our experiments prove that it is more effective than current proposals in terms of $$F_1$$ score and does not have their drawbacks.

Fernando O. Gallego, Rafael Corchuelo

A First Attempt on Monotonic Training Set Selection

Monotonicity constraints frequently appear in real-life problems. Many of the monotonic classifiers used in these cases require that the input data satisfy the monotonicity restrictions. This contribution proposes the use of training set selection to choose the most representative instances which improves the monotonic classifiers performance, fulfilling the monotonic constraints. We have developed an experiment on 30 data sets in order to demonstrate the benefits of our proposal.

J.-R. Cano, S. García

Dealing with Missing Data and Uncertainty in the Context of Data Mining

Missing data is an issue in many real-world datasets yet robust methods for dealing with missing data appropriately still need development. In this paper we conduct an investigation of how some methods for handling missing data perform when the uncertainty increases. Using benchmark datasets from the UCI Machine Learning repository we generate datasets for our experimentation with increasing amounts of data Missing Completely At Random (MCAR) both at the attribute level and at the record level. We then apply four classification algorithms: C4.5, Random Forest, Naïve Bayes and Support Vector Machines (SVMs). We measure the performance of each classifiers on the basis of complete case analysis, simple imputation and then we study the performance of the algorithms that can handle missing data. We find that complete case analysis has a detrimental effect because it renders many datasets infeasible when missing data increases, particularly for high dimensional data. We find that increasing missing data does have a negative effect on the performance of all the algorithms tested but the different algorithms tested either using preprocessing in the form of simple imputation or handling the missing data do not show a significant difference in performance.

Aliya Aleryani, Wenjia Wang, Beatriz De La Iglesia

A Preliminary Study of Diversity in Extreme Learning Machines Ensembles

In this paper, the neural network version of Extreme Learning Machine (ELM) is used as a base learner for an ensemble meta-algorithm which promotes diversity explicitly in the ELM loss function. The cost function proposed encourages orthogonality (scalar product) in the parameter space. Other ensemble-based meta-algorithms from AdaBoost family are used for comparison purposes. Both accuracy and diversity presented in our proposal are competitive, thus reinforcing the idea of introducing diversity explicitly.

Carlos Perales-González, Mariano Carbonero-Ruz, David Becerra-Alonso, Francisco Fernández-Navarro

Orthogonal Learning Firefly Algorithm

In this paper, a proven technique, orthogonal learning, is combined with popular swarm metaheuristic Firefly Algorithm (FA). More precisely with its hybrid modification Firefly Particle Swarm Optimization (FFPSO). The performance of the developed algorithm is tested and compared with canonical FA and above mentioned FFPSO. Comparisons have been conducted on well-known CEC 2017 benchmark functions, and the results have been evaluated for statistical significance using Friedman rank test.

Kadavy Tomas, Pluhacek Michal, Viktorin Adam, Senkerik Roman

Multi-label Learning by Hyperparameters Calibration for Treating Class Imbalance

Multi-label learning has been becoming an increasingly active area into the machine learning community due to a wide variety of real world problems. However, only over the past few years class balancing for these kind of problems became a topic of interest. In this paper, we present a novel method named hyperparameter calibration to treat class imbalance in a multi-label problem, to this aim we develop an extensive analysis over four real-world databases and two own synthetic databases exhibiting different ratios of imbalance. The empirical analysis shows that the proposed method is able to improve the classification performance when it is combined with three of the most widely used strategies for treating multi-label classification problems.

Andrés Felipe Giraldo-Forero, Andrés Felipe Cardona-Escobar, Andrés Eduardo Castro-Ospina

Drifted Data Stream Clustering Based on ClusTree Algorithm

Correct recognition of the possible changes in data streams, called concept drifts plays a crucial role in constructing the appropriate model learning strategy. This paper focuses on the unsupervised learning model for non-stationary data streams, where two significant modifications of the ClustTree algorithm are presented. They allow the clustering model to be adapted to the changes caused by a concept drift. An experimental study conducted on a set of benchmark data streams proves the usefulness of the proposed solutions.

Jakub Zgraja, Michał Woźniak

Featuring the Attributes in Supervised Machine Learning

This paper introduces an approach to feature subset selection which is able to characterise the attributes of a supervised machine learning problem into two categories: essential and important features. Additionally, the fusion of both kinds of features yields to an overcoming in the prediction task, where some measures such as accuracy and Receiver Operating Characteristic curve (ROC) have been reported. The test-bed is composed of eight binary and multi-class classification problems with up to five hundred of attributes. Several classification algorithms such as Ridor, PART, C4.5 and NBTree have been tested to assess the proposal.

Antonio J. Tallón-Ballesteros, Luís Correia, Bing Xue

Applying VorEAl for IoT Intrusion Detection

Smart connected devices create what has been denominated as the Internet of Things (IoT). The combined and cohesive use of these devices prompts the emergence of Ambient Intelligence (AmI). One of the current key issues in the IoT domain has to do with the detection and prevention of security breaches and intrusions. In this paper, we introduce the use of the Voronoi diagram-based Evolutionary Algorithm (VorEAl) in the context of IoT intrusion detection. In order to cope with the dimensions of the problem, we propose a modification of VorEAl that employs a proxy for the volume that approximates it using a heuristic surrogate. The proxy has linear complexity and, therefore, highly scalable. The experimental studies carried out as part of the paper show that our approach is able to outperform other approaches that have been previously used to address the problem of interest.

Nayat Sanchez-Pi, Luis Martí, José M. Molina

Visual Analysis and Advanced Data Processing Techniques

Frontmatter

Evaluation of a Wrist-Based Wearable Fall Detection Method

Fall detection represents an important issue when dealing with Ambient Assisted Living for the elder. The vast majority of fall detection approaches have been developed for healthy and relatively young people. Moreover, plenty of these approaches make use of sensors placed on the hip. Considering the focused population of elderly people, there are clear differences and constraints. On the one hand, the patterns and times in the normal activities -and also the falls- are different from younger people: elders move slowly. On the second hand, solutions using uncomfortable sensory systems would be rejected by many candidates. In this research, one of the proposed solutions in the literature has been adapted to use a smartwatch on a wrist, solving some problems and modifying part of the algorithm. The experimentation includes a publicly available dataset. Results point to several enhancements in order to be adapted to the focused population.

Samad Barri Khojasteh, José R. Villar, Enrique de la Cal, Víctor M. González, Javier Sedano, Harun Reşit Yazg̈an

EnerVMAS: Virtual Agent Organizations to Optimize Energy Consumption Using Intelligent Temperature Calibration

One of the problems we encounter when dealing with the optimization of household energy consumption is how to reduce the consumption of air conditioning systems without reducing the comfort level of the residents. The systems that have been proposed so far do not succeed at optimizing the electricity consumed by heating and air conditioning systems because they do not monitor all the variables involved in this process, often leaving users’ comfort aside. It is therefore necessary to develop a solution that monitors the factors which contribute to greater energy consumption. Such a solution must have a self-adaptive architecture with the capacity of self-organization which will allow it to adapt to changes in user temperature preferences. The methodology that is the most suitable for the development of such solution are virtual agent organizations, they allow for the management of wireless sensor networks (WSN) and the use of Case-Based Reasoning (CBR) for predicting the presence of people at home. This work presents an energy optimization system based on virtual agent organizations (VO-MAS) that obtains the characteristics of the environment through sensors and user behavior pattern using a CBR system. A case study was carried out in order to evaluate the performance of the proposed system, the results show that 22.8% energy savings were achieved.

Alfonso González-Briones, Javier Prieto, Juan M. Corchado, Yves Demazeau

Tool Wear Estimation and Visualization Using Image Sensors in Micro Milling Manufacturing

This paper presents a reliable machine vision system to automatically estimate and visualize tool wear in micro milling manufacturing. The estimation of tool wear is very important for tool monitoring systems and image sensors configure a cheap and reliable solution. This system provides information to decide whether a tool should be replaced so the quality of the machined piece is ensured and the tool does not collapse. In the method that we propose, we first delimit the area of interest of the micro milling tool and then we delimit the worn area. The worn area is visualized and estimated while errors are computed against the ground truth proposed by experts. The method is mainly based on morphological operations and k-means algorithm. Other approaches based on pure morphological operations and on Otsu multi threshold algorithms were also tested. The obtained result (a harmonic mean of precision and recall 90.24 (±2.78)%) shows that the machine vision system that we present is effective and suitable for the estimation and visualization of tool wear in micro milling machines and ready to be installed in an on-line system.

Laura Fernández-Robles, Noelia Charro, Lidia Sánchez-González, Hilde Pérez, Manuel Castejón-Limas, Javier Alfonso-Cendón

Compensating Atmospheric Turbulence with Convolutional Neural Networks for Defocused Pupil Image Wave-Front Sensors

Adaptive optics are techniques used for processing the spatial resolution of astronomical images taken from large ground-based telescopes. In this work are presented computational results from a modified curvature sensor, the Tomographic Pupil Image Wave-front Sensor (TPI-WFS), which measures the turbulence of the atmosphere, expressed in terms of an expansion over Zernike polynomials.Convolutional Neural Networks (CNN) are presented as an alternative to the TPI-WFS reconstruction. This technique is a machine learning model of the family of artificial neural networks, which are widely known for its performance as modeling and prediction technique in complex systems. Results obtained from the reconstruction of the networks are compared with the TPI-WFS reconstruction by estimating errors and optical measurements (root mean square error, mean structural similarity and Strehl ratio).Two different scenarios are set, attending to different resolutions for the reconstruction. The reconstructed wave-fronts from both techniques are compared for wave-fronts of 25 Zernike modes and 153 Zernike modes. In general, CNN trained as reconstructor showed better performance than the reconstruction in TPI-WFS for most of the turbulent profiles, but the most significant improvements were found for higher turbulent profiles that have the lowest r0 values.

Sergio Luis Suárez Gómez, Carlos González-Gutiérrez, Enrique Díez Alonso, Jesús Daniel Santos Rodríguez, Laura Bonavera, Juan José Fernández Valdivia, José Manuel Rodríguez Ramos, Luis Fernando Rodríguez Ramos

Using Nonlinear Quantile Regression for the Estimation of Software Cost

Estimation of effort costs is an important task for the management of software development projects. Researchers have followed two approaches –namely, statistical/machine-learning and theory-based– which explicitly rely on mean/median regression lines in order to model the relationship between software size and effort. Those approaches share a common drawback deriving from their inability to properly incorporate risk attitudes in the presence of heteroskedasticity. We propose a more flexible quantile regression approach that enables risk aversion to be incorporated in a systematic way, with the higher order conditional quantiles of the relationship between project size and effort being used to represent more risk adverse decision makers. A cubic quantile regression model allows consideration of economies/diseconomies of scale. The method is illustrated with an empirical application to a database of real projects. Results suggest that the shapes of higher order regression quantiles may sharply differ from that of the conditional median, revealing that the naive expedient of translating or multiplying some average norm (adding a safety margin to median estimates or including a multiplicative correction factor) is a potentially biased way to consider risk aversion. The proposed approach enables a more realistic analysis, adapted to the specificities of software development databases.

J. De Andrés, M. Landajo, P. Lorca

A Distributed Drone-Oriented Architecture for In-Flight Object Detection

Drones are increasingly being used to provide support to inspection tasks in many industrial sectors and civil applications. The procedure is usually completed off-line by the final user, once the flight mission terminated and the video streaming and conjoint data gathered by the drone were examined. The procedure can be improved with real-time operation and automated object detection features. With this purpose, this paper describes a cloud-based architecture which enables real-time video streaming and bundled object detection in a remote control center, taking advantage of the availability of high-speed cellular networks for communications. The architecture, which is ready to handle different types of drones, is instantiated for a specific use case, the inspection of a telecommunication tower. For this use case, the specific object detection strategy is detailed. Results show that the approach is viable and enables to redesign the traditional inspection procedures with drones, in a step forward between manual operation and full automation.

Diego Vaquero-Melchor, Iván Campaña, Ana M. Bernardos, Luca Bergesio, Juan A. Besada

3D Gabor Filters for Chest Segmentation in DCE-MRI

Computer aided applications in Dynamic contrast enhanced magnetic resonance imaging (DCE-MRI) are increasingly gaining attention as important tools to asses the risk of breast cancer. Chest wall detection and whole breast segmentation require effective solutions to increase the potential benefits of computer aided tools for tumor detection. Here we propose a 3D extension of Gabor filtering for detection of wall-like regions in medical imaging, and prove its effectiveness in chest-wall detection.

I. A. Illan, J. Perez Matos, J. Ramirez, J. M. Gorriz, S. Foo, A. Meyer-Baese

Fingertips Segmentation of Thermal Images and Its Potential Use in Hand Thermoregulation Analysis

Thermoregulation refers to the physiological processes that maintain stable the body temperatures. Infrared thermography is a non-invasive technique useful for visualizing these temperatures. Previous works suggest it is important to analyze thermoregulation in peripheral regions, such as the fingertips, because some disabling pathologies affect particularly the thermoregulation of these regions. This work proposes an algorithm for fingertip segmentation in thermal images of the hand. By using a supervised index, the results are compared against segmentations provided by humans. The results are outstanding even when the analyzed images are highly resized.

A. E. Castro-Ospina, A. M. Correa-Mira, I. D. Herrera-Granda, D. H. Peluffo-Ordóñez, H. A. Fandiño-Toro

Data Mining Applications

Frontmatter

Listen to This: Music Recommendation Based on One-Class Support Vector Machine

The streaming services are here to stay. In recent years we have witnessed their consolidation and success, which is manifested in their exponential growth, while the sale of songs/albums in physical or digital format has declined. An important part of these services are recommendation systems, which facilitate the exploration of content to users. This article proposes a content-based approach, using the One-Class Support Vector Machine classification algorithm as an anomaly detector. The aim is to generate a playlist that adapts to the user’s tastes, incorporating the novelties of new releases. The model is capable of detecting elements that belong to the profile of the user’s tastes with great accuracy, facilitating the implementation of an Android mobile application that scans and detects changes in user preferences. This will make it possible not only to manage the playlist that has been recommended, but also periodically to incorporate new songs to the profile from the list of new music.

Fabio A. Yepes, Vivian F. López, Javier Pérez-Marcos, Ana B. Gil, Gabriel Villarrubia

Improving Forecasting Using Information Fusion in Local Agricultural Markets

This research explores the capacity of Information Fusion to extract knowledge about associations among agricultural products, which allows prediction for future consumption in local markets in the Andean region of Ecuador. This commercial activity is performed using Alternative Marketing Circuits (CIALCO), seeking to establish a direct relationship between producer and consumer prices, and promote buying and selling among family groups. In the results we see that, information fusion from heterogenous data sources that are spatially located allows to establish best association rules among data sources (several products on several local markets) to infer significant improvement in time forecasting and spatial prediction accuracy for the future sales of agricultural products.

Washington R. Padilla, Jesús García, José M. Molina

Chord Progressions Selection Based on Song Audio Features

A chord progression is an essential building block in music. In the field of music theory is usually assumed that these progressions influence the mood, emotion, genre or other critical aspects of the songs, and also in the perception that they will cause on the listener. Therefore, it is natural to think that musical and audio features of a track should be related to its chord progressions. Choosing carefully these progressions when it comes the time of creating a new song, is a fundamental aspect depending on the feelings we want to evoke on the listener. Also, two songs can be considered alike or classified into the same emotions or genres if they use the same chord progressions. Many music classification studies are presented nowadays, but none of them take into account chord progressions, probably due to the lack of this kind of data. In this paper, classification algorithms are used to illustrate the influence of the songs’ features when it comes to pick up chord progressions to create a new song.

Noelia Rico, Irene Díaz

LearnSec: A Framework for Full Text Analysis

Large corpus of scientific research papers have been available for a long time. However, most of those corpus store only the title and the abstract of the paper. For some domains this information may not be enough to achieve high performance in text mining tasks. This problem has been recently reduced by the growing availability of full text scientific research papers. A full text version provides more detailed information but, on the other hand, a large amount of data needs to be processed. A priori, it is difficult to know if the extra work of the full text analysis has a significant impact in the performance of text mining tasks, or if the effect depends on the scientific domain or the specific corpus under analysis.The goal of this paper is to show a framework for full text analysis, called LearnSec, which incorporates domain specific knowledge and information about the content of the document sections to improve the classification process with propositional and relational learning.To demonstrate the usefulness of the tool, we process a scientific corpus based on OSHUMED, generating an attribute/value dataset in Weka format and a First Order Logic dataset in Inductive Logic Programming (ILP) format. Results show a successful assessment of the framework.

Carlos Gonçalves, E. L. Iglesias, L. Borrajo, Rui Camacho, A. Seara Vieira, Célia Talma Gonçalves

A Mood Analysis on Youtube Comments and a Method for Improved Social Spam Detection

In the same manner that Online Social Networks (OSN) usage increases, non-legitimate campaigns over these types of web services are growing. This is the reason why significant number of users are affected by social spam every day and therefore, their privacy is threatened. To deal with this issue in this study we focus on mood analysis, among all content-based analysis techniques. We demonstrate that using this technique social spam filtering results are improved. First, the best spam filtering classifiers are identified using a labeled dataset consisting of Youtube comments, including spam. Then, a new dataset is created adding the mood feature to each comment, and the best classifiers are applied to it. A comparison between obtained results with and without mood information shows that this feature can help to improve social spam filtering results: the best accuracy is improved in two different datasets, and the number of false positives is reduced 13.76% and 11.41% on average. Moreover, the results are validated carrying out the same experiment but using a different dataset.

Enaitz Ezpeleta, Mikel Iturbe, Iñaki Garitano, Iñaki Velez de Mendizabal, Urko Zurutuza

An Improved Comfort Biased Smart Home Load Manager for Grid Connected Homes Under Direct Load Control

This paper presents an improved comfort biased smart home load manager (iCBSHLM) for grid connected residential houses. The proposed algorithm discriminates household loads into class 1 (air-conditioner, heating) and class 2 loads (dishwasher, cloth washer and cloth dryer) and achieves electricity consumption reduction and electricity cost reduction of up to 2.9% and 7.5% respectively using dynamic pricing (Price1) over time of use pricing (Price0), while ensuring that indoor temperature is kept within the user prescribed range and without any violation. iCBSHLM advances existing home energy management systems (HEMs) by ensuring that vulnerable household residents (especially the elderly) can still benefit from smart grid initiatives like HEMs without any discomfort. Furthermore, this research presents a simplistic model for heating, ventilation and cooling (HVAC) loads using capacitor charging/discharging behaviour.

Chukwuka G. Monyei, Serestina Viriri

Remifentanil Dose Prediction for Patients During General Anesthesia

In the anesthesia field there are some challenges, such as achieving new methods to control, and, of course, for reducing the pain suffered for the patients during surgeries. The first steps in this field were focused on obtaining representative measurements for pain measurement. Nowadays, one of the most promiser index is the ANI (Antinociception Index). This research works deals the model for the remifentanil dose prediction for patients undergoing general anesthesia. To do that, a hybrid model based on intelligent techniques is implemented. The model was trained using Support Vector Regression (SVR) and Artificial Neural Networks (ANN) algorithms. Results were validated with a real dataset of patients. It was possible to check the really successful model performance.

Esteban Jove, Jose M. Gonzalez-Cava, José-Luis Casteleiro-Roca, Héctor Quintián, Juan Albino Méndez-Pérez, José Luis Calvo-Rolle, Francisco Javier de Cos Juez, Ana León, María Martín, José Reboso

Classification of Prostate Cancer Patients and Healthy Individuals by Means of a Hybrid Algorithm Combing SVM and Evolutionary Algorithms

This research presents a new hybrid algorithm able to select a set of features that makes it possible to classify healthy individuals and those affected by prostate cancer.In this research the feature selection is performed with the help of evolutionary algorithms. This kind of algorithms, have proven in previous researches their ability for obtaining solutions for optimization problems in very different fields. In this study, a hybrid algorithm based on evolutionary methods and support vector machine is developed for the selection of optimal feature subsets for the classification of data sets. The results of the algorithm using a reduced data set demonstrates the performance of the method when compared with non-hybrid methodologies.

Juan Enrique Sánchez Lasheras, Fernando Sánchez Lasheras, Carmen González Donquiles, Adonina Tardón, Gemma Castaño Vynals, Beatriz Pérez Gómez, Camilo Palazuelos, Dolors Sala, Francisco Javier de Cos Juez

Hybrid Intelligent Applications

Frontmatter

A Hybrid Deep Learning System of CNN and LRCN to Detect Cyberbullying from SNS Comments

The cyberbullying is becoming a significant social issue, in proportion to the proliferation of Social Network Service (SNS). The cyberbullying commentaries can be categorized into syntactic and semantic subsets. In this paper, we propose an ensemble method of the two deep learning models: One is character-level CNN which captures low-level syntactic information from the sequence of characters and is robust to noise using the transfer learning. The other is word-level LRCN which captures high-level semantic information from the sequence of words, complementing the CNN model. Empirical results show that the performance of the ensemble method is significantly enhanced, outperforming the state-of-the-art methods for detecting cyberbullying comment. The model is analyzed by t-SNE algorithm to investigate the mutually cooperative relations between syntactic and semantic models.

Seok-Jun Bu, Sung-Bae Cho

Taxonomy-Based Detection of User Emotions for Advanced Artificial Intelligent Applications

Catching the attention of a new acquaintance and empathize with her can improve the social skills of a robot. For this reason, we illustrate here the first step towards a system which can be used by a social robot in order to “break the ice” between a robot and a new acquaintance. After a training phase, the robot acquires a sub-symbolic coding of the main concepts being expressed in tweets about the IAB Tier-1 categories. Then this knowledge is used to catch the new acquaintance interests, which let arouse in her a joyful sentiment. The analysis process is done alongside a general small talk, and once the process is finished, the robot can propose to talk about something that catches the attention of the user, hopefully letting arise in him a mix of feelings which involve surprise and joy, triggering, therefore, an engagement between the user and the social robot.

Alfredo Cuzzocrea, Giovanni Pilato

Prediction of the Energy Demand of a Hotel Using an Artificial Intelligence-Based Model

The growth of the hotel industry in the world, is a reality that increasingly needs a greater use of energy resources, and their optimal management. Of all the available energy resources, renewable energies can give greater economic efficiency and lower environmental impact. To manage these resources it is important the availability of energy prediction models. This allows managing the demand for power and the available energy resources, to obtain maximum efficiency and stability, with the consequent economic savings. This paper focuses in the use of Artificial Intelligence methods for energy prediction in luxury hotels. As a case of study, the energy performance data used were taken from the hotel complex The Ritz-Carlton, Abama, located in the South of the island of Tenerife, in the Canary Islands, Spain. This is a high complexity infrastructure with many services that require a lot of energy, such as restaurants, kitchens, swimming pools, vehicle fleet, etc., which make the hotel a good study model for other resorts. The model developed for the artificial intelligence system is based on a hybrid topology with artificial neural networks. In this paper, the daily power demand prediction using information of last 24 h is presented. This prediction allows the development of appropriate actions to optimize energy management.

José-Luis Casteleiro-Roca, José Francisco Gómez-González, José Luis Calvo-Rolle, Esteban Jove, Héctor Quintián, Juan Francisco Acosta Martín, Sara Gonzalez Perez, Benjamin Gonzalez Diaz, Francisco Calero-Garcia, Juan Albino Méndez-Perez

A Hybrid Algorithm for the Prediction of Computer Vision Syndrome in Health Personnel Based on Trees and Evolutionary Algorithms

In the last decades, the use of video display terminals in workplaces has become more and more common. Despite their remarkable advantages, they imply a series of risks for the health of the workers, as they can be responsible for ocular and visual disorders. In this research, certain problems associated to prolonged computer use classified under the name of Computer Vision Syndrome are studied with the help of a hybrid algorithm based on regression trees and genetic algorithms. The importance of the different symptoms on the Computer Vision Syndrome is evaluated.Also, the proposed algorithm is tested in order to know its performance as a prediction model that can determine how prone an individual is to suffering from Computer Vision Syndrome.

Eva María Artime Ríos, Fernando Sánchez Lasheras, Ana Suárez Sánchez, Francisco J. Iglesias-Rodríguez, María del Mar Seguí Crespo

An Algorithm Based on Satellite Observations to Quality Control Ground Solar Sensors: Analysis of Spanish Meteorological Networks

We present a hybrid quality control (QC) for identifying defects in ground sensors of solar radiation. The method combines a window function that flags potential defects in radiation time series with a visual decision support system that eases the detection of false alarms and the identification of the causes of the defects. The core of the algorithm is the window function that filters out groups of daily records where the errors of several radiation products, mainly satellite-based models, are greater than the typical values for that product, region and time of the year.The QC method was tested in 748 Spanish ground stations finding different operational errors such as shading or soiling, and some equipment errors related to the deficiencies of silicon-based photodiode pyranometers. The majority of these errors cannot be detected by traditional QC methods based on physical or statistical limits, and hence produce problems in most of the applications that require solar radiation data. Besides, these results manifest the low-quality of Spanish networks such as SIAR, Meteocat, Euskalmet and SOS Rioja, which show defects in more than a 50% of the stations and should be consequently avoided.

Ruben Urraca, Javier Antonanzas, Andres Sanz-Garcia, Alvaro Aldama, Francisco Javier Martinez-de-Pison

Predicting Global Irradiance Combining Forecasting Models Through Machine Learning

Predicting solar irradiance is an active research problem, with many physical models having being designed to accurately predict Global Horizontal Irradiance. However, some of the models are better at short time horizons, while others are more accurate for medium and long horizons. The aim of this research is to automatically combine the predictions of four different models (Smart Persistence, Satellite, Cloud Index Advection and Diffusion, and Solar Weather Research and Forecasting) by means of a state-of-the-art machine learning method (Extreme Gradient Boosting). With this purpose, the four models are used as inputs to the machine learning model, so that the output is an improved Global Irradiance forecast. A 2-year dataset of predictions and measures at one radiometric station in Seville has been gathered to validate the method proposed. Three approaches are studied: a general model, a model for each horizon, and models for groups of horizons. Experimental results show that the machine learning combination of predictors is, on average, more accurate than the predictors themselves.

J. Huertas-Tato, R. Aler, F. J. Rodríguez-Benítez, C. Arbizu-Barrena, D. Pozo-Vázquez, I. M. Galván

A Hybrid Algorithm for the Assessment of the Influence of Risk Factors in the Development of Upper Limb Musculoskeletal Disorders

A hybrid model based on genetic algorithms, classification trees and multivariate adaptive regression splines is applied to identify the risk factors that have the strongest influence on the development of an upper limb musculoskeletal disorder using the data of the Spanish Seventh National Survey on Working Conditions. The study is performed among a sample of workers from the extractive and manufacturing industry sector, where upper limb have been the most frequently reported disorders during 2016. The considered variables are connected to employment conditions, physical conditions at workplace, safety conditions, workstation design and ergonomics, psychosocial and organizational factors, Health and Safety management and health damages. These variables are either continuous, Liker scale or binary. The chosen output variable is built taking into consideration the presence or absence of three conditions: the existence of upper limb pain, the perception of a work-related nature and the requirement of medical care in relation with it. The results show that WMSD have a multifactorial origin and the categories that include the most relevant variables are: ergonomics and psychosocial factors, workplace conditions and workers’ individual characteristics.

Nélida M. Busto Serrano, Paulino J. García Nieto, Ana Suárez Sánchez, Fernando Sánchez Lasheras, Pedro Riesgo Fernández

Evolutionary Computation on Road Safety

This study examines the psychological research that focuses on road safety in Smart Cities as proposed by the Vulnerable Road Users (VRUs) sphere. It takes into account qualities such as VRUs’ personal information, their habits, environmental measurements and things data. With the goal of seeing VRUs as active and proactive actors with differentiated feelings and behaviours, we are committed to integrating the social factors that characterize each VRU into our social machinery. As a result, we will focus on the development of a VRU Social Machine to assess VRUs’ behaviour in order to improve road safety. The formal background will be to use Logic Programming to define its architecture based on a Deep Learning approach to Knowledge Representation and Reasoning, complemented with an Evolutionary approach to Computing.

Bruno Fernandes, Henrique Vicente, Jorge Ribeiro, Cesar Analide, José Neves

Memetic Modified Cuckoo Search Algorithm with ASSRS for the SSCF Problem in Self-Similar Fractal Image Reconstruction

This paper proposes a new memetic approach to address the problem of obtaining the optimal set of individual Self-Similar Contractive Functions (SSCF) for the reconstruction of self-similar binary IFS fractal images, the so-called SSCF problem. This memetic approach is based on the hybridization of the modified cuckoo search method for global optimization with a new strategy for the Lévy flight step size (MMCS) and the adaptive step size random search (ASSRS) heuristics for local search. This new method is applied to some illustrative examples of self-similar fractal images with satisfactory graphical and numerical results. Our approach represents a substantial improvement with respect to a previous method based on the original cuckoo search algorithm for all contractive functions of the examples in this paper.

Akemi Gálvez, Andrés Iglesias, Iztok Fister, Iztok Fister Jr., Eneko Osaba, Javier Del Ser

A Re-description Based Developmental Approach to the Generation of Value Functions for Cognitive Robots

Motivation is a fundamental topic when implementing cognitive architectures aimed at lifelong open-ended learning in autonomous robots. In particular, it is of paramount importance for these types of architectures to be able to establish goals that provide purpose to the robot’s interaction with the world as well as to progressively learn value functions within its state space that allow reaching those goals whatever the starting point. This paper aims at exploring a developmental approach to the generation of high level neural network based value functions in complex continuous state spaces through a re-description process. This process starts by obtaining relatively simple Separable Utility Regions (SURs) which allow the system to consistently achieve goals, although not necessarily in the most efficient manner. The traces obtained by these SURs are then used to provide training data for a neural network based value function. Through a simple experiment with the Robobo robot, we show that this procedure can be more generalizable than attempting to directly obtain the value function through more traditional means.

A. Romero, F. Bellas, A. Prieto, R. J. Duro

A Hybrid Iterated Local Search for Solving a Particular Two-Stage Fixed-Charge Transportation Problem

In the current paper we take a different approach to a particular capacitated two-stage fixed-charge transportation problem proposing an efficient hybrid Iterated Local Search (HILS) procedure as a means of solving the above-mentioned problem. Our approach is a heuristic one; it constructs an initial solution while using a local search procedure whose aim is to increase the exploration, namely a perturbation mechanism. For the purpose of diversifying the search, a neighborhood structure is used to hybridize it. The preliminary computational results that we achieved stand as proof to the fact that the solution we propose yields high-quality solutions within reasonable running-times.

Ovidiu Cosma, Petrica Pop, Matei Oliviu, Ioana Zelina

Change Detection in Multidimensional Data Streams with Efficient Tensor Subspace Model

The paper presents a method for change detection in multidimensional streams of data based on a tensor model constructed from the Higher-Order Singular Value Decomposition of raw data tensors. The method was applied to the problem of video shot detection showing good accuracy and high speed of execution compared with other more time demanding tensor models. In this paper we show two efficient algorithms for tensor model construction and tensor model update from the stream of data.

Bogusław Cyganek

A View of the State of the Art of Dialogue Systems

Dialogue systems are becoming central tools in human computer interface systems. New interaction systems, e.g. Siri, Echo and others, are proposed by the day, and new features are added to these systems at breathtaking pace. The conventional approaches based on traditional artificial intelligence techniques, such as ontologies and tree based search, have been superseded by machine learning approaches and, more recently, deep learning. In this paper we give a view of the current state of dialogue systems, describing the areas of application, as well as the current technical approaches and challenges. We propose two emerging domains of application of dialogue systems that may be highly influential in the near future: storytelling and therapeutic systems.

Leire Ozaeta, Manuel Graña

An Adaptive Approach for Index Tuning with Learning Classifier Systems on Hybrid Storage Environments

Index tuning is an activity typically performed by database administrators (DBAs) and advisors tools to decrease the response times of commands submitted to a database management system (DBMS). With the introduction of solid state drive (SSD) storage, a new challenge has arisen for DBAs and tools because SSDs provide fast read operations and low random-access costs, and these new features must be considered to perform index tuning of the database. In this paper, we use a learning classifier system (LCS), which is a machine learning approach that combines learning by reinforcement and genetic algorithms and allows the updating and discovery of new rules to provide an efficient and flexible index tuning mechanism applicable for hybrid storage environments (HDD/SSD). The proposed approach, termed Index Tuning with Learning Classifier System (ITLCS), builds a rule-based mechanism designed to represent the knowledge of the system. Experimental results with the TPC-H benchmark showed that the ITLCS performed better than well-known advisor tools, indicating the feasibility of the proposed approach.

Wendel Góes Pedrozo, Júlio Cesar Nievola, Deborah Carvalho Ribeiro

Electrical Behavior Modeling of Solar Panels Using Extreme Learning Machines

Predicting the response of solar panels has a big potential impact on the economical viability of the insertion of alternative energy sources in our societies, diminishing the dependence on polluting fossil fuels. In this paper we approach the modeling of the electrical behavior of a commercial photovoltaic module Atersa A-55 using Extreme Learning Machines (ELMs). The training and validation data were extracted from the response of a real photovoltaic module installed at the Faculty of Engineering of Vitoria-Gasteiz (Basque Country University, Spain). The resulting predictive model has one input ($$V_{PV}$$) and one output ($$I_{PV}$$) variables. We achieve a Root Mean Squared Error (RMSE) of 0.026 in the electrical current measured in Amperes.

Jose Manuel Lopez-Guede, Jose Antonio Ramos-Hernanz, Julian Estevez, Asier Garmendia, Leyre Torre, Manuel Graña

A Hybrid Clustering Approach for Diagnosing Medical Diseases

Clustering is one of the most fundamental and essential data analysis tasks with broad applications. It has been studied extensively in various research fields, including data mining, machine learning, pattern recognition, and in scientific, engineering, social, economic, and biomedical data analysis. This paper is focused on a new strategy based on a hybrid model for combining fuzzy partition method and maximum likelihood estimates clustering algorithm for diagnosing medical diseases. The proposed hybrid system is first tested on well-known Iris data set and then on three data sets for diagnosing medical diseases from UCI data repository.

Svetlana Simić, Zorana Banković, Dragan Simić, Svetislav D. Simić

Backmatter

Titel: Hybrid Artificial Intelligent Systems
herausgegeben von: Francisco Javier de Cos Juez
José Ramón Villar
Enrique A. de la Cal
Álvaro Herrero
Héctor Quintián
José António Sáez
Emilio Corchado
Verlag: Springer International Publishing
Electronic ISBN: 978-3-319-92639-1
Print ISBN: 978-3-319-92638-4
DOI: https://doi.org/10.1007/978-3-319-92639-1

Springer Professional

Über dieses Buch

Inhaltsverzeichnis

Frontmatter

Data Mining, Knowledge Discovery and Big Data

Frontmatter

A Deep Learning-Based Recommendation System to Enable End User Access to Financial Linked Knowledge

On the Use of Random Discretization and Dimensionality Reduction in Ensembles for Big Data

Hybrid Deep Learning Based on GAN for Classifying BSR Noises from Invehicle Sensors

Inferring User Expertise from Social Tagging in Music Recommender Systems for Streaming Services

Learning Logical Definitions of n-Ary Relations in Graph Databases

GAparsimony: An R Package for Searching Parsimonious Models by Combining Hyperparameter Optimization and Feature Selection

Improving Adaptive Optics Reconstructions with a Deep Learning Approach

Complexity of Rule Sets in Mining Incomplete Data Using Characteristic Sets and Generalized Maximal Consistent Blocks

Optimization of the University Transportation by Contraction Hierarchies Method and Clustering Algorithms

Identification of Patterns in Blogosphere Considering Social Positions of Users and Reciprocity of Relations

SmartFD: A Real Big Data Application for Electrical Fraud Detection

Multi-class Imbalanced Data Oversampling for Vertebral Column Pathologies Classification

Bio-inspired Models and Evolutionary Computation

Frontmatter

A Hybrid Genetic-Bootstrapping Approach to Link Resources in the Web of Data

Modelling and Forecasting of the Radiation Level Time Series at the Canfranc Underground Laboratory

Sensor Fault Detection and Recovery Methodology for a Geothermal Heat Exchanger

Distinctive Features of Asymmetric Neural Networks with Gabor Filters

Tuning CNN Input Layout for IDS with Genetic Algorithms

Improving the Accuracy of Prediction Applications by Efficient Tuning of Gradient Descent Using Genetic Algorithms

Mono-modal Medical Image Registration with Coral Reef Optimization

Evaluating Feature Selection Robustness on High-Dimensional Data

Learning Algorithms

Frontmatter

Generalized Probability Distribution Mixture Model for Clustering

A Hybrid Approach to Mining Conditions

A First Attempt on Monotonic Training Set Selection

Dealing with Missing Data and Uncertainty in the Context of Data Mining

A Preliminary Study of Diversity in Extreme Learning Machines Ensembles

Orthogonal Learning Firefly Algorithm

Multi-label Learning by Hyperparameters Calibration for Treating Class Imbalance

Drifted Data Stream Clustering Based on ClusTree Algorithm

Featuring the Attributes in Supervised Machine Learning

Applying VorEAl for IoT Intrusion Detection

Visual Analysis and Advanced Data Processing Techniques

Frontmatter

Evaluation of a Wrist-Based Wearable Fall Detection Method

EnerVMAS: Virtual Agent Organizations to Optimize Energy Consumption Using Intelligent Temperature Calibration

Tool Wear Estimation and Visualization Using Image Sensors in Micro Milling Manufacturing

Compensating Atmospheric Turbulence with Convolutional Neural Networks for Defocused Pupil Image Wave-Front Sensors

Using Nonlinear Quantile Regression for the Estimation of Software Cost

A Distributed Drone-Oriented Architecture for In-Flight Object Detection

3D Gabor Filters for Chest Segmentation in DCE-MRI

Fingertips Segmentation of Thermal Images and Its Potential Use in Hand Thermoregulation Analysis

Data Mining Applications

Frontmatter

Listen to This: Music Recommendation Based on One-Class Support Vector Machine

Improving Forecasting Using Information Fusion in Local Agricultural Markets

Chord Progressions Selection Based on Song Audio Features

LearnSec: A Framework for Full Text Analysis

A Mood Analysis on Youtube Comments and a Method for Improved Social Spam Detection

An Improved Comfort Biased Smart Home Load Manager for Grid Connected Homes Under Direct Load Control

Remifentanil Dose Prediction for Patients During General Anesthesia

Classification of Prostate Cancer Patients and Healthy Individuals by Means of a Hybrid Algorithm Combing SVM and Evolutionary Algorithms

Hybrid Intelligent Applications

Frontmatter

A Hybrid Deep Learning System of CNN and LRCN to Detect Cyberbullying from SNS Comments

Taxonomy-Based Detection of User Emotions for Advanced Artificial Intelligent Applications

Prediction of the Energy Demand of a Hotel Using an Artificial Intelligence-Based Model

A Hybrid Algorithm for the Prediction of Computer Vision Syndrome in Health Personnel Based on Trees and Evolutionary Algorithms

An Algorithm Based on Satellite Observations to Quality Control Ground Solar Sensors: Analysis of Spanish Meteorological Networks

Predicting Global Irradiance Combining Forecasting Models Through Machine Learning

A Hybrid Algorithm for the Assessment of the Influence of Risk Factors in the Development of Upper Limb Musculoskeletal Disorders

Evolutionary Computation on Road Safety

Memetic Modified Cuckoo Search Algorithm with ASSRS for the SSCF Problem in Self-Similar Fractal Image Reconstruction

A Re-description Based Developmental Approach to the Generation of Value Functions for Cognitive Robots

A Hybrid Iterated Local Search for Solving a Particular Two-Stage Fixed-Charge Transportation Problem

Change Detection in Multidimensional Data Streams with Efficient Tensor Subspace Model

A View of the State of the Art of Dialogue Systems

An Adaptive Approach for Index Tuning with Learning Classifier Systems on Hybrid Storage Environments

Electrical Behavior Modeling of Solar Panels Using Extreme Learning Machines

A Hybrid Clustering Approach for Diagnosing Medical Diseases

Backmatter

Premium Partner