main-content

## Über dieses Buch

This book constitutes the refereed proceedings of the 18th Portuguese Conference on Artificial Intelligence, EPIA 2017, held in Porto, Portugal, in September 2017.
The 69 revised full papers and 2 short papers presented were carefully reviewed and selected from a total of 177 submissions. The papers are organized in 16 tracks devoted to the following topics: agent-based modelling for criminological research (ABM4Crime), artificial intelligence in cyber-physical and distributed embedded systems (AICPDES), artificial intelligence in games (AIG), artificial intelligence in medicine (AIM), artificial intelligence in power and energy systems (AIPES), artificial intelligence in transportation systems (AITS), artificial life and evolutionary algorithms (ALEA), ambient intelligence and affective environments (AmIA), business applications of artificial intelligence (BAAI), intelligent robotics (IROBOT), knowledge discovery and business intelligence (KDBI), knowledge representation and reasoning (KRR), multi-agent systems: theory and applications (MASTA), software engineering for autonomous and intelligent systems (SE4AIS), social simulation and modelling (SSM), and text mining and applications (TeMA).

## Inhaltsverzeichnis

### Correction to: Flower Pollination Algorithm Applied to the Economic Dispatch Problem with Multiple Fuels and Valve Point Effect

In the originally published version of this paper the name of the fifth author was inadvertently published with a spelling error. The name “Marcos T.B. de Olveira” was corrected to “Marcos T.B. de Oliveira”.

Rafael Ochsendorf G. Souza, Ezequiel Silva Oliveira, Ivo Chaves Silva Junior, André Luís Marques Marcato, Marcos T. B. de Oliveira

### An Agent-Based Aggression De-escalation Training Application for Football Referees

An ongoing problem associated with sports such as football is the regular occurrence of aggressive behavior against referees. Campaigns and meetings for players and football clubs that are organized by the Dutch football federation must reduce aggression on the pitch. To support referees, this paper introduces a mobile application that simulates a football related environment for referees to train with aggression de-escalation. It replicates an aggressive scenario between football player (agent) and referee (user), in which the referee must approach appropriately to decrease the aggression level of the agent. A preliminary evaluation pointed out that the application has potential to be used as a training instrument for football referees.

Tibor Bosse, Ward van Breda, Nousha van Dijk, Jelmer Scholte

### Agents Shaping Networks Shaping Agents: Integrating Social Network Analysis and Agent-Based Modeling in Computational Crime Research

The paper presents a recent development of an interdisciplinary research exploring innovative computational approaches to the scientific study of criminal behavior. The attention is focused on an attempt to combine social network analysis and agent-based modelling into CrimeMiner, an experimental framework that seamlessly integrates document-enhancement, visualization and network analysis techniques to support the study of criminal organizations. Our goal is both methodological and scientific. We are exploring how the synergy between ABM and SNA can support a deeper and more empirically grounded understanding of the complex dynamics taking place within criminal organizations between the individual/behavioral and social/structural level.

Nicola Lettieri, Antonio Altamura, Delfina Malandrino, Valentina Punzo

### An Agent-Based Model Predicting Group Emotion and Misbehaviours in Stranded Passengers

Airline passengers can get stranded in an airport due to a number of reasons. As a consequence, they might get frustrated. Frustration leads to misbehaving if a given individual is frustrated enough, according to the literature. In this work, an agent-based model of stranded passengers in an airport departure area is presented. Structured simulations show how personal and environmental characteristics such as age, gender and emotional contagion, among others, influence the frustration dynamics, number and type of misbehaviours in such a scenario. We also present simulation results with two implemented support models (a chatbot and multilingual staff) aiming to reduce the overall frustration level of passengers facing this type of situation. Important findings are that: men are more likely to use force than women, the crowd composition plays an important role in terms of misbehaviours, the effect of emotional contagion leads to more misbehaviours and a chatbot might be considered as an alternative for supporting stranded passengers.

Lenin Medeiros, C. Natalie van der Wal

### Towards Understanding the Impact of Crime on the Choice of Route by a Bus Passenger

In this paper we describe a simulation platform that supports studies on the impact of crime on urban mobility. We present an example of how this can be achieved by seeking to understand the effect, on the transport system, if users of this system decide to choose optimal routes of time between origins and destinations that they normally follow. Based on real data from a large Brazilian metropolis, we found that the percentage of users who follow this policy is small. Most prefer to follow less efficient routes by making bus exchanges at terminals. This can be understood as an indication that the users of the transport system favor the security factor.

Daniel Sullivan, Carlos Caminha, Hygor P. M. Melo, Vasco Furtado

### Exploring Anti-poaching Strategies for Wildlife Crime with a Simple and General Agent-Based Model

Understanding and preventing wildlife crime is challenging because of the complex interdependencies between animals, poachers, and rangers. To tackle this complexity, this study introduces a simple, general agent-based model of wildlife crime. The model is abstract and can be used to derive general conclusions about the emergence and prevention of wildlife crime. It can also be tailored to create scenarios which allows researchers and practitioners to better understand the dynamics in specific cases. This was illustrated by applying the model to the context of rhino poaching in South Africa. A virtual park populated by rhinos, poachers and rangers was created to study how an increase in patrol effort for two different anti-poaching strategies affect the number of poached rhinos. The results show that fence patrols are more effective in preventing wildlife crime than standard patrols. Strikingly, even increasing the number of ranger teams does not increase the effectiveness of standard patrols compared to fence patrols.

Nick van Doormaal

### A SOA Web-Based Group Decision Support System Considering Affective Aspects

The topic of Group Decision Support Systems (GDSS) has been studied over the last decades. Supporting decision-makers that participate in group decision-making processes is a complex task, especially when decision-makers have no opportunity to gather at the same place and at the same time. In this work, we propose a Web based Group Decision Support System (WebGDSS) which intends to support decision-makers anywhere, anytime and through almost any kind of devices. Our system was developed under a SOA architecture and we used a multi-criteria algorithm that features decision-makers’ cognitive aspects, as well as a component of generation of intelligent reports to provide feedback of decision-making processes to the decision-makers.

Luís Conceição, João Carneiro, Goreti Marreiros, Paulo Novais

### Monitoring the Progress of Programming Students Supported by a Digital Teaching Assistant

Several studies have shown that there is an important link between continual monitoring by the teachers and the students’ performance. Unfortunately, the teachers cannot be continuously looking for what the students are doing. To overcome this situation, we propose the use of CodeInsights, a tool capable of capturing, in an autonomous, transparent and unobtrusive manner, information about the students’ performance and then, based on teacher’s expectations, notify them about possible deviations in the specific context of programming courses. The decision on whether the system should or should not notify the teacher is supported by an artificial cognitive selective attention mechanism. Although CodeInsights, provided with the described mechanism, hasn’t been fully tested in a real case scenario, we present some specific examples of how it can be used to assist teachers.

Nuno Gil Fonseca, Luís Macedo, António José Mendes

### Image Matching Algorithm Based on Hashes Extraction

Nowadays, the rise of social networks and the continuous storage of large of information are topical issue. But the main problem is not the storage itself, is the ability to process most of this information, so that it is not stored in vain. In this way, using the shared images within the scope of social networks, possible relationships between users could be identified. From this idea arises the present work, which focuses on identifying similar images even if they have been modified (applying color filters, rotations or even watermarks). The solution involves preprocessing to eliminate possible filters and then apply hashing techniques, just to obtain hashes that are unique for each image and allow the comparison of an abstract but effective way for the user.

Alberto Rivas, Pablo Chamoso, Javier J. Martín-Limorti, Sara Rodríguez, Fernando de la Prieta, Javier Bajo

### Unsupervised Stress Detection Algorithm and Experiments with Real Life Data

Stress is the major problem in the modern society and a reason for at least half of lost working days in European enterprises, but existing stress detectors are not sufficiently convenient for everyday use. One reason is that stress perception and stress manifestation vary a lot between individuals; hence, “one-fits-all-persons” stress detectors usually achieve notably lower accuracies than person-specific methods. The majority of existing approaches to person-specific stress recognition, however, employ fully supervised training, requiring to collect fairly large sets of labelled data from each end user. These sets should contain examples of stresses and normal conditions, and such data collection effort may be tiring for end users. Therefore this work proposes an algorithm to train person-specific stress detectors using only unlabelled data, not necessarily containing examples of stresses. The proposed method, based on Hidden Markov Models with maximum posterior marginal decision rule, was tested using real life data of 28 persons and achieved average stress detection accuracy of 75%, which is similar to the accuracies of state-of-the-art supervised algorithms for real life data.

Elena Vildjiounaite, Johanna Kallio, Jani Mäntyjärvi, Vesa Kyllönen, Mikko Lindholm, Georgy Gimel’farb

### Iterative Parallel Sampling RRT for Racing Car Simulation

Graphics Processing Units have evolved at a large pace, maintaining a processing power orders of magnitude higher than Central Processing Units. As a result, the interest of using the General-Purpose computing on Graphics Processing Units paradigm has grown. Nowadays, big effort is put to study probabilistic search algorithms like the Randomized Search Algorithms family, which have good time complexity, and thus can be adapted to massive search spaces. One of those algorithms is Rapidly Exploring Random Tree (RRT) which reveals good results when applied to high dimensional dynamical search spaces. This paper proposes a new variant of the RRT algorithm called Iterative Parallel Sampling RRT which explores the use of parallel computation in GPU to generate faster solutions. The algorithm was used to construct a CUDA accelerated bot for the TORCS open source racing game and tested against the plain RRT. Preliminary tests show lap time reductions of around 17% and the potential for reducing search times.

Samuel Gomes, João Dias, Carlos Martinho

### Multi-agent Double Deep Q-Networks

There are many open issues and challenges in the multi-agent reward-based learning field. Theoretical convergence guarantees are lost, and the complexity of the action-space is also exponential to the amount of agents calculating their optimal joint-action. Function approximators, such as deep neural networks, have successfully been used in single-agent environments with high dimensional state-spaces. We propose the Multi-agent Double Deep Q-Networks algorithm, an extension of Deep Q-Networks to the multi-agent paradigm. Two common techniques of multi-agent Q-learning are used to formally describe our proposal, and are tested in a Foraging Task and a Pursuit Game. We also demonstrate how they can generalize to similar tasks and to larger teams, due to the strength of deep-learning techniques, and their viability for transfer learning approaches. With only a small fraction of the initial task’s training, we adapt to longer tasks, and we accelerate the task completion by increasing the team size, thus empirically demonstrating a solution to the complexity issues of the multi-agent field.

David Simões, Nuno Lau, Luís Paulo Reis

### A Deep Learning Method for ICD-10 Coding of Free-Text Death Certificates

The assignment of disease codes to clinical texts has a wide range of applications, including epidemiological studies or disease surveillance. We address the task of automatically assigning the ICD-10 codes for the underlying cause of death, from the free-text descriptions included in death certificates obtained from the Portuguese Ministry of Health. We specifically propose to leverage a deep neural network based on a two-level hierarchy of recurrent nodes together with attention mechanisms. The first level uses recurrent nodes for modeling the sequences of words given in individual fields of the death certificates, together with attention to weight the contribution of each word, producing intermediate representations for the contents of each field. The second level uses recurrent nodes to model a sequence of fields, using the representations produced by the first level and also leveraging attention in order to weight the contributions of the different fields. The paper reports on experiments with a dataset of 115,406 death certificates, presenting the results of an evaluation of the predictive accuracy of the proposed method, for different ICD-10 levels (i.e., chapter, block, or full code) and for particular causes of death. We also discuss how the neural attention mechanisms can help in interpreting the classification results.

Francisco Duarte, Bruno Martins, Cátia Sousa Pinto, Mário J. Silva

### Robot Programming Through Whole-Body Interaction

Programmable and non-programmable educational robots are, in most cases, associated with sedentary behavior in children. Children interact with educational robots mostly in indoor environments. Whole-body interaction and natural environments seem to potentiate children’s physical and mental health. In order to potentiate children’s physical and mental health we have developed a new set of robotic devices - Biosymtic Robotic devices. We describe the main computational models of Biosymtic Robotic devices: a computational model demonstrating how to increase children’s physical activity levels and contact with natural environments through automatic feedback control mechanisms; a theoretical cognitive model on how to program robotic devices through whole-body interaction in natural environments.

Marta Ferraz

### Wheeze Detection Using Convolutional Neural Networks

In this paper, we propose to use convolutional neural networks for automatic wheeze detection in lung sounds. We present convolutional neural network based approach that has several advantages compared to the previous approaches described in the literature. Our method surpasses the standard machine learning models on this task. It is robust to lung sound shifting and requires minimal feature preprocessing steps. Our approach achieves 99% accuracy and 0.96 AUC on our datasets.

Kirill Kochetov, Evgeny Putin, Svyatoslav Azizov, Ilya Skorobogatov, Andrey Filchenkov

### Multiclassifier System Using Class and Interclass Competence of Base Classifiers Applied to the Recognition of Grasping Movements in the Control of Bioprosthetic Hand

In this paper the problem of recognition of patient’s intent to move hand prosthesis is addressed. The proposed method is based on recognition of electromyographic (EMG) and mechanomyographic (MMG) biosignals using a multiclassifier (MC) system working with dynamic ensemble selection scheme and original concept of competence measure. The concept focuses on developing competence and interclass cross- competence measures which can be applied as a method for classifiers combination. The cross-competence measure allows an ensemble to harness information obtained from incompetent classifiers instead of removing them from the ensemble. The performance of MC system with proposed competence measure was experimentally compared against six state-of-the-art classification methods using real data concerning the recognition of six types of grasping movements. The system developed achieved the highest classification accuracies demonstrating the potential of MC system for the control of bioprosthetic hand.

Marek Kurzynski, Pawel Trajdos, Andrzej Wolczowski

### Classifying Heart Sounds Using Images of MFCC and Temporal Features

Phonocardiogram signals contain very useful information about the condition of the heart. It is a method of registration of heart sounds, which can be visually represented on a chart. By analyzing these signals, early detections and diagnosis of heart diseases can be done. Intelligent and automated analysis of the phonocardiogram is therefore very important, to determine whether the patient’s heart works properly or should be referred to an expert for further evaluation. In this work, we use electrocardiograms and phonocardiograms collected simultaneously, from the Physionet challenge database, and we aim to determine whether a phonocardiogram corresponds to a “normal” or “abnormal” physiological state. The main idea is to translate a 1D phonocardiogram signal into a 2D image that represents temporal and Mel-frequency cepstral coefficients features. To do that, we develop a novel approach that uses both features. First we segment the phonocardiogram signals with an algorithm based on a logistic regression hidden semi-Markov model, which uses the electrocardiogram signals as reference. After that, we extract a group of features from the time and frequency domain (Mel-frequency cepstral coefficients) of the phonocardiogram. Then, we combine these features into a two-dimensional time-frequency heat map representation. Lastly, we run a binary classifier to learn a model that discriminates between normal and abnormal phonocardiogram signals.In the experiments, we study the contribution of temporal and Mel-frequency cepstral coefficients features and evaluate three classification algorithms: Support Vector Machines, Convolutional Neural Network, and Random Forest. The best results are achieved when we map both temporal and Mel-frequency cepstral coefficients features into a 2D image and use the Support Vector Machines with a radial basis function kernel. Indeed, by including both temporal and Mel-frequency cepstral coefficients features, we obtain sligthly better results than the ones reported by the challenge participants, which use large amounts of data and high computational power.

Diogo Marcelo Nogueira, Carlos Abreu Ferreira, Alípio M. Jorge

### Discovering Interesting Associations in Gestation Course Data

Finding risk factors in pregnancy related to neonatal hypoxia is a challenging task due to the informal nature and a wide scatter of the data. In this work, we propose a methodology for sequential estimation of interestingness of association rules with two sets of criteria. The rules suggest that a strong relationship exists between the specific sets of attributes and the diagnosis. We set up a profile of the pregnant woman with a high likelihood of hypoxia of the newborn that would be beneficial to medical professionals.

Inna Skarga-Bandurova, Tetiana Biloborodova, Maksym Nesterov

### Severity Estimation of Stator Winding Short-Circuit Faults Using Cubist

In this paper, an approach to estimate the severity of stator winding short-circuit faults in squirrel-cage induction motors based on the Cubist model is proposed. This is accomplished by scoring the unbalance in the current and voltage waveforms as well as in Park’s Vector, both for current and voltage. The proposed method presents a systematic comparison between models, as well as an analysis regarding hyper-parameter tunning, where the novelty of the presented work is mainly associated with the application of data-based analysis techniques to estimate the stator winding short-circuit severity in three-phase squirrel-cage induction motors. The developed solution may be used for tele-monitoring of the motor condition and to implement advanced predictive maintenance strategies.

Tiago dos Santos, Fernando J. T. E. Ferreira, João Moura Pires, Carlos Viegas Damásio

### Estimating Energy Consumption in Evolutionary Algorithms by Means of FRBS

Towards Energy-Aware Bioinspired Algorithms

During the last decades, energy consumption has become a topic of interest for algorithm designers, particularly when devoted to networked devices and mainly when handheld ones are involved. Moreover energy consumption has become a matter of paramount importance in nowadays environmentally conscious society. Although a number of studies are already available, not many have focused on Evolutionary Algorithms (EAs). Moreover, no previous attempt has been performed for modeling energy consumption behavior of EAs considering different hardware platforms. This paper thus aims at not only analyzing the influence of the main EA parameters in their energy related behavior, but also tries for the first time to develop a model that allows researchers to know how the algorithm will behave in a number of hardware devices. We focus on a specific member of the EA family, namely Genetic Programming (GP), and consider several devices when employed as the underlying hardware platform. We apply a Fuzzy Rules Based System to build the model that allows then to predict energy required to find a solution, given a previously chosen hardware device and a set of parameters for the algorithm.

Josefa Díaz Álvarez, Francisco Chávez de La O, Juan Ángel García Martínez, Pedro Ángel Castillo Valdivieso, Francisco Fernández de Vega

### Application of Robust Optimization Technique to the Energy Planning Problem

The present work proposes an approach based on the application of the robust optimization technique named column-and-constraint generation (C&CG), for solving the problem of energy planning comprising the minimization of the thermoelectric dispatch cost during a daily operation of a system with wind and hydraulic generation. In order to define the hourly dispatch of thermoelectric generation, the approach considers a history of flow for the hydraulic generation, as well as uncertainties over the wind behavior in the wind power plant. Thus, the short-term energy planning is defined by taking into account the wind stochastic through the concept of uncertainties. As solving proposal, linear programming with a robust optimization (RO) mathematical technique through the C&CG algorithm is used. This method is applied to divide the global problem into wind speed uncertainties scenarios.

Saulo C. de A. Ferreira, Jerson dos S. Carvalho, Leonardo W. de Oliveira, Taís L. O. Araújo, Edimar J. de Oliveira, Marina B. A. Souza

### EnAPlug – An Environmental Awareness Plug to Test Energy Management Solutions for Households

The present paper presents a new kind of Smart Plug that covers the needs of power systems R&D centers. EnAPlug, described in this paper, enables the monitor and control of loads, as a normal Smart Plug. However, it has a great benefit in comparison with a normal Smart Plug, the EnAPlug allows the integration of a variety of sensors so the user can understand the load and the surrounding environment (using a set of sensors that better fit the load). The sensors are installed in the load itself, and must have a clear fit to the load. The paper presents a demonstration of an EnAPlug used in a refrigerator for a demand response event participation, using the sensor capability to measure important values, such as, inside temperature.

Luis Gomes, Filipe Sousa, Zita Vale

### Flower Pollination Algorithm Applied to the Economic Dispatch Problem with Multiple Fuels and Valve Point Effect

Due to the high importance of economic dispatch in planning and operating electric power systems, new methods have been researched to minimize the costs of power generation. To calculate these costs, the power generation of each thermal unit must be evaluated. When a thermal unit is modelled considering real world constraints, such as multiple fuels and valve point effect, traditional optimization methods are inefficient due to the nature of the cost function. This paper shows a study of a metaheuristic method, based on flower pollination to search for satisfactory results for economic dispatch. The results obtained are compared with results from other authors, with the purpose of evaluating how efficient the technique presented here is.

Rafael Ochsendorf G. Souza, Ezequiel Silva Oliveira, Ivo Chaves Silva Junior, André Luís Marques Marcato, Marcos T. B. de Oliveira

### Dynamic and Static Transmission Network Expansion Planning via Harmony Search and Branch & Bound on a Hybrid Algorithm

This work presents a method based on metaheuristics to solve the problem of Static (STNEP) and Dynamic (DTNEP) Transmission Network Expansion Planning in electrical power systems. The result of this formulation is mixed-integer nonlinear programming (MINLP), where the difficulties are intensified in the DTNEP by the temporal coupling. Therefore, a methodology was developed to reach the solution in three different stages: The first one is responsible for obtaining an efficient set of best candidate routes for the expansion; the metaheuristic optimization process, Harmony Search (HS), is used to find STNEP’s optimal solution and its neighborhood that provides a DTNEP candidate zone; lastly, a hybrid algorithm that mixes the HS and Branch & Bound (B&B) concepts is adapted to provide the optimal DTNEP. In this study, the lossless linearized modeling for load flow is used as a representation of the transmission network. Tests with the Garver and southern Brazilian systems were carried out to verify the performance method. The computational time saving for the STNEP and DTNEP prove the efficacy of the proposed method.

Luiz E. de Oliveira, Francisco D. Freitas, Ivo C. da Silva, Phillipe V. Gomes

### Nord Pool Ontology to Enhance Electricity Markets Simulation in MASCEM

This paper proposes the use of ontologies to enable information and knowledge exchange, to test different electricity market models and to allow players from different systems to interact in common market environments. Multi-agent based software is particularly well fitted to analyse dynamic and adaptive systems with complex interactions among its constituents, such as the complex and dynamic electricity markets. The main drivers are the markets’ restructuring and evolution into regional and continental scales, along with the constant changes brought by the increasing necessity for an adequate integration of renewable energy sources. An ontology to represent the concepts related to the Nord Pool Elspot market is proposed. It is validated through a case study considering the simulation of Elspot market. Results show that heterogeneous agents are able to effectively participate in the simulation by using the proposed ontologies to support their communications with the Nord Pool market operator.

Gabriel Santos, Tiago Pinto, Isabel Praça, Zita Vale

### Electricity Rate Planning for the Current Consumer Market Scenario Through Segmentation of Consumption Time Series

The current European legislation requires households the installation of smart metering systems. These will eventually allow electric utilities to gather richly detailed data of consumption. In this scenario, the implementation of data mining procedures for actionable knowledge extraction could be the key to competitive advantage. These may take the form of market segmentation using clustering techniques for the identification of customer behaviour patterns of electricity consumption that could justify the definition of tailored tariffs. In this brief paper, we show that the combination of a standard clustering algorithm with a similarity measure specifically defined for non-i.i.d. data, namely Dynamic Time Warping, can reveal an actionable segmentation of a real consumer market, combining business criteria and quantitative evaluation.

Alfredo Vellido, David L. García

### Towards Dynamic Rebalancing of Bike Sharing Systems: An Event-Driven Agents Approach

Operating a Bicycle Sharing System over some time without the operator’s intervention causes serious imbalances, which prevents the rental of bikes at some stations and the return at others. To cope with such problems, user-based bicycle rebalancing approaches offer incentives to influence the users’ behavior in an appropriate way. In this paper, an event-driven agent architecture is proposed, which uses Complex Event Processing to predict the future demand at the bike stations using live data about the users. The predicted demands are used to derive situation-aware incentives that are offered by the affected stations. Furthermore, it is shown how bike stations cooperate to prevent that they outbid each other.

Jeremias Dötterl, Ralf Bruns, Jürgen Dunkel, Sascha Ossowski

### Mobility Mining Using Nonnegative Tensor Factorization

Mobility mining has lots of applications in urban planning and transportation systems. In particular, extracting mobility patterns enables service providers to have a global insight about the mobility behaviors which consequently leads to providing better services to the citizens. In the recent years several data mining techniques have been presented to tackle this problem. These methods usually are either spatial extension of temporal methods or temporal extension of spatial methods. However, still a framework that can keep the natural structure of mobility data has not been considered. Non-negative tensor factorizations (NNTF) have shown great applications in topic modelling and pattern recognition. However, unfortunately their usefulness in mobility mining is less explored. In this paper we propose a new mobility pattern mining framework based on a recent non-negative tensor model called BetaNTF. We also present a new approach based on interpretability concept for determination of number of components in the tensor rank selection process. We later demonstrate some meaningful mobility patterns extracted with the proposed method from bike sharing network mobility data in Boston, USA.

### Machine Learning for Pavement Friction Prediction Using Scikit-Learn

During the last decades, the advent of Artificial Intelligence (AI) has been taking place in several technical and scientific areas. Despite its success, AI applications to solve real-life problems in pavement engineering are far from reaching its potential. In this paper, a Python machine learning library, scikit-learn, is used to predict asphalt pavement friction. Using data from the Long-Term Pavement Performance (LTPP) database, 113 different sections of asphalt concrete pavement, spread all over the United States, were selected. Two machine learning models were built from these data to predict friction, one based on linear regression and the other on regularized regression with lasso. Both models showed to be feasible and perform similarly. According to the results, initial friction plays an essential role in the way friction evolves over time. The results of this study also showed that scikit-learn can be a versatile tool to solve pavement engineering problems. By applying machine learning methods to predict asphalt pavements friction, this paper emphasizes how theory and practice can be effectively coupled to solve real-life problems in contemporary transportation.

Pedro Marcelino, Maria de Lurdes Antunes, Eduardo Fortunato, Marta Castilho Gomes

### Optimising Cyclic Timetables with a SAT Approach

EPIA 2017

This paper describes the preliminary results of an ongoing research on cyclic railway timetabling, namely on optimising timetables with respect to travel time using Boolean Satisfiability Problem (SAT) approaches.Some works already done in the field of railway timetables propose solutions to the optimisation problem using Mixed Integer Linear Programming (MILP) and SAT. In this work, we propose a binary search procedure which uses a SAT solver to get global minimum solutions with respect to travel time, and a procedure which is being developed to compute a better upper bound for the solution value and speed up the search process.Finally, we present some promising preliminary results which show that our approach applied to real world data performs better than existing SAT approaches and a state-of-the-art MILP approach.

Gonçalo P. Matos, Luís Albino, Ricardo L. Saldanha, Ernesto M. Morgado

### Transportation in Social Media: An Automatic Classifier for Travel-Related Tweets

In the last years researchers in the field of intelligent transportation systems have made several efforts to extract valuable information from social media streams. However, collecting domain-specific data from any social media is a challenging task demanding appropriate and robust classification methods. In this work we focus on exploring geo-located tweets in order to create a travel-related tweet classifier using a combination of bag-of-words and word embeddings. The resulting classification makes possible the identification of interesting spatio-temporal relations in São Paulo and Rio de Janeiro.

João Pereira, Arian Pasquali, Pedro Saleiro, Rosaldo Rossetti

### A Meta-Genetic Algorithm for Hybridizing Metaheuristics

The research presented in this paper forms part of the initiative aimed at automating the design of intelligent techniques to make them more accessible to non-experts. This study focuses on automating the hybridization of metaheuristics and parameter tuning of the individual metaheuristics. It is an initial attempt at testing the feasibility to automate this design process. A genetic algorithm is used for this purpose. Each hybrid metaheuristic is a combination of metaheuristics and corresponding parameter values. The genetic algorithm explores the space of these combinations. The genetic algorithm is evaluated by applying it to solve the symmetric travelling salesman problem. The evolved hybrid metaheuristics are found to perform competitively with the manually designed hybrid approaches from previous studies and outperform the metaheuristics applied individually. The study has also revealed the potential reusability of the evolved hybrids. Based on the success of this initial study, different problem domains shall be used to verify the automation approach to the design of hybrid metaheuristics.

Ahmed Hassan, Nelishia Pillay

### Econometric Genetic Programming in Binary Classification: Evolving Logistic Regressions Through Genetic Programming

Logistic Regression and Genetic Programming (GP) have already been compared to each other in classification tasks. In this paper, Econometric Genetic Programming (EGP), first introduced as a regression methodology, is extended to binary classification tasks and evolves logistic regressions through GP, aiming to generate high accuracy classifications with potential interpretability of parameters, while uses statistical significance as a feature-selection tool and GP for model selection. EGP-Classification (or EGP-C), the name of this proposed EGP’s extension, was tested against a large group of algorithms in three cross-sectional datasets, showing competitive results in most of them. EGP-C successfully competed against highly non-linear algorithms, like Support Vector Machines and Multilayer Perceptron with Back Propagation, and still allows interpretability of parameters and models generated.

André Luiz Farias Novaes, Ricardo Tanscheit, Douglas Mota Dias

### GAVGA: A Genetic Algorithm for Viral Genome Assembly

Bioinformatics has grown considerably since the development of the first sequencing machine, being today intensively used with the next generation DNA sequencers. Viral genomes represent a great challenge to bioinformatics due to its high mutation rate, forming quasispecies in the same infected host. In this paper, we implement and evaluate the performance of a genetic algorithm, named GAVGA, through the quality of a viral genome assembly. The assembly process works by first clustering the reads that share a common substring called seed and for each cluster, checks if there are overlapping reads with a given similarity percentage using a genetic algorithm. The assembled data are then compared to Newbler, SPAdes and ABySS assemblers, and also to a viral assembler such as VICUNA, which confirms the feasibility of our approach. GAVGA was implemented in python 2.7+ and can be downloaded at https://sourceforge.net/projects/gavga-assembler/.

Renato R. M. Oliveira, Filipe Damasceno, Ronald Souza, Reginaldo Santos, Manoel Lima, Regiane Kawasaki, Claudomiro Sales

### Cartesian Genetic Programming in an Open-Ended Evolution Environment

In this paper we describe and analyze the use of the Cartesian Genetic Programming method to evolve Artificial Neural Networks (CGPANN) in an open-ended evolution scenario. The issue of open-ended evolution has for some time been considered one of the open problems in the field of Artificial Life. In this paper we analyze the capabilities of CGPANN to evolve behaviors in a scenario without artificial selection, more specifically, without the use of explicit fitness functions. We use the BitBang framework and one of its example scenarios as a proof of concept. The results obtained in these first experiments show that it is indeed possible to evolve CGPANN brains, in an open-ended environment, without any explicit fitness function. We also present an analysis of different parameter configurations for the CGPANN when used in this type of scenario.

António Simões, Tiago Baptista, Ernesto Costa

### A Genetic Algorithm Approach for Static Routing and Wavelength Assignment in All-Optical WDM Networks

In order to transmit data efficiently over an optical network, many routing and wavelength assignment (RWA) algorithms have been proposed. This work presents a genetic algorithm that aims at solving the RWA problem, which consists of choosing the most suitable lightpath (i.e., a combination of a route and a wavelength channel) between a source-destination pair of nodes in all-optical networks. A comparison to some already known approaches in terms of blocking probability was made. Results show a reasonable performance, since the average blocking probability achieved by the genetic algorithm was lower than or relatively equivalent to the standard approaches compared.

Diego Bento A. Teixeira, Cassio T. Batista, Afonso Jorge F. Cardoso, Josivaldo de S. Araújo

### A Recommender Model of Teaching-Learning Techniques

Learning contents creation supported on computer tools has triggered the scientific community for a couple of decades. However, teachers have been facing more and different challenges, namely the emergence of other delivery learning approaches besides the traditional educational settings, the diversification of the student target population, and the recognition of different ways of learning. In education domain, diverse recommender systems have been developed so far for recommending learning activities and more specifically, learning objects. This research work is focused on teaching-learning techniques recommendation to assist teachers by providing them recommendation about which teaching-learning techniques should scaffold teaching-learning activities to be carried out by students. This paper presents a recommender model sustained in diverse elements, namely, a hybrid recommender system, an association rules mechanism to infer possible combinations of teaching-learning techniques, and collaborative work among several actors in education. An evaluation is carried out and the preliminary results are very encouraging, revealing that teachers seem very enthusiastic and motivated to rethink their teaching-learning techniques when designing teaching-learning activities.

Dulce Mota, Luis Paulo Reis, Carlos Vaz de Carvalho

### Credit Scoring in Microfinance Using Non-traditional Data

Emerging markets contain the vast majority of the world’s population. Despite the huge number of inhabitants, these markets still lack a proper finance infrastructure. One of the main difficulties felt by customers is the access to loans. This limitation arises from the fact that most customers usually lack a verifiable credit history. As such, traditional banks are unable to provide loans. This paper proposes credit scoring modeling based on non-traditional data, acquired from smartphones, for loan classification processes. We use Logistic Regression (LR) and Support Vector Machine (SVM) models which are the top performers in traditional banking. Then we compared the transformation of the training datasets creating boolean indicators against recoding using Weight of Evidence (WoE). Our models surpassed the performance of the manual loan application selection process, loans granted through the models criteria presented fewer overdues, also the approval criteria of the models increased the amount of granted loans substantially. Compared to the baseline, the loans approved by meeting the criteria of the SVM model presented −196.80% overdue rate. At the same time, the approval criteria of the SVM model generated 251.53% more loans. This paper shows that credit scoring can be useful in emerging markets. The non-traditional data can be used to build algorithms that can identify good borrowers as in traditional banking.

Saulo Ruiz, Pedro Gomes, Luís Rodrigues, João Gama

### Approach for Supervising Self-localization Processes in Mobile Robots

In this paper it will be presented a proposal of a supervisory approach to be applied to the global localization algorithms in mobile robots. One of the objectives of this work is the increase of the robustness in the estimation of the robot’s pose, favoring the anticipated detection of the loss of spatial reference and avoiding faults like tracking derail. The proposed supervisory system is also intended to increase accuracy in localization and is based on two of the most commonly used global feature based localization algorithms for pose tracking in robotics: Augmented Monte Carlo Localization (AMCL) and Perfect Match (PM). The experimental platform was a robotic wheelchair and the navigation used the sensory data from encoders and laser rangers. The software was developed using the ROS framework. The results showed the validity of the proposal, since the supervisor was able to coordinate the action of the AMCL and PM algorithms, benefiting the robot’s localization system with the advantages of each one of the methods.

P. C. M. A. Farias, Ivo Sousa, Héber Sobreira, António Paulo Moreira

### Autonomous Interactive Object Manipulation and Navigation Capabilities for an Intelligent Wheelchair

This paper aims to develop grasping and manipulation capability along with autonomous navigation and localization in a wheelchair-mounted robotic arm to serve patients. Since the human daily environment is dynamically varied, it is not possible to enable the robot to know all the objects that would be grasped. We present an approach to enable the robot to detect, grasp and manipulate unknown objects. We propose an approach to construct the local reference frame that can estimate the object pose for detecting the grasp pose of an object. The main objective of this paper is to present the grasping and manipulation approach along with a navigating and localization method that can be performed in the human daily environment. A grid map and a match algorithm is used to enable the wheelchair to localize itself using a low-power computer. The experimental results show that the robot can manipulate multiple objects and can localize itself with great accuracy.

Nima Shafii, P. C. M. A. Farias, Ivo Sousa, Heber Sobreira, Luis Paulo Reis, Antonio Paulo Moreira

### Feedbot - A Robotic Arm for Autonomous Assisted Feeding

The act of assisted feeding is a challenging task that requires a good reactive planning strategy to cope with an unpredictable environment. It can be seen as a tracking task, where some end effector must travel to a moving goal. This work builds upon state of the art algorithms, such as Discriminative Optimization, making use of a Kinect camera and a modular robotic arm to implement a closed form system that performs assisted feeding. It presents two different approaches: the use of a variable rate function for updating the trajectory with information on the moving goal, and the definition of different risk regions that will shape a safer trajectory.

Catarina Silva, Jayakorn Vongkulbhisal, Manuel Marques, João Paulo Costeira, Manuela Veloso

### Improving and Benchmarking Motion Planning for a Mobile Manipulator Operating in Unstructured Environments

This paper presents the use, adaptation and benchmarking of motion planning tools that will be integrated with the KUKA KMR iiwa mobile robot. The motion planning tools are integrated in the robotic agent presented in [1]. The adaptation consists on algorithms developed to increase the robustness and the efficiency to solve the motion planning problems. These algorithms combine existing motion planners with a trajectory filter developed in this work. Finally, the benchmarking of different motion planners is presented. Three motion planning tasks with a growing level of complexity are taken in consideration for the tests in a simulation environment. The motion planners that provided the best results were RRTConnect for the two less complex tasks and PRM* for the most difficult task.

Andrea Tudico, Nuno Lau, Eurico Pedrosa, Filipe Amaral, Claudio Mazzotti, Marco Carricato

### Exploring Resampling with Neighborhood Bias on Imbalanced Regression Problems

Imbalanced domains are an important problem that arises in predictive tasks causing a loss in the performance of the most relevant cases for the user. This problem has been intensively studied for classification problems. Recently it was recognized that imbalanced domains occur in several other contexts and for a diversity of types of tasks. This paper focus on imbalanced regression tasks. Resampling strategies are among the most successful approaches to imbalanced domains. In this work we propose variants of existing resampling strategies that are able to take into account the information regarding the neighborhood of the examples. Instead of performing sampling uniformly, our proposals bias the strategies for reinforcing some regions of the data sets. In an extensive set of experiments we provide evidence of the advantage of introducing a neighborhood bias in the resampling strategies.

Paula Branco, Luís Torgo, Rita P. Ribeiro

### A Feature Selection Algorithm Based on Heuristic Decomposition

Feature selection is one of the most important concepts in data mining when dimensionality reduction is needed. The performance measures of feature selection encompass predictive accuracy and result comprehensibility. Consistency based feature selection is a significant category of feature selection research that substantially improves the comprehensibility of the result using the parsimony principle. In this work, the feature selection algorithm LAID, Logical Analysis of Inconsistent Data, is applied to large volumes of data. In order to deal with hundreds of thousands of attributes, a problem de-composition strategy associated with a set covering problem formulation is used. The algorithm is applied to artificial datasets with genome-like characteristics of patients with rare diseases.

Luís Cavique, Armando B. Mendes, Hugo F. M. C. Martiniano

### Mining Rational Team Concert Repositories: A Case Study on a Software Project

Software repositories are key to support the development of software. In this article, we present a Mining Software Repositories (MSR) approach that considered a two-year software project repository, set using the Rational Team Concert (RTC) tool. Such MSR was designed in terms of three main components: RTC data extraction, RTC data mining and design of RTC intelligence dashboard. In particular, we focus more on the data extraction component, although we also present mining and dashboard outcomes. Interesting results were achieved, revealing a potential of the proposed MSR to improve the software project planning/development agility and quality.

Pedro Cunha, André Ferreira, Paulo Cortez

### Predictive Teaching and Learning

In this paper, we present a study about students’ behavior based on activity logs in Moodle (an online Learning Management System LMS) analyzing three characteristics: online time (separated by its location), tasks delivered and support material views. We relate these three characteristics with the students’ performance (i.e. success, fail and dropout) and providing a generalization of four students’ groups (based on their behavior on the LMS). After analyzing these characteristics, we evaluate the correlation between each characteristic and the individual student performance, identifying a promising feature to enrich predictive algorithms. Finally, we generated a Naïve Bayes model to predict if the student will succeed, fail or dropout. To evaluate the prediction, we compared the models generated with only the performance data and the models with the enriched data, according with the previously analyzed features. The results shows that the enriched data model are more accurate and may help the teacher to identify “at risk” students.

Cristiano Galafassi, Fabiane Flores Penteado Galafassi, Rosa Maria Vicari

### Multi-objective Learning of Neural Network Time Series Prediction Intervals

In this paper, we address multi-step ahead time series Prediction Intervals (PI). We extend two Neural Network (NN) methods, Lower Upper Bound Estimation (LUBE) and Multi-objective Evolutionary Algorithm (MOEA) LUBE (MLUBE), for multi-step PI. Furthermore, we propose two new MOEA methods based on a 2-phase gradient and MOEA based learning: M2LUBET1 and M2LUBET2. Also, we present a robust evaluation procedure to compare PI methods. Using four distinct seasonal time series, we compared all four PI methods. Overall, competitive results were achieved by the 2-phase learning methods, in terms of both predictive performance and computational effort.

Pedro José Pereira, Paulo Cortez, Rui Mendes

### Toward a Token-Based Approach to Concern Detection in MATLAB Sources

Matrix and data manipulation programming languages are an essential tool for data analysts. However, these languages are often unstructured and lack modularity mechanisms. This paper presents a business intelligence approach for studying the manifestations of lack of modularity support in that kind of languages. The study is focused on MATLAB as a well established representative of those languages. We present a technique for the automatic detection and quantification of concerns in MATLAB, as well as their exploration in a code base. Ubiquitous Self Organizing Map (UbiSOM) is used based on direct usage of indicators representing different sets of tokens in the code. UbiSOM is quite effective to detect patterns of co-occurrence between multiple concerns. To illustrate, a repository comprising over 35, 000 MATLAB files is analyzed using the technique and relevant conclusions are drawn.

Miguel P. Monteiro, Nuno C. Marques, Bruno Silva, Bruno Palma, João Cardoso

### Food Truck Recommendation Using Multi-label Classification

Food trucks are vehicles with which fast food, from various cuisines, is cooked and sold. They have been popular in several countries and usually offer food in different locations of a city. Frequently, several food trucks offer their dishes in music concerts, festivals and other events. When several food trucks are present in a place, the variety of possible cuisines and food dishes makes their choice by the public a challenging task. This paper describes the task of recommending food trucks using a multi-label classification approach, where more than one option can be suggested. The recommendation is made using customers’ personal information and preferences. Six multi-label transformation strategies were used to induce learning models from real data obtained via a market research, where hundreds of participants provided their food preferences. The experimental results show that the strategies overcame the adopted baseline in almost all cases, with RAndom k-labELsets (RAkEL) and Binary Relevance (BR) in specific, were the ones who had the best overall result, respectively. On the other hand, it is required to investigate the matter furthermore to improve the predictive outcome of the task. From a machine learning perspective, a new way to analyze multi-label results, called confusion matrix plot, is discussed and the food truck dataset is released as a new multi-label benchmark.

Adriano Rivolli, Larissa C. Parker, Andre C. P. L. F. de Carvalho

### Improving Incremental Recommenders with Online Bagging

Online recommender systems often deal with continuous, potentially fast and unbounded flows of data. Ensemble methods for recommender systems have been used in the past in batch algorithms, however they have never been studied with incremental algorithms that learn from data streams. We evaluate online bagging with an incremental matrix factorization algorithm for top-N recommendation with positive-only user feedback, often known as binary ratings. Our results show that online bagging is able to improve accuracy up to 35% over the baseline, with small computational overhead.

João Vinagre, Alípio Mário Jorge, João Gama

### Tableaux for Hybrid XPath with Data

We provide a sound, complete and terminating tableau procedure to check satisfiability of downward XPath$$_=$$ formulas enriched with nominals and satisfaction operators. The calculus is inspired by ideas introduced to ensure termination of tableau calculi for certain Hybrid Logics. We prove that even though we increased the expressive power of XPath by introducing hybrid operators, the satisfiability problem for the obtained logic is still PSpace-complete.

Carlos Areces, Raul Fervari, Nahuel Seiler

### On the Properties of Atom Definability and Well-Supportedness in Logic Programming

We analyse alternative extensions of stable models for non-disjunctive logic programs with arbitrary Boolean formulas in the body, and examine two semantic properties. The first property, we call atom definability, allows one to replace any expression in rule bodies by an auxiliary atom defined by a single rule. The second property, well-supportedness, was introduced by Fages and dictates that it must be possible to establish a derivation ordering for all true atoms in a stable model so that self-supportedness is not allowed. We start from a generic fixpoint definition for well-supportedness that deals with: (1) a monotonic basis, for which we consider the whole range of intermediate logics; and (2), an assumption function, that determines which type of negated formulas can be added as defaults. Assuming that we take the strongest underlying logic in such a case, we show that only Equilibrium Logic satisfies both atom definability and strict well-suportedness.

Pedro Cabalar, Jorge Fandinno, Luis Fariñas, David Pearce, Agustín Valverde

### haspie - A Musical Harmonisation Tool Based on ASP

In this paper we describe a musical harmonisation and composition assistant based on Answer Set Programming (ASP). The tool takes scores in MusicXML format and annotates them with a preferred harmonisation. If specified, it is also able to complete intentionally blank sections and create new parts of the score that fit with the proposed harmonisation. Both the harmonisation and the completion of blank parts can be seen as constraint satisfaction problems that are encoded in ASP. Although the tool is a preliminary prototype still being improved, its basic functionality already helps to illustrate the appropriateness of ASP for musical knowledge representation, which provides a high degree of flexibility thanks to its relational, declarative orientation and an efficient computation of preferred solutions.

Pedro Cabalar, Rodrigo Martín

### Iterative Variable Elimination in ASP

In recent years, a large variety of approaches for forgetting in Answer Set Programming (ASP) have been proposed, in the form of specific operators, or classes of operators, following different principles and obeying different properties. A recent comprehensive overview of existing operators and properties provides a uniform picture of the landscape, including many novel results on relations between properties and operators. In this paper, we introduce four new properties not considered previously and show that these are indeed succinct and relevant additions providing novel results and insights, further strengthening established relations between existing operators. Most notably among these, the invariance to permutations of the order of forgetting a set of atoms iteratively raises interesting questions with surprising results.

Ricardo Gonçalves, Matthias Knorr, João Leite

### Logic-Based Encodings for Ricochet Robots

Studying the performance of logic tools on solving a specific problem can bring new insights on the use of different paradigms. This paper provides an empirical evaluation of logic-based encodings for a well known board game: Ricochet Robots. Ricochet Robots is a board game where the goal is to find the smallest number of moves needed for one robot to move from the initial position to a target position, while taking into account the existing barriers and other robots. Finding a solution to the Ricochet Robots problem is NP-hard. In this work we develop logic-based encodings for the Ricochet Robots problem to feed into Boolean Satisfiability (SAT) solvers. When appropriate, advanced techniques are applied to further boost the performance of a solver. A comparison between the performance of SAT solvers and an existing ASP solution clearly shows that SAT is by far the more adequate technology to solve the Ricochet Robots problem.

Filipe Gouveia, Pedro T. Monteiro, Vasco Manquinho, Inês Lynce

### An Achilles’ Heel of Term-Resolution

Term-resolution provides an elegant mechanism to prove that a quantified Boolean formula (QBF) is true. It is a dual to Q-resolution and is practically highly important as it enables certifying answers of DPLL-based QBF solvers. While term-resolution and Q-resolution are very similar, they are not completely symmetrical. In particular, Q-resolution operates on clauses and term-resolution operates on models of the matrix. This paper investigates the impact of this asymmetry. We will see that there is a large class of formulas (formulas with “big models”) whose term-resolution proofs are exponential. As a possible remedy, the paper suggests to prove true QBFs by refuting their negation (negate-refute), rather than proving them by term-resolution. The paper shows that from the theoretical perspective this is indeed a favorable approach. In particular, negation-refutation p-simulates term-resolution and there is an exponential separation between the two calculi. These observations further our understanding of proof systems for QBFs and provide a strong theoretical underpinning for the effort towards non-CNF QBF solvers.

Mikoláš Janota, Joao Marques-Silva

### Horn Maximum Satisfiability: Reductions, Algorithms and Applications

Recent years have witnessed remarkable performance improvements in maximum satisfiability (MaxSAT) solvers. In practice, MaxSAT algorithms often target the most generic MaxSAT formulation, whereas dedicated solvers, which address specific subclasses of MaxSAT, have not been investigated. This paper shows that a wide range of optimization and decision problems are either naturally formulated as MaxSAT over Horn formulas, or permit simple encodings using HornMaxSAT. Furthermore, the paper also shows how linear time decision procedures for Horn formulas can be used for developing novel algorithms for the HornMaxSAT problem.

Joao Marques-Silva, Alexey Ignatiev, Antonio Morgado

### Not Too Big, Not Too Small... Complexities of Fixed-Domain Reasoning in First-Order and Description Logics

We consider reasoning problems in description logics and variants of first-order logic under the fixed-domain semantics, where the model size is finite and explicitly given. It follows from previous results that standard reasoning is NP-complete for a very wide range of logics, if the domain size is given in unary encoding. In this paper, we complete the complexity overview for unary encoding and investigate the effects of binary encoding with partially surprising results. Most notably, fixed-domain standard reasoning becomes NExpTime for the rather low-level description logics $$\mathcal {ELI}$$ and $$\mathcal {ELF}$$ (as opposed to ExpTime when no domain size is given). On the other hand, fixed-domain reasoning remains NExpTime even for first-order logic, which is undecidable under the unconstrained semantics. For less expressive logics, we establish a generic criterion ensuring NP-completeness of fixed-domain reasoning. Amongst other logics, this criterion captures all the tractable profiles of OWL 2.

Sebastian Rudolph, Lukas Schweizer

### Reactive Maintenance Policies over Equalized States in Dynamic Environments

We address the problem of representing and verifying the behavior of an agent following a policy in dynamic environments. Our focus is on policies that yield sequences of actions, according to the present knowledge in the state, with the aim of reaching some main goal. We distinguish certain cases where the dynamic nature of the environment may require the agent to stop and revise its next actions. We employ the notion of maintenance to check whether a given policy can maintain the conditions of the main goal, given a respite from environment actions. Furthermore, we apply state clustering to mitigate the large state spaces caused by having irrelevant information in the states, and under some conditions this clustering might change the worst-case complexity. By preserving the behavior of the policy, it helps in checking for maintenance with a guarantee that the result also holds in the original system.

Zeynep G. Saribatur, Chitta Baral, Thomas Eiter

### Multi-agent Based File Replication and Consistency Management

Replication is a well-known technique in distributed systems. Replication can reduce data access latency, enhance and optimize the availability and reliability of the entire system. The existence of multiple instances of data however causes additional issues. The main issue is consistency of data. In this paper we present a multi-agent based approach for file replication and consistency management. We describe the design of a multi-agent system using the JADE platform. The system presents a multi-agent based communication framework that enables the replication and maintenance of files.

### Online Learning for Conversational Agents

Agents relying on large collections of interactions face the challenge of choosing an appropriate answer from such collections. Several works address this challenge by using offline learning approaches, which do not take advantage of how user-agent conversations unfold.In this work, we propose an alternative approach: incorporating user feedback at each interaction with the agent, in order to enhance its ability to choose an answer. We focus on the case of adjusting the weights of the features used by the agent to choose an answer, using an online learning algorithm (the Exponentially Weighted Average Forecaster) for that purpose. We validate our hypothesis with an experiment featuring a specific agent and simulating user feedback using a reference corpus. The results of our experiment suggest that the adjustment of the agent’s feature weights can improve its answers, provided that an appropriate reward function is designed, as this aspect is critical in the agent’s performance.

Vânia Mendonça, Francisco S. Melo, Luísa Coheur, Alberto Sardinha

### Simulating Behaviors of Children with Autism Spectrum Disorders Through Reversal of the Autism Diagnosis Process

Children affected by Autism Spectrum Disorders (ASD) exhibit behaviors that may vary drastically from child to child. The goal of achieving accurate computer simulations of behavioral responses to given stimuli for different ASD severities is a difficult one, but it could unlock interesting applications such as informing the algorithms of agents designed to interact with those individuals. This paper demonstrates a novel research direction for high-level simulation of behaviors of children with ASD by exploiting the structure of available ASD diagnosis tools. Building on the observation that the simulation process is in fact the reverse of the diagnosis process, we take advantage of the structure of the Autism Diagnostic Observation Schedule (ADOS), a state-of-the-art standardized tool used by therapists to diagnose ASD, in order to build our ADOS-Based Autism Simulator (ABASim). We first define the ADOS-Based Autism Space (ABAS), a feature space that captures individual behavioral differences. Using this space as a high-level behavioral model, the simulator is able to stochastically generate behavioral responses to given stimuli, consistent with provided child descriptors, namely ASD severity, age and language ability. Our method is informed by and generalizes from real ADOS data collected on 67 children with different ASD severities, whose correlational profile is used as our basis for the generation of the feature vectors used to select behaviors.

Kim Baraka, Francisco S. Melo, Manuela Veloso

### An Adaptive Simulation Tool for Evacuation Scenarios

Building useful and efficient models and tools for a varied audience, such as evacuation simulators for scientists, engineers and crisis managers, can be tricky. Even good models can fail in providing information when the user’s tools for the model are scarce of resources. The aim of this work is to propose a new tool that covers the most required features in evacuation scenarios. This paper starts with a review of current software, prototypes and models simulating evacuation scenarios, by discussing their required and desired features. Based on this overview, we propose our simulator comparing it with other models and commercial tools. Moreover, we discuss the importance of building simulators that cover the minimum requirements to avoid the risk of building inefficient models or tools that do not provide enough insights for users to take right decisions in terms of security policies in crowded events. The implications of this work are to present a new simulation tool and to start a discussion in this research field on mandatory features of evacuation simulation tools that will provide valuable information to users and to find out what the criteria are to define these features.

Daniel Formolo, C. Natalie van der Wal

### A Stochastic Approach of SIRC Model Using Individual-Based Epidemiological Models

Mathematical models are important instruments in epidemiology to assist in analyzing epidemiological dynamics as well as possible dissemination controls. Classical model uses differential equations to describe dynamics of population over time. A widely used example is susceptible-infected-recovered (SIR) compartmental model. Such model has been used to obtain optimum control policies in different scenarios. This model has been enhanced to include dynamics of reinfection of disease including a new compartment, known as susceptible-infected-recovered-cross-immune (SIRC). An alternative model is to consider each individual as a string or vector of characteristic data and simulate the contagion and recovery processes by computational means. This type of model, referred in literature as individual based model (IBM) has advantage of being flexible as characteristics of each individual can be quite complex, involving, for instance, age, sex, pre-existing health conditions, environmental factors, social, and habits. However, it was not found in literature equivalence in an IBM model for SIRC model. Some works have shown the possibility of equivalence between IBM and SIR models, in order to simulate similar scenarios with models of different natures, in deterministic and stochastic case respectively. In this context, this work proposes implementation of an IBM stochastic model equivalent to SIRC model. Results show that equivalence is also possible only with the proper configuration of parameters of IBM model. Accuracy of equivalent model showed better with reduction of time step end increase the size of population.

Arlindo Rodrigues Galvão Filho, Telma Woerle de Lima, Anderson da Silva Soares, Clarimar Jose Coelho

### Incorporating Learning into Decision Making in Agent Based Models

Most of the current work in social simulation is focused on building multi-agent systems that cooperate and collaborate together to exhibit some kind of a collective or social behavior. These agents are coded as rule-based or state-transition based systems. Often, the choice function of the agents is either hard-wired or dependent on environmental changes, but there is no explicit learning based on historical performance. There has been some recent work exploring methods for incorporating machine learning into multi-agent systems to capture adaptive behaviors. The goal of this paper is to expand upon the discussion around designing adaptive behaviors in agent based models by comparing multiple techniques for modeling learning, including: (1) applying machine learning and symbolic regression to use historical patterns to design learning mechanisms, (2) modeling behavioral economic principles to capture “irrational” or non-optimal learning, and (3) simulating reinforcement learning techniques such as q-learning to model a direct reward structure to improve learning outcomes at both the individual and group level. An example model has been built that applies these three learning techniques to simulate how people invest for retirement. Then, the outcomes of each learning technique simulated are used to identify lessons learned on when each technique should be applied.

Pia Ramchandani, Mark Paich, Anand Rao

### The Complementary Nature of Different NLP Toolkits for Named Entity Recognition in Social Media

In this paper we study the combined use of four different NLP toolkits—Stanford CoreNLP, GATE, OpenNLP and Twitter NLP tools—in the context of social media posts. Previous studies have shown performance comparisons between these tools, both on news and social media corporas. In this paper, we go further by trying to understand how differently these toolkits predict Named Entities, in terms of their precision and recall for three different entity types, and how they can complement each other in this task in order to achieve a combined performance superior to each individual one. Experiments on two publicly available datasets from the workshops WNUT-2015 and #MSM2013 show that using an ensemble of toolkits can improve the recognition of specific entity types - up to 10.62% for the entity type Person, 1.97% for the type Location and 1.31% for the type Organization, depending on the dataset and the criteria used for the voting. Our results also showed improvements of 3.76% and 1.69%, in each dataset respectively, on the average performance of the three entity types.

Filipe Batista, Álvaro Figueira

### Aspect-Based Opinion Mining in Drug Reviews

Aspect-based opinion mining can be applied to extract relevant information expressed by patients in drug reviews (e.g., adverse reactions, efficacy of a drug, symptoms and conditions of patients). This new domain of application presents challenges as well as opportunities for research in opinion mining. Nevertheless, the literature is still scarce of methods to extract multiple relevant aspects present in drug reviews. In this paper we propose a method to extract and classify aspects in drug reviews. The proposed solution has two main steps. In the aspect extraction, a method based on syntactic dependency paths is proposed to extract opinion pairs in drug reviews, composed by an aspect term associated to a sentiment modifier. In the aspect classification, a supervised classification is proposed based on domain and linguistics resources to classify the opinion pairs by aspect type (e.g., condition, adverse reaction, dosage and effectiveness). In order to evaluate the proposed method we conducted experiments with datasets related to three different diseases: ADHD, AIDS and Anxiety. Promising results were obtained in the experiments and various issues were identified and discussed.

Diana Cavalcanti, Ricardo Prudêncio

### Unsupervised Approaches for Computing Word Similarity in Portuguese

This paper presents several approaches for computing word similarity in Portuguese and is motivated by the recent availability of state-of-the-art distributional models of Portuguese words, which add to several lexical knowledge bases (LKBs) for this language, available for a longer time. The previous resources were exploited to answer word similarity tests, also recently available for Portuguese. We conclude that there are several valid approaches for this task, but not one that outperforms all the others in every single test. For instance, distributional models seem to capture relatedness better, but LKBs are better suited for computing genuine similarity.

Hugo Gonçalo Oliveira

### Gradually Improving the Computation of Semantic Textual Similarity in Portuguese

There is much research on Semantic Textual Similarity (STS) in English, specially since its inclusion in the SemEval evaluations. For other languages, it is not as common, mostly due to the unavailability of benchmarks. Recently, the ASSIN shared task targeted STS in Portuguese and released training and test collections. This paper describes an incremental approach to ASSIN, where the computed similarity is gradually improved by exploiting different features (e.g., token overlap, semantic relations, chunks, and negation) and approaches. The best reported results, obtained with a supervised approach, would get second place overall in ASSIN.

Hugo Gonçalo Oliveira, Ana Oliveira Alves, Ricardo Rodrigues

### Towards a Mention-Pair Model for Coreference Resolution in Portuguese

The aim of coreference resolution is to automatically determine all linguistic expressions included in a piece of text that refer to the same entity. Following the mention-pair model, we employ machine learning techniques to address coreference resolution from text written in Portuguese. Based on a modest annotated corpus, we highlight the impact that different training-set creation strategies have on the quality of the predictions made by the system. We conclude that enriching the system with semantic-based features significantly improves the overall performance of the system.

Gil Rocha, Henrique Lopes Cardoso

### Recognizing Textual Entailment and Paraphrases in Portuguese

The aim of textual entailment and paraphrase recognition is to determine whether the meaning of a text fragment can be inferred (is entailed) from the meaning of another text fragment. In this paper, we address the task of automatically recognizing textual entailment (RTE) and paraphrases from text written in the Portuguese language employing supervised machine learning techniques. Firstly, we formulate the task as a multi-class classification problem. We conclude that semantic-based approaches are very promising to recognize textual entailment and that combining data from European and Brazilian Portuguese brings several challenges typical with cross-language learning. Then, we formulate the task as a binary classification problem and demonstrate the capability of the proposed classifier for RTE and paraphrases. The results reported in this work are promising, achieving 0.83 of accuracy on the test data.

Gil Rocha, Henrique Lopes Cardoso

### Learning Word Embeddings from the Portuguese Twitter Stream: A Study of Some Practical Aspects

This paper describes a preliminary study for producing and distributing a large-scale database of embeddings from the Portuguese Twitter stream. We start by experimenting with a relatively small sample and focusing on three challenges: volume of training data, vocabulary size and intrinsic evaluation metrics. Using a single GPU, we were able to scale up vocabulary size from 2048 words embedded and 500K training examples to 32768 words over 10M training examples while keeping a stable validation loss and approximately linear trend on training time per epoch. We also observed that using less than 50% of the available training examples for each vocabulary size might result in overfitting. Results on intrinsic evaluation show promising performance for a vocabulary size of 32768 words. Nevertheless, intrinsic evaluation metrics suffer from over-sensitivity to their corresponding cosine similarity thresholds, indicating that a wider range of metrics need to be developed to track progress.

Pedro Saleiro, Luís Sarmento, Eduarda Mendes Rodrigues, Carlos Soares, Eugénio Oliveira

### Backmatter

Weitere Informationen