main-content

Über dieses Buch

This book comprises selected research papers from the 2015 edition of the EVOLVE conference, which was held on June 18–June 24, 2015 in Iași, Romania. It presents the latest research on Probability, Set Oriented Numerics, and Evolutionary Computation. The aim of the EVOLVE conference was to provide a bridge between probability, set oriented numerics and evolutionary computation and to bring together experts from these disciplines. The broad focus of the EVOLVE conference made it possible to discuss the connection between these related fields of study computational science. The selected papers published in the proceedings book were peer reviewed by an international committee of reviewers (at least three reviews per paper) and were revised and enhanced by the authors after the conference. The contributions are categorized into five major parts, which are:

Multicriteria and Set-Oriented Optimization; Evolution in ICT Security; Computational Game Theory; Theory on Evolutionary Computation; Applications of Evolutionary Algorithms.

The 2015 edition shows a major progress in the aim to bring disciplines together and the research on a number of topics that have been discussed in previous editions of the conference matured over time and methods have found their ways in applications. In this sense the book can be considered an important milestone in bridging and thereby advancing state-of-the-art computational methods.

Inhaltsverzeichnis

Aggregate Selection in Multi-objective Biochemical Optimization via the Average Cuboid Volume Indicator

Abstract
The identification of peptides that optimize several physiochemical properties is an important task in the drug design process. Multi-objective genetic algorithms are efficient and cost-effective methods to scan the highly complex search space for optimal candidate peptides. A multi-objective genetic algorithm called NGSA-II has been proposed in previous work with the aim of producing diverse high-quality peptides in a low number of generations. An important component of NSGA-II is the selection process that determines the high-quality individuals for the succeeding generation. This paper presents two kinds of selection strategies for NSGA-II to guide the search process towards high-quality peptides while maintaining diversity within the genetic material. The proposed selection strategies rely both on tournaments, and use a combination of fitness-proportionate selection and a discerning selection criterion, which is front-based in one case and indicator-based in the other case. The two strategies are compared to each other with respect to the search behavior on a generic three-dimensional molecular minimization problem.
Susanne Rosenthal, Bernd Freisleben, Markus Borschbach

On Gradient-Based and Swarm-Based Algorithms for Set-Oriented Bicriteria Optimization

Abstract
This paper is about the numerical solution of multiobjective optimization problems in continuous spaces. The problem is to define a search direction and a dynamical adaptation scheme for sets of vectors that serve as approximation sets. Two algorithmic concepts are compared: These are stochastic optimization algorithms based on cooperative particle swarms, and a deterministic optimization algorithm based on set-oriented gradients of the hypervolume indicator. Both concepts are instantiated as algorithms, which are deliberately kept simple in order to not obfuscate their discussion. It is shown that these algorithms are capable of approximating Pareto fronts iteratively. The numerical studies of the paper are restricted to relatively simple and low dimensional problems. For these problems a visualization of the convergence dynamics was implemented that shows how the approximation set converges to a diverse cover of the Pareto front and efficient set. The demonstration of the algorithms is implemented in Java Script and can therefore run from a website in any conventional browser. Besides using it to reproduce the findings of the paper, it is also suitable as an educational tool in order to demonstrate the idea of set-based convergence in Pareto optimization using stochastic and deterministic search.
Wilco Verhoef, André H. Deutz, Michael T. M. Emmerich

Quadcriteria Optimization of Binary Classifiers: Error Rates, Coverage, and Complexity

Abstract
This paper presents a 4-objective evolutionary multiobjective optimization study for optimizing the error rates (false positives, false negatives), reliability, and complexity of binary classifiers. The example taken is the email anti-spam filtering problem.
The two major goals of the optimization is to minimize the error rates that is the false negative rate and the false positive rate. Our approach discusses three-way classification, that is the binary classifier can also not classify an instance in cases where there is not enough evidence to assign the instance to one of the two classes. In this case the instance is marked as suspicious but still presented to the user. The number of unclassified (suspicious) instances should be minimized, as long as this does not lead to errors. This will be termed the coverage objective. The set (ensemble) of rules needed for the anti-spam filter to operate in optimal conditions is addressed as a fourth objective. All objectives stated above are in general conflicting with each other and that is why we address the problem as a 4-objective (quadcriteria) optimization problem. We assess the performance of a set of state-of-the-art evolutionary multiobjective optimization algorithms. These are NSGA-II, SPEA2, and the hypervolume indicator-based SMS-EMOA. Focusing on the anti-spam filter optimization, statistical comparisons on algorithm performance are provided on several benchmarks and a range of performance indicators. Moreover, the resulting 4-D Pareto hyper-surface is discussed in the context of binary classifier optimization.
Vitor Basto-Fernandes, Iryna Yevseyeva, David Ruano-Ordás, Jiaqi Zhao, Florentino Fdez-Riverola, José Ramón Méndez, Michael T. M. Emmerich

Parameter Identification of Stochastic Gene Regulation Models by Indicator-Based Evolutionary Level Set Approximation

Abstract
Continuous level set approximation seeks to find a set of points (parameter vectors) that approximates the set of sets of parameters that for a given function map to a given output value. In this work we will look at a class of difficult to solve level set problems with innumerably many solutions and show how their solution sets can be approximated by robust evolutionary search methods.
In particular, this paper seeks to solve noisy parameter identification problem from biology where the task is to find the set of parameter settings of a stochastic gene regulatory network, simulated by Gillespie’s algorithm with delays, that match existing observations. In this context the necessity of active diversity maintenance and adaptation of search operators to find all feasible subspaces is studied. As a result a robust implementation and default setting of evolutionary level set approximation (ELSA) for noisy parameter identification problems will be developed and validated. The validation uses two classical gene regulatory networks and it is demonstrated that a larger set of reaction parameters can be found that potentially could explain the observed stochastic dynamics.
Alexander Nezhinsky, Michael T. M. Emmerich

On Using Cognition for Anomaly Detection in SDN

Abstract
Through this position paper we aim at providing a prototype cognitive security service for anomaly detection in Software Defined Networks (SDNs). We equally look at strengthening attack detection capabilities in SDNs, through the addition of predictive analytics capabilities. For this purpose, we build a learning-based anomaly detection service called Learn2Defend, based on functionalities provided by Opendaylight. A potential path to cognition is detailed, by means of a Gaussian Processes driven engine that makes use of traffic characteristics/behavior profiles e.g. smoothness of the frequency of flows traversing a given node. Learn2Defend follows a two-fold approach, with unsupervised learning and prediction mechanisms, all in an on-line dynamic SDN context. The prototype does not target to provide an universally valid predictive analytics framework for security, but rather to offer a tool that supports the integration of cognitive techniques in the SDN security services.
Emilia Tantar, Alexandru-Adrian Tantar, Miroslaw Kantor, Thomas Engel

Feature Creation Using Genetic Algorithms for Zero False Positive Malware Classification

Abstract
This paper presents a Genetic Programming approach to feature extraction in the frame of the perceptron algorithm described in [1]. While feature extraction has the potential of increasing the accuracy of classification, fully exploring the huge space of possible combinations of the initial 45150 features would make the approach infeasible; Genetic Programming provides a proper way of tackling the search for relevant features. In turn, the extracted features are used to train an algorithm - One Side Class Perceptron - designed to minimize the number of false positives; accuracy is increased. In the experiments, the classifier using the extracted features was run on a dataset consisting of 358,144 files. The results show that our overall approach and implementation is fit for real-world malware detection.
Razvan Benchea, Dragos Gavrilut, Henri Luchian

Multi-centroid Cluster Analysis in Malware Research

Abstract
Verdicts assignment is a recurring problem in malware research and it involves deciding if a given program is clean or infected (if it contains malicious logic). Since the general problem of identifying malicious logic is undecidable, a certain amount of manual analysis is required. As the collections of both clean and malicious samples are continuously increasing, we would like to reduce the manual work to a minimum, by using information extracted by automated analysis systems and the similarity between some programs in the collection.
Based on the assumption that similar programs are likely to share the same verdict, we have designed a system that selects a subset from a given collection of program samples for manual analysis. The selected subset should be as small as possible, given the constraint that the other verdicts must be inferable from the manually-assigned ones. The system was tested on a collection of more than 200000 clusters built using the single linkage approach on a collection of over 20 million samples.
Ciprian Oprişa, George Cabău, Gheorghe Sebestyen Pal

Cooperation in Multicriteria Repeated Games

Abstract
In classical Game Theory a rational player’s goal is to maximize its payoff by choosing a strategy that is best response to the opponent’s strategy. This theoretical presumption leads to mutual defection in dilemma games. Despite theoretical prediction in real life situations players tend to cooperate. Our goal is to develop a model that overcomes some of the limitations of classical models. We investigate the emergence of cooperation for the Prisoner’s Dilemma game in a spatial framework with multicriteria payoffs. We propose a multicriteria model where a second criterion, that reflects the identity of a player, is introduced. Numerical experiments show that the second criterion promotes cooperation without any external interactions. The proposed model allows the interaction of different type of players which leads to more realistic outcomes.
Réka Nagy, Mihai Suciu, Dan Dumitrescu

Evolving Game Strategies in a Dynamic Cournot Oligopoly Setting

Abstract
A Cournot oligopoly is used as a benchmark for Nash equilibria tracking and detection in a dynamic setting. Several dynamics that induce different trajectories for the equilibria are considered: random, linear, cosine, and spiral. A new Extremal Optimization based method is tested on the proposed dynamic setting.
Mihai Alexandru Suciu, Rodica-Ioana Lung, Noémi Gaskó, Tudor-Dan Mihoc, Dan Dumitrescu

Efficient Real-Parameter Single Objective Optimizer Using Hierarchical CMA-ES Solvers

Abstract
Monte Carlo Tree Search (MCTS) is a novel machine learning paradigm that is used to find good solutions for complex optimization problems with very large search spaces (like playing GO). We combine MCTS with Covariance Matrix Adaptation Evolution Strategies (CMA-ES) to efficiently optimize real-parameter single objective problems by balancing the exploitation of promising areas with the exploration of new regions of the search space. The novel algorithm is called hierarchical CMA-ES and it is influenced by both machine learning and evolutionary computation research areas. Like in evolutionary computation, we use a population of individuals to explore the commonalities of CMA-ES solvers. These CMA-ES solvers are structured using a MCTS tree like structure. Our experiments compare the performance of hierarchical CMA-ES solvers with two other algorithms: the standard CMA-ES optimizer, and an adaptation of MCTS to solve real-parameter problems. The hierarchical CMA-ES optimizer has the best empirical performance on several benchmark problems.

Multi-point Efficient Global Optimization Using Niching Evolution Strategy

Abstract
The Efficient Global Optimization (EGO) is capable of using limited function evaluation budget to find the global optimum. However, EGO is by design not built for parallelization, which is an important technique to speed up the costly computer codes. Some approaches have been developed to fix this issue. e.g. Constant Liar Strategy. In this article we propose an alternative way to obtain multiple points in the Efficient Global Optimization cycle, where a niching evolution strategy is combined into the classic EGO framework. The new approach is discussed and compared to other methods which aim at the same goal. The proposed approach is also experimented on the selected test functions.
Hao Wang, Thomas Bäck, Michael T. M. Emmerich

Community Detection in NK Landscapes - An Empirical Study of Complexity Transitions in Interactive Networks

Abstract
NK-landscapes have been used as models of gene interaction in biology, but also to understand the influence of interaction between variables on the controllability and optimality organizations as a whole. As such, instead of gene networks they could also be models of economical systems or social networks. In NK-landscapes a fitness function is computed as a sum of N trait values. Each trait depends on the variable directly associated with the trait and K other variables – so-called epistatic variables. With increasing number of interactions the ruggedness of the function increases (local optima). The transition in complexity resembles phase transitions, and for $$K \ge 2$$ the problem to find global optima becomes NP hard. This is not the case if epistatic variables are local, meaning that all interactions are local.
In this research we study NK-landscapes from the perspective of communities and community detection. We will not look at communities of epistatic links but instead focus on links due to correlation between phenotypic traits. To this end we view a single trait as an individual agent which strives to maximize its contributed value to the net value of a community. If the value of a single trait is high whenever that of another trait is low we regard these traits as being conflicting. If high values of one trait coincide with high values of other traits we regard the traits as supporting each other. Finally, if the value of two traits is uncorrelated, we view their relationship as being neutral. We study what happens to the system of traits when the NK-landscape undergoes a critical transition to a more complex model via the increment of k. In particular, we study the effect of locality of interaction on the shape and number of the emerging communities of traits and show that the number of communities reaches its lowest point for medium values of k and not, as might be expected, for a fully connected epistatic matrix (case $$k=N-1$$).
Asep Maulana, André H. Deutz, Erik Schultes, Michael T. M. Emmerich

River Flow Forecasting Using an Improved Artificial Neural Network

Abstract
Artificial neural network (ANN) is a popular data-driven modelling technique that has found application in river flow forecasting over the last two decades. This can be attributed to its ability to assimilate complex and nonlinear input-output relationships inherent in hydrological processes within a river catchment. However despite its prominence, ANNs are still prone to certain problems such as overfitting and over-parameterization, especially when used under limited availability of datasets. These problems often influence the predictive ability of ANN-derived models, with inaccurate and unreliable results as resultant effects. This paper presents a study aimed at finding a solution to the aforementioned problems. Two evolutionary computational techniques namely differential evolution (DE) and genetic programming (GP) were applied to forecast monthly flow in the upper Mkomazi River, South Africa using a 19-year baseline record. Two case studies were considered. Case study 1 involved the use of correlation analysis in selecting input variables during model development while using DE algorithm for optimization purposes. However in the second case study, GP was incorporated as a screening tool for determining the dimensionality of the ANN models, while the learning process was subjected to sensitivity analysis using the DE-algorithm. Results from the two case studies were evaluated comparatively using three standard model evaluation criteria. It was found that results from case study 1 were considerably plagued by the problems of overfitting and over-parameterization, as significant differences were observed in the error estimates and R2 values between the training and validation phases. However, results from case study 2 showed great improvement, as the overfitting and memorization problems were significantly minimized, thus leading to improved forecast accuracy of the ANN models. It was concluded that the conjunctive use of GP and DE can be used to improve the performance of ANNs, especially when availability of datasets is limited.
Josiah Adeyemo, Oluwaseun Oyebode, Derek Stretch

Evolutionary Cost-Sensitive Balancing: A Generic Method for Imbalanced Classification Problems

Abstract
Efficient classification under imbalanced class distributions is currently of interest in data mining research, considering that traditional learning methods often fail to achieve satisfying results in such domains. Also, the correct choice of the metric is essential for the recognition effort. This paper presents a new general methodology for improving the performance of classifiers in imbalanced problems. The method, Evolutionary Cost-Sensitive Balancing (ECSB), is a meta-approach, which can be employed with any error-reduction classifier. It utilizes genetic search and cost-sensitive mechanisms to boost the performance of the base classifier. We present evaluations on benchmark data, comparing the results obtained by ECSB with those of similar recent methods in the literature: SMOTE and EUS. We found that ECSB boosts the performance of traditional classifiers in imbalanced problems, achieving ~45% relative improvement in true positive rate ($$\text {TP}_{\text {rate}}$$) and around 16% in F-measure (FM) on the average; also, it performs better than sampling strategies, with ~35% relative improvement in $$\text {TP}_{\text {rate}}$$ and ~12% in FM over SMOTE (on the average), similar $$text{TP}_{\text {rate}}$$ and geometric mean (GM) values and slightly higher area under de curve (AUC) values than EUS (up to ~9% relative improvement).
Camelia Lemnaru, Rodica Potolea

Balancing the Subtours for Multiple TSP Approached with ACS: Clustering-Based Approaches Vs. MinMax Formulation

Abstract
The algorithms belonging to the Ant Colony Optimization metaheuristic can be naturally applied to shortest path related problems. Although not so intensively studied as TSP, the multiple traveling salesman problem (multiple-TSP) is a straightforward extension of TSP of practical importance, requiring more than one salesman to be used for covering the whole set of cities. This work tackles the MinMax formulation of multiple-TSP, which aims at obtaining balanced subtours for the salesmen. An Ant Colony System that follows the MinMax formulation is proposed. Two other algorithms combining K-Means and Fuzzy C-Means clustering with Ant Colony Systems are described. The experimental analysis investigates the performance of the proposed algorithms from a bi-objective perspective, counting for the total length/cost of a solution and its balancing degree measured as the amplitude of its subtours.
Raluca Necula, Madalina Raschip, Mihaela Breaban

Backmatter

Weitere Informationen