Skip to main content
Top

2020 | Book

Deep Neural Evolution

Deep Learning with Evolutionary Computation

Editors: Hitoshi Iba, Dr. Nasimul Noman

Publisher: Springer Singapore

Book Series : Natural Computing Series

insite
SEARCH

About this book

This book delivers the state of the art in deep learning (DL) methods hybridized with evolutionary computation (EC). Over the last decade, DL has dramatically reformed many domains: computer vision, speech recognition, healthcare, and automatic game playing, to mention only a few. All DL models, using different architectures and algorithms, utilize multiple processing layers for extracting a hierarchy of abstractions of data. Their remarkable successes notwithstanding, these powerful models are facing many challenges, and this book presents the collaborative efforts by researchers in EC to solve some of the problems in DL.

EC comprises optimization techniques that are useful when problems are complex or poorly understood, or insufficient information about the problem domain is available. This family of algorithms has proven effective in solving problems with challenging characteristics such as non-convexity, non-linearity, noise, and irregularity, which dampen the performance of most classic optimization schemes. Furthermore, EC has been extensively and successfully applied in artificial neural network (ANN) research —from parameter estimation to structure optimization. Consequently, EC researchers are enthusiastic about applying their arsenal for the design and optimization of deep neural networks (DNN).

This book brings together the recent progress in DL research where the focus is particularly on three sub-domains that integrate EC with DL: (1) EC for hyper-parameter optimization in DNN; (2) EC for DNN architecture design; and (3) Deep neuroevolution. The book also presents interesting applications of DL with EC in real-world problems, e.g., malware classification and object detection. Additionally, it covers recent applications of EC in DL, e.g. generative adversarial networks (GAN) training and adversarial attacks. The book aims to prompt and facilitate the research in DL with EC both in theory and in practice.

Table of Contents

Frontmatter

Preliminaries

Frontmatter
Chapter 1. Evolutionary Computation and Meta-heuristics
Abstract
This chapter presents several methods of evolutionary computation and meta-heuristics. Evolutionary computation is a computation technique that mimics the evolutionary mechanism of life to select, deform, and convolute data structures. Because of its high versatility, its applications are found in various fields. Meta-heuristics described in this chapter are considered as representatives of swarm intelligence, such as particle swarm optimization (PSO), artificial bee colony optimization (ABC), ant colony optimization (ACO), firefly algorithms, cuckoo search, etc. A benefit of these methods is global searching as well as local searching. Existence of local minima or saddle points could lead to a locally optimum solution when using gradient methods such as the steepest descent search. By contrast, the methods described in this chapter can escape from such local solutions by means of various kinds of operations. Methods of evolutionary computation and meta-heuristics are used in combination with deep learning to establish a framework of deep neural evolution, which will be described in later chapters.
Hitoshi Iba
Chapter 2. A Shallow Introduction to Deep Neural Networks
Abstract
Deep learning is one of the two branches of artificial intelligence that merged to give rise to the field of deep neural evolution. The other one is evolutionary computation introduced in the previous chapter. Deep learning, the most active research area in machine learning, is a powerful family of computational models that learns and processes data using multiple levels of abstractions. Over the last years, deep learning methods have shown amazing performances in a diverse field of applications. This chapter familiarizes the readers with the major classes of deep neural networks that are frequently used, namely CNN (Convolutional Neural Network), RNN (Recurrent Neural Network), DBN (Deep Belief Network), Deep autoencoder, GAN (Generative Adversarial Network) and Deep Recursive Network. For each class of networks, we introduced the architecture, type of layers, processing units, learning algorithms and other relevant information. This chapter aims to provide the readers with necessary background information in deep learning for understanding the contemporary research in deep neural evolution presented in the subsequent chapters of the book.
Nasimul Noman

Hyper-Parameter Optimization

Frontmatter
Chapter 3. On the Assessment of Nature-Inspired Meta-Heuristic Optimization Techniques to Fine-Tune Deep Belief Networks
Abstract
Machine learning techniques are capable of talking, interpreting, creating, and even reasoning about virtually any subject. Also, their learning power has grown exponentially throughout the last years due to advances in hardware architecture. Nevertheless, most of these models still struggle regarding their practical usage since they require a proper selection of hyper-parameters, which are often empirically chosen. Such requirements are strengthened when concerning deep learning models, which commonly require a higher number of hyper-parameters. A collection of nature-inspired optimization techniques, known as meta-heuristics, arise as straightforward solutions to tackle such problems since they do not employ derivatives, thus alleviating their computational burden. Therefore, this work proposes a comparison among several meta-heuristic optimization techniques in the context of Deep Belief Networks hyper-parameter fine-tuning. An experimental setup was conducted over three public datasets in the task of binary image reconstruction and demonstrated consistent results, posing meta-heuristic techniques as a suitable alternative to the problem.
Leandro Aparecido Passos, Gustavo Henrique de Rosa, Douglas Rodrigues, Mateus Roder, João Paulo Papa
Chapter 4. Automated Development of DNN Based Spoken Language Systems Using Evolutionary Algorithms
Abstract
Spoken language processing is one of the research areas that has contributed significantly to the recent revival in neural network research. For example, speech recognition has been at the forefront of deep learning research, inventing various novel models. Their dramatic performance improvements compared to previous state-of-the-art implementations have resulted in spoken language systems being deployed in a wide range of applications today. However, these systems require intensive tuning of their network designs and the training setups in order to achieve maximal performance. The laborious effort by human experts is becoming a prominent obstacle in system development. In this chapter, we first explain the basic concepts and the neural network-based implementations of spoken language processing systems. Several types of neural network models will be described. We then introduce our effort to automate the tuning of the system meta-parameters using evolutionary algorithms.
Takahiro Shinozaki, Shinji Watanabe, Kevin Duh
Chapter 5. Search Heuristics for the Optimization of DBN for Time Series Forecasting
Abstract
A deep belief net (DBN) with multi-stacked restricted Boltzmann machines (RBMs) was proposed by Hinton and Salakhutdinov for reducing the dimensionality of data in 2006. Comparing to the conventional methods, such as the principal component analysis (PCA), the superior performance of DBN received the most attention by the researchers of pattern recognition, and it even brought out a new era of artificial intelligence (AI) with a keyword “deep learning” (DL). Deep neural networks (DNN) such as DBN, deep auto-encoders (DAE), and convolutional neural networks (CNN) have been successfully applied to the fields of dimensionality reduction, image processing, pattern recognition, etc., nevertheless, there are more AI disciplines in which they could be applied such as computational cognition, behavior decision, forecasting, and others. Furthermore, the architectures of conventional deep models are usually handcrafted, i.e., the optimization of the structure of DNN is still a problem. In this chapter, we mainly introduce how DBNs were firstly adopted to time series forecasting systems by our original studies, and two kinds of heuristic optimization methods for structuring DBNs are discussed: particle swarm optimization (PSO), a well-known method in swarm intelligence; and random search (RS), which is a simpler and useful algorithm for high dimensional hyper-parameter exploration.
Takashi Kuremoto, Takaomi Hirata, Masanao Obayashi, Kunikazu Kobayashi, Shingo Mabu

Structure Optimization

Frontmatter
Chapter 6. Particle Swarm Optimization for Evolving Deep Convolutional Neural Networks for Image Classification: Single- and Multi-Objective Approaches
Abstract
Convolutional neural networks (CNNs) are one of the most effective deep learning methods to solve image classification problems, but the design of the CNN architectures is mainly done manually, which is very time consuming and requires expertise in both problem domains and CNNs. In this chapter, we will describe an approach to the use of particle swarm optimization (PSO) for automatically searching for and learning the optimal CNN architectures. We will provide an encoding strategy inspired by computer networks to encode CNN layers and to allow the proposed method to learn variable-length CNN architectures by focusing only on the single objective of maximizing the classification accuracy. A surrogate dataset will be used to speed up the evolutionary learning process. We will also include a multi-objective way for PSO to evolve CNN architectures in the chapter. The PSO-based algorithms are examined and compared with state-of-the-art algorithms on a number of widely used image classification benchmark datasets. The experimental results show that the proposed algorithms are strong competitors to the state-of-the-art algorithms in terms of classification error. A major advantage of the proposed methods is the automated design of CNN architectures without requiring human intervention and good performance of the learned CNNs.
Bin Wang, Bing Xue, Mengjie Zhang
Chapter 7. Designing Convolutional Neural Network Architectures Using Cartesian Genetic Programming
Abstract
Convolutional neural networks (CNNs), among the deep learning models, are making remarkable progress in a variety of computer vision tasks, such as image recognition, restoration, and generation. The network architecture in CNNs should be manually designed in advance. Researchers and practitioners have developed various neural network structures to improve performance. Despite the fact that the network architecture considerably affects the performance, the selection and design of architectures are tedious and require trial-and-error because the best architecture depends on the target task and amount of data. Evolutionary algorithms have been successfully applied to automate the design process of CNN architectures. This chapter aims to explain how evolutionary algorithms can support the automatic design of CNN architectures. We introduce a method based on Cartesian genetic programming (CGP) for the design of CNN architectures. CGP is a form of genetic programming and searches the network-structured program. We represent the CNN architecture via a combination of pre-defined modules and search for the high-performing architecture based on CGP. The method attempts to find better architectures by repeating the architecture generation, training, and evaluation. The effectiveness of the CGP-based CNN architecture search is demonstrated through two types of computer vision tasks: image classification and image restoration. The experimental result for image classification shows that the method can find a well-performing CNN architecture. For the experiment on image restoration tasks, we show that the method can find a simple yet high-performing architecture of a convolutional autoencoder that is a type of CNN.
Masanori Suganuma, Shinichi Shirakawa, Tomoharu Nagao
Chapter 8. Fast Evolution of CNN Architecture for Image Classification
Abstract
The performance improvement of Convolutional Neural Network (CNN) in image classification and other applications has become a yearly event. Generally, two factors are contributing to achieving this envious success: stacking of more layers resulting in gigantic networks and use of more sophisticated network architectures, e.g. modules, skip connections, etc. Since these state-of-the-art CNN models are manually designed, finding the most optimized model is not easy. In recent years, evolutionary and other nature-inspired algorithms have become human competitors in designing CNN and other deep networks automatically. However, one challenge for these methods is their very high computational cost. In this chapter, we investigate if we can find an optimized CNN model in the classic CNN architecture and if we can do that automatically at a lower cost. Towards this aim, we present a genetic algorithm for optimizing the number of blocks and layers and some other network hyperparameters in classic CNN architecture. Experimenting with CIFAR10, CIFAR100, and SVHN datasets, it was found that the proposed GA evolved CNN models which are competitive with the other best models available.
Ali Bakhshi, Stephan Chalup, Nasimul Noman

Deep Neuroevolution

Frontmatter
Chapter 9. Discovering Gated Recurrent Neural Network Architectures
Abstract
Gated recurrent networks such as those composed of Long Short-Term Memory (LSTM) nodes have recently been used to improve state of the art in many sequential processing tasks such as speech recognition and machine translation. However, the basic structure of the LSTM node is essentially the same as when it was first conceived 25 years ago. Recently, evolutionary and reinforcement-learning mechanisms have been employed to create new variations of this structure. This chapter proposes a new method, evolution of a tree-based encoding of the gated memory nodes, and shows that it makes it possible to explore new variations more effectively than other methods. The method discovers nodes with multiple recurrent paths and multiple memory cells, which lead to significant improvement in the standard language modeling benchmark task. The chapter also shows how the search process can be speeded up by training an LSTM network to estimate performance of candidate structures, and by encouraging exploration of novel solutions. Thus, evolutionary design of complex neural network structures promises to improve performance of deep learning architectures beyond human ability to do so.
Aditya Rawal, Jason Liang, Risto Miikkulainen
Chapter 10. Investigating Deep Recurrent Connections and Recurrent Memory Cells Using Neuro-Evolution
Abstract
Neural architecture search poses one of the most difficult problems for statistical learning, given the incredibly vast architectural search space. This problem is further compounded for recurrent neural networks (RNNs), where every node in an architecture can be connected to any other node via recurrent connections which pass information from previous passes through the RNN via a weighted connection. Most modern-day RNNs focus on recurrent connections which pass information from the immediately preceding pass by utilizing gated constructs known as memory cells; however, connections farther back in time, or deep recurrent connections, are also possible. A novel neuro-evolutionary metaheuristic called EXAMM is utilized to conduct extensive experiments evolving RNNs consisting of a suite of memory cells and simple neurons, with and without deep recurrent connections. These experiments evolved and trained 10.56 million RNNs, with results showing that networks with deep recurrent connections perform significantly better than those without, and in some cases the best evolved RNNs consist of only simple neurons and deep recurrent connections. These results strongly suggest that utilizing complex recurrent connectivity patterns in RNNs deserves further study and also showcases the strong potential for using neuro-evolutionary metaheuristic algorithms as tools for understanding and training effective RNNs.
Travis Desell, AbdElRahman A. ElSaid, Alexander G. Ororbia
Chapter 11. Neuroevolution of Generative Adversarial Networks
Abstract
Generative Adversarial Networks (GAN) is an adversarial model that became relevant in the last years, displaying impressive results in generative tasks. A GAN combines two neural networks, a discriminator and a generator, trained in an adversarial way. The discriminator learns to distinguish between real samples of an input dataset and fake samples. The generator creates fake samples aiming to fool the discriminator. The training progresses iteratively, leading to the production of realistic samples that can mislead the discriminator. Despite the impressive results, GANs are hard to train, and a trial-and-error approach is generally used to obtain consistent results. Since the original GAN proposal, research has been conducted not only to improve the quality of the generated results but also to overcome the training issues and provide a robust training process. However, even with the advances in the GAN model, stability issues are still present in the training of GANs. Neuroevolution, the application of evolutionary algorithms in neural networks, was recently proposed as a strategy to train and evolve GANs. These proposals use the evolutionary pressure to guide the training of GANs to build robust models, leveraging the quality of results, and providing a more stable training. Furthermore, these proposals can automatically provide useful architectural definitions, avoiding the manual discovery of suitable models for GANs. We show the current advances in the use of evolutionary algorithms and GANs, presenting the state-of-the-art proposals related to this context. Finally, we discuss perspectives and possible directions for further advances in the use of evolutionary algorithms and GANs.
Victor Costa, Nuno Lourenço, João Correia, Penousal Machado

Applications and Others

Frontmatter
Chapter 12. Evolving Deep Neural Networks for X-ray Based Detection of Dangerous Objects
Abstract
In recent years, neural networks with an additional convolutional layer, referred to as convolutional neural networks (CNN), have widely been recognized as being effective in the field of image recognition. In the majority of these previous researches, the structures of networks were designed by hand, and were based on experience. However, there is no established theory explaining how to build networks with higher learning abilities. In this chapter, we propose a framework on automatically obtaining network structures with the highest learning ability for image recognition, through the combination of the various core technologies. We employ EC (evolutionary computation) for the automatic extraction and synthesis of network structures. Additionally, we attempt to perform an effective search in a larger parameter space by gradually increasing the number of training epochs during the generation change process. In order to show the effectiveness of our approach, we apply the proposed method to the task of detecting dangerous objects in an X-ray image data set. Compared with the previous results, we have achieved an improvement in the mAP value. We can also find several by-passes in the structures that were actually obtained.
Ryotaro Tsukada, Lekang Zou, Hitoshi Iba
Chapter 13. Evolving the Architecture and Hyperparameters of DNNs for Malware Detection
Abstract
Deep Learning models have consistently provided excellent results in highly complex domains. Its deep architecture of layers allows to face problems where classical machine learning approaches fail, or simply are not able to provide good enough solutions. However, these deep models usually involve a complex topology and hyperparameters that have to be carefully defined, typically following a grid search, in order to reach the most profitable configuration. Neuroevolution presents a perfect instrument to perform an evolutionary search pursuing this configuration. Through an evolution of the hyperparameters (activation functions, initialisation methods and optimiser) and the topology of the network (number and type layers and the number of units) it is possible to deeply explore the space of solutions in order to find the most proper architecture. Among the multiple applications of this approach, in this chapter we focus on the Android malware detection problem. This domain, which has led to a large amount of research in the last decade, presents interesting characteristics which make the application of Neuroevolution a logical approach to determine the architecture which will better discern between malicious and benign applications. In this research, we leverage a modification of EvoDeep, a framework for the evolution of valid deep layers sequences, to implement this evolutionary search using a genetic algorithm as means. To assess the approach, we use the OmniDroid dataset, a large set of static and dynamic features extracted from 22,000 malicious and benign Android applications. The results show that the application of a Neuroevolution based strategy leads to build Deep Learning models which provide high accuracy rates, greater than those obtained with classical machine learning approaches.
Alejandro Martín, David Camacho
Chapter 14. Data Dieting in GAN Training
Abstract
We investigate training Generative Adversarial Networks, GANs, with less data. Subsets of the training dataset can express empirical sample diversity while reducing training resource requirements, e.g., time and memory. We ask how much data reduction impacts generator performance and gauge the additive value of generator ensembles. In addition to considering stand-alone GAN training and ensembles of generator models, we also consider reduced data training on an evolutionary GAN training framework named Redux-Lipizzaner. Redux-Lipizzaner makes GAN training more robust and accurate by exploiting overlapping neighborhood-based training on a spatial 2D grid. We conduct empirical experiments on Redux-Lipizzaner using the MNIST and CelebA data sets.
Jamal Toutouh, Erik Hemberg, Una-May O’Reilly
Chapter 15. One-Pixel Attack: Understanding and Improving Deep Neural Networks with Evolutionary Computation
Abstract
Recently, the one-pixel attack showed that deep neural networks (DNNs) can misclassify by changing only one pixel. Beyond a vulnerability, by demonstrating how easy it is to cause a change in classes, it revealed that DNNs are not learning the expected high-level features but rather less robust ones. In this chapter, recent findings further confirming the affirmations above will be presented together with an overview of current attacks and defenses. Moreover, it will be shown the promises of evolutionary computation as both a way to investigate the robustness of DNNs as well as a way to improve their robustness through hybrid systems, evolution of architectures, among others.
Danilo Vasconcellos Vargas
Backmatter
Metadata
Title
Deep Neural Evolution
Editors
Hitoshi Iba
Dr. Nasimul Noman
Copyright Year
2020
Publisher
Springer Singapore
Electronic ISBN
978-981-15-3685-4
Print ISBN
978-981-15-3684-7
DOI
https://doi.org/10.1007/978-981-15-3685-4

Premium Partner