Performance analysis of synchronous and asynchronous distributed genetic algorithms on multiprocessors
Introduction
Genetic Algorithms (GAs) are well-known population-based metaheuristics for providing accurate solutions in modern search problems. Genetic Algorithm (GA) goes through many different regions of the search space at the same time by dealing with a set of tentative solutions called a population [28]. The use of a population of solutions along with genetic operations makes GAs time-consuming in order to reach an acceptable solution.
Due to GAs implicit decentralized nature, parallelism represents a promising approach for overcoming the drawbacks of sequential algorithms. The parallel execution (physically independent set of processing units) reduce the execution time and even could allow an improvement in the quality of the solutions, by sharing the knowledge during the search between different sub-populations. Most of the modern computing devices such as laptops, mobiles, and workstations are equipped with multi-core processing units. Using these parallel architectures have many benefits in implementing and designing parallel metaheuristics from many different aspects. As a first aspect, the parallel technique usually has a different design from the sequential algorithm, what numerically springs out providing different search behaviors. As a second aspect, running over many processing units for solving one problem could lead to overcome the drawbacks of the expensive computational time of GAs and solve the problem faster with still high-quality numerical solutions. As a third aspect, the combination of parallel metaheuristics and powerful processing units will allow us to solve large-scale and real-world problems that involve big-data, faster and more accurately [42].
GA could be parallelized by dividing the tasks of the algorithm. Master-slave model stands as the most common dividing-tasks parallel approach. In the master-slave model, many different processing units called “slaves” collaborate in computing the fitness of the individuals. With no direct communication between the slaves, another processing unit called “master” is performing the rest operations of the algorithm. This type of parallel execution is governed by the master process, and the communication cost will depend on the number of slaves involved, making this model very hardware-dependent [11]. In this research, we investigate the numerical behavior and speed-up of this model over a multiprocessor shared-memory system.
Another common type of Parallel GAs (PGAs) is the island model, also known as distributed model [28]. In this model, the global population is divided into sub-populations (islands) distributed over different processors. These islands run a basic GA each, and after a predefined interval they communicate in order to exchange the search knowledge; e.g. individuals or parameters. The communication between the islands is called “Migration”. The migration process usually involves selecting some individuals from one sub-population and sending them to other islands. The migrated individuals are integrated into the local sub-population according to predefined acceptance criteria. The decoupled evolution provided by the island model is very beneficial for a large number of real-world problems [10].
The communication type and the acceptance criteria of the migrated individuals are very important in determining the convergence behavior of the algorithm [7]. In literature, there are two main communication schemes: sync and async.1 The sync algorithm maintains a synchronization point, which means that all the islands should move from one communication stage to another together. However, the async algorithm has not such a point; thus islands proceed on their own with sparse and non-synchronous exchanges of information. In this work, we focus on analyzing the characteristics and differences of the sync/async island distributed GAs (dGAs) in shared-memory multiprocessor architectures.
This work is an extension of initial research published in Ref. [1], where preliminary results of the island distributed GA (dGA) were obtained for one problem (ONEMAX) for eight cores only. ONEMAX is known for its simple landscape; hence the results obtained there were not enough to make a fair comparison between the sync and async algorithms. In this research, we extend our preliminary study in many ways. First, we here analyze new problems (ONEMAX and three different problems). These problems possess different search spaces and different dimension sizes. Further, we analyze one of these problems with several dimension sizes. Second, our experiments consider a higher number of cores, from one to 32 cores. Thus, we could show the effect of the synchronism on the algorithms, using a different number of problems and dimensions over a various number of cores. Third, we present the numerical results and speed-up of the master-slave model. Finally, we present the results of sync and async island dGAs with a detailed comparison between the two implementations.
In short, we address three relevant and interesting research questions: RQ1 What is the performance of different implementations of parallel GAs over shared-memory multiprocessor systems? RQ2 What is the effect of synchronism on the numerical accuracy and execution time of dGAs over modern multiprocessor systems? RQ3 What is the speed-up behavior of the different implementations of a parallel GA on multiprocessor shared-memory platforms over different types of functions and dimensions?
For RQ1, we focus on the execution time and numerical efficiency features of the different dGA models over a different number of cores. We use a set of problems with different features and sizes. For RQ2, we analyze the behavior of a dGA with two different communication schemes (sync and async). Moreover, we perform a statistical comparison between these two implementations. For RQ3, we present a detailed speed-up analysis for the PGAs models under the study within the change of the number of cores used. We investigate the speed-up behavior of different PGAs implementations using a different number of functions vary on complexity and dimensions; in order, a novel conclusion emerges.
The scientific contribution of this paper lies in the combined study of the different parallel models for GAs from a run-time and a numerical point of view. Contrary to most previous studies, we run our algorithms on a multiprocessor system that can potentially vanish the numerical and run times differences or not. Besides, we present a detailed statistical comparison between the sync and async island dGAs. We hope to increase the body of knowledge in this domain, allowing researchers and final users to build new efficient search techniques that fit with their goals.
The rest of this paper is organized as follows. Section 2 discusses related works, where we present the recent state of the art most relevant to this study. Section 3 overviews the fundamentals of the parallel models used in our experiments, with detailed explanations for the differences between the sync/async models, migration operations, and master/slave design. Section 4 provides the experimental settings of our experiments, the benchmark problems, and parameters used in the studies. In Section 5, we overview our results in detail and discuss on them. Section 6, summarizes our findings and conclusions.
Section snippets
Related works
Evolutionary Computation (EC) has gained much attention from the researchers in the past decades, due to its importance in solving real-life numerical optimization problems. Del Ser et al. [15] presented a recent survey study on numerical optimization using the bio-inspired and evolutionary computing methods. The significance of their research lays in over-viewing the recent history of the bio-inspired computation and EC with a prospective look to the future of these methods. The design of the
Characterizing parallel GA models
In this section, we overview the PGAs models investigated in our experiments. We present the basics of master-slave and island models along with the definition of migration operator and the differences between the sync and async implementations.
Experimental settings
In this section, we present the system specifications and parameters used in our experiments. Firstly, we present the benchmark problems used to analyze the parallel models. Secondly, we describe the parameters used in both models and system specifications.
Experimental analysis
In this section, we explain our experiments and their analysis. Our algorithms were implemented in C++/MPI; we make this choice to promote the universal utility of our findings. C++ compilers are available for virtually every platform and operating environment, and in fact, it helped us to break the problem into a collection of data structures and operations which will be efficiently parallelized [22]. Our choice of MPI as a message passing library is because it allows a natural and easy
Conclusions
In this work, we have presented a comprehensible study on the performance and the effect of synchronism on parallel GAs over multiprocessors systems covering three vital parallel models. Our focus was on numerical efficiency and time consumption, to provide a common study on the behavior of parallel GAs over homogeneous shared-memory computing systems.
Our computational experiments vary in the type of parallel GA and in the scalability in problem size, demonstrating that the different PGAs are
Acknowledgments
This research has been partially funded by the Spanish MINECO and FEDER projects TIN2016-81766-REDT (CI-RTI), TIN2017-88213-R (6city), and by Andalucía Tech, Universidad de Málaga. The first author was supported by the Egyptian Ministry of Higher Education under a grant from the missions sector, Egypt.
References (47)
Parallel evolutionary algorithms can achieve super-linear performance
Inf. Process. Lett.
(2002)- et al.
Parallel heterogeneous genetic algorithms for continuous optimization
Parallel Comput.
(2004) - et al.
Analyzing synchronous and asynchronous parallel distributed genetic algorithms
Future Gener. Comput. Syst.
(2001) - et al.
Efficient parallel genetic algorithms: theory and practice
Comput. Methods Appl. Mech. Eng.
(2000) - et al.
A parallel ant colony algorithm on massively parallel processors and its convergence analysis for the travelling salesman problem
Inf. Sci.
(2012) - et al.
Optimizing shared-memory hyperheuristics on top of parameterized metaheuristics
Procedia Comput. Sci.
(2014) - et al.
Bio-inspired computation: where we stand and what's next
Swarm Evol. Comput.
(2019) - et al.
Distributed evolutionary algorithms and their models: a survey of the state-of-the-art
Appl. Soft Comput.
(2015) - et al.
A comparison study of harmony search and genetic algorithm for the max-cut problem
Swarm Evol. Comput.
(2019) - et al.
A scalable parallel genetic algorithm for the Generalized Assignment Problem
Parallel Comput.
(2015)