Abstract
We give sufficient conditions which the mutation rate must satisfy for the convergence of the genetic algorithm when that rate is allowed to change throughout iterations. The empirical performance of the algorithm with regards to changes in the mutation parameter is explored via test functions, ARIMA model selection and maximum likelihood estimation illustrating the advantages of letting the mutation rate decrease from rather unusual high values to the commonly used low ones.
Similar content being viewed by others
References
Allufi-Pentini F, Parisi V, Zirilli F (1985) Global optimization and stochastic differential equations. J Optim Theory Appl 47:1–16
Bolfarine H, Martínez-Flórez G, Salinas HS (2013) Bimodal symmetric-asymmetric power-normal families. Commun Stat Theory. doi:10.1080/03610926.2013.765475
Box GEP, Jenkins G, Reinsel GC (1994) Time series analysis: forecasting and control, 3rd edn. Prentice-Hall, New Jersey
Campos VEM, Pereira AGC (2012) Modeling the genetic algorithm by a nonhomogeneous Markov chain: weak and strong ergodicity. Theory Probab Appl 57:85–192
Chen T, Tang K, Chen G, Yao X (2012) A large population size can be unhelpful in evolutionary algorithms. Theor Comput Sci 436:54–70
Cox C (2008) The generalized F distribution: an umbrella for parametric survival analysis. Stat Med 27:4301–4312
Douc R, Moulines E, Rosenthal JS (2004) Quantitative bounds on convergence of time-inhomogeneous Markov chains. Ann Appl Probab 14:1605–2201
Eiben AE, Hinterding R, Michalewicz Z (1999) Parameter control in evolutionary algorithms. IEEE Trans Evol Comput 3:124–141
Fogarty TC (1989) Varying the probability of mutation in the genetic algorithm. In: Proceedings of the 3rd ICGA. Morgan Kaufmann Pub. Inc., San Francisco, USA, pp 104–109
Givens GH, Hoeting JA (2012) Computational statistics. Wiley, New York
Goldberg DE (1989) Genetic algorithms in search, optimization and machine learning. Addison Wesley, New York
Hallam J, Akman O, Akman F (2010) Genetic algorithms with shrinking population size. Comput Stat 25:691–705
He J, Yu X (2001) Conditions for the convergence of evolutionary algorithms. J Syst Archit 47:601–612
He J, Kang L (1999) On the convergence rates of genetic algorithms. Theor Comput Sci 229:23–39
Hyndman RJ (2013) Time series data library, http://data.is/TSDLdemo. Accessed on January, 2013
Hokstad P (1983) A method for diagnostic checking of time series models. J Time Ser Anal 4:177–183
Holland JH (1975) Adaptation in natural and artificial systems. The University of Michigan Press, Ann Arbor
Isaacson RL, Madsen RW (1976) Markov chains: theory and applications. Wiley, New York
Michalewicz Z (1992) Genetic algorithms + data structures = evolution programs. Springer, Berlin
Michalewicz Z, Fogel DB (2004) How to solve it: modern heuristics, 2nd edn. Springer, Berlin
Morris MI (1977) Forecasting the sunspot cycle. J R Stat Soc B 140:437–468
Noufaily A, Jones MC (2013) On maximization of the likelihood for the generalized gamma distribution. Comput Stat 28:505–517
Pereira AGC, Andrade BB, Colosimo E (2014) Is it worth using a fuzzy controller to adjust the mutation probability in a genetic algorithm when the input variable is the number of iterations? In: Proceedings of SMTDA2014, Lisbon, Portugal (in press)
Rojas Cruz JA (1998) Convergência de Cadeias de Markov Não-Homogêneas: Ergodicidade Fraca e Forte, Doctoral Thesis. Dep. of Mathematics, Universidade de Brasília, Brazil
Rojas Cruz JA, Pereira AGC (2013) The elitist nonhomogeneous genetic algorithm: almost sure convergence. Stat Probab Lett 83:2179–2185
Rudolph G (1994) Convergence analysis of canonical genetic algorithms. IEEE Trans Neural Netw 5:96–101
Törn A, Zilinskas A (1989) Global optimization, lecture notes in computer science, vol 350. Springer, Berlin
Varnamkhasti MJ, Lee LS, Abu Bakar MR, Leong WJ (2012) A genetic algorithm with fuzzy crossover operator and probability. Adv Oper Res. doi:10.1155/2012/956498
Yu Y, Zhou Z-H (2008) A new approach to estimating the expected first hitting time of evolutionary algorithms. Artif Intell 172:1809–1832
Acknowledgments
We sincerely thank two anonymous referees for their scrutiny of the manuscript which led to many improvements. The authors were partially supported by PROCAD/CAPES.
Author information
Authors and Affiliations
Corresponding author
Appendix: Results from the literature
Appendix: Results from the literature
The definitions and results below are used in Sect. 2.
Throughout this paper we have adopted the \(\infty \)-norm \(\Vert a \Vert = \max _{i \in E} a(i)\) with induced matrix norm \(\Vert A \Vert = \max _i\sum _j A(i,j)\).
Definition 1
A nonhomogeneous Markov chain \(\{P_t\}_{t\in \mathbb {N}}\) is said to be weakly ergodic if it satisfies
where \(\mu _0\) and \(\mu _1\) are any two finite probability distributions and the \(k\)-step transition from \(t\) is given by the product of the transition matrices, \(P^{(t,k)} = P_t P_{t+1}\ldots P_{t+k-1}\) (Markov property).
Strong ergodicity is characterized by the existence of a (row-constant) limit transition matrix \( P_{\infty }\) such that \( \lim _{k\rightarrow \infty } \Vert P^{(t,k)}-P_{\infty } \Vert =0\) \(\forall t\ge 0\) but it will not be used here. We just note that Campos and Pereira (2012) show that weak ergodicity along with \( \lim _{t\rightarrow \infty } \Vert P_t - P_{\infty } \Vert =0 \) yield strong ergodicity.
Definition 2
Given a stochastic matrix \(Q= \left[ q_{ij}\right] _{i,j\in E}\), the Dobrushin ergodic coefficient is defined by
and \(\delta (Q)=1-\alpha (Q)\) is called the \(delta\)-coefficient. Note that (12) is equivalent to \( \lim _{k\rightarrow \infty } \alpha (P^{(t,k)}) = 1\) or \(\lim _{k\rightarrow \infty } \delta (P^{(t,k)})=0,\,\forall t\ge 0.\)
We summarize Theorems 1, 2 and Corollary 3 from Rojas Cruz and Pereira (2013) in the following lemma.
Lemma 1
Let \(\{X_t\}_{t \in \mathbb {N}}\) be the Markov chain which models the elitist NGA with stationary selection and crossover operators but time-dependent mutation rate \(p_m^{(t)}\). Denote the transition matrices by \(P_t= SCM _t\). Let \(E^*\) be a subset of the state space containing an optimizer in it. Suppose that \(0 < p_m^{(t)} < 1\) and that there exists a strictly positive sequence \(\{\xi _t\}\) such that
and also
Then,
The next lemma is taken from Rojas Cruz (1998).
Lemma 2
Let \(\{P_t\}_{t\in \mathbb {N}}\) be a nonhomogeneous Markov chain. If
then there exists a stochastic matrix \(P_{\infty }\) such that
The next lemma bundles some results taken from the textbook of Isaacson and Madsen (1976).
Lemma 3
Let \(P\), \(Q\) and \(R\) be transition matrices. Then,
Rights and permissions
About this article
Cite this article
Pereira, A.G.C., de Andrade, B.B. On the genetic algorithm with adaptive mutation rate and selected statistical applications. Comput Stat 30, 131–150 (2015). https://doi.org/10.1007/s00180-014-0526-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00180-014-0526-x