Skip to main content

Advertisement

Log in

On the genetic algorithm with adaptive mutation rate and selected statistical applications

  • Original Paper
  • Published:
Computational Statistics Aims and scope Submit manuscript

Abstract

We give sufficient conditions which the mutation rate must satisfy for the convergence of the genetic algorithm when that rate is allowed to change throughout iterations. The empirical performance of the algorithm with regards to changes in the mutation parameter is explored via test functions, ARIMA model selection and maximum likelihood estimation illustrating the advantages of letting the mutation rate decrease from rather unusual high values to the commonly used low ones.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  • Allufi-Pentini F, Parisi V, Zirilli F (1985) Global optimization and stochastic differential equations. J Optim Theory Appl 47:1–16

    Article  MathSciNet  Google Scholar 

  • Bolfarine H, Martínez-Flórez G, Salinas HS (2013) Bimodal symmetric-asymmetric power-normal families. Commun Stat Theory. doi:10.1080/03610926.2013.765475

  • Box GEP, Jenkins G, Reinsel GC (1994) Time series analysis: forecasting and control, 3rd edn. Prentice-Hall, New Jersey

    MATH  Google Scholar 

  • Campos VEM, Pereira AGC (2012) Modeling the genetic algorithm by a nonhomogeneous Markov chain: weak and strong ergodicity. Theory Probab Appl 57:85–192

    MathSciNet  Google Scholar 

  • Chen T, Tang K, Chen G, Yao X (2012) A large population size can be unhelpful in evolutionary algorithms. Theor Comput Sci 436:54–70

    Article  MATH  MathSciNet  Google Scholar 

  • Cox C (2008) The generalized F distribution: an umbrella for parametric survival analysis. Stat Med 27:4301–4312

    Article  MathSciNet  Google Scholar 

  • Douc R, Moulines E, Rosenthal JS (2004) Quantitative bounds on convergence of time-inhomogeneous Markov chains. Ann Appl Probab 14:1605–2201

    Article  MathSciNet  Google Scholar 

  • Eiben AE, Hinterding R, Michalewicz Z (1999) Parameter control in evolutionary algorithms. IEEE Trans Evol Comput 3:124–141

    Article  Google Scholar 

  • Fogarty TC (1989) Varying the probability of mutation in the genetic algorithm. In: Proceedings of the 3rd ICGA. Morgan Kaufmann Pub. Inc., San Francisco, USA, pp 104–109

  • Givens GH, Hoeting JA (2012) Computational statistics. Wiley, New York

    Book  Google Scholar 

  • Goldberg DE (1989) Genetic algorithms in search, optimization and machine learning. Addison Wesley, New York

    MATH  Google Scholar 

  • Hallam J, Akman O, Akman F (2010) Genetic algorithms with shrinking population size. Comput Stat 25:691–705

    Article  MATH  MathSciNet  Google Scholar 

  • He J, Yu X (2001) Conditions for the convergence of evolutionary algorithms. J Syst Archit 47:601–612

    Article  Google Scholar 

  • He J, Kang L (1999) On the convergence rates of genetic algorithms. Theor Comput Sci 229:23–39

    Article  MATH  MathSciNet  Google Scholar 

  • Hyndman RJ (2013) Time series data library, http://data.is/TSDLdemo. Accessed on January, 2013

  • Hokstad P (1983) A method for diagnostic checking of time series models. J Time Ser Anal 4:177–183

    Article  MathSciNet  Google Scholar 

  • Holland JH (1975) Adaptation in natural and artificial systems. The University of Michigan Press, Ann Arbor

    Google Scholar 

  • Isaacson RL, Madsen RW (1976) Markov chains: theory and applications. Wiley, New York

    MATH  Google Scholar 

  • Michalewicz Z (1992) Genetic algorithms + data structures = evolution programs. Springer, Berlin

    Book  MATH  Google Scholar 

  • Michalewicz Z, Fogel DB (2004) How to solve it: modern heuristics, 2nd edn. Springer, Berlin

    Book  MATH  Google Scholar 

  • Morris MI (1977) Forecasting the sunspot cycle. J R Stat Soc B 140:437–468

    Google Scholar 

  • Noufaily A, Jones MC (2013) On maximization of the likelihood for the generalized gamma distribution. Comput Stat 28:505–517

    Article  MATH  MathSciNet  Google Scholar 

  • Pereira AGC, Andrade BB, Colosimo E (2014) Is it worth using a fuzzy controller to adjust the mutation probability in a genetic algorithm when the input variable is the number of iterations? In: Proceedings of SMTDA2014, Lisbon, Portugal (in press)

  • Rojas Cruz JA (1998) Convergência de Cadeias de Markov Não-Homogêneas: Ergodicidade Fraca e Forte, Doctoral Thesis. Dep. of Mathematics, Universidade de Brasília, Brazil

  • Rojas Cruz JA, Pereira AGC (2013) The elitist nonhomogeneous genetic algorithm: almost sure convergence. Stat Probab Lett 83:2179–2185

    Article  MATH  MathSciNet  Google Scholar 

  • Rudolph G (1994) Convergence analysis of canonical genetic algorithms. IEEE Trans Neural Netw 5:96–101

    Article  Google Scholar 

  • Törn A, Zilinskas A (1989) Global optimization, lecture notes in computer science, vol 350. Springer, Berlin

    Google Scholar 

  • Varnamkhasti MJ, Lee LS, Abu Bakar MR, Leong WJ (2012) A genetic algorithm with fuzzy crossover operator and probability. Adv Oper Res. doi:10.1155/2012/956498

  • Yu Y, Zhou Z-H (2008) A new approach to estimating the expected first hitting time of evolutionary algorithms. Artif Intell 172:1809–1832

    Article  MATH  MathSciNet  Google Scholar 

Download references

Acknowledgments

We sincerely thank two anonymous referees for their scrutiny of the manuscript which led to many improvements. The authors were partially supported by PROCAD/CAPES.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bernardo B. de Andrade.

Appendix: Results from the literature

Appendix: Results from the literature

The definitions and results below are used in Sect. 2.

Throughout this paper we have adopted the \(\infty \)-norm \(\Vert a \Vert = \max _{i \in E} a(i)\) with induced matrix norm \(\Vert A \Vert = \max _i\sum _j A(i,j)\).

Definition 1

A nonhomogeneous Markov chain \(\{P_t\}_{t\in \mathbb {N}}\) is said to be weakly ergodic if it satisfies

$$\begin{aligned} \lim _{k\rightarrow \infty } \Vert \mu _0P^{(t,k)}-\mu _1P^{(t,k)} \Vert = 0, \quad \forall t\ge 0, \end{aligned}$$
(12)

where \(\mu _0\) and \(\mu _1\) are any two finite probability distributions and the \(k\)-step transition from \(t\) is given by the product of the transition matrices, \(P^{(t,k)} = P_t P_{t+1}\ldots P_{t+k-1}\) (Markov property).

Strong ergodicity is characterized by the existence of a (row-constant) limit transition matrix \( P_{\infty }\) such that \( \lim _{k\rightarrow \infty } \Vert P^{(t,k)}-P_{\infty } \Vert =0\) \(\forall t\ge 0\) but it will not be used here. We just note that Campos and Pereira (2012) show that weak ergodicity along with \( \lim _{t\rightarrow \infty } \Vert P_t - P_{\infty } \Vert =0 \) yield strong ergodicity.

Definition 2

Given a stochastic matrix \(Q= \left[ q_{ij}\right] _{i,j\in E}\), the Dobrushin ergodic coefficient is defined by

$$\begin{aligned} \alpha (Q)=1-\sup _{i,k \in E}\sum _{j \in E} \left( q_{ij}-q_{kj}\right) ^{+} \quad \text{ with } \quad \left( q_{ij}-q_{kj}\right) ^{+}= \max \left\{ 0,q_{ij}-q_{kj}\right\} , \end{aligned}$$

and \(\delta (Q)=1-\alpha (Q)\) is called the \(delta\)-coefficient. Note that (12) is equivalent to \( \lim _{k\rightarrow \infty } \alpha (P^{(t,k)}) = 1\) or \(\lim _{k\rightarrow \infty } \delta (P^{(t,k)})=0,\,\forall t\ge 0.\)

We summarize Theorems 1, 2 and Corollary 3 from Rojas Cruz and Pereira (2013) in the following lemma.

Lemma 1

Let \(\{X_t\}_{t \in \mathbb {N}}\) be the Markov chain which models the elitist NGA with stationary selection and crossover operators but time-dependent mutation rate \(p_m^{(t)}\). Denote the transition matrices by \(P_t= SCM _t\). Let \(E^*\) be a subset of the state space containing an optimizer in it. Suppose that \(0 < p_m^{(t)} < 1\) and that there exists a strictly positive sequence \(\{\xi _t\}\) such that

$$\begin{aligned} \inf _{i\in E}P_t(i,j)\ge \xi _t,\; \forall j\in E \end{aligned}$$

and also

$$\begin{aligned} \sum _{t \ge 1}\xi _t=\infty . \end{aligned}$$

Then,

$$\begin{aligned} P\left( \lim _{t\rightarrow \infty }X_t\in E^*\right) =1. \end{aligned}$$
(13)

The next lemma is taken from Rojas Cruz (1998).

Lemma 2

Let \(\{P_t\}_{t\in \mathbb {N}}\) be a nonhomogeneous Markov chain. If

$$\begin{aligned} \sum _{n=1}^{\infty } \Vert P_{t+1}-P_t \Vert < \infty \end{aligned}$$

then there exists a stochastic matrix \(P_{\infty }\) such that

$$\begin{aligned} \lim _{n\rightarrow \infty } \Vert P_t - P_{\infty } \Vert =0. \end{aligned}$$

The next lemma bundles some results taken from the textbook of Isaacson and Madsen (1976).

Lemma 3

Let \(P\), \(Q\) and \(R\) be transition matrices. Then,

$$\begin{aligned}&\Vert P(Q-R)\Vert \le \Vert Q-R \Vert ,\end{aligned}$$
(14)
$$\begin{aligned}&\Vert (P-Q)R \Vert \le \Vert P-Q \Vert \delta (R) \le \Vert P- Q \Vert ,\end{aligned}$$
(15)
$$\begin{aligned}&\vert \delta (P) - \delta (Q) \vert \le \Vert P-Q \Vert . \end{aligned}$$
(16)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Pereira, A.G.C., de Andrade, B.B. On the genetic algorithm with adaptive mutation rate and selected statistical applications. Comput Stat 30, 131–150 (2015). https://doi.org/10.1007/s00180-014-0526-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00180-014-0526-x

Keywords

Navigation