Efficient covariance matrix update for variable metric evolution strategies

Suttorp, Thorsten; Hansen, Nikolaus; Igel, Christian

doi:10.1007/s10994-009-5102-1

Efficient covariance matrix update for variable metric evolution strategies

Published: 24 January 2009

Volume 75, pages 167–197, (2009)
Cite this article

Download PDF

Machine Learning Aims and scope Submit manuscript

Efficient covariance matrix update for variable metric evolution strategies

Download PDF

Thorsten Suttorp¹,
Nikolaus Hansen² &
Christian Igel¹

2539 Accesses
91 Citations
3 Altmetric
Explore all metrics

Abstract

Randomized direct search algorithms for continuous domains, such as evolution strategies, are basic tools in machine learning. They are especially needed when the gradient of an objective function (e.g., loss, energy, or reward function) cannot be computed or estimated efficiently. Application areas include supervised and reinforcement learning as well as model selection. These randomized search strategies often rely on normally distributed additive variations of candidate solutions. In order to efficiently search in non-separable and ill-conditioned landscapes the covariance matrix of the normal distribution must be adapted, amounting to a variable metric method. Consequently, covariance matrix adaptation (CMA) is considered state-of-the-art in evolution strategies. In order to sample the normal distribution, the adapted covariance matrix needs to be decomposed, requiring in general Θ(n ³) operations, where n is the search space dimension. We propose a new update mechanism which can replace a rank-one covariance matrix update and the computationally expensive decomposition of the covariance matrix. The newly developed update rule reduces the computational complexity of the rank-one covariance matrix adaptation to Θ(n ²) without resorting to outdated distributions. We derive new versions of the elitist covariance matrix adaptation evolution strategy (CMA-ES) and the multi-objective CMA-ES. These algorithms are equivalent to the original procedures except that the update step for the variable metric distribution scales better in the problem dimension. We also introduce a simplified variant of the non-elitist CMA-ES with the incremental covariance matrix update and investigate its performance. Apart from the reduced time-complexity of the distribution update, the algebraic computations involved in all new algorithms are simpler compared to the original versions. The new update rule improves the performance of the CMA-ES for large scale machine learning problems in which the objective function can be evaluated fast.

References

Arnold, D. V. (2002). Noisy optimization with evolution strategies. Dordrecht: Kluwer Academic.
MATH Google Scholar
Arnold, D. V. (2006). Weighted multirecombination evolution strategies. Theoretical Computer Science, 361(1), 18–37.
Article MATH MathSciNet Google Scholar
Arnold, D. V., & Beyer, H. G. (2003). A comparison of evolution strategies with other direct search methods in the presence of noise. Computational Optimization and Applications, 24(1), 135–159.
Article MATH MathSciNet Google Scholar
Auger, A. (2005). Convergence results for the (1, λ)-SA-ES using the theory of φ-irreducible Markov chains. Theoretical Computer Science, 334(1), 35–69.
Article MATH MathSciNet Google Scholar
Auger, A., & Hansen, N. (2005a). Performance evaluation of an advanced local search evolutionary algorithm. In Proceedings of the IEEE congress on evolutionary computation (CEC 2005) (pp. 1777–1784). New York: IEEE Press.
Chapter Google Scholar
Auger, A., & Hansen, N. (2005b). A restart CMA evolution strategy with increasing population size. In Proceedings of the IEEE congress on evolutionary computation (CEC 2005) (pp. 1769–1776). New York: IEEE Press.
Chapter Google Scholar
Beume, N., & Rudolph, G. (2006). Faster S-metric calculation by considering dominated hypervolume as Klee’s measure problem. In IASTED international conference on computational intelligence (pp. 231–236). Calgary: ACTA Press.
Google Scholar
Beume, N., Naujoks, B., & Emmerich, M. (2007). SMS-EMOA: Multiobjective selection based on dominated hypervolume. European Journal of Operational Research, 181(3), 1653–1669.
Article MATH Google Scholar
Beyer, H. G. (2001). The theory of evolution strategies. New York: Springer.
Google Scholar
Beyer, H. G. (2007). Evolution strategies. Scholarpedia, 2(8), 1965.
Google Scholar
Beyer, H. G., & Schwefel, H. P. (2002). Evolution strategies: A comprehensive introduction. Natural Computing, 1(1), 3–52.
Article MATH MathSciNet Google Scholar
Deb, K. (2001). Multi-objective optimization using evolutionary algorithms. New York: Wiley.
MATH Google Scholar
Deb, K., Agrawal, S., Pratap, A., & Meyarivan, T. (2002). A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Transactions on Evolutionary Computation, 6(2), 182–197.
Article Google Scholar
Emmerich, M., Beume, N., & Naujoks, B. (2005). An EMO algorithm using the hypervolume measure as selection criterion. In C. A. Coello Coello, E. Zitzler, & A. Hernandez Aguirre (Eds.), LNCS: Vol. 3410. Third international conference on evolutionary multi-criterion optimization (EMO 2005) (pp. 62–76). New York: Springer.
Google Scholar
Fogel, D. B. (1995). Evolutionary computation: Toward a new philosophy of machine intelligence. New York: IEEE Press.
Google Scholar
Fonseca, C. M., Paquete, L., & López-Ibáñez, M. (2006). An improved dimension-sweep algorithm for the hypervolume indicator. In: IEEE congress on evolutionary computation (pp. 1157–1163).
Friedrichs, F., & Igel, C. (2005). Evolutionary tuning of multiple SVM parameters. Neurocomputing, 64(C), 107–117.
Google Scholar
Glasmachers, T., & Igel, C. (2008). Uncertainty handling in model selection for support vector machines. In G. Rudolph (Ed.), LNCS: Vol. 5199. Parallel problem solving from nature (PPSN X) (pp. 185–194). New York: Springer.
Chapter Google Scholar
Gomez, F., Schmidhuber, J., & Miikkulainen, R. (2006). Efficient non-linear control through neuroevolution. LNCS: Vol. 4212. In Proceedings of the European conference on machine learning (ECML 2006) (pp. 654–662). New York: Springer.
Google Scholar
Grewal, M. S., & Andrews, A. P. (1993). Kalman filtering: Theory and practice. Englewood Cliffs: Prentice-Hall.
MATH Google Scholar
Hager, W. W. (1989). Updating the inverse of a matrix. SIAM Review, 31(2), 221–239.
Article MATH MathSciNet Google Scholar
Hansen, N. (1998). Verallgemeinerte individuelle Schrittweitenregelung in der Evolutionsstrategie. Eine Untersuchung zur entstochastisierten, koordinatensystemunabhängigen Adaptation der Mutationsverteilung. Berlin: Mensch und Buch Verlag. ISBN 3-933346-29-0.
Google Scholar
Hansen, N. (2006). The CMA evolution strategy: a comparing review. In J. Lozano, P. Larranaga, I. Inza, & E. Bengoetxea (Eds.), Towards a new evolutionary computation. Advances on estimation of distribution algorithms (pp. 75–102). New York: Springer.
Google Scholar
Hansen, N. (2008). http://www.bionik.tu-berlin.de/user/niko/cmaapplications.pdf.
Hansen, N., & Kern, S. (2004). Evaluating the CMA evolution strategy on multimodal test functions. In X. Yao, E. Burke, J. A. Lozano, J. Smith, J. J. Merelo-Guervós, J. A. Bullinaria, J. Rowe, P. Tiňo, A. Kabán, & H. P. Schwefel (Eds.), LNCS: Vol. 3242. Parallel problem solving from nature (PPSN VIII) (pp. 282–291). New York: Springer.
Google Scholar
Hansen, N., & Ostermeier, A. (1996). Adapting arbitrary normal mutation distributions in evolution strategies: The covariance matrix adaptation. In Proceedings of the 1996 IEEE conference on evolutionary computation (ICEC ’96) (pp. 312–317).
Hansen, N., & Ostermeier, A. (2001). Completely derandomized self-adaptation in evolution strategies. Evolutionary Computation, 9(2), 159–195.
Article Google Scholar
Hansen, N., Ostermeier, A., & Gawelczyk, A. (1995). On the adaptation of arbitrary normal mutation distributions in evolution strategies: The generating set adaptation. In L. Eshelman (Ed.), Proceedings of the sixth international conference on genetic algorithms (pp. 57–64). San Francisco: Morgan Kaufmann.
Google Scholar
Hansen, N., Müller, S. D., & Koumoutsakos, P. (2003). Reducing the time complexity of the derandomized evolution strategy with covariance matrix adaptation (CMA-ES). Evolutionary Computation, 11(1), 1–18.
Article Google Scholar
Hansen, N., Niederberger, S. P. N., Guzzella, L., & Koumoutsakos, P. (2008). A method for handling uncertainty in evolutionary optimization with an application to feedback control of combustion. IEEE Transactions on Evolutionary Computation. doi:10.1109/TEVC.2008.924423.
Heidrich-Meisner, V., & Igel, C. (2008a). Evolution strategies for direct policy search. In G. Rudolph (Ed.), LNCS: Vol. 5199. Parallel problem solving from nature (PPSN X) (pp. 428–437). New York: Springer.
Chapter Google Scholar
Heidrich-Meisner, V., & Igel, C. (2008b). Similarities and differences between policy gradient methods and evolution strategies. In: M. Verleysen (Ed.), 16th European symposium on artificial neural networks (ESANN 2008), Evere, Belgien: d-side publications (pp. 427–432).
Heidrich-Meisner, V., & Igel, C. (2008c). Variable metric reinforcement learning methods applied to the noisy mountain car problem. In S. Girgin et al. (Eds.), LNAI: Vol. 5323. European workshop on reinforcement learning (EWRL 2008) (pp. 136–150). New York: Springer.
Google Scholar
Igel, C. (2003). Neuroevolution for reinforcement learning using evolution strategies. In R. Sarker, R. Reynolds, H. Abbass, K. C. Tan, B. McKay, D. Essam, & T. Gedeon (Eds.), Vol. 4. Congress on evolutionary computation (CEC 2003) (pp. 2588–2595). New York: IEEE Press.
Chapter Google Scholar
Igel, C. (2005). Multi-objective model selection for support vector machines. In C. A. Coello Coello, E. Zitzler, & A. Hernandez Aguirre (Eds.), LNAI: Vol. 3410. Third international conference on evolutionary multi-criterion optimization (EMO 2005) (pp. 534–546). New York: Springer.
Google Scholar
Igel, C., & Toussaint, M. (2003). On classes of functions for which No Free Lunch results hold. Information Processing Letters, 86(6), 317–321.
Article MathSciNet Google Scholar
Igel, C., Erlhagen, W., & Jancke, D. (2001). Optimization of dynamic neural fields. Neurocomputing, 36(1–4), 225–233.
Article MATH Google Scholar
Igel, C., Suttorp, T., & Hansen, N. (2006). A computational efficient covariance matrix update and a (1+1)-CMA for evolution strategies. In Proceedings of the genetic and evolutionary computation conference (GECCO 2006) (pp. 453–460). New York: ACM Press.
Chapter Google Scholar
Igel, C., Hansen, N., & Roth, S. (2007a). Covariance matrix adaptation for multi-objective optimization. Evolutionary Computation, 15(1), 1–28.
Article Google Scholar
Igel, C., Suttorp, T., & Hansen, N. (2007b). Steady-state selection and efficient covariance matrix update in the multi-objective CMA-ES. In S. Obayashi, K. Deb, C. Poloni, T. Hiroyasu, & T. Murata (Eds.), LNCS: Vol. 4403. Fourth international conference on evolutionary multi-criterion optimization (EMO 2007) (pp. 171–185). New York: Springer.
Chapter Google Scholar
Igel, C., Glasmachers, T., & Heidrich-Meisner, V. (2008). Shark. Journal of Machine Learning Research, 9, 993–996.
Google Scholar
Iorio, A., & Li, X. (2005). Solving rotated multi-objective optimization problems using differential evolution. In G. I. Webb & X. Yu (Eds.), LNCS: Vol. 3339. Proceeding of the 17th joint Australian conference on artificial intelligence (pp. 861–872). New York: Springer.
Google Scholar
Jägersküpper, J. (2006). How the (1+1) ES using isotropic mutations minimizes positive definite quadratic forms. Theoretical Computer Science, 36(1), 38–56.
Article Google Scholar
Jägersküpper, J. (2007). Lower bounds for hit-and-run direct search. In Proceedings of the fourth international symposium on stochastic algorithms: foundations and applications (SAGA) (pp. 118–129). New York: Springer.
Chapter Google Scholar
Jägersküpper, J. (2008). Lower bounds for randomized direct search with isotropic sampling. Operations Research Letter, 36(3), 327–332.
Article MATH Google Scholar
Jin, Y. (Ed.). (2006). Multi-objective machine learning. New York: Springer.
MATH Google Scholar
Jones, T., & Forrest, S. (1995). Fitness distance correlation as a measure of problem difficulty for genetic algorithms. In L. Eshelman (Ed.), Proceedings of the 6th international conference on genetic algorithms (pp. 184–192). San Francisco: Morgan Kaufmann.
Google Scholar
Kassahun, Y., & Sommer, G. (2005). Efficient reinforcement learning through evolutionary acquisition of neural topologies. In M. Verleysen (Ed.), 13th European symposium on artificial neural networks (ESANN 2005) d-side (pp. 259–266).
Kern, S., Müller, S., Hansen, N., Büche, D., Ocenasek, J., & Koumoutsakos, P. (2004). Learning probability distributions in continuous evolutionary algorithms—A comparative review. Natural Computing, 3, 77–112.
Article MATH MathSciNet Google Scholar
Klee, V. (1977). Can the measure of ∪[a _i,b _i] be computed in less than O(n⋅log n) steps? American Mathematical Monthly, 84, 284–285.
Article MathSciNet Google Scholar
Knight, J. N., & Lunacek, M. (2007). Reducing the space-time complexity of the CMA-ES. In Proceedings of the genetic and evolutionary computation conference (GECCO 2007) (pp. 658–665). New York: ACM Press.
Chapter Google Scholar
Mandischer, M. (2002). A comparison of evolution strategies and backpropagation for neural network training. Neurocomputing, 42(1–4), 87–117.
Article MATH Google Scholar
Mersch, B., Glasmachers, T., Meinicke, P., & Igel, C. (2007). Evolutionary optimization of sequence kernels for detection of bacterial gene starts. International Journal of Neural Systems, 17(5), 369–381, special issue on selected papers presented at the international conference on artificial neural networks (ICANN 2006).
Article Google Scholar
Mierswa, I. (2006). Evolutionary learning with kernels: a generic solution for large margin problems. In Proceedings of the 8th annual conference on genetic and evolutionary computation (GECCO 2006) (pp. 1553–1560).
Mierswa, I. (2007). Controlling overfitting with multi-objective support vector machines. In Proceedings of the 9th annual conference on genetic and evolutionary computation (GECCO 2007) (pp. 1830–1837).
Miettinen, K. (1999). Kluwer’s international series in operations research & management science: Vol. 12: Nonlinear multiobjective optimization. Dordrecht: Kluwer Academic.
MATH Google Scholar
Ostermeier, A. (1992). An evolution strategy with momentum adaptation of the random number distribution. Parallel Problem Solving from Nature, 2, 197–206.
Google Scholar
Overmars, M. H., & Yap, C. K. (1991). New upper bounds in Klee’s measure problem. SIAM Journal on Computing, 20(6), 1034–1045.
Article MATH MathSciNet Google Scholar
Pellecchia, A., Igel, C., Edelbrunner, J., & Schöner, G. (2005). Making driver modeling attractive. IEEE Intelligent Systems, 20(2), 8–12.
Article Google Scholar
Poland, J., & Zell, A. (2001). Main vector adaptation: A CMA variant with linear time and space complexity. In L. Spector, E. D. Goodman, A. Wu, W. B. Langdon, H. M. Voigt, M. Gen, S. Sen, M. Dorigo, S. Pezeshk, M. H. Garzon, & E. Burke (Eds.), Proceedings of the genetic and evolutionary computation conference (GECCO 2001) (pp. 1050–1055). San Francisco: Morgan Kaufmann.
Google Scholar
Rechenberg, I. (1973). Evolutionsstrategie: Optimierung technischer Systeme nach Prinzipien der biologischen Evolution. Stuttgart: Frommann-Holzboog.
Google Scholar
Ros, R., & Hansen, N. (2008). A simple modification in CMA-ES achieving linear time and space complexity. In G. Rudolph (Ed.), LNCS: Vol. 5199. Parallel problem solving from nature (PPSN X) (pp. 296–305). New York: Springer.
Chapter Google Scholar
Rudolph, G. (1992). On correlated mutations in evolution strategies. In R. Männer & B. Manderick (Eds.), Parallel problem solving from nature 2 (PPSN II) (pp. 105–114). Amsterdam: Elsevier.
Google Scholar
Runarsson, T. P., & Sigurdsson, S. (2004). Asynchronous parallel evolutionary model selection for support vector machines. Neural Information Processing—Letters and Reviews, 3(3), 59–68.
Google Scholar
Schneider, S., Igel, C., Klaes, C., Dinse, H., & Wiemer, J. (2004). Evolutionary adaptation of nonlinear dynamical systems in computational neuroscience. Journal of Genetic Programming and Evolvable Machines, 5(2), 215–227.
Article Google Scholar
Schumer, M., & Steiglitz, K. (1968). Adaptive step size random search. IEEE Transactions on Automatic Control, 13(3), 270–276.
Article Google Scholar
Schwefel, H. P. (1995). Sixth-generation computer technology series: Evolution and optimum seeking. New York: Wiley.
Google Scholar
Siebel, N. T., & Sommer, G. (2007). Evolutionary reinforcement learning of artificial neural networks. International Journal of Hybrid Intelligent Systems, 4(3), 171–183.
MATH Google Scholar
Sutton, R., & Barto, A. (1998). Reinforcement learning: An Introduction. Cambridge: MIT Press.
Google Scholar
Suttorp, T., & Igel, C. (2006). Multi-objective optimization of support vector machines. In Y. Jin (Ed.), Studies in computational intelligence: Vol. 16. Multi-objective machine learning (pp. 199–220). New York: Springer.
Chapter Google Scholar
Voß, T., Beume, N., Rudolph, G., & Igel, C. (2008). Scalarization versus indicator-based selection in multi-objective CMA evolution strategies. In IEEE congress on evolutionary computation 2008 (CEC 2008) (pp. 3041–3048). New York: IEEE Press.
Google Scholar
Whitley, D., Lunacek, M., & Sokolov, A. (2006). Comparing the niches of CMA-ES, CHC and pattern search using diverse benchmarks. LNCS: Vol. 4193. In Parallel problem solving from nature (PPSN IX) (pp. 988–997). New York: Springer.
Chapter Google Scholar
Winter, S., Brendel, B., Pechlivanis, I., Schmieder, K., & Igel, C. (2008). Registration of CT and intraoperative 3D ultrasound images of the spine using evolutionary and gradient-based methods. IEEE Transactions on Evolutionary Computation, 12(3), 284–296.
Article Google Scholar
Zitzler, E., & Thiele, L. (1998). Multiobjective optimization using evolutionary algorithms—a comparative case study. In A. E. Eiben, T. Bäck, M. Schoenauer, & H. P. Schwefel (Eds.), Parallel problem solving from nature (PPSN V) (pp. 292–301). New York: Springer.
Chapter Google Scholar
Zitzler, E., Thiele, L., Laumanns, M., Fonseca, C. M., & Grunert da Fonseca, V. (2003). Performance assessment of multiobjective optimizers: An analysis and review. IEEE Transactions on Evolutionary Computation, 7(2), 117–132.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Institut für Neuroinformatik, Ruhr-Universität Bochum, 44780, Bochum, Germany
Thorsten Suttorp & Christian Igel
Microsoft Research—INRIA Joint Centre & TAO, INRIA Saclay—Île-de-France, Parc Orsay Université, 28 rue Jean Rostand, 91893, Orsay Cedex, France
Nikolaus Hansen

Authors

Thorsten Suttorp
View author publications
You can also search for this author in PubMed Google Scholar
Nikolaus Hansen
View author publications
You can also search for this author in PubMed Google Scholar
Christian Igel
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Christian Igel.

Additional information

Editor: Risto Miikkulainen.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Suttorp, T., Hansen, N. & Igel, C. Efficient covariance matrix update for variable metric evolution strategies. Mach Learn 75, 167–197 (2009). https://doi.org/10.1007/s10994-009-5102-1

Download citation

Received: 11 September 2007
Revised: 05 December 2008
Accepted: 19 December 2008
Published: 24 January 2009
Issue Date: May 2009
DOI: https://doi.org/10.1007/s10994-009-5102-1

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Efficient covariance matrix update for variable metric evolution strategies

Abstract

Article PDF

Similar content being viewed by others

Online Model Selection for Restricted Covariance Matrix Adaptation

The Hessian Estimation Evolution Strategy

A New Look at the Covariance Matrix Estimation in Evolution Strategies

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Efficient covariance matrix update for variable metric evolution strategies

Abstract

Article PDF

Similar content being viewed by others

Online Model Selection for Restricted Covariance Matrix Adaptation

The Hessian Estimation Evolution Strategy

A New Look at the Covariance Matrix Estimation in Evolution Strategies

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation