Skip to main content

2013 | OriginalPaper | Buchkapitel

9. A Baseline Symbolic Regression Algorithm

verfasst von : Michael F. Korns

Erschienen in: Genetic Programming Theory and Practice X

Verlag: Springer New York

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Recent advances in symbolic regression (SR) have promoted the field into the early stages of commercial exploitation. This is the expected maturation history for an academic field which is progressing rapidly. The original published symbolic regression algorithms in (Koza 1994) have long since been replaced by techniques such as pareto front, age layered population structures, and even age pareto front optimization. The lack of specific techniques for optimizing embedded real numbers, in the original algorithms, has been replaced with sophisticated techniques for optimizing embedded constants. Symbolic regression is coming of age as a technology.
As the discipline of Symbolic Regression (SR) has matured, the first commercial SR packages have appeared. There is at least one commercial package on the market for several years http://​www.​rmltech.​com/​. There is now at least one well documented commercial symbolic regression package available for Mathmatica www.​evolved-analytics.​com. There is at least one very well done open source symbolic regression package available for free download http://​ccsl.​mae.​cornell.​edu/​eureqa. Yet, even as the sophistication of commercial SR packages increases, there have been glaring issues with SR accuracy even on simple problems (Korns 2011). The depth and breadth of SR adoption in industry and academia will be greatly affected by the demonstrable accuracy of available SR algorithms and tools.
In this chapter we develop a complete public domain algorithm for modern symbolic regression which is reasonably competitive with current commercial SR packages, and calibrate its accuracy on a set of previously published sample problems. This algorithm is designed as a baseline for further public domain research on SR algorithm simplicity and accuracy. No claim is made placing this baseline algorithm on a par with commercial packages – especially as the commercial offerings can be expected to relentlessly improve in the future. However this baseline is a great improvement over the original published algorithms, and is an attempt to consolidate the latest published research into a simplified baseline algorithm of similar speed and accuracy.
The baseline algorithm presented herein is called Age Weighted Pareto Optimization. It is an amalgamation of recent published techniques in pareto front optimization (Kotanchek et al., 2007), age layered population structures (Hornby 2006), age fitness pareto optimization (Schmidt and Hipson 2010), and specialized embedded abstract constant optimization (Korns 2010). The complete pseudo code for the baseline algorithm is presented in this paper. It is developed step by step as enhancements to the original published SR algorithm (Koza 1992) with justifications for each enhancement. Before-after speed and accuracy comparisons are made for each enhancement on a series of previously published sample problems.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Hornby GS (2006) ALPS: the age-layered population structure for reducing the problem of premature convergence. In: Keijzer M, Cattolico M, Arnold D, Babovic V, Blum C, Bosman P, Butz MV, Coello Coello C, Dasgupta D, Ficici SG, Foster J, Hernandez-Aguirre A, Hornby G, Lipson H, McMinn P, Moore J, Raidl G, Rothlauf F, Ryan C, Thierens D (eds) GECCO 2006: Proceedings of the 8th annual conference on Genetic and evolutionary computation, ACM Press, Seattle, Washington, USA, vol 1, pp 815–822, DOI doi:10.1145/1143997.1144142, URL http://www.cs.bham.ac.uk/wbl/biblio/gecco2006/docs/p815.pdf Hornby GS (2006) ALPS: the age-layered population structure for reducing the problem of premature convergence. In: Keijzer M, Cattolico M, Arnold D, Babovic V, Blum C, Bosman P, Butz MV, Coello Coello C, Dasgupta D, Ficici SG, Foster J, Hernandez-Aguirre A, Hornby G, Lipson H, McMinn P, Moore J, Raidl G, Rothlauf F, Ryan C, Thierens D (eds) GECCO 2006: Proceedings of the 8th annual conference on Genetic and evolutionary computation, ACM Press, Seattle, Washington, USA, vol 1, pp 815–822, DOI doi:10.1145/1143997.1144142, URL http://​www.​cs.​bham.​ac.​uk/​wbl/​biblio/​gecco2006/​docs/​p815.​pdf
Zurück zum Zitat Korns MF (2011) Accuracy in symbolic regression. In: Riolo R, Vladislavleva E, Moore JH (eds) Genetic Programming Theory and Practice IX, Genetic and Evolutionary Computation, Springer, Ann Arbor, USA, chap 8, pp 129–151, DOI doi:10.1007/ 978-1-4614-1770-5-8 Korns MF (2011) Accuracy in symbolic regression. In: Riolo R, Vladislavleva E, Moore JH (eds) Genetic Programming Theory and Practice IX, Genetic and Evolutionary Computation, Springer, Ann Arbor, USA, chap 8, pp 129–151, DOI doi:10.1007/ 978-1-4614-1770-5-8
Zurück zum Zitat Kotanchek M, Smits G, Vladislavleva E (2007) Trustable symbolic regression models: using ensembles, interval arithmetic and pareto fronts to develop robust and trust-aware models. In: Riolo RL, Soule T, Worzel B (eds) Genetic Programming Theory and Practice V, Genetic and Evolutionary Computation, Springer, Ann Arbor, chap 12, pp 201–220, DOI doi:10.1007/978-0-387-76308-8-12 Kotanchek M, Smits G, Vladislavleva E (2007) Trustable symbolic regression models: using ensembles, interval arithmetic and pareto fronts to develop robust and trust-aware models. In: Riolo RL, Soule T, Worzel B (eds) Genetic Programming Theory and Practice V, Genetic and Evolutionary Computation, Springer, Ann Arbor, chap 12, pp 201–220, DOI doi:10.1007/978-0-387-76308-8-12
Zurück zum Zitat Koza JR (1992) Genetic Programming: On the Programming of Computers by Means of Natural Selection. MIT Press, Cambridge, MA, USAMATH Koza JR (1992) Genetic Programming: On the Programming of Computers by Means of Natural Selection. MIT Press, Cambridge, MA, USAMATH
Zurück zum Zitat Koza JR (1994) Genetic programming: On the programming of computers by means of natural selection. Statistics and Computing 4(2):87–112, DOI doi:10.1007/BF00175355 Koza JR (1994) Genetic programming: On the programming of computers by means of natural selection. Statistics and Computing 4(2):87–112, DOI doi:10.1007/BF00175355
Zurück zum Zitat McConaghy T (2011) FFX: Fast, scalable, deterministic symbolic regression technology. In: Riolo R, Vladislavleva E, Moore JH (eds) Genetic Programming Theory and Practice IX, Genetic and Evolutionary Computation, Springer, Ann Arbor, USA, chap 13, pp 235–260, DOI doi:10.1007/978-1-4614-1770-5-13 McConaghy T (2011) FFX: Fast, scalable, deterministic symbolic regression technology. In: Riolo R, Vladislavleva E, Moore JH (eds) Genetic Programming Theory and Practice IX, Genetic and Evolutionary Computation, Springer, Ann Arbor, USA, chap 13, pp 235–260, DOI doi:10.1007/978-1-4614-1770-5-13
Zurück zum Zitat Nelder J, Wedderburn R (1972) Generalized linear models. Journal of the Royal Statistical Society pp 135:370–384CrossRef Nelder J, Wedderburn R (1972) Generalized linear models. Journal of the Royal Statistical Society pp 135:370–384CrossRef
Zurück zum Zitat Poli, Riccaro, McPhee, Nicholas, Vanneshi, Leonardo (2009) Analysis of the Effects of Elitism on Bloat in Linear and Tree-based Genetic Programming. Springer, New York Poli, Riccaro, McPhee, Nicholas, Vanneshi, Leonardo (2009) Analysis of the Effects of Elitism on Bloat in Linear and Tree-based Genetic Programming. Springer, New York
Zurück zum Zitat Schmidt M, Hipson H (2010) Age-Fitness Pareto Optimization. Springer, New York Schmidt M, Hipson H (2010) Age-Fitness Pareto Optimization. Springer, New York
Zurück zum Zitat Smits G, Kotanchek M (2005) Pareto-Front Exploitation in Symoblic Regression Smits G, Kotanchek M (2005) Pareto-Front Exploitation in Symoblic Regression
Metadaten
Titel
A Baseline Symbolic Regression Algorithm
verfasst von
Michael F. Korns
Copyright-Jahr
2013
Verlag
Springer New York
DOI
https://doi.org/10.1007/978-1-4614-6846-2_9