Skip to main content

2020 | OriginalPaper | Buchkapitel

11. Genetic Programming Symbolic Regression: What Is the Prior on the Prediction?

verfasst von : Miguel Nicolau, James McDermott

Erschienen in: Genetic Programming Theory and Practice XVII

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In the context of Genetic Programming Symbolic Regression, we empirically investigate the prior on the output prediction, that is, the distribution of the output prior to observing data. We distinguish between the prior due to initialisation and due to evolutionary search. We also investigate the effect on the prior of maximum tree depth and the effect of different function sets and different independent variable distributions. We find that priors are highly diffuse and sometimes include support for extreme values. We compare priors to values for dependent variables observed in benchmarks and real-world problems, finding that mismatches occur and can affect algorithm behaviour and performance. As a further application of our results, we investigate the behaviour of mutation operators in semantic space.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Anhänge
Nur mit Berechtigung zugänglich
Fußnoten
1
In many GP configurations, there are no true local optima since the mutation operator can jump to anywhere in the search space in single step. We can informally define a local pseudo-optimum as a point where improving steps are not impossible but highly unlikely.
 
2
We adopt the whisker definition of 3rd-quantile +  1.5 * IQR for the upper whisker, and inversely for the lower whisker.
 
Literatur
1.
Zurück zum Zitat Beadle, L., Johnson, C.G.: Semantic analysis of program initialisation in genetic programming. Genetic Programming and Evolvable Machines 10(3), 307–337 (2009)CrossRef Beadle, L., Johnson, C.G.: Semantic analysis of program initialisation in genetic programming. Genetic Programming and Evolvable Machines 10(3), 307–337 (2009)CrossRef
2.
Zurück zum Zitat Bhowan, U., Johnston, M., Zhang, M., Yao, X.: Evolving diverse ensembles using genetic programming for classification with unbalanced data. IEEE Transactions on Evolutionary Computation 17(3), 368–386 (2013)CrossRef Bhowan, U., Johnston, M., Zhang, M., Yao, X.: Evolving diverse ensembles using genetic programming for classification with unbalanced data. IEEE Transactions on Evolutionary Computation 17(3), 368–386 (2013)CrossRef
4.
Zurück zum Zitat Castelli, M., Silva, S., Vanneschi, L.: A C+ + framework for geometric semantic genetic programming. Genetic Programming and Evolvable Machines 16(1), 73–81 (2015)CrossRef Castelli, M., Silva, S., Vanneschi, L.: A C+ + framework for geometric semantic genetic programming. Genetic Programming and Evolvable Machines 16(1), 73–81 (2015)CrossRef
5.
Zurück zum Zitat Costelloe, D., Ryan, C.: On improving generalisation in genetic programming. In: L. Vanneschi, S. Gustafson, A. Moraglio, I.D. Falco, M. Ebner (eds.) European Conference on Genetic Programming, EuroGP 2009, Tübingen, Germany, April 15–17, 2009, Proceedings, Lecture Notes in Computer Science, vol. 5481, pp. 61–72. Springer (2009) Costelloe, D., Ryan, C.: On improving generalisation in genetic programming. In: L. Vanneschi, S. Gustafson, A. Moraglio, I.D. Falco, M. Ebner (eds.) European Conference on Genetic Programming, EuroGP 2009, Tübingen, Germany, April 15–17, 2009, Proceedings, Lecture Notes in Computer Science, vol. 5481, pp. 61–72. Springer (2009)
6.
Zurück zum Zitat Dignum, S., Poli, R.: Generalisation of the limiting distribution of program sizes in tree-based genetic programming and analysis of its effects on bloat. In: D. Thierens, H.G. Beyer, J. Bongard, J. Branke, J.A. Clark, D. Cliff, C.B. Congdon, K. Deb, B. Doerr, T. Kovacs, S. Kumar, J.F. Miller, J. Moore, F. Neumann, M. Pelikan, R. Poli, K. Sastry, K.O. Stanley, T. Stutzle, R.A. Watson, I. Wegener (eds.) GECCO ’07: Proceedings of the 9th annual conference on Genetic and evolutionary computation, vol. 2, pp. 1588–1595. ACM Press, London (2007) Dignum, S., Poli, R.: Generalisation of the limiting distribution of program sizes in tree-based genetic programming and analysis of its effects on bloat. In: D. Thierens, H.G. Beyer, J. Bongard, J. Branke, J.A. Clark, D. Cliff, C.B. Congdon, K. Deb, B. Doerr, T. Kovacs, S. Kumar, J.F. Miller, J. Moore, F. Neumann, M. Pelikan, R. Poli, K. Sastry, K.O. Stanley, T. Stutzle, R.A. Watson, I. Wegener (eds.) GECCO ’07: Proceedings of the 9th annual conference on Genetic and evolutionary computation, vol. 2, pp. 1588–1595. ACM Press, London (2007)
7.
Zurück zum Zitat Espejo, P.G., Ventura, S., Herrera, F.: A survey on the application of genetic programming to classification. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews) 40(2), 121–144 (2010) Espejo, P.G., Ventura, S., Herrera, F.: A survey on the application of genetic programming to classification. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews) 40(2), 121–144 (2010)
8.
Zurück zum Zitat Fortin, F.A., De Rainville, F.M., Gardner, M.A., Parizeau, M., Gagné, C.: DEAP: Evolutionary algorithms made easy. Journal of Machine Learning Research 13, 2171–2175 (2012)MathSciNet Fortin, F.A., De Rainville, F.M., Gardner, M.A., Parizeau, M., Gagné, C.: DEAP: Evolutionary algorithms made easy. Journal of Machine Learning Research 13, 2171–2175 (2012)MathSciNet
10.
Zurück zum Zitat Grinstead, C.M., Snell, J.L.: Introduction to probability. American Mathematical Soc. (2012) Grinstead, C.M., Snell, J.L.: Introduction to probability. American Mathematical Soc. (2012)
11.
Zurück zum Zitat Iba, H., de Garis, H., Sato, T.: Genetic programming using a minimum description length principle. In: K.E. Kinnear, Jr. (ed.) Advances in Genetic Programming, chap. 12, pp. 265–284. MIT Press (1994) Iba, H., de Garis, H., Sato, T.: Genetic programming using a minimum description length principle. In: K.E. Kinnear, Jr. (ed.) Advances in Genetic Programming, chap. 12, pp. 265–284. MIT Press (1994)
12.
Zurück zum Zitat Keijzer, M.: Improving symbolic regression with interval arithmetic and linear scaling. In: EuroGP, pp. 70–82. Springer (2003) Keijzer, M.: Improving symbolic regression with interval arithmetic and linear scaling. In: EuroGP, pp. 70–82. Springer (2003)
13.
Zurück zum Zitat Keijzer, M., Foster, J.: Crossover bias in genetic programming. In: European Conference on Genetic Programming, pp. 33–44. Springer (2007) Keijzer, M., Foster, J.: Crossover bias in genetic programming. In: European Conference on Genetic Programming, pp. 33–44. Springer (2007)
14.
Zurück zum Zitat Korns, M.F.: Accuracy in symbolic regression. In: R. Riolo, E. Vladislavleva, J.H. Moore (eds.) Genetic Programming Theory and Practice IX, Genetic and Evolutionary Computation, pp. 129–151. Springer, New York (2011) Korns, M.F.: Accuracy in symbolic regression. In: R. Riolo, E. Vladislavleva, J.H. Moore (eds.) Genetic Programming Theory and Practice IX, Genetic and Evolutionary Computation, pp. 129–151. Springer, New York (2011)
15.
Zurück zum Zitat Koza, J.: Genetic Programming: on the programming of computers by means of natural selection. MIT Press, Cambridge, MA (1992)MATH Koza, J.: Genetic Programming: on the programming of computers by means of natural selection. MIT Press, Cambridge, MA (1992)MATH
16.
Zurück zum Zitat Krogh, A., Hertz, J.A.: A simple weight decay can improve generalization. In: Advances in neural information processing systems, pp. 950–957 (1992) Krogh, A., Hertz, J.A.: A simple weight decay can improve generalization. In: Advances in neural information processing systems, pp. 950–957 (1992)
19.
Zurück zum Zitat Luke, S., Panait, L.: A comparison of bloat control methods for genetic programming. Evolutionary Computation 14(3), 309–344 (2006)CrossRef Luke, S., Panait, L.: A comparison of bloat control methods for genetic programming. Evolutionary Computation 14(3), 309–344 (2006)CrossRef
20.
Zurück zum Zitat Mauceri, S., Sweeney, J., McDermott, J.: One-class subject authentication using feature extraction by grammatical evolution on accelerometer data. In: Proceedings of META 2018, 7th International Conference on Metaheuristics and Nature Inspired computing. Marrakesh, Morocco (2018) Mauceri, S., Sweeney, J., McDermott, J.: One-class subject authentication using feature extraction by grammatical evolution on accelerometer data. In: Proceedings of META 2018, 7th International Conference on Metaheuristics and Nature Inspired computing. Marrakesh, Morocco (2018)
21.
Zurück zum Zitat McDermott, J.: Measuring mutation operators’ exploration-exploitation behaviour and long-term biases. In: M. Nicolau, K. Krawiec, M.I. Heywood, M. Castelli, P. García-Sánchez, J.J. Merelo, V.M.R. Santos, K. Sim (eds.) 17th European Conference on Genetic Programming, LNCS, vol. 8599, pp. 100–111. Springer, Granada, Spain (2014) McDermott, J.: Measuring mutation operators’ exploration-exploitation behaviour and long-term biases. In: M. Nicolau, K. Krawiec, M.I. Heywood, M. Castelli, P. García-Sánchez, J.J. Merelo, V.M.R. Santos, K. Sim (eds.) 17th European Conference on Genetic Programming, LNCS, vol. 8599, pp. 100–111. Springer, Granada, Spain (2014)
22.
Zurück zum Zitat McDermott, J., Agapitos, A., Brabazon, A., O’Neill, M.: Geometric semantic genetic programming for financial data. In: Applications of Evolutionary Computation, pp. 215–226. Springer (2014) McDermott, J., Agapitos, A., Brabazon, A., O’Neill, M.: Geometric semantic genetic programming for financial data. In: Applications of Evolutionary Computation, pp. 215–226. Springer (2014)
23.
Zurück zum Zitat Moraglio, A.: Towards a geometric unification of evolutionary algorithms. Ph.D. thesis, University of Essex (2007) Moraglio, A.: Towards a geometric unification of evolutionary algorithms. Ph.D. thesis, University of Essex (2007)
24.
Zurück zum Zitat Moraglio, A., Krawiec, K., Johnson, C.: Geometric semantic genetic programming. In: Proc. PPSN XII: Parallel problem solving from nature, pp. 21–31. Springer, Taormina, Italy (2012) Moraglio, A., Krawiec, K., Johnson, C.: Geometric semantic genetic programming. In: Proc. PPSN XII: Parallel problem solving from nature, pp. 21–31. Springer, Taormina, Italy (2012)
25.
Zurück zum Zitat Moraglio, A., Mambrini, A.: Runtime analysis of mutation-based geometric semantic genetic programming for basis functions regression. In: Proceedings of the 15th annual conference on Genetic and evolutionary computation, pp. 989–996. ACM (2013) Moraglio, A., Mambrini, A.: Runtime analysis of mutation-based geometric semantic genetic programming for basis functions regression. In: Proceedings of the 15th annual conference on Genetic and evolutionary computation, pp. 989–996. ACM (2013)
26.
Zurück zum Zitat Ni, J., Drieberg, R.H., Rockett, P.I.: The use of an analytic quotient operator in genetic programming. IEEE Transactions on Evolutionary Computation 17(1), 146–152 (2013)CrossRef Ni, J., Drieberg, R.H., Rockett, P.I.: The use of an analytic quotient operator in genetic programming. IEEE Transactions on Evolutionary Computation 17(1), 146–152 (2013)CrossRef
27.
Zurück zum Zitat Nicolau, M., Agapitos, A.: On the effect of function set to the generalisation of symbolic regression models. In: Proceedings of the Genetic and Evolutionary Computation Conference Companion, pp. 272–273. ACM (2018) Nicolau, M., Agapitos, A.: On the effect of function set to the generalisation of symbolic regression models. In: Proceedings of the Genetic and Evolutionary Computation Conference Companion, pp. 272–273. ACM (2018)
28.
Zurück zum Zitat Nicolau, M., Agapitos, A.: Function sets and their generalisation effect in symbolic regression models (2019). In review Nicolau, M., Agapitos, A.: Function sets and their generalisation effect in symbolic regression models (2019). In review
29.
Zurück zum Zitat Poli, R.: A simple but theoretically-motivated method to control bloat in genetic programming. In: C. Ryan, T. Soule, M. Keijzer, E. Tsang, R. Poli, E. Costa (eds.) Genetic Programming, Proceedings of EuroGP’2003, LNCS, vol. 2610, pp. 204–217. Springer-Verlag, Essex (2003)CrossRef Poli, R.: A simple but theoretically-motivated method to control bloat in genetic programming. In: C. Ryan, T. Soule, M. Keijzer, E. Tsang, R. Poli, E. Costa (eds.) Genetic Programming, Proceedings of EuroGP’2003, LNCS, vol. 2610, pp. 204–217. Springer-Verlag, Essex (2003)CrossRef
30.
Zurück zum Zitat Poli, R., Langdon, W.B., Dignum, S.: On the limiting distribution of program sizes in tree-based genetic programming. In: European Conference on Genetic Programming, pp. 193–204. Springer (2007) Poli, R., Langdon, W.B., Dignum, S.: On the limiting distribution of program sizes in tree-based genetic programming. In: European Conference on Genetic Programming, pp. 193–204. Springer (2007)
32.
Zurück zum Zitat Rosca, J.P., et al.: Analysis of complexity drift in genetic programming. Genetic Programming pp. 286–294 (1997) Rosca, J.P., et al.: Analysis of complexity drift in genetic programming. Genetic Programming pp. 286–294 (1997)
33.
Zurück zum Zitat Silva, S., Costa, E.: Dynamic limits for bloat control in genetic programming and a review of past and current bloat theories. Genetic Programming and Evolvable Machines 10(2), 141–179 (2009)CrossRef Silva, S., Costa, E.: Dynamic limits for bloat control in genetic programming and a review of past and current bloat theories. Genetic Programming and Evolvable Machines 10(2), 141–179 (2009)CrossRef
34.
Zurück zum Zitat Silva, S., Dignum, S.: Extending operator equalisation: Fitness based self adaptive length distribution for bloat free GP. In: EuroGP, pp. 159–170. Springer (2009) Silva, S., Dignum, S.: Extending operator equalisation: Fitness based self adaptive length distribution for bloat free GP. In: EuroGP, pp. 159–170. Springer (2009)
35.
Zurück zum Zitat Silva, S., Vanneschi, L.: The importance of being flat—studying the program length distributions of operator equalisation. In: R. Riolo, K. Vladislavleva, J. Moore (eds.) Genetic Programming Theory and Practice IX, pp. 211–233. Springer (2011) Silva, S., Vanneschi, L.: The importance of being flat—studying the program length distributions of operator equalisation. In: R. Riolo, K. Vladislavleva, J. Moore (eds.) Genetic Programming Theory and Practice IX, pp. 211–233. Springer (2011)
36.
Zurück zum Zitat Springer, M.D.: The algebra of random variables. Wiley (1979) Springer, M.D.: The algebra of random variables. Wiley (1979)
38.
Zurück zum Zitat Vanneschi, L., Silva, S., Castelli, M., Manzoni, L.: Geometric semantic genetic programming for real life applications. In: Genetic programming theory and practice xi, pp. 191–209. Springer (2014) Vanneschi, L., Silva, S., Castelli, M., Manzoni, L.: Geometric semantic genetic programming for real life applications. In: Genetic programming theory and practice xi, pp. 191–209. Springer (2014)
39.
Zurück zum Zitat Vladislavleva, E.J., Smits, G.F., den Hertog, D.: Order of nonlinearity as a complexity measure for models generated by symbolic regression via Pareto genetic programming. IEEE Transactions on Evolutionary Computation 13(2), 333–349 (2009)CrossRef Vladislavleva, E.J., Smits, G.F., den Hertog, D.: Order of nonlinearity as a complexity measure for models generated by symbolic regression via Pareto genetic programming. IEEE Transactions on Evolutionary Computation 13(2), 333–349 (2009)CrossRef
40.
Zurück zum Zitat Whigham, P.A.: Inductive bias and genetic programming (1995) Whigham, P.A.: Inductive bias and genetic programming (1995)
41.
Zurück zum Zitat Whigham, P.A., McKay, R.I.: Genetic approaches to learning recursive relations. In: X. Yao (ed.) Progress in Evolutionary Computation, Lecture Notes in Artificial Intelligence, vol. 956, pp. 17–27. Springer-Verlag (1995) Whigham, P.A., McKay, R.I.: Genetic approaches to learning recursive relations. In: X. Yao (ed.) Progress in Evolutionary Computation, Lecture Notes in Artificial Intelligence, vol. 956, pp. 17–27. Springer-Verlag (1995)
Metadaten
Titel
Genetic Programming Symbolic Regression: What Is the Prior on the Prediction?
verfasst von
Miguel Nicolau
James McDermott
Copyright-Jahr
2020
DOI
https://doi.org/10.1007/978-3-030-39958-0_11

Premium Partner