Skip to main content

2019 | OriginalPaper | Buchkapitel

Trading-off Data Fit and Complexity in Training Gaussian Processes with Multiple Kernels

verfasst von : Tinkle Chugh, Alma Rahat, Pramudita Satria Palar

Erschienen in: Machine Learning, Optimization, and Data Science

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Gaussian processes (GPs) belong to a class of probabilistic techniques that have been successfully used in different domains of machine learning and optimization. They are popular because they provide uncertainties in predictions, which sets them apart from other modelling methods providing only point predictions. The uncertainty is particularly useful for decision making as we can gauge how reliable a prediction is. One of the fundamental challenges in using GPs is that the efficacy of a model is conferred by selecting an appropriate kernel and the associated hyperparameter values for a given problem. Furthermore, the training of GPs, that is optimizing the hyperparameters using a data set is traditionally performed using a cost function that is a weighted sum of data fit and model complexity, and the underlying trade-off is completely ignored. Addressing these challenges and shortcomings, in this article, we propose the following automated training scheme. Firstly, we use a weighted product of multiple kernels with a view to relieve the users from choosing an appropriate kernel for the problem at hand without any domain specific knowledge. Secondly, for the first time, we modify GP training by using a multi-objective optimizer to tune the hyperparameters and weights of multiple kernels and extract an approximation of the complete trade-off front between data-fit and model complexity. We then propose to use a novel solution selection strategy based on mean standardized log loss (MSLL) to select a solution from the estimated trade-off front and finalise training of a GP model. The results on three data sets and comparison with the standard approach clearly show the potential benefit of the proposed approach of using multi-objective optimization with multiple kernels.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Bender, R., Lange, S.: Adjusting for multiple testing: when and how? J. Clin. Epidemiol. 54(4), 343–349 (2001)CrossRef Bender, R., Lange, S.: Adjusting for multiple testing: when and how? J. Clin. Epidemiol. 54(4), 343–349 (2001)CrossRef
3.
Zurück zum Zitat Burnham, K.P., Anderson, D.R.: Multimodel inference: understanding AIC and BIC in model selection. Soc. Methods Res. 33(2), 261–304 (2004)MathSciNetCrossRef Burnham, K.P., Anderson, D.R.: Multimodel inference: understanding AIC and BIC in model selection. Soc. Methods Res. 33(2), 261–304 (2004)MathSciNetCrossRef
4.
Zurück zum Zitat Chugh, T., Rahat, A., Volz, V., Zaefferer, M.: Towards better integration of surrogate models and optimizers. In: Bartz-Beielstein, T., Filipič, B., Korošec, P., Talbi, E.-G. (eds.) High-Performance Simulation-Based Optimization. SCI, vol. 833, pp. 137–163. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-18764-4_7CrossRef Chugh, T., Rahat, A., Volz, V., Zaefferer, M.: Towards better integration of surrogate models and optimizers. In: Bartz-Beielstein, T., Filipič, B., Korošec, P., Talbi, E.-G. (eds.) High-Performance Simulation-Based Optimization. SCI, vol. 833, pp. 137–163. Springer, Cham (2020). https://​doi.​org/​10.​1007/​978-3-030-18764-4_​7CrossRef
5.
Zurück zum Zitat Coello Coello, C.A., Lamont, G.B., Veldhuizen, D.A.V.: Evolutionary Algorithms for Solving Multi-Objective Problems. Springer, Berlin (2007)MATH Coello Coello, C.A., Lamont, G.B., Veldhuizen, D.A.V.: Evolutionary Algorithms for Solving Multi-Objective Problems. Springer, Berlin (2007)MATH
6.
Zurück zum Zitat Deb, K., Prarap, A., Agarwal, S., Meyarivan, T.: A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Trans. Evol. Comput. 6, 182–197 (2002)CrossRef Deb, K., Prarap, A., Agarwal, S., Meyarivan, T.: A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Trans. Evol. Comput. 6, 182–197 (2002)CrossRef
7.
Zurück zum Zitat Duvenaud, D.: Automatic model construction with Gaussian processes. Ph.D. thesis, University of Cambridge (2014) Duvenaud, D.: Automatic model construction with Gaussian processes. Ph.D. thesis, University of Cambridge (2014)
8.
Zurück zum Zitat Duvenaud, D., Lloyd, J., Grosse, R., Tenenbaum, J., Zoubin, G.: Structure discovery in nonparametric regression through compositional kernel search. In: Proceedings of the 30th International Conference on Machine Learning, vol. 28, pp. 1166–1174. PMLR, Atlanta (2013) Duvenaud, D., Lloyd, J., Grosse, R., Tenenbaum, J., Zoubin, G.: Structure discovery in nonparametric regression through compositional kernel search. In: Proceedings of the 30th International Conference on Machine Learning, vol. 28, pp. 1166–1174. PMLR, Atlanta (2013)
9.
Zurück zum Zitat Fieldsend, J.E., Singh, S.: Pareto evolutionary neural networks. IEEE Trans. Neural Netw. 16(2), 338–354 (2005)CrossRef Fieldsend, J.E., Singh, S.: Pareto evolutionary neural networks. IEEE Trans. Neural Netw. 16(2), 338–354 (2005)CrossRef
11.
Zurück zum Zitat Friese, M., Bartz-Beielstein, T., Bäck, T., Naujoks, B., Emmerich, M.: Weighted ensembles in model-based global optimization. In: AIP Conference, vol. 2070, p. 020003 (2019) Friese, M., Bartz-Beielstein, T., Bäck, T., Naujoks, B., Emmerich, M.: Weighted ensembles in model-based global optimization. In: AIP Conference, vol. 2070, p. 020003 (2019)
12.
Zurück zum Zitat Jin, Y., Sendhoff, B.: Pareto-based multiobjective machine learning: an overview and case studies. IEEE Trans. Syst. Man Cybern. Part C (Appl. Rev.) 38, 397–415 (2008) Jin, Y., Sendhoff, B.: Pareto-based multiobjective machine learning: an overview and case studies. IEEE Trans. Syst. Man Cybern. Part C (Appl. Rev.) 38, 397–415 (2008)
13.
Zurück zum Zitat Jones, D., Schonlau, M., Welch, W.: Efficient global optimization of expensive black-box functions. J. Global Optim. 13, 455–492 (1998)MathSciNetCrossRef Jones, D., Schonlau, M., Welch, W.: Efficient global optimization of expensive black-box functions. J. Global Optim. 13, 455–492 (1998)MathSciNetCrossRef
14.
Zurück zum Zitat Keeling, C.D., Whorf, T.P.: Atmospheric CO2 records from sites in the sio air sampling network. in trends: a compendium of data on global change. Carbon dioxide information analysis center. Oak Ridge National Laboratory, USA (2004) Keeling, C.D., Whorf, T.P.: Atmospheric CO2 records from sites in the sio air sampling network. in trends: a compendium of data on global change. Carbon dioxide information analysis center. Oak Ridge National Laboratory, USA (2004)
15.
Zurück zum Zitat Knowles, J.: ParEGO: a hybrid algorithm with on-line landscape approximation for expensive multiobjective optimization problems. IEEE Trans. Evol. Comput. 10, 50–66 (2006)CrossRef Knowles, J.: ParEGO: a hybrid algorithm with on-line landscape approximation for expensive multiobjective optimization problems. IEEE Trans. Evol. Comput. 10, 50–66 (2006)CrossRef
16.
Zurück zum Zitat Knowles, J.D., Theile, L., Zitzler, E.: A tutorial on the performance assesment of stochastic multiobjective optimizers. Technical report TIK214, Computer Engineering and Networks Laboratory, ETH Zurich, Zurich, Switzerland, February 2006 Knowles, J.D., Theile, L., Zitzler, E.: A tutorial on the performance assesment of stochastic multiobjective optimizers. Technical report TIK214, Computer Engineering and Networks Laboratory, ETH Zurich, Zurich, Switzerland, February 2006
18.
Zurück zum Zitat Lei, Y., Yang, H.: A Gaussian process ensemble modeling method based on boosting algorithm. In: Proceedings of the 32nd Chinese Control Conference, pp. 1704–1707 (2013) Lei, Y., Yang, H.: A Gaussian process ensemble modeling method based on boosting algorithm. In: Proceedings of the 32nd Chinese Control Conference, pp. 1704–1707 (2013)
19.
Zurück zum Zitat MacKay, D.J.: Introduction to Gaussian processes. NATO ASI Series F Comput. Syst. Sci. 168, 133–166 (1998)MATH MacKay, D.J.: Introduction to Gaussian processes. NATO ASI Series F Comput. Syst. Sci. 168, 133–166 (1998)MATH
20.
21.
Zurück zum Zitat Palar, P.S., Shimoyama, K.: Kriging with composite kernel learning for surrogate modeling in computer experiments. In: AIAA Scitech 2019 Forum, pp. 2019–2209 (2019) Palar, P.S., Shimoyama, K.: Kriging with composite kernel learning for surrogate modeling in computer experiments. In: AIAA Scitech 2019 Forum, pp. 2019–2209 (2019)
22.
Zurück zum Zitat Rahat, A.A., Wang, C., Everson, R.M., Fieldsend, J.E.: Data-driven multi-objective optimisation of coal-fired boiler combustion systems. Appl. Energy 229, 446–458 (2018)CrossRef Rahat, A.A., Wang, C., Everson, R.M., Fieldsend, J.E.: Data-driven multi-objective optimisation of coal-fired boiler combustion systems. Appl. Energy 229, 446–458 (2018)CrossRef
23.
Zurück zum Zitat Rasmussen, C.E., Williams, C.K.I.: Gaussian Processes for Machine Learning. The MIT Press, Cambridge (2006)MATH Rasmussen, C.E., Williams, C.K.I.: Gaussian Processes for Machine Learning. The MIT Press, Cambridge (2006)MATH
24.
Zurück zum Zitat Shahriari, B., Swersky, K., Wang, Z., Adams, R.P., de Freitas, N.: Taking the human out of the loop: a review of bayesian optimization. Proc. IEEE 104, 148–175 (2016)CrossRef Shahriari, B., Swersky, K., Wang, Z., Adams, R.P., de Freitas, N.: Taking the human out of the loop: a review of bayesian optimization. Proc. IEEE 104, 148–175 (2016)CrossRef
25.
Zurück zum Zitat Sonnenburg, S., Rätsch, G., Schäfer, C., Schölkopf, B.: Large scale multiple kernel learning. J. Mach. Learn. Res. 7, 1531–1565 (2006)MathSciNetMATH Sonnenburg, S., Rätsch, G., Schäfer, C., Schölkopf, B.: Large scale multiple kernel learning. J. Mach. Learn. Res. 7, 1531–1565 (2006)MathSciNetMATH
26.
Zurück zum Zitat Vijayakumar, S., Schaal, S.: Locally weighted projection regression: an O(n) algorithm for incremental real time learning in high dimensional space. In: Proceedings of the Seventeenth International Conference on Machine Learning (ICML), pp. 1079–1086 (2000) Vijayakumar, S., Schaal, S.: Locally weighted projection regression: an O(n) algorithm for incremental real time learning in high dimensional space. In: Proceedings of the Seventeenth International Conference on Machine Learning (ICML), pp. 1079–1086 (2000)
27.
Zurück zum Zitat Wolpert, D.H.: The lack of a priori distinctions between learning algorithms. Neural Comput. 8(7), 1341–1390 (1996)CrossRef Wolpert, D.H.: The lack of a priori distinctions between learning algorithms. Neural Comput. 8(7), 1341–1390 (1996)CrossRef
28.
Zurück zum Zitat Yeh, I.C.: Modeling of strength of high-performance concrete using artificial neural networks. Cem. Concr. Res. 28, 1797–1808 (1998)CrossRef Yeh, I.C.: Modeling of strength of high-performance concrete using artificial neural networks. Cem. Concr. Res. 28, 1797–1808 (1998)CrossRef
Metadaten
Titel
Trading-off Data Fit and Complexity in Training Gaussian Processes with Multiple Kernels
verfasst von
Tinkle Chugh
Alma Rahat
Pramudita Satria Palar
Copyright-Jahr
2019
DOI
https://doi.org/10.1007/978-3-030-37599-7_48