Skip to main content
Top

2019 | OriginalPaper | Chapter

Trading-off Data Fit and Complexity in Training Gaussian Processes with Multiple Kernels

Authors : Tinkle Chugh, Alma Rahat, Pramudita Satria Palar

Published in: Machine Learning, Optimization, and Data Science

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Gaussian processes (GPs) belong to a class of probabilistic techniques that have been successfully used in different domains of machine learning and optimization. They are popular because they provide uncertainties in predictions, which sets them apart from other modelling methods providing only point predictions. The uncertainty is particularly useful for decision making as we can gauge how reliable a prediction is. One of the fundamental challenges in using GPs is that the efficacy of a model is conferred by selecting an appropriate kernel and the associated hyperparameter values for a given problem. Furthermore, the training of GPs, that is optimizing the hyperparameters using a data set is traditionally performed using a cost function that is a weighted sum of data fit and model complexity, and the underlying trade-off is completely ignored. Addressing these challenges and shortcomings, in this article, we propose the following automated training scheme. Firstly, we use a weighted product of multiple kernels with a view to relieve the users from choosing an appropriate kernel for the problem at hand without any domain specific knowledge. Secondly, for the first time, we modify GP training by using a multi-objective optimizer to tune the hyperparameters and weights of multiple kernels and extract an approximation of the complete trade-off front between data-fit and model complexity. We then propose to use a novel solution selection strategy based on mean standardized log loss (MSLL) to select a solution from the estimated trade-off front and finalise training of a GP model. The results on three data sets and comparison with the standard approach clearly show the potential benefit of the proposed approach of using multi-objective optimization with multiple kernels.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Bender, R., Lange, S.: Adjusting for multiple testing: when and how? J. Clin. Epidemiol. 54(4), 343–349 (2001)CrossRef Bender, R., Lange, S.: Adjusting for multiple testing: when and how? J. Clin. Epidemiol. 54(4), 343–349 (2001)CrossRef
3.
go back to reference Burnham, K.P., Anderson, D.R.: Multimodel inference: understanding AIC and BIC in model selection. Soc. Methods Res. 33(2), 261–304 (2004)MathSciNetCrossRef Burnham, K.P., Anderson, D.R.: Multimodel inference: understanding AIC and BIC in model selection. Soc. Methods Res. 33(2), 261–304 (2004)MathSciNetCrossRef
4.
go back to reference Chugh, T., Rahat, A., Volz, V., Zaefferer, M.: Towards better integration of surrogate models and optimizers. In: Bartz-Beielstein, T., Filipič, B., Korošec, P., Talbi, E.-G. (eds.) High-Performance Simulation-Based Optimization. SCI, vol. 833, pp. 137–163. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-18764-4_7CrossRef Chugh, T., Rahat, A., Volz, V., Zaefferer, M.: Towards better integration of surrogate models and optimizers. In: Bartz-Beielstein, T., Filipič, B., Korošec, P., Talbi, E.-G. (eds.) High-Performance Simulation-Based Optimization. SCI, vol. 833, pp. 137–163. Springer, Cham (2020). https://​doi.​org/​10.​1007/​978-3-030-18764-4_​7CrossRef
5.
go back to reference Coello Coello, C.A., Lamont, G.B., Veldhuizen, D.A.V.: Evolutionary Algorithms for Solving Multi-Objective Problems. Springer, Berlin (2007)MATH Coello Coello, C.A., Lamont, G.B., Veldhuizen, D.A.V.: Evolutionary Algorithms for Solving Multi-Objective Problems. Springer, Berlin (2007)MATH
6.
go back to reference Deb, K., Prarap, A., Agarwal, S., Meyarivan, T.: A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Trans. Evol. Comput. 6, 182–197 (2002)CrossRef Deb, K., Prarap, A., Agarwal, S., Meyarivan, T.: A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Trans. Evol. Comput. 6, 182–197 (2002)CrossRef
7.
go back to reference Duvenaud, D.: Automatic model construction with Gaussian processes. Ph.D. thesis, University of Cambridge (2014) Duvenaud, D.: Automatic model construction with Gaussian processes. Ph.D. thesis, University of Cambridge (2014)
8.
go back to reference Duvenaud, D., Lloyd, J., Grosse, R., Tenenbaum, J., Zoubin, G.: Structure discovery in nonparametric regression through compositional kernel search. In: Proceedings of the 30th International Conference on Machine Learning, vol. 28, pp. 1166–1174. PMLR, Atlanta (2013) Duvenaud, D., Lloyd, J., Grosse, R., Tenenbaum, J., Zoubin, G.: Structure discovery in nonparametric regression through compositional kernel search. In: Proceedings of the 30th International Conference on Machine Learning, vol. 28, pp. 1166–1174. PMLR, Atlanta (2013)
9.
go back to reference Fieldsend, J.E., Singh, S.: Pareto evolutionary neural networks. IEEE Trans. Neural Netw. 16(2), 338–354 (2005)CrossRef Fieldsend, J.E., Singh, S.: Pareto evolutionary neural networks. IEEE Trans. Neural Netw. 16(2), 338–354 (2005)CrossRef
11.
go back to reference Friese, M., Bartz-Beielstein, T., Bäck, T., Naujoks, B., Emmerich, M.: Weighted ensembles in model-based global optimization. In: AIP Conference, vol. 2070, p. 020003 (2019) Friese, M., Bartz-Beielstein, T., Bäck, T., Naujoks, B., Emmerich, M.: Weighted ensembles in model-based global optimization. In: AIP Conference, vol. 2070, p. 020003 (2019)
12.
go back to reference Jin, Y., Sendhoff, B.: Pareto-based multiobjective machine learning: an overview and case studies. IEEE Trans. Syst. Man Cybern. Part C (Appl. Rev.) 38, 397–415 (2008) Jin, Y., Sendhoff, B.: Pareto-based multiobjective machine learning: an overview and case studies. IEEE Trans. Syst. Man Cybern. Part C (Appl. Rev.) 38, 397–415 (2008)
13.
go back to reference Jones, D., Schonlau, M., Welch, W.: Efficient global optimization of expensive black-box functions. J. Global Optim. 13, 455–492 (1998)MathSciNetCrossRef Jones, D., Schonlau, M., Welch, W.: Efficient global optimization of expensive black-box functions. J. Global Optim. 13, 455–492 (1998)MathSciNetCrossRef
14.
go back to reference Keeling, C.D., Whorf, T.P.: Atmospheric CO2 records from sites in the sio air sampling network. in trends: a compendium of data on global change. Carbon dioxide information analysis center. Oak Ridge National Laboratory, USA (2004) Keeling, C.D., Whorf, T.P.: Atmospheric CO2 records from sites in the sio air sampling network. in trends: a compendium of data on global change. Carbon dioxide information analysis center. Oak Ridge National Laboratory, USA (2004)
15.
go back to reference Knowles, J.: ParEGO: a hybrid algorithm with on-line landscape approximation for expensive multiobjective optimization problems. IEEE Trans. Evol. Comput. 10, 50–66 (2006)CrossRef Knowles, J.: ParEGO: a hybrid algorithm with on-line landscape approximation for expensive multiobjective optimization problems. IEEE Trans. Evol. Comput. 10, 50–66 (2006)CrossRef
16.
go back to reference Knowles, J.D., Theile, L., Zitzler, E.: A tutorial on the performance assesment of stochastic multiobjective optimizers. Technical report TIK214, Computer Engineering and Networks Laboratory, ETH Zurich, Zurich, Switzerland, February 2006 Knowles, J.D., Theile, L., Zitzler, E.: A tutorial on the performance assesment of stochastic multiobjective optimizers. Technical report TIK214, Computer Engineering and Networks Laboratory, ETH Zurich, Zurich, Switzerland, February 2006
18.
go back to reference Lei, Y., Yang, H.: A Gaussian process ensemble modeling method based on boosting algorithm. In: Proceedings of the 32nd Chinese Control Conference, pp. 1704–1707 (2013) Lei, Y., Yang, H.: A Gaussian process ensemble modeling method based on boosting algorithm. In: Proceedings of the 32nd Chinese Control Conference, pp. 1704–1707 (2013)
19.
go back to reference MacKay, D.J.: Introduction to Gaussian processes. NATO ASI Series F Comput. Syst. Sci. 168, 133–166 (1998)MATH MacKay, D.J.: Introduction to Gaussian processes. NATO ASI Series F Comput. Syst. Sci. 168, 133–166 (1998)MATH
21.
go back to reference Palar, P.S., Shimoyama, K.: Kriging with composite kernel learning for surrogate modeling in computer experiments. In: AIAA Scitech 2019 Forum, pp. 2019–2209 (2019) Palar, P.S., Shimoyama, K.: Kriging with composite kernel learning for surrogate modeling in computer experiments. In: AIAA Scitech 2019 Forum, pp. 2019–2209 (2019)
22.
go back to reference Rahat, A.A., Wang, C., Everson, R.M., Fieldsend, J.E.: Data-driven multi-objective optimisation of coal-fired boiler combustion systems. Appl. Energy 229, 446–458 (2018)CrossRef Rahat, A.A., Wang, C., Everson, R.M., Fieldsend, J.E.: Data-driven multi-objective optimisation of coal-fired boiler combustion systems. Appl. Energy 229, 446–458 (2018)CrossRef
23.
go back to reference Rasmussen, C.E., Williams, C.K.I.: Gaussian Processes for Machine Learning. The MIT Press, Cambridge (2006)MATH Rasmussen, C.E., Williams, C.K.I.: Gaussian Processes for Machine Learning. The MIT Press, Cambridge (2006)MATH
24.
go back to reference Shahriari, B., Swersky, K., Wang, Z., Adams, R.P., de Freitas, N.: Taking the human out of the loop: a review of bayesian optimization. Proc. IEEE 104, 148–175 (2016)CrossRef Shahriari, B., Swersky, K., Wang, Z., Adams, R.P., de Freitas, N.: Taking the human out of the loop: a review of bayesian optimization. Proc. IEEE 104, 148–175 (2016)CrossRef
25.
go back to reference Sonnenburg, S., Rätsch, G., Schäfer, C., Schölkopf, B.: Large scale multiple kernel learning. J. Mach. Learn. Res. 7, 1531–1565 (2006)MathSciNetMATH Sonnenburg, S., Rätsch, G., Schäfer, C., Schölkopf, B.: Large scale multiple kernel learning. J. Mach. Learn. Res. 7, 1531–1565 (2006)MathSciNetMATH
26.
go back to reference Vijayakumar, S., Schaal, S.: Locally weighted projection regression: an O(n) algorithm for incremental real time learning in high dimensional space. In: Proceedings of the Seventeenth International Conference on Machine Learning (ICML), pp. 1079–1086 (2000) Vijayakumar, S., Schaal, S.: Locally weighted projection regression: an O(n) algorithm for incremental real time learning in high dimensional space. In: Proceedings of the Seventeenth International Conference on Machine Learning (ICML), pp. 1079–1086 (2000)
27.
go back to reference Wolpert, D.H.: The lack of a priori distinctions between learning algorithms. Neural Comput. 8(7), 1341–1390 (1996)CrossRef Wolpert, D.H.: The lack of a priori distinctions between learning algorithms. Neural Comput. 8(7), 1341–1390 (1996)CrossRef
28.
go back to reference Yeh, I.C.: Modeling of strength of high-performance concrete using artificial neural networks. Cem. Concr. Res. 28, 1797–1808 (1998)CrossRef Yeh, I.C.: Modeling of strength of high-performance concrete using artificial neural networks. Cem. Concr. Res. 28, 1797–1808 (1998)CrossRef
Metadata
Title
Trading-off Data Fit and Complexity in Training Gaussian Processes with Multiple Kernels
Authors
Tinkle Chugh
Alma Rahat
Pramudita Satria Palar
Copyright Year
2019
DOI
https://doi.org/10.1007/978-3-030-37599-7_48

Premium Partner