nach oben

Erschienen in:

2024 | OriginalPaper | Buchkapitel

Hybrid Surrogate Assisted Evolutionary Multiobjective Reinforcement Learning for Continuous Robot Control

verfasst von : Atanu Mazumdar, Ville Kyrki

Erschienen in: Applications of Evolutionary Computation

Verlag: Springer Nature Switzerland

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Many real world reinforcement learning (RL) problems consist of multiple conflicting objective functions that need to be optimized simultaneously. Finding these optimal policies (known as Pareto optimal policies) for different preferences of objectives requires extensive state space exploration. Thus, obtaining a dense set of Pareto optimal policies is challenging and often reduces the sample efficiency. In this paper, we propose a hybrid multiobjective policy optimization approach for solving multiobjective reinforcement learning (MORL) problems with continuous actions. Our approach combines the faster convergence of multiobjective policy gradient (MOPG) and a surrogate assisted multiobjective evolutionary algorithm (MOEA) to produce a dense set of Pareto optimal policies. The solutions found by the MOPG algorithm are utilized to build computationally inexpensive surrogate models in the parameter space of the policies that approximate the return of policies. An MOEA is executed that utilizes the surrogates’ mean prediction and uncertainty in the prediction to find approximate optimal policies. The final solution policies are later evaluated using the simulator and stored in an archive. Tests on multiobjective continuous action RL benchmarks show that a hybrid surrogate assisted multiobjective evolutionary optimizer with robust selection criterion produces a dense set of Pareto optimal policies without extensively exploring the state space. We also apply the proposed approach to train Pareto optimal agents for autonomous driving, where the hybrid approach produced superior results compared to a state-of-the-art MOPG algorithm.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Vorheriges Kapitel Evolving Reservoirs for Meta Reinforcement Learning

Nächstes Kapitel Towards Physical Plausibility in Neuroevolution Systems

Codes can be found at https://github.com/amrzr/SA-MOEAMOPG.

Ao, Y., Li, H., Zhu, L., Ali, S., Yang, Z.: The linear random forest algorithm and its advantages in machine learning assisted logging regression modeling. J. Petrol. Sci. Eng. 174, 776–789 (2019)CrossRef

Arashi, M., Lukman, A.F., Algamal, Z.Y.: Liu regression after random forest for prediction and modeling in high dimension. J. Chemometr. 36(4), e3393 (2022)CrossRef

Bouhlel, M.A., Martins, J.R.R.A.: Gradient-enhanced kriging for high-dimensional problems. Eng. Comput. 35(1), 157–173 (2018)CrossRef

Chen, D., Wang, Y., Gao, W.: Combining a gradient-based method and an evolution strategy for multi-objective reinforcement learning. Appl. Intell. 50(10), 3301–3317 (2020)CrossRef

Cheng, R., Jin, Y., Olhofer, M., Sendhoff, B.: A reference vector guided evolutionary algorithm for many-objective optimization. IEEE Trans. Evol. Comput. 20, 773–791 (2016)CrossRef

Chugh, T., Sindhya, K., Hakanen, J., Miettinen, K.: A survey on handling computationally expensive multiobjective optimization problems with evolutionary algorithms. Soft. Comput. 23, 3137–3166 (2019)CrossRef

Conlon, J., Lin, J.: Greenhouse gas emission impact of autonomous vehicle introduction in an urban network. Transp. Res. Rec. 2673(5), 142–152 (2019)CrossRef

Deb, K., Jain, H.: An evolutionary many-objective optimization algorithm using reference-point-based nondominated sorting approach, part I: Solving problems with box constraints. IEEE Trans. Evol. Comput. 18, 577–601 (2014)CrossRef

Deb, K., Pratap, A., Agarwal, S., Meyarivan, T.: A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Trans. Evol. Comput. 6(2), 182–197 (2002)CrossRef

10.

Forrester, A., Sobester, A., Keane, A.: Engineering Design via Surrogate Modelling. John Wiley & Sons, Hoboken (2008)CrossRef

11.

Hayes, C.F., Reymond, M., Roijers, D.M., Howley, E., Mannion, P.: Risk aware and multi-objective decision making with distributional monte carlo tree search (2021). arXiv:2102.00966

12.

Hayes, C.F., et al.: A practical guide to multi-objective reinforcement learning and planning. Auton. Agents Multi-Agent Syst. 36(1), 26 (2022)CrossRef

13.

Jin, Y.: Surrogate-assisted evolutionary computation: recent advances and future challenges. Swarm Evol. Comput. 1, 61–70 (2011)CrossRef

14.

Jin, Y., Wang, H., Chugh, T., Guo, D., Miettinen, K.: Data-driven evolutionary optimization: an overview and case studies. IEEE Trans. Evol. Comput. 23, 442–458 (2019)CrossRef

15.

Knowles, J.D., Thiele, L., Zitzler, E.: A tutorial on the performance assessment of stochastic multiobjective optimizers (2006)

16.

Leurent, E.: An environment for autonomous driving decision-making (2018). https://github.com/eleurent/highway-env

17.

Li, M., Yao, X.: Quality evaluation of solution sets in multiobjective optimisation. ACM Comput. Surv. 52(2), 1–38 (2019)CrossRef

18.

Mazumdar, A., Chugh, T., Hakanen, J., Miettinen, K.: Probabilistic selection approaches in decomposition-based evolutionary algorithms for offline data-driven multiobjective optimization. IEEE Trans. Evol. Comput. 26, 1182–1191 (2022)CrossRef

19.

Parisi, S., Pirotta, M., Smacchia, N., Bascetta, L., Restelli, M.: Policy gradient approaches for multi-objective sequential decision making. In: 2014 International Joint Conference on Neural Networks (IJCNN), pp. 2323–2330 (2014)

20.

Rodriguez-Galiano, V., Sanchez-Castillo, M., Chica-Olmo, M., Chica-Rivas, M.: Machine learning predictive models for mineral prospectivity: an evaluation of neural networks, random forest, regression trees and support vector machines. Ore Geol. Rev. 71, 804–818 (2015)CrossRef

21.

Siddique, U., Weng, P., Zimmer, M.: Learning fair policies in multiobjective (deep) reinforcement learning with average and discounted rewards. In: Proceedings of the 37th International Conference on Machine Learning (2020)

22.

Stork, J., et al.: Open issues in surrogate-assisted optimization. High-Performance Simulation-Based Optimization p. 225–244 (2019)

23.

Xu, J., Tian, Y., Ma, P., Rus, D., Sueda, S., Matusik, W.: Prediction-guided multi-objective reinforcement learning for continuous robot control. In: Proceedings of the 37th International Conference on Machine Learning, pp. 10607–10616. PMLR (2020)

24.

Yang, K., Emmerich, M., Deutz, A., Bäck, T.: Efficient computation of expected hypervolume improvement using box decomposition algorithms. J. Global Optim. 75(1), 3–34 (2019)MathSciNetCrossRef

25.

Zapotecas Martínez, S., Coello Coello, C.A.: Moea/d assisted by RBF networks for expensive multi-objective optimization problems. In: Proceedings of the 15th Annual Conference on Genetic and Evolutionary Computation, pp. 1405–1412. Association for Computing Machinery (2013)

26.

Zhang, Q., Li, H.: MOEA/D: a multiobjective evolutionary algorithm based on decomposition. IEEE Trans. Evol. Comput. 11, 712–731 (2007)CrossRef

27.

Zitzler, E., Deb, K., Thiele, L.: Comparison of multiobjective evolutionary algorithms: empirical results. Evol. Comput. 8, 173–195 (2000)CrossRef

Titel: Hybrid Surrogate Assisted Evolutionary Multiobjective Reinforcement Learning for Continuous Robot Control
verfasst von: Atanu Mazumdar
Ville Kyrki
Verlag: Springer Nature Switzerland
Buch: Applications of Evolutionary Computation
Print ISBN: 978-3-031-56854-1

Electronic ISBN: 978-3-031-56855-8

Copyright-Jahr: 2024
DOI: https://doi.org/10.1007/978-3-031-56855-8_4

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Premium Partner