Top

Published in:

2020 | OriginalPaper | Chapter

Quantile Regression Hindsight Experience Replay

Authors : Qiwei He, Liansheng Zhuang, Wei Zhang, Houqiang Li

Published in: Neural Information Processing

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

Efficient learning in the environment with sparse rewards is one of the most important challenges in Deep Reinforcement Learning (DRL). In continuous DRL environments such as robotic manipulation tasks, Multi-goal RL with the accompanying algorithm Hindsight Experience Replay (HER) has been shown an effective solution. However, HER and its variants typically suffer from a major challenge that the agents may perform well in some goals while poorly in the other goals. The main reason for the phenomenon is the popular concept in the recent DRL works called intrinsic stochasticity. In Multi-goal RL, intrinsic stochasticity lies in that the different initial goals of the environment will cause the different value distributions and interfere with each other, where computing the expected return is not suitable in principle and cannot perform well as usual. To tackle this challenge, in this paper, we propose Quantile Regression Hindsight Experience Replay (QR-HER), a novel approach based on Quantile Regression. The key idea is to select the returns that are most closely related to the current goal from the replay buffer without additional data. In this way, the interference between different initial goals will be significantly reduced. We evaluate QR-HER on OpenAI Robotics manipulation tasks with sparse rewards. Experimental results show that, in contrast to HER and its variants, our proposed QR-HER achieves better performance by improving the performances of each goal as we expected.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

previous chapter Multi-scale Attention Consistency for Multi-label Image Classification

next chapter Sustainable Patterns of Pigeon Flights Over Different Types of Terrain

Andrychowicz, M., et al.: Hindsight experience replay. In: Advances in Neural Information Processing Systems, pp. 5048–5058 (2017)

Bellemare, M.G., Dabney, W., Munos, R.: A distributional perspective on reinforcement learning. In: Proceedings of the 34th International Conference on Machine Learning-Volume 70, pp. 449–458. JMLR. org (2017)

Dabney, W., Rowland, M., Bellemare, M.G., Munos, R.: Distributional reinforcement learning with quantile regression (2017)

Fang, M., Zhou, T., Du, Y., Han, L., Zhang, Z.: Curriculum-guided hindsight experience replay. In: Advances in Neural Information Processing Systems, pp. 12602–12613 (2019)

LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436 (2015)CrossRef

Lillicrap, T.P., et al.: Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971 (2015)

Lin, L.J.: Self-improving reactive agents based on reinforcement learning, planning and teaching. Mach. Learn. 8(3–4), 293–321 (1992)

Plappert, M., et al.: Multi-goal reinforcement learning: challenging robotics environments and request for research. arXiv preprint arXiv:1802.09464 (2018)

Schaul, T., Horgan, D., Gregor, K., Silver, D.: Universal value function approximators. In: International Conference on Machine Learning, pp. 1312–1320 (2015)

10.

Sutton, R.S., Barto, A.G.: Reinforcement learning: an introduction. MIT press (2018)

Title: Quantile Regression Hindsight Experience Replay
Authors: Qiwei He
Liansheng Zhuang
Wei Zhang
Houqiang Li
Publisher: Springer International Publishing
Book: Neural Information Processing
Print ISBN: 978-3-030-63819-1

Electronic ISBN: 978-3-030-63820-7

Copyright Year: 2020
DOI: https://doi.org/10.1007/978-3-030-63820-7_94

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Premium Partner