Top

Published in:

2013 | OriginalPaper | Chapter

6. Up the Down Staircase: Hierarchical Reinforcement Learning

Authors : Alexander Paprotny, Michael Thess

Published in: Realtime Data Mining

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

We address the question of how hierarchical, or multigrid, methods may figure in dynamic programming and reinforcement learning for recommendation engines.

After providing a general introduction, we approach the framework of hierarchical methods from both the historical analytical and algebraic viewpoints; we proceed to devising and justifying approaches to apply hierarchical methods to both the model-based as well as the model-free case. In regard to the latter, we set out from the multigrid reinforcement learning algorithms introduced by Ziv in [Ziv04] and extend these methods to finite-horizon problems.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

previous chapter How Engines Learn to Generate Recommendations: Adaptive Learning Algorithms

next chapter Breaking Dimensions: Adaptive Scoring with Sparse Grids

[AR02]

Andre, D., Russel, S.J.: State abstraction for programmable reinforcement learning agents. In: Proceedings of the National Conference on Artificial Intelligence, pp. 119–125 (2002)

[Bakh66]

Bakhvalov, N.S.: On the convergence of a relaxation method with natural constraints on the elliptic operator. USSR Comp. Math. Math. Phys. 6, 101–113 (1966)CrossRef

[BC89]

Bertsekas, D.P., Castanon, D.A.: Adaptive aggregation methods for infinite horizon dynamic programming. IEEE Trans. Automat. Contr. 34(6), 589–598 (1998)MathSciNet

[BMR82]

Brandt, A., McCormick S.F., Ruge J.W.: Algebraic multigrid (AMG) for automatic multigrid solution with application in geodetic computations. Technical Report CO POB 1852, Institute Computational Studies State University (1982)

[BPX90]

Bramble, J., Pasciak, J., Xu, J.: Parallel multilevel preconditioners. Math. Comput. 55, 1–22 (1990)CrossRefMATHMathSciNet

[Bra77]

Brandt, A.: Multi-level adaptive solutions to boundary-value problems. Math. Comput. 31, 333–390 (1977)CrossRefMATH

[Dau92]

Daubechies, I.: Ten Lectures on Wavelets. Society for Industrial and Applied Mathematics, Philadelphia (1992)CrossRefMATH

[Diet00]

Dietterich, T.G.: Hierarchical reinforcement learning with the MAXQ value function decomposition. J. Artif. Intell. Res. 13, 227–303 (2000)MATHMathSciNet

[Fed64]

Fedorenko, R.P.: The speed of convergence of one iterative process. USSR Comput. Math. Math. Phys. 4, 227–235 (1964)CrossRef

[GVL96]

Goloub, G.H., Van Loan, C.F.: Matrix Computations, 3rd edn. The Johns Hopkins University Press, Baltimore/London (1996)

[Ha85]

Hackbusch, W.: Multigrid Methods and Applications. Springer, New York (1985)CrossRef

[Ma99]

Mallat, S.: A Wavelet Tour of Signal Processing, 2nd edn. Academic Press, San Diego (1999)MATH

[MRLG05]

Marthi, B., Russell, S.J., Latham, D., Guestrin, C.: Concurrent hierarchical reinforcement learning. In: Proceedings of the 19th International Conference on Artificial Intelligence (IJCAI), pp. 779–785. (2005)

[Os94]

Oswald, P.: Multilevel Finite Element Approximation. B.G. Teubner, Stuttgart (1994)CrossRefMATH

[Pap10]

Paprotny A.: Hierarchical methods for the solution of dynamic programming equations arising from optimal control problems related to recommendation. Diploma Thesis, TU Hamburg-Harburg (2010)

[Pap11]

Paprotny, A.: Multilevel Methods for Dynamic Programming: Deterministic and Stochastic Iterative Methods with Application to Recommendation Engines. AVM – Akademische Verlagsgemeinschaft, München (2011)

[PR98]

Parr, R., Russel, S.J.: Reinforcement learning with hierarchies of machines. Adv. Neural Inf. Process. Syst., 1088–1095 (1998)

[SPS99]

Sutton, R., Precup, D., Singh, S.: Between MDPs and semi-MDPs: a framework for temporal abstraction in reinforcement learning. Artif. Intell. 112(1), 181–211 (1999)CrossRefMATHMathSciNet

[The12]

Thess, M.: Multilevel preconditioners for temporal-difference learning methods related to recommendation engines. In: Apel, T., Steinbach, O. (eds.) Advanced Finite Element Methods and Applications. Springer, Berlin (2012)

[TOS01]

Stüben, K.: An introduction to algebraic multigrid. In: Trottenberg, U., Oosterlee, C., Schüller, A. (eds.) Multigrid, pp. 413–532. Academic Press, San Diego (2001)

[Ziv04]

Ziv, O.: Algebraic multigrid for reinforcement learning. Master’s Thesis, Technion (2004)

[Zu00]

Zumbusch, G.: A sparse grid PDE solver. In: Langtangen, H.P., Bruaset A.M., Quak E. (eds.) Advances in Software Tools for Scientific Computing, Proceedings SciTools '98. Lecture Notes in Computational Science and Engineering, vol. 10, chapter 4, pp. 133–178. Springer (2000)

Title: Up the Down Staircase: Hierarchical Reinforcement Learning
Authors: Alexander Paprotny
Michael Thess
Publisher: Springer International Publishing
Book: Realtime Data Mining
Print ISBN: 978-3-319-01320-6

Electronic ISBN: 978-3-319-01321-3

Copyright Year: 2013
DOI: https://doi.org/10.1007/978-3-319-01321-3_6

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Premium Partner