nach oben

International Journal of Machine Learning and Cybernetics

Erschienen in:

18.05.2020 | Original Article

Evaluating skills in hierarchical reinforcement learning

verfasst von: Marzieh Davoodabadi Farahani, Nasser Mozayani

Erschienen in: International Journal of Machine Learning and Cybernetics | Ausgabe 10/2020

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Despite the benefits mentioned in previous works of automatically acquiring skills for using them in hierarchical reinforcement learning algorithms such as solving the curse of dimensionality, improving exploration, and speeding up value propagation, they have not paid much attention to evaluating the effect of each skill on these factors. In this paper, we show that depending on the given task, a skill may be useful for learning it or not. In addition, the focus of the related work of automatically acquiring skills is on detecting subgoals, i.e., the skill termination condition, but there is not a precise method for extracting the initiation set of skills. In this paper, we propose not only two methods for evaluating skills but also two other methods for pruning the initiation set of them. Experimental results show significant improvements in learning different test domains after evaluating and pruning skills.

Vorheriger Artikel A novel-designed fuzzy logic control structure for control of distinct chaotic systems

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

ATZelectronics worldwide

ATZlectronics worldwide is up-to-speed on new trends and developments in automotive electronics on a scientific level with a high depth of information.

Order your 30-days-trial for free and without any commitment.

Jetzt informieren

ATZelektronik

Die Fachzeitschrift ATZelektronik bietet für Entwickler und Entscheider in der Automobil- und Zulieferindustrie qualitativ hochwertige und fundierte Informationen aus dem gesamten Spektrum der Pkw- und Nutzfahrzeug-Elektronik.

Lassen Sie sich jetzt unverbindlich 2 kostenlose Ausgabe zusenden.

Jetzt informieren

Dulac-Arnold G, Mankowitz D, Hester T (2019) Challenges of real-world reinforcement learning. arXiv preprint arXiv:1904.12901, 29 April 2019

Moerman W (2009) Hierarchical reinforcement learning : assignment of behaviours to subpolicies by self-organization. Ph.D. thesis, Utrecht University

Pfau J (2008) Plans as a means for guiding reinforcement learner. Ph.D. thesis, The University of Melbourn

Nguyen TT, Nguyen ND, Nahavandi S (2018) Deep reinforcement learning for multi-agent systems: A review of challenges, solutions and applications. arXiv preprint arXiv:1812.11794. 31 Dec 2018

McGovern A, Sutton RS (1998) Macro-actions in reinforcement learning: an empirical analysis. University of Massachusetts, Department of Computer Science, Tech. Rep 98–70

Jong NK, Hester T, Stone P (2008) The utility of temporal abstraction in reinforcement learning. In: Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems, vol 1, pp 299–306

Lin LJ (1992) Self-improving reactive agents based on reinforcement learning, planning and teaching. Mach Learn 8(3):293–321MathSciNet

Sutton RS, Barto AG (1998) Reinforcement learning: an introduction. IEEE Trans Neural Netw 9(5):1054–1054CrossRef

Sutton RS, Precup D, Singh S (1999) Between MDPs and semi-MDPs: a framework for temporal abstraction in reinforcement learning. Artif Intell 112(1):181–211MathSciNetCrossRef

10.

Dietterich TG (2000) An overview of MAXQ hierarchical reinforcement learning. In: International symposium on abstraction, reformulation, and approximation. Springer, Berlin, Heidelberg, pp 26–44CrossRef

11.

Shoeleh F, Asadpour M (2017) Graph based skill acquisition and transfer learning for continuous reinforcement learning domains. Pattern Recognit Lett 87:104–116CrossRef

12.

Xiong C, Tianmin S, Socher R (2019) Hierarchical and interpretable skill acquisition in multi-task reinforcement learning. United States patent application

13.

Bacon P, Harb J, Precup D (2017) The option-critic architecture. In: Thirty-first AAAI conference on artificial intelligence, pp 1726–1734

14.

Machado M, Bellemare M, Bowling M (2017) A Laplacian framework for option discovery in reinforcement learning. In: Proceedings of the 34th international conference on machine learning, vol 70, pp 2295–2304

15.

Dann M, Zambetta F (2017) Integrating skills and simulation to solve complex navigation tasks in Infinite Mario. IEEE Trans Games 10:101–106CrossRef

16.

Fox R, Krishnan S, Stoica I, Goldberg K (2017) Multi-level discovery of deep options. arXiv preprint arXiv:1703.08294

17.

Houthooft R, Chen X, Duan Y, Schulman J, De Turck F, Abbeel P (2016) Vime: variational information maximizing exploration. In: Advances in neural information processing systems, pp 1109–1117

18.

Demir A, Çilden E, Polat F (2016) Local roots: a tree-based subgoal discovery method to accelerate reinforcement learning. In: Joint European conference on machine learning and knowledge discovery in databases, pp 361–376

19.

Kulkarni TD, Narasimhan K, Saeedi A, Tenenbaum J (2016) Hierarchical deep reinforcement learning: integrating temporal abstraction and intrinsic motivation. In: Advances in neural information processing systems, pp 3675–3683

20.

Riemer M, Liu M, Tesauro G (2018) Learning abstract options. In: Advances in neural information processing systems, pp 10424–10434

21.

Kaelbling L (1996) Reinforcement learning: a survey. J Artif Intell Res 4:237–285CrossRef

22.

McGovern A, Barto AG (2001) Automatic discovery of subgoals in reinforcement learning using diverse density. In: Machine learning-international workshop then conference, pp 361–368

23.

Menache I, Mannor S, Shimkin N (2002) Q-cut-dynamic discovery of sub-goals in reinforcement learning. In: European conference on machine learning: ECML 2002, pp 295–306

24.

Simşek O (2008) Behavioral building blocks for autonomous agents: description, identification, and learning. Ph.D. Thesis, University of Massachusetts Amherst

25.

Merrick K (2007) Modelling motivation for experience-based attention focus in reinforcement learning. Ph.D. Thesis, School of Information Technologies, University of Sydney

26.

Mehta N, Ray S, Tadepalli P, Dietterich T (2008) Automatic discovery and transfer of MAXQ hierarchies. In: Proceedings of the 25th international conference on machine learning, pp 648–655

27.

Zang P, Zhou P, Minnen D, Isbell C (2009) Discovering options from example trajectories. In: Proceedings of the 26th annual international conference on machine learning, pp 1217–1224

28.

Mannor S, Menache I, Hoze A, Klein U (2004) Dynamic abstraction in reinforcement learning via clustering. In: Proceedings of the twenty-first international conference on machine learning, p 71

29.

Murata J (2008) Controlled use of subgoals in reinforcement learning. In: Robotics, automation and control, book, pp 167–182

30.

Davoodabadi Farahani M, Mozayani N (2019) Automatic construction and evaluation of macro-actions in reinforcement learning. Appl Soft Comput 82:105574CrossRef

31.

Metzen JH (2014) Learning the structure of continuous Markov decision processes. Ph.D. thesis, Universität Bremen

32.

Davoodabadi Farahani M, Mozayani N (2020) A new method for acquiring reusable skills in intrinsically motivated reinforcement learning. J Intell Manuf (submitted)

33.

Barto AG, Singh S, Chentanez N (2004) Intrinsically motivated learning of hierarchical collections of skills. In: Proceedings of the 3rd international conference on development and learning (ICDL 2004), Salk Institute, San Diego

34.

Metzen JH (2013) Learning graph-based representations for continuous reinforcement learning domains. In: Lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics), vol 8188 LNAI, no PART 1, pp 81–96

35.

Newman MEJ, Girvan M (2004) Finding and evaluating community structure in networks. Phys Rev E 69(2):026113CrossRef

36.

Davoodabadi Farahani M, Mozayani N (2018) Proposing a new method for acquiring skills in reinforcement learning with the help of graph clustering. Iran J Electr Comput Eng 2(16):131–141

37.

Sutton RS, Precup D, Singh S (1998) Intra-option learning about temporally abstract actions. In: Proceedings of the fifteenth international conference on machine learning, pp 556–564

38.

Metzen JH (2013) Learning graph-based representations for continuous reinforcement learning domains. Mach Learn Knowl Discov Databases 8188:81–96

39.

Henderson P, Chang WD, Shkurti F, Hansen J, Meger D, Dudek G (2017) Benchmark environments for multitask learning in continuous domains. arXiv preprint arXiv:1708.04352

40.

François-Lavet V, Henderson P, Islam R, Bellemare MG, Pineau J (2018) An introduction to deep reinforcement learning. Found Trends®. Mach Learn 11(3–4):219–354CrossRef

Titel: Evaluating skills in hierarchical reinforcement learning
verfasst von: Marzieh Davoodabadi Farahani
Nasser Mozayani
Publikationsdatum: 18.05.2020
Verlag: Springer Berlin Heidelberg
Erschienen in: International Journal of Machine Learning and Cybernetics / Ausgabe 10/2020
Print ISSN: 1868-8071
Elektronische ISSN: 1868-808X
DOI: https://doi.org/10.1007/s13042-020-01141-3

Neuer Inhalt

Bildnachweise

VDI-Icon, Profil Icon, inhalt2, Springer Professional Modul/© Springer Fachmedien Wiesbaden GmbH, Nachhaltigkeitsaward Key Visual/© Cometis AG/Global ESG Monitor | Daniel Rupp | Generiert mit KI, Search Icon, Banner Hanser, Kryptowährungen/© gopixa / Getty Images / iStock, MG4 aus China auf dem Prüfstand im ADAC-Technik-Zentrum in Landsberg am Lech/© ADAC e.V., Chassis eines Elektrofahrzeugs/© chesky / stock.adobe.com, Zeitschrift Wissensmanagement Cover, PatentFit-Logo/© Springer Fachmedien Wiesbaden GmbH, Sustainibility Finance/© Robert Kneschke / stock.adobe.com / Springer Fachmedien Wiesbaden GmbH, Zukunftswerkstatt Sales Excellence 2024/© AndreyPopov / Getty Images / iStock, 2023_Antrieb/© supervisuell

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

ATZelectronics worldwide

ATZelektronik

Weitere Artikel der Ausgabe 10/2020

Automatic extraction of named entities of cyber threats using a deep Bi-LSTM-CRF network

PRF-RW: a progressive random forest-based random walk approach for interactive semi-automated pulmonary lobes segmentation

Accelerated inexact matrix completion algorithm via closed-form q-thresholding operator

Training error and sensitivity-based ensemble feature selection

Twin support vector machine based on adjustable large margin distribution for pattern classification

A novel-designed fuzzy logic control structure for control of distinct chaotic systems

Neuer Inhalt

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.