nach oben

Autonomous Robots

Erschienen in:

05.01.2021 | Original Research

A hierarchical representation of behaviour supporting open ended development and progressive learning for artificial agents

verfasst von: François Suro, Jacques Ferber, Tiberiu Stratulat, Fabien Michel

Erschienen in: Autonomous Robots | Ausgabe 2/2021

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

One of the challenging aspects of open ended or lifelong agent development is that the final behaviour for which an agent is trained at a given moment can be an element for the future creation of one, or even several, behaviours of greater complexity, whose purpose cannot be anticipated. In this paper, we present modular influence network design (MIND), an artificial agent control architecture suited to open ended and cumulative learning. The MIND architecture encapsulates sub behaviours into modules and combines them into a hierarchy reflecting the modular and hierarchical nature of complex tasks. Compared to similar research, the main original aspect of MIND is the multi layered hierarchy using a generic control signal, the influence, to obtain an efficient global behaviour. This article shows the ability of MIND to learn a curriculum of independent didactic tasks of increasing complexity covering different aspects of a desired behaviour. In so doing we demonstrate the contributions of MIND to open-ended development: encapsulation into modules allows for the preservation and re-usability of all the skills acquired during the curriculum and their focused retraining, the modular structure serves the evolving topology by easing the coordination of new sensors, actuators and heterogeneous learning structures.

Vorheriger Artikel An improved kinematic model for skid-steered wheeled platforms

Nächster Artikel Robust circumnavigation of a heterogeneous multi-agent system

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Nur mit Berechtigung zugänglich

Objects can be naturally split into parts and sub-parts, complex features and simple features (Kruger et al. 2013).

JBox2d: www.jbox2d.org.

Videos of the results are available at the following addresses:

Raspberry PI3: www.raspberrypi.org/products/raspberry-pi-3-model-b-plus/.

Grove Pi: www.dexterindustries.com/grovepi/.

OpenCV computer vision library: https://opencv.org/.

Videos of the results are available at the following address: www.lirmm.fr/~suro/videos/clawDemo.mp4; https://hal.archives-ouvertes.fr/hal-02594407.

Arkin, R. C., & Balch, T. (1997). Aura: Principles and practice in review. Journal of Experimental & Theoretical Artificial Intelligence, 9(2–3), 175–189.CrossRef

Barto, A. G., Singh, S., & Chentanez, N. (2004). Intrinsically motivated learning of hierarchical collections of skills. In Proceedings of the 3rd International Conference on Development and Learning (pp. 112–19).

Bengio, Y., Louradour, J., Collobert, R., & Weston, J. (2009). Curriculum learning. In Proceedings of the 26th Annual International Conference on Machine Learning (pp. 41–48). ACM.

Blaes, S., Pogančić, M. V., Zhu, J., & Martius, G. (2019). Control what you can: Intrinsically motivated task-planning agent. In Advances in Neural Information Processing Systems (pp. 12520–12531).

Braitenberg, V. (1986). Vehicles: Experiments in synthetic psychology. Cambridge: MIT press.

Brooks, R. A. (1986). A robust layered control system for a mobile robot. IEEE Journal of Robotics and Automation, 2(1), 14–23.CrossRef

Connor, J. T., Martin, R. D., & Atlas, L. E. (1994). Recurrent neural networks and robust time series prediction. IEEE Transactions on Neural Networks, 5(2), 240–254.CrossRef

De Jong, K. A. (1992). Are genetic algorithms function optimizers? PPSN, 2, 3–14.

Devin, C., Gupta, A., Darrell, T., Abbeel, P., & Levine, S. (2017). Learning modular neural network policies for multi-task and multi-robot transfer. In 2017 IEEE international conference on robotics and automation (ICRA) (pp. 2169–2176). IEEE.

Dorigo, M., & Colombetti, M. (1994). Robot shaping: Developing autonomous agents through learning. Artificial intelligence, 71(2), 321–370.CrossRef

Dorigo, M., & Colombetti, M. (1998). Robot shaping: An experiment in behavior engineering. Cambridge: MIT press.

Felleman, D. J., & Van Essen, D. C. (1991). Distributed hierarchical processing in the primate cerebral cortex. Cerebral Cortex, 1(1), 1–47.CrossRef

Foglino, F., Christakou, C. C., & Leonetti, M. (2019). An optimization framework for task sequencing in curriculum learning. In Joint IEEE 9th International Conference ICDL-EpiRob (pp. 207–214). IEEE.

Forestier, S., Mollard, Y., & Oudeyer, P.-Y. (2017). Intrinsically motivated goal exploration processes with automatic curriculum learning. arXiv preprint arXiv:1708.02190.

Gen, M., & Lin, L. (2007). Genetic algorithms. In Wiley Encyclopedia of Computer Science and Engineering (pp. 1–15).

Gülçehre, Ç., Moczulski, M., Visin, F., & Bengio, Y. (2016). Mollifying networks. CoRR, abs/1608.04980.

Heess, N., Wayne, G., Tassa, Y., Lillicrap, T. P., Riedmiller, M. A., & Silver, D. (2016). Learning and transfer of modulated locomotor controllers. CoRR, abs/1610.05182.

Hester, T., & Stone, P. (2017). Intrinsically motivated model learning for developing curious robots. Artificial Intelligence, 247, 170–186.MathSciNetCrossRef

Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. InAdvances in neural information processing systems (pp. 1097–1105).

Kruger, N., Janssen, P., Kalkan, S., Lappe, M., Leonardis, A., Piater, J., et al. (2013). Deep hierarchies in the primate visual cortex: What can we learn for computer vision? IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(8), 1847–1871.CrossRef

Larsen, T., & Hansen, S. T. (2005). Evolving composite robot behaviour-a modular architecture. In Proceedings of the Fifth International Workshop on Robot Motion and Control, 2005. RoMoCo’05., pages 271–276. IEEE.

Lessin, D., Fussell, D., & Miikkulainen, R. (2013). Open-ended behavioral complexity for evolved virtual creatures. In Proceedings of the 15th annual conference on Genetic and evolutionary computation (pp. 335–342).

Lessin, D., Fussell, D., Miikkulainen, R., & Risi, S. (2015). Increasing behavioral complexity for evolved virtual creatures with the esp method. arXiv preprint arXiv:1510.07957.

Lopes, M., & Oudeyer, P.-Y. (2012). The strategic student approach for life-long exploration and learning. In 2012 IEEE international conference on development and learning and epigenetic robotics (ICDL) (pp. 1–8). IEEE.

Lukoševičius, M., & Jaeger, H. (2009). Reservoir computing approaches to recurrent neural network training. Computer Science Review, 3(3), 127–149.CrossRef

Lungarella, M., Metta, G., Pfeifer, R., & Sandini, G. (2003). Developmental robotics: A survey. Connection Science, 15(4), 151–190.CrossRef

Narvekar, S., Sinapov, J., Leonetti, M., & Stone, P. (2016). Source task creation for curriculum learning. In: Proceedings of the 2016 international conference on autonomous agents & multiagent systems (pp. 566–574). International Foundation for Autonomous Agents and Multiagent Systems.

Niël, R., & Wiering, M. A. (2018). Hierarchical reinforcement learning for playing a dynamic dungeon crawler game. In 2018 IEEE Symposium Series on Computational Intelligence (SSCI) (pp. 1159–1166). IEEE.

Oudeyer, P.-Y. (2012). Developmental robotics. In Encyclopedia of the sciences of learning (pp 969–972). Springer.

Oudeyer, P.-Y., & Kaplan, F. (2007). What is intrinsic motivation? a typology of computational approaches. Frontiers in Neurorobotics, 1, 6.CrossRef

Piaget, J. (1954). The construction of reality in the child. New York: Basic Books.CrossRef

Piaget, J., & Duckworth, E. (1970). Genetic epistemology. American Behavioral Scientist, 13(3), 459–480.CrossRef

Reynolds, C. W. (1987). Flocks, herds and schools: A distributed behavioral model. In ACM SIGGRAPH computer graphics (Vol. 21, pp. 25–34). ACM.

Rudolph, G. (1994). Convergence analysis of canonical genetic algorithms. IEEE Transactions on Neural Networks, 5(1), 96–101.CrossRef

Russell, S., & Norvig, P. (2009). Artificial Intelligence: A Modern Approach (3rd ed.). Upper Saddle River: Prentice Hall Press.MATH

Santucci, V. G., Baldassarre, G., & Cartoni, E. (2019). Autonomous reinforcement learning of multiple interrelated tasks. In 2019 Joint IEEE 9th international conference on development and learning and epigenetic robotics (ICDL-EpiRob) (pp. 221–227). IEEE.

Santucci, V. G., Baldassarre, G., & Mirolli, M. (2016). Grail: A goal-discovering robotic architecture for intrinsically-motivated learning. IEEE Transactions on Cognitive and Developmental Systems, 8(3), 214–231.CrossRef

Schrum, J., & Miikkulainen, R. (2015). Discovering multimodal behavior in MS PAC-man through evolution of modular neural networks. IEEE Transactions on Computational Intelligence and AI in Games, 8(1), 67–81.CrossRef

Simonin, O., & Ferber, J. (2000). Modeling self satisfaction and altruism to handle action selection and reactive cooperation. In Proceedings of the 6th international conference on the simulation of adaptive behavior (Vol. 2, pp. 314–323).

Stanley, K. O., & Miikkulainen, R. (2002). Evolving neural networks through augmenting topologies. Evolutionary Computation, 10(2), 99–127.CrossRef

Stone, P., & Veloso, M. (2000). Layered learning. In European conference on machine learning (pp. 369–381). Springer.

Whiteson, S., Kohl, N., Miikkulainen, R., & Stone, P. (2003). Evolving keepaway soccer players through task decomposition. In Genetic and Evolutionary Computation Conference (pp. 356–368). Springer.

Titel: A hierarchical representation of behaviour supporting open ended development and progressive learning for artificial agents
verfasst von: François Suro
Jacques Ferber
Tiberiu Stratulat
Fabien Michel
Publikationsdatum: 05.01.2021
Verlag: Springer US
Erschienen in: Autonomous Robots / Ausgabe 2/2021
Print ISSN: 0929-5593
Elektronische ISSN: 1573-7527
DOI: https://doi.org/10.1007/s10514-020-09960-7

Neuer Inhalt

Bildnachweise

VDI-Icon, Profil Icon, inhalt2, Springer Professional Modul/© Springer Fachmedien Wiesbaden GmbH, Nachhaltigkeitsaward Key Visual/© Cometis AG/Global ESG Monitor | Daniel Rupp | Generiert mit KI, Search Icon, Banner Hanser, Arbeitszeit/© granata68 / Fotolia, E-Autos im Fuhrpark: Lohnt sich das noch?/© Petair / stock.adobe.com, Kryptowährungen/© gopixa / Getty Images / iStock, Zeitschrift Wissensmanagement Cover, PatentFit-Logo/© Springer Fachmedien Wiesbaden GmbH, Sustainibility Finance/© Robert Kneschke / stock.adobe.com / Springer Fachmedien Wiesbaden GmbH, Zukunftswerkstatt Sales Excellence 2024/© AndreyPopov / Getty Images / iStock, 2023_Antrieb/© supervisuell

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Weitere Artikel der Ausgabe 2/2021

Game tree search for minimizing detectability and maximizing visibility

Robust circumnavigation of a heterogeneous multi-agent system

A sketch is worth a thousand navigational instructions

LTA*: Local tangent based A* for optimal path planning

Estimating boundary dynamics using robotic sensor networks with pointwise measurements

Road surface detection and differentiation considering surface damages

Neuer Inhalt

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

LTA: Local tangent based A for optimal path planning