nach oben

Autonomous Robots

Erschienen in:

01.03.2014

Socially guided intrinsic motivation for robot learning of motor skills

verfasst von: Sao Mai Nguyen, Pierre-Yves Oudeyer

Erschienen in: Autonomous Robots | Ausgabe 3/2014

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

This paper presents a technical approach to robot learning of motor skills which combines active intrinsically motivated learning with imitation learning. Our algorithmic architecture, called SGIM-D, allows efficient learning of high-dimensional continuous sensorimotor inverse models in robots, and in particular learns distributions of parameterised motor policies that solve a corresponding distribution of parameterised goals/tasks. This is made possible by the technical integration of imitation learning techniques within an algorithm for learning inverse models that relies on active goal babbling. After reviewing social learning and intrinsic motivation approaches to action learning, we describe the general framework of our algorithm, before detailing its architecture. In an experiment where a robot arm has to learn to use a flexible fishing line, we illustrate that SGIM-D efficiently combines the advantages of social learning and intrinsic motivation and benefits from human demonstration properties to learn how to produce varied outcomes in the environment, while developing more precise control policies in large spaces.

Vorheriger Artikel Optimal coverage trajectories for a UGV with tradeoffs for energy and time

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Abbeel, P. & Ng, A. Y. (2004). Apprenticeship learning via inverse reinforcement learning. In Proceedings of the 21st international conference on machine learning (ICML’04) (pp. 1–8).

Akgun, B., Cakmak, M., Yoo, J., & Thomaz, A. (2012). Trajectories and keyframes for kinesthetic teaching: A human–robot interaction perspective. In International conference on human–robot interaction.

Argall, B. D., Browning, B., & Veloso, M. (2008). Learning robot motion control with demonstration and advice-operators. In Proceedings IEEE/RSJ international conference on intelligent robots and systems IEEE (pp. 399–404).

Argall, B. D., Chernova, S., Veloso, M., & Browning, B. (2009). A survey of robot learning from demonstration. Robotics and Autonomous Systems, 57(5), 469–483. doi:10.1016/j.robot.2008.10.024.CrossRef

Argall, B. D., Browning, B., & Veloso, M. (2011). Teacher feedback to scaffold and refine demonstrated motion primitives on a mobile robot. Robotics and Autonomous Systems, 59(3–4), 243–255.CrossRef

Baldassarre, G. (2011). What are intrinsic motivations? A biological perspective. In 2011 IEEE international conference on development and learning (ICDL) (Vol. 2, pp. 1–8).

Baranes, A., & Oudeyer, P. Y. (2010). Intrinsically motivated goal exploration for active motor learning in robots. Paris: INRIA.

Baranes, A., & Oudeyer, P. Y. (2013). Active learning of inverse models with intrinsically motivated goal exploration in robots. Robotics and Autonomous Systems, 61(1), 49–73.CrossRef

Barto, A. G., Singh, S., & Chenatez, N. (2004a). Intrinsically motivated learning of hierarchical collections of skills. In Proceedings of 3rd international conference on development and learning, San Diego, CA (pp. 112–119).

Barto, A. G., Singh, S., & Chentanez, N. (2004b). Intrinsically motivated learning of hierarchical collections of skills. In ICDL international conference on developmental learning.

Billard, A., Calinon, S., Dillmann, R., & Schaal, S. (2007). Robot programming by demonstration. In B. Siciliano & O. Khatib (Eds.), Handbook of robotics (Chapt. 59). New York: Springer.

Bishop, C. (2007). Pattern recognition and machine learning. In Information science and statistics. Heidelberg: Springer.

Blumberg, B., Downie, M., Ivanov, Y., Berlin, M., Johnson, M. P., & Tomlinson, B. (2002). Integrated learning for interactive synthetic characters. ACM Transactions on Graphics 21:417–426. doi:10.1145/566654.566597.

Breazeal, C., & Scassellati, B. (2002). Robots that imitate humans. Trends in Cognitive Sciences, 6(11), 481–487.CrossRef

Cakmak, M., & Thomaz, A. L. (2010). Optimality of human teachers for robot learners. In IEEE international conference on development and learning (ICDL) (Vol. 4).

Cakmak, M., DePalma, N., Thomaz, A. L., & Arriaga, R. (2009). Effects of social exploration mechanisms on robot learning. In The 18th IEEE international symposium on robot and human interactive communication (RO-MAN 2009) (pp. 128–134).

Cakmak, M., Chao, C., & Thomaz, A. L. (2010). Designing interactions for robot active learners. IEEE Transactions on Autonomous Mental Development, 2(2), 108–118.CrossRef

Calinon, S. (2009). Robot programming by demonstration: A probabilistic approach. Boca Raton: EPFL/CRC Press. EPFL Press ISBN 978-2-940222-31-5, CRC Press ISBN 978-1-4398-0867-2.

Calinon, S., & F G, Billard A,. (2007). On learning, representing and generalizing a task in a humanoid robot. IEEE Transactions on Systems, Man and Cybernetics, 37(2), 286–298.

Call, J., & Carpenter, M. (2002). Three sources of information in social learning. In K. Dautenhahn & C. L. Nehaniv (Eds.), Imitation in animals and artifacts (pp. 211–228). Cambridge, MA: MIT Press.

Cederborg, T., Li, M., Baranes, A., & Oudeyer, P. Y. (2010). Incremental local inline gaussian mixture regression for imitation learning of multiple tasks. In Proceedings of the IEEE/RSJ international conference on intelligent robots and systems (IROS), Taipei, Taiwan.

Chernova, S., & Veloso, M. (2009). Interactive policy learning through confidence-based autonomy. Journal of Artificial Intelligence Research, 34. doi:10.1613/jair.2584.

Clouse, J., & Utgoff, P. (1992). A teaching method for reinforcement learning. In Proceedings of the nineth international conference on machine learning.

Cohn, D. A., Ghahramani, Z., & Jordan, M. I. (1996). Active learning with statistical models. Journal of Artificial Intelligence Research, 4, 129–145.MATH

Coleman, T., & Li, Y. (1994). On the convergence of reflective newton methods for large-scale nonlinear minimization subject to bounds. Mathematical Programming, 67(2), 189–224.CrossRefMATHMathSciNet

Coleman, T., & Li, Y. (1996). An interior, trust region approach for nonlinear minimization subject to bounds. SIAM Journal on Optimization, 6, 418–445.CrossRefMATHMathSciNet

Csibra, G. (2003). Teleological and referential understanding of action in infancy. Philosophical Transactions of the Royal Society of London Series B: Biological Sciences, 358(1431), 447.CrossRef

Csibra, G., & Gergely, G. (2007). Obsessed with goals: Functions and mechanisms of teleological interpretation of actions in humans. Acta Psychologica, 124(1), 60–78. doi:10.1016/j.actpsy.2006.09.007. Becoming an intentional agent: Early development of action interpretation and action control.

da Silva, B., Konidaris, G., & Barto, A. (2012). Learning parameterized skills. In 29th international conference on machine learning (ICML 2012).

Dautenhahn, K., & Nehaniv, C. L. (2002). Imitation in animals and artifacts. Cambridge: MIT Press.

d’Avella, A., Portone, A., Fernandez, L., & Lacquaniti, F. (2006). Control of fast-reaching movement by muscle synergies combinations. The Journal of Neuroscience, 26(30), 7791–7810.CrossRef

Deci, E., & Ryan, R. M. (1985). Intrinsic motivation and self-determination in human behavior. New York: Plenum Press.CrossRef

Fedorov, V. (1972). Theory of optimal experiment. New York, NY: Academic Press, Inc.

Grollman, D. H., & Jenkins, O. C. (2008). Sparse incremental learning for interactive robot control policy estimation. In International conference on robotics and automation (ICRA 2008) (pp. 3315–3320).

Kaplan, F., Oudeyer, P. Y., Kubinyi, E., & Miklosi, A. (2002). Robotic clicker training. Robotics and Autonomous Systems, 38(3–4), 197–206.CrossRef

Kober, J., & Peters, J. (2011). Policy search for motor primitives in robotics. Machine Learning, 84(1), 171–203.CrossRefMATHMathSciNet

Kober, J., Wilhelm, A., Oztop, E., & Peters, J. (2012). Reinforcement learning to adjust parametrized motor primitives to new situations. Autonomous Robots, 1–19. doi:10.1007/s10514-012-9290-3.

Koenig, N., Takayama, L., & Matarić, M. (2010). Communication and knowledge sharing in human–robot interaction and learning from demonstration. Neural Networks, 23(8–9), 1104–1112. doi:10.1016/j.neunet.2010.06.005.CrossRef

Kormushev, P., Calinon, S., & Caldwell, D. G. (2010). Robot motor skill coordination with EM-based reinforcement learning. In Proceedings of IEEE/RSJ international conference on intelligent robots and systems (IROS), Taipei, Taiwan (pp 3232–3237).

Kormushev, P., Calinon, S., & Caldwell, D. G. (2011). Imitation learning of positional and force skills demonstrated via kinesthetic teaching and haptic input. Advanced Robotics, 25(5), 581–603.CrossRef

Krzanowski, W. J. (1988). Principles of multivariate analysis: A user’s perspective. New York: Oxford University Press.MATH

Lagarias, J. C., Reeds, J. A., Wright, M. H., & Wright, P. E. (1998). Convergence properties of the nelder-mead simplex method in low dimensions. SIAM Journal of Optimization, 9(1), 112–147.CrossRefMATHMathSciNet

Lopes, M. (2012). Optimal teaching on sequential decision tasks (to appear).

Lopes, M., & Oudeyer, P. Y. (2010). Active learning and intrinsically motivated exploration in robots: Advances and challenges (guest editorial). IEEE Transactions on Autonomous Mental Development, 2(2), 65–69.CrossRef

Lopes, M., Melo, F., Montesano, L., & Santos-Victor, J. (2009a). Abstraction levels for robotic imitation: Overview and computational approaches. In From motor to interaction learning in robots. Berlin: Springer.

Lopes, M., Melo, F. S., Kenward, B., & Santos-Victor, J. (2009b). A computational model of social-learning mechanisms. Adaptive Behaviour, 17(6), 467–483.

Lopes, M., Melo, F., Montesano, L., & Santos-Victor, J. (2010b). Abstraction levels for robotic imitation: Overview and computational approaches. In O. Sigaud & J. Peters (Eds.), From motor to interaction learning in robots, Studies in computational intelligence (Vol. 264, pp. 313–355). Berlin: Springer.

Lopes, M., Cederbourg, T., & Oudeyer, P. Y. (2011) Simultaneous acquisition of task and feedback models. In IEEE international conference on development and learning.

Mangin, O., & Oudeyer, P. Y. (2012) Learning the combinatorial structure of demonstrated behaviors with inverse feedback control. In A. A. Salah, J., Ruiz-del Solar, Ç. Meriçli, & P. Y. Oudeyer (Eds.), HBU 2012. LNCS (Vol. 7559, pp 135–148). Heidelberg: Springer.

Muja, M., & Lowe, D. (2009). Fast approximate nearest neighbors with automatic algorithm. In International conference on computer vision theory and applications (VISAPP’09).

Nehaniv, C. L., Dautenhahn, K., et al. (2004). Imitation and social learning in robots, humans, and animals: Behavioural, social and communicative dimensions. Cambridge: Cambridge University Press.

Nehaniv, C. L., & Dautenhahn, K. (2007). Imitation and social learning in robots, humans and animals: Behavioural, social and communicative dimensions. Cambridge: Cambridge University Press.CrossRef

Nguyen, S. M., & Oudeyer, P.-Y. (2012a). Interactive learning gives the tempo to an intrinsically motivated robot learner. In IEEE-RAS international conference on humanoid robots.

Nguyen, S. M., & Oudeyer, P.-Y. (2012b). Whom will an intrinsically motivated robot learner choose to imitate from? In J. Szufnarowska (Ed.), Proceedings of the post-graduate conference on robotics and development of cognition (pp. 32–35). doi:10.2390/biecoll-robotdoc2012-12.

Nguyen, S. M., & Oudeyer, P.-Y. (2012c). Active choice of teachers, learning strategies and goals for a socially guided intrinsic motivation learner. Paladyn Journal of Behavioural Robotics, 3(3), 136–146.

Nicolescu, M., & Mataric, M. (2003). Natural methods for robot task learning: Instructive demonstrations, generalization and practice. In Proceedings of the second international joint conference on autonomous agents and multiagent systems, ACM (pp. 241–248).

Oudeyer, P. Y. (2011). Developmental constraints on the evolution and acquisition of sensorimotor skills. Habilitation a Diriger des Recherches.

Oudeyer, P. Y., & Kaplan, F. (2007). What is intrinsic motivation? a typology of computational approaches. Frontiers in Neurorobotics, 1, 6.

Oudeyer, P. Y., Kaplan, F., & Hafner, V. (2007). Intrinsic motivation systems for autonomous mental development. IEEE Transactions on Evolutionary Computation, 11(2), 265–286.CrossRef

Oudeyer, P. Y., Baranes, A., & Kaplan, F. (2013). Intrinsically motivated learning of real-word sensorimotor skills with developmental constraints. In G. Baldassarre & Miroli (Eds.), Intrinsically motivated learning in natural and artificial system. London: Springer.

Peters, J., & Schaal, S. (2008). Reinforcement learning of motor skills with policy gradients. Neural Networks, 21(4), 682–697.CrossRef

Rolf, M., Steil, J., & Gienger, M. (2010). Gobal babbling permits direct learning of inverse kinematics. IEEE Transactions on Autonomous Mental Development, 2(3), 216–229.CrossRef

Roy, N., & McCallum, A. (2001). Towards optimal active learning through sampling estimation of error reduction. In Proceedings of the 18th international conference on machine learning, 1, 143–160.

Schaal, S., Ijspeert, A., & Billard, A. (2003). Computational approaches to motor learning by imitation. Philosophical Transactions of the Royal Society of London Series B, Biological sciences 358(1431), 537–547.

Schmidhuber J (1991) Curious model-building control systems. In: Proceedings of the international joint conference on neural networks (Vol. 2, pp. 1458–1463).

Schmidhuber, J. (2010). Formal theory of creativity, fun, and intrinsic motivation (1990–2010). IEEE Transactions on Autonomous Mental Development, 2(3), 230–247.CrossRef

Slater, A., & Lewis, M. (2006). Introduction to infant development. Oxford: Oxford University Press.

Smart, W., & Kaelbling, L. (2002). Effective reinforcement learning for mobile robots. In Proceedings of the IEEE international conference on robotics and automation (pp 3404–3410).

Stulp, F., & Schaal, S. (2011). Hierarchical reinforcement learning with movement primitives. In Humanoids (pp. 231–238).

Stulp, F., & Sigaud, O. (2012). Policy improvement methods: Between black-box optimization and episodic reinforcement learning.

Theodorou, E., Buchli, J., & Schaal, S. (2010). Reinforcement learning of motor skills in high dimensions: A path integral approach. In IEEE international conference on robotics and automation (ICRA) 2010 (pp. 2397–2403).

Thomaz, A. L. (2006). Socially guided machine learning. PhD thesis, MIT.

Thomaz, A. L., & Breazeal, C. (2008). Experiments in socially guided exploration: Lessons learned in building robots that learn with and without human teachers. Connection Science, Special Issue on Social Learning in Embodied Agents, 20(2, 3), 91–110.

Tomasello, M., & Carpenter, M. (2007). Shared intentionality. Developmental Science, 10(1), 121–125.

Verma, D., & Rao, R. (2006). Goal-based imitation as probabilistic inference over graphical models. In Advances in NIPS (Vol. 18).

Weiss, E., & Flanders, M. (2004). Muscular and postural synergies of the human hand. Journal of Neurophysiology, 92, 523–535.CrossRef

Whiten, A. (2000). Primate culture and social learning. Cognitive Science, 24(3), 477–508.CrossRef

Xu, T., Yu, C., & Smith, L. (2011). It’s the child’s body: The role of toddler and parent in selecting toddler’s visual experience. IN Proceedings of IEEE 10th international conference in development and learning.

Titel: Socially guided intrinsic motivation for robot learning of motor skills
verfasst von: Sao Mai Nguyen
Pierre-Yves Oudeyer
Publikationsdatum: 01.03.2014
Verlag: Springer US
Erschienen in: Autonomous Robots / Ausgabe 3/2014
Print ISSN: 0929-5593
Elektronische ISSN: 1573-7527
DOI: https://doi.org/10.1007/s10514-013-9339-y

Neuer Inhalt

Bildnachweise

VDI-Icon, Profil Icon, inhalt2, Springer Professional Modul/© Springer Fachmedien Wiesbaden GmbH, Zukunftswerkstatt Sales Excellence_ieS/© Springer Fachmedien Wiesbaden GmbH, Search Icon, Banner Hanser, Strompreise/© vejaa / stock.adobe.com, Bunte Männchen, die Kunden darstelle, werden von einem riesigen Magneten angezogen. /© Oleksiy Mark, Dr. Daniel Schneider/© Fraunhofer IESE, Zeitschrift Wissensmanagement Cover, PatentFit-Logo/© Springer Fachmedien Wiesbaden GmbH, Zukunftswerkstatt Sales Excellence 2024/© AndreyPopov / Getty Images / iStock, 2023_Antrieb/© supervisuell, ATZ-Webinar: Prototypenfreie Entwicklung durch Offline- und Driver-in-the-Loop-HiL-Tests /© (c) VI-grade

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Weitere Artikel der Ausgabe 3/2014

Optimal coverage trajectories for a UGV with tradeoffs for energy and time

Scan matching SLAM in underwater environments

Anytime merging of appearance-based maps

A saliency-driven robotic head with bio-inspired saccadic behaviors for social robotics

Direction-changing fall control of humanoid robots: theory and experiments

Neuer Inhalt

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.