Skip to main content
Top

2020 | OriginalPaper | Chapter

State Representation Learning from Demonstration

Authors : Astrid Merckling, Alexandre Coninx, Loic Cressot, Stephane Doncieux, Nicolas Perrin

Published in: Machine Learning, Optimization, and Data Science

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Robots could learn their own state and universe representation from perception, experience, and observations without supervision. This desirable goal is the main focus of our field of interest, State Representation Learning (SRL). Indeed, a compact representation of such a state is beneficial to help robots grasp onto their environment for interacting. The properties of this representation have a strong impact on the adaptive capability of the agent. Our approach deals with imitation learning from demonstration towards a shared representation across multiple tasks in the same environment. Our imitation learning strategy relies on a multi-head neural network starting from a shared state representation feeding a task-specific agent. As expected, generalization demands tasks diversity during training for better transfer learning effects. Our experimental setup proves favorable comparison with other SRL strategies and shows more efficient end-to-end Reinforcement Learning (RL) in our case than with independently learned tasks.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Footnotes
1
Roughly, different tasks refer to objectives of different natures, while different instances of a task refer to a difference of parameters in the task. For example, reaching various locations with a robotic arm is considered as different instances of the same reaching task.
 
2
The number of 24 dimensions has been selected empirically (not very large but leading to good RL results).
 
Literature
1.
go back to reference Andrychowicz, M., et al.: Hindsight experience replay. In: Advances in Neural Information Processing Systems, pp. 5048–5058 (2017) Andrychowicz, M., et al.: Hindsight experience replay. In: Advances in Neural Information Processing Systems, pp. 5048–5058 (2017)
2.
go back to reference Bengio, Y., Courville, A., Vincent, P.: Representation learning: a review and new perspectives. IEEE Trans. Pattern Anal. Mach. Intell. 35(8), 1798–1828 (2013)CrossRef Bengio, Y., Courville, A., Vincent, P.: Representation learning: a review and new perspectives. IEEE Trans. Pattern Anal. Mach. Intell. 35(8), 1798–1828 (2013)CrossRef
4.
go back to reference de Bruin, T., Kober, J., Tuyls, K., Babuška, R.: Integrating state representation learning into deep reinforcement learning. IEEE Robot. Autom. Lett. 3(3), 1394–1401 (2018)CrossRef de Bruin, T., Kober, J., Tuyls, K., Babuška, R.: Integrating state representation learning into deep reinforcement learning. IEEE Robot. Autom. Lett. 3(3), 1394–1401 (2018)CrossRef
5.
go back to reference Coumans, E., Bai, Y., Hsu, J.: Pybullet physics engine (2018) Coumans, E., Bai, Y., Hsu, J.: Pybullet physics engine (2018)
6.
7.
go back to reference Finn, C., Yu, T., Zhang, T., Abbeel, P., Levine, S.: One-shot visual imitation learning via meta-learning. arXiv preprint arXiv:1709.04905 (2017) Finn, C., Yu, T., Zhang, T., Abbeel, P., Levine, S.: One-shot visual imitation learning via meta-learning. arXiv preprint arXiv:​1709.​04905 (2017)
8.
go back to reference Gaier, A., Ha, D.: Weight agnostic neural networks. In: Advances in Neural Information Processing Systems, pp. 5364–5378 (2019) Gaier, A., Ha, D.: Weight agnostic neural networks. In: Advances in Neural Information Processing Systems, pp. 5364–5378 (2019)
9.
go back to reference Haarnoja, T., Zhou, A., Abbeel, P., Levine, S.: Soft actor-critic: off-policy maximum entropy deep reinforcement learning with a stochastic actor. arXiv preprint arXiv:1801.01290 (2018) Haarnoja, T., Zhou, A., Abbeel, P., Levine, S.: Soft actor-critic: off-policy maximum entropy deep reinforcement learning with a stochastic actor. arXiv preprint arXiv:​1801.​01290 (2018)
10.
go back to reference Higgins, I., et al..: beta-VAE: learning basic visual concepts with a constrained variational framework. In: International Conference on Learning Representations (2017) Higgins, I., et al..: beta-VAE: learning basic visual concepts with a constrained variational framework. In: International Conference on Learning Representations (2017)
11.
go back to reference Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006)MathSciNetCrossRef Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006)MathSciNetCrossRef
13.
go back to reference Jonschkowski, R., Hafner, R., Scholz, J., Riedmiller, M.: PVEs: position-velocity encoders for unsupervised learning of structured state representations. arXiv preprint arXiv:1705.09805 (2017) Jonschkowski, R., Hafner, R., Scholz, J., Riedmiller, M.: PVEs: position-velocity encoders for unsupervised learning of structured state representations. arXiv preprint arXiv:​1705.​09805 (2017)
16.
go back to reference Kober, J., Bagnell, J.A., Peters, J.: Reinforcement learning in robotics: a survey. Int. J. Robot. Res. 32(11), 1238–1274 (2013)CrossRef Kober, J., Bagnell, J.A., Peters, J.: Reinforcement learning in robotics: a survey. Int. J. Robot. Res. 32(11), 1238–1274 (2013)CrossRef
18.
go back to reference Lesort, T., Díaz-Rodríguez, N., Goudou, J.F., Filliat, D.: State representation learning for control: an overview. Neural Netw. 108, 379–392 (2018)CrossRef Lesort, T., Díaz-Rodríguez, N., Goudou, J.F., Filliat, D.: State representation learning for control: an overview. Neural Netw. 108, 379–392 (2018)CrossRef
20.
go back to reference Pastor, P., Hoffmann, H., Asfour, T., Schaal, S.: Learning and generalization of motor skills by learning from demonstration. In: 2009 IEEE International Conference on Robotics and Automation, ICRA 2009, pp. 763–768. IEEE (2009) Pastor, P., Hoffmann, H., Asfour, T., Schaal, S.: Learning and generalization of motor skills by learning from demonstration. In: 2009 IEEE International Conference on Robotics and Automation, ICRA 2009, pp. 763–768. IEEE (2009)
21.
go back to reference Pinto, L., Gupta, A.: Learning to push by grasping: using multiple tasks for effective learning. In: 2017 IEEE International Conference on Robotics and Automation (ICRA), pp. 2161–2168. IEEE (2017) Pinto, L., Gupta, A.: Learning to push by grasping: using multiple tasks for effective learning. In: 2017 IEEE International Conference on Robotics and Automation (ICRA), pp. 2161–2168. IEEE (2017)
23.
go back to reference Schultz, W., Dayan, P., Montague, P.R.: A neural substrate of prediction and reward. Science 275(5306), 1593–1599 (1997)CrossRef Schultz, W., Dayan, P., Montague, P.R.: A neural substrate of prediction and reward. Science 275(5306), 1593–1599 (1997)CrossRef
24.
go back to reference Shin, H.C., et al.: Deep convolutional neural networks for computer-aided detection: CNN architectures, dataset characteristics and transfer learning. IEEE Trans. Med. Imaging 35(5), 1285–1298 (2016)CrossRef Shin, H.C., et al.: Deep convolutional neural networks for computer-aided detection: CNN architectures, dataset characteristics and transfer learning. IEEE Trans. Med. Imaging 35(5), 1285–1298 (2016)CrossRef
25.
go back to reference Taylor, M.E., Stone, P.: Transfer learning for reinforcement learning domains: a survey. J. Mach. Learn. Res. 10(Jul), 1633–1685 (2009) Taylor, M.E., Stone, P.: Transfer learning for reinforcement learning domains: a survey. J. Mach. Learn. Res. 10(Jul), 1633–1685 (2009)
26.
go back to reference Watter, M., Springenberg, J., Boedecker, J., Riedmiller, M.: Embed to control: a locally linear latent dynamics model for control from raw images. In: Advances in Neural Information Processing Systems, pp. 2746–2754 (2015) Watter, M., Springenberg, J., Boedecker, J., Riedmiller, M.: Embed to control: a locally linear latent dynamics model for control from raw images. In: Advances in Neural Information Processing Systems, pp. 2746–2754 (2015)
Metadata
Title
State Representation Learning from Demonstration
Authors
Astrid Merckling
Alexandre Coninx
Loic Cressot
Stephane Doncieux
Nicolas Perrin
Copyright Year
2020
DOI
https://doi.org/10.1007/978-3-030-64580-9_26

Premium Partner