Skip to main content
Top

2018 | OriginalPaper | Chapter

Generalizing Over Uncertain Dynamics for Online Trajectory Generation

Authors : Beomjoon Kim, Albert Kim, Hongkai Dai, Leslie Kaelbling, Tomas Lozano-Perez

Published in: Robotics Research

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

We present an algorithm which learns an online trajectory generator that can generalize over varying and uncertain dynamics. When the dynamics is certain, the algorithm generalizes across model parameters. When the dynamics is partially observable, the algorithm generalizes across different observations. To do this, we employ recent advances in supervised imitation learning to learn a trajectory generator from a set of example trajectories computed by a trajectory optimizer. In experiments in two simulated domains, it finds solutions that are nearly as good as, and sometimes better than, those obtained by calling the trajectory optimizer on line. The online execution time is dramatically decreased, and the off-line training time is reasonable.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Betts, J.T.: Survey of numerical methods for trajectory optimization. In: Journal of Guidance, Control, and Dynamics (1998) Betts, J.T.: Survey of numerical methods for trajectory optimization. In: Journal of Guidance, Control, and Dynamics (1998)
2.
go back to reference Ross, S., Gordon, G.J., Bagnell, J.A.: A reduction of imitation learning and structured prediction to no-regret online learning. In: International Conference on Artificial Intelligence and Statistics (2011) Ross, S., Gordon, G.J., Bagnell, J.A.: A reduction of imitation learning and structured prediction to no-regret online learning. In: International Conference on Artificial Intelligence and Statistics (2011)
3.
go back to reference Kim, B., Pineau, J.: Maximum mean discrepancy imitation learning. In: Robotics: Science and Systems (2013) Kim, B., Pineau, J.: Maximum mean discrepancy imitation learning. In: Robotics: Science and Systems (2013)
4.
go back to reference Atkeson, C.: Using local trajectory optimizers to speed up global optimization in dynamic programming. In: Neural Information Processing Systems (1994) Atkeson, C.: Using local trajectory optimizers to speed up global optimization in dynamic programming. In: Neural Information Processing Systems (1994)
5.
go back to reference Tedrake, R.: LQR-trees: feedback motion planning via sums of squares verification. In: International Journal of Robotics Research (2010) Tedrake, R.: LQR-trees: feedback motion planning via sums of squares verification. In: International Journal of Robotics Research (2010)
6.
go back to reference Atkeson, C., Liu, C.: Trajectory-based dynamic programming, In: Modeling, Simulation, and Optimization of Bipedal Walking (2013) Atkeson, C., Liu, C.: Trajectory-based dynamic programming, In: Modeling, Simulation, and Optimization of Bipedal Walking (2013)
7.
go back to reference Levine, S., Koltun, V.: Guided policy search. In: International Conference on Machine Learning (2013) Levine, S., Koltun, V.: Guided policy search. In: International Conference on Machine Learning (2013)
8.
go back to reference Levine, S., Koltun, V.: Learning complex neural network policies with trajectory optimization. In: International Conference on Machine Learning (2014) Levine, S., Koltun, V.: Learning complex neural network policies with trajectory optimization. In: International Conference on Machine Learning (2014)
9.
go back to reference Mordatch, I., Todorov, E.: Combining the benefits of function approximation and trajectory optimization. In: Robotics: Science and Systems (2014) Mordatch, I., Todorov, E.: Combining the benefits of function approximation and trajectory optimization. In: Robotics: Science and Systems (2014)
10.
go back to reference Argall, B., Chernova, S., Veloso, M., Browning, B.: A survey of robot learning from demonstration. In: Robotics and Autonomous Systems (2009) Argall, B., Chernova, S., Veloso, M., Browning, B.: A survey of robot learning from demonstration. In: Robotics and Autonomous Systems (2009)
11.
go back to reference Bagnell, J.A.: An invitation to imitation. In Tech Report CMU-RI-TR-15-08, Robotics Institute, Carnegie Mellon University (2015) Bagnell, J.A.: An invitation to imitation. In Tech Report CMU-RI-TR-15-08, Robotics Institute, Carnegie Mellon University (2015)
12.
go back to reference Abbeel, P., Coates, A., Ng. A.Y.: Autonomous helicopter aerobatics through apprenticeship learning. In: International Journal of Robotics Research (2010) Abbeel, P., Coates, A., Ng. A.Y.: Autonomous helicopter aerobatics through apprenticeship learning. In: International Journal of Robotics Research (2010)
13.
go back to reference Ross, S., Melik-Barkhudarov, N., Shankar, K S., Wendel, A., Dey, D., Bagnell, J.A., Hebert, M.: Learning monocular reactive UAV control in cluttered natural environments. In International Conference on Robotics and Automation (2013) Ross, S., Melik-Barkhudarov, N., Shankar, K S., Wendel, A., Dey, D., Bagnell, J.A., Hebert, M.: Learning monocular reactive UAV control in cluttered natural environments. In International Conference on Robotics and Automation (2013)
14.
go back to reference Berg, J., Miller, S., Duckworth, D., Hu, H., Wan, A., Fu, X., Goldberg, K., Abbeel, P.: Superhuman performance of surgical tasks by robots using iterative Learning from human-guided demonstrations. In: International Conference on Robotics and Automation (2010) Berg, J., Miller, S., Duckworth, D., Hu, H., Wan, A., Fu, X., Goldberg, K., Abbeel, P.: Superhuman performance of surgical tasks by robots using iterative Learning from human-guided demonstrations. In: International Conference on Robotics and Automation (2010)
15.
go back to reference Gretton, A., Borgwardt, K., Rasch, M., Schlkopf, B., Smola, A.: A kernel method for the two sample problem. In: Neural Information Processing Systems (2007) Gretton, A., Borgwardt, K., Rasch, M., Schlkopf, B., Smola, A.: A kernel method for the two sample problem. In: Neural Information Processing Systems (2007)
16.
go back to reference Cristianini, N., Shawe-Taylor, J.: Kernel Methods for Pattern Analysis. Cambridge University Press, Cambridge (2004)MATH Cristianini, N., Shawe-Taylor, J.: Kernel Methods for Pattern Analysis. Cambridge University Press, Cambridge (2004)MATH
17.
go back to reference Chandola, V., Banerjee, A., Kumar, V.: Anomaly detection: a survey. In: ACM Computing Surveys (2009) Chandola, V., Banerjee, A., Kumar, V.: Anomaly detection: a survey. In: ACM Computing Surveys (2009)
18.
go back to reference Betts, J.: SIAM Advances in Design and Control. Practical methods for optimal control using nonlinear programming. Society for Industrial and Applied Mathematics, Philadelphia (2001) Betts, J.: SIAM Advances in Design and Control. Practical methods for optimal control using nonlinear programming. Society for Industrial and Applied Mathematics, Philadelphia (2001)
20.
go back to reference Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, E.: Scikit-learn: machine learning in Python. In: Journal of Machine Learning Research (2011) Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, E.: Scikit-learn: machine learning in Python. In: Journal of Machine Learning Research (2011)
21.
go back to reference Gill, P.E., Murray, W., Saunders, M.A.: Snopt: an sqp algorithm for large-scale constrained optimization. In: SIAM Journal on Optimization (2002) Gill, P.E., Murray, W., Saunders, M.A.: Snopt: an sqp algorithm for large-scale constrained optimization. In: SIAM Journal on Optimization (2002)
22.
go back to reference Levine, S., Wagener, N., Abbeel, P.: Learning contact-rich manipulator skills with guided policy search. In: International Conference on Automation and Control (2015) Levine, S., Wagener, N., Abbeel, P.: Learning contact-rich manipulator skills with guided policy search. In: International Conference on Automation and Control (2015)
23.
go back to reference Marchese, A.D., Tedrake, R., Rus, D.: Dynamics and trajectory optimization for a soft spatial fluidic elastomer manipulator. In: International Conference on Automation and Control (2015) Marchese, A.D., Tedrake, R., Rus, D.: Dynamics and trajectory optimization for a soft spatial fluidic elastomer manipulator. In: International Conference on Automation and Control (2015)
24.
go back to reference Dai, H., Valenzuela, A., Tedrake, R.: Whole-body motion planning with centroidal dynamics and full kinematics. In: International Conference on Humanoid Robots (2014) Dai, H., Valenzuela, A., Tedrake, R.: Whole-body motion planning with centroidal dynamics and full kinematics. In: International Conference on Humanoid Robots (2014)
25.
go back to reference Posa, M., Cantu, C., Tedrake, R.: A direct method for trajectory optimization of rigid bodies through contact. In: International Journal of Robotics Research (2014) Posa, M., Cantu, C., Tedrake, R.: A direct method for trajectory optimization of rigid bodies through contact. In: International Journal of Robotics Research (2014)
26.
go back to reference Stryk, O.V., Bulirsch, R.: Direct and indirect methods for trajectory optimization. Ann. Op. Res. 37, 357–373 (1992) Stryk, O.V., Bulirsch, R.: Direct and indirect methods for trajectory optimization. Ann. Op. Res. 37, 357–373 (1992)
29.
go back to reference Daume, H., Langford, J., Marcu, D.: Search-based structured prediction. In: Machine Learning Journal (2009) Daume, H., Langford, J., Marcu, D.: Search-based structured prediction. In: Machine Learning Journal (2009)
30.
go back to reference Perkins, T.J., Barto, A.G.: Lyapunov design for safe reinforcement learning. J. Mach. Learn. Res. 3, 803–832 (2002)MathSciNetMATH Perkins, T.J., Barto, A.G.: Lyapunov design for safe reinforcement learning. J. Mach. Learn. Res. 3, 803–832 (2002)MathSciNetMATH
Metadata
Title
Generalizing Over Uncertain Dynamics for Online Trajectory Generation
Authors
Beomjoon Kim
Albert Kim
Hongkai Dai
Leslie Kaelbling
Tomas Lozano-Perez
Copyright Year
2018
DOI
https://doi.org/10.1007/978-3-319-60916-4_3