nach oben

Erschienen in:

2016 | OriginalPaper | Buchkapitel

Beyond Geometric Path Planning: Learning Context-Driven Trajectory Preferences via Sub-optimal Feedback

verfasst von : Ashesh Jain, Shikhar Sharma, Ashutosh Saxena

Erschienen in: Robotics Research

Verlag: Springer International Publishing

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

We consider the problem of learning preferences over trajectories for mobile manipulators such as personal robots and assembly line robots. The preferences we learn are more intricate than those arising from simple geometric constraints on robot’s trajectory, such as distance of the robot from human etc. Our preferences are rather governed by the surrounding context of various objects and human interactions in the environment. Such preferences makes the problem challenging because the criterion of defining a good trajectory now varies with the task, with the environment and across the users. Furthermore, demonstrating optimal trajectories (e.g., learning from expert’s demonstrations) is often challenging and non-intuitive on high degrees of freedom manipulators. In this work, we propose an approach that requires a non-expert user to only incrementally improve the trajectory currently proposed by the robot. We implement our algorithm on two high degree-of-freedom robots, PR2 and Baxter, and present three intuitive mechanisms for providing such incremental feedback. In our experimental evaluation we consider two context rich settings—household chores and grocery store checkout—and show that users are able to train the robot with just a few feedbacks (taking only a few minutes). Despite receiving sub-optimal feedback from non-expert users, our algorithm enjoys theoretical bounds on regret that match the asymptotic rates of optimal trajectory algorithms.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Vorheriges Kapitel Personalizing Intelligent Systems and Robots with Human Motion Data

Nächstes Kapitel Learning from Demonstrations Through the Use of Non-rigid Registration

A kitchen knife originating in Japan.

When RRT becomes too slow, we switch to a more efficient bidirectional-RRT.The cost function (or its approximation) we learn can be fed to trajectory optimizers like CHOMP [39] or optimal planners like RRT* [23] to produce reasonably good trajectories.

Consider the following analogy. In search engine results, it is much harder for the user to provide the best web-pages for each query, but it is easier to provide relative ranking on the search results by clicking.

Similar results were obtained with nDCG@1 metric, not included here due to space constraints.

The smaller user size on PR2 is because it requires users with experience in Rviz-ROS. Further, we also observed users found it harder to correct trajectory waypoints in a simulator than providing zero-G feedback on the robot. For the same reason we report training time only on Baxter for grocery store setting.

Abbeel, P., Coates, A., Ng, A.Y.: Autonomous helicopter aerobatics through apprenticeship learning. IJRR 29(13) (2010)

Akgun, B., Cakmak, M., Jiang, K., Thomaz, A.L.: Keyframe-based learning from demonstration. IJSR 4(4), 343–355 (2012)

Alterovitz, R., Siméon, T., Goldberg, K.: The stochastic motion roadmap: A sampling framework for planning with markov motion uncertainty. In: RSS (2007)

Argall, B.D., Chernova, S., Veloso, M., Browning, B.: A survey of robot learning from demonstration. Robot. Autonom. Syst. 57(5), 469–483 (2009)CrossRef

Berenson, D., Abbeel, P., Goldberg, K.: A robot path planning framework that learns from experience. In: ICRA (2012)

Berg, J.V.D., Abbeel, P., Goldberg, K.: LQG-MP: Optimized path planning for robots with motion uncertainty and imperfect state information. In: RSS (2010)

Bhattacharya, S., Likhachev, M., Kumar, V.: Identification and representation of homotopy classes of trajectories for search-based path planning in 3d. In: RSS (2011)

Bischoff, R., Kazi, A., Seyfarth, M.: The morpha style guide for icon-based programming. In: Proceedings of the 11th IEEE International Workshop on RHIC (2002)

Calinon, S., Guenter, F., Billard, A.: On learning, representing, and generalizing a task in a humanoid robot. In: IEEE Transactions on Systems Man and Cybernetics (2007)

10.

Cohen, B.J., Chitta, S., Likhachev, M.: Search-based planning for manipulation with motion primitives. In: ICRA (2010)

11.

Dey, D., Liu, T.Y., Hebert, M., Bagnell, J.A.: Contextual sequence prediction with application to control library optimization. In: RSS (2012)

12.

Diankov, R.: Automated Construction of Robotic Manipulation Programs. Ph.D. thesis, CMU, RI (2010)

13.

Dragan, A., Srinivasa, S.: Generating legible motion. In: RSS (2013)

14.

Dragan, A., Lee, K., Srinivasa, S.: Legibility and predictability of robot motion. In: HRI (2013)

15.

Erickson, L.H., LaValle, S.M.: Survivability: Measuring and ensuring path diversity. In: ICRA (2009)

16.

Gossow, D., Leeperand, A., Hershberger, D., Ciocarlie, M.: Interactive markers: 3-d user interfaces for ros applications [ros topics]. IEEE Robot. Autom. Mag. 18(4), 14–15 (2011)CrossRef

17.

Green, C.J., Kelly, A.: Toward optimal sampling in the space of paths. In: ISRR (2007)

18.

Hovland, G.E., Sikka, P., McCarragher, B.J.: Skill acquisition from human demonstration using a hidden markov model. In: ICRA (1996)

19.

Jain, A., Wojcik, B., Joachims, T., Saxena, A.: Learning trajectory preferences for manipulators via iterative improvement. In: NIPS (2013)

20.

Jiang, Y., Lim, M., Zheng, C., Saxena, A.: Learning to place new objects in a scene. IJRR, 31(9) (2012)

21.

Joachims, T.: Training linear svms in linear time. In: KDD (2006)

22.

Joachims, T., Finley, T., Yu, C.: Cutting-plane training of structural SVMS. Mach Learn, 77(1) (2009)

23.

Karaman, S., Frazzoli, E.: Incremental sampling-based algorithms for optimal motion planning. In: RSS (2010)

24.

Klingbeil, E., Rao, D., Carpenter, B., Ganapathi, V., Ng, A.Y., Khatib, O.: Grasping with application to an autonomous checkout robot. In: ICRA (2011)

25.

Kober, J., Peters, J.: Policy search for motor primitives in robotics. Machine Learning, 84(1) (2011)

26.

Koppula, H.S., Saxena, A.: Anticipating human activities using object affordances for reactive robotic response. In: RSS (2013)

27.

LaValle, S.M., Kuffner, J.J.: Randomized kinodynamic planning. IJRR 20(5), 378–400 (2001)

28.

Lenz, I., Lee, H., Saxena, A.: Deep learning for detecting robotic grasps. In: RSS (2013)

29.

Levine, S., Koltun, V.: Continuous inverse optimal control with locally optimal examples. In: ICML (2012)

30.

Manning, C.D., Raghavan, P., Schütze, H.: Introduction to information retrieval, vol. 1, Cambridge University Press, Cambridge (2008)

31.

Nicolescu, M.N., Mataric, M.J.: Natural methods for robot task learning: Instructive demonstrations, generalization and practice. In: Proceedings of the Second International Joint Conference on Autonomous Agents and Multiagent Systems (2003)

32.

Nikolaidis, S., Shah, J.: Human-robot teaming using shared mental models. In: HRI, Workshop on Human-Agent-Robot Teamwork (2012)

33.

Nikolaidis, S., Shah, J.: Human-robot cross-training: Computational formulation, modeling and evaluation of a human team training strategy. In: IEEE/ACM ICHRI (2013)

34.

Phillips, M., Cohen, B., Chitta, S., Likhachev, M.: E-graphs: Bootstrapping planning with experience graphs. In: RSS (2012)

35.

Raman, K., Joachims, T.: Learning socially optimal information systems from egoistic users. In: Proceedings of the ECML (2013)

36.

Ratliff, N.: Learning to search: structured prediction techniques for imitation learning. Ph.D. thesis, CMU, RI (2009)

37.

Ratliff, N., Bagnell, J.A., Zinkevich, M.: Maximum margin planning. In: ICML (2006)

38.

Ratliff, N., Silver, D., Bagnell, J.A.: Learning to search: Functional gradient techniques for imitation learning. Autonom. Robot. 27(1), 25–53 (2009a)CrossRef

39.

Ratliff, N., Zucker, M., Bagnell, J.A., Srinivasa, S.: Chomp: Gradient optimization techniques for efficient motion planning. In: ICRA (2009b)

40.

Saxena, A., Driemeyer, J., Ng, A.Y.: Robotic grasping of novel objects using vision. IJRR, 27(2) (2008)

41.

Shivaswamy, P., Joachims, T.: Online structured prediction via coactive learning. In: ICML (2012)

42.

Shneiderman, B., Plaisant, C.: Designing The User Interface: Strategies for Effective Human-Computer Interaction. Addison-Wesley Publication (2010)

43.

Stopp, A., Horstmann, S., Kristensen, S., Lohnert, F.: Towards interactive learning for manufacturing assistants. In: Proceedings of the 10th IEEE International Workshop on RHIC (2001)

44.

Sucan, I.A., Moll, M., Kavraki, L.E.: The Open Motion Planning Library. IEEE Robot. Autom. Mag. 19(4):72–82 (2012). http://ompl.kavrakilab.org

45.

Tamane, K., Revfi, M., Asfour, T.: Synthesizing object receiving motions of humanoid robots with human motion database. In: ICRA (2013)

46.

Vernaza, P., Bagnell, J.A.: Efficient high dimensional maximum entropy modeling via symmetric partition functions. In: NIPS (2012)

47.

Wilson, A., Fern, A., Tadepalli, P.: A bayesian approach for policy learning from trajectory preference queries. In: NIPS (2012)

48.

Ziebart, B.D., Maas, A., Bagnell, J.A., Dey, A.K.: Maximum entropy inverse reinforcement learning. In: AAAI (2008)

Titel: Beyond Geometric Path Planning: Learning Context-Driven Trajectory Preferences via Sub-optimal Feedback
verfasst von: Ashesh Jain
Shikhar Sharma
Ashutosh Saxena
Verlag: Springer International Publishing
Buch: Robotics Research
Print ISBN: 978-3-319-28870-3

Electronic ISBN: 978-3-319-28872-7

Copyright-Jahr: 2016
DOI: https://doi.org/10.1007/978-3-319-28872-7_19

Neuer Inhalt

Bildnachweise

VDI-Icon, Profil Icon, inhalt2, Springer Professional Modul/© Springer Fachmedien Wiesbaden GmbH, Nachhaltigkeitsaward Key Visual/© Cometis AG/Global ESG Monitor | Daniel Rupp | Generiert mit KI, Search Icon, Banner Hanser, Beijing Auto Show 2024: Deutsche Hersteller wollen angreifen./© EKH-Pictures / Generated with AI / Stock.adobe.com, Buchstaben, die aus einem Megaphon kommen/© MicroStockHub/Getty Images/iStock, Digitale Lieferkette/© zapp2photo / stock.adobe.com, Zeitschrift Wissensmanagement Cover, PatentFit-Logo/© Springer Fachmedien Wiesbaden GmbH, Zukunftswerkstatt Sales Excellence 2024/© AndreyPopov / Getty Images / iStock, 2023_Antrieb/© supervisuell, ATZ-Webinar: Prototypenfreie Entwicklung durch Offline- und Driver-in-the-Loop-HiL-Tests /© (c) VI-grade

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Neuer Inhalt

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.