nach oben

Autonomous Robots

Erschienen in:

04.06.2022

Hierarchical planning with state abstractions for temporal task specifications

verfasst von: Yoonseon Oh, Roma Patel, Thao Nguyen, Baichuan Huang, Matthew Berg, Ellie Pavlick, Stefanie Tellex

Erschienen in: Autonomous Robots | Ausgabe 6/2022

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

We often specify tasks for a robot using temporal language that can include different levels of abstraction. For example, the command “go to the kitchen before going to the second floor” contains spatial abstraction, given that “floor” consists of individual rooms that can also be referred to in isolation (“kitchen”, for example). There is also a temporal ordering of events, defined by the word “before”. Previous works have used syntactically co-safe Linear Temporal Logic (sc-LTL) to interpret temporal language (such as “before”), and Abstract Markov Decision Processes (AMDPs) to interpret hierarchical abstractions (such as “kitchen” and “second floor”), separately. To handle both types of commands at once, we introduce the Abstract Product Markov Decision Process (AP-MDP), a novel approach capable of representing non-Markovian reward functions at different levels of abstractions. The AP-MDP framework translates LTL into its corresponding automata, creates a product Markov Decision Process (MDP) of the LTL specification and the environment MDP, and decomposes the problem into subproblems to enable efficient planning with abstractions. AP-MDP performs faster than a non-hierarchical method of solving LTL problems in over \(95 \%\) of path planning tasks, and this number only increases as the size of the environment domain increases. In a cleanup world domain, AP-MDP performs faster in over \(98\%\) of tasks. We also present a neural sequence-to-sequence model trained to translate language commands into LTL expression, and a new corpus of non-Markovian language commands spanning different levels of abstraction. We test our framework with the collected language commands on two drones, demonstrating that our approach enables robots to efficiently solve temporal commands at different levels of abstraction in both indoor and outdoor environments.

Nächster Artikel AEB-RRT*: an adaptive extension bidirectional RRT* algorithm

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

The corpus can be found at https://github.com/h2r/ltl-amdp

Skydio R1: https://robots.ieee.org/robots/skydior1/

Arumugam, D., Karamcheti, S., Gopalan, N., Wong, L. L., Tellex, S. (2017). Accurately and efficiently interpreting human-robot instructions of varying granularities. arXiv preprint arXiv:1704.06616.

Bhatia, A., Kavraki, L. E., Vardi, M. Y. (2010). Sampling-based motion planning with temporal goals. 2010 IEEE International Conference on Robotics and Automation.

Boteanu, A., Howard, T., Arkin, J., Kress-Gazit, H. (2016). A model for verifiable grounding and execution of complex natural language instructions. In 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

Brand, I., Roy, J., Ray, A., Oberlin, J., Oberlix, S. (2018). Pidrone: An autonomous educational drone using raspberry pi and python. In IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

Büchi, J. R. (1990). On a decision method in restricted second order arithmetic. The Collected Works of J (pp. 425–435). Richard Büchi: Springer.

Chen, H., Lee, A. S., Swift, M., Tang, J. C. (2015). 3d collaboration method over \(\text{hololens}^{{\rm TM}}\) and \(\text{ skype}^{{\rm TM}}\) end points. In Proc. of the 3rd International Workshop on Immersive Media Experiences.

Cho, K., Suh, J., Tomlin, C. J., & Oh, S. (2017). Cost-aware path planning under co-safe temporal logic specifications. IEEE Robotics and Automation Letters, 2(4), 2308–2315.CrossRef

Dasgupta, I., Guo, D., Stuhlmüller, A., Gershman, S. J., Goodman, N. D. (2018). Evaluating compositionality in sentence embeddings. CoRR abs/1802.04302.

Dietterich, T. G. (2000). Hierarchical reinforcement learning with the maxq value function decomposition. Journal of artificial intelligence research, 13, 227–303.MathSciNetCrossRef

Ding, X., Smith, S. L., Belta, C., & Rus, D. (2014). Optimal control of markov decision processes with linear temporal logic constraints. IEEE Transactions on Automatic Control, 59(5), 1244–1257.MathSciNetCrossRef

Ding, X. C., Smith, S. L., Belta, C., Rus, D. (2011). MDP optimal control under temporal logic constraints. In IEEE Conference on Decision and Control and European Control Conference (CDC-ECC).

Duret-Lutz, A. (2022). Spot’s temporal logic formulas. Tech. rep., https://spot.lrde.epita.fr/tl.pdf.

Duret-Lutz, A., Lewkowicz, A., Fauchille, A., Michaud, T., Renault, E., Xu, L. (2016). Spot 2.0 — a framework for LTL and \(\omega \)-automata manipulation. In Proc. of the International Symposium on Automated Technology for Verification and Analysis (ATVA’16), Springer, Lecture Notes in Computer Science.

Fainekos, G. E., Girard, A., Kress-Gazit, H., & Pappas, G. J. (2009). Temporal logic motion planning for dynamic robots. Automatica, 45(2), 343–352. https://doi.org/10.1016/j.automatica.2008.08.008MathSciNetCrossRefMATH

Finucane, C., Jing, G., Kress-Gazit, H. (2010). Ltlmop: Experimenting with language, temporal logic and robot control. In 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.

Fu, J., Topcu, U. (2014). Probably approximately correct MDP learning and control with temporal logic constraints. arXiv preprint arXiv:1404.7073.

Gopalan, N., desJardins, M., Littman, M. L., MacGlashan, J., Squire, S., Tellex, S., Winder, J., Wong, L. L. (2017). Planning with abstract markov decision processes. In ICAPS.

Gopalan, N., Arumugam, D., Wong, L., & Tellex, S. (2018). Sequence-to-sequence language grounding of non-markovian task specifications. In Robotics: Science and Systems.

Kasenberg, D., Scheutz, M. (2017) Interpretable apprenticeship learning with temporal logic specifications. In IEEE Conference on Decision and Control.

Kingma, D. P., Ba, J. (2014). Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980.

Kloetzer, M., Mahulea, C. (2016). Multi-robot path planning for syntactically co-safe ltl specifications. In 2016 13th International Workshop on Discrete Event Systems (WODES), pp. 452–458, https://doi.org/10.1109/WODES.2016.7497887.

Koehn, P., Knowles, R. (2017). Six challenges for neural machine translation. arXiv preprint arXiv:1706.03872.

Konidaris, G. (2016). Constructing abstraction hierarchies using a skill-symbol loop. In Proc. of the International Joint Conference on Artificial Intelligence.

Konidaris, G., Kaelbling, L. P., & Lozano-Perez, T. (2018). From skills to symbols: Learning symbolic representations for abstract high-level planning. Journal of Artificial Intelligence Research, 61, 215–289.MathSciNetCrossRef

Kress-Gazit, H., Fainekos, G. E., & Pappas, G. J. (2008). Translating structured english to robot controllers. Advanced Robotics, 22(12), 1343–1359.CrossRef

Kulkarni, T. D., Narasimhan, K., Saeedi, A., & Tenenbaum, J. (2016). Hierarchical deep reinforcement learning: Integrating temporal abstraction and intrinsic motivation. In D. D. Lee, M. Sugiyama, U. V. Luxburg, I. Guyon, & R. Garnett (Eds.), Advances in Neural Information Processing Systems 29. USA: Curran Associates Inc.

Lake, B. M., Baroni, M. (2017). Still not systematic after all these years: On the compositional skills of sequence-to-sequence recurrent networks. arXiv preprint arXiv:1711.00350.

Lignos, C., Raman, V., Finucane, C., Marcus, M., & Kress-Gazit, H. (2015). Provably correct reactive control from natural language. Autonomous Robots, 38(1), 89–105.CrossRef

Liu, X., Fu, J. (2018). Compositional planning in markov decision processes: Temporal abstraction meets generalized logic composition. arXiv preprint arXiv:1810.02497.

MacGlashan, J., Babes-Vroman, M., & desJardins, M., Littman, M., Muresan, S., Squire, S., Tellex, S., Arumugam, D., Yang, L. (2015). Grounding english commands to reward functions. In Robotics: Science and Systems.

MacGlashan, J., Babes-Vroman, M., & desJardins, M., Littman, M. L., Muresan, S., Squire, S., Tellex, S., Arumugam, D., Yang, L. (2015). Grounding english commands to reward functions. In Robotics: Science and Systems.

Manna, Z., Pnueli, A. (1990) A hierarchy of temporal properties. In Proceedings of the Ninth Annual ACM Symposium on Principles of Distributed Computing, Association for Computing Machinery, pp. 377–410.

McMahon, J., Plaku, E. (2014). Sampling-based tree search with discrete abstractions for motion planning with dynamics and temporal logic. In Proc. of the IEEE/RSJ International Conference on Intelligent Robots and Systems.

Oh, Y., Cho, K., Choi, Y., & Oh, S. (2017). Robust multi-layered sampling-based path planning for temporal logic-based missions. In IEEE Conference on Decision and Control. https://doi.org/10.1109/CDC.2017.8263891

Oh, Y., Patel, R., Nguyen, T., Huang, B., Pavlick, E., Tellex, S. (2019). Planning with state abstractions for non-markovian task specifications. In Proceedings of Robotics: Science and Systems, FreiburgimBreisgau, Germany, https://doi.org/10.15607/RSS.2019.XV.059.

Paszke, A., Gross, S., Chintala, S., Chanan, G., Yang, E., DeVito, Z., Lin, Z., Desmaison, A., Antiga, L., Lerer, A. (2017). Automatic differentiation in pytorch.

Sadigh, D., Kim, E. S., Coogan, S., Sastry, S. S., & Seshia, S. A. (2014). A learning based approach to control synthesis of markov decision processes for linear temporal logic specifications. In IEEE Conference on Decision and Control. https://doi.org/10.1109/CDC.2014.7039527

Snyder, J. P. (1987). Map projections–A working manual, vol 1395, US Government Printing Office, pp. 98–103.

Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., & Salakhutdinov, R. (2014). Dropout: a simple way to prevent neural networks from overfitting. The Journal of Machine Learning Research, 15(1), 1929–1958.MathSciNetMATH

Sutton, R. S., Precup, D., & Singh, S. (1999). Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning. Artificial intelligence, 112(1–2), 181–211.MathSciNetCrossRef

Wolff, E. M., Topcu, U., Murray, R. M. (2012). Robust control of uncertain markov decision processes with temporal logic specifications. In IEEE Conference on Decision and Control.

Titel: Hierarchical planning with state abstractions for temporal task specifications
verfasst von: Yoonseon Oh
Roma Patel
Thao Nguyen
Baichuan Huang
Matthew Berg
Ellie Pavlick
Stefanie Tellex
Publikationsdatum: 04.06.2022
Verlag: Springer US
Erschienen in: Autonomous Robots / Ausgabe 6/2022
Print ISSN: 0929-5593
Elektronische ISSN: 1573-7527
DOI: https://doi.org/10.1007/s10514-022-10043-y

Neuer Inhalt

Bildnachweise

VDI-Icon, Profil Icon, inhalt2, Springer Professional Modul/© Springer Fachmedien Wiesbaden GmbH, Nachhaltigkeitsaward Key Visual/© Cometis AG/Global ESG Monitor | Daniel Rupp | Generiert mit KI, Search Icon, Banner Hanser, Frank Urbansky/© Peter Eichler / Leipzig, CO2-Fußabdruck/© Jenny Sturm / stock.adobe.com, Interview Entropie Bild 1/© Bernhard Weßling, Zeitschrift Wissensmanagement Cover, PatentFit-Logo/© Springer Fachmedien Wiesbaden GmbH, Sustainibility Finance/© Robert Kneschke / stock.adobe.com / Springer Fachmedien Wiesbaden GmbH, Zukunftswerkstatt Sales Excellence 2024/© AndreyPopov / Getty Images / iStock, 2023_Antrieb/© supervisuell

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Weitere Artikel der Ausgabe 6/2022

AEB-RRT*: an adaptive extension bidirectional RRT* algorithm

Deep introspective SLAM: deep reinforcement learning based approach to avoid tracking failure in visual SLAM

A penalized batch-Bayesian approach to informative path planning for decentralized swarm robotic search

Manipulation at optimum locations for maximum force transmission with mobile robots under environmental disturbances

Efficiently finding poses for multiple grasp types with partial point clouds by uncoupling grasp shape and scale

Neuer Inhalt

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

AEB-RRT: an adaptive extension bidirectional RRT algorithm