Skip to main content
Erschienen in: Autonomous Robots 6/2022

04.06.2022

Hierarchical planning with state abstractions for temporal task specifications

verfasst von: Yoonseon Oh, Roma Patel, Thao Nguyen, Baichuan Huang, Matthew Berg, Ellie Pavlick, Stefanie Tellex

Erschienen in: Autonomous Robots | Ausgabe 6/2022

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

We often specify tasks for a robot using temporal language that can include different levels of abstraction. For example, the command “go to the kitchen before going to the second floor” contains spatial abstraction, given that “floor” consists of individual rooms that can also be referred to in isolation (“kitchen”, for example). There is also a temporal ordering of events, defined by the word “before”. Previous works have used syntactically co-safe Linear Temporal Logic (sc-LTL) to interpret temporal language (such as “before”), and Abstract Markov Decision Processes (AMDPs) to interpret hierarchical abstractions (such as “kitchen” and “second floor”), separately. To handle both types of commands at once, we introduce the Abstract Product Markov Decision Process (AP-MDP), a novel approach capable of representing non-Markovian reward functions at different levels of abstractions. The AP-MDP framework translates LTL into its corresponding automata, creates a product Markov Decision Process (MDP) of the LTL specification and the environment MDP, and decomposes the problem into subproblems to enable efficient planning with abstractions. AP-MDP performs faster than a non-hierarchical method of solving LTL problems in over \(95 \%\) of path planning tasks, and this number only increases as the size of the environment domain increases. In a cleanup world domain, AP-MDP performs faster in over \(98\%\) of tasks. We also present a neural sequence-to-sequence model trained to translate language commands into LTL expression, and a new corpus of non-Markovian language commands spanning different levels of abstraction. We test our framework with the collected language commands on two drones, demonstrating that our approach enables robots to efficiently solve temporal commands at different levels of abstraction in both indoor and outdoor environments.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Arumugam, D., Karamcheti, S., Gopalan, N., Wong, L. L., Tellex, S. (2017). Accurately and efficiently interpreting human-robot instructions of varying granularities. arXiv preprint arXiv:1704.06616. Arumugam, D., Karamcheti, S., Gopalan, N., Wong, L. L., Tellex, S. (2017). Accurately and efficiently interpreting human-robot instructions of varying granularities. arXiv preprint arXiv:​1704.​06616.
Zurück zum Zitat Bhatia, A., Kavraki, L. E., Vardi, M. Y. (2010). Sampling-based motion planning with temporal goals. 2010 IEEE International Conference on Robotics and Automation. Bhatia, A., Kavraki, L. E., Vardi, M. Y. (2010). Sampling-based motion planning with temporal goals. 2010 IEEE International Conference on Robotics and Automation.
Zurück zum Zitat Boteanu, A., Howard, T., Arkin, J., Kress-Gazit, H. (2016). A model for verifiable grounding and execution of complex natural language instructions. In 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). Boteanu, A., Howard, T., Arkin, J., Kress-Gazit, H. (2016). A model for verifiable grounding and execution of complex natural language instructions. In 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).
Zurück zum Zitat Brand, I., Roy, J., Ray, A., Oberlin, J., Oberlix, S. (2018). Pidrone: An autonomous educational drone using raspberry pi and python. In IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). Brand, I., Roy, J., Ray, A., Oberlin, J., Oberlix, S. (2018). Pidrone: An autonomous educational drone using raspberry pi and python. In IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).
Zurück zum Zitat Büchi, J. R. (1990). On a decision method in restricted second order arithmetic. The Collected Works of J (pp. 425–435). Richard Büchi: Springer. Büchi, J. R. (1990). On a decision method in restricted second order arithmetic. The Collected Works of J (pp. 425–435). Richard Büchi: Springer.
Zurück zum Zitat Chen, H., Lee, A. S., Swift, M., Tang, J. C. (2015). 3d collaboration method over \(\text{hololens}^{{\rm TM}}\) and \(\text{ skype}^{{\rm TM}}\) end points. In Proc. of the 3rd International Workshop on Immersive Media Experiences. Chen, H., Lee, A. S., Swift, M., Tang, J. C. (2015). 3d collaboration method over \(\text{hololens}^{{\rm TM}}\) and \(\text{ skype}^{{\rm TM}}\) end points. In Proc. of the 3rd International Workshop on Immersive Media Experiences.
Zurück zum Zitat Cho, K., Suh, J., Tomlin, C. J., & Oh, S. (2017). Cost-aware path planning under co-safe temporal logic specifications. IEEE Robotics and Automation Letters, 2(4), 2308–2315.CrossRef Cho, K., Suh, J., Tomlin, C. J., & Oh, S. (2017). Cost-aware path planning under co-safe temporal logic specifications. IEEE Robotics and Automation Letters, 2(4), 2308–2315.CrossRef
Zurück zum Zitat Dasgupta, I., Guo, D., Stuhlmüller, A., Gershman, S. J., Goodman, N. D. (2018). Evaluating compositionality in sentence embeddings. CoRR abs/1802.04302. Dasgupta, I., Guo, D., Stuhlmüller, A., Gershman, S. J., Goodman, N. D. (2018). Evaluating compositionality in sentence embeddings. CoRR abs/1802.04302.
Zurück zum Zitat Dietterich, T. G. (2000). Hierarchical reinforcement learning with the maxq value function decomposition. Journal of artificial intelligence research, 13, 227–303.MathSciNetCrossRef Dietterich, T. G. (2000). Hierarchical reinforcement learning with the maxq value function decomposition. Journal of artificial intelligence research, 13, 227–303.MathSciNetCrossRef
Zurück zum Zitat Ding, X., Smith, S. L., Belta, C., & Rus, D. (2014). Optimal control of markov decision processes with linear temporal logic constraints. IEEE Transactions on Automatic Control, 59(5), 1244–1257.MathSciNetCrossRef Ding, X., Smith, S. L., Belta, C., & Rus, D. (2014). Optimal control of markov decision processes with linear temporal logic constraints. IEEE Transactions on Automatic Control, 59(5), 1244–1257.MathSciNetCrossRef
Zurück zum Zitat Ding, X. C., Smith, S. L., Belta, C., Rus, D. (2011). MDP optimal control under temporal logic constraints. In IEEE Conference on Decision and Control and European Control Conference (CDC-ECC). Ding, X. C., Smith, S. L., Belta, C., Rus, D. (2011). MDP optimal control under temporal logic constraints. In IEEE Conference on Decision and Control and European Control Conference (CDC-ECC).
Zurück zum Zitat Duret-Lutz, A., Lewkowicz, A., Fauchille, A., Michaud, T., Renault, E., Xu, L. (2016). Spot 2.0 — a framework for LTL and \(\omega \)-automata manipulation. In Proc. of the International Symposium on Automated Technology for Verification and Analysis (ATVA’16), Springer, Lecture Notes in Computer Science. Duret-Lutz, A., Lewkowicz, A., Fauchille, A., Michaud, T., Renault, E., Xu, L. (2016). Spot 2.0 — a framework for LTL and \(\omega \)-automata manipulation. In Proc. of the International Symposium on Automated Technology for Verification and Analysis (ATVA’16), Springer, Lecture Notes in Computer Science.
Zurück zum Zitat Finucane, C., Jing, G., Kress-Gazit, H. (2010). Ltlmop: Experimenting with language, temporal logic and robot control. In 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems. Finucane, C., Jing, G., Kress-Gazit, H. (2010). Ltlmop: Experimenting with language, temporal logic and robot control. In 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.
Zurück zum Zitat Fu, J., Topcu, U. (2014). Probably approximately correct MDP learning and control with temporal logic constraints. arXiv preprint arXiv:1404.7073. Fu, J., Topcu, U. (2014). Probably approximately correct MDP learning and control with temporal logic constraints. arXiv preprint arXiv:​1404.​7073.
Zurück zum Zitat Gopalan, N., desJardins, M., Littman, M. L., MacGlashan, J., Squire, S., Tellex, S., Winder, J., Wong, L. L. (2017). Planning with abstract markov decision processes. In ICAPS. Gopalan, N., desJardins, M., Littman, M. L., MacGlashan, J., Squire, S., Tellex, S., Winder, J., Wong, L. L. (2017). Planning with abstract markov decision processes. In ICAPS.
Zurück zum Zitat Gopalan, N., Arumugam, D., Wong, L., & Tellex, S. (2018). Sequence-to-sequence language grounding of non-markovian task specifications. In Robotics: Science and Systems. Gopalan, N., Arumugam, D., Wong, L., & Tellex, S. (2018). Sequence-to-sequence language grounding of non-markovian task specifications. In Robotics: Science and Systems.
Zurück zum Zitat Kasenberg, D., Scheutz, M. (2017) Interpretable apprenticeship learning with temporal logic specifications. In IEEE Conference on Decision and Control. Kasenberg, D., Scheutz, M. (2017) Interpretable apprenticeship learning with temporal logic specifications. In IEEE Conference on Decision and Control.
Zurück zum Zitat Konidaris, G. (2016). Constructing abstraction hierarchies using a skill-symbol loop. In Proc. of the International Joint Conference on Artificial Intelligence. Konidaris, G. (2016). Constructing abstraction hierarchies using a skill-symbol loop. In Proc. of the International Joint Conference on Artificial Intelligence.
Zurück zum Zitat Konidaris, G., Kaelbling, L. P., & Lozano-Perez, T. (2018). From skills to symbols: Learning symbolic representations for abstract high-level planning. Journal of Artificial Intelligence Research, 61, 215–289.MathSciNetCrossRef Konidaris, G., Kaelbling, L. P., & Lozano-Perez, T. (2018). From skills to symbols: Learning symbolic representations for abstract high-level planning. Journal of Artificial Intelligence Research, 61, 215–289.MathSciNetCrossRef
Zurück zum Zitat Kress-Gazit, H., Fainekos, G. E., & Pappas, G. J. (2008). Translating structured english to robot controllers. Advanced Robotics, 22(12), 1343–1359.CrossRef Kress-Gazit, H., Fainekos, G. E., & Pappas, G. J. (2008). Translating structured english to robot controllers. Advanced Robotics, 22(12), 1343–1359.CrossRef
Zurück zum Zitat Kulkarni, T. D., Narasimhan, K., Saeedi, A., & Tenenbaum, J. (2016). Hierarchical deep reinforcement learning: Integrating temporal abstraction and intrinsic motivation. In D. D. Lee, M. Sugiyama, U. V. Luxburg, I. Guyon, & R. Garnett (Eds.), Advances in Neural Information Processing Systems 29. USA: Curran Associates Inc. Kulkarni, T. D., Narasimhan, K., Saeedi, A., & Tenenbaum, J. (2016). Hierarchical deep reinforcement learning: Integrating temporal abstraction and intrinsic motivation. In D. D. Lee, M. Sugiyama, U. V. Luxburg, I. Guyon, & R. Garnett (Eds.), Advances in Neural Information Processing Systems 29. USA: Curran Associates Inc.
Zurück zum Zitat Lake, B. M., Baroni, M. (2017). Still not systematic after all these years: On the compositional skills of sequence-to-sequence recurrent networks. arXiv preprint arXiv:1711.00350. Lake, B. M., Baroni, M. (2017). Still not systematic after all these years: On the compositional skills of sequence-to-sequence recurrent networks. arXiv preprint arXiv:​1711.​00350.
Zurück zum Zitat Lignos, C., Raman, V., Finucane, C., Marcus, M., & Kress-Gazit, H. (2015). Provably correct reactive control from natural language. Autonomous Robots, 38(1), 89–105.CrossRef Lignos, C., Raman, V., Finucane, C., Marcus, M., & Kress-Gazit, H. (2015). Provably correct reactive control from natural language. Autonomous Robots, 38(1), 89–105.CrossRef
Zurück zum Zitat Liu, X., Fu, J. (2018). Compositional planning in markov decision processes: Temporal abstraction meets generalized logic composition. arXiv preprint arXiv:1810.02497. Liu, X., Fu, J. (2018). Compositional planning in markov decision processes: Temporal abstraction meets generalized logic composition. arXiv preprint arXiv:​1810.​02497.
Zurück zum Zitat MacGlashan, J., Babes-Vroman, M., & desJardins, M., Littman, M., Muresan, S., Squire, S., Tellex, S., Arumugam, D., Yang, L. (2015). Grounding english commands to reward functions. In Robotics: Science and Systems. MacGlashan, J., Babes-Vroman, M., & desJardins, M., Littman, M., Muresan, S., Squire, S., Tellex, S., Arumugam, D., Yang, L. (2015). Grounding english commands to reward functions. In Robotics: Science and Systems.
Zurück zum Zitat MacGlashan, J., Babes-Vroman, M., & desJardins, M., Littman, M. L., Muresan, S., Squire, S., Tellex, S., Arumugam, D., Yang, L. (2015). Grounding english commands to reward functions. In Robotics: Science and Systems. MacGlashan, J., Babes-Vroman, M., & desJardins, M., Littman, M. L., Muresan, S., Squire, S., Tellex, S., Arumugam, D., Yang, L. (2015). Grounding english commands to reward functions. In Robotics: Science and Systems.
Zurück zum Zitat Manna, Z., Pnueli, A. (1990) A hierarchy of temporal properties. In Proceedings of the Ninth Annual ACM Symposium on Principles of Distributed Computing, Association for Computing Machinery, pp. 377–410. Manna, Z., Pnueli, A. (1990) A hierarchy of temporal properties. In Proceedings of the Ninth Annual ACM Symposium on Principles of Distributed Computing, Association for Computing Machinery, pp. 377–410.
Zurück zum Zitat McMahon, J., Plaku, E. (2014). Sampling-based tree search with discrete abstractions for motion planning with dynamics and temporal logic. In Proc. of the IEEE/RSJ International Conference on Intelligent Robots and Systems. McMahon, J., Plaku, E. (2014). Sampling-based tree search with discrete abstractions for motion planning with dynamics and temporal logic. In Proc. of the IEEE/RSJ International Conference on Intelligent Robots and Systems.
Zurück zum Zitat Paszke, A., Gross, S., Chintala, S., Chanan, G., Yang, E., DeVito, Z., Lin, Z., Desmaison, A., Antiga, L., Lerer, A. (2017). Automatic differentiation in pytorch. Paszke, A., Gross, S., Chintala, S., Chanan, G., Yang, E., DeVito, Z., Lin, Z., Desmaison, A., Antiga, L., Lerer, A. (2017). Automatic differentiation in pytorch.
Zurück zum Zitat Snyder, J. P. (1987). Map projections–A working manual, vol 1395, US Government Printing Office, pp. 98–103. Snyder, J. P. (1987). Map projections–A working manual, vol 1395, US Government Printing Office, pp. 98–103.
Zurück zum Zitat Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., & Salakhutdinov, R. (2014). Dropout: a simple way to prevent neural networks from overfitting. The Journal of Machine Learning Research, 15(1), 1929–1958.MathSciNetMATH Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., & Salakhutdinov, R. (2014). Dropout: a simple way to prevent neural networks from overfitting. The Journal of Machine Learning Research, 15(1), 1929–1958.MathSciNetMATH
Zurück zum Zitat Sutton, R. S., Precup, D., & Singh, S. (1999). Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning. Artificial intelligence, 112(1–2), 181–211.MathSciNetCrossRef Sutton, R. S., Precup, D., & Singh, S. (1999). Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning. Artificial intelligence, 112(1–2), 181–211.MathSciNetCrossRef
Zurück zum Zitat Wolff, E. M., Topcu, U., Murray, R. M. (2012). Robust control of uncertain markov decision processes with temporal logic specifications. In IEEE Conference on Decision and Control. Wolff, E. M., Topcu, U., Murray, R. M. (2012). Robust control of uncertain markov decision processes with temporal logic specifications. In IEEE Conference on Decision and Control.
Metadaten
Titel
Hierarchical planning with state abstractions for temporal task specifications
verfasst von
Yoonseon Oh
Roma Patel
Thao Nguyen
Baichuan Huang
Matthew Berg
Ellie Pavlick
Stefanie Tellex
Publikationsdatum
04.06.2022
Verlag
Springer US
Erschienen in
Autonomous Robots / Ausgabe 6/2022
Print ISSN: 0929-5593
Elektronische ISSN: 1573-7527
DOI
https://doi.org/10.1007/s10514-022-10043-y

Weitere Artikel der Ausgabe 6/2022

Autonomous Robots 6/2022 Zur Ausgabe

Neuer Inhalt