Skip to main content
Top

2018 | OriginalPaper | Chapter

Model Checking for Safe Navigation Among Humans

Authors : Sebastian Junges, Nils Jansen, Joost-Pieter Katoen, Ufuk Topcu, Ruohan Zhang, Mary Hayhoe

Published in: Quantitative Evaluation of Systems

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

We investigate the use of probabilistic model checking to synthesise optimal strategies for autonomous systems that operate among uncontrollable agents such as humans. To formally assess such uncontrollable behaviour, we use models obtained from reinforcement learning. These behaviour models are, e.g., based on data collected in experiments in which humans execute dynamic tasks in a virtual environment. We first describe a method to translate such behaviour models into Markov decision processes (MDPs). The composition of these MDPs with models for (controllable) autonomous systems gives rise to stochastic games (SGs). MDPs and SGs are amenable to probabilistic model checking which enables the synthesis of strategies that provably adhere to formal specifications such as probabilistic temporal logic constraints. Experiments with a prototype provide (1) systematic insights on the credibility and the characteristics of behavioural models and (2) methods for automated synthesis of strategies satisfying guarantees on their required characteristics in the presence of humans.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Footnotes
1
We use the term movements to clarify the distinction from actions in MDPs.
 
Literature
1.
go back to reference Brafman, R.I., Tennenholtz, M.: On partially controlled multi-agent systems. J. Artif. Intell. Res. 4, 477–507 (1996)MathSciNetCrossRef Brafman, R.I., Tennenholtz, M.: On partially controlled multi-agent systems. J. Artif. Intell. Res. 4, 477–507 (1996)MathSciNetCrossRef
2.
go back to reference Dresner, K., Stone, P.: A multiagent approach to autonomous intersection management. J. Artif. Intell. Res. 31, 591–656 (2008)CrossRef Dresner, K., Stone, P.: A multiagent approach to autonomous intersection management. J. Artif. Intell. Res. 31, 591–656 (2008)CrossRef
3.
go back to reference Wellman, M.P., Wurman, P.R., O’Malley, K., Bangera, R., Reeves, D., Walsh, W.E.: Designing the market game for a trading agent competition. IEEE Internet Comput. 5(2), 43–51 (2001)CrossRef Wellman, M.P., Wurman, P.R., O’Malley, K., Bangera, R., Reeves, D., Walsh, W.E.: Designing the market game for a trading agent competition. IEEE Internet Comput. 5(2), 43–51 (2001)CrossRef
4.
go back to reference Khandelwal, P., et al.: Bwibots: a platform for bridging the gap between AI and human-robot interaction research. Int. J. Robot. Res. 36, 635–659 (2017)CrossRef Khandelwal, P., et al.: Bwibots: a platform for bridging the gap between AI and human-robot interaction research. Int. J. Robot. Res. 36, 635–659 (2017)CrossRef
5.
go back to reference Puterman, M.L.: Markov Decision Processes: Discrete Stochastic Dynamic Programming. Wiley, New York (1994)CrossRef Puterman, M.L.: Markov Decision Processes: Discrete Stochastic Dynamic Programming. Wiley, New York (1994)CrossRef
6.
go back to reference Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (1998) Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (1998)
7.
go back to reference Kwiatkowska, M.Z.: Model checking for probability and time: from theory to practice. In: LICS, p. 351. IEEE Computer Society (2003) Kwiatkowska, M.Z.: Model checking for probability and time: from theory to practice. In: LICS, p. 351. IEEE Computer Society (2003)
10.
go back to reference Hansson, H., Jonsson, B.: A logic for reasoning about time and reliability. Formal Aspects Comput. 6(5), 512–535 (1994)CrossRef Hansson, H., Jonsson, B.: A logic for reasoning about time and reliability. Formal Aspects Comput. 6(5), 512–535 (1994)CrossRef
13.
go back to reference Dean, T.L., Givan, R.: Model minimization in Markov decision processes. In: AAAI/IAAI, pp. 106–111. AAAI Press/The MIT Press (1997) Dean, T.L., Givan, R.: Model minimization in Markov decision processes. In: AAAI/IAAI, pp. 106–111. AAAI Press/The MIT Press (1997)
14.
go back to reference Tong, M.H., Zohar, O., Hayhoe, M.M.: Control of gaze while walking: task structure, reward, and uncertainty. J. Vis. 17(1), 28 (2017)CrossRef Tong, M.H., Zohar, O., Hayhoe, M.M.: Control of gaze while walking: task structure, reward, and uncertainty. J. Vis. 17(1), 28 (2017)CrossRef
15.
go back to reference Rothkopf, C.A., Ballard, D.H.: Modular inverse reinforcement learning for visuomotor behaviour. Biol. Cybern. 107(4), 477–490 (2013)CrossRef Rothkopf, C.A., Ballard, D.H.: Modular inverse reinforcement learning for visuomotor behaviour. Biol. Cybern. 107(4), 477–490 (2013)CrossRef
16.
go back to reference Sprague, N., Ballard, D.: Multiple-goal reinforcement learning with modular sarsa (0). IJCA I, 1445–1447 (2003) Sprague, N., Ballard, D.: Multiple-goal reinforcement learning with modular sarsa (0). IJCA I, 1445–1447 (2003)
17.
go back to reference Ballard, D.H., Kit, D., Rothkopf, C.A., Sullivan, B.: A hierarchical modular architecture for embodied cognition. Multisens. Res. 26(1–2), 177–204 (2013)CrossRef Ballard, D.H., Kit, D., Rothkopf, C.A., Sullivan, B.: A hierarchical modular architecture for embodied cognition. Multisens. Res. 26(1–2), 177–204 (2013)CrossRef
18.
go back to reference Leong, Y.C., Radulescu, A., Daniel, R., DeWoskin, V., Niv, Y.: Dynamic interaction between reinforcement learning and attention in multidimensional environments. Neuron 93(2), 451–463 (2017)CrossRef Leong, Y.C., Radulescu, A., Daniel, R., DeWoskin, V., Niv, Y.: Dynamic interaction between reinforcement learning and attention in multidimensional environments. Neuron 93(2), 451–463 (2017)CrossRef
19.
go back to reference Konur, S., Dixon, C., Fisher, M.: Analysing robot swarm behaviour via probabilistic model checking. Robot. Auton. Syst. 60(2), 199–213 (2012)CrossRef Konur, S., Dixon, C., Fisher, M.: Analysing robot swarm behaviour via probabilistic model checking. Robot. Auton. Syst. 60(2), 199–213 (2012)CrossRef
20.
go back to reference Johnson, B., Kress-Gazit, H.: Analyzing and revising synthesized controllers for robots with sensing and actuation errors. Int. J. Robot. Res. 34(6), 816–832 (2015)CrossRef Johnson, B., Kress-Gazit, H.: Analyzing and revising synthesized controllers for robots with sensing and actuation errors. Int. J. Robot. Res. 34(6), 816–832 (2015)CrossRef
22.
23.
go back to reference Feng, L., Wiltsche, C., Humphrey, L., Topcu, U.: Synthesis of human-in-the-loop control protocols for autonomous systems. IEEE Trans. Autom. Sci. Eng. 13(2), 450–462 (2016)CrossRef Feng, L., Wiltsche, C., Humphrey, L., Topcu, U.: Synthesis of human-in-the-loop control protocols for autonomous systems. IEEE Trans. Autom. Sci. Eng. 13(2), 450–462 (2016)CrossRef
24.
go back to reference Lacerda, B., Parker, D., Hawes, N.: Optimal policy generation for partially satisfiable co-safe LTL specifications. In: IJCAI, pp. 1587–1593. AAAI Press (2015) Lacerda, B., Parker, D., Hawes, N.: Optimal policy generation for partially satisfiable co-safe LTL specifications. In: IJCAI, pp. 1587–1593. AAAI Press (2015)
25.
go back to reference Littman, M.L.: Markov games as a framework for multi-agent reinforcement learning. ICML 157, 157–163 (1994) Littman, M.L.: Markov games as a framework for multi-agent reinforcement learning. ICML 157, 157–163 (1994)
26.
go back to reference Bowling, M., Veloso, M.: Multiagent learning using a variable learning rate. Artif. Intell. 136(2), 215–250 (2002)MathSciNetCrossRef Bowling, M., Veloso, M.: Multiagent learning using a variable learning rate. Artif. Intell. 136(2), 215–250 (2002)MathSciNetCrossRef
28.
go back to reference Katoen, J.P.: The probabilistic model checking landscape. In: LICS, pp. 31–45. ACM (2016) Katoen, J.P.: The probabilistic model checking landscape. In: LICS, pp. 31–45. ACM (2016)
29.
go back to reference Garcıa, J., Fernández, F.: A comprehensive survey on safe reinforcement learning. J. Mach. Learn. Res. 16(1), 1437–1480 (2015)MathSciNetMATH Garcıa, J., Fernández, F.: A comprehensive survey on safe reinforcement learning. J. Mach. Learn. Res. 16(1), 1437–1480 (2015)MathSciNetMATH
30.
go back to reference Sculley, D., Phillips, T., Ebner, D., Chaudhary, V., Young, M.: Machine learning: the high-interest credit card of technical debt (2014) Sculley, D., Phillips, T., Ebner, D., Chaudhary, V., Young, M.: Machine learning: the high-interest credit card of technical debt (2014)
31.
go back to reference Winterer, L., et al.: Motion planning under partial observability using game-based abstraction. In: CDC, pp. 2201–2208. IEEE (2017) Winterer, L., et al.: Motion planning under partial observability using game-based abstraction. In: CDC, pp. 2201–2208. IEEE (2017)
32.
go back to reference Etessami, K., Kwiatkowska, M.Z., Vardi, M.Y., Yannakakis, M.: Multi-objective model checking of Markov decision processes. Log. Methods Comput. Sci. 4(4), 1–21 (2008)MathSciNetMATH Etessami, K., Kwiatkowska, M.Z., Vardi, M.Y., Yannakakis, M.: Multi-objective model checking of Markov decision processes. Log. Methods Comput. Sci. 4(4), 1–21 (2008)MathSciNetMATH
33.
go back to reference Agha, G., Palmskog, K.: A survey of statistical model checking. ACM Trans. Model. Comput. Simul. 28(1), 6:1–6:39 (2018)MathSciNetCrossRef Agha, G., Palmskog, K.: A survey of statistical model checking. ACM Trans. Model. Comput. Simul. 28(1), 6:1–6:39 (2018)MathSciNetCrossRef
34.
go back to reference Wachter, B., Zhang, L., Hermanns, H.: Probabilistic model checking modulo theories. In: QEST, pp. 129–140. IEEE CS (2007) Wachter, B., Zhang, L., Hermanns, H.: Probabilistic model checking modulo theories. In: QEST, pp. 129–140. IEEE CS (2007)
Metadata
Title
Model Checking for Safe Navigation Among Humans
Authors
Sebastian Junges
Nils Jansen
Joost-Pieter Katoen
Ufuk Topcu
Ruohan Zhang
Mary Hayhoe
Copyright Year
2018
DOI
https://doi.org/10.1007/978-3-319-99154-2_13

Premium Partner