Skip to main content

2018 | OriginalPaper | Buchkapitel

Model Checking for Safe Navigation Among Humans

verfasst von : Sebastian Junges, Nils Jansen, Joost-Pieter Katoen, Ufuk Topcu, Ruohan Zhang, Mary Hayhoe

Erschienen in: Quantitative Evaluation of Systems

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

We investigate the use of probabilistic model checking to synthesise optimal strategies for autonomous systems that operate among uncontrollable agents such as humans. To formally assess such uncontrollable behaviour, we use models obtained from reinforcement learning. These behaviour models are, e.g., based on data collected in experiments in which humans execute dynamic tasks in a virtual environment. We first describe a method to translate such behaviour models into Markov decision processes (MDPs). The composition of these MDPs with models for (controllable) autonomous systems gives rise to stochastic games (SGs). MDPs and SGs are amenable to probabilistic model checking which enables the synthesis of strategies that provably adhere to formal specifications such as probabilistic temporal logic constraints. Experiments with a prototype provide (1) systematic insights on the credibility and the characteristics of behavioural models and (2) methods for automated synthesis of strategies satisfying guarantees on their required characteristics in the presence of humans.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
We use the term movements to clarify the distinction from actions in MDPs.
 
Literatur
1.
Zurück zum Zitat Brafman, R.I., Tennenholtz, M.: On partially controlled multi-agent systems. J. Artif. Intell. Res. 4, 477–507 (1996)MathSciNetCrossRef Brafman, R.I., Tennenholtz, M.: On partially controlled multi-agent systems. J. Artif. Intell. Res. 4, 477–507 (1996)MathSciNetCrossRef
2.
Zurück zum Zitat Dresner, K., Stone, P.: A multiagent approach to autonomous intersection management. J. Artif. Intell. Res. 31, 591–656 (2008)CrossRef Dresner, K., Stone, P.: A multiagent approach to autonomous intersection management. J. Artif. Intell. Res. 31, 591–656 (2008)CrossRef
3.
Zurück zum Zitat Wellman, M.P., Wurman, P.R., O’Malley, K., Bangera, R., Reeves, D., Walsh, W.E.: Designing the market game for a trading agent competition. IEEE Internet Comput. 5(2), 43–51 (2001)CrossRef Wellman, M.P., Wurman, P.R., O’Malley, K., Bangera, R., Reeves, D., Walsh, W.E.: Designing the market game for a trading agent competition. IEEE Internet Comput. 5(2), 43–51 (2001)CrossRef
4.
Zurück zum Zitat Khandelwal, P., et al.: Bwibots: a platform for bridging the gap between AI and human-robot interaction research. Int. J. Robot. Res. 36, 635–659 (2017)CrossRef Khandelwal, P., et al.: Bwibots: a platform for bridging the gap between AI and human-robot interaction research. Int. J. Robot. Res. 36, 635–659 (2017)CrossRef
5.
Zurück zum Zitat Puterman, M.L.: Markov Decision Processes: Discrete Stochastic Dynamic Programming. Wiley, New York (1994)CrossRef Puterman, M.L.: Markov Decision Processes: Discrete Stochastic Dynamic Programming. Wiley, New York (1994)CrossRef
6.
Zurück zum Zitat Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (1998) Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (1998)
7.
Zurück zum Zitat Kwiatkowska, M.Z.: Model checking for probability and time: from theory to practice. In: LICS, p. 351. IEEE Computer Society (2003) Kwiatkowska, M.Z.: Model checking for probability and time: from theory to practice. In: LICS, p. 351. IEEE Computer Society (2003)
10.
Zurück zum Zitat Hansson, H., Jonsson, B.: A logic for reasoning about time and reliability. Formal Aspects Comput. 6(5), 512–535 (1994)CrossRef Hansson, H., Jonsson, B.: A logic for reasoning about time and reliability. Formal Aspects Comput. 6(5), 512–535 (1994)CrossRef
13.
Zurück zum Zitat Dean, T.L., Givan, R.: Model minimization in Markov decision processes. In: AAAI/IAAI, pp. 106–111. AAAI Press/The MIT Press (1997) Dean, T.L., Givan, R.: Model minimization in Markov decision processes. In: AAAI/IAAI, pp. 106–111. AAAI Press/The MIT Press (1997)
14.
Zurück zum Zitat Tong, M.H., Zohar, O., Hayhoe, M.M.: Control of gaze while walking: task structure, reward, and uncertainty. J. Vis. 17(1), 28 (2017)CrossRef Tong, M.H., Zohar, O., Hayhoe, M.M.: Control of gaze while walking: task structure, reward, and uncertainty. J. Vis. 17(1), 28 (2017)CrossRef
15.
Zurück zum Zitat Rothkopf, C.A., Ballard, D.H.: Modular inverse reinforcement learning for visuomotor behaviour. Biol. Cybern. 107(4), 477–490 (2013)CrossRef Rothkopf, C.A., Ballard, D.H.: Modular inverse reinforcement learning for visuomotor behaviour. Biol. Cybern. 107(4), 477–490 (2013)CrossRef
16.
Zurück zum Zitat Sprague, N., Ballard, D.: Multiple-goal reinforcement learning with modular sarsa (0). IJCA I, 1445–1447 (2003) Sprague, N., Ballard, D.: Multiple-goal reinforcement learning with modular sarsa (0). IJCA I, 1445–1447 (2003)
17.
Zurück zum Zitat Ballard, D.H., Kit, D., Rothkopf, C.A., Sullivan, B.: A hierarchical modular architecture for embodied cognition. Multisens. Res. 26(1–2), 177–204 (2013)CrossRef Ballard, D.H., Kit, D., Rothkopf, C.A., Sullivan, B.: A hierarchical modular architecture for embodied cognition. Multisens. Res. 26(1–2), 177–204 (2013)CrossRef
18.
Zurück zum Zitat Leong, Y.C., Radulescu, A., Daniel, R., DeWoskin, V., Niv, Y.: Dynamic interaction between reinforcement learning and attention in multidimensional environments. Neuron 93(2), 451–463 (2017)CrossRef Leong, Y.C., Radulescu, A., Daniel, R., DeWoskin, V., Niv, Y.: Dynamic interaction between reinforcement learning and attention in multidimensional environments. Neuron 93(2), 451–463 (2017)CrossRef
19.
Zurück zum Zitat Konur, S., Dixon, C., Fisher, M.: Analysing robot swarm behaviour via probabilistic model checking. Robot. Auton. Syst. 60(2), 199–213 (2012)CrossRef Konur, S., Dixon, C., Fisher, M.: Analysing robot swarm behaviour via probabilistic model checking. Robot. Auton. Syst. 60(2), 199–213 (2012)CrossRef
20.
Zurück zum Zitat Johnson, B., Kress-Gazit, H.: Analyzing and revising synthesized controllers for robots with sensing and actuation errors. Int. J. Robot. Res. 34(6), 816–832 (2015)CrossRef Johnson, B., Kress-Gazit, H.: Analyzing and revising synthesized controllers for robots with sensing and actuation errors. Int. J. Robot. Res. 34(6), 816–832 (2015)CrossRef
22.
Zurück zum Zitat Chen, T., Kwiatkowska, M., Simaitis, A., Wiltsche, C.: Synthesis for multi-objective stochastic games: an application to autonomous urban driving. In: Joshi, K., Siegle, M., Stoelinga, M., D’Argenio, P.R. (eds.) QEST 2013. LNCS, vol. 8054, pp. 322–337. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-40196-1_28CrossRef Chen, T., Kwiatkowska, M., Simaitis, A., Wiltsche, C.: Synthesis for multi-objective stochastic games: an application to autonomous urban driving. In: Joshi, K., Siegle, M., Stoelinga, M., D’Argenio, P.R. (eds.) QEST 2013. LNCS, vol. 8054, pp. 322–337. Springer, Heidelberg (2013). https://​doi.​org/​10.​1007/​978-3-642-40196-1_​28CrossRef
23.
Zurück zum Zitat Feng, L., Wiltsche, C., Humphrey, L., Topcu, U.: Synthesis of human-in-the-loop control protocols for autonomous systems. IEEE Trans. Autom. Sci. Eng. 13(2), 450–462 (2016)CrossRef Feng, L., Wiltsche, C., Humphrey, L., Topcu, U.: Synthesis of human-in-the-loop control protocols for autonomous systems. IEEE Trans. Autom. Sci. Eng. 13(2), 450–462 (2016)CrossRef
24.
Zurück zum Zitat Lacerda, B., Parker, D., Hawes, N.: Optimal policy generation for partially satisfiable co-safe LTL specifications. In: IJCAI, pp. 1587–1593. AAAI Press (2015) Lacerda, B., Parker, D., Hawes, N.: Optimal policy generation for partially satisfiable co-safe LTL specifications. In: IJCAI, pp. 1587–1593. AAAI Press (2015)
25.
Zurück zum Zitat Littman, M.L.: Markov games as a framework for multi-agent reinforcement learning. ICML 157, 157–163 (1994) Littman, M.L.: Markov games as a framework for multi-agent reinforcement learning. ICML 157, 157–163 (1994)
26.
Zurück zum Zitat Bowling, M., Veloso, M.: Multiagent learning using a variable learning rate. Artif. Intell. 136(2), 215–250 (2002)MathSciNetCrossRef Bowling, M., Veloso, M.: Multiagent learning using a variable learning rate. Artif. Intell. 136(2), 215–250 (2002)MathSciNetCrossRef
28.
Zurück zum Zitat Katoen, J.P.: The probabilistic model checking landscape. In: LICS, pp. 31–45. ACM (2016) Katoen, J.P.: The probabilistic model checking landscape. In: LICS, pp. 31–45. ACM (2016)
29.
Zurück zum Zitat Garcıa, J., Fernández, F.: A comprehensive survey on safe reinforcement learning. J. Mach. Learn. Res. 16(1), 1437–1480 (2015)MathSciNetMATH Garcıa, J., Fernández, F.: A comprehensive survey on safe reinforcement learning. J. Mach. Learn. Res. 16(1), 1437–1480 (2015)MathSciNetMATH
30.
Zurück zum Zitat Sculley, D., Phillips, T., Ebner, D., Chaudhary, V., Young, M.: Machine learning: the high-interest credit card of technical debt (2014) Sculley, D., Phillips, T., Ebner, D., Chaudhary, V., Young, M.: Machine learning: the high-interest credit card of technical debt (2014)
31.
Zurück zum Zitat Winterer, L., et al.: Motion planning under partial observability using game-based abstraction. In: CDC, pp. 2201–2208. IEEE (2017) Winterer, L., et al.: Motion planning under partial observability using game-based abstraction. In: CDC, pp. 2201–2208. IEEE (2017)
32.
Zurück zum Zitat Etessami, K., Kwiatkowska, M.Z., Vardi, M.Y., Yannakakis, M.: Multi-objective model checking of Markov decision processes. Log. Methods Comput. Sci. 4(4), 1–21 (2008)MathSciNetMATH Etessami, K., Kwiatkowska, M.Z., Vardi, M.Y., Yannakakis, M.: Multi-objective model checking of Markov decision processes. Log. Methods Comput. Sci. 4(4), 1–21 (2008)MathSciNetMATH
33.
Zurück zum Zitat Agha, G., Palmskog, K.: A survey of statistical model checking. ACM Trans. Model. Comput. Simul. 28(1), 6:1–6:39 (2018)MathSciNetCrossRef Agha, G., Palmskog, K.: A survey of statistical model checking. ACM Trans. Model. Comput. Simul. 28(1), 6:1–6:39 (2018)MathSciNetCrossRef
34.
Zurück zum Zitat Wachter, B., Zhang, L., Hermanns, H.: Probabilistic model checking modulo theories. In: QEST, pp. 129–140. IEEE CS (2007) Wachter, B., Zhang, L., Hermanns, H.: Probabilistic model checking modulo theories. In: QEST, pp. 129–140. IEEE CS (2007)
Metadaten
Titel
Model Checking for Safe Navigation Among Humans
verfasst von
Sebastian Junges
Nils Jansen
Joost-Pieter Katoen
Ufuk Topcu
Ruohan Zhang
Mary Hayhoe
Copyright-Jahr
2018
DOI
https://doi.org/10.1007/978-3-319-99154-2_13