Skip to main content

2017 | OriginalPaper | Buchkapitel

ARES: Adaptive Receding-Horizon Synthesis of Optimal Plans

verfasst von : Anna Lukina, Lukas Esterle, Christian Hirsch, Ezio Bartocci, Junxing Yang, Ashish Tiwari, Scott A. Smolka, Radu Grosu

Erschienen in: Tools and Algorithms for the Construction and Analysis of Systems

Verlag: Springer Berlin Heidelberg

Aktivieren Sie unsere intelligente Suche um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

We introduce ARES, an efficient approximation algorithm for generating optimal plans (action sequences) that take an initial state of a Markov Decision Process (MDP) to a state whose cost is below a specified (convergence) threshold. ARES uses Particle Swarm Optimization, with adaptive sizing for both the receding horizon and the particle swarm. Inspired by Importance Splitting, the length of the horizon and the number of particles are chosen such that at least one particle reaches a next-level state, that is, a state where the cost decreases by a required delta from the previous-level state. The level relation on states and the plans constructed by ARES implicitly define a Lyapunov function and an optimal policy, respectively, both of which could be explicitly generated by applying ARES to all states of the MDP, up to some topological equivalence relation. We also assess the effectiveness of ARES by statistically evaluating its rate of success in generating optimal plans. The ARES algorithm resulted from our desire to clarify if flying in V-formation is a flocking policy that optimizes energy conservation, clear view, and velocity alignment. That is, we were interested to see if one could find optimal plans that bring a flock from an arbitrary initial state to a state exhibiting a single connected V-formation. For flocks with 7 birds, ARES is able to generate a plan that leads to a V-formation in 95% of the 8,000 random initial configurations within 63 s, on average. ARES can also be easily customized into a model-predictive controller (MPC) with an adaptive receding horizon and statistical guarantees of convergence. To the best of our knowledge, our adaptive-sizing approach is the first to provide convergence guarantees in receding-horizon techniques.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
A classic MDP [28] is obtained by adding sensor/actuator or wind-gust noise, which are the case we are addressing in the follow-up work.
 
Literatur
1.
Zurück zum Zitat Bajec, I.L., Heppner, F.H.: Organized flight in birds. Anim. Behav. 78(4), 777–789 (2009)CrossRef Bajec, I.L., Heppner, F.H.: Organized flight in birds. Anim. Behav. 78(4), 777–789 (2009)CrossRef
2.
Zurück zum Zitat Bartocci, E., Bortolussi, L., Brázdil, T., Milios, D., Sanguinetti, G.: Policy learning for time-bounded reachability in continuous-time Markov decision processes via doubly-stochastic gradient ascent. In: Agha, G., Houdt, B. (eds.) QEST 2016. LNCS, vol. 9826, pp. 244–259. Springer, Heidelberg (2016). doi:10.1007/978-3-319-43425-4_17 CrossRef Bartocci, E., Bortolussi, L., Brázdil, T., Milios, D., Sanguinetti, G.: Policy learning for time-bounded reachability in continuous-time Markov decision processes via doubly-stochastic gradient ascent. In: Agha, G., Houdt, B. (eds.) QEST 2016. LNCS, vol. 9826, pp. 244–259. Springer, Heidelberg (2016). doi:10.​1007/​978-3-319-43425-4_​17 CrossRef
3.
Zurück zum Zitat Baxter, J., Bartlett, P.L., Weaver, L.: Experiments with infinite-horizon, policy-gradient estimation. J. Artif. Int. Res. 15(1), 351–381 (2011)MathSciNetMATH Baxter, J., Bartlett, P.L., Weaver, L.: Experiments with infinite-horizon, policy-gradient estimation. J. Artif. Int. Res. 15(1), 351–381 (2011)MathSciNetMATH
4.
Zurück zum Zitat Bellman, R.: Dynamic Programming. Princeton University Press, Princeton (1957)MATH Bellman, R.: Dynamic Programming. Princeton University Press, Princeton (1957)MATH
5.
Zurück zum Zitat Camacho, E.F., Alba, C.B.: Model Predictive Control. Advanced Textbooks in Control and Signal Processing. Springer, Heidelberg (2007)CrossRef Camacho, E.F., Alba, C.B.: Model Predictive Control. Advanced Textbooks in Control and Signal Processing. Springer, Heidelberg (2007)CrossRef
6.
Zurück zum Zitat Cattivelli, F.S., Sayed, A.H.: Modeling bird flight formations using diffusion adaptation. IEEE Trans. Signal Process. 59(5), 2038–2051 (2011)MathSciNetCrossRef Cattivelli, F.S., Sayed, A.H.: Modeling bird flight formations using diffusion adaptation. IEEE Trans. Signal Process. 59(5), 2038–2051 (2011)MathSciNetCrossRef
8.
Zurück zum Zitat Chen, Y., Wu, B., Lai, T.L.: Fast Particle Filters and Their Applications to Adaptive Control in Change-Point ARX Models and Robotics. INTECH Open Access Publisher (2009) Chen, Y., Wu, B., Lai, T.L.: Fast Particle Filters and Their Applications to Adaptive Control in Change-Point ARX Models and Robotics. INTECH Open Access Publisher (2009)
9.
Zurück zum Zitat Cutts, C., Speakman, J.: Energy savings in formation flight of pink-footed geese. J. Exp. Biol. 189(1), 251–261 (1994) Cutts, C., Speakman, J.: Energy savings in formation flight of pink-footed geese. J. Exp. Biol. 189(1), 251–261 (1994)
10.
Zurück zum Zitat Dang, A.D., Horn, J.: Formation control of autonomous robots following desired formation during tracking a moving target. In: Proceedings of the International Conference on Cybernetics, pp. 160–165. IEEE (2015) Dang, A.D., Horn, J.: Formation control of autonomous robots following desired formation during tracking a moving target. In: Proceedings of the International Conference on Cybernetics, pp. 160–165. IEEE (2015)
11.
Zurück zum Zitat Dimock, G., Selig, M.: The aerodynamic benefits of self-organization in bird flocks. Urbana 51, 1–9 (2003) Dimock, G., Selig, M.: The aerodynamic benefits of self-organization in bird flocks. Urbana 51, 1–9 (2003)
12.
Zurück zum Zitat Flake, G.W.: The Computational Beauty of Nature: Computer Explorations of Fractals, Chaos, Complex Systems, and Adaptation. MIT Press, Cambridge (1998)MATH Flake, G.W.: The Computational Beauty of Nature: Computer Explorations of Fractals, Chaos, Complex Systems, and Adaptation. MIT Press, Cambridge (1998)MATH
13.
Zurück zum Zitat García, C.E., Prett, D.M., Morari, M.: Model predictive control: theory and practice – a survey. Automatica 25(3), 335–348 (1989)CrossRefMATH García, C.E., Prett, D.M., Morari, M.: Model predictive control: theory and practice – a survey. Automatica 25(3), 335–348 (1989)CrossRefMATH
14.
Zurück zum Zitat Gennaro, M.C.D., Iannelli, L., Vasca, F.: Formation control and collision avoidance in mobile agent systems. In: Proceedings of the International Symposium on Control and Automation Intelligent Control, pp. 796–801. IEEE (2005) Gennaro, M.C.D., Iannelli, L., Vasca, F.: Formation control and collision avoidance in mobile agent systems. In: Proceedings of the International Symposium on Control and Automation Intelligent Control, pp. 796–801. IEEE (2005)
15.
Zurück zum Zitat Glasserman, P., Heidelberger, P., Shahabuddin, P., Zajic, T.: Multilevel splitting for estimating rare event probabilities. Oper. Res. 47(4), 585–600 (1999)MathSciNetCrossRefMATH Glasserman, P., Heidelberger, P., Shahabuddin, P., Zajic, T.: Multilevel splitting for estimating rare event probabilities. Oper. Res. 47(4), 585–600 (1999)MathSciNetCrossRefMATH
16.
Zurück zum Zitat Grosu, R., Peled, D., Ramakrishnan, C.R., Smolka, S.A., Stoller, S.D., Yang, J.: Using statistical model checking for measuring systems. In: Margaria, T., Steffen, B. (eds.) ISoLA 2014. LNCS, vol. 8803, pp. 223–238. Springer, Heidelberg (2014). doi:10.1007/978-3-662-45231-8_16 Grosu, R., Peled, D., Ramakrishnan, C.R., Smolka, S.A., Stoller, S.D., Yang, J.: Using statistical model checking for measuring systems. In: Margaria, T., Steffen, B. (eds.) ISoLA 2014. LNCS, vol. 8803, pp. 223–238. Springer, Heidelberg (2014). doi:10.​1007/​978-3-662-45231-8_​16
17.
Zurück zum Zitat Henriques, D., Martins, J.G., Zuliani, P., Platzer, A., Clarke, E.M.: Statistical model checking for Markov decision processes. In: Proceedings of QEST 2012: The Ninth International Conference on Quantitative Evaluation of Systems, QEST 2012, pp. 84–93. IEEE Computer Society (2012) Henriques, D., Martins, J.G., Zuliani, P., Platzer, A., Clarke, E.M.: Statistical model checking for Markov decision processes. In: Proceedings of QEST 2012: The Ninth International Conference on Quantitative Evaluation of Systems, QEST 2012, pp. 84–93. IEEE Computer Society (2012)
18.
Zurück zum Zitat Heppner, F.H.: Avian flight formations. Bird-Banding 45(2), 160–169 (1974)CrossRef Heppner, F.H.: Avian flight formations. Bird-Banding 45(2), 160–169 (1974)CrossRef
19.
Zurück zum Zitat Hérault, T., Lassaigne, R., Magniette, F., Peyronnet, S.: Approximate probabilistic model checking. In: Steffen, B., Levi, G. (eds.) VMCAI 2004. LNCS, vol. 2937, pp. 73–84. Springer, Heidelberg (2004). doi:10.1007/978-3-540-24622-0_8 CrossRef Hérault, T., Lassaigne, R., Magniette, F., Peyronnet, S.: Approximate probabilistic model checking. In: Steffen, B., Levi, G. (eds.) VMCAI 2004. LNCS, vol. 2937, pp. 73–84. Springer, Heidelberg (2004). doi:10.​1007/​978-3-540-24622-0_​8 CrossRef
20.
Zurück zum Zitat Hung, Y., Wang, W.: Accelerating parallel particle swarm optimization via GPU. Optim. Methods Softw. 27(1), 33–51 (2012)CrossRefMATH Hung, Y., Wang, W.: Accelerating parallel particle swarm optimization via GPU. Optim. Methods Softw. 27(1), 33–51 (2012)CrossRefMATH
21.
Zurück zum Zitat Kahn, H., Harris, T.E.: Estimation of particle transmission by random sampling. Natl. Bur. Stand. Appl. Math. Ser. 12, 27–30 (1951) Kahn, H., Harris, T.E.: Estimation of particle transmission by random sampling. Natl. Bur. Stand. Appl. Math. Ser. 12, 27–30 (1951)
22.
Zurück zum Zitat Kalajdzic, K., Jegourel, C., Lukina, A., Bartocci, E., Legay, A., Smolka, S.A., Grosu, R.: Feedback control for statistical model checking of cyber-physical systems. In: Margaria, T., Steffen, B. (eds.) ISoLA 2016. LNCS, vol. 9952, pp. 46–61. Springer, Heidelberg (2016). doi:10.1007/978-3-319-47166-2_4 CrossRef Kalajdzic, K., Jegourel, C., Lukina, A., Bartocci, E., Legay, A., Smolka, S.A., Grosu, R.: Feedback control for statistical model checking of cyber-physical systems. In: Margaria, T., Steffen, B. (eds.) ISoLA 2016. LNCS, vol. 9952, pp. 46–61. Springer, Heidelberg (2016). doi:10.​1007/​978-3-319-47166-2_​4 CrossRef
23.
Zurück zum Zitat Kennedy, J., Eberhart, R.: Particle swarm optimization. In: Proceedings of 1995 IEEE International Conference on Neural Networks, pp. 1942–1948 (1995) Kennedy, J., Eberhart, R.: Particle swarm optimization. In: Proceedings of 1995 IEEE International Conference on Neural Networks, pp. 1942–1948 (1995)
24.
Zurück zum Zitat Lissaman, P., Shollenberger, C.A.: Formation flight of birds. Science 168(3934), 1003–1005 (1970)CrossRef Lissaman, P., Shollenberger, C.A.: Formation flight of birds. Science 168(3934), 1003–1005 (1970)CrossRef
25.
Zurück zum Zitat Mannor, S., Rubinstein, R.Y., Gat, Y.: The cross entropy method for fast policy search. In: ICML, pp. 512–519 (2003) Mannor, S., Rubinstein, R.Y., Gat, Y.: The cross entropy method for fast policy search. In: ICML, pp. 512–519 (2003)
26.
Zurück zum Zitat Nathan, A., Barbosa, V.C.: V-like formations in flocks of artificial birds. Artif. Life 14(2), 179–188 (2008)CrossRef Nathan, A., Barbosa, V.C.: V-like formations in flocks of artificial birds. Artif. Life 14(2), 179–188 (2008)CrossRef
27.
Zurück zum Zitat Reynolds, C.W.: Flocks, herds and schools: a distributed behavioral model. SIGGRAPH Comput. Graph. 21(4), 25–34 (1987)CrossRef Reynolds, C.W.: Flocks, herds and schools: a distributed behavioral model. SIGGRAPH Comput. Graph. 21(4), 25–34 (1987)CrossRef
28.
Zurück zum Zitat Russell, S., Norvig, P.: Artificial Intelligence: A Modern Approach, 3rd edn. Prentice-Hall, Upper Saddle River (2010)MATH Russell, S., Norvig, P.: Artificial Intelligence: A Modern Approach, 3rd edn. Prentice-Hall, Upper Saddle River (2010)MATH
29.
Zurück zum Zitat Rymut, B., Kwolek, B., Krzeszowski, T.: GPU-accelerated human motion tracking using particle filter combined with PSO. In: Blanc-Talon, J., Kasinski, A., Philips, W., Popescu, D., Scheunders, P. (eds.) ACIVS 2013. LNCS, vol. 8192, pp. 426–437. Springer, Heidelberg (2013). doi:10.1007/978-3-319-02895-8_38 CrossRef Rymut, B., Kwolek, B., Krzeszowski, T.: GPU-accelerated human motion tracking using particle filter combined with PSO. In: Blanc-Talon, J., Kasinski, A., Philips, W., Popescu, D., Scheunders, P. (eds.) ACIVS 2013. LNCS, vol. 8192, pp. 426–437. Springer, Heidelberg (2013). doi:10.​1007/​978-3-319-02895-8_​38 CrossRef
30.
Zurück zum Zitat Seiler, P., Pant, A., Hedrick, K.: Analysis of bird formations. In: Proceedings of the Conference on Decision and Control, vol. 1, pp. 118–123. IEEE (2002) Seiler, P., Pant, A., Hedrick, K.: Analysis of bird formations. In: Proceedings of the Conference on Decision and Control, vol. 1, pp. 118–123. IEEE (2002)
31.
Zurück zum Zitat Stulp, F., Sigaud, O.: Path integral policy improvement with covariance matrix adaptation. arXiv preprint arXiv:1206.4621 (2012) Stulp, F., Sigaud, O.: Path integral policy improvement with covariance matrix adaptation. arXiv preprint arXiv:​1206.​4621 (2012)
33.
Zurück zum Zitat Verfaillie, G., Pralet, C., Vidal, V., Teichteil, F., Infantes, G., Lesire, C.: Synthesis of plans or policies for controlling dynamic systems. AerospaceLab (4), 1–12 (2012) Verfaillie, G., Pralet, C., Vidal, V., Teichteil, F., Infantes, G., Lesire, C.: Synthesis of plans or policies for controlling dynamic systems. AerospaceLab (4), 1–12 (2012)
34.
Zurück zum Zitat Weimerskirch, H., Martin, J., Clerquin, Y., Alexandre, P., Jiraskova, S.: Energy saving in flight formation. Nature 413(6857), 697–698 (2001)CrossRef Weimerskirch, H., Martin, J., Clerquin, Y., Alexandre, P., Jiraskova, S.: Energy saving in flight formation. Nature 413(6857), 697–698 (2001)CrossRef
35.
Zurück zum Zitat Yang, J., Grosu, R., Smolka, S.A., Tiwari, A.: Love thy neighbor: V-formation as a problem of model predictive control. In: LIPIcs-Leibniz International Proceedings in Informatics, vol. 59. Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik (2016) Yang, J., Grosu, R., Smolka, S.A., Tiwari, A.: Love thy neighbor: V-formation as a problem of model predictive control. In: LIPIcs-Leibniz International Proceedings in Informatics, vol. 59. Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik (2016)
36.
Zurück zum Zitat Yang, J., Grosu, R., Smolka, S.A., Tiwari, A.: V-formation as optimal control. In: Proceedings of the Biological Distributed Algorithms Workshop 2016 (2016) Yang, J., Grosu, R., Smolka, S.A., Tiwari, A.: V-formation as optimal control. In: Proceedings of the Biological Distributed Algorithms Workshop 2016 (2016)
37.
Zurück zum Zitat Zhou, Y., Tan, Y.: GPU-based parallel particle swarm optimization. In: Proceedings of the Congress on Evolutionary Computation, pp. 1493–1500. IEEE (2009) Zhou, Y., Tan, Y.: GPU-based parallel particle swarm optimization. In: Proceedings of the Congress on Evolutionary Computation, pp. 1493–1500. IEEE (2009)
Metadaten
Titel
ARES: Adaptive Receding-Horizon Synthesis of Optimal Plans
verfasst von
Anna Lukina
Lukas Esterle
Christian Hirsch
Ezio Bartocci
Junxing Yang
Ashish Tiwari
Scott A. Smolka
Radu Grosu
Copyright-Jahr
2017
Verlag
Springer Berlin Heidelberg
DOI
https://doi.org/10.1007/978-3-662-54580-5_17