Skip to main content
Top

2017 | OriginalPaper | Chapter

ARES: Adaptive Receding-Horizon Synthesis of Optimal Plans

Authors : Anna Lukina, Lukas Esterle, Christian Hirsch, Ezio Bartocci, Junxing Yang, Ashish Tiwari, Scott A. Smolka, Radu Grosu

Published in: Tools and Algorithms for the Construction and Analysis of Systems

Publisher: Springer Berlin Heidelberg

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

We introduce ARES, an efficient approximation algorithm for generating optimal plans (action sequences) that take an initial state of a Markov Decision Process (MDP) to a state whose cost is below a specified (convergence) threshold. ARES uses Particle Swarm Optimization, with adaptive sizing for both the receding horizon and the particle swarm. Inspired by Importance Splitting, the length of the horizon and the number of particles are chosen such that at least one particle reaches a next-level state, that is, a state where the cost decreases by a required delta from the previous-level state. The level relation on states and the plans constructed by ARES implicitly define a Lyapunov function and an optimal policy, respectively, both of which could be explicitly generated by applying ARES to all states of the MDP, up to some topological equivalence relation. We also assess the effectiveness of ARES by statistically evaluating its rate of success in generating optimal plans. The ARES algorithm resulted from our desire to clarify if flying in V-formation is a flocking policy that optimizes energy conservation, clear view, and velocity alignment. That is, we were interested to see if one could find optimal plans that bring a flock from an arbitrary initial state to a state exhibiting a single connected V-formation. For flocks with 7 birds, ARES is able to generate a plan that leads to a V-formation in 95% of the 8,000 random initial configurations within 63 s, on average. ARES can also be easily customized into a model-predictive controller (MPC) with an adaptive receding horizon and statistical guarantees of convergence. To the best of our knowledge, our adaptive-sizing approach is the first to provide convergence guarantees in receding-horizon techniques.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Footnotes
1
A classic MDP [28] is obtained by adding sensor/actuator or wind-gust noise, which are the case we are addressing in the follow-up work.
 
Literature
1.
go back to reference Bajec, I.L., Heppner, F.H.: Organized flight in birds. Anim. Behav. 78(4), 777–789 (2009)CrossRef Bajec, I.L., Heppner, F.H.: Organized flight in birds. Anim. Behav. 78(4), 777–789 (2009)CrossRef
2.
go back to reference Bartocci, E., Bortolussi, L., Brázdil, T., Milios, D., Sanguinetti, G.: Policy learning for time-bounded reachability in continuous-time Markov decision processes via doubly-stochastic gradient ascent. In: Agha, G., Houdt, B. (eds.) QEST 2016. LNCS, vol. 9826, pp. 244–259. Springer, Heidelberg (2016). doi:10.1007/978-3-319-43425-4_17 CrossRef Bartocci, E., Bortolussi, L., Brázdil, T., Milios, D., Sanguinetti, G.: Policy learning for time-bounded reachability in continuous-time Markov decision processes via doubly-stochastic gradient ascent. In: Agha, G., Houdt, B. (eds.) QEST 2016. LNCS, vol. 9826, pp. 244–259. Springer, Heidelberg (2016). doi:10.​1007/​978-3-319-43425-4_​17 CrossRef
3.
go back to reference Baxter, J., Bartlett, P.L., Weaver, L.: Experiments with infinite-horizon, policy-gradient estimation. J. Artif. Int. Res. 15(1), 351–381 (2011)MathSciNetMATH Baxter, J., Bartlett, P.L., Weaver, L.: Experiments with infinite-horizon, policy-gradient estimation. J. Artif. Int. Res. 15(1), 351–381 (2011)MathSciNetMATH
4.
go back to reference Bellman, R.: Dynamic Programming. Princeton University Press, Princeton (1957)MATH Bellman, R.: Dynamic Programming. Princeton University Press, Princeton (1957)MATH
5.
go back to reference Camacho, E.F., Alba, C.B.: Model Predictive Control. Advanced Textbooks in Control and Signal Processing. Springer, Heidelberg (2007)CrossRef Camacho, E.F., Alba, C.B.: Model Predictive Control. Advanced Textbooks in Control and Signal Processing. Springer, Heidelberg (2007)CrossRef
6.
go back to reference Cattivelli, F.S., Sayed, A.H.: Modeling bird flight formations using diffusion adaptation. IEEE Trans. Signal Process. 59(5), 2038–2051 (2011)MathSciNetCrossRef Cattivelli, F.S., Sayed, A.H.: Modeling bird flight formations using diffusion adaptation. IEEE Trans. Signal Process. 59(5), 2038–2051 (2011)MathSciNetCrossRef
8.
go back to reference Chen, Y., Wu, B., Lai, T.L.: Fast Particle Filters and Their Applications to Adaptive Control in Change-Point ARX Models and Robotics. INTECH Open Access Publisher (2009) Chen, Y., Wu, B., Lai, T.L.: Fast Particle Filters and Their Applications to Adaptive Control in Change-Point ARX Models and Robotics. INTECH Open Access Publisher (2009)
9.
go back to reference Cutts, C., Speakman, J.: Energy savings in formation flight of pink-footed geese. J. Exp. Biol. 189(1), 251–261 (1994) Cutts, C., Speakman, J.: Energy savings in formation flight of pink-footed geese. J. Exp. Biol. 189(1), 251–261 (1994)
10.
go back to reference Dang, A.D., Horn, J.: Formation control of autonomous robots following desired formation during tracking a moving target. In: Proceedings of the International Conference on Cybernetics, pp. 160–165. IEEE (2015) Dang, A.D., Horn, J.: Formation control of autonomous robots following desired formation during tracking a moving target. In: Proceedings of the International Conference on Cybernetics, pp. 160–165. IEEE (2015)
11.
go back to reference Dimock, G., Selig, M.: The aerodynamic benefits of self-organization in bird flocks. Urbana 51, 1–9 (2003) Dimock, G., Selig, M.: The aerodynamic benefits of self-organization in bird flocks. Urbana 51, 1–9 (2003)
12.
go back to reference Flake, G.W.: The Computational Beauty of Nature: Computer Explorations of Fractals, Chaos, Complex Systems, and Adaptation. MIT Press, Cambridge (1998)MATH Flake, G.W.: The Computational Beauty of Nature: Computer Explorations of Fractals, Chaos, Complex Systems, and Adaptation. MIT Press, Cambridge (1998)MATH
13.
go back to reference García, C.E., Prett, D.M., Morari, M.: Model predictive control: theory and practice – a survey. Automatica 25(3), 335–348 (1989)CrossRefMATH García, C.E., Prett, D.M., Morari, M.: Model predictive control: theory and practice – a survey. Automatica 25(3), 335–348 (1989)CrossRefMATH
14.
go back to reference Gennaro, M.C.D., Iannelli, L., Vasca, F.: Formation control and collision avoidance in mobile agent systems. In: Proceedings of the International Symposium on Control and Automation Intelligent Control, pp. 796–801. IEEE (2005) Gennaro, M.C.D., Iannelli, L., Vasca, F.: Formation control and collision avoidance in mobile agent systems. In: Proceedings of the International Symposium on Control and Automation Intelligent Control, pp. 796–801. IEEE (2005)
15.
go back to reference Glasserman, P., Heidelberger, P., Shahabuddin, P., Zajic, T.: Multilevel splitting for estimating rare event probabilities. Oper. Res. 47(4), 585–600 (1999)MathSciNetCrossRefMATH Glasserman, P., Heidelberger, P., Shahabuddin, P., Zajic, T.: Multilevel splitting for estimating rare event probabilities. Oper. Res. 47(4), 585–600 (1999)MathSciNetCrossRefMATH
16.
go back to reference Grosu, R., Peled, D., Ramakrishnan, C.R., Smolka, S.A., Stoller, S.D., Yang, J.: Using statistical model checking for measuring systems. In: Margaria, T., Steffen, B. (eds.) ISoLA 2014. LNCS, vol. 8803, pp. 223–238. Springer, Heidelberg (2014). doi:10.1007/978-3-662-45231-8_16 Grosu, R., Peled, D., Ramakrishnan, C.R., Smolka, S.A., Stoller, S.D., Yang, J.: Using statistical model checking for measuring systems. In: Margaria, T., Steffen, B. (eds.) ISoLA 2014. LNCS, vol. 8803, pp. 223–238. Springer, Heidelberg (2014). doi:10.​1007/​978-3-662-45231-8_​16
17.
go back to reference Henriques, D., Martins, J.G., Zuliani, P., Platzer, A., Clarke, E.M.: Statistical model checking for Markov decision processes. In: Proceedings of QEST 2012: The Ninth International Conference on Quantitative Evaluation of Systems, QEST 2012, pp. 84–93. IEEE Computer Society (2012) Henriques, D., Martins, J.G., Zuliani, P., Platzer, A., Clarke, E.M.: Statistical model checking for Markov decision processes. In: Proceedings of QEST 2012: The Ninth International Conference on Quantitative Evaluation of Systems, QEST 2012, pp. 84–93. IEEE Computer Society (2012)
18.
go back to reference Heppner, F.H.: Avian flight formations. Bird-Banding 45(2), 160–169 (1974)CrossRef Heppner, F.H.: Avian flight formations. Bird-Banding 45(2), 160–169 (1974)CrossRef
19.
go back to reference Hérault, T., Lassaigne, R., Magniette, F., Peyronnet, S.: Approximate probabilistic model checking. In: Steffen, B., Levi, G. (eds.) VMCAI 2004. LNCS, vol. 2937, pp. 73–84. Springer, Heidelberg (2004). doi:10.1007/978-3-540-24622-0_8 CrossRef Hérault, T., Lassaigne, R., Magniette, F., Peyronnet, S.: Approximate probabilistic model checking. In: Steffen, B., Levi, G. (eds.) VMCAI 2004. LNCS, vol. 2937, pp. 73–84. Springer, Heidelberg (2004). doi:10.​1007/​978-3-540-24622-0_​8 CrossRef
20.
go back to reference Hung, Y., Wang, W.: Accelerating parallel particle swarm optimization via GPU. Optim. Methods Softw. 27(1), 33–51 (2012)CrossRefMATH Hung, Y., Wang, W.: Accelerating parallel particle swarm optimization via GPU. Optim. Methods Softw. 27(1), 33–51 (2012)CrossRefMATH
21.
go back to reference Kahn, H., Harris, T.E.: Estimation of particle transmission by random sampling. Natl. Bur. Stand. Appl. Math. Ser. 12, 27–30 (1951) Kahn, H., Harris, T.E.: Estimation of particle transmission by random sampling. Natl. Bur. Stand. Appl. Math. Ser. 12, 27–30 (1951)
22.
go back to reference Kalajdzic, K., Jegourel, C., Lukina, A., Bartocci, E., Legay, A., Smolka, S.A., Grosu, R.: Feedback control for statistical model checking of cyber-physical systems. In: Margaria, T., Steffen, B. (eds.) ISoLA 2016. LNCS, vol. 9952, pp. 46–61. Springer, Heidelberg (2016). doi:10.1007/978-3-319-47166-2_4 CrossRef Kalajdzic, K., Jegourel, C., Lukina, A., Bartocci, E., Legay, A., Smolka, S.A., Grosu, R.: Feedback control for statistical model checking of cyber-physical systems. In: Margaria, T., Steffen, B. (eds.) ISoLA 2016. LNCS, vol. 9952, pp. 46–61. Springer, Heidelberg (2016). doi:10.​1007/​978-3-319-47166-2_​4 CrossRef
23.
go back to reference Kennedy, J., Eberhart, R.: Particle swarm optimization. In: Proceedings of 1995 IEEE International Conference on Neural Networks, pp. 1942–1948 (1995) Kennedy, J., Eberhart, R.: Particle swarm optimization. In: Proceedings of 1995 IEEE International Conference on Neural Networks, pp. 1942–1948 (1995)
24.
go back to reference Lissaman, P., Shollenberger, C.A.: Formation flight of birds. Science 168(3934), 1003–1005 (1970)CrossRef Lissaman, P., Shollenberger, C.A.: Formation flight of birds. Science 168(3934), 1003–1005 (1970)CrossRef
25.
go back to reference Mannor, S., Rubinstein, R.Y., Gat, Y.: The cross entropy method for fast policy search. In: ICML, pp. 512–519 (2003) Mannor, S., Rubinstein, R.Y., Gat, Y.: The cross entropy method for fast policy search. In: ICML, pp. 512–519 (2003)
26.
go back to reference Nathan, A., Barbosa, V.C.: V-like formations in flocks of artificial birds. Artif. Life 14(2), 179–188 (2008)CrossRef Nathan, A., Barbosa, V.C.: V-like formations in flocks of artificial birds. Artif. Life 14(2), 179–188 (2008)CrossRef
27.
go back to reference Reynolds, C.W.: Flocks, herds and schools: a distributed behavioral model. SIGGRAPH Comput. Graph. 21(4), 25–34 (1987)CrossRef Reynolds, C.W.: Flocks, herds and schools: a distributed behavioral model. SIGGRAPH Comput. Graph. 21(4), 25–34 (1987)CrossRef
28.
go back to reference Russell, S., Norvig, P.: Artificial Intelligence: A Modern Approach, 3rd edn. Prentice-Hall, Upper Saddle River (2010)MATH Russell, S., Norvig, P.: Artificial Intelligence: A Modern Approach, 3rd edn. Prentice-Hall, Upper Saddle River (2010)MATH
29.
go back to reference Rymut, B., Kwolek, B., Krzeszowski, T.: GPU-accelerated human motion tracking using particle filter combined with PSO. In: Blanc-Talon, J., Kasinski, A., Philips, W., Popescu, D., Scheunders, P. (eds.) ACIVS 2013. LNCS, vol. 8192, pp. 426–437. Springer, Heidelberg (2013). doi:10.1007/978-3-319-02895-8_38 CrossRef Rymut, B., Kwolek, B., Krzeszowski, T.: GPU-accelerated human motion tracking using particle filter combined with PSO. In: Blanc-Talon, J., Kasinski, A., Philips, W., Popescu, D., Scheunders, P. (eds.) ACIVS 2013. LNCS, vol. 8192, pp. 426–437. Springer, Heidelberg (2013). doi:10.​1007/​978-3-319-02895-8_​38 CrossRef
30.
go back to reference Seiler, P., Pant, A., Hedrick, K.: Analysis of bird formations. In: Proceedings of the Conference on Decision and Control, vol. 1, pp. 118–123. IEEE (2002) Seiler, P., Pant, A., Hedrick, K.: Analysis of bird formations. In: Proceedings of the Conference on Decision and Control, vol. 1, pp. 118–123. IEEE (2002)
31.
33.
go back to reference Verfaillie, G., Pralet, C., Vidal, V., Teichteil, F., Infantes, G., Lesire, C.: Synthesis of plans or policies for controlling dynamic systems. AerospaceLab (4), 1–12 (2012) Verfaillie, G., Pralet, C., Vidal, V., Teichteil, F., Infantes, G., Lesire, C.: Synthesis of plans or policies for controlling dynamic systems. AerospaceLab (4), 1–12 (2012)
34.
go back to reference Weimerskirch, H., Martin, J., Clerquin, Y., Alexandre, P., Jiraskova, S.: Energy saving in flight formation. Nature 413(6857), 697–698 (2001)CrossRef Weimerskirch, H., Martin, J., Clerquin, Y., Alexandre, P., Jiraskova, S.: Energy saving in flight formation. Nature 413(6857), 697–698 (2001)CrossRef
35.
go back to reference Yang, J., Grosu, R., Smolka, S.A., Tiwari, A.: Love thy neighbor: V-formation as a problem of model predictive control. In: LIPIcs-Leibniz International Proceedings in Informatics, vol. 59. Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik (2016) Yang, J., Grosu, R., Smolka, S.A., Tiwari, A.: Love thy neighbor: V-formation as a problem of model predictive control. In: LIPIcs-Leibniz International Proceedings in Informatics, vol. 59. Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik (2016)
36.
go back to reference Yang, J., Grosu, R., Smolka, S.A., Tiwari, A.: V-formation as optimal control. In: Proceedings of the Biological Distributed Algorithms Workshop 2016 (2016) Yang, J., Grosu, R., Smolka, S.A., Tiwari, A.: V-formation as optimal control. In: Proceedings of the Biological Distributed Algorithms Workshop 2016 (2016)
37.
go back to reference Zhou, Y., Tan, Y.: GPU-based parallel particle swarm optimization. In: Proceedings of the Congress on Evolutionary Computation, pp. 1493–1500. IEEE (2009) Zhou, Y., Tan, Y.: GPU-based parallel particle swarm optimization. In: Proceedings of the Congress on Evolutionary Computation, pp. 1493–1500. IEEE (2009)
Metadata
Title
ARES: Adaptive Receding-Horizon Synthesis of Optimal Plans
Authors
Anna Lukina
Lukas Esterle
Christian Hirsch
Ezio Bartocci
Junxing Yang
Ashish Tiwari
Scott A. Smolka
Radu Grosu
Copyright Year
2017
Publisher
Springer Berlin Heidelberg
DOI
https://doi.org/10.1007/978-3-662-54580-5_17

Premium Partner