Skip to main content
Top
Published in: Soft Computing 1/2010

01-01-2010 | Original Paper

Using evolution strategies to solve DEC-POMDP problems

Authors: Barış Eker, H. Levent Akın

Published in: Soft Computing | Issue 1/2010

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Decentralized partially observable Markov decision process (DEC-POMDP) is an approach to model multi-robot decision making problems under uncertainty. Since it is NEXP-complete there is no efficient exact algorithm to solve these problems and in spite of the attention it has taken recently, so far only a few approximate solutions that can solve small problems have been proposed. In this study, we offer a novel approximate solution algorithm for DEC-POMDP problems using evolution strategies, and a novel approach to approximately calculate the fitness of the chromosomes which correspond to the expected reward. We also propose a new problem which is a more complex, modified version of the grid meeting problem and solve it. Our results show that our algorithm is scalable and we can solve problems that have more states than the problems attempted in previous studies.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literature
go back to reference Akin HL (1994) Evolutionary computation: a natural answer to artificial questions. In: Proceedings of ANNAL: hints from life to artificial intelligence, METU, pp 41-52 Akin HL (1994) Evolutionary computation: a natural answer to artificial questions. In: Proceedings of ANNAL: hints from life to artificial intelligence, METU, pp 41-52
go back to reference Amato C, Bernstein DS, Zilberstein S (2006) Optimal fixed-size controllers for decentralized POMDPs, AAMAS 2006 workshop on multi-agent sequential decision making in uncertain domains, Hakodate, Japan Amato C, Bernstein DS, Zilberstein S (2006) Optimal fixed-size controllers for decentralized POMDPs, AAMAS 2006 workshop on multi-agent sequential decision making in uncertain domains, Hakodate, Japan
go back to reference Back T, Schwefel HP (1993) An overview of evolutionary algorithms for parameter optimization. Evol Comput 1:1–24CrossRef Back T, Schwefel HP (1993) An overview of evolutionary algorithms for parameter optimization. Evol Comput 1:1–24CrossRef
go back to reference Becker E, Zilberstein S, Lesser V, Goldman CV (2004) Solving transition independent decentralized Markov decision processes. J Artif Intell Res 22:423–255MATHMathSciNet Becker E, Zilberstein S, Lesser V, Goldman CV (2004) Solving transition independent decentralized Markov decision processes. J Artif Intell Res 22:423–255MATHMathSciNet
go back to reference Bernstein D, Zilberstein S, Immerman N (2000) The complexity of decentralized control of markov decision processes. In: Proceedings of the 16th conference on uncertainty in artificial intelligence Bernstein D, Zilberstein S, Immerman N (2000) The complexity of decentralized control of markov decision processes. In: Proceedings of the 16th conference on uncertainty in artificial intelligence
go back to reference Bernstein D, Hansen EA, Zilberstein S (2005) Bounded policy iteration for decentralized POMDPs. In: Proceedings of the nineteenth international joint conference on artificial intelligence (IJCAI), Edinburgh, Scotland Bernstein D, Hansen EA, Zilberstein S (2005) Bounded policy iteration for decentralized POMDPs. In: Proceedings of the nineteenth international joint conference on artificial intelligence (IJCAI), Edinburgh, Scotland
go back to reference Cassandra AR (1998) A survey of POMDP applications, AAAI fall symposium Cassandra AR (1998) A survey of POMDP applications, AAAI fall symposium
go back to reference Cogill R, Rotkowitz M, Van Roy B, Lall S (2004) An approximate dynamic programming approach to decentralized control of stochastic systems. In: Proceedings of the allerton conference on communication, control, and computing Cogill R, Rotkowitz M, Van Roy B, Lall S (2004) An approximate dynamic programming approach to decentralized control of stochastic systems. In: Proceedings of the allerton conference on communication, control, and computing
go back to reference Cohen PR (1995) Empirical methods for artificial intelligence. MIT Press, Cambridge Cohen PR (1995) Empirical methods for artificial intelligence. MIT Press, Cambridge
go back to reference Fogel IJ, Owens AJ, Walsh MJ (1966) Artificial intelligence through simulated evolution. Wiley, New YorkMATH Fogel IJ, Owens AJ, Walsh MJ (1966) Artificial intelligence through simulated evolution. Wiley, New YorkMATH
go back to reference Goldman CV, Zilberstein S (2004) Decentralized control of cooperative systems: categorization and complexity analysis. J Art Intell Res 22:143–174MATHMathSciNet Goldman CV, Zilberstein S (2004) Decentralized control of cooperative systems: categorization and complexity analysis. J Art Intell Res 22:143–174MATHMathSciNet
go back to reference Hansen EA, Bernstein DS, Zilberstein S (2004) Dynamic programming for partially observable stochastic games. In: Proceedings of the nineteenth national conference on artificial intelligence (AAAI), San Jose, CA, pp 709-715 Hansen EA, Bernstein DS, Zilberstein S (2004) Dynamic programming for partially observable stochastic games. In: Proceedings of the nineteenth national conference on artificial intelligence (AAAI), San Jose, CA, pp 709-715
go back to reference Holland J (1975) Adaptation in natural and artificial systems. University of Michigan Press Holland J (1975) Adaptation in natural and artificial systems. University of Michigan Press
go back to reference Ignat DB (1998) Genetic algorithm with punctuated equilibria: analysis of the traveling salesperson problem instance. A thesis in TCC402. School of Engineering and Applied Science University of Virginia Ignat DB (1998) Genetic algorithm with punctuated equilibria: analysis of the traveling salesperson problem instance. A thesis in TCC402. School of Engineering and Applied Science University of Virginia
go back to reference Jimenez F, Sanchez G, Vasant P, Verdegay J (2006) A multi-objective evolutionary approach for fuzzy optimization in production planning. 2006 IEEE international conference on systems, man, and cybernetics, Taipei, Taiwan, pp 3120–3125 Jimenez F, Sanchez G, Vasant P, Verdegay J (2006) A multi-objective evolutionary approach for fuzzy optimization in production planning. 2006 IEEE international conference on systems, man, and cybernetics, Taipei, Taiwan, pp 3120–3125
go back to reference Jin Y (2005) A comprehensive survey of fitness approximation in evolutionary computation. Soft Comput 9(1):3–12CrossRef Jin Y (2005) A comprehensive survey of fitness approximation in evolutionary computation. Soft Comput 9(1):3–12CrossRef
go back to reference Koza JR (1992) Genetic programming: on the programming of computers by means of natural selection. MIT Press, Cambridge. ISBN 0-262-11170-5 Koza JR (1992) Genetic programming: on the programming of computers by means of natural selection. MIT Press, Cambridge. ISBN 0-262-11170-5
go back to reference LaValle SM (2006) Planning algorithms. Cambridge University Press, Cambridge LaValle SM (2006) Planning algorithms. Cambridge University Press, Cambridge
go back to reference Lin AZ, Bean JC, White CC (1999) A hybrid genetic/optimization algorithm for finite horizon partially observed Markov decision processes. In: Proceedings of the congress on evolutionary computation Lin AZ, Bean JC, White CC (1999) A hybrid genetic/optimization algorithm for finite horizon partially observed Markov decision processes. In: Proceedings of the congress on evolutionary computation
go back to reference Lopez E, Barea R, Bergasa LM, Escudero M (2003) Visually augmented POMDP for indoor robot navigation. In: 21th IASTED international multi-conference on applied informatics Lopez E, Barea R, Bergasa LM, Escudero M (2003) Visually augmented POMDP for indoor robot navigation. In: 21th IASTED international multi-conference on applied informatics
go back to reference Nair R, Pynadath D, Yokoo M, Tambe M, Marsella S (2003) Taming decentralized POMDPs: towards efficient policy computation for multiagent settings. In: Proceedings of the eighteenth international joint conference on artificial intelligence (IJCAI-03) Nair R, Pynadath D, Yokoo M, Tambe M, Marsella S (2003) Taming decentralized POMDPs: towards efficient policy computation for multiagent settings. In: Proceedings of the eighteenth international joint conference on artificial intelligence (IJCAI-03)
go back to reference Pynadath DV, Tambe M (2002) The communicative multiagent team decision problem: analyzing teamwork theories and models. J Art Intell Res Pynadath DV, Tambe M (2002) The communicative multiagent team decision problem: analyzing teamwork theories and models. J Art Intell Res
go back to reference Rabinovich Z, Goldman CV, Rosenschein JS (2003) The complexity of multiagent systems: the price of silence, In: Proceedings of the second international joint conference on autonomous agents and multiagent systems (AAMAS), Melbourne, Australia, pp 1102–1103 Rabinovich Z, Goldman CV, Rosenschein JS (2003) The complexity of multiagent systems: the price of silence, In: Proceedings of the second international joint conference on autonomous agents and multiagent systems (AAMAS), Melbourne, Australia, pp 1102–1103
go back to reference Rechenberg I (1973) Evolutionsstrategie: Optimierung technischer Systeme nach Prinzipien der biologischen Evolution. Fromman-Holzboog Verlag, Stuttgart Rechenberg I (1973) Evolutionsstrategie: Optimierung technischer Systeme nach Prinzipien der biologischen Evolution. Fromman-Holzboog Verlag, Stuttgart
go back to reference Sanchez G, Jimenez F, Vasant P (2007) Fuzzy optimization with multi-objective evolutionary algorithms: a case study. In: Proceedings of the 2007 IEEE symposium on computational intelligence in multicriteria decision making (MCDM). Honolulu, Hawaii, USA, pp 58–64 Sanchez G, Jimenez F, Vasant P (2007) Fuzzy optimization with multi-objective evolutionary algorithms: a case study. In: Proceedings of the 2007 IEEE symposium on computational intelligence in multicriteria decision making (MCDM). Honolulu, Hawaii, USA, pp 58–64
go back to reference Sendhoff B, Kreutz M, von Seelen W (1997) Causality and the analysis of local search in evolutionary algorithms. Internal report IRINI 97-16, Institut für Neuroinformatik, Ruhr-Universität Bochum, Germany Sendhoff B, Kreutz M, von Seelen W (1997) Causality and the analysis of local search in evolutionary algorithms. Internal report IRINI 97-16, Institut für Neuroinformatik, Ruhr-Universität Bochum, Germany
go back to reference Seuken S, Zilberstein S (2005) Formal models and algorithms for decentralized control of multiple agents. Technical report UM-CS-2005-068, Computer Science Department, University of Massachusetts Seuken S, Zilberstein S (2005) Formal models and algorithms for decentralized control of multiple agents. Technical report UM-CS-2005-068, Computer Science Department, University of Massachusetts
go back to reference Seuken S, Zilberstein S (2007) Memory-bounded dynamic programming for DEC-POMDPs, IJCAI Seuken S, Zilberstein S (2007) Memory-bounded dynamic programming for DEC-POMDPs, IJCAI
go back to reference Spaan M, Vlassis N (2004) A point-based POMDP algorithm for robot planning. In: Proceedings of the IEEE international conference on robotics and automation. New Orleans, Louisiana, pp 2399–2404 Spaan M, Vlassis N (2004) A point-based POMDP algorithm for robot planning. In: Proceedings of the IEEE international conference on robotics and automation. New Orleans, Louisiana, pp 2399–2404
go back to reference Szer D, Charpillet F, Zilberstein S (2005) MAA*: a heuristic search algorithm for solving decentralized POMDPs. In: Proceedings of the twenty-first conference on uncertainty in artificial intelligence (UAI), Edinburgh, Scotland Szer D, Charpillet F, Zilberstein S (2005) MAA*: a heuristic search algorithm for solving decentralized POMDPs. In: Proceedings of the twenty-first conference on uncertainty in artificial intelligence (UAI), Edinburgh, Scotland
go back to reference Wang FK (2001) Confidence interval for the mean of non-normal data. Qual Reliab Eng Int 17:257–267CrossRef Wang FK (2001) Confidence interval for the mean of non-normal data. Qual Reliab Eng Int 17:257–267CrossRef
Metadata
Title
Using evolution strategies to solve DEC-POMDP problems
Authors
Barış Eker
H. Levent Akın
Publication date
01-01-2010
Publisher
Springer-Verlag
Published in
Soft Computing / Issue 1/2010
Print ISSN: 1432-7643
Electronic ISSN: 1433-7479
DOI
https://doi.org/10.1007/s00500-008-0388-7

Other articles of this Issue 1/2010

Soft Computing 1/2010 Go to the issue

Premium Partner