Skip to main content
Top
Published in: Autonomous Robots 8/2016

28-11-2015

A robust approach to robot team learning

Authors: Justin Girard, M. Reza Emami

Published in: Autonomous Robots | Issue 8/2016

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

The paper achieves two outcomes. First, it summarizes previous work on concurrent Markov decision processes (CMDPs) currently demonstrated for use with multi-agent foraging problems. When using CMDPs, each agent models the environment using two Markov decision process (MDP). The two MDPs characterize a multi-agent foraging problem by modeling both a single-agent foraging problem, and multi-agent task allocation problem, for each agent. Second, the paper studies the effects of state uncertainty on a heterogeneous robot team that utilizes the aforementioned CMDP modelling approach. Furthermore, the paper presents a method to maintain performance despite state uncertainty. The resulting robust concurrent individual and social learning (RCISL) mechanism leads to an enhanced team learning behaviour despite state uncertainty. The paper analyzes the performance of the concurrent individual and social learning mechanism with and without a particle filter for a heterogeneous foraging scenario. The RCISL mechanism confers statistically significant performance improvements over the CISL mechanism.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literature
go back to reference Airiau, S., & Endriss, U. (2014). Multiagent resource allocation with sharable items. Autonomous Agents and Multi-Agent Systems, 28, 956–985.CrossRef Airiau, S., & Endriss, U. (2014). Multiagent resource allocation with sharable items. Autonomous Agents and Multi-Agent Systems, 28, 956–985.CrossRef
go back to reference Araghin, S., Khosravi, A., Johnstone, M., & Creighton, D. (2013). A novel modular Q-learning architecture to improve performance under incomplete learning in a grid soccer game. Engineering Applications of Artificial Intelligence, 26, 2164–2171. doi:10.1016/j.engappai.2013.05.003.CrossRef Araghin, S., Khosravi, A., Johnstone, M., & Creighton, D. (2013). A novel modular Q-learning architecture to improve performance under incomplete learning in a grid soccer game. Engineering Applications of Artificial Intelligence, 26, 2164–2171. doi:10.​1016/​j.​engappai.​2013.​05.​003.CrossRef
go back to reference Arulampalam, M. S., Maskell, S., Gordon, N., & Clapp, T. (2002). A tutorial on particle filters for online nonlinear/non-Gaussian bayesian tracking. IEEE Transactions on Signal Processing, 50, 174–188.CrossRef Arulampalam, M. S., Maskell, S., Gordon, N., & Clapp, T. (2002). A tutorial on particle filters for online nonlinear/non-Gaussian bayesian tracking. IEEE Transactions on Signal Processing, 50, 174–188.CrossRef
go back to reference Becker, R., Zilberstein, S., Lesser, V., & Goldman, C. V. (2004). Solving transition independent decentralized Markov decision processes. Journal of Artificial Intelligence Research archive, 22(1), 423–455.MathSciNetMATH Becker, R., Zilberstein, S., Lesser, V., & Goldman, C. V. (2004). Solving transition independent decentralized Markov decision processes. Journal of Artificial Intelligence Research archive, 22(1), 423–455.MathSciNetMATH
go back to reference Bernstein, D. S., Givan, R., Immerman, N., & Zilberstein, S. (2002). The complexity of decentralized control of markov decision processes. Mathematics of Operations Research, 27, 819–840.MathSciNetCrossRefMATH Bernstein, D. S., Givan, R., Immerman, N., & Zilberstein, S. (2002). The complexity of decentralized control of markov decision processes. Mathematics of Operations Research, 27, 819–840.MathSciNetCrossRefMATH
go back to reference Billon, R., Nédélec, A., & Tisseau, J. (2008). Gesture recognition in flow based on PCA analysis using multiagent system. In ACE ’08 proceedings of the 2008 international conference on advances in computer entertainment technology (pp. 139–146). Billon, R., Nédélec, A., & Tisseau, J. (2008). Gesture recognition in flow based on PCA analysis using multiagent system. In ACE ’08 proceedings of the 2008 international conference on advances in computer entertainment technology (pp. 139–146).
go back to reference Biswas, J., & Veloso, M. (2012). Depth camera based localization and navigation for indoor mobile robots. In Proceedings of IEEE International Conference on Robotics and Automation (pp. 1697–1702). Biswas, J., & Veloso, M. (2012). Depth camera based localization and navigation for indoor mobile robots. In Proceedings of IEEE International Conference on Robotics and Automation (pp. 1697–1702).
go back to reference Boutilier, C., Dean, T., & Hanks, S. (1999). Decision-theoretic planning: Structural assumptions and computational leverage. Journal of Artificial Intelligence Research, 11, 1–94.MathSciNetMATH Boutilier, C., Dean, T., & Hanks, S. (1999). Decision-theoretic planning: Structural assumptions and computational leverage. Journal of Artificial Intelligence Research, 11, 1–94.MathSciNetMATH
go back to reference Boutilier, C. (1996). Planning, learning and coordination in multiagent decision processes. In TARK ’96 proceedings of the 6th conference on theoretical aspects of rationality and knowledge (pp. 195–210). Boutilier, C. (1996). Planning, learning and coordination in multiagent decision processes. In TARK ’96 proceedings of the 6th conference on theoretical aspects of rationality and knowledge (pp. 195–210).
go back to reference Di Paola, D., Gasparri, A., Naso, D., & Lewis, F. L. (2015). Decentralized dynamic task planning for heterogeneous robotic networks. Autonomous Agents and Multi-Agent Systems, 38, 31–48. Di Paola, D., Gasparri, A., Naso, D., & Lewis, F. L. (2015). Decentralized dynamic task planning for heterogeneous robotic networks. Autonomous Agents and Multi-Agent Systems, 38, 31–48.
go back to reference Grady, D. K., Moll, M., & Kavraki, L. E. (2015). Extending the applicability of POMDP solutions to robotic tasks. IEEE Transaction on Robotics, 31(4), 948–961.CrossRef Grady, D. K., Moll, M., & Kavraki, L. E. (2015). Extending the applicability of POMDP solutions to robotic tasks. IEEE Transaction on Robotics, 31(4), 948–961.CrossRef
go back to reference Hayat, S. A., & Niazi, M. (2005). Multi agent foraging—taking a step further Q-learning with search. International Conference on Emerging Technologies, 2005, 215–220. Hayat, S. A., & Niazi, M. (2005). Multi agent foraging—taking a step further Q-learning with search. International Conference on Emerging Technologies, 2005, 215–220.
go back to reference Jin, Z., Kunming, Y. U., Liu, W., & Jin, J. (2009). State-clusters shared cooperative multi-agent reinforcement learning. In ASCC 2009. 7th, Asian Control Conference (pp. 129–135). Jin, Z., Kunming, Y. U., Liu, W., & Jin, J. (2009). State-clusters shared cooperative multi-agent reinforcement learning. In ASCC 2009. 7th, Asian Control Conference (pp. 129–135).
go back to reference Littman, M. L., & Cassandra, A. R. (1998). Planning and acting in partially observable stochastic domains. Artificial Intelligence, 101(1–2), 99–134.MathSciNetMATH Littman, M. L., & Cassandra, A. R. (1998). Planning and acting in partially observable stochastic domains. Artificial Intelligence, 101(1–2), 99–134.MathSciNetMATH
go back to reference Liu, L., Michael, N., & Shell, D. A. (2015). Communication constrained task allocation with optimized local task swaps. Autonomous Agents and Multi-Agent Systems, 39, 429–444. Liu, L., Michael, N., & Shell, D. A. (2015). Communication constrained task allocation with optimized local task swaps. Autonomous Agents and Multi-Agent Systems, 39, 429–444.
go back to reference Montemerlo, M., & Thrun, S. (2003). FastSLAM 2.0: An improved particle filtering algorithm for simultaneous localization and mapping that provably converges. In Proceedings of international joint conference on artificial intelligence (pp. 1151–1156). Montemerlo, M., & Thrun, S. (2003). FastSLAM 2.0: An improved particle filtering algorithm for simultaneous localization and mapping that provably converges. In Proceedings of international joint conference on artificial intelligence (pp. 1151–1156).
go back to reference Ng, L., & Emami, M. R. (2014). Concurrent individual and social learning in robot teams. Computational Intelligence. Letter of notification October 13, 2014. Ng, L., & Emami, M. R. (2014). Concurrent individual and social learning in robot teams. Computational Intelligence. Letter of notification October 13, 2014.
go back to reference Pajarinen, J., & Peltonen, J. (2011). Efficient planning for factored infinite-horizon DEC-POMDPs. In IJCAI’11 Proceedings of the twenty-second international joint conference on artificial intelligence (Vol. 1, pp. 325–331). Pajarinen, J., & Peltonen, J. (2011). Efficient planning for factored infinite-horizon DEC-POMDPs. In IJCAI’11 Proceedings of the twenty-second international joint conference on artificial intelligence (Vol. 1, pp. 325–331).
go back to reference Parker, L. E. (2012). Decision making as optimization in multi-robot teams. In Proceedings of 8th international conference on distributed computing and internet technology (Vol. 7154, pp. 35-49). Parker, L. E. (2012). Decision making as optimization in multi-robot teams. In Proceedings of 8th international conference on distributed computing and internet technology (Vol. 7154, pp. 35-49).
go back to reference Pineau, J., Gordon, G., & Thrun, S. (2003). Point-based value iteration: An anytime algorithm for POMDPs. In International Joint Conference on Artificial Intelligence (pp. 1025–1032). Pineau, J., Gordon, G., & Thrun, S. (2003). Point-based value iteration: An anytime algorithm for POMDPs. In International Joint Conference on Artificial Intelligence (pp. 1025–1032).
go back to reference Rekleitis, I. M., Dudek, G., & Milios, E. (2003). Probabilistic cooperative localization and mapping in practice. In In Proceedings of IEEE international conference in robotics and automation (Vol. 2, pp. 1907–1912). Rekleitis, I. M., Dudek, G., & Milios, E. (2003). Probabilistic cooperative localization and mapping in practice. In In Proceedings of IEEE international conference in robotics and automation (Vol. 2, pp. 1907–1912).
go back to reference Roy, N., Gordon, G., & Thrun, S. (2005). Finding approximate POMDP solutions through belief compression. Journal of Artificial Intelligence Research, 23, 1–40.CrossRefMATH Roy, N., Gordon, G., & Thrun, S. (2005). Finding approximate POMDP solutions through belief compression. Journal of Artificial Intelligence Research, 23, 1–40.CrossRefMATH
go back to reference Sonu, E., & Doshi, P. (2015). Scalable solutions of interactive POMDPs using generalized and bounded policy iteration. Autonomous Agents and Multi-Agent Systems, 29, 455–494.CrossRef Sonu, E., & Doshi, P. (2015). Scalable solutions of interactive POMDPs using generalized and bounded policy iteration. Autonomous Agents and Multi-Agent Systems, 29, 455–494.CrossRef
go back to reference Tamimi, H., & Zell, A. (2004). Global visual localization of mobile robots using kernel principal component analysis. In Proceedings, 2004 IEEE/RSJ international conference on intelligent robots and systems (Vol. 2, pp. 1896–1901). Tamimi, H., & Zell, A. (2004). Global visual localization of mobile robots using kernel principal component analysis. In Proceedings, 2004 IEEE/RSJ international conference on intelligent robots and systems (Vol. 2, pp. 1896–1901).
go back to reference Watkins, C., & Dyan, P. (1992). Q-learning. Machine Learning, 8(3/4), 279–292.CrossRef Watkins, C., & Dyan, P. (1992). Q-learning. Machine Learning, 8(3/4), 279–292.CrossRef
go back to reference Welch, B. L. (1947). The generalization of “Student’s” problem when several different population variances are involved. Biometrika, 34(1–2), 28–35.MathSciNetMATH Welch, B. L. (1947). The generalization of “Student’s” problem when several different population variances are involved. Biometrika, 34(1–2), 28–35.MathSciNetMATH
Metadata
Title
A robust approach to robot team learning
Authors
Justin Girard
M. Reza Emami
Publication date
28-11-2015
Publisher
Springer US
Published in
Autonomous Robots / Issue 8/2016
Print ISSN: 0929-5593
Electronic ISSN: 1573-7527
DOI
https://doi.org/10.1007/s10514-015-9524-2

Other articles of this Issue 8/2016

Autonomous Robots 8/2016 Go to the issue