Skip to main content
Erschienen in: Autonomous Robots 4/2017

04.08.2016

A confidence-based roadmap using Gaussian process regression

verfasst von: Yuya Okadome, Yutaka Nakamura, Hiroshi Ishiguro

Erschienen in: Autonomous Robots | Ausgabe 4/2017

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Recent advances in high performance computing have allowed sampling-based motion planning methods to be successfully applied to practical robot control problems. In such methods, a graph representing the local connectivity among states is constructed using a mathematical model of the controlled target. The motion is planned using this graph. However, it is difficult to obtain an appropriate mathematical model in advance when the behavior of the robot is affected by unanticipated factors. Therefore, it is crucial to be able to build a mathematical model from the motion data gathered by monitoring the robot in operation. However, when these data are sparse, uncertainty may be introduced into the model. To deal with this uncertainty, we propose a motion planning method using Gaussian process regression as a mathematical model. Experimental results show that satisfactory robot motion can be achieved using limited data.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Fußnoten
1
The accumulated cost becomes -1 for any control policy if \(\gamma =1\) under our assumption since the system reaches the goal with probability 1 after an infinite number of time steps. When the purpose is to obtain the optimal control policy that leads the system to reach the goal in the fewest time steps, the discount factor \(\gamma \) is usually set to less than 1.
 
2
This RMSE is calculated as \({\frac{1}{|\mathcal M^{p}|} \sum _{\tau =1}^{|\mathcal M^{p}|} \root \of {({\varvec{s}}^{p}(t)-{\varvec{s}}(t))^{\top }({\varvec{s}}^{p}(t)-{\varvec{s}}(t))}}\).
 
3
The control uncertainty is mentioned in (Berg et al. 2011), and it corresponds the uncertainty in the state transition model or inverse dynamics model.
 
Literatur
Zurück zum Zitat Asmuth, J., & Littman, M. L. (2011). Learning is planning: Near Bayes-optimal reinforcement learning via Monte-Carlo tree search. In Uncertainty in Artificial Intelligence. Asmuth, J., & Littman, M. L. (2011). Learning is planning: Near Bayes-optimal reinforcement learning via Monte-Carlo tree search. In Uncertainty in Artificial Intelligence.
Zurück zum Zitat Berg, J. V. D., et al. (2011). LQG-MP: Optimized path planning for robots with motion uncertainty and imperfect state information. The International Journal of Robotics Research, 30(7), 895–913.CrossRef Berg, J. V. D., et al. (2011). LQG-MP: Optimized path planning for robots with motion uncertainty and imperfect state information. The International Journal of Robotics Research, 30(7), 895–913.CrossRef
Zurück zum Zitat Bishop, C. M. (2006). Pattern recognition and machine learning (1st ed.). Springer. corr. 2nd printing edn. Bishop, C. M. (2006). Pattern recognition and machine learning (1st ed.). Springer. corr. 2nd printing edn.
Zurück zum Zitat Bry, A., & Roy, N. (2011). Rapidly-exploring Random Belief Trees for motion planning under uncertainty. In 2011 IEEE international conference on robotics and automation (ICRA) (pp. 723–730). Bry, A., & Roy, N. (2011). Rapidly-exploring Random Belief Trees for motion planning under uncertainty. In 2011 IEEE international conference on robotics and automation (ICRA) (pp. 723–730).
Zurück zum Zitat Choset, H . M. (2005). Principles of robot motion: Theory, algorithms, and implementations. Cambridge: MIT Press.MATH Choset, H . M. (2005). Principles of robot motion: Theory, algorithms, and implementations. Cambridge: MIT Press.MATH
Zurück zum Zitat Chrisman, L. (1992). Reinforcement learning with perceptual aliasing: The perceptual distinctions approach. In AAAI (pp. 183–188). Chrisman, L. (1992). Reinforcement learning with perceptual aliasing: The perceptual distinctions approach. In AAAI (pp. 183–188).
Zurück zum Zitat Dalibard, S., et al. (2013). Dynamic walking and whole-body motion planning for humanoid robots: An integrated approach. The International Journal of Robotics Research, 32(9–10), 1089–1103.CrossRef Dalibard, S., et al. (2013). Dynamic walking and whole-body motion planning for humanoid robots: An integrated approach. The International Journal of Robotics Research, 32(9–10), 1089–1103.CrossRef
Zurück zum Zitat Dechter, R., & Pearl, J. (1985). Generalized best-first search strategies and the optimality of A*. Journal of the ACM (JACM), 32(3), 505–536.MathSciNetCrossRefMATH Dechter, R., & Pearl, J. (1985). Generalized best-first search strategies and the optimality of A*. Journal of the ACM (JACM), 32(3), 505–536.MathSciNetCrossRefMATH
Zurück zum Zitat Deisenroth, M. P., & Rasmussen, C. E. (2011). PILCO: A model-based and data-efficient approach to policy search. In In Proceedings of the International Conference on Machine Learning. Deisenroth, M. P., & Rasmussen, C. E. (2011). PILCO: A model-based and data-efficient approach to policy search. In In Proceedings of the International Conference on Machine Learning.
Zurück zum Zitat Doya, K. (2000). Reinforcement learning in continuous time and space. Neural Computation, 12(1), 219–245.CrossRef Doya, K. (2000). Reinforcement learning in continuous time and space. Neural Computation, 12(1), 219–245.CrossRef
Zurück zum Zitat Foster, L., et al. (2009). Stable and efficient Gaussian Process calculations. Journal of Machine Learning Research, 10, 857–882.MathSciNetMATH Foster, L., et al. (2009). Stable and efficient Gaussian Process calculations. Journal of Machine Learning Research, 10, 857–882.MathSciNetMATH
Zurück zum Zitat Grondman, I., et al. (2012). Efficient model learning methods for actor critic control. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 42(3), 591–602.CrossRef Grondman, I., et al. (2012). Efficient model learning methods for actor critic control. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 42(3), 591–602.CrossRef
Zurück zum Zitat Hinton, G. E. (2002). Training products of experts by minimizing contrastive divergence. Neural computation, 14, 1771–1800. Hinton, G. E. (2002). Training products of experts by minimizing contrastive divergence. Neural computation, 14, 1771–1800.
Zurück zum Zitat Ijspeert, A. J. (2008). Central pattern generators for locomotion control in animals and robots: A review. Neural Networks 21(4), 642–653. Robotics and Neuroscience. Ijspeert, A. J. (2008). Central pattern generators for locomotion control in animals and robots: A review. Neural Networks 21(4), 642–653. Robotics and Neuroscience.
Zurück zum Zitat Indyk, P., & Motwani, R. (1998). Approximate nearest neighbors: towards removing the curse of dimensionality. In Proceedings of the thirtieth annual ACM symposium on theory of computing, STOC ’98 (pp. 604–613). New York, ACM. Indyk, P., & Motwani, R. (1998). Approximate nearest neighbors: towards removing the curse of dimensionality. In Proceedings of the thirtieth annual ACM symposium on theory of computing, STOC ’98 (pp. 604–613). New York, ACM.
Zurück zum Zitat Kavraki, L. E., et al. (1996). Probabilistic roadmaps for path planning in high-dimensional configuration spaces. IEEE Transactions on Robotics and Automation, 12(4), 566–580.CrossRef Kavraki, L. E., et al. (1996). Probabilistic roadmaps for path planning in high-dimensional configuration spaces. IEEE Transactions on Robotics and Automation, 12(4), 566–580.CrossRef
Zurück zum Zitat Ko, J., & Fox, D. (2009). GP-BayesFilters: Bayesian filtering using Gaussian process prediction and observation models. Autonomous Robots, 27, 75–90.CrossRef Ko, J., & Fox, D. (2009). GP-BayesFilters: Bayesian filtering using Gaussian process prediction and observation models. Autonomous Robots, 27, 75–90.CrossRef
Zurück zum Zitat Kuwata, Y., et al. (2009). Real-time motion planning with applications to autonomous urban driving. IEEE Transactions on Control Systems Technology, 17(5), 1105–1118.CrossRef Kuwata, Y., et al. (2009). Real-time motion planning with applications to autonomous urban driving. IEEE Transactions on Control Systems Technology, 17(5), 1105–1118.CrossRef
Zurück zum Zitat LaValle, S . M. (2006). Planning algorithms. Cambridge: Cambridge University Press.CrossRefMATH LaValle, S . M. (2006). Planning algorithms. Cambridge: Cambridge University Press.CrossRefMATH
Zurück zum Zitat LaValle, S. M., & Kuffner, J. J. (2001). Randomized kinodynamic planning. The International Journal of Robotics Research, 20(5), 378–400.CrossRef LaValle, S. M., & Kuffner, J. J. (2001). Randomized kinodynamic planning. The International Journal of Robotics Research, 20(5), 378–400.CrossRef
Zurück zum Zitat Lawrence, N. D. (2004). Gaussian process latent variable models for visualisation of high dimensional data. Advances in neural information systems, 16, 329–336. Lawrence, N. D. (2004). Gaussian process latent variable models for visualisation of high dimensional data. Advances in neural information systems, 16, 329–336.
Zurück zum Zitat Marco, A., et al. (2016). Automatic LQR tuning based on Gaussian process global optimization. In 2016 IEEE international conference on robotics and automation (ICRA), (pp. 270–277). Marco, A., et al. (2016). Automatic LQR tuning based on Gaussian process global optimization. In 2016 IEEE international conference on robotics and automation (ICRA), (pp. 270–277).
Zurück zum Zitat Mukadam, M., et al. (2016). Gaussian process motion planning. In 2016 IEEE international conference on robotics and automation (ICRA) (pp. 9–15). Mukadam, M., et al. (2016). Gaussian process motion planning. In 2016 IEEE international conference on robotics and automation (ICRA) (pp. 9–15).
Zurück zum Zitat Okadome, Y., et al. (2013). Fast approximation method for Gaussian process regression using hash function for non-uniformly distributed data. In Artificial neural networks and machine learning (pp. 17–25). Okadome, Y., et al. (2013). Fast approximation method for Gaussian process regression using hash function for non-uniformly distributed data. In Artificial neural networks and machine learning (pp. 17–25).
Zurück zum Zitat Okadome, Y., et al. (2014). Confidence-based roadmap using Gaussian process regression for a robot control. In 2014 IEEE/RSJ international conference on intelligent robots and systems (IROS 2014) (pp. 661–666). Okadome, Y., et al. (2014). Confidence-based roadmap using Gaussian process regression for a robot control. In 2014 IEEE/RSJ international conference on intelligent robots and systems (IROS 2014) (pp. 661–666).
Zurück zum Zitat Peters, J., & Schaal, S. (2008). Reinforcement learning of motor skills with policy gradients. Neural Networks, 21(4), 682–697.CrossRef Peters, J., & Schaal, S. (2008). Reinforcement learning of motor skills with policy gradients. Neural Networks, 21(4), 682–697.CrossRef
Zurück zum Zitat Prentice, S., & Roy, N. (2009). The belief roadmap: Efficient planning in belief space by factoring the covariance. The International Journal of Robotics Research, 28(11–12), 1448–1465.CrossRef Prentice, S., & Roy, N. (2009). The belief roadmap: Efficient planning in belief space by factoring the covariance. The International Journal of Robotics Research, 28(11–12), 1448–1465.CrossRef
Zurück zum Zitat Rasmussen, C . E., & Williams, C . K . I. (2005). Gaussian processes for machine learning. Cambridge: the mit press.MATH Rasmussen, C . E., & Williams, C . K . I. (2005). Gaussian processes for machine learning. Cambridge: the mit press.MATH
Zurück zum Zitat Ross, S., et al. (2011). A Bayesian approach for learning and planning in partially observable Markov decision processes. The Journal of Machine Learning Research, 12, 1729–1770.MathSciNetMATH Ross, S., et al. (2011). A Bayesian approach for learning and planning in partially observable Markov decision processes. The Journal of Machine Learning Research, 12, 1729–1770.MathSciNetMATH
Zurück zum Zitat Spaan, M. T. J., & Vlassis, N. (2004). ‘A point-based POMDP algorithm for robot planning’. In IEEE international conference on robotics and automation (Vol. 3, pp. 2399–2404). IEEE. Spaan, M. T. J., & Vlassis, N. (2004). ‘A point-based POMDP algorithm for robot planning’. In IEEE international conference on robotics and automation (Vol. 3, pp. 2399–2404). IEEE.
Zurück zum Zitat Sutton, R. S., et al. (2000). Policy gradient method for reinforcement learning with function approximation. In Advances in neural information processing systems (Vol. 12, pp. 1057–1063). Sutton, R. S., et al. (2000). Policy gradient method for reinforcement learning with function approximation. In Advances in neural information processing systems (Vol. 12, pp. 1057–1063).
Zurück zum Zitat Theodorou, E., et al. (2010). ‘Reinforcement learning of motor skills in high dimensions: A path integral approach’. In 2010 IEEE international conference on robotics and automation (ICRA), (pp. 2397 –2403). Theodorou, E., et al. (2010). ‘Reinforcement learning of motor skills in high dimensions: A path integral approach’. In 2010 IEEE international conference on robotics and automation (ICRA), (pp. 2397 –2403).
Zurück zum Zitat Thrun, S. B. (1992). Efficient exploration in reinforcement learning. Tech. rep., Pittsburgh. Thrun, S. B. (1992). Efficient exploration in reinforcement learning. Tech. rep., Pittsburgh.
Zurück zum Zitat Todorov, E., & Li, W. (2005). A generalized iterative LQG method for locally-optimal feedback control of constrained nonlinear stochastic In Proceedings of the 2005, American control conference (Vol. 1, pp. 300–306). Todorov, E., & Li, W. (2005). A generalized iterative LQG method for locally-optimal feedback control of constrained nonlinear stochastic In Proceedings of the 2005, American control conference (Vol. 1, pp. 300–306).
Zurück zum Zitat Wang, J. M., et al. (2008). Gaussian process dynamical models for human motion. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(2), 283–298.CrossRef Wang, J. M., et al. (2008). Gaussian process dynamical models for human motion. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(2), 283–298.CrossRef
Metadaten
Titel
A confidence-based roadmap using Gaussian process regression
verfasst von
Yuya Okadome
Yutaka Nakamura
Hiroshi Ishiguro
Publikationsdatum
04.08.2016
Verlag
Springer US
Erschienen in
Autonomous Robots / Ausgabe 4/2017
Print ISSN: 0929-5593
Elektronische ISSN: 1573-7527
DOI
https://doi.org/10.1007/s10514-016-9604-y

Weitere Artikel der Ausgabe 4/2017

Autonomous Robots 4/2017 Zur Ausgabe

Neuer Inhalt