Skip to main content

2017 | OriginalPaper | Buchkapitel

Experiments with Hierarchical Reinforcement Learning of Multiple Grasping Policies

verfasst von : Takayuki Osa, Jan Peters, Gerhard Neumann

Erschienen in: 2016 International Symposium on Experimental Robotics

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Robotic grasping has attracted considerable interest, but it still remains a challenging task. The data-driven approach is a promising solution to the robotic grasping problem; this approach leverages a grasp dataset and generalizes grasps for various objects. However, these methods often depend on the quality of the given datasets, which are not trivial to obtain with sufficient quality. Although reinforcement learning approaches have been recently used to achieve autonomous collection of grasp datasets, the existing algorithms are often limited to specific grasp types. In this paper, we present a framework for hierarchical reinforcement learning of grasping policies. In our framework, the lower-level hierarchy learns multiple grasp types, and the upper-level hierarchy learns a policy to select from the learned grasp types according to a point cloud of a new object. Through experiments, we validate that our approach learns grasping by constructing the grasp dataset autonomously. The experimental results show that our approach learns multiple grasping policies and generalizes the learned grasps by using local point cloud information.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Bicchi, A., Kumar, V.: Robotic grasping and contact: a review. In: IEEE International Conference on Robotics and Automation (ICRA), pp. 348–353 (2000) Bicchi, A., Kumar, V.: Robotic grasping and contact: a review. In: IEEE International Conference on Robotics and Automation (ICRA), pp. 348–353 (2000)
2.
Zurück zum Zitat Bohg, J., Morales, A., Asfour, T., Kragic, D.: Data-driven grasp synthesis- a survey. IEEE Trans. Robot. 30(2), 289–309 (2014)CrossRef Bohg, J., Morales, A., Asfour, T., Kragic, D.: Data-driven grasp synthesis- a survey. IEEE Trans. Robot. 30(2), 289–309 (2014)CrossRef
3.
Zurück zum Zitat Goldfeder, C., Allen, P.K.: Data-driven grasping. Autonomous Robots 31, 1–20 (2011)CrossRef Goldfeder, C., Allen, P.K.: Data-driven grasping. Autonomous Robots 31, 1–20 (2011)CrossRef
4.
Zurück zum Zitat Fischinger, D., Weiss, A., Vincze, M.: Learning grasps with topographic features. Intl. J. Robot. Res. 34, 1167–1194 (2015) Fischinger, D., Weiss, A., Vincze, M.: Learning grasps with topographic features. Intl. J. Robot. Res. 34, 1167–1194 (2015)
5.
Zurück zum Zitat Kopicki, M., Detry, R., Adjigble, M., Stolkin, R., Leonardis, A., Wyatt, J.L.: One-shot learning and generation of dexterous grasps for novel objects. Intl. J. Robot. Res. (2015) Kopicki, M., Detry, R., Adjigble, M., Stolkin, R., Leonardis, A., Wyatt, J.L.: One-shot learning and generation of dexterous grasps for novel objects. Intl. J. Robot. Res. (2015)
6.
Zurück zum Zitat Lenz, I., Lee, H., Saxena, A.: Deep learning for detecting robotic grasps. Intl. J. Robot. Res. 34, 705–724 (2015) Lenz, I., Lee, H., Saxena, A.: Deep learning for detecting robotic grasps. Intl. J. Robot. Res. 34, 705–724 (2015)
7.
Zurück zum Zitat Ten Pas, A., Platt, R.: Localizing handle-like grasp affordances in 3d point clouds. In: International Symposium on Experimental Robotics (ISER) (2014) Ten Pas, A., Platt, R.: Localizing handle-like grasp affordances in 3d point clouds. In: International Symposium on Experimental Robotics (ISER) (2014)
8.
Zurück zum Zitat Gualtieri, M., Ten Pas, A., Saenko, K., Platt, R.: Using geometry to detect grasp poses in 3d point clouds. In: International Symposium on Robotics Research (ISRR) (2015) Gualtieri, M., Ten Pas, A., Saenko, K., Platt, R.: Using geometry to detect grasp poses in 3d point clouds. In: International Symposium on Robotics Research (ISRR) (2015)
9.
Zurück zum Zitat Sutton, R., Barto, A.: Reinforcement Learning: An Introduction. The MIT Press, Cambridge (1998) Sutton, R., Barto, A.: Reinforcement Learning: An Introduction. The MIT Press, Cambridge (1998)
10.
Zurück zum Zitat Pinto, L., Gupta, A.: Supersizing self-supervision: learning to grasp from 50k tries and 700 robot hours. In: IEEE International Conference on Robotics and Automation (ICRA) (2016) Pinto, L., Gupta, A.: Supersizing self-supervision: learning to grasp from 50k tries and 700 robot hours. In: IEEE International Conference on Robotics and Automation (ICRA) (2016)
11.
Zurück zum Zitat Levine, S., Pastor, P., Krizhevsky, A., Quillen, D.: Learning hand-eye coordination for robotic grasping with deep learning and large-scale data collection. CoRR abs/1603.02199 (2016) Levine, S., Pastor, P., Krizhevsky, A., Quillen, D.: Learning hand-eye coordination for robotic grasping with deep learning and large-scale data collection. CoRR abs/1603.02199 (2016)
12.
Zurück zum Zitat Napier, J.R.: The prehensile movements of the human hand. J. Bone Joint Surg. 38-B(4), 902–913 (1956) Napier, J.R.: The prehensile movements of the human hand. J. Bone Joint Surg. 38-B(4), 902–913 (1956)
13.
Zurück zum Zitat Cutkosky, M.R., Howe, R.D.: Human grasp choice and robotic grasp analysis. In: Venkataraman, S.T., Iberall, T. (eds.) Dextrous Robot Hands, pp. 5–31. Springer, New York (1990) Cutkosky, M.R., Howe, R.D.: Human grasp choice and robotic grasp analysis. In: Venkataraman, S.T., Iberall, T. (eds.) Dextrous Robot Hands, pp. 5–31. Springer, New York (1990)
14.
Zurück zum Zitat Kroemer, O., Detry, R., Piater, J., Peters, J.: Combining active learning and reactive control for robot grasping. Robot. Autonomous Syst. 9, 1105–1116 (2010)CrossRef Kroemer, O., Detry, R., Piater, J., Peters, J.: Combining active learning and reactive control for robot grasping. Robot. Autonomous Syst. 9, 1105–1116 (2010)CrossRef
15.
Zurück zum Zitat Peters, J., Muelling, K., Altun, Y.: Relative entropy policy search. In: AAAI Conference on Artificial Intelligence (AAAI) (2010) Peters, J., Muelling, K., Altun, Y.: Relative entropy policy search. In: AAAI Conference on Artificial Intelligence (AAAI) (2010)
16.
Zurück zum Zitat Kupcsik, A., Deisenroth, M.P., Peters, J., Loh, A.P., Vadakkepat, P., Neumann, G.: Model-based contextual policy search for data-efficient generalization of robot skills. Artificial Intell. (2014) Kupcsik, A., Deisenroth, M.P., Peters, J., Loh, A.P., Vadakkepat, P., Neumann, G.: Model-based contextual policy search for data-efficient generalization of robot skills. Artificial Intell. (2014)
17.
Zurück zum Zitat Deisenroth, M.P., Neumann, G., Peters, J.: A survey on policy search for robotics. Foundations Trends Robot. 21, 388–403 (2013) Deisenroth, M.P., Neumann, G., Peters, J.: A survey on policy search for robotics. Foundations Trends Robot. 21, 388–403 (2013)
18.
Zurück zum Zitat Auer, P.: Using confidence bounds for exploitation-exploration trade-offs. J. Mach. Learn. Res. 3, 397–422 (2003)MathSciNetMATH Auer, P.: Using confidence bounds for exploitation-exploration trade-offs. J. Mach. Learn. Res. 3, 397–422 (2003)MathSciNetMATH
19.
Zurück zum Zitat Srinivas, N., Krause, A., Kakade, S., Seeger, M.: Information-theoretic regret bounds for gaussian process optimization in the bandit setting. IEEE Trans. Inf. Theory 58(5), 3250–3265 (2012)MathSciNetCrossRef Srinivas, N., Krause, A., Kakade, S., Seeger, M.: Information-theoretic regret bounds for gaussian process optimization in the bandit setting. IEEE Trans. Inf. Theory 58(5), 3250–3265 (2012)MathSciNetCrossRef
20.
Zurück zum Zitat Calandra, R., Seyfarth, A., Peters, J., Deisenroth, M.P.: Bayesian optimization for learning gaits under uncertainty. Ann. Math. Artif. Intell. 76(1), 5–23 (2016)MathSciNetCrossRef Calandra, R., Seyfarth, A., Peters, J., Deisenroth, M.P.: Bayesian optimization for learning gaits under uncertainty. Ann. Math. Artif. Intell. 76(1), 5–23 (2016)MathSciNetCrossRef
21.
Zurück zum Zitat Rasmussen, C.E., Williams, C.K.I.: Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning). The MIT Press, Cambridge (2005) Rasmussen, C.E., Williams, C.K.I.: Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning). The MIT Press, Cambridge (2005)
22.
Zurück zum Zitat Girard, A., Rasmussen, C.E., Candela, J.Q., Murray-Smith, R.: Gaussian process priors with uncertain inputs - application to multiple-step ahead time series forecasting. In: Advances in Neural Information Processing Systems (2002) Girard, A., Rasmussen, C.E., Candela, J.Q., Murray-Smith, R.: Gaussian process priors with uncertain inputs - application to multiple-step ahead time series forecasting. In: Advances in Neural Information Processing Systems (2002)
23.
Zurück zum Zitat Candela, J.Q., Girard, A.: Prediction at an uncertain input for Gaussian processes and relevance vector machines - application to multiple-step ahead time-series forecasting. Technical report, Danish Technical University (2002) Candela, J.Q., Girard, A.: Prediction at an uncertain input for Gaussian processes and relevance vector machines - application to multiple-step ahead time-series forecasting. Technical report, Danish Technical University (2002)
24.
Zurück zum Zitat Besl, P.J., McKay, N.D.: A method for registration of 3-d shapes. IEEE Trans. Pattern Anal. Mach. Intell. 14(2), 239–256 (1992)CrossRef Besl, P.J., McKay, N.D.: A method for registration of 3-d shapes. IEEE Trans. Pattern Anal. Mach. Intell. 14(2), 239–256 (1992)CrossRef
25.
Zurück zum Zitat Murray, R.M., Sastry, S.S., Zexiang, L.: A Mathematical Introduction to Robotic Manipulation, 1st edn. CRC Press Inc., Boca Raton (1994)MATH Murray, R.M., Sastry, S.S., Zexiang, L.: A Mathematical Introduction to Robotic Manipulation, 1st edn. CRC Press Inc., Boca Raton (1994)MATH
26.
Zurück zum Zitat Ferrari, C., Canny, J.: Planning optimal grasps. In: IEEE International Conference on Robotics and Automation (ICRA), vol. 3, pp. 2290–2295, May 1992 Ferrari, C., Canny, J.: Planning optimal grasps. In: IEEE International Conference on Robotics and Automation (ICRA), vol. 3, pp. 2290–2295, May 1992
27.
Zurück zum Zitat Pokorny, F., Kragic, D.: Classical grasp quality evaluation: new algorithms and theory. In: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 3493–3500, November 2013 Pokorny, F., Kragic, D.: Classical grasp quality evaluation: new algorithms and theory. In: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 3493–3500, November 2013
28.
Zurück zum Zitat Peters, J., Schaal, S.: Reinforcement learning by reward-weighted regression for operational space control. In: International Conference on Machine Learning (ICML) (2007) Peters, J., Schaal, S.: Reinforcement learning by reward-weighted regression for operational space control. In: International Conference on Machine Learning (ICML) (2007)
Metadaten
Titel
Experiments with Hierarchical Reinforcement Learning of Multiple Grasping Policies
verfasst von
Takayuki Osa
Jan Peters
Gerhard Neumann
Copyright-Jahr
2017
DOI
https://doi.org/10.1007/978-3-319-50115-4_15

Neuer Inhalt