Skip to main content

2019 | OriginalPaper | Buchkapitel

Design of Transfer Reinforcement Learning Mechanisms for Autonomous Collision Avoidance

verfasst von : Xiongqing Liu, Yan Jin

Erschienen in: Design Computing and Cognition '18

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

It is often hard for a reinforcement learning (RL) agent to utilize previous experience to solve new similar but more complex tasks. In this research, we combine the transfer learning with reinforcement learning and investigate how the hyperparameters of both transfer learning and reinforcement learning impact the learning effectiveness and task performance in the context of autonomous robotic collision avoidance. A deep reinforcement learning algorithm was first implemented for a robot to learn, from its experience, how to avoid randomly generated single obstacles. After that the effect of transfer of previously learned experience was studied by introducing two important concepts, transfer belief—i.e., how much a robot should believe in its previous experience—and transfer period—i.e., how long the previous experience should be applied in the new context. The proposed approach has been tested for collision avoidance problems by altering transfer period. It is shown that transfer learnings on average had ~50% speed increase at ~30% competence levels, and there exists an optimal transfer period where the variance is the lowest and learning speed is the fastest.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Bojarski M et al (2016) End to end learning for self-driving cars. arXiv: 1604.07316 [cs.LG] Bojarski M et al (2016) End to end learning for self-driving cars. arXiv: 1604.07316 [cs.LG]
2.
Zurück zum Zitat Casanova D, Tardioli C, Lemaître A (2014) Space debris collision avoidance using a three-filter sequence. Mon Not R Astron Soc 442(4):3235–3242CrossRef Casanova D, Tardioli C, Lemaître A (2014) Space debris collision avoidance using a three-filter sequence. Mon Not R Astron Soc 442(4):3235–3242CrossRef
3.
Zurück zum Zitat Chen JX (2016) The evolution of computing: AlphaGo. Comput Sci Eng 18(4):4–7CrossRef Chen JX (2016) The evolution of computing: AlphaGo. Comput Sci Eng 18(4):4–7CrossRef
4.
Zurück zum Zitat Churchland PS, Sejnowski TJ (2016) The computational brain. MIT Press, Cambridge Churchland PS, Sejnowski TJ (2016) The computational brain. MIT Press, Cambridge
5.
Zurück zum Zitat Coates A, Huval B, Wang T, Wu D, Ng A (2013) Deep learning with COTS HPC systems. In: International conference on machine learning Coates A, Huval B, Wang T, Wu D, Ng A (2013) Deep learning with COTS HPC systems. In: International conference on machine learning
6.
Zurück zum Zitat Chen YF, Liu M, Everett M, How JP (2016) Decentralized non-communicating multiagent collision avoidance with deep reinforcement learning. arXiv: 1609.07845 [cs.MA] Chen YF, Liu M, Everett M, How JP (2016) Decentralized non-communicating multiagent collision avoidance with deep reinforcement learning. arXiv: 1609.07845 [cs.MA]
7.
Zurück zum Zitat Dean J et al (2012) Large scale distributed deep networks. In: International conference on neural information processing systems. Curran Associates Inc., New York Dean J et al (2012) Large scale distributed deep networks. In: International conference on neural information processing systems. Curran Associates Inc., New York
8.
Zurück zum Zitat Dieleman S, Schrauwen B (2014) End-to-end learning for music audio. In: IEEE international conference on acoustics, speech and signal processing Dieleman S, Schrauwen B (2014) End-to-end learning for music audio. In: IEEE international conference on acoustics, speech and signal processing
9.
Zurück zum Zitat Ding Z, Nasrabadi N, Fu Y (2016) Task-driven deep transfer learning for image classification. In: IEEE international conference on acoustics, speech and signal processing Ding Z, Nasrabadi N, Fu Y (2016) Task-driven deep transfer learning for image classification. In: IEEE international conference on acoustics, speech and signal processing
10.
Zurück zum Zitat Fahimi F, Nataraj C, Ashrafiuon H (2009) Real-time obstacle avoidance for multiple mobile robots. Robotica 27(2):189–198CrossRef Fahimi F, Nataraj C, Ashrafiuon H (2009) Real-time obstacle avoidance for multiple mobile robots. Robotica 27(2):189–198CrossRef
11.
Zurück zum Zitat Fernandez F, Veloso M (2006) Probabilistic policy reuse in a reinforcement learning agent. In: International joint conference on autonomous agents and multiagent systems, vol 58, pp 720–727 Fernandez F, Veloso M (2006) Probabilistic policy reuse in a reinforcement learning agent. In: International joint conference on autonomous agents and multiagent systems, vol 58, pp 720–727
12.
Zurück zum Zitat Frommberger L (2008) Learning to behave in space: a qualitative spatial representation for robot navigation with reinforcement learning. Int J Artif Intell Tools 17(03):465–482CrossRef Frommberger L (2008) Learning to behave in space: a qualitative spatial representation for robot navigation with reinforcement learning. Int J Artif Intell Tools 17(03):465–482CrossRef
13.
Zurück zum Zitat Fujii T, Arai Y, Asama H, Endo I (1998) Multilayered reinforcement learning for complicated collision avoidance problems. In: Proceedings 1998 IEEE international conference on robotics and automation, vol 3, pp 2186–2191 Fujii T, Arai Y, Asama H, Endo I (1998) Multilayered reinforcement learning for complicated collision avoidance problems. In: Proceedings 1998 IEEE international conference on robotics and automation, vol 3, pp 2186–2191
14.
Zurück zum Zitat Goerlandt F, Kujala P (2014) On the reliability and validity of ship–ship collision risk analysis in light of different perspectives on risk. Saf Sci 62:348–365CrossRef Goerlandt F, Kujala P (2014) On the reliability and validity of ship–ship collision risk analysis in light of different perspectives on risk. Saf Sci 62:348–365CrossRef
15.
Zurück zum Zitat Hinton G, Deng L, Yu D, Dahl GE, Mohamed A, Jaitly N (2012) A senior, deep neural networks for acoustic modeling in speech recognition: the shared views of four research groups. IEEE Sig Process Mag 29(6):82–97CrossRef Hinton G, Deng L, Yu D, Dahl GE, Mohamed A, Jaitly N (2012) A senior, deep neural networks for acoustic modeling in speech recognition: the shared views of four research groups. IEEE Sig Process Mag 29(6):82–97CrossRef
16.
Zurück zum Zitat Hinton G, Vinyals O, Dean J (2015) Distilling the knowledge in a neural network. arXiv: 1503.02531v1 [stat.ML] Hinton G, Vinyals O, Dean J (2015) Distilling the knowledge in a neural network. arXiv: 1503.02531v1 [stat.ML]
17.
Zurück zum Zitat Hourtash AM, Hingwe P, Schena BM, Devengenzo RL (2016) U.S. Patent No. 9,492,235. U.S. Patent and Trademark Office, Washington, DC Hourtash AM, Hingwe P, Schena BM, Devengenzo RL (2016) U.S. Patent No. 9,492,235. U.S. Patent and Trademark Office, Washington, DC
18.
Zurück zum Zitat Keller J, Thakur D, Gallier J, Kumar V (2016) Obstacle avoidance and path intersection validation for UAS: a B-spline approach. In: IEEE international conference on unmanned aircraft systems, pp 420–429 Keller J, Thakur D, Gallier J, Kumar V (2016) Obstacle avoidance and path intersection validation for UAS: a B-spline approach. In: IEEE international conference on unmanned aircraft systems, pp 420–429
19.
Zurück zum Zitat Khatib O (1986) Real-time obstacle avoidance for manipulators and mobile robots. Int J Robot Res 5(1) Khatib O (1986) Real-time obstacle avoidance for manipulators and mobile robots. Int J Robot Res 5(1)
20.
Zurück zum Zitat Krizhevsky A, Sutskever I, Hinton G (2012) ImageNet classification with deep convolutional neural networks. Commun ACM 60(2)CrossRef Krizhevsky A, Sutskever I, Hinton G (2012) ImageNet classification with deep convolutional neural networks. Commun ACM 60(2)CrossRef
21.
Zurück zum Zitat LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436–444CrossRef LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436–444CrossRef
22.
Zurück zum Zitat Liu X, Jin Y (2018) Transfer reinforcement learning: task similarities and transfer strategies (in preparation) Liu X, Jin Y (2018) Transfer reinforcement learning: task similarities and transfer strategies (in preparation)
23.
Zurück zum Zitat Machado T, Malheiro T, Monteiro S, Erlhagen W, Bicho E (2016) Multi-constrained joint transportation tasks by teams of autonomous mobile robots using a dynamical systems approach. In: 2016 IEEE international conference on robotics and automation (ICRA), pp 3111–3117 Machado T, Malheiro T, Monteiro S, Erlhagen W, Bicho E (2016) Multi-constrained joint transportation tasks by teams of autonomous mobile robots using a dynamical systems approach. In: 2016 IEEE international conference on robotics and automation (ICRA), pp 3111–3117
24.
Zurück zum Zitat March JG (1991) Exploration and exploitation in organizational learning. Organ Sci 2(1):71–87CrossRef March JG (1991) Exploration and exploitation in organizational learning. Organ Sci 2(1):71–87CrossRef
25.
Zurück zum Zitat Mastellone S, Stipanovic D, Graunke C, Intlekofer K, Spong M (2008) Formation control and collision avoidance for multi-agent non-holonomic systems: theory and experiments. Int J Rob Res 27(1):107–126CrossRef Mastellone S, Stipanovic D, Graunke C, Intlekofer K, Spong M (2008) Formation control and collision avoidance for multi-agent non-holonomic systems: theory and experiments. Int J Rob Res 27(1):107–126CrossRef
26.
Zurück zum Zitat Matarić MJ (1997) Reinforcement learning in the multi-robot domain. In: Robot colonies. Springer, US, pp 73–83CrossRef Matarić MJ (1997) Reinforcement learning in the multi-robot domain. In: Robot colonies. Springer, US, pp 73–83CrossRef
27.
Zurück zum Zitat Mnih V, Kavukcuoglu K, Silver D, Graves A, Antonoglou I, Wierstra D, Riedmiller M (2013) Playing Atari with deep reinforcement learning. arXiv:1312.5602v1 [cs.LG] Mnih V, Kavukcuoglu K, Silver D, Graves A, Antonoglou I, Wierstra D, Riedmiller M (2013) Playing Atari with deep reinforcement learning. arXiv:1312.5602v1 [cs.LG]
28.
Zurück zum Zitat Mukhtar A, Xia L, Tang TB (2015) Vehicle detection techniques for collision avoidance systems: a review. IEEE Trans Intell Transp Syst 16(5):2318–2338CrossRef Mukhtar A, Xia L, Tang TB (2015) Vehicle detection techniques for collision avoidance systems: a review. IEEE Trans Intell Transp Syst 16(5):2318–2338CrossRef
29.
Zurück zum Zitat Ohn-Bar E, Trivedi MM (2016) Looking at humans in the age of self-driving and highly automated vehicles. IEEE Trans Intell Veh 1(1):90–104CrossRef Ohn-Bar E, Trivedi MM (2016) Looking at humans in the age of self-driving and highly automated vehicles. IEEE Trans Intell Veh 1(1):90–104CrossRef
30.
Zurück zum Zitat Pan SJ, Yang Q (2010) A survey on transfer learning. IEEE Trans Knowl Data Eng 22(10):1345–1359CrossRef Pan SJ, Yang Q (2010) A survey on transfer learning. IEEE Trans Knowl Data Eng 22(10):1345–1359CrossRef
31.
Zurück zum Zitat Parisotto E, Ba JL, Salakhutdinov R (2016) Actor-mimic: deep multitask and transfer reinforcement learning. arXiv:1511.06342v4 [cs.LG] Parisotto E, Ba JL, Salakhutdinov R (2016) Actor-mimic: deep multitask and transfer reinforcement learning. arXiv:1511.06342v4 [cs.LG]
32.
Zurück zum Zitat Schaul T, Quan J, Antonoglou I, Silver D (2016) Prioritized experience replay. International Conference on Learning Representations, 2016 Schaul T, Quan J, Antonoglou I, Silver D (2016) Prioritized experience replay. International Conference on Learning Representations, 2016
33.
Zurück zum Zitat Shiomi M, Zanlungo F, Hayashi K, Kanda T (2014) Towards a socially acceptable collision avoidance for a mobile robot navigating among pedestrians using a pedestrian model. Int J Soc Robot 6(3):443–455CrossRef Shiomi M, Zanlungo F, Hayashi K, Kanda T (2014) Towards a socially acceptable collision avoidance for a mobile robot navigating among pedestrians using a pedestrian model. Int J Soc Robot 6(3):443–455CrossRef
34.
Zurück zum Zitat Silver D et al (2016) Mastering the game of Go with deep neural networks and tree search. Nature 529(7587):484CrossRef Silver D et al (2016) Mastering the game of Go with deep neural networks and tree search. Nature 529(7587):484CrossRef
35.
Zurück zum Zitat Tang S, Kumar V (2015) A complete algorithm for generating safe trajectories for multi-robot teams. In: International symposium on robotics research Tang S, Kumar V (2015) A complete algorithm for generating safe trajectories for multi-robot teams. In: International symposium on robotics research
36.
Zurück zum Zitat Taylor M, Stone P (2007) Cross-domain transfer for reinforcement learning. In: International conference on machine learning, ACM Taylor M, Stone P (2007) Cross-domain transfer for reinforcement learning. In: International conference on machine learning, ACM
37.
Zurück zum Zitat Torrey L, Shavlik J, Walker T, Maclin R (2006) Skill acquisition via transfer learning and advice taking. In: European conference on machine learning. Springer, Berlin Torrey L, Shavlik J, Walker T, Maclin R (2006) Skill acquisition via transfer learning and advice taking. In: European conference on machine learning. Springer, Berlin
38.
Zurück zum Zitat van Hasselt H, Guez A, Silver D (2015) Deep reinforcement learning with double Q-learning. arXiv:1509.06461v3 [cs.LG] van Hasselt H, Guez A, Silver D (2015) Deep reinforcement learning with double Q-learning. arXiv:1509.06461v3 [cs.LG]
39.
Zurück zum Zitat Wang FY, Zhang JJ, Zheng X, Wang X, Yuan Y, Dai X, Zhang J, Yang, L (2016). Where does AlphaGo go: from Church-Turing thesis to AlphaGo thesis and beyond. IEEE/CAA J Automatica Sin 3(2):113–120 Wang FY, Zhang JJ, Zheng X, Wang X, Yuan Y, Dai X, Zhang J, Yang, L (2016). Where does AlphaGo go: from Church-Turing thesis to AlphaGo thesis and beyond. IEEE/CAA J Automatica Sin 3(2):113–120
40.
Zurück zum Zitat Wang Z, School T, Hessel M, van Haselt H, Lanctot M, de Freitas N (2016) Dueling network architectures for deep reinforcement learning. arXiv:1511.06581v3 [cs.LG] Wang Z, School T, Hessel M, van Haselt H, Lanctot M, de Freitas N (2016) Dueling network architectures for deep reinforcement learning. arXiv:1511.06581v3 [cs.LG]
41.
Zurück zum Zitat Watkins C (1989) Learning from delayed rewards. Doctoral dissertation, University of Cambridge, Cambridge Watkins C (1989) Learning from delayed rewards. Doctoral dissertation, University of Cambridge, Cambridge
42.
Zurück zum Zitat Yu A, Palefsky-Smith R, Bedi R (2016) Deep reinforcement learning for simulated autonomous vehicle control Yu A, Palefsky-Smith R, Bedi R (2016) Deep reinforcement learning for simulated autonomous vehicle control
43.
Zurück zum Zitat Zou X, Alexander R, McDermid J (2016) On the validation of a UAV collision avoidance system developed by model-based optimization: challenges and a tentative partial solution. In: 2016 46th annual IEEE/IFIP international conference on dependable systems and networks workshop, pp 192–199 Zou X, Alexander R, McDermid J (2016) On the validation of a UAV collision avoidance system developed by model-based optimization: challenges and a tentative partial solution. In: 2016 46th annual IEEE/IFIP international conference on dependable systems and networks workshop, pp 192–199
Metadaten
Titel
Design of Transfer Reinforcement Learning Mechanisms for Autonomous Collision Avoidance
verfasst von
Xiongqing Liu
Yan Jin
Copyright-Jahr
2019
DOI
https://doi.org/10.1007/978-3-030-05363-5_17

    Marktübersichten

    Die im Laufe eines Jahres in der „adhäsion“ veröffentlichten Marktübersichten helfen Anwendern verschiedenster Branchen, sich einen gezielten Überblick über Lieferantenangebote zu verschaffen.