Skip to main content
Top
Published in:
Cover of the book

2016 | OriginalPaper | Chapter

Deep Active Learning for Autonomous Navigation

Authors : Ahmed Hussein, Mohamed Medhat Gaber, Eyad Elyan

Published in: Engineering Applications of Neural Networks

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Imitation learning refers to an agent’s ability to mimic a desired behavior by learning from observations. A major challenge facing learning from demonstrations is to represent the demonstrations in a manner that is adequate for learning and efficient for real time decisions. Creating feature representations is especially challenging when extracted from high dimensional visual data. In this paper, we present a method for imitation learning from raw visual data. The proposed method is applied to a popular imitation learning domain that is relevant to a variety of real life applications; namely navigation. To create a training set, a teacher uses an optimal policy to perform a navigation task, and the actions taken are recorded along with visual footage from the first person perspective. Features are automatically extracted and used to learn a policy that mimics the teacher via a deep convolutional neural network. A trained agent can then predict an action to perform based on the scene it finds itself in. This method is generic, and the network is trained without knowledge of the task, targets or environment in which it is acting. Another common challenge in imitation learning is generalizing a policy over unseen situation in training data. To address this challenge, the learned policy is subsequently improved by employing active learning. While the agent is executing a task, it can query the teacher for the correct action to take in situations where it has low confidence. The active samples are added to the training set and used to update the initial policy. The proposed approach is demonstrated on 4 different tasks in a 3D simulated environment. The experiments show that an agent can effectively perform imitation learning from raw visual data for navigation tasks and that active learning can significantly improve the initial policy using a small number of samples. The simulated testbed facilitates reproduction of these results and comparison with other approaches.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Abbeel, P., Coates, A., Quigley, M., Ng, A.Y.: An application of reinforcement learning to aerobatic helicopter flight. Adv. Neural Inf. Process. Syst. 19, 1 (2007) Abbeel, P., Coates, A., Quigley, M., Ng, A.Y.: An application of reinforcement learning to aerobatic helicopter flight. Adv. Neural Inf. Process. Syst. 19, 1 (2007)
2.
go back to reference Argall, B.D., Chernova, S., Veloso, M., Browning, B.: A survey of robot learning from demonstration. Robot. Auton. Syst. 57(5), 469–483 (2009)CrossRef Argall, B.D., Chernova, S., Veloso, M., Browning, B.: A survey of robot learning from demonstration. Robot. Auton. Syst. 57(5), 469–483 (2009)CrossRef
3.
go back to reference Bellemare, M.G., Naddaf, Y., Veness, J., Bowling, M.: The arcade learning environment: an evaluation platform for general agents (2012). arXiv preprint arXiv:1207.4708 Bellemare, M.G., Naddaf, Y., Veness, J., Bowling, M.: The arcade learning environment: an evaluation platform for general agents (2012). arXiv preprint arXiv:​1207.​4708
4.
go back to reference Bemelmans, R., Gelderblom, G.J., Jonker, P., De Witte, L.: Socially assistive robots in elderly care: a systematic review into effects and effectiveness. J. Am. Med. Direct. Assoc. 13(2), 114–120 (2012)CrossRef Bemelmans, R., Gelderblom, G.J., Jonker, P., De Witte, L.: Socially assistive robots in elderly care: a systematic review into effects and effectiveness. J. Am. Med. Direct. Assoc. 13(2), 114–120 (2012)CrossRef
5.
go back to reference Calinon, S., Billard, A.G.: What is the teachers role in robot programming by demonstration? Toward benchmarks for improved learning. Interact. Stud. 8(3), 441–464 (2007)CrossRef Calinon, S., Billard, A.G.: What is the teachers role in robot programming by demonstration? Toward benchmarks for improved learning. Interact. Stud. 8(3), 441–464 (2007)CrossRef
6.
go back to reference Cardamone, L., Loiacono, D., Lanzi, P.L.: Learning drivers for torcs through imitation using supervised methods. In: 2009 IEEE Symposium on Computational Intelligence and Games, CIG 2009, pp. 148–155. IEEE (2009) Cardamone, L., Loiacono, D., Lanzi, P.L.: Learning drivers for torcs through imitation using supervised methods. In: 2009 IEEE Symposium on Computational Intelligence and Games, CIG 2009, pp. 148–155. IEEE (2009)
7.
go back to reference Chernova, S., Veloso, M.: Confidence-based policy learning from demonstration using Gaussian mixture models. In: Proceedings of the 6th International Joint Conference on Autonomous Agents and Multiagent Systems, p. 233. ACM (2007) Chernova, S., Veloso, M.: Confidence-based policy learning from demonstration using Gaussian mixture models. In: Proceedings of the 6th International Joint Conference on Autonomous Agents and Multiagent Systems, p. 233. ACM (2007)
8.
go back to reference Ciresan, D., Meier, U., Schmidhuber, J.: Multi-column deep neural networks for image classification. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3642–3649. IEEE (2012) Ciresan, D., Meier, U., Schmidhuber, J.: Multi-column deep neural networks for image classification. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3642–3649. IEEE (2012)
9.
go back to reference Clark, C., Storkey, A.: Training deep convolutional neural networks to play go. In: Proceedings of the 32nd International Conference on Machine Learning (ICML 2015), pp. 1766–1774 (2015) Clark, C., Storkey, A.: Training deep convolutional neural networks to play go. In: Proceedings of the 32nd International Conference on Machine Learning (ICML 2015), pp. 1766–1774 (2015)
10.
go back to reference Dixon, K.R., Khosla, P.K.: Learning by observation with mobile robots: a computational approach. In: Proceedings 2004 IEEE International Conference on Robotics and Automation, ICRA 2004, vol. 1, pp. 102–107. IEEE (2004) Dixon, K.R., Khosla, P.K.: Learning by observation with mobile robots: a computational approach. In: Proceedings 2004 IEEE International Conference on Robotics and Automation, ICRA 2004, vol. 1, pp. 102–107. IEEE (2004)
11.
go back to reference Gorman, B.: Imitation learning through games: theory, implementation and evaluation. Ph.D. thesis, Dublin City University (2009) Gorman, B.: Imitation learning through games: theory, implementation and evaluation. Ph.D. thesis, Dublin City University (2009)
12.
go back to reference Guo, X., Singh, S., Lee, H., Lewis, R.L., Wang, X.: Deep learning for real-time Atari game play using offline monte-carlo tree search planning. In: Proceedings of Advances in Neural Information Processing Systems, pp. 3338–3346 (2014) Guo, X., Singh, S., Lee, H., Lewis, R.L., Wang, X.: Deep learning for real-time Atari game play using offline monte-carlo tree search planning. In: Proceedings of Advances in Neural Information Processing Systems, pp. 3338–3346 (2014)
13.
go back to reference Ijspeert, A.J., Nakanishi, J., Schaal, S.: Learning rhythmic movements by demonstration using nonlinear oscillators. In: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2002), pp. 958–963 (2002). No. BIOROB-CONF-2002-003 Ijspeert, A.J., Nakanishi, J., Schaal, S.: Learning rhythmic movements by demonstration using nonlinear oscillators. In: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2002), pp. 958–963 (2002). No. BIOROB-CONF-2002-003
14.
go back to reference Judah, K., Fern, A., Dietterich, T.G.: Active imitation learning via reduction to IID active learning (2012). arXiv preprint arXiv:1210.4876 Judah, K., Fern, A., Dietterich, T.G.: Active imitation learning via reduction to IID active learning (2012). arXiv preprint arXiv:​1210.​4876
15.
go back to reference Karakovskiy, S., Togelius, J.: The mario AI benchmark and competitions. IEEE Trans. Comput. Intell. AI Games 4(1), 55–67 (2012)CrossRef Karakovskiy, S., Togelius, J.: The mario AI benchmark and competitions. IEEE Trans. Comput. Intell. AI Games 4(1), 55–67 (2012)CrossRef
16.
go back to reference Kober, J., Bagnell, J.A., Peters, J.: Reinforcement learning in robotics: a survey. Int. J. Robot. Res. 32, 1238 (2013). 0278364913495721CrossRef Kober, J., Bagnell, J.A., Peters, J.: Reinforcement learning in robotics: a survey. Int. J. Robot. Res. 32, 1238 (2013). 0278364913495721CrossRef
17.
go back to reference Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Proceedings of Advances in Neural Information Processing Systems, pp. 1097–1105 (2012) Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Proceedings of Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
18.
20.
go back to reference Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D., Riedmiller, M.: Playing Atari with deep reinforcement learning (2013). arXiv preprint arXiv:1312.5602 Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D., Riedmiller, M.: Playing Atari with deep reinforcement learning (2013). arXiv preprint arXiv:​1312.​5602
21.
go back to reference Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A.A., Veness, J., Bellemare, M.G., Graves, A., Riedmiller, M., Fidjeland, A.K., Ostrovski, G., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015)CrossRef Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A.A., Veness, J., Bellemare, M.G., Graves, A., Riedmiller, M., Fidjeland, A.K., Ostrovski, G., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015)CrossRef
22.
go back to reference Munoz, J., Gutierrez, G., Sanchis, A.: Controller for torcs created by imitation. In: 2009 IEEE Symposium on Computational Intelligence and Games, CIG 2009, pp. 271–278. IEEE (2009) Munoz, J., Gutierrez, G., Sanchis, A.: Controller for torcs created by imitation. In: 2009 IEEE Symposium on Computational Intelligence and Games, CIG 2009, pp. 271–278. IEEE (2009)
23.
go back to reference Ng, A.Y., Coates, A., Diel, M., Ganapathi, V., Schulte, J., Tse, B., Berger, E., Liang, E.: Autonomous inverted helicopter flight via reinforcement learning. In: Ang Jr., M.H., Khatib, O. (eds.) Experimental Robotics IX. Springer Tracts in Advanced Robotics, vol. 21, pp. 363–372. Springer, Heidelberg (2006)CrossRef Ng, A.Y., Coates, A., Diel, M., Ganapathi, V., Schulte, J., Tse, B., Berger, E., Liang, E.: Autonomous inverted helicopter flight via reinforcement learning. In: Ang Jr., M.H., Khatib, O. (eds.) Experimental Robotics IX. Springer Tracts in Advanced Robotics, vol. 21, pp. 363–372. Springer, Heidelberg (2006)CrossRef
24.
go back to reference Nicolescu, M.N., Mataric, M.J.: Natural methods for robot task learning: instructive demonstrations, generalization and practice. In: Proceedings of the Second International Joint Conference on Autonomous Agents and Multiagent Systems, pp. 241–248. ACM (2003) Nicolescu, M.N., Mataric, M.J.: Natural methods for robot task learning: instructive demonstrations, generalization and practice. In: Proceedings of the Second International Joint Conference on Autonomous Agents and Multiagent Systems, pp. 241–248. ACM (2003)
25.
go back to reference Noda, I., Matsubara, H., Hiraki, K., Frank, I.: Soccer server: a tool for research on multiagent systems. Appl. Artif. Intell. 12(2–3), 233–250 (1998)CrossRef Noda, I., Matsubara, H., Hiraki, K., Frank, I.: Soccer server: a tool for research on multiagent systems. Appl. Artif. Intell. 12(2–3), 233–250 (1998)CrossRef
26.
go back to reference Ollis, M., Huang, W.H., Happold, M.: A Bayesian approach to imitation learning for robot navigation. In: 2007 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2007, pp. 709–714. IEEE (2007) Ollis, M., Huang, W.H., Happold, M.: A Bayesian approach to imitation learning for robot navigation. In: 2007 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2007, pp. 709–714. IEEE (2007)
27.
go back to reference Ratliff, N., Bradley, D., Bagnell, J.A., Chestnutt, J.: Boosting structured prediction for imitation learning. In: Proceedings of Robotics Institute, p. 54 (2007) Ratliff, N., Bradley, D., Bagnell, J.A., Chestnutt, J.: Boosting structured prediction for imitation learning. In: Proceedings of Robotics Institute, p. 54 (2007)
28.
go back to reference Ross, S., Bagnell, D.: Efficient reductions for imitation learning. In: International Conference on Artificial Intelligence and Statistics, pp. 661–668 (2010) Ross, S., Bagnell, D.: Efficient reductions for imitation learning. In: International Conference on Artificial Intelligence and Statistics, pp. 661–668 (2010)
29.
go back to reference Sammut, C., Hurst, S., Kedzier, D., Michie, D., et al.: Learning to fly. In: Proceedings of the Ninth International Workshop on Machine Learning, pp. 385–393 (1992) Sammut, C., Hurst, S., Kedzier, D., Michie, D., et al.: Learning to fly. In: Proceedings of the Ninth International Workshop on Machine Learning, pp. 385–393 (1992)
30.
go back to reference Saunders, J., Nehaniv, C.L., Dautenhahn, K.: Teaching robots by moulding behavior and scaffolding the environment. In: Proceedings of the 1st ACM SIGCHI/SIGART Conference on Human-robot Interaction, pp. 118–125. ACM (2006) Saunders, J., Nehaniv, C.L., Dautenhahn, K.: Teaching robots by moulding behavior and scaffolding the environment. In: Proceedings of the 1st ACM SIGCHI/SIGART Conference on Human-robot Interaction, pp. 118–125. ACM (2006)
31.
go back to reference Schaal, S.: Is imitation learning the route to humanoid robots? Trends Cogn. Sci. 3(6), 233–242 (1999)CrossRef Schaal, S.: Is imitation learning the route to humanoid robots? Trends Cogn. Sci. 3(6), 233–242 (1999)CrossRef
32.
go back to reference Silver, D., Bagnell, J., Stentz, A.: High performance outdoor navigation from overhead data using imitation learning. In: Proceedings of Robotics: Science and Systems IV, Zurich, Switzerland (2008) Silver, D., Bagnell, J., Stentz, A.: High performance outdoor navigation from overhead data using imitation learning. In: Proceedings of Robotics: Science and Systems IV, Zurich, Switzerland (2008)
33.
go back to reference Silver, D., Huang, A., Maddison, C.J., Guez, A., Sifre, L., van den Driessche, G., Schrittwieser, J., Antonoglou, I., Panneershelvam, V., Lanctot, M., et al.: Mastering the game of go with deep neural networks and tree search. Nature 529(7587), 484–489 (2016)CrossRef Silver, D., Huang, A., Maddison, C.J., Guez, A., Sifre, L., van den Driessche, G., Schrittwieser, J., Antonoglou, I., Panneershelvam, V., Lanctot, M., et al.: Mastering the game of go with deep neural networks and tree search. Nature 529(7587), 484–489 (2016)CrossRef
35.
go back to reference Togelius, J., De Nardi, R., Lucas, S.M.: Towards automatic personalised content creation for racing games. In: 2007 IEEE Symposium on Computational Intelligence and Games, CIG 2007, pp. 252–259. IEEE (2007) Togelius, J., De Nardi, R., Lucas, S.M.: Towards automatic personalised content creation for racing games. In: 2007 IEEE Symposium on Computational Intelligence and Games, CIG 2007, pp. 252–259. IEEE (2007)
36.
go back to reference Vogt, D., Amor, H.B., Berger, E., Jung, B.: Learning two-person interaction models for responsive synthetic humanoids. J. Virtual Real. Broadcast. 11(1) (2014) Vogt, D., Amor, H.B., Berger, E., Jung, B.: Learning two-person interaction models for responsive synthetic humanoids. J. Virtual Real. Broadcast. 11(1) (2014)
Metadata
Title
Deep Active Learning for Autonomous Navigation
Authors
Ahmed Hussein
Mohamed Medhat Gaber
Eyad Elyan
Copyright Year
2016
DOI
https://doi.org/10.1007/978-3-319-44188-7_1

Premium Partner