Skip to main content
Top
Published in: Neural Processing Letters 5/2022

23-03-2022

Improving Target-driven Visual Navigation with Attention on 3D Spatial Relationships

Authors: Yunlian Lyu, Yimin Shi, Xianggang Zhang

Published in: Neural Processing Letters | Issue 5/2022

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Embodied Artificial Intelligence has become popular in recent years. Its task shifts from focusing on internet images to active settings, involving an embodied agent to perceive and act within 3D environments. In this paper, we study the Target-driven Visual Navigation (TDVN) in 3D indoor scenes using deep reinforcement learning techniques. The generalization of TDVN is a long-standing ill-posed issue, where the agent is expected to transfer intelligent knowledge from training domains to unseen domains. To address this issue, we propose a model that combines visual and relational graph features to learn the navigation policy. Graph convolutional networks are used to obtain graph features, which encodes spatial relations between objects. We also adopt a Target Skill Extension module to generate sub-targets, in order to allow the agent to learn from its failures. For evaluation, we perform experiments in the AI2-THOR. Experimental results show that our proposed model outperforms baselines under various metrics.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Savva M, Kadian A, Maksymets O, Zhao Y, Wijmans E, Jain B, Straub J, Liu J, Koltun V, Malik J et al (2019) Habitat: a platform for embodied ai research. In: Proceedings of the IEEE international conference on computer vision (ICCV), pp 9339–9347 Savva M, Kadian A, Maksymets O, Zhao Y, Wijmans E, Jain B, Straub J, Liu J, Koltun V, Malik J et al (2019) Habitat: a platform for embodied ai research. In: Proceedings of the IEEE international conference on computer vision (ICCV), pp 9339–9347
2.
go back to reference Zhu, Y., Mottaghi, R., Kolve, E., Lim, J.J., Gupta, A., Fei-Fei, L., Farhadi, A.: Target-driven visual navigation in indoor scenes using deep reinforcement learning. In: Proceedings - IEEE international conference on robotics and automation (ICRA), pp 3357–3364 (2017) Zhu, Y., Mottaghi, R., Kolve, E., Lim, J.J., Gupta, A., Fei-Fei, L., Farhadi, A.: Target-driven visual navigation in indoor scenes using deep reinforcement learning. In: Proceedings - IEEE international conference on robotics and automation (ICRA), pp 3357–3364 (2017)
3.
go back to reference Anderson P, Wu Q, Teney D, Bruce J, Johnson M, Sünderhauf N, Reid I, Gould S, van den Hengel A (2018) Vision-and-language navigation: interpreting visually-grounded navigation instructions in real environments. In: Proceedings of IEEE conference on computer vision and pattern recognition (CVPR), pp 3674–3683 Anderson P, Wu Q, Teney D, Bruce J, Johnson M, Sünderhauf N, Reid I, Gould S, van den Hengel A (2018) Vision-and-language navigation: interpreting visually-grounded navigation instructions in real environments. In: Proceedings of IEEE conference on computer vision and pattern recognition (CVPR), pp 3674–3683
4.
go back to reference Das A, Datta S, Gkioxari G, Lee S, Parikh D, Batra D (2018) Embodied question answering. In: Proceedings of IEEE conference on computer vision and pattern recognition (CVPR), pp 2054–2063 Das A, Datta S, Gkioxari G, Lee S, Parikh D, Batra D (2018) Embodied question answering. In: Proceedings of IEEE conference on computer vision and pattern recognition (CVPR), pp 2054–2063
5.
go back to reference Deng J, Dong W, Socher R, Li L-J, Li K, Fei-Fei L (2009) Imagenet: a large-scale hierarchical image database. In: Proceedings of IEEE conference on computer vision and pattern recognition (CVPR), pp 248–255 Deng J, Dong W, Socher R, Li L-J, Li K, Fei-Fei L (2009) Imagenet: a large-scale hierarchical image database. In: Proceedings of IEEE conference on computer vision and pattern recognition (CVPR), pp 248–255
6.
go back to reference Lin T-Y, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P, Zitnick CL (2014) Microsoft coco: Common objects in context. In: Proceedings of european conference on computer vision (ECCV), pp 740–755 Lin T-Y, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P, Zitnick CL (2014) Microsoft coco: Common objects in context. In: Proceedings of european conference on computer vision (ECCV), pp 740–755
7.
go back to reference Krishna R, Zhu Y, Groth O, Johnson J, Hata K, Kravitz J, Chen S, Kalantidis Y, Li L-J, Shamma DA et al (2017) Visual genome: connecting language and vision using crowdsourced dense image annotations. Int. J. Comput. Vis. 123(1):32–73MathSciNetCrossRef Krishna R, Zhu Y, Groth O, Johnson J, Hata K, Kravitz J, Chen S, Kalantidis Y, Li L-J, Shamma DA et al (2017) Visual genome: connecting language and vision using crowdsourced dense image annotations. Int. J. Comput. Vis. 123(1):32–73MathSciNetCrossRef
8.
go back to reference Chaplot DS, Sathyendra KM, Pasumarthi RK, Rajagopal D, Salakhutdinov R (2018) Gated-attention architectures for task-oriented language grounding. In: Proceedings of AAAI conference on artificial intelligence (AAAI), pp 2819–2826 Chaplot DS, Sathyendra KM, Pasumarthi RK, Rajagopal D, Salakhutdinov R (2018) Gated-attention architectures for task-oriented language grounding. In: Proceedings of AAAI conference on artificial intelligence (AAAI), pp 2819–2826
9.
go back to reference Wu Y, Wu Y, Gkioxari G, Tian Y (2018) Building generalizable agents with a realistic and rich 3d environmentc Wu Y, Wu Y, Gkioxari G, Tian Y (2018) Building generalizable agents with a realistic and rich 3d environmentc
10.
go back to reference Dhiman V, Banerjee S, Griffin B, Siskind JM, Corso JJ (2018) A critical investigation of deep reinforcement learning for navigation Dhiman V, Banerjee S, Griffin B, Siskind JM, Corso JJ (2018) A critical investigation of deep reinforcement learning for navigation
11.
go back to reference Kolve E, Mottaghi R, Han W, VanderBilt E, Weihs L, Herrasti A, Gordon D, Zhu Y, Gupta A, Farhadi A (2017) Ai2-thor: An interactive 3d environment for visual ai. arXiv preprint arXiv:1712.05474 Kolve E, Mottaghi R, Han W, VanderBilt E, Weihs L, Herrasti A, Gordon D, Zhu Y, Gupta A, Farhadi A (2017) Ai2-thor: An interactive 3d environment for visual ai. arXiv preprint arXiv:​1712.​05474
12.
go back to reference Bonin-Font F, Ortiz A, Oliver G (2008) Visual navigation for mobile robots: a survey. J. Intell. Robotic Syst. 53(3):263–296CrossRef Bonin-Font F, Ortiz A, Oliver G (2008) Visual navigation for mobile robots: a survey. J. Intell. Robotic Syst. 53(3):263–296CrossRef
13.
go back to reference Fuentes-Pacheco J, Ascencio JR, Rendón-Mancha JM (2015) Visual simultaneous localization and mapping: a survey. Artif. Intell. Rev. 43(1):55–81CrossRef Fuentes-Pacheco J, Ascencio JR, Rendón-Mancha JM (2015) Visual simultaneous localization and mapping: a survey. Artif. Intell. Rev. 43(1):55–81CrossRef
14.
go back to reference LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436–444CrossRef LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436–444CrossRef
15.
go back to reference Sutton RS, Barto AG (2018) Reinforcement Learning: An Introduction. MIT Press, CambridgeMATH Sutton RS, Barto AG (2018) Reinforcement Learning: An Introduction. MIT Press, CambridgeMATH
16.
go back to reference Mousavi SS, Schukat M, Howley E (2016) Deep reinforcement learning: an overview. In: Proceedings of the SAI intelligent systems conference, pp. 426–440. Springer Mousavi SS, Schukat M, Howley E (2016) Deep reinforcement learning: an overview. In: Proceedings of the SAI intelligent systems conference, pp. 426–440. Springer
17.
go back to reference Zhu Y, Gordon D, Kolve E, Fox D, Fei-Fei L, Gupta A, Mottaghi R, Farhadi A (2017) Visual semantic planning using deep successor representations. In: Proceedings of the IEEE international conference on computer vision (ICCV), pp 483–492 Zhu Y, Gordon D, Kolve E, Fox D, Fei-Fei L, Gupta A, Mottaghi R, Farhadi A (2017) Visual semantic planning using deep successor representations. In: Proceedings of the IEEE international conference on computer vision (ICCV), pp 483–492
18.
go back to reference Yang W, Wang X, Farhadi A, Gupta A, Mottaghi R (2019) Visual semantic navigation using scene priors. In: Proceedings of international conference on learning representations (ICLR) Yang W, Wang X, Farhadi A, Gupta A, Mottaghi R (2019) Visual semantic navigation using scene priors. In: Proceedings of international conference on learning representations (ICLR)
19.
go back to reference Mei H, Bansal M, Walter MR (2016) Listen, attend, and walk: neural mapping of navigational instructions to action sequences. In: Proceedings of AAAI conference on artificial intelligence (AAAI), pp 2772–2778 Mei H, Bansal M, Walter MR (2016) Listen, attend, and walk: neural mapping of navigational instructions to action sequences. In: Proceedings of AAAI conference on artificial intelligence (AAAI), pp 2772–2778
20.
go back to reference Fried D, Hu R, Cirik V, Rohrbach A, Andreas J, Morency L-P, Berg-Kirkpatrick T, Saenko K, Klein D, Darrell T (2018) Speaker-follower models for vision-and-language navigation. In: Proceedings of the neural information processing systems (NIPS), pp 3314–3325 Fried D, Hu R, Cirik V, Rohrbach A, Andreas J, Morency L-P, Berg-Kirkpatrick T, Saenko K, Klein D, Darrell T (2018) Speaker-follower models for vision-and-language navigation. In: Proceedings of the neural information processing systems (NIPS), pp 3314–3325
21.
go back to reference Gordon D, Kembhavi A, Rastegari M, Redmon J, Fox D, Farhadi A (2018) Iqa: visual question answering in interactive environments. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition (CVPR), pp 4089–4098 Gordon D, Kembhavi A, Rastegari M, Redmon J, Fox D, Farhadi A (2018) Iqa: visual question answering in interactive environments. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition (CVPR), pp 4089–4098
22.
go back to reference Devo A, Mezzetti G, Costante G, Fravolini ML, Valigi P (2020) Towards generalization in target-driven visual navigation by using deep reinforcement learning. IEEE Trans. Robot. 36(5):1546–1561CrossRef Devo A, Mezzetti G, Costante G, Fravolini ML, Valigi P (2020) Towards generalization in target-driven visual navigation by using deep reinforcement learning. IEEE Trans. Robot. 36(5):1546–1561CrossRef
23.
go back to reference Rao Z, Wu Y, Yang Z, Zhang W, Lu S, Lu W, Zha Z (2021) Visual navigation with multiple goals based on deep reinforcement learning. IEEE Trans Neural Netw Learn Syst 32(12):5445–5455CrossRef Rao Z, Wu Y, Yang Z, Zhang W, Lu S, Lu W, Zha Z (2021) Visual navigation with multiple goals based on deep reinforcement learning. IEEE Trans Neural Netw Learn Syst 32(12):5445–5455CrossRef
25.
go back to reference Tessler C, Givony S, Zahavy T, Mankowitz DJ, Mannor S (2017) A deep hierarchical approach to lifelong learning in minecraft. In: Proceedings of the AAAI conference on artificial intelligence (AAAI), pp 1553–1561 Tessler C, Givony S, Zahavy T, Mankowitz DJ, Mannor S (2017) A deep hierarchical approach to lifelong learning in minecraft. In: Proceedings of the AAAI conference on artificial intelligence (AAAI), pp 1553–1561
26.
go back to reference Mirowski P, Pascanu R, Viola F, Soyer H, Ballard AJ, Banino A, Denil M, Goroshin R, Sifre L, Kavukcuoglu K, et al (2017) Learning to navigate in complex environments. In: Proceedings of the international conference on learning representations (ICLR) Mirowski P, Pascanu R, Viola F, Soyer H, Ballard AJ, Banino A, Denil M, Goroshin R, Sifre L, Kavukcuoglu K, et al (2017) Learning to navigate in complex environments. In: Proceedings of the international conference on learning representations (ICLR)
27.
go back to reference Jaderberg M, Mnih V, Czarnecki WM, Schaul T, Leibo JZ, Silver D, Kavukcuoglu K (2017) Reinforcement learning with unsupervised auxiliary tasks. In: Proceedings of international conference on learning representations (ICLR) Jaderberg M, Mnih V, Czarnecki WM, Schaul T, Leibo JZ, Silver D, Kavukcuoglu K (2017) Reinforcement learning with unsupervised auxiliary tasks. In: Proceedings of international conference on learning representations (ICLR)
28.
go back to reference Kempka M, Wydmuch M, Runc G, Toczek J, Jaśkowski W (2016) Vizdoom: a doom-based ai research platform for visual reinforcement learning. In: Proceedings of IEEE international conference on computational intelligence and games (CIG), pp. 1–8 Kempka M, Wydmuch M, Runc G, Toczek J, Jaśkowski W (2016) Vizdoom: a doom-based ai research platform for visual reinforcement learning. In: Proceedings of IEEE international conference on computational intelligence and games (CIG), pp. 1–8
29.
go back to reference Oh J, Chockalingam V, Lee H, et al (2016) Control of memory, active perception, and action in minecraft. In: Proceedings of the international conference on machine learning (ICML), pp 2790–2799 Oh J, Chockalingam V, Lee H, et al (2016) Control of memory, active perception, and action in minecraft. In: Proceedings of the international conference on machine learning (ICML), pp 2790–2799
30.
go back to reference Beattie C, Leibo JZ, Teplyashin D, Ward T, Wainwright M, Küttler H, Lefrancq A, Green S, Valdés V, Sadik A, et al (2016) Deepmind lab Beattie C, Leibo JZ, Teplyashin D, Ward T, Wainwright M, Küttler H, Lefrancq A, Green S, Valdés V, Sadik A, et al (2016) Deepmind lab
31.
go back to reference Chaplot DS, Lample G, Sathyendra KM, Salakhutdinov R (2016) Transfer deep reinforcement learning in 3d environments: An empirical study. In: Proceedings of the international conference on neural information processing systems (NIPS) Chaplot DS, Lample G, Sathyendra KM, Salakhutdinov R (2016) Transfer deep reinforcement learning in 3d environments: An empirical study. In: Proceedings of the international conference on neural information processing systems (NIPS)
32.
go back to reference Parisotto E, Salakhutdinov R (2018) Neural map: structured memory for deep reinforcement learning. In: Proceedings of international conference on learning representations (ICLR) Parisotto E, Salakhutdinov R (2018) Neural map: structured memory for deep reinforcement learning. In: Proceedings of international conference on learning representations (ICLR)
33.
go back to reference Oh J, Singh S, Lee H, Kohli P (2017) Zero-shot task generalization with multi-task deep reinforcement learning. In: Proceedings of the international conference on machine learning (ICML), pp 2661–2670 Oh J, Singh S, Lee H, Kohli P (2017) Zero-shot task generalization with multi-task deep reinforcement learning. In: Proceedings of the international conference on machine learning (ICML), pp 2661–2670
34.
go back to reference Pathak D, Mahmoudieh P, Luo G, Agrawal P, Chen D, Shentu Y, Shelhamer E, Malik J, Efros AA, Darrell T (2018) Zero-shot visual imitation. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition (CVPR), pp 2050–2053 Pathak D, Mahmoudieh P, Luo G, Agrawal P, Chen D, Shentu Y, Shelhamer E, Malik J, Efros AA, Darrell T (2018) Zero-shot visual imitation. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition (CVPR), pp 2050–2053
35.
go back to reference Wortsman M, Ehsani K, Rastegari M, Farhadi A, Mottaghi R (2019) Learning to learn how to learn: Self-adaptive visual navigation using meta-learning. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition (CVPR), pp 6750–6759 Wortsman M, Ehsani K, Rastegari M, Farhadi A, Mottaghi R (2019) Learning to learn how to learn: Self-adaptive visual navigation using meta-learning. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition (CVPR), pp 6750–6759
36.
go back to reference Fang Q, Xu X, Wang X, Zeng Y (2021) Target-driven visual navigation in indoor scenes using reinforcement learning and imitation learning. CAAI T. Intell, Technol Fang Q, Xu X, Wang X, Zeng Y (2021) Target-driven visual navigation in indoor scenes using reinforcement learning and imitation learning. CAAI T. Intell, Technol
37.
go back to reference Song S, Yu F, Zeng A, Chang AX, Savva M, Funkhouser T (2017) Semantic scene completion from a single depth image. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition (CVPR), pp 1746–1754 Song S, Yu F, Zeng A, Chang AX, Savva M, Funkhouser T (2017) Semantic scene completion from a single depth image. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition (CVPR), pp 1746–1754
38.
go back to reference Chang A, Dai A, Funkhouser TA, Halber M, Niebner M, Savva M, Song S, Zeng A, Zhang Y (2018) Matterport3d: Learning from rgb-d data in indoor environments. In: Proceedings of international conference on 3D vision (3DV), pp 667–676 Chang A, Dai A, Funkhouser TA, Halber M, Niebner M, Savva M, Song S, Zeng A, Zhang Y (2018) Matterport3d: Learning from rgb-d data in indoor environments. In: Proceedings of international conference on 3D vision (3DV), pp 667–676
39.
go back to reference Kipf TN, Welling M (2017) Semi-supervised classification with graph convolutional networks. In: Proceedings of international conference on learning representations (ICLR) Kipf TN, Welling M (2017) Semi-supervised classification with graph convolutional networks. In: Proceedings of international conference on learning representations (ICLR)
40.
go back to reference Mnih V, Badia AP, Mirza M, Graves A, Lillicrap T, Harley T, Silver D, Kavukcuoglu K (2016) Asynchronous methods for deep reinforcement learning. In: Proceedings of international conference on machine learning (ICML), pp 1928–1937 Mnih V, Badia AP, Mirza M, Graves A, Lillicrap T, Harley T, Silver D, Kavukcuoglu K (2016) Asynchronous methods for deep reinforcement learning. In: Proceedings of international conference on machine learning (ICML), pp 1928–1937
41.
go back to reference Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput. 9(8):1735–1780CrossRef Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput. 9(8):1735–1780CrossRef
42.
go back to reference He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition (CVPR), pp 770–778 He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition (CVPR), pp 770–778
43.
go back to reference Chopra S, Hadsell R, LeCun Y, et al (2005) Learning a similarity metric discriminatively, with application to face verification. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition (CVPR), pp 539–546 Chopra S, Hadsell R, LeCun Y, et al (2005) Learning a similarity metric discriminatively, with application to face verification. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition (CVPR), pp 539–546
44.
go back to reference Redmon J, Farhadi A (2018) Yolov3: an incremental improvement Redmon J, Farhadi A (2018) Yolov3: an incremental improvement
45.
go back to reference Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. In: Proceedings of the neural information processing systems (NIPS), pp 5998–6008 Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. In: Proceedings of the neural information processing systems (NIPS), pp 5998–6008
46.
go back to reference Zhang H, Xiao Z, Wang J, Li F, Szczerbicki E (2019) A novel iot-perceptive human activity recognition (har) approach using multihead convolutional attention. IEEE Internet Things J. 7(2):1072–1080CrossRef Zhang H, Xiao Z, Wang J, Li F, Szczerbicki E (2019) A novel iot-perceptive human activity recognition (har) approach using multihead convolutional attention. IEEE Internet Things J. 7(2):1072–1080CrossRef
47.
go back to reference Xiao Z, Xu X, Xing H, Luo S, Dai P, Zhan D (2021) Rtfn: a robust temporal feature network for time series classification. Inf. Sci. 571:65–86MathSciNetCrossRef Xiao Z, Xu X, Xing H, Luo S, Dai P, Zhan D (2021) Rtfn: a robust temporal feature network for time series classification. Inf. Sci. 571:65–86MathSciNetCrossRef
48.
go back to reference Andrychowicz M, Wolski F, Ray A, Schneider J, Fong R, Welinder P, McGrew B, Tobin J, Abbeel OP, Zaremba W (2017) Hindsight experience replay. In: Proceedings of the neural information processing systems (NIPS), pp 5048–5058 Andrychowicz M, Wolski F, Ray A, Schneider J, Fong R, Welinder P, McGrew B, Tobin J, Abbeel OP, Zaremba W (2017) Hindsight experience replay. In: Proceedings of the neural information processing systems (NIPS), pp 5048–5058
49.
go back to reference Anderson P, Chang A, Chaplot DS, Dosovitskiy A, Gupta S, Koltun V, Kosecka J, Malik J, Mottaghi R, Savva M, et al (2018) On evaluation of embodied navigation agents Anderson P, Chang A, Chaplot DS, Dosovitskiy A, Gupta S, Koltun V, Kosecka J, Malik J, Mottaghi R, Savva M, et al (2018) On evaluation of embodied navigation agents
50.
go back to reference Bojarski M, Del Testa D, Dworakowski D, Firner B, Flepp B, Goyal P, Jackel LD, Monfort M, Muller U, Zhang J, et al (2016) End to end learning for self-driving cars Bojarski M, Del Testa D, Dworakowski D, Firner B, Flepp B, Goyal P, Jackel LD, Monfort M, Muller U, Zhang J, et al (2016) End to end learning for self-driving cars
51.
go back to reference Abadi M, Agarwal A, Barham P, Brevdo E, Chen Z, Citro C, Corrado GS, Davis A, Dean J, Devin M, et al (2016) Tensorflow: large-scale machine learning on heterogeneous distributed systems Abadi M, Agarwal A, Barham P, Brevdo E, Chen Z, Citro C, Corrado GS, Davis A, Dean J, Devin M, et al (2016) Tensorflow: large-scale machine learning on heterogeneous distributed systems
52.
go back to reference Lvd Maaten, Hinton G (2008) Visualizing data using t-sne. J. Mach. Learn. Res. 9(Nov):2579–2605MATH Lvd Maaten, Hinton G (2008) Visualizing data using t-sne. J. Mach. Learn. Res. 9(Nov):2579–2605MATH
Metadata
Title
Improving Target-driven Visual Navigation with Attention on 3D Spatial Relationships
Authors
Yunlian Lyu
Yimin Shi
Xianggang Zhang
Publication date
23-03-2022
Publisher
Springer US
Published in
Neural Processing Letters / Issue 5/2022
Print ISSN: 1370-4621
Electronic ISSN: 1573-773X
DOI
https://doi.org/10.1007/s11063-022-10796-8

Other articles of this Issue 5/2022

Neural Processing Letters 5/2022 Go to the issue