Skip to main content

2018 | OriginalPaper | Buchkapitel

r2p2: A ReparameteRized Pushforward Policy for Diverse, Precise Generative Path Forecasting

verfasst von : Nicholas Rhinehart, Kris M. Kitani, Paul Vernaza

Erschienen in: Computer Vision – ECCV 2018

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

We propose a method to forecast a vehicle’s ego-motion as a distribution over spatiotemporal paths, conditioned on features (e.g., from LIDAR and images) embedded in an overhead map. The method learns a policy inducing a distribution over simulated trajectories that is both “diverse” (produces most of the likely paths) and “precise” (mostly produces likely paths). This balance is achieved through minimization of a symmetrized cross-entropy between the distribution and demonstration data. By viewing the simulated-outcome distribution as the pushforward of a simple distribution under a simulation operator, we obtain expressions for the cross-entropy metrics that can be efficiently evaluated and differentiated, enabling stochastic-gradient optimization. We propose concrete policy architectures for this model, discuss our evaluation metrics relative to previously-used degenerate metrics, and demonstrate the superiority of our method relative to state-of-the-art methods in both the Kitti dataset and a similar but novel and larger real-world dataset explicitly designed for the vehicle forecasting domain.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Anhänge
Nur mit Berechtigung zugänglich
Literatur
1.
Zurück zum Zitat Abadi, M., et al.: Tensorflow: a system for large-scale machine learning. OSDI 16, 265–283 (2016) Abadi, M., et al.: Tensorflow: a system for large-scale machine learning. OSDI 16, 265–283 (2016)
2.
Zurück zum Zitat Abbeel, P., Ng, A.Y.: Apprenticeship learning via inverse reinforcement learning. In: Proceedings of the Twenty-First International Conference on Machine Learning, p. 1. ACM (2004) Abbeel, P., Ng, A.Y.: Apprenticeship learning via inverse reinforcement learning. In: Proceedings of the Twenty-First International Conference on Machine Learning, p. 1. ACM (2004)
3.
Zurück zum Zitat Alahi, A., Goel, K., Ramanathan, V., Robicquet, A., Fei-Fei, L., Savarese, S.: Social LSTM: human trajectory prediction in crowded spaces. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 961–971 (2016) Alahi, A., Goel, K., Ramanathan, V., Robicquet, A., Fei-Fei, L., Savarese, S.: Social LSTM: human trajectory prediction in crowded spaces. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 961–971 (2016)
6.
Zurück zum Zitat Baram, N., Anschel, O., Caspi, I., Mannor, S.: End-to-end differentiable adversarial imitation learning. In: International Conference on Machine Learning, pp. 390–399 (2017) Baram, N., Anschel, O., Caspi, I., Mannor, S.: End-to-end differentiable adversarial imitation learning. In: International Conference on Machine Learning, pp. 390–399 (2017)
7.
Zurück zum Zitat Bhattacharyya, A., Malinowski, M., Schiele, B., Fritz, M.: Long-term image boundary prediction. In: Thirty-Second AAAI Conference on Artificial Intelligence, AAAI (2017) Bhattacharyya, A., Malinowski, M., Schiele, B., Fritz, M.: Long-term image boundary prediction. In: Thirty-Second AAAI Conference on Artificial Intelligence, AAAI (2017)
8.
Zurück zum Zitat Bhattacharyya, A., Schiele, B., Fritz, M.: Accurate and diverse sampling of sequences based on a “best of many” sample objective. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8485–8493 (2018) Bhattacharyya, A., Schiele, B., Fritz, M.: Accurate and diverse sampling of sequences based on a “best of many” sample objective. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8485–8493 (2018)
10.
Zurück zum Zitat Finn, C., Levine, S.: Deep visual foresight for planning robot motion. In: IEEE International Conference on Robotics and Automation (ICRA), pp. 2786–2793. IEEE (2017) Finn, C., Levine, S.: Deep visual foresight for planning robot motion. In: IEEE International Conference on Robotics and Automation (ICRA), pp. 2786–2793. IEEE (2017)
11.
Zurück zum Zitat Gal, Y.: Uncertainty in deep learning. Ph.D. thesis, University of Cambridge (2016) Gal, Y.: Uncertainty in deep learning. Ph.D. thesis, University of Cambridge (2016)
13.
Zurück zum Zitat Grover, A., Dhar, M., Ermon, S.: Flow-GAN: bridging implicit and prescribed learning in generative models. arXiv preprint arXiv:1705.08868 (2017) Grover, A., Dhar, M., Ermon, S.: Flow-GAN: bridging implicit and prescribed learning in generative models. arXiv preprint arXiv:​1705.​08868 (2017)
14.
Zurück zum Zitat Gulrajani, I., Ahmed, F., Arjovsky, M., Dumoulin, V., Courville, A.C.: Improved training of Wasserstein GANs. In: Advances in Neural Information Processing Systems, pp. 5769–5779 (2017) Gulrajani, I., Ahmed, F., Arjovsky, M., Dumoulin, V., Courville, A.C.: Improved training of Wasserstein GANs. In: Advances in Neural Information Processing Systems, pp. 5769–5779 (2017)
15.
Zurück zum Zitat Gupta, A., Johnson, J.: Social GAN: socially acceptable trajectories with generative adversarial networks (2018) Gupta, A., Johnson, J.: Social GAN: socially acceptable trajectories with generative adversarial networks (2018)
16.
Zurück zum Zitat Ho, J., Ermon, S.: Generative adversarial imitation learning. In: Advances in Neural Information Processing Systems, pp. 4565–4573 (2016) Ho, J., Ermon, S.: Generative adversarial imitation learning. In: Advances in Neural Information Processing Systems, pp. 4565–4573 (2016)
17.
18.
Zurück zum Zitat Jain, A., Singh, A., Koppula, H.S., Soh, S., Saxena, A.: Recurrent neural networks for driver activity anticipation via sensory-fusion architecture. In: IEEE International Conference on Robotics and Automation (ICRA), pp. 3118–3125. IEEE (2016) Jain, A., Singh, A., Koppula, H.S., Soh, S., Saxena, A.: Recurrent neural networks for driver activity anticipation via sensory-fusion architecture. In: IEEE International Conference on Robotics and Automation (ICRA), pp. 3118–3125. IEEE (2016)
19.
Zurück zum Zitat Kakade, S.M., et al.: On the sample complexity of reinforcement learning. Ph.D. thesis (2003) Kakade, S.M., et al.: On the sample complexity of reinforcement learning. Ph.D. thesis (2003)
20.
Zurück zum Zitat Kingma, D.P., Dhariwal, P.: Glow: generative flow with invertible \(1\times 1\) convolutions. arXiv preprint arXiv:1807.03039 (2018) Kingma, D.P., Dhariwal, P.: Glow: generative flow with invertible \(1\times 1\) convolutions. arXiv preprint arXiv:​1807.​03039 (2018)
21.
Zurück zum Zitat Kingma, D.P., Salimans, T., Jozefowicz, R., Chen, X., Sutskever, I., Welling, M.: Improved variational inference with inverse autoregressive flow. In: Advances in Neural Information Processing Systems, pp. 4743–4751 (2016) Kingma, D.P., Salimans, T., Jozefowicz, R., Chen, X., Sutskever, I., Welling, M.: Improved variational inference with inverse autoregressive flow. In: Advances in Neural Information Processing Systems, pp. 4743–4751 (2016)
25.
Zurück zum Zitat Lee, N., Choi, W., Vernaza, P., Choy, C.B., Torr, P.H., Chandraker, M.: Desire: Distant future prediction in dynamic scenes with interacting agents (2017) Lee, N., Choi, W., Vernaza, P., Choy, C.B., Torr, P.H., Chandraker, M.: Desire: Distant future prediction in dynamic scenes with interacting agents (2017)
26.
Zurück zum Zitat Lee, N., Kitani, K.M.: Predicting wide receiver trajectories in American football. In: IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 1–9. IEEE (2016) Lee, N., Kitani, K.M.: Predicting wide receiver trajectories in American football. In: IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 1–9. IEEE (2016)
27.
Zurück zum Zitat Li, Y., Song, J., Ermon, S.: Infogail: interpretable imitation learning from visual demonstrations. In: Advances in Neural Information Processing Systems, pp. 3815–3825 (2017) Li, Y., Song, J., Ermon, S.: Infogail: interpretable imitation learning from visual demonstrations. In: Advances in Neural Information Processing Systems, pp. 3815–3825 (2017)
28.
Zurück zum Zitat Ma, W.C., Huang, D.A., Lee, N., Kitani, K.M.: Forecasting interactive dynamics of pedestrians with fictitious play. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4636–4644. IEEE (2017) Ma, W.C., Huang, D.A., Lee, N., Kitani, K.M.: Forecasting interactive dynamics of pedestrians with fictitious play. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4636–4644. IEEE (2017)
29.
Zurück zum Zitat Najfeld, I., Havel, T.F.: Derivatives of the matrix exponential and their computation. Adv. Appl. Math. 16(3), 321–375 (1995)MathSciNetCrossRef Najfeld, I., Havel, T.F.: Derivatives of the matrix exponential and their computation. Adv. Appl. Math. 16(3), 321–375 (1995)MathSciNetCrossRef
30.
Zurück zum Zitat Park, H.S., Hwang, J.J., Niu, Y., Shi, J.: Egocentric future localization. In: CVPR, vol. 2, p. 4 (2016) Park, H.S., Hwang, J.J., Niu, Y., Shi, J.: Egocentric future localization. In: CVPR, vol. 2, p. 4 (2016)
31.
Zurück zum Zitat Ratliff, N.D., Bagnell, J.A., Zinkevich, M.A.: Maximum margin planning. In: Proceedings of the 23rd International Conference on Machine Learning, pp. 729–736. ACM (2006) Ratliff, N.D., Bagnell, J.A., Zinkevich, M.A.: Maximum margin planning. In: Proceedings of the 23rd International Conference on Machine Learning, pp. 729–736. ACM (2006)
33.
Zurück zum Zitat Rhinehart, N., Kitani, K.M.: First-person activity forecasting with online inverse reinforcement learning. In: The IEEE International Conference on Computer Vision (ICCV), October 2017 Rhinehart, N., Kitani, K.M.: First-person activity forecasting with online inverse reinforcement learning. In: The IEEE International Conference on Computer Vision (ICCV), October 2017
35.
Zurück zum Zitat Ryoo, M.S., Fuchs, T.J., Xia, L., Aggarwal, J.K., Matthies, L.H.: Robot-centric activity prediction from first-person videos: what will they do to me’. In: Proceedings of the Tenth Annual ACM/IEEE International Conference on Human-Robot Interaction, HRI 2015, Portland, 2–5 March 2015, pp. 295–302 (2015). https://doi.org/10.1145/2696454.2696462 Ryoo, M.S., Fuchs, T.J., Xia, L., Aggarwal, J.K., Matthies, L.H.: Robot-centric activity prediction from first-person videos: what will they do to me’. In: Proceedings of the Tenth Annual ACM/IEEE International Conference on Human-Robot Interaction, HRI 2015, Portland, 2–5 March 2015, pp. 295–302 (2015). https://​doi.​org/​10.​1145/​2696454.​2696462
36.
Zurück zum Zitat Ryoo, M.S.: Human activity prediction: early recognition of ongoing activities from streaming videos. In: IEEE International Conference on Computer Vision (ICCV), pp. 1036–1043. IEEE (2011) Ryoo, M.S.: Human activity prediction: early recognition of ongoing activities from streaming videos. In: IEEE International Conference on Computer Vision (ICCV), pp. 1036–1043. IEEE (2011)
37.
Zurück zum Zitat Sadeghian, A., Kosaraju, V., Gupta, A., Savarese, S., Alahi, A.: TrajNet: towards a benchmark for human trajectory prediction. arXiv preprint (2018) Sadeghian, A., Kosaraju, V., Gupta, A., Savarese, S., Alahi, A.: TrajNet: towards a benchmark for human trajectory prediction. arXiv preprint (2018)
38.
Zurück zum Zitat Salimans, T., Goodfellow, I., Zaremba, W., Cheung, V., Radford, A., Chen, X.: Improved techniques for training GANs. In: Advances in Neural Information Processing Systems, pp. 2234–2242 (2016) Salimans, T., Goodfellow, I., Zaremba, W., Cheung, V., Radford, A., Chen, X.: Improved techniques for training GANs. In: Advances in Neural Information Processing Systems, pp. 2234–2242 (2016)
39.
Zurück zum Zitat Szegedy, C., et al.: Going deeper with convolutions. In: CVPR (2015) Szegedy, C., et al.: Going deeper with convolutions. In: CVPR (2015)
40.
Zurück zum Zitat Venkatraman, A., et al.: Predictive-state decoders: encoding the future into recurrent networks. In: Advances in Neural Information Processing Systems, pp. 1172–1183 (2017) Venkatraman, A., et al.: Predictive-state decoders: encoding the future into recurrent networks. In: Advances in Neural Information Processing Systems, pp. 1172–1183 (2017)
41.
Zurück zum Zitat Verlet, L.: Computer “experiments” on classical fluids. I. Thermodynamical properties of Lennard-Jones molecules. Phys. Rev. 159(1), 98 (1967)CrossRef Verlet, L.: Computer “experiments” on classical fluids. I. Thermodynamical properties of Lennard-Jones molecules. Phys. Rev. 159(1), 98 (1967)CrossRef
42.
43.
Zurück zum Zitat Vondrick, C., Pirsiavash, H., Torralba, A.: Anticipating visual representations from unlabeled video. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 98–106 (2016) Vondrick, C., Pirsiavash, H., Torralba, A.: Anticipating visual representations from unlabeled video. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 98–106 (2016)
44.
Zurück zum Zitat Vondrick, C., Pirsiavash, H., Torralba, A.: Generating videos with scene dynamics. In: Advances in Neural Information Processing Systems, pp. 613–621 (2016) Vondrick, C., Pirsiavash, H., Torralba, A.: Generating videos with scene dynamics. In: Advances in Neural Information Processing Systems, pp. 613–621 (2016)
46.
Zurück zum Zitat Walker, J., Gupta, A., Hebert, M.: Patch to the future: unsupervised visual prediction. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3302–3309 (2014) Walker, J., Gupta, A., Hebert, M.: Patch to the future: unsupervised visual prediction. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3302–3309 (2014)
47.
Zurück zum Zitat Walker, J., Marino, K., Gupta, A., Hebert, M.: The pose knows: video forecasting by generating pose futures. In: IEEE International Conference on Computer Vision (ICCV), pp. 3352–3361. IEEE (2017) Walker, J., Marino, K., Gupta, A., Hebert, M.: The pose knows: video forecasting by generating pose futures. In: IEEE International Conference on Computer Vision (ICCV), pp. 3352–3361. IEEE (2017)
48.
Zurück zum Zitat Williams, R.J.: Simple statistical gradient-following algorithms for connectionist reinforcement learning. Reinf. Learn. 5–32 (1992) Williams, R.J.: Simple statistical gradient-following algorithms for connectionist reinforcement learning. Reinf. Learn. 5–32 (1992)
49.
Zurück zum Zitat Wulfmeier, M., Rao, D., Wang, D.Z., Ondruska, P., Posner, I.: Large-scale cost function learning for path planning using deep inverse reinforcement learning. Int. J. Robot. Res. 36(10), 1073–1087 (2017)CrossRef Wulfmeier, M., Rao, D., Wang, D.Z., Ondruska, P., Posner, I.: Large-scale cost function learning for path planning using deep inverse reinforcement learning. Int. J. Robot. Res. 36(10), 1073–1087 (2017)CrossRef
50.
Zurück zum Zitat Xie, D., Todorovic, S., Zhu, S.C.: Inferring “dark matter” and “dark energy” from videos. In: IEEE International Conference on Computer Vision (ICCV), pp. 2224–2231. IEEE (2013) Xie, D., Todorovic, S., Zhu, S.C.: Inferring “dark matter” and “dark energy” from videos. In: IEEE International Conference on Computer Vision (ICCV), pp. 2224–2231. IEEE (2013)
51.
Zurück zum Zitat Zhu, J., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: IEEE International Conference on Computer Vision, ICCV 2017, Venice, 22–29 October 2017, pp. 2242–2251 (2017). https://doi.org/10.1109/ICCV.2017.244 Zhu, J., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: IEEE International Conference on Computer Vision, ICCV 2017, Venice, 22–29 October 2017, pp. 2242–2251 (2017). https://​doi.​org/​10.​1109/​ICCV.​2017.​244
Metadaten
Titel
r2p2: A ReparameteRized Pushforward Policy for Diverse, Precise Generative Path Forecasting
verfasst von
Nicholas Rhinehart
Kris M. Kitani
Paul Vernaza
Copyright-Jahr
2018
DOI
https://doi.org/10.1007/978-3-030-01261-8_47