Top

International Journal of Computer Vision

Published in:

16-06-2021

Multi-level Motion Attention for Human Motion Prediction

Authors: Wei Mao, Miaomiao Liu, Mathieu Salzmann, Hongdong Li

Published in: International Journal of Computer Vision | Issue 9/2021

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

Human motion prediction aims to forecast future human poses given a historical motion. Whether based on recurrent or feed-forward neural networks, existing learning based methods fail to model the observation that human motion tends to repeat itself, even for complex sports actions and cooking activities. Here, we introduce an attention based feed-forward network that explicitly leverages this observation. In particular, instead of modeling frame-wise attention via pose similarity, we propose to extract motion attention to capture the similarity between the current motion context and the historical motion sub-sequences. In this context, we study the use of different types of attention, computed at joint, body part, and full pose levels. Aggregating the relevant past motions and processing the result with a graph convolutional network allows us to effectively exploit motion patterns from the long-term history to predict the future poses. Our experiments on Human3.6M, AMASS and 3DPW validate the benefits of our approach for both periodical and non-periodical actions. Thanks to our attention model, it yields state-of-the-art results on all three datasets. Our code is available at https://github.com/wei-mao-2019/HisRepItself.

next article Learning Regression and Verification Networks for Robust Long-term Tracking

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Available only for authorised users

Described at https://github.com/nghorbani/amass

Available at https://amass.is.tue.mpg.de/dataset.

Akhter, I., Sheikh, Y., Khan, S., & Kanade, T. (2009). Nonrigid structure from motion in trajectory space. In: Advances in neural information processing systems, pp 41–48.

Arjovsky, M., & Bottou, L. (2017). Towards principled methods for training generative adversarial networks. In: ICLR.

Bahdanau, D., Cho, K., & Bengio, Y. (2015). Neural machine translation by jointly learning to align and translate.

Brand, M., & Hertzmann, A. (2000). Style machines. In: Proceedings of the 27th annual conference on Computer graphics and interactive techniques, ACM Press/Addison-Wesley Publishing Co., pp 183–192.

Butepage, J., Black, M.J., Kragic, D., & Kjellstrom, H. (2017). Deep representation learning for human motion prediction and classification. In: CVPR.

Cai, Y., Huang, L., Wang, Y., Cham, T.J., Cai, J., Yuan, J., Liu, J., Yang, X., Zhu, Y., Shen, X., et al. (2020). Learning progressive joint propagation for human motion prediction. In: ECCV.

Fragkiadaki, K., Levine, S., Felsen, P., & Malik, J. (2015). Recurrent network models for human dynamics. In: ICCV, pp 4346–4354.

Gong, H., Sim, J., Likhachev, M., & Shi, J. (2011). Multi-hypothesis motion planning for visual object tracking. In: ICCV, IEEE, pp 619–626.

Gopalakrishnan, A., Mali, A., Kifer, D., Giles, L., & Ororbia, A.G. (2019). A neural temporal model for human motion prediction. In: CVPR, pp 12116–12125.

Gui, L.Y., Wang, Y.X., Liang, X., & Moura, J.M. (2018). Adversarial geometry-aware human motion prediction. In: ECCV, pp 786–803.

Hernandez, A., Gall, J., & Moreno-Noguer, F. (2019). Human motion prediction via spatio-temporal inpainting. In: ICCV, pp 7134–7143.

Ionescu, C., Papava, D., Olaru, V., & Sminchisescu, C. (2014). Human3.6m: Large scale datasets and predictive methods for 3d human sensing in natural environments. Transactions on Pattern Analysis and Machine Intelligence, 36(7), 1325–1339.CrossRef

Jain, A., Zamir, A.R., Savarese, S., & Saxena, A. (2016). Structural-rnn: Deep learning on spatio-temporal graphs. In: CVPR, pp 5308–5317.

Kipf, T.N., & Welling, M. (2017). Semi-supervised classification with graph convolutional networks. In: ICLR.

Kiros, R., Zhu, Y., Salakhutdinov, R.R., Zemel, R., Urtasun, R., Torralba, A., & Fidler, S. (2015). Skip-thought vectors. In: NIPS, pp 3294–3302.

Koppula, H.S., & Saxena, A. (2013). Anticipating human activities for reactive robotic response. In: IROS, Tokyo, p 2071.

Kovar, L., Gleicher, M., & Pighin, F. (2008). Motion graphs. In: ACM SIGGRAPH 2008 classes, pp 1–10.

Levine, S., Wang, J. M., Haraux, A., Popović, Z., & Koltun, V. (2012). Continuous character control with low-dimensional embeddings. ACM Transactions on Graphics, 31(4), 28.CrossRef

Li, C., Zhang, Z., Lee, W.S., Lee, G.H. (2018a). Convolutional sequence to sequence model for human dynamics. In: CVPR, pp 5226–5234.

Li, X., Li, H., Joo, H., Liu, Y., & Sheikh, Y. (2018b). Structure from recurrent motion: From rigidity to recurrency. In: CVPR, pp 3032–3040.

Loper, M., Mahmood, N., Romero, J., Pons-Moll, G., & Black, M. J. (2015). SMPL: A skinned multi-person linear model. ACM Trans Graphics (Proc SIGGRAPH Asia), 34(6), 248:1-248:16.

Mahmood, N., Ghorbani, N., Troje, N.F., Pons-Moll, G., & Black, M.J. (2019). Amass: Archive of motion capture as surface shapes. In: ICCV, https://amass.is.tue.mpg.de.

Mao, W., Liu, M., Salzmann, M., & Li, H. (2019). Learning trajectory dependencies for human motion prediction. In: ICCV, pp 9489–9497.

Mao, W., Liu, M., & Salzmann, M. (2020). History repeats itself: Human motion prediction via motion attention. In: ECCV.

von Marcard, T., Henschel, R., Black, M., Rosenhahn, B., & Pons-Moll, G. (2018). Recovering accurate 3d human pose in the wild using imus and a moving camera. In: ECCV.

Martinez, J., Black, M.J., & Romero, J. (2017). On human motion prediction using recurrent neural networks. In: CVPR.

Pavllo, D., Feichtenhofer, C., Auli, M., & Grangier, D. (2019). Modeling human motion with quaternion-based neural networks. IJCV pp 1–18.

Romero, J., Tzionas, D., & Black, M.J. (2017). Embodied hands: Modeling and capturing hands and bodies together. ACM Transactions on Graphics, (Proc SIGGRAPH Asia) 36(6).

Runia, T.F., Snoek, C.G., & Smeulders, A.W. (2018). Real-world repetition estimation by div, grad and curl. In: CVPR, pp 9009–9017.

Sidenbladh, H., Black, M.J., & Sigal, L. (2002). Implicit probabilistic models of human motion for synthesis and tracking. In: ECCV, Springer, pp 784–800.

Sutskever, I., Martens, J., & Hinton, G.E. (2011). Generating text with recurrent neural networks. In: ICML, pp 1017–1024.

Tang, Y., Ma, L., Liu, W., Zheng, W.S. (2018). Long-term human motion prediction by modeling motion context and enhancing motion dynamics. IJCAI 10.24963/ijcai.2018/130.

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., & Polosukhin, I. (2017). Attention is all you need. In: NIPS, pp 5998–6008.

Wang, J. M., Fleet, D. J., & Hertzmann, A. (2008). Gaussian process dynamical models for human motion. Transactions on Pattern Analysis and Machine Intelligence, 30(2), 283–298.CrossRef

Title: Multi-level Motion Attention for Human Motion Prediction
Authors: Wei Mao
Miaomiao Liu
Mathieu Salzmann
Hongdong Li
Publication date: 16-06-2021
Publisher: Springer US
Published in: International Journal of Computer Vision / Issue 9/2021
Print ISSN: 0920-5691
Electronic ISSN: 1573-1405
DOI: https://doi.org/10.1007/s11263-021-01483-7

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Wirtschaft"

Springer Professional "Technik"

Other articles of this Issue 9/2021

Correction to: Long-Short Temporal–Spatial Clues Excited Network for Robust Person Re-identification

Unsupervised Scale-Consistent Depth Learning from Video

Shape My Face: Registering 3D Face Scans by Surface-to-Surface Translation

Learning Regression and Verification Networks for Robust Long-term Tracking

Learning Adaptive Attribute-Driven Representation for Real-Time RGB-T Tracking

Semantics-to-Signal Scalable Image Compression with Learned Revertible Representations

Premium Partner