Skip to main content
Top
Published in: Neural Computing and Applications 14/2020

23-10-2019 | Original Article

Efficient human motion recovery using bidirectional attention network

Authors: Qiongjie Cui, Huaijiang Sun, Yupeng Li, Yue kong

Published in: Neural Computing and Applications | Issue 14/2020

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Human motion capture (mocap) data, recording the movement from markers attached to specific joints, has gradually become the most popular solution of animation production. However, the raw motion data are often corrupted due to joint occlusion, marker shedding, and the lack of equipment precision, which severely limits the performance in real-world applications. Since human motion is essentially sequential data, the latest methods resort to variants of long short-time memory network (LSTM) to solve related problems, but most of them tend to obtain visually unreasonable results. This is mainly because these methods hardly capture long-term dependencies and cannot explicitly utilize relevant context. To address these issues, we propose a deep bidirectional attention network which can not only capture the long-term dependencies but also adaptively extract relevant information at each time step. Moreover, the proposed model, embedded attention mechanism in the bidirectional LSTM structure at the encoding and decoding stages, can decide where to borrow information and use it to recover the corrupted frame effectively. Extensive experiments on CMU database demonstrate that the proposed model consistently outperforms other state-of-the-art methods in terms of recovery accuracy and visualization.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Aristidou A, Cameron J, Lasenby J (2008) Real-time estimation of missing markers in human motion capture. In: The 2nd international conference on bioinformatics and biomedical engineering, 2008. ICBBE 2008. IEEE, pp 1343–1346 Aristidou A, Cameron J, Lasenby J (2008) Real-time estimation of missing markers in human motion capture. In: The 2nd international conference on bioinformatics and biomedical engineering, 2008. ICBBE 2008. IEEE, pp 1343–1346
2.
go back to reference Bahdanau D, Cho K, Bengio Y (2014) Neural machine translation by jointly learning to align and translate. CoRR abs/1409.0473 Bahdanau D, Cho K, Bengio Y (2014) Neural machine translation by jointly learning to align and translate. CoRR abs/1409.0473
3.
go back to reference Bütepage J, Black MJ, Kragic D, Kjellström H (2017) Deep representation learning for human motion prediction and classification. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR), pp 1591–1599 Bütepage J, Black MJ, Kragic D, Kjellström H (2017) Deep representation learning for human motion prediction and classification. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR), pp 1591–1599
4.
go back to reference Feng Y, Xiao J, Zhuang Y, Yang X, Zhang JJ, Song R (2014) Exploiting temporal stability and low-rank structure for motion capture data refinement. Inf Sci 277(2):777–793CrossRef Feng Y, Xiao J, Zhuang Y, Yang X, Zhang JJ, Song R (2014) Exploiting temporal stability and low-rank structure for motion capture data refinement. Inf Sci 277(2):777–793CrossRef
5.
go back to reference Fragkiadaki K, Levine S, Felsen P, Malik J (2015) Recurrent network models for human dynamics. In: 2015 IEEE international conference on computer vision (ICCV), pp 4346–4354 Fragkiadaki K, Levine S, Felsen P, Malik J (2015) Recurrent network models for human dynamics. In: 2015 IEEE international conference on computer vision (ICCV), pp 4346–4354
7.
go back to reference Graves A, Schmidhuber J (2005) Framewise phoneme classification with bidirectional lstm and other neural network architectures. Neural Netw 18(5–6):602–610CrossRef Graves A, Schmidhuber J (2005) Framewise phoneme classification with bidirectional lstm and other neural network architectures. Neural Netw 18(5–6):602–610CrossRef
8.
go back to reference Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780CrossRef Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780CrossRef
9.
go back to reference Holden D (2018) Robust solving of optical motion capture data by denoising. ACM Trans Graph 37:165:1–165:12CrossRef Holden D (2018) Robust solving of optical motion capture data by denoising. ACM Trans Graph 37:165:1–165:12CrossRef
10.
go back to reference Holden D, Komura T, Saito J (2017) Phase-functioned neural networks for character control. ACM Trans Graph 36:42:1–42:13CrossRef Holden D, Komura T, Saito J (2017) Phase-functioned neural networks for character control. ACM Trans Graph 36:42:1–42:13CrossRef
11.
go back to reference Holden D, Saito J, Komura T (2016) A deep learning framework for character motion synthesis and editing. ACM Trans Graph 35:138:1–138:11CrossRef Holden D, Saito J, Komura T (2016) A deep learning framework for character motion synthesis and editing. ACM Trans Graph 35:138:1–138:11CrossRef
12.
go back to reference Holden D, Saito J, Komura T, Joyce T (2015) Learning motion manifolds with convolutional autoencoders. In: SIGGRAPH Asia technical briefs Holden D, Saito J, Komura T, Joyce T (2015) Learning motion manifolds with convolutional autoencoders. In: SIGGRAPH Asia technical briefs
13.
go back to reference Hou J, Chau LP, He Y, Chen J, Magnenat-Thalmann N (2014) Human motion capture data recovery via trajectory-based sparse representation. In: IEEE international conference on image processing, pp 709–713 Hou J, Chau LP, He Y, Chen J, Magnenat-Thalmann N (2014) Human motion capture data recovery via trajectory-based sparse representation. In: IEEE international conference on image processing, pp 709–713
14.
go back to reference Hu W, Wang Z, Liu S, Yang X, Yu G, Zhang JJ (2018) Motion capture data completion via truncated nuclear norm regularization. IEEE Signal Process Lett 25:258–262CrossRef Hu W, Wang Z, Liu S, Yang X, Yu G, Zhang JJ (2018) Motion capture data completion via truncated nuclear norm regularization. IEEE Signal Process Lett 25:258–262CrossRef
15.
go back to reference Kingma DP, Ba J (2014) Adam: a method for stochastic optimization. CoRR abs/1412.6980 Kingma DP, Ba J (2014) Adam: a method for stochastic optimization. CoRR abs/1412.6980
16.
go back to reference Kingma DP, Welling M (2013) Auto-encoding variational bayes. CoRR abs/1312.6114 Kingma DP, Welling M (2013) Auto-encoding variational bayes. CoRR abs/1312.6114
17.
go back to reference Koltchinskii V, Lounici K, Tsybakov AB et al (2011) Nuclear norm penalization and optimal rates for noisy low-rank matrix completion. Ann Stat 39(5):2302–2329MathSciNetMATHCrossRef Koltchinskii V, Lounici K, Tsybakov AB et al (2011) Nuclear norm penalization and optimal rates for noisy low-rank matrix completion. Ann Stat 39(5):2302–2329MathSciNetMATHCrossRef
18.
go back to reference Kucherenko T, Kjellström H (2018) A neural network approach to missing marker reconstruction. CoRR abs/1803.02665 Kucherenko T, Kjellström H (2018) A neural network approach to missing marker reconstruction. CoRR abs/1803.02665
19.
go back to reference Lai RYQ, Yuen PC, Lee KKW (2011) Motion capture data completion and denoising by singular value thresholding. Proc Eurogr Assoc 11(3):924–929 Lai RYQ, Yuen PC, Lee KKW (2011) Motion capture data completion and denoising by singular value thresholding. Proc Eurogr Assoc 11(3):924–929
20.
go back to reference Li C, Zhang Z, Lee WS, Lee GH (2018) Convolutional sequence to sequence model for human dynamics. In: CVPR Li C, Zhang Z, Lee WS, Lee GH (2018) Convolutional sequence to sequence model for human dynamics. In: CVPR
21.
go back to reference Lu X, Chen H, Yeung SK, Deng Z, Chen W (2018) Unsupervised articulated skeleton extraction from point set sequences captured by a single depth camera. In: AAAI Lu X, Chen H, Yeung SK, Deng Z, Chen W (2018) Unsupervised articulated skeleton extraction from point set sequences captured by a single depth camera. In: AAAI
22.
go back to reference Mall U, Lal GR, Chaudhuri S, Chaudhuri P (2017) A deep recurrent framework for cleaning motion capture data. arXiv preprint arXiv:1712.03380 Mall U, Lal GR, Chaudhuri S, Chaudhuri P (2017) A deep recurrent framework for cleaning motion capture data. arXiv preprint arXiv:​1712.​03380
23.
go back to reference Moeslund TB, Hilton A, Krüger V (2006) A survey of advances in vision-based human motion capture and analysis. Comput Vis Image Unders 104(2):90–126CrossRef Moeslund TB, Hilton A, Krüger V (2006) A survey of advances in vision-based human motion capture and analysis. Comput Vis Image Unders 104(2):90–126CrossRef
24.
go back to reference Qin Y, Song D, Cheng H, Cheng W, Jiang G, Cottrell GW (2017) A dual-stage attention-based recurrent neural network for time series prediction. In: IJCAI Qin Y, Song D, Cheng H, Cheng W, Jiang G, Cottrell GW (2017) A dual-stage attention-based recurrent neural network for time series prediction. In: IJCAI
25.
go back to reference Ruiz AH, Gall J, Moreno-Noguer F (2018) Human motion prediction via spatio-temporal inpainting. CoRR abs/1812.05478 Ruiz AH, Gall J, Moreno-Noguer F (2018) Human motion prediction via spatio-temporal inpainting. CoRR abs/1812.05478
26.
go back to reference Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536MATHCrossRef Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536MATHCrossRef
27.
go back to reference Srivastava N, Hinton GE, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15:1929–1958MathSciNetMATH Srivastava N, Hinton GE, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15:1929–1958MathSciNetMATH
28.
go back to reference Tan CH, Hou J, Chau LP (2013) Human motion capture data recovery using trajectory-based matrix completion. Electron Lett 49(12):752–754CrossRef Tan CH, Hou J, Chau LP (2013) Human motion capture data recovery using trajectory-based matrix completion. Electron Lett 49(12):752–754CrossRef
29.
go back to reference Uller-Ulhaas KD (2007) Robust optical user motion tracking using a kalman filter. In: ACM symposium on virtual reality software & technology Uller-Ulhaas KD (2007) Robust optical user motion tracking using a kalman filter. In: ACM symposium on virtual reality software & technology
30.
go back to reference Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I (2017) Attention is all you need. In: NIPS Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I (2017) Attention is all you need. In: NIPS
31.
go back to reference Vincent P, Larochelle H, Bengio Y, Manzagol PA (2008) Extracting and composing robust features with denoising autoencoders. In: ICML Vincent P, Larochelle H, Bengio Y, Manzagol PA (2008) Extracting and composing robust features with denoising autoencoders. In: ICML
32.
go back to reference Wang L, Ding Z, Fu Y (2018) Learning transferable subspace for human motion segmentation. In: AAAI Wang L, Ding Z, Fu Y (2018) Learning transferable subspace for human motion segmentation. In: AAAI
33.
go back to reference Wang L, Hu W, Tan T (2003) Recent developments in human motion analysis. Pattern Recognit 36(3):585–601CrossRef Wang L, Hu W, Tan T (2003) Recent developments in human motion analysis. Pattern Recognit 36(3):585–601CrossRef
34.
go back to reference Wiley DJ, Hahn JK (1997) Interpolation synthesis of articulated figure motion. IEEE Comput Graph Appl 17(6):39CrossRef Wiley DJ, Hahn JK (1997) Interpolation synthesis of articulated figure motion. IEEE Comput Graph Appl 17(6):39CrossRef
35.
go back to reference Wright J, Ma Y, Mairal J, Sapiro G, Huang TS, Yan S (2010) Sparse representation for computer vision and pattern recognition. Proc IEEE 98(6):1031–1044CrossRef Wright J, Ma Y, Mairal J, Sapiro G, Huang TS, Yan S (2010) Sparse representation for computer vision and pattern recognition. Proc IEEE 98(6):1031–1044CrossRef
36.
go back to reference Xia G, Sun H, Chen B, Liu Q, Feng L, Zhang G, Hang R (2018) Nonlinear low-rank matrix completion for human motion recovery. IEEE Trans Image Process 27:3011–3024MathSciNetMATHCrossRef Xia G, Sun H, Chen B, Liu Q, Feng L, Zhang G, Hang R (2018) Nonlinear low-rank matrix completion for human motion recovery. IEEE Trans Image Process 27:3011–3024MathSciNetMATHCrossRef
37.
go back to reference Xia G, Sun H, Feng L, Zhang G, Liu Y (2018) Human motion segmentation via robust kernel sparse subspace clustering. IEEE Trans Image Process 27(1):135–150MathSciNetMATHCrossRef Xia G, Sun H, Feng L, Zhang G, Liu Y (2018) Human motion segmentation via robust kernel sparse subspace clustering. IEEE Trans Image Process 27(1):135–150MathSciNetMATHCrossRef
38.
go back to reference Xia G, Sun H, Zhang G, Feng L (2016) Human motion recovery jointly utilizing statistical and kinematic information. Inf Sci 339:189–205CrossRef Xia G, Sun H, Zhang G, Feng L (2016) Human motion recovery jointly utilizing statistical and kinematic information. Inf Sci 339:189–205CrossRef
39.
go back to reference Xiao J, Feng Y, Hu W (2011) Predicting missing markers in human motion capture using l1-sparse representation. Comput Anim Virtual Worlds 22(2–3):221–228CrossRef Xiao J, Feng Y, Hu W (2011) Predicting missing markers in human motion capture using l1-sparse representation. Comput Anim Virtual Worlds 22(2–3):221–228CrossRef
40.
go back to reference Yang Z, Yang D, Dyer C, He X, Smola AJ, Hovy EH (2016) Hierarchical attention networks for document classification. In: HLT-NAACL Yang Z, Yang D, Dyer C, He X, Smola AJ, Hovy EH (2016) Hierarchical attention networks for document classification. In: HLT-NAACL
41.
go back to reference You Q, Jin H, Wang Z, Fang C, Luo J (2016) Image captioning with semantic attention. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR), pp 4651–4659 You Q, Jin H, Wang Z, Fang C, Luo J (2016) Image captioning with semantic attention. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR), pp 4651–4659
42.
go back to reference Zhang H, Li J, Ji Y, Yue H (2016) Understanding subtitles by character-level sequence-to-sequence learning. IEEE Trans Ind Inf 13(2):616–624CrossRef Zhang H, Li J, Ji Y, Yue H (2016) Understanding subtitles by character-level sequence-to-sequence learning. IEEE Trans Ind Inf 13(2):616–624CrossRef
43.
go back to reference Zhou P, Shi W, Tian J, Qi Z, Li B, Hao H, Xu B (2016) Attention-based bidirectional long short-term memory networks for relation classification. In: ACL Zhou P, Shi W, Tian J, Qi Z, Li B, Hao H, Xu B (2016) Attention-based bidirectional long short-term memory networks for relation classification. In: ACL
44.
go back to reference Zhou X, Liu S, Pavlakos G, Kumar V, Daniilidis K (2018) Human motion capture using a drone. In: 2018 IEEE international conference on robotics and automation (ICRA), pp 2027–2033 Zhou X, Liu S, Pavlakos G, Kumar V, Daniilidis K (2018) Human motion capture using a drone. In: 2018 IEEE international conference on robotics and automation (ICRA), pp 2027–2033
Metadata
Title
Efficient human motion recovery using bidirectional attention network
Authors
Qiongjie Cui
Huaijiang Sun
Yupeng Li
Yue kong
Publication date
23-10-2019
Publisher
Springer London
Published in
Neural Computing and Applications / Issue 14/2020
Print ISSN: 0941-0643
Electronic ISSN: 1433-3058
DOI
https://doi.org/10.1007/s00521-019-04543-9

Other articles of this Issue 14/2020

Neural Computing and Applications 14/2020 Go to the issue

Premium Partner