Skip to main content
Erschienen in: International Journal of Social Robotics 2/2019

30.10.2018

Skeleton-Based Human Action Recognition by Pose Specificity and Weighted Voting

verfasst von: Tingting Liu, Jiaole Wang, Seth Hutchinson, Max Q.-H. Meng

Erschienen in: International Journal of Social Robotics | Ausgabe 2/2019

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

This paper introduces a human action recognition method based on skeletal data captured by Kinect or other depth sensors. After a series of pre-processing, action features such as position, velocity, and acceleration have been extracted from each frame to capture both dynamic and static information of human motion, which can make full use of the human skeletal data. The most challenging problem in skeleton-based human action recognition is the large variability within and across subjects. To handle this problem, we propose to divide human poses into two major categories: the discriminating pose and the common pose. A pose specificity metric has been proposed to quantify the discriminative level of different poses. Finally, the action recognition is actualized by a weighted voting method. This method uses the k nearest neighbors found from the training dataset for voting and uses the pose specificity as the weight of a ballot. Experiments on two benchmark datasets have been carried out, the results have illustrated that the proposed method outperforms the state-of-the-art methods.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Aggarwal JK, Ryoo MS (2011) Human activity analysis: a review. ACM Comput Surv 43(3):16:1–16:43CrossRef Aggarwal JK, Ryoo MS (2011) Human activity analysis: a review. ACM Comput Surv 43(3):16:1–16:43CrossRef
2.
Zurück zum Zitat Aggarwal JK, Xia L (2014) Human activity recognition from 3D data: a review. Pattern Recognit Lett 48(Supplement C):70–80CrossRef Aggarwal JK, Xia L (2014) Human activity recognition from 3D data: a review. Pattern Recognit Lett 48(Supplement C):70–80CrossRef
3.
Zurück zum Zitat Amor BB, Su J, Srivastava A (2016) Action recognition using rate-invariant analysis of skeletal shape trajectories. IEEE Trans Pattern Anal Mach Intell 38(1):1–13CrossRef Amor BB, Su J, Srivastava A (2016) Action recognition using rate-invariant analysis of skeletal shape trajectories. IEEE Trans Pattern Anal Mach Intell 38(1):1–13CrossRef
4.
Zurück zum Zitat Bobick AF, Davis JW (2001) The recognition of human movement using temporal templates. IEEE Trans Pattern Anal Mach Intell 23(3):257–267CrossRef Bobick AF, Davis JW (2001) The recognition of human movement using temporal templates. IEEE Trans Pattern Anal Mach Intell 23(3):257–267CrossRef
5.
Zurück zum Zitat Chaaraoui AA, Climent-Pérez P, Flórez-Revuelta F (2012) An efficient approach for multi-view human action recognition based on bag-of-key-poses. In: International workshop on human behavior understanding, Springer, Berlin, pp 29–40 Chaaraoui AA, Climent-Pérez P, Flórez-Revuelta F (2012) An efficient approach for multi-view human action recognition based on bag-of-key-poses. In: International workshop on human behavior understanding, Springer, Berlin, pp 29–40
6.
Zurück zum Zitat Chaaraoui AA, Padilla-López JR, Climent-Pérez P, Flórez-Revuelta F (2014) Evolutionary joint selection to improve human action recognition with RGB-D devices. Expert Syst Appl 41(3):786–794CrossRef Chaaraoui AA, Padilla-López JR, Climent-Pérez P, Flórez-Revuelta F (2014) Evolutionary joint selection to improve human action recognition with RGB-D devices. Expert Syst Appl 41(3):786–794CrossRef
7.
Zurück zum Zitat Chaquet JM, Carmona EJ, Fernández-Caballero A (2013) A survey of video datasets for human action and activity recognition. Comput Vis Image Underst 117(6):633–659CrossRef Chaquet JM, Carmona EJ, Fernández-Caballero A (2013) A survey of video datasets for human action and activity recognition. Comput Vis Image Underst 117(6):633–659CrossRef
8.
Zurück zum Zitat Chen C, Liu K, Kehtarnavaz N (2016) Real-time human action recognition based on depth motion maps. J Real-Time Image Process 12:155–163CrossRef Chen C, Liu K, Kehtarnavaz N (2016) Real-time human action recognition based on depth motion maps. J Real-Time Image Process 12:155–163CrossRef
9.
Zurück zum Zitat Cippitelli E, Gasparrini S, Gambi E, Spinsante S (2016) A human activity recognition system using skeleton data from RGBD sensors. Intell Neurosci 2016:21–34 Cippitelli E, Gasparrini S, Gambi E, Spinsante S (2016) A human activity recognition system using skeleton data from RGBD sensors. Intell Neurosci 2016:21–34
10.
Zurück zum Zitat Ding W, Liu K, Cheng F, Zhang J (2016) Learning hierarchical spatio-temporal pattern for human activity prediction. J Vis Commun Image Represent 35(Supplement C):103–111CrossRef Ding W, Liu K, Cheng F, Zhang J (2016) Learning hierarchical spatio-temporal pattern for human activity prediction. J Vis Commun Image Represent 35(Supplement C):103–111CrossRef
11.
Zurück zum Zitat Ding W, Liu K, Fu X, Cheng F (2016) Profile hmms for skeleton-based human action recognition. Signal Process Image Commun 42:109–119CrossRef Ding W, Liu K, Fu X, Cheng F (2016) Profile hmms for skeleton-based human action recognition. Signal Process Image Commun 42:109–119CrossRef
12.
Zurück zum Zitat Du Y, Wang W, Wang L (2015) Hierarchical recurrent neural network for skeleton based action recognition. In: The IEEE conference on computer vision and pattern recognition (CVPR), IEEE, pp 1110–1118 Du Y, Wang W, Wang L (2015) Hierarchical recurrent neural network for skeleton based action recognition. In: The IEEE conference on computer vision and pattern recognition (CVPR), IEEE, pp 1110–1118
13.
Zurück zum Zitat Du Y, Fu Y, Wang L (2016) Representation learning of temporal dynamics for skeleton-based action recognition. IEEE Trans Image Process 25(7):3010–3022MathSciNetMATHCrossRef Du Y, Fu Y, Wang L (2016) Representation learning of temporal dynamics for skeleton-based action recognition. IEEE Trans Image Process 25(7):3010–3022MathSciNetMATHCrossRef
14.
Zurück zum Zitat Eweiwi A, Cheema MS, Bauckhage C, Gall J (2014) Efficient pose-based action recognition. In: Asian conference on computer vision (ACCV), Springer, Berlin, pp 428–443 Eweiwi A, Cheema MS, Bauckhage C, Gall J (2014) Efficient pose-based action recognition. In: Asian conference on computer vision (ACCV), Springer, Berlin, pp 428–443
15.
Zurück zum Zitat Faria DR, Premebida C, Nunes U (2014) A probabilistic approach for human everyday activities recognition using body motion from RGB-D images. In: The 23rd IEEE international symposium on robot and human interactive communication, IEEE, pp 732–737 Faria DR, Premebida C, Nunes U (2014) A probabilistic approach for human everyday activities recognition using body motion from RGB-D images. In: The 23rd IEEE international symposium on robot and human interactive communication, IEEE, pp 732–737
16.
Zurück zum Zitat Gowayyed MA, Torki M, Hussein ME, El-Saban M (2013) Histogram of oriented displacements (hod): describing trajectories of human joints for action recognition. In: International joint conference on artificial intelligence, AAAI Press, pp 1351–1357 Gowayyed MA, Torki M, Hussein ME, El-Saban M (2013) Histogram of oriented displacements (hod): describing trajectories of human joints for action recognition. In: International joint conference on artificial intelligence, AAAI Press, pp 1351–1357
17.
Zurück zum Zitat Gupta R, Chia AYS, Rajan D (2013) Human activities recognition using depth images. In: Proceedings of the 21st ACM international conference on multimedia, ACM, pp 283–292 Gupta R, Chia AYS, Rajan D (2013) Human activities recognition using depth images. In: Proceedings of the 21st ACM international conference on multimedia, ACM, pp 283–292
18.
Zurück zum Zitat Jiang M, Kong J, Bebis G, Huo H (2015) Informative joints based human action recognition using skeleton contexts. Signal Process Image Commun 33(Supplement C):29–40CrossRef Jiang M, Kong J, Bebis G, Huo H (2015) Informative joints based human action recognition using skeleton contexts. Signal Process Image Commun 33(Supplement C):29–40CrossRef
19.
Zurück zum Zitat Joo SW, Chellappa R (2006) Attribute grammar-based event recognition and anomaly detection. In: 2006 conference on computer vision and pattern recognition workshop (CVPRW’06), IEEE, pp 107–107 Joo SW, Chellappa R (2006) Attribute grammar-based event recognition and anomaly detection. In: 2006 conference on computer vision and pattern recognition workshop (CVPRW’06), IEEE, pp 107–107
20.
Zurück zum Zitat Ke SR, Thuc HLU, Lee YJ, Hwang JN, Yoo JH, Choi KH (2013) A review on video-based human activity recognition. Computers 2(2):88–131CrossRef Ke SR, Thuc HLU, Lee YJ, Hwang JN, Yoo JH, Choi KH (2013) A review on video-based human activity recognition. Computers 2(2):88–131CrossRef
21.
Zurück zum Zitat Ke Y, Sukthankar R, Hebert M (2007) Spatio-temporal shape and flow correlation for action recognition. In: 2007 IEEE conference on computer vision and pattern recognition, IEEE, pp 1–8 Ke Y, Sukthankar R, Hebert M (2007) Spatio-temporal shape and flow correlation for action recognition. In: 2007 IEEE conference on computer vision and pattern recognition, IEEE, pp 1–8
22.
Zurück zum Zitat Kitani KM, Sato Y, Sugimoto A (2007) Recovering the basic structure of human activities from a video-based symbol string. In: IEEE workshop on motion and video computing, 2007. WMVC’07, IEEE, pp 9–9 Kitani KM, Sato Y, Sugimoto A (2007) Recovering the basic structure of human activities from a video-based symbol string. In: IEEE workshop on motion and video computing, 2007. WMVC’07, IEEE, pp 9–9
23.
Zurück zum Zitat Koppula HS, Gupta R, Saxena A (2013) Learning human activities and object affordances from RGB-D videos. Int J Robot Res 32(8):951–970CrossRef Koppula HS, Gupta R, Saxena A (2013) Learning human activities and object affordances from RGB-D videos. Int J Robot Res 32(8):951–970CrossRef
24.
Zurück zum Zitat Lai RYQ, Yuen PC, Lee KKW (2011) Motion capture data completion and denoising by singular value thresholding. In: Proceedings of Eurographics, pp 45–48 Lai RYQ, Yuen PC, Lee KKW (2011) Motion capture data completion and denoising by singular value thresholding. In: Proceedings of Eurographics, pp 45–48
25.
Zurück zum Zitat Li W, Zhang Z, Liu Z (2010) Action recognition based on a bag of 3D points. In: 2010 IEEE computer society conference on computer vision and pattern recognition—workshops, IEEE, pp 9–14 Li W, Zhang Z, Liu Z (2010) Action recognition based on a bag of 3D points. In: 2010 IEEE computer society conference on computer vision and pattern recognition—workshops, IEEE, pp 9–14
26.
Zurück zum Zitat Lublinerman R, Ozay N, Zarpalas D, Camps O (2006) Activity recognition from silhouettes using linear systems and model (in) validation techniques. In: 18th international conference on pattern recognition (ICPR’06), IEEE, vol 1, pp 347–350 Lublinerman R, Ozay N, Zarpalas D, Camps O (2006) Activity recognition from silhouettes using linear systems and model (in) validation techniques. In: 18th international conference on pattern recognition (ICPR’06), IEEE, vol 1, pp 347–350
27.
Zurück zum Zitat Ni B, Pei Y, Moulin P, Yan S (2013) Multilevel depth and image fusion for human activity detection. IEEE Trans Cybern 43(5):1383–1394CrossRef Ni B, Pei Y, Moulin P, Yan S (2013) Multilevel depth and image fusion for human activity detection. IEEE Trans Cybern 43(5):1383–1394CrossRef
28.
Zurück zum Zitat Ofli F, Chaudhry R, Kurillo G, Vidal R, Bajcsy R (2014) Sequence of the most informative joints (SMIJ): a new representation for human skeletal action recognition. J Vis Commun Image Represent 25(1):24–38CrossRef Ofli F, Chaudhry R, Kurillo G, Vidal R, Bajcsy R (2014) Sequence of the most informative joints (SMIJ): a new representation for human skeletal action recognition. J Vis Commun Image Represent 25(1):24–38CrossRef
29.
Zurück zum Zitat Parisi GI, Weber C, Wermter S (2015) Self-organizing neural integration of pose-motion features for human action recognition. Front Neurorobot 9:3CrossRef Parisi GI, Weber C, Wermter S (2015) Self-organizing neural integration of pose-motion features for human action recognition. Front Neurorobot 9:3CrossRef
30.
Zurück zum Zitat Piyathilaka L, Kodagoda S (2013) Gaussian mixture based hmm for human daily activity recognition using 3D skeleton features. In: 2013 IEEE 8th conference on industrial electronics and applications (ICIEA), IEEE, pp 567–572 Piyathilaka L, Kodagoda S (2013) Gaussian mixture based hmm for human daily activity recognition using 3D skeleton features. In: 2013 IEEE 8th conference on industrial electronics and applications (ICIEA), IEEE, pp 567–572
31.
Zurück zum Zitat Ryoo MS, Aggarwal JK (2006) Recognition of composite human activities through context-free grammar based representation. In: 2006 IEEE computer society conference on computer vision and pattern recognition (CVPR’06), IEEE, vol 2, pp 1709–1718 Ryoo MS, Aggarwal JK (2006) Recognition of composite human activities through context-free grammar based representation. In: 2006 IEEE computer society conference on computer vision and pattern recognition (CVPR’06), IEEE, vol 2, pp 1709–1718
32.
Zurück zum Zitat Ryoo MS, Aggarwal JK (2009) Spatio-temporal relationship match: video structure comparison for recognition of complex human activities. In: 2009 IEEE 12th international conference on computer vision, IEEE, pp 1593–1600 Ryoo MS, Aggarwal JK (2009) Spatio-temporal relationship match: video structure comparison for recognition of complex human activities. In: 2009 IEEE 12th international conference on computer vision, IEEE, pp 1593–1600
33.
Zurück zum Zitat Shan J, Akella S (2014) 3D human action segmentation and recognition using pose kinetic energy. In: 2014 IEEE international workshop on advanced robotics and its social impacts, IEEE, pp 69–75 Shan J, Akella S (2014) 3D human action segmentation and recognition using pose kinetic energy. In: 2014 IEEE international workshop on advanced robotics and its social impacts, IEEE, pp 69–75
34.
Zurück zum Zitat Shotton J, Sharp T, Kipman A, Fitzgibbon A, Finocchio M, Blake A, Cook M, Moore R (2013) Real-time human pose recognition in parts from single depth images. Commun ACM 56(1):116–124CrossRef Shotton J, Sharp T, Kipman A, Fitzgibbon A, Finocchio M, Blake A, Cook M, Moore R (2013) Real-time human pose recognition in parts from single depth images. Commun ACM 56(1):116–124CrossRef
35.
Zurück zum Zitat Sparck Jones K (1972) A statistical interpretation of term specificity and its application in retrieval. J Doc 28(1):11–21CrossRef Sparck Jones K (1972) A statistical interpretation of term specificity and its application in retrieval. J Doc 28(1):11–21CrossRef
36.
Zurück zum Zitat Srivastava A, Turaga P, Kurtek S (2012) On advances in differential-geometric approaches for 2D and 3D shape analyses and activity recognition. Image Vis Comput 30(6):398–416CrossRef Srivastava A, Turaga P, Kurtek S (2012) On advances in differential-geometric approaches for 2D and 3D shape analyses and activity recognition. Image Vis Comput 30(6):398–416CrossRef
37.
Zurück zum Zitat Sung J, Ponce C, Selman B, Saxena A (2012) Unstructured human activity detection from RGBD images. In: 2012 IEEE international conference on robotics and automation, IEEE, pp 842–849 Sung J, Ponce C, Selman B, Saxena A (2012) Unstructured human activity detection from RGBD images. In: 2012 IEEE international conference on robotics and automation, IEEE, pp 842–849
38.
Zurück zum Zitat Tao L, Vidal R (2015) Moving poselets: A discriminative and interpretable skeletal motion representation for action recognition. In: The IEEE international conference on computer vision (ICCV) workshops, pp 61–69 Tao L, Vidal R (2015) Moving poselets: A discriminative and interpretable skeletal motion representation for action recognition. In: The IEEE international conference on computer vision (ICCV) workshops, pp 61–69
39.
Zurück zum Zitat Thanh TT, Chen F, Kotani K, Le B (2014) Extraction of discriminative patterns from skeleton sequences for accurate action recognition. Fundam Inform 130(2):247–261 Thanh TT, Chen F, Kotani K, Le B (2014) Extraction of discriminative patterns from skeleton sequences for accurate action recognition. Fundam Inform 130(2):247–261
40.
Zurück zum Zitat Veeraraghavan A, Chellappa R, Roy-Chowdhury AK (2006) The function space of an activity. In: 2006 IEEE computer society conference on computer vision and pattern recognition (CVPR’06), IEEE, vol 1, pp 959–968 Veeraraghavan A, Chellappa R, Roy-Chowdhury AK (2006) The function space of an activity. In: 2006 IEEE computer society conference on computer vision and pattern recognition (CVPR’06), IEEE, vol 1, pp 959–968
41.
Zurück zum Zitat Vemulapalli R, Arrate F, Chellappa R (2014) Human action recognition by representing 3D skeletons as points in a lie group. In: The IEEE conference on computer vision and pattern recognition (CVPR), IEEE, pp 588–595 Vemulapalli R, Arrate F, Chellappa R (2014) Human action recognition by representing 3D skeletons as points in a lie group. In: The IEEE conference on computer vision and pattern recognition (CVPR), IEEE, pp 588–595
42.
Zurück zum Zitat Wang Y, Shi Y, Wei G (2017) A novel local feature descriptor based on energy information for human activity recognition. Neurocomputing 228(Supplement C):19–28CrossRef Wang Y, Shi Y, Wei G (2017) A novel local feature descriptor based on energy information for human activity recognition. Neurocomputing 228(Supplement C):19–28CrossRef
43.
Zurück zum Zitat Yang X, Tian Y (2014) Effective 3D action recognition using eigenjoints. J Vis Commun Image Represent 25(1):2–11MathSciNetCrossRef Yang X, Tian Y (2014) Effective 3D action recognition using eigenjoints. J Vis Commun Image Represent 25(1):2–11MathSciNetCrossRef
44.
Zurück zum Zitat Yang X, Tian YL (2012) Eigenjoints-based action recognition using naive-bayes-nearest-neighbor. In: 2012 IEEE computer society conference on computer vision and pattern recognition workshops, IEEE, pp 14–19 Yang X, Tian YL (2012) Eigenjoints-based action recognition using naive-bayes-nearest-neighbor. In: 2012 IEEE computer society conference on computer vision and pattern recognition workshops, IEEE, pp 14–19
45.
Zurück zum Zitat Yu E, Aggarwal JK (2006) Detection of fence climbing from monocular video. In: 18th international conference on pattern recognition (ICPR’06), IEEE, vol 1, pp 375–378 Yu E, Aggarwal JK (2006) Detection of fence climbing from monocular video. In: 18th international conference on pattern recognition (ICPR’06), IEEE, vol 1, pp 375–378
46.
Zurück zum Zitat Zhang C, Tian Y (2012) RGB-D camera-based daily living activity recognition. J Comput Vis Image Process 2(4):1–7CrossRef Zhang C, Tian Y (2012) RGB-D camera-based daily living activity recognition. J Comput Vis Image Process 2(4):1–7CrossRef
47.
Zurück zum Zitat Zhang D, Gatica-Perez D, Bengio S, McCowan IA, Lathoud G (2006) Modeling individual and group actions in meetings with layered hmms. IEEE Trans Multimed 8(3):509–520CrossRef Zhang D, Gatica-Perez D, Bengio S, McCowan IA, Lathoud G (2006) Modeling individual and group actions in meetings with layered hmms. IEEE Trans Multimed 8(3):509–520CrossRef
48.
Zurück zum Zitat Zhang Z (2012) Microsoft kinect sensor and its effect. IEEE MultiMed 19(2):4–10CrossRef Zhang Z (2012) Microsoft kinect sensor and its effect. IEEE MultiMed 19(2):4–10CrossRef
49.
Zurück zum Zitat Zhu G, Zhang L, Shen P, Song J (2016) Human action recognition using multi-layer codebooks of key poses and atomic motions. Signal Process Image Commun 42:19–30CrossRef Zhu G, Zhang L, Shen P, Song J (2016) Human action recognition using multi-layer codebooks of key poses and atomic motions. Signal Process Image Commun 42:19–30CrossRef
50.
Zurück zum Zitat Zhu G, Zhang L, Shen P, Song J (2016) An online continuous human action recognition algorithm based on the kinect sensor. Sensors 16(2):161CrossRef Zhu G, Zhang L, Shen P, Song J (2016) An online continuous human action recognition algorithm based on the kinect sensor. Sensors 16(2):161CrossRef
51.
Zurück zum Zitat Zhu Y, Chen W, Guo G (2014) Evaluating spatiotemporal interest point features for depth-based action recognition. Image and Vis Comput 32(8):453–464CrossRef Zhu Y, Chen W, Guo G (2014) Evaluating spatiotemporal interest point features for depth-based action recognition. Image and Vis Comput 32(8):453–464CrossRef
Metadaten
Titel
Skeleton-Based Human Action Recognition by Pose Specificity and Weighted Voting
verfasst von
Tingting Liu
Jiaole Wang
Seth Hutchinson
Max Q.-H. Meng
Publikationsdatum
30.10.2018
Verlag
Springer Netherlands
Erschienen in
International Journal of Social Robotics / Ausgabe 2/2019
Print ISSN: 1875-4791
Elektronische ISSN: 1875-4805
DOI
https://doi.org/10.1007/s12369-018-0498-z

Weitere Artikel der Ausgabe 2/2019

International Journal of Social Robotics 2/2019 Zur Ausgabe

    Marktübersichten

    Die im Laufe eines Jahres in der „adhäsion“ veröffentlichten Marktübersichten helfen Anwendern verschiedenster Branchen, sich einen gezielten Überblick über Lieferantenangebote zu verschaffen.