Top

Machine Vision and Applications

Published in:

01-08-2014 | Original Paper

3D human pose estimation from image using couple sparse coding

Authors: Mohammadreza Zolfaghari, Amin Jourabloo, Samira Ghareh Gozlou, Bahman Pedrood, Mohammad T. Manzuri-Shalmani

Published in: Machine Vision and Applications | Issue 6/2014

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

Recent studies have demonstrated that high-level semantics in data can be captured using sparse representation. In this paper, we propose an approach to human body pose estimation in static images based on sparse representation. Given a visual input, the objective is to estimate 3D human body pose using feature space information and geometrical information of the pose space. On the assumption that each data point and its neighbors are likely to reside on a locally linear patch of the underlying manifold, our method learns the sparse representation of the new input using both feature and pose space information and then estimates the corresponding 3D pose by a linear combination of the bases of the pose dictionary. Two strategies for dictionary construction are presented: (i) constructing the dictionary by randomly selecting the frames of a sequence and (ii) selecting specific frames of a sequence as dictionary atoms. We analyzed the effect of each strategy on the accuracy of pose estimation. Extensive experiments on datasets of various human activities show that our proposed method outperforms state-of-the-art methods.

previous article Fast automatic medical image segmentation based on spatial kernel fuzzy c-means on level set method

next article Two-stage online inference model for traffic pattern analysis and anomaly detection

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

BVH format created by Biovision Company to describing 3D pose in animation production. http://www.cs.wisc.edu/graphics/Courses/cs-838-1999/Jeff/BVH.html

http://www.poser.com

Cmu graphics lab motion capture database (2013). http://mocap.cs.cmu.edu

Agarwal, A., Triggs, B.: Monocular human motion capture with a mixture of regressors. In: Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition—Workshops, vol. 03, CVPR ’05, pp. 72. IEEE Computer Society, Washington, DC (2005) doi:10.1109/CVPR.2005.496

Agarwal, A., Triggs, B.: Recovering 3d human pose from monocular images. IEEE Trans. Pattern Anal. Mach. Intell. 28(1), 44–58 (2006). doi:10.1109/TPAMI.2006.21 CrossRef

Aharon, M., Elad, M., Bruckstein, A.: K-svd: An algorithm for designing overcomplete dictionaries for sparse representation. Trans. Sig. Proc. 54(11), 4311–4322 (2006). doi:10.1109/TSP.2006.881199 CrossRef

Andriluka, M., Roth, S., Schiele, B.: Monocular 3d pose estimation and tracking by detection. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 623–630 (2010). doi:10.1109/CVPR.2010.5540156

Bo, L., Sminchisescu, C.: Twin gaussian processes for structured prediction. Int. J. Comput. Vision 87(1–2), 28–52 (2010). doi:10.1007/s11263-008-0204-y CrossRef

Cai, T.T., Wang, L.: Orthogonal matching pursuit for sparse signal recovery with noise. IEEE Trans. Inf. Theor. 57(7), 4680–4688 (2011). doi:10.1109/TIT.2011.2146090 CrossRefMathSciNet

Candes, E.J., Tao, T.: Near-optimal signal recovery from random projections: universal encoding strategies? IEEE Trans. Inf. Theor. 52(12), 5406–5425 (2006). doi:10.1109/TIT.2006.885507 CrossRefMathSciNet

Chen, C., Yang, Y., Nie, F., Odobez, J.M.: 3d human pose recovery from image by efficient visual feature selection. Comput. Vis. Image Underst. 115(3), 290–299 (2011). doi:10.1016/j.cviu.2010.11.007 CrossRef

10.

Chen, S.S., Donoho, D.L., Saunders, M.A.: Atomic decomposition by basis pursuit. SIAM Rev. 43(1), 129–159 (2001). doi:10.1137/S003614450037906X CrossRefMATHMathSciNet

11.

Christoudias, C.M., Darrell, T.: On modelling nonlinear shape-and-texture appearance manifolds. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), vol. 2, vol. 02, CVPR ’05, pp. 1067–1074. IEEE Computer Society, Washington, DC (2005). doi:10.1109/CVPR.2005.255

12.

Donoho, D.L.: Compressed sensing. IEEE Trans. Inf. Theor. 52(4), 1289–1306 (2006). doi:10.1109/TIT.2006.871582 CrossRefMATHMathSciNet

13.

Donoho, D.L.: For most large underdetermined systems of linear equations the minimal l1-norm solution is also the sparsest solution. Commun. Pure Appl. Math. 59(6), 797–829 (2006)CrossRefMATHMathSciNet

14.

Efron, B., Hastie, T., Johnstone, I., Tibshirani, R.: Least angle regression. Ann. Stat. 32, 407–499 (2004)CrossRefMATHMathSciNet

15.

Elgammal, A., Lee, C.S.: Inferring 3d body pose from silhouettes using activity manifold learning. In: Proceedings of the IEEE Computer Society Conference on Computer vision and Pattern Recognition. CVPR’04, pp. 681–688. IEEE Computer Society, Washington, DC (2004)

16.

Hara, K., Kurokawa, T.: Human pose estimation using patch-based candidate generation and model-based verification. In: IEEE International Conference on Automatic Face Gesture Recognition and Workshops (FG), pp. 687–693 (2011). doi:10.1109/FG.2011.5771331

17.

Huang, J.B., Yang, M.H.: Estimating human pose from occluded images. In: ACCV (1), Lecture Notes in Computer Science, vol. 5994, pp. 48–60. Springer, Berlin (2009)

18.

Huang, J.B., Yang, M.H.: Fast sparse representation with prototypes. In: CVPR, pp. 3618–3625. IEEE, New York (2010)

19.

Jiang, H.: 20th International Conference on 3d human pose reconstruction using millions of exemplars. In: Pattern Recognition (ICPR), pp. 1674–1677 (2010). doi:10.1109/ICPR.2010.414

20.

Lee, C.S., Elgammal, A.M.: Modeling view and posture manifolds for tracking. In: ICCV, pp. 1–8. IEEE, New York (2007)

21.

Lee, H., Battle, A., Raina, R., Ng, A.Y.: Efficient sparse coding algorithms. In: NIPS, pp. 801–808. NIPS, Kolkata (2007)

22.

Lee, M.W., Nevatia, R.: Human pose tracking in monocular sequence using multilevel structured models. IEEE Trans. Pattern Anal. Mach. Intell. 31(1), 27–38 (2009). doi:10.1109/TPAMI.2008.35.

23.

Mairal, J., Bach, F., Ponce, J.: Task-driven dictionary learning. IEEE Trans. Pattern Anal. Mach. Intell. 34(4), 791–804 (2012). doi:10.1109/TPAMI.2011.156 CrossRef

24.

Mori, G., Malik, J.: Recovering 3d human body configurations using shape contexts. IEEE Trans. Pattern Anal. Mach. Intell. 28(7), 1052–1062 (2006)CrossRef

25.

Olshausen, B.A., Field, D.J.: Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature 381, 607–609 (1996)CrossRef

26.

Olshausen, B.A., Field, D.J.: Sparse coding with an overcomplete basis set: a strategy employed by v1? Vision Res. 37, 3311–3325 (1997)CrossRef

27.

Pourdamghani, N., Rabiee, H.R., Faghri, F., Rohban, M.H.: Graph based semi-supervised human pose estimation: When the output space comes to help. Pattern Recogn. Lett. 33(12), 1529–1535 (2012). doi:10.1016/j.patrec.2012.04.012 CrossRef

28.

Rao, R.P.N., Olshausen, B.A., Lewicki, M.S.: Probabilistic models of the brain: perception and neural function. MIT Press, Cambridge (2002)

29.

Rubinstein, R., Bruckstein, A., Elad, M.: Dictionaries for sparse representation modeling. Proc. IEEE 98(6), 1045–1057 (2010). doi:10.1109/JPROC.2010.2040551 CrossRef

30.

Serre, T.: Learning a dictionary of shape-components in visual cortex: comparison with neurons, humans and machines. Mass. Inst. Technol. (2006)

31.

Shakhnarovich, G., Viola, P., Darrell, T.: Fast pose estimation with parameter-sensitive hashing. In: Proceedings of the Ninth IEEE International Conference on Computer Vision, vol. 2, ICCV ’03, pp. 750. IEEE Computer Society, Washington, DC (2003)

32.

Shang, L., Zhou, Y., Tao, L., Sun, Z.l.: Super-resolution restoration of mmw image using sparse representation based on couple dictionaries. In: Emerging Intelligent Computing Technology and Applications, pp. 286–291. Springer, Berlin (2012)

33.

Tzimiropoulos, G., Zafeiriou, S., Pantic, M.: Sparse representations of image gradient orientations for visual recognition and tracking. In: Proceedings of IEEE International Conference Computer Vision and Pattern Recognition (CVPR-W11), Workshop on CVPR for Human Behaviour Analysis, pp. 26–33. Colorado Springs, USA (2011)

34.

Urtasun, R., Fleet, D.J., Hertzmann, A., Fua, P.: Priors for people tracking from small training sets. In: Proceedings of the Tenth IEEE International Conference on Computer Vision (ICCV) vol. 1, vol. 01, ICCV ’05, pp. 403–410. IEEE Computer Society, Washington, DC (2005) doi:10.1109/ICCV.2005.193

35.

Wright, J., Ma, Y., Mairal, J., Sapiro, G., Huang, T., Yan, S.: Sparse representation for computer vision and pattern recognition (2009)

36.

Wright, J., Yang, A.Y., Ganesh, A., Sastry, S.S., Ma, Y.: Robust face recognition via sparse representation. IEEE Trans. Pattern Anal. Mach. Intell. 31(2), 210–227 (2009). doi:10.1109/TPAMI.2008.79

37.

Yang, J., Wang, Z., Lin, Z., Cohen, S., Huang, T.: Coupled dictionary training for image super-resolution. IEEE Trans. Image Process. 21(8), 3467–3478 (2012)CrossRefMathSciNet

38.

Yang, S., Liu, Z., Wang, M., Sun, F., Jiao, L.: Multitask dictionary learning and sparse representation based single-image super-resolution reconstruction. Neurocomputing 74(17), 3193–3203 (2011). doi:10.1016/j.neucom.2011.04.014 CrossRef

39.

Zheng, M., Bu, J., Chen, C., Wang, C., Zhang, L., Qiu, G., Cai, D.: Graph regularized sparse coding for image representation. Trans. Image Proc. 20(5), 1327–1336 (2011). doi:10.1109/TIP.2010.2090535 CrossRefMathSciNet

Title: 3D human pose estimation from image using couple sparse coding
Authors: Mohammadreza Zolfaghari
Amin Jourabloo
Samira Ghareh Gozlou
Bahman Pedrood
Mohammad T. Manzuri-Shalmani
Publication date: 01-08-2014
Publisher: Springer Berlin Heidelberg
Published in: Machine Vision and Applications / Issue 6/2014
Print ISSN: 0932-8092
Electronic ISSN: 1432-1769
DOI: https://doi.org/10.1007/s00138-014-0613-6

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Wirtschaft"

Springer Professional "Technik"

Other articles of this Issue 6/2014

Super-resolution: a comprehensive survey

Fast automatic medical image segmentation based on spatial kernel fuzzy c-means on level set method

Painting-91: a large scale database for computational painting categorization

Improve scene categorization via sub-scene recognition

Automatic plant identification from photographs

Multilayer background modeling under occlusions

Premium Partner