Top

Neural Computing and Applications

Published in:

09-10-2020 | Original Article

Three-dimensional CNN-inspired deep learning architecture for Yoga pose recognition in the real-world environment

Authors: Shrajal Jain, Aditya Rustagi, Sumeet Saurav, Ravi Saini, Sanjay Singh

Published in: Neural Computing and Applications | Issue 12/2021

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

Existing techniques for Yoga pose recognition build classifiers based on sophisticated handcrafted features computed from the raw inputs captured in a controlled environment. These techniques often fail in complex real-world situations and thus, pose limitations on the practical applicability of existing Yoga pose recognition systems. This paper presents an alternative computationally efficient approach for Yoga pose recognition in complex real-world environments using deep learning. To this end, a Yoga pose dataset was created with the participation of 27 individual (8 males and 19 females), which consists of ten Yoga poses, namely Malasana, Ananda Balasana, Janu Sirsasana, Anjaneyasana, Tadasana, Kumbhakasana, Hasta Uttanasana, Paschimottanasana, Uttanasana, and Dandasana. To capture the videos, we used smartphone cameras having 4 K resolution and 30 fps frame rate. For the recognition of Yoga poses in real time, a three-dimensional convolutional neural network (3D CNN) architecture is designed and implemented. The designed architecture is a modified version of the C3D architecture initially introduced for the recognition of human actions. In the proposed modified C3D architecture, the computationally intensive fully connected layers are pruned, and supplementary layers such as the batch normalization and average pooling were introduced for computational efficiency. To the best of our knowledge, this is among the first studies, which utilized the inherent spatial–temporal relationship among Yoga poses for their recognition. The designed 3D CNN architecture achieved test recognition accuracy of 91.15% on the in-house prepared Yoga pose dataset consisting of ten Yoga poses. Furthermore, on the publicly available dataset, the designed architecture achieved competitive test recognition accuracy of 99.39%, along with multifold improvement in the execution speed compared to the existing state-of-the-art technique. To promote further study, we will make the in-house created Yoga pose dataset publicly available to the research community.

previous article Spatiotemporal dynamic of a coupled neutral-type neural network with time delay and diffusion

next article The Mode-Fisher pooling for time complexity optimization in deep convolutional neural networks

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Kidokuchi L (2008) The philosophy of Yoga. http://spot.pcc.edu/~lkidoguc/Yoga/Yoga01.htm. Accessed 13 November 2019

Chen HT, He YZ, Hsu CC et al (2014) Yoga posture recognition for self-training. In: Lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics), pp. 496–505

Sathyanarayanan G, Vengadavaradan A, Bharadwaj B (2019) Role of yoga and mindfulness in severe mental illnesses: a narrative review. Int J Yoga 12:3–28. https://doi.org/10.4103/ijoy.IJOY_65_1CrossRef

Guddeti RR, Dang G, Williams MA, Alla VM (2018) Role of Yoga in cardiac disease and rehabilitation. J Cardiopulm Rehabil Prev. https://doi.org/10.1097/hcr.0000000000000372CrossRef

Sethi JK, Nagendra H, Ganpat TS (2013) Yoga improves attention and self-esteem in underprivileged girl student. J Educ Health Promot 2:55CrossRef

Wilhelm FH, Grossman P, Coyle MA (2004) Improving estimation of cardiac vagal tone during spontaneous breathing using a paced breathing calibration. Biomed Sci Instrum 40:317–324

Risher B (2019) Yoga in schools really works: this is how one program helps students decompress. https://www.yogajournal.com/lifestyle/yoga-and-mindfulness-programs-for-schools. Accessed 14 November 2019

Schure MB, Christopher J, Christopher S (2008) Mind–body medicine and the art of self-care: teaching mindfulness to counseling students through yoga, meditation, and qigong. J Couns Dev. https://doi.org/10.1002/j.1556-6678.2008.tb00625.xCrossRef

Lim S-A, Cheong K-J (2015) Regular Yoga practice improves antioxidant status, immune function, and stress hormone releases in young healthy people: a randomized, double-blind, controlled pilot study. J Altern Complement Med 1:1. https://doi.org/10.1089/acm.2014.0044CrossRef

10.

Chen HT, He YZ, Hsu CC (2018) Computer-assisted yoga training system. Multimed Tools Appl 77:23969–23991. https://doi.org/10.1007/s11042-018-5721-2CrossRef

11.

Gao Z, Zhang H, Liu AA et al (2016) Human action recognition on depth dataset. Neural Comput Appl 27:2047–2054. https://doi.org/10.1007/s00521-015-2002-0CrossRef

12.

Connaghan D, Kelly P, O’Connor NE et al (2011) Multi-sensor classification of tennis strokes. Proc IEEE Sens. https://doi.org/10.1109/icsens.2011.6127084CrossRef

13.

Nordsborg NB, Espinosa HG, Thiel DV (2014) Estimating energy expenditure during front crawl swimming using accelerometers. Procedia Eng 72:132–137. https://doi.org/10.1016/j.proeng.2014.06.024CrossRef

14.

Pai PF, ChangLiao LH, Lin KP (2017) Analyzing basketball games by a support vector machines with decision tree model. Neural Comput Appl 28:4159–4167. https://doi.org/10.1007/s00521-016-2321-9CrossRef

15.

Bai L, Efstratiou C, Ang CS (2016) WeSport: utilising wrist-band sensing to detect player activities in basketball games. In: 2016 IEEE international conference on pervasive computing and communication workshops, PerCom workshops 2016. IEEE. pp. 1–6

16.

Shan CZ, Su E, Ming L (2015) Investigation of upper limb movement during badminton smash. In: 2015 10th Asian control conference, pp 1–6. https://doi.org/10.1109/ascc.2015.7244605

17.

Waldron M, Twist C, Highton J et al (2011) Movement and physiological match demands of elite rugby league using portable global positioning systems. J Sports Sci 29:1223–1230. https://doi.org/10.1080/02640414.2011.587445CrossRef

18.

Kelly P, Healy A, Moran K, O’Connor NE (2010) A virtual coaching environment for improving golf swing technique. In: Proceedings of the 2010 ACM workshop on Surreal media and virtual cloning, ACM. pp. 51–56

19.

Yang Y, Ramanan D (2011) Articulated pose estimation with flexible mixtures-of-parts. In: CVPR 2011, IEEE, pp 1385–1392

20.

Wang F, Li Y (2013) Beyond physical connections: Tree models in human pose estimation. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 596–603

21.

Patil S, Pawar A, Peshave A et al (2011) Yoga tutor: visualization and analysis using SURF algorithm. In: Proceedings of 2011 IEEE control system graduate research colloquium, ICSGRC 2011. pp. 43–46

22.

Toshev A, Szegedy C (2013) DeepPose: human pose estimation via deep neural networks. https://doi.org/10.1109/cvpr.2014.214

23.

Luo Z, Yang W, Ding ZQ, Liu L, Chen IM, Yeo SH, Ling KV, Duh HBL (2011) “left arm up!” interactive yoga training in virtual environment. In: 2011 IEEE virtual reality conference. IEEE. pp. 261–262

24.

Hsieh CC, Wu BS, Lee CC (2011) A distance computer vision assisted yoga learning system. J. Comput. 6(11):2382–2388

25.

Tompson JJ, Jain A, Le-Cun Y, Bregler C (2014) Joint training of a convolutional network and a graphical model for human pose estimation. In: Advances in neural information processing systems. pp 1799–1807

26.

Qiang B, Zhang S, Zhan Y, Xie W, Zhao T (2019) Improved convolutional pose machines for human pose esti-mation using image sensor data. Sensors 19(3):718CrossRef

27.

Martinez J, Hossain R,Romero J, Little JJ (2017) A simple yet effective baseline for 3d human pose esti-mation. In: Proceedings of the IEEE international conference on computer vision. pp 2640–2649

28.

Wang C, Wang Y, Lin Z, YuilleAL, Gao W (2014) Robust estimation of 3d human poses from a single image. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 2361–2368

29.

Cao Z, Simon T, Wei SE, Sheikh Y (2017) Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp7291–7299

30.

Fang HS, Xie S, Tai YW, Lu C (2017) Rmpe: Regional multi-person pose estimation. In: Proceedings of the IEEE international conference on computer vision, pp. 2334–2343

31.

Liu Y, Stoll C, Gall J, Seidel HP, Theobalt C (2011) Markerless motion capture of interacting characters using multi-view image segmentation. In: CVPR 2011, IEEE, pp 1249–1256

32.

Alp Guler R, Neverova N, Kokkinos I (2018) Densepose: dense human pose estimation in the wild. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 7297–7306

33.

Joo H, Liu H, Tan L, Gui L, Nabbe B, Matthews I, Kanade T, Nobuhara S, SheikhY (2015) Panoptic studio: a massively multiview system for social motion capture. In: Proceedings of the IEEE international conference on computer vision, pp. 3334–3342

34.

Dantone M, Gall J, Leistner C, Van Gool L (2013) Human pose estimation using body parts dependent joint regressors. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3041–3048

35.

Tian Y, Zitnick CL, Narasimhan SG (2012) Exploring the spatial hierarchy of mixture models for human pose estimation. In: European Conference on Computer Vision, Springer, pp 256–269

36.

Sapp B, Taskar B (2013) Modec: Multimodal decomposable models for human pose estimation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3674–3681

37.

Pishchulin L, An-driluka M, Gehler P, Schiele B (2013) Poselet conditioned pictorial structures. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 588–595

38.

Shotton J, Sharp T, Kipman A, Fitzgibbon A, Finocchio M, Blake A, Cook Mamore R (2013) Real-time human pose recognition in parts from single depth images. Commun ACM 56(1):116–124CrossRef

39.

Mohanty A, Ahmed A, Goswami T, Das A, Vaishnavi P, Sahay RR (2017) Robust pose recognition using deep learning. In: Proceedings of international conference on computer vision and image processing, Springer. pp. 93–105

40.

Yadav SK, Singh A, Gupta A, Raheja J (2019) Real-time yoga recognition using deep learning. Neural Comput Appl 31:9349. https://doi.org/10.1007/s00521-019-04232-7CrossRef

41.

Ji S, Xu W, Yang M, Yu K (2012) 3d convolutional neural networks for human action recognition. IEEE Trans Pattern Anal Mach Intell 35(1):221–231CrossRef

42.

Karpathy A, Toderici G, Shetty S, Leung T, Sukthankar R, Fei-Fei L (2014) Large-scale video classification with convolutional neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1725–1732

43.

Varol G, Laptev I, Schmid C (2017) Long-term temporal convolutions for action recognition. IEEE trans Patttern Anal Mach Intell 40(6):1510–1517CrossRef

44.

Vanholder H (2016) Efficient inference with tensorrt

45.

Ditty M, Karandikar A, Reed D (2018) NVidia’s Xavier soc. In: Hot chips: a symposium on high performance chips

Title: Three-dimensional CNN-inspired deep learning architecture for Yoga pose recognition in the real-world environment
Authors: Shrajal Jain
Aditya Rustagi
Sumeet Saurav
Ravi Saini
Sanjay Singh
Publication date: 09-10-2020
Publisher: Springer London
Published in: Neural Computing and Applications / Issue 12/2021
Print ISSN: 0941-0643
Electronic ISSN: 1433-3058
DOI: https://doi.org/10.1007/s00521-020-05405-5

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft"

Springer Professional "Technik"

Springer Professional "Wirtschaft+Technik"

Other articles of this Issue 12/2021

An intelligent feature selection approach based on moth flame optimization for medical diagnosis

Signature verification using geometrical features and artificial neural network classifier

Optimal placement and sizing of FACTS devices for optimal power flow in a wind power integrated electrical network

Artificial intelligence simulation of suspended sediment load with different membership functions of ANFIS

Random-based networks with dropout for embedded systems

In-depth analysis of SVM kernel learning and its components

Premium Partner