skip to main content
research-article

Human activity analysis: A review

Authors Info & Claims
Published:29 April 2011Publication History
Skip Abstract Section

Abstract

Human activity recognition is an important area of computer vision research. Its applications include surveillance systems, patient monitoring systems, and a variety of systems that involve interactions between persons and electronic devices such as human-computer interfaces. Most of these applications require an automated recognition of high-level activities, composed of multiple simple (or atomic) actions of persons. This article provides a detailed overview of various state-of-the-art research papers on human activity recognition. We discuss both the methodologies developed for simple human actions and those for high-level activities. An approach-based taxonomy is chosen that compares the advantages and limitations of each approach.

Recognition methodologies for an analysis of the simple actions of a single person are first presented in the article. Space-time volume approaches and sequential approaches that represent and recognize activities directly from input images are discussed. Next, hierarchical recognition methodologies for high-level activities are presented and compared. Statistical approaches, syntactic approaches, and description-based approaches for hierarchical recognition are discussed in the article. In addition, we further discuss the papers on the recognition of human-object interactions and group activities. Public datasets designed for the evaluation of the recognition methodologies are illustrated in our article as well, comparing the methodologies' performances. This review will provide the impetus for future research in more productive areas.

References

  1. Aggarwal, J. K. and Cai, Q. 1999. Human motion analysis: A review. Comput. Vision Image Understand. 73, 3, 428--440. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Aggarwal, J. K. and Duda, R. O. 1975. Computer analysis of moving polygonal images. IEEE Trans. Comput. 24, 10, 966--976. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Allen, J. F. 1983. Allen, J. F. 1983. Maintaining knowledge about temporal intervals. Comm. ACM 26, 11, 832--843. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Allen, J. F. and Ferguson, G. 1994. Actions and events in interval temporal logic.J. Logic Comput. 4, 5, 531--579.Google ScholarGoogle ScholarCross RefCross Ref
  5. Bhargava, M., Chen, C.-C., Ryoo, M. S., and Aggarwal, J. K. 2007. Detection of abandoned objects in crowded environments. In Proceedings of the IEEE Conference on Advanced Video and Signal Based Surveillance (AVSS). IEEE, Los Alamitos, CA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Blank, M., Gorelick, L., Shechtman, E., Irani, M., and Basri, R. 2005. Actions as space-time shapes. In Proceedings of the IEEE International Conference on Computer Vision (ICCV). IEEE, Los Alamitos, CA, 1395--1402. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Bobick, A. and Davis, J. 2001. The recognition of human movement using temporal templates. IEEE Trans. Patt. Anal. Mach. Intell. 23, 3, 257--267. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Bobick, A. F. and Wilson, A. D. 1997. A state-based approach to the representation and recognition of gesture. IEEE Trans. Patt. Anal. Mach. Intell. 19, 12, 1325--1337. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Bregonzio, M., Gong, S., and Xiang, T. 2009. Recognising action as clouds of space-time interest points. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE Los Alamitos, CA.Google ScholarGoogle Scholar
  10. Campbell, L. W. and Bobick, A. F. 1995. Recognition of human body motion using phase space constraints. In Proceedings of the IEEE International Conference on Computer Vision (ICCV). IEEE Los Alamitos, CA, 624--630. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Cedras, C. and Shah, M. 1995. A motion-based recognition: A survey. Image Vision Comput. 13, 2, 129--155.Google ScholarGoogle ScholarCross RefCross Ref
  12. Chomat, O. and Crowley, J. 1999. Probabilistic recognition of activity using local appearance. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Vol. 2, IEEE Los Alamitos, CA.Google ScholarGoogle Scholar
  13. Cupillard, F., Bremond, F., and Thonnat, M. 2002. Group behavior recognition with multiple cameras. In Proceedings of the IEEE Workshop on Applications of Computer Vision (WACV). IEEE, Los Alamitos, CA, 177--183. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Dai, P., Di, H., Dong, L., Tao, L., and Xu, G. 2008. Group interaction analysis in dynamic context. IEEE Trans. Syst. Man Cybern. Part B 38, 1, 275--282. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Damen, D. and Hogg, D. 2009. Recognizing linked events: Searching the space of feasible explanations. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, Los Alamitos, CA.Google ScholarGoogle Scholar
  16. Darrell, T. and Pentland, A. 1993. Space-time gestures. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, Los Alamitos, CA. 335--340.Google ScholarGoogle Scholar
  17. Dollar, P., Rabaud, V., Cottrell, G., and Belongie, S. 2005. Behavior recognition via sparse spatio-temporal features. In Proceedings of the 2nd Joint IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance (VS-PETS). IEEE, Los Alamitos, CA. 65--72. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Efros, A., Berg, A., Mori, G., and Malik, J. 2003. Recognizing action at a distance. In Proceedings of the IEEE International Conference on Computer Vision (ICCV). Vol. 2, IEEE, Los Alamitos, CA, 726--733. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Gavrila, D. and Davis, L. 1995. Towards 3-D model-based tracking and recognition of human movement. In Proceedings of the International Workshop on Face and Gesture Recognition. 272--277.Google ScholarGoogle Scholar
  20. Gavrila, D. M. 1999. The visual analysis of human movement: A survey. Comput. Vision Image Understand. 73, 1, 82--98. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Ghanem, N., DeMenthon, D., Doermann, D., and Davis, L. 2004. Representation and recognition of events in surveillance video using Petri nets. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshop (CVPRW). IEEE, Los Alamitos, CA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Gong, S. and Xiang, T. 2003. Recognition of group activities using dynamic probabilistic networks. In Proceedings of the IEEE International Conference on Computer Vision (ICCV). IEEE, Los Alamitos, CA, 742. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Gupta, A. and Davis, L. S. 2007. Objects in action: An approach for combining action understanding and object perception. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, Los Alamitos, CA.Google ScholarGoogle Scholar
  24. Gupta, A., Srinivasan, P., Shi, J., and Davis, L. S. 2009. Understanding videos, constructing plots. Learning a visually grounded storyline model from annotated videos. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, Los Alamitos, CA.Google ScholarGoogle Scholar
  25. Hakeem, A., Sheikh, Y., and Shah, M. 2004. CASEE: A hierarchical event representation for the analysis of videos. In Proceedings of the 20th National Conference on Artificial Intelligence (AAAI). 263--268. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Harris, C. and Stephens, M. 1988. A combined corner and edge detector. In Proceedings of the Alvey Vision Conference. 147--152.Google ScholarGoogle Scholar
  27. Hongeng, S., Nevatia, R., and Bremond, F. 2004. Video-based event recognition: Activity representation and probabilistic recognition methods. Comput. Vision Image Understand. 96, 2, 129--162. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Intille, S. S. and Bobick, A. F. 1999. A framework for recognizing multi-agent action from visual evidence. In Proceedings of the AAAI Conference on Innovative Applications of Artificial Intelligence. AAAI/IAAI. 518--525. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Ivanov, Y. A. and Bobick, A. F. 2000. Recognition of visual activities and interactions by stochastic parsing. IEEE Trans. Patt. Anal. Mach. Intell. 22, 8, 852--872. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Jhuang, H., Serre, T., Wolf, L., and Poggio, T. 2007. A biologically inspired system for action recognition. In Proceedings of the IEEE International Conference on Computer Vision (ICCV). IEEE, Los Alamitos, CA.Google ScholarGoogle Scholar
  31. Jiang, H., Drew, M., and Li, Z. 2006. Successive convex matching for action detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, Los Alamitos, CA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Johansson, G. 1975. Visual motion perception. Sci. Amer. 232, 6, 76--88.Google ScholarGoogle Scholar
  33. Joo, S.-W. and Chellappa, R. 2006. Attribute grammar-based event recognition and anomaly detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshop (CVPRW). IEEE, Los Alamitos, CA, 107. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Ke, Y., Sukthankar, R., and Hebert, M. 2007. Spatio-temporal shape and flow correlation for action recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, Los Alamitos, CA.Google ScholarGoogle Scholar
  35. Khan, S. M. and Shah, M. 2005. Detecting group activities using rigidity of formation. In Proceedings of the ACM International Conference on Multimedia (ACM MM). ACM, New York, 403--406. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Kim, T.-K., Wong, S.-F., and Cipolla, R. 2007. Tensor canonical correlation analysis for action classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, Los Alamitos, CA.Google ScholarGoogle Scholar
  37. Kitani, K. M., Sato, Y., and Sugimoto, A. 2005. Deleted interpolation using a hierarchical Bayesian grammar network for recognizing human activity. In Proceedings of the Second Joint IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance (VS-PETS). IEEE, Los Alamitos, CA.Google ScholarGoogle Scholar
  38. Kitani, K. M., Sato, Y., and Sugimoto, A. 2007. Recovering the basic structure of human activities from a video-based symbol string. In Proceedings of the IEEE Workshop on Motion and Video Computing (WMVC). IEEE, Los Alamitos, CA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Kruger, V., Kragic, D., Ude, A., and Geib, C. 2007. The meaning of action: A review on action recognition and mapping. Advanced Robotics 21, 13, 1473--1501.Google ScholarGoogle ScholarCross RefCross Ref
  40. la Torre Frade, F. D., Campoy, J., Cohn, J., and Kanade, T. 2007. Simultaneous registration and clustering for temporal segmentation. In Proceedings of the International Conference on Computer Vision Theory and Applications. 110--115.Google ScholarGoogle Scholar
  41. Laptev, I. and Lindeberg, T. 2003. Space-time interest points. In Proceedings of the IEEE International Conference on Computer Vision (ICCV). IEEE, Los Alamitos, CA, 432. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Laptev, I., Marszalek, M., Schmid, C., and Rozenfeld, B. 2008. Learning realistic human actions from movies. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, Los Alamitos, CA.Google ScholarGoogle Scholar
  43. Laptev, I. and Perez, P. 2007. Retrieving actions in movies. In Proceedings of the IEEE International Conference on Computer Vision (ICCV). IEEE, Los Alamitos, CA.Google ScholarGoogle Scholar
  44. Li, Z., Fu, Y., Huang, T., and Yan, S. 2008. Real-time human action recognition by luminance field trajectory analysis. In Proceedings of the ACM International Conference on Multimedia (ACM MM). ACM, New York, 671--676. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Liu, J., Luo, J., and Shah, M. 2009. Recognizing realistic actions from videos “in the wild”. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, Los Alamitos, CA.Google ScholarGoogle Scholar
  46. Liu, J. and Shah, M. 2008. Learning human actions via information maximization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, Los Alamitos, CA.Google ScholarGoogle Scholar
  47. Lowe, D. G. 1999. Object recognition from local scale-invariant features. In Proceedings of the IEEE International Conference on Computer Vision (ICCV). IEEE, Los Alamitos, CA, 1150--1157. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Lublinerman, R., Ozay, N., Zarpalas, D., and Camps, O. 2006. Activity recognition from silhouettes using linear systems and model (in)validation techniques. In Proceedings of the International Conference on Pattern Recognition (ICPR). 347--350. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Lv, F., Kang, J., Nevatia, R., Cohen, I., and Medioni, G. 2004. Automatic tracking and labeling of human activities in a video sequence. In Proceedings of the IEEE International Workshop on Performance Evaluation of Tracking and Surveillance (PETS). IEEE, Los Alamitos, CA.Google ScholarGoogle Scholar
  50. Lv, F. and Nevatia, R. 2007. Single view human action recognition using key pose matching and Viterbi path searching. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, Los Alamitos, CA.Google ScholarGoogle Scholar
  51. Minnen, D., Essa, I. A., and Starner, T. 2003. Expectation grammars: Leveraging high-level expectations for activity recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Vol. 2, IEEE, Los Alamitos, CA, 626--632.Google ScholarGoogle Scholar
  52. Moore, D. J. and Essa, I. A. 2002. Recognizing multitasked activities from video using stochastic context-free grammar. In Proceedings of the AAAI Conference on Innovative Applications of Artificial Intelligence. 770--776. Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. Moore, D. J., Essa, I. A., and Hayes, M. H. 1999. Exploiting human actions and object context for recognition tasks. In Proceedings of the IEEE International Conference on Computer Vision (ICCV). Vol. 1, IEEE, Los Alamitos, CA, 80--86.Google ScholarGoogle Scholar
  54. Nam, Y., Wohn, K., and Lee-Kwang, H. 1999. Modeling and recognition of hand gesture using colored Petri nets. IEEE Trans. Syst. Man Cybern. 29, 5, 514--521. Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. Natarajan, P. and Nevatia, R. 2007. Coupled hidden semi-Markov models for activity recognition. In Proceedings of the IEEE Workshop on Motion and Video Computing (WMVC). IEEE, Los Alamitos, CA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. Nevatia, R., Hobbs, J., and Bolles, B. 2004. An ontology for video event representation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshop (CVPRW). Vol. 7, IEEE, Los Alamitos, CA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. Nevatia, R., Zhao, T., and Hongeng, S. 2003. Hierarchical language-based representation of events in video streams. In Proceedings of the IEEE Workshop on Event Mining. IEEE, Los Alamitos, CA.Google ScholarGoogle Scholar
  58. Nguyen, N. T., Phung, D. Q., Venkatesh, S., and Bui, H. H. 2005. Learning and detecting activities from movement trajectories using the hierarchical hidden Markov models. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Vol. 2, IEEE, Los Alamitos, CA, 955--960. Google ScholarGoogle ScholarDigital LibraryDigital Library
  59. Niebles, J. C., Wang, H., and Fei-Fei, L. 2006. Unsupervised learning of human action categories using spatial-temporal words. In Proceedings of the British Machine Vision Conference (BMVC).Google ScholarGoogle Scholar
  60. Niebles, J. C., Wang, H., and Fei-Fei, L. 2008. Unsupervised learning of human action categories using spatial-temporal words. Int. J. Comput. Vision 79, 3. Google ScholarGoogle ScholarDigital LibraryDigital Library
  61. Niyogi, S. and Adelson, E. 1994. Analyzing and recognizing walking figures in XYT. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, Los Alamitos, CA, 469--474.Google ScholarGoogle Scholar
  62. Oliver, N., Horvitz, E., and Garg, A. 2002. Layered representations for human activity recognition. In Proceedings of the IEEE International Conference on Multimodal Interfaces (ICMI). IEEE, Los Alamitos, CA, 3--8. Google ScholarGoogle ScholarDigital LibraryDigital Library
  63. Oliver, N. M., Rosario, B., and Pentland, A. P. 2000. A Bayesian computer vision system for modeling human interactions. IEEE Trans. Patt. Anal. Mach. Intell. 22, 8, 831--843. Google ScholarGoogle ScholarDigital LibraryDigital Library
  64. Park, S. and Aggarwal, J. K. 2004. A hierarchical Bayesian network for event recognition of human actions and interactions. Multimedia Syst. 10, 2, 164--179.Google ScholarGoogle ScholarDigital LibraryDigital Library
  65. Peursum, P., West, G., and Venkatesh, S. 2005. Combining image regions and human activity for indirect object recognition in indoor wide-angle views. In Proceedings of the IEEE International Conference on Computer Vision (ICCV). IEEE, Los Alamitos, CA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  66. Pinhanez, C. S. and Bobick, A. F. 1998. Human action detection using PNF propagation of temporal constraints. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, Los Alamitos, CA, 898. Google ScholarGoogle ScholarDigital LibraryDigital Library
  67. Rao, C. and Shah, M. 2001. View-invariance in action recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Vol. 2, IEEE, Los Alamitos, CA, 316--322.Google ScholarGoogle Scholar
  68. Rapantzikos, K., Avrithis, Y., and Kollias, S. 2009. Dense saliency-based spatiotemporal feature points for action recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, Los Alamitos, CA.Google ScholarGoogle Scholar
  69. Ribeiro, P. C., Moreno, P., and Santos-Victor, J. 2007. Detecting luggage related behaviors using a new temporal boost algorithm. In Proceedings of the IEEE International Workshop on Performance Evaluation of Tracking and Surveillance (PETS). IEEE, Los Alamitos, CA.Google ScholarGoogle Scholar
  70. Rodriguez, M. D., Ahmed, J., and Shah, M. 2008. Action MACH: A spatio-temporal maximum average correlation height filter for action recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, Los Alamitos, CA.Google ScholarGoogle Scholar
  71. Rofouei, M., Moazeni, M., and Sarrafzadeh, M. 2008. Fast GPU-based space-time correlation for activity recognition in video sequences. In Proceedings of the IEEE/ACM/IFIP Workshop on Embedded Systems for Real-Time Multimedia (ESTImedia). ACM, New York, 33--38.Google ScholarGoogle Scholar
  72. Ryoo, M. S. and Aggarwal, J. K. 2009a. Semantic representation and recognition of continued and recursive human activities. Int. J. Comput. Vision 32, 1, 1--24. Google ScholarGoogle ScholarDigital LibraryDigital Library
  73. Ryoo, M. S. and Aggarwal, J. K. 2009b. Spatio-temporal relationship match: Video structure comparison for recognition of complex human activities. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), IEEE, Los Alamitos, CA.Google ScholarGoogle Scholar
  74. Ryoo, M. S. and Aggarwal, J. K. 2008. Recognition of high-level group activities based on activities of individual members. In Proceedings of the IEEE Workshop on Motion and Video Computing (WMVC). IEEE, Los Alamitos, CA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  75. Ryoo, M. S. and Aggarwal, J. K. 2007. Hierarchical recognition of human activities interacting with objects. In Proceedings of the 2nd International Workshop on Semantic Learning Applications in Multimedia (SLAM).Google ScholarGoogle Scholar
  76. Ryoo, M. S. and Aggarwal, J. K. 2006a. Recognition of composite human activities through context-free grammar based representation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, Los Alamitos, CA, 1709--1718. Google ScholarGoogle ScholarDigital LibraryDigital Library
  77. Ryoo, M. S. and Aggarwal, J. K. 2006b. Semantic understanding of continued and recursive human activities. In Proceedings of the International Conference on Pattern Recognition (ICPR). 379--382. Google ScholarGoogle ScholarDigital LibraryDigital Library
  78. Savarese, S., DelPozo, A., Niebles, J., and Fei-Fei, L. 2008. Spatial-temporal correlatons for unsupervised action classification. In Proceedings of the IEEE Workshop on Motion and Video Computing (WMVC). IEEE, Los Alamitos, CA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  79. Schuldt, C., Laptev, I., and Caputo, B. 2004. Recognizing human actions: A local SVM approach. In Proceedings of the International Conference on Pattern Recognition (ICPR). Vol. 3, 32--36. Google ScholarGoogle ScholarDigital LibraryDigital Library
  80. Scovanner, P., Ali, S., and Shah, M. 2007. A 3-dimensional sift descriptor and its application to action recognition. In Proceedings of the ACM International Conference on Multimedia (ACM MM). ACM, New York, 357--360. Google ScholarGoogle ScholarDigital LibraryDigital Library
  81. Shechtman, E. and Irani, M. 2005. Space-time behavior based correlation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Vol. 1, IEEE, Los Alamitos, CA, 405--412. Google ScholarGoogle ScholarDigital LibraryDigital Library
  82. Sheikh, Y., Sheikh, M., and Shah, M. 2005. Exploring the space of a human action. In Proceedings of the IEEE International Conference on Computer Vision (ICCV). Vol. 1, IEEE, Los Alamitos, CA, 144--149. Google ScholarGoogle ScholarDigital LibraryDigital Library
  83. Shi, Y., Huang, Y., Minnen, D., Bobick, A. F., and Essa, I. A. 2004. Propagation networks for recognition of partially ordered sequential action. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Vol. 2, IEEE, Los Alamitos, CA, 862--869. Google ScholarGoogle ScholarDigital LibraryDigital Library
  84. Siskind, J. M. 2001. Grounding the lexical semantics of verbs in visual perception using force dynamics and event logic. J. Artif. Intell. Res. 15, 31--90. Google ScholarGoogle ScholarDigital LibraryDigital Library
  85. Starner, T. and Pentland, A. 1995. Real-time American Sign Language recognition from video using hidden Markov models. In Proceedings of the International Symposium on Computer Vision. 265. Google ScholarGoogle ScholarDigital LibraryDigital Library
  86. Tran, S. D. and Davis, L. S. 2008. Event modeling and recognition using Markov logic networks. In Proceedings of European Conference on Computer Vision (ECCV). 610--623. Google ScholarGoogle ScholarDigital LibraryDigital Library
  87. Turaga, P., Chellappa, R., Subrahmanian, V. S., and Udrea, O. 2008. Machine recognition of human activities: A survey. IEEE Trans. Circuits Syst. Video Technol. 18, 11 (Nov), 1473--1488. Google ScholarGoogle ScholarDigital LibraryDigital Library
  88. Vaswani, N., Roy Chowdhury, A., and Chellappa, R. 2003. Activity recognition using the dynamics of the configuration of interacting objects. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Vol. 2, IEEE, Los Alamitos, CA.Google ScholarGoogle Scholar
  89. Veeraraghavan, A., Chellappa, R., and Roy-Chowdhury, A. 2006. The function space of an activity. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Vol. 1, IEEE, Los Alamitos, CA, 959--968. Google ScholarGoogle ScholarDigital LibraryDigital Library
  90. Venetianer, P., Zhang, Z., Yin, W., and Lipton, A. 2007. Stationary target detection using the ObjectVideo surveillance system. In Proceedings of the IEEE Conference on Advanced Video and Signal Based Surveillance (AVSS). IEEE, Los Alamitos, CA, 242--247. Google ScholarGoogle ScholarDigital LibraryDigital Library
  91. Vu, V.-T., Bremond, F., and Thonnat, M. 2003. Automatic video interpretation: A novel algorithm for temporal scenario recognition. In Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI). 1295--1302. Google ScholarGoogle ScholarDigital LibraryDigital Library
  92. Webb, J. A. and Aggarwal, J. K. 1982. Structure from motion of rigid and jointed objects. Artif. Intell. 19, 107--130.Google ScholarGoogle ScholarDigital LibraryDigital Library
  93. Wong, S.-F., Kim, T.-K., and Cipolla, R. 2007. Learning motion categories using both semantic and structural information. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Google ScholarGoogle Scholar
  94. Yacoob, Y. and Black, M. 1998. Parameterized modeling and recognition of activities. In Proceedings of the IEEE International Conference on Computer Vision (ICCV). IEEE, Los Alamitos, CA, 120--127. Google ScholarGoogle ScholarDigital LibraryDigital Library
  95. Yamato, J., Ohya, J., and Ishii, K. 1992. Recognizing human action in time-sequential images using hidden Markov models. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, Los Alamitos, CA, 379--385.Google ScholarGoogle Scholar
  96. Yeo, C., Ahammad, P., Ramachandran, K., and Shankar Sastry, S. 2006. Compressed domain real-time action recognition. In Proceedings of the IEEE Workshop on Multimedia Signal Processing. IEEE, Los Alamitos, CA. 33--36.Google ScholarGoogle ScholarCross RefCross Ref
  97. Yilmaz, A. and Shah, M. 2005a. Actions sketch: A novel action representation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Vol. 1, IEEE, Los Alamitos, CA, 984--989. Google ScholarGoogle ScholarDigital LibraryDigital Library
  98. Yilmaz, A. and Shah, M. 2005b. Recognizing human actions in videos acquired by uncalibrated moving cameras (ICCV). IEEE, Los Alamitos, CA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  99. Yu, E. and Aggarwal, J. K. 2006. Detection of fence climbing from monocular video. In Proceedings of the International Conference on Pattern Recognition (ICPR). 375--378. Google ScholarGoogle ScholarDigital LibraryDigital Library
  100. Zaidi, A. K. 1999. On temporal logic programming using Petri nets. IEEE Trans. Syst. Man Cybern. 29, 3, 245--254. Google ScholarGoogle ScholarDigital LibraryDigital Library
  101. Zelnik-Manor, L. and Irani, M. 2001. Event-based analysis of video. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, Los Alamitos, CA.Google ScholarGoogle Scholar
  102. Zhang, D., Gatica-Perez, D., Bengio, S., and McCowan, I. 2006. Modeling individual and group actions in meetings with layered hmms. IEEE Trans. Multimedia 8, 3, 509--520. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Human activity analysis: A review

            Recommendations

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in

            Full Access

            • Published in

              cover image ACM Computing Surveys
              ACM Computing Surveys  Volume 43, Issue 3
              April 2011
              466 pages
              ISSN:0360-0300
              EISSN:1557-7341
              DOI:10.1145/1922649
              Issue’s Table of Contents

              Copyright © 2011 ACM

              Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

              Publisher

              Association for Computing Machinery

              New York, NY, United States

              Publication History

              • Published: 29 April 2011
              • Accepted: 1 September 2009
              • Revised: 1 March 2009
              • Received: 1 May 2008
              Published in csur Volume 43, Issue 3

              Permissions

              Request permissions about this article.

              Request Permissions

              Check for updates

              Qualifiers

              • research-article
              • Research
              • Refereed

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader