Skip to main content

2014 | OriginalPaper | Buchkapitel

Fine-Grained Activity Recognition with Holistic and Pose Based Features

verfasst von : Leonid Pishchulin, Mykhaylo Andriluka, Bernt Schiele

Erschienen in: Pattern Recognition

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Holistic methods based on dense trajectories [29, 30] are currently the de facto standard for recognition of human activities in video. Whether holistic representations will sustain or will be superseded by higher level video encoding in terms of body pose and motion is the subject of an ongoing debate [12]. In this paper we aim to clarify the underlying factors responsible for good performance of holistic and pose-based representations. To that end we build on our recent dataset [2] leveraging the existing taxonomy of human activities. This dataset includes \(24,920\) video snippets covering \(410\) human activities in total. Our analysis reveals that holistic and pose-based methods are highly complementary, and their performance varies significantly depending on the activity. We find that holistic methods are mostly affected by the number and speed of trajectories, whereas pose-based methods are mostly influenced by viewpoint of the person. We observe striking performance differences across activities: for certain activities results with pose-based features are more than twice as accurate compared to holistic features, and vice versa. The best performing approach in our comparison is based on the combination of holistic and pose-based approaches, which again underlines their complementarity.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Ainsworth, B., Haskell, W., Herrmann, S., Meckes, N., Bassett, D., Tudor-Locke, C., Greer, J., Vezina, J., Whitt-Glover, M., Leon, A.: 2011 compendium of physical activities: a second update of codes and MET values. MSSE 43(8), 1575–1581 (2011) Ainsworth, B., Haskell, W., Herrmann, S., Meckes, N., Bassett, D., Tudor-Locke, C., Greer, J., Vezina, J., Whitt-Glover, M., Leon, A.: 2011 compendium of physical activities: a second update of codes and MET values. MSSE 43(8), 1575–1581 (2011)
2.
Zurück zum Zitat Andriluka, M., Pishchulin, L., Gehler, P., Schiele, B.: 2D human poseestimation: new benchmark and state of the art analysis. In: CVPR’14 Andriluka, M., Pishchulin, L., Gehler, P., Schiele, B.: 2D human poseestimation: new benchmark and state of the art analysis. In: CVPR’14
3.
Zurück zum Zitat Brendel, W., Todorovic, S.: Learning spatiotemporal graphs of human activities. In: ICCV’11 Brendel, W., Todorovic, S.: Learning spatiotemporal graphs of human activities. In: ICCV’11
4.
Zurück zum Zitat Cardinaux, F., Bhowmik, D., Abhayaratne, C., Hawley, M.S.: Video based technology for ambient assisted living: a review of the literature. J. Ambient Intell. Smart Environ. 3(3), 253–269 (2011) Cardinaux, F., Bhowmik, D., Abhayaratne, C., Hawley, M.S.: Video based technology for ambient assisted living: a review of the literature. J. Ambient Intell. Smart Environ. 3(3), 253–269 (2011)
5.
Zurück zum Zitat Chakraborty, B., Holte, M.B., Moeslund, T.B., Gonzalez, J., Xavier Roca, F.: A selective spatio-temporal interest point detector for human action recognition in complex scenes. In: ICCV’11 Chakraborty, B., Holte, M.B., Moeslund, T.B., Gonzalez, J., Xavier Roca, F.: A selective spatio-temporal interest point detector for human action recognition in complex scenes. In: ICCV’11
6.
Zurück zum Zitat Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection.In: CVPR’05 Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection.In: CVPR’05
7.
Zurück zum Zitat Dalal, N., Triggs, B., Schmid, C.: Human detection using oriented histograms of flow and appearance. In: ECCV’06 Dalal, N., Triggs, B., Schmid, C.: Human detection using oriented histograms of flow and appearance. In: ECCV’06
8.
Zurück zum Zitat Dantone, M., Gall, J., Leistner, C., Gool., L.V.: Human pose estimation usingbody parts dependent joint regressors. In: CVPR’13 Dantone, M., Gall, J., Leistner, C., Gool., L.V.: Human pose estimation usingbody parts dependent joint regressors. In: CVPR’13
9.
Zurück zum Zitat Duchenne, O., Laptev, I., Sivic, J., Bach, F., Ponce, J.: Automatic annotation of human actions in video. In: ICCV’09 Duchenne, O., Laptev, I., Sivic, J., Bach, F., Ponce, J.: Automatic annotation of human actions in video. In: ICCV’09
11.
Zurück zum Zitat Ferrari, V., Marin, M., Zisserman, A.: Progressive search space reduction for human pose estimation. In: CVPR’08 Ferrari, V., Marin, M., Zisserman, A.: Progressive search space reduction for human pose estimation. In: CVPR’08
12.
Zurück zum Zitat Jhuang, H., Gall, J., Zuffi, S., Schmid, C., Black, M.J.: Towards understanding action recognition. In: ICCV’13 Jhuang, H., Gall, J., Zuffi, S., Schmid, C., Black, M.J.: Towards understanding action recognition. In: ICCV’13
13.
Zurück zum Zitat Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: HMDB: a large video database for human motion recognition. In: Proceedings of the International Conference on Computer Vision (ICCV) (2011) Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: HMDB: a large video database for human motion recognition. In: Proceedings of the International Conference on Computer Vision (ICCV) (2011)
14.
Zurück zum Zitat Laptev, I.: On space-time interest points. IJCV 64(2/3), 107–123 (2005) Laptev, I.: On space-time interest points. IJCV 64(2/3), 107–123 (2005)
15.
Zurück zum Zitat Laptev, I., Marszałek, M., Schmid, C., Rozenfeld, B.: Learning realistichuman actions from movies. In: CVPR’08 Laptev, I., Marszałek, M., Schmid, C., Rozenfeld, B.: Learning realistichuman actions from movies. In: CVPR’08
16.
Zurück zum Zitat Liu, J., Luo, J., Shah, M.: Recognizing realistic actions from videos in thewild. In: CVPR’09 Liu, J., Luo, J., Shah, M.: Recognizing realistic actions from videos in thewild. In: CVPR’09
17.
Zurück zum Zitat Marszałek, M., Laptev, I., Schmid, C.: Actions in context. In: CVPR’09 Marszałek, M., Laptev, I., Schmid, C.: Actions in context. In: CVPR’09
18.
Zurück zum Zitat Pishchulin, L., Andriluka, M., Gehler, P., Schiele, B.: Poselet conditionedpictorial structures. In: CVPR’13 Pishchulin, L., Andriluka, M., Gehler, P., Schiele, B.: Poselet conditionedpictorial structures. In: CVPR’13
19.
Zurück zum Zitat Pishchulin, L., Andriluka, M., Gehler, P., Schiele, B.: Strong appearance and expressive spatial models for human pose estimation. In: ICCV’13 Pishchulin, L., Andriluka, M., Gehler, P., Schiele, B.: Strong appearance and expressive spatial models for human pose estimation. In: ICCV’13
20.
Zurück zum Zitat Rodriguez, M.D., Ahmed, J., Shah, M.: Action mach: a spatio-temporal maximum average correlation height filter for action recognition. In: CVPR’08 Rodriguez, M.D., Ahmed, J., Shah, M.: Action mach: a spatio-temporal maximum average correlation height filter for action recognition. In: CVPR’08
21.
Zurück zum Zitat Rohrbach, M., Amin, S., Andriluka, M., Schiele, B.: A database for fine grained activity detection of cooking activities. In: CVPR’12 Rohrbach, M., Amin, S., Andriluka, M., Schiele, B.: A database for fine grained activity detection of cooking activities. In: CVPR’12
22.
Zurück zum Zitat Rohrbach, M., Stark, M., Schiele, B.: Evaluating knowledge transfer andzero-shot learning in a large-scale setting. In: CVPR’11 Rohrbach, M., Stark, M., Schiele, B.: Evaluating knowledge transfer andzero-shot learning in a large-scale setting. In: CVPR’11
23.
Zurück zum Zitat Sadanand, S., J., C.J.: Action bank: a high-level representation of activity in video. In: ECCV’12 Sadanand, S., J., C.J.: Action bank: a high-level representation of activity in video. In: ECCV’12
24.
Zurück zum Zitat Sapp, B., Taskar, B.: Multimodal decomposable models for human pose estimation. In: CVPR’13 Sapp, B., Taskar, B.: Multimodal decomposable models for human pose estimation. In: CVPR’13
25.
Zurück zum Zitat Singh, V.K., Nevatia, R.: Action recognition in cluttered dynamic scenes usingpose-specific part models. In: ICCV’11 Singh, V.K., Nevatia, R.: Action recognition in cluttered dynamic scenes usingpose-specific part models. In: ICCV’11
26.
Zurück zum Zitat Soomro, K., Zamir, A.R., Shah, M.: Ucf101: a dataset of 101 human action classes from videos in the wild. Technical report CRCV-TR-12-01, UCF (2012) Soomro, K., Zamir, A.R., Shah, M.: Ucf101: a dataset of 101 human action classes from videos in the wild. Technical report CRCV-TR-12-01, UCF (2012)
27.
Zurück zum Zitat Vedaldi, A., Zisserman, A.: Efficient additive kernels via explicit feature maps. In: CVPR’10 Vedaldi, A., Zisserman, A.: Efficient additive kernels via explicit feature maps. In: CVPR’10
28.
Zurück zum Zitat Vishwakarma, S., Agrawal, A.: A survey on activity recognition and behavior understanding in video surveillance. VC 29(10), 983–1009 (2013) Vishwakarma, S., Agrawal, A.: A survey on activity recognition and behavior understanding in video surveillance. VC 29(10), 983–1009 (2013)
29.
Zurück zum Zitat Wang, H., Kläser, A., Schmid, C., Liu, C.L.: Dense trajectories and motion boundary descriptors for action recognition. IJCV 103(1), 60–79 (2013)CrossRef Wang, H., Kläser, A., Schmid, C., Liu, C.L.: Dense trajectories and motion boundary descriptors for action recognition. IJCV 103(1), 60–79 (2013)CrossRef
30.
Zurück zum Zitat Wang, H., Schmid, C.: Action recognition with improved trajectories. In:ICCV’13 Wang, H., Schmid, C.: Action recognition with improved trajectories. In:ICCV’13
31.
Zurück zum Zitat Wang, H., Ullah, M.M., Kläser, A., Laptev, I., Schmid, C.: Evaluation oflocal spatio-temporal features for action recognition. In: BMVC’09 Wang, H., Ullah, M.M., Kläser, A., Laptev, I., Schmid, C.: Evaluation oflocal spatio-temporal features for action recognition. In: BMVC’09
32.
Zurück zum Zitat Yang, Y., Ramanan, D.: Articulated human detection with flexible mixtures of parts. PAMI 61(1), 55–79 (2013) Yang, Y., Ramanan, D.: Articulated human detection with flexible mixtures of parts. PAMI 61(1), 55–79 (2013)
Metadaten
Titel
Fine-Grained Activity Recognition with Holistic and Pose Based Features
verfasst von
Leonid Pishchulin
Mykhaylo Andriluka
Bernt Schiele
Copyright-Jahr
2014
DOI
https://doi.org/10.1007/978-3-319-11752-2_56