Skip to main content

2015 | OriginalPaper | Buchkapitel

uulmMAD – A Human Action Recognition Dataset for Ground-Truth Evaluation and Investigation of View Invariances

verfasst von : Michael Glodek, Georg Layher, Felix Heilemann, Florian Gawrilowicz, Günther Palm, Friedhelm Schwenker, Heiko Neumann

Erschienen in: Multimodal Pattern Recognition of Social Signals in Human-Computer-Interaction

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In recent time, human action recognition has gained increasing attention in pattern recognition. However, many datasets in the literature focus on a limited number of target-oriented properties. Within this work, we present a novel dataset, named uulmMAD, which has been created to benchmark state-of-the-art action recognition architectures addressing multiple properties, e.g. high-resolutions cameras, perspective changes, realistic cluttered background and noise, overlap of action classes, different execution speeds, variability in subjects and their clothing, and the availability of a pose ground-truth. The uulmMAD was recorded using three synchronized high-resolution cameras and an inertial motion capturing system. Each subject performed fourteen actions at least three times in front of a green screen. Selected actions in four variants were recorded, i.e. normal, pausing, fast and deceleration. The data has been post-processed in order to separate the subject from the background. Furthermore, the camera and the motion capturing data have been mapped onto each other and 3D-avatars have been generated to further extend the dataset. The avatars have also been used to emulate the self-occlusion in pose recognition when using a time-of-flight camera. In this work, we analyze the uulmMAD using a state-of-the-art action recognition architecture to provide first baseline results. The results emphasize the unique characteristics of the dataset. The dataset will be made publicity available upon publication of the paper.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
Pike F-145 from Allied Vision with a Tevidon 1,8/16 lens.
 
2
Poser™ is a 3D modeling software for human avatars by Smith Micro Software.
 
Literatur
1.
Zurück zum Zitat Aggarwal, J., Ryoo, M.: Human activity analysis: a review. ACM Comput. Surv. 43(3), 16:1–16:43 (2011)CrossRef Aggarwal, J., Ryoo, M.: Human activity analysis: a review. ACM Comput. Surv. 43(3), 16:1–16:43 (2011)CrossRef
2.
Zurück zum Zitat Blank, M., Gorelick, L., Shechtman, E., Irani, M., Basri, R.: Actions as space-time shapes. In: Tenth IEEE International Conference on Computer Vision 2005, ICCV 2005, vol. 2, pp. 1395–1402. IEEE (2005) Blank, M., Gorelick, L., Shechtman, E., Irani, M., Basri, R.: Actions as space-time shapes. In: Tenth IEEE International Conference on Computer Vision 2005, ICCV 2005, vol. 2, pp. 1395–1402. IEEE (2005)
3.
Zurück zum Zitat Escobar, M.J., Masson, G.S., Vieville, T., Kornprobst, P.: Action recognition using a bio-inspired feedforward spiking network. Int. J. Comput. Vis. 82(3), 284–301 (2009)CrossRef Escobar, M.J., Masson, G.S., Vieville, T., Kornprobst, P.: Action recognition using a bio-inspired feedforward spiking network. Int. J. Comput. Vis. 82(3), 284–301 (2009)CrossRef
4.
Zurück zum Zitat Glodek, M., Geier, T., Biundo, S., Palm, G.: A layered architecture for probabilistic complex pattern recognition to detect user preferences. J. Biol. Inspired Cogn. Archit. 9, 46–56 (2014) Glodek, M., Geier, T., Biundo, S., Palm, G.: A layered architecture for probabilistic complex pattern recognition to detect user preferences. J. Biol. Inspired Cogn. Archit. 9, 46–56 (2014)
5.
Zurück zum Zitat Glodek, M., Geier, T., Biundo, S., Schwenker, F., Palm, G.: Recognizing user preferences based on layered activity recognition and first-order logic. In: Proceedings of the International IEEE Conference on Tools with Artificial Intelligence (ICTAI), pp. 648–653. IEEE (2013) Glodek, M., Geier, T., Biundo, S., Schwenker, F., Palm, G.: Recognizing user preferences based on layered activity recognition and first-order logic. In: Proceedings of the International IEEE Conference on Tools with Artificial Intelligence (ICTAI), pp. 648–653. IEEE (2013)
6.
Zurück zum Zitat Glodek, M., Reuter, S., Schels, M., Dietmayer, K., Schwenker, F.: Kalman filter based classifier fusion for affective state recognition. In: Zhou, Z.-H., Roli, F., Kittler, J. (eds.) MCS 2013. LNCS, vol. 7872, pp. 85–94. Springer, Heidelberg (2013)CrossRef Glodek, M., Reuter, S., Schels, M., Dietmayer, K., Schwenker, F.: Kalman filter based classifier fusion for affective state recognition. In: Zhou, Z.-H., Roli, F., Kittler, J. (eds.) MCS 2013. LNCS, vol. 7872, pp. 85–94. Springer, Heidelberg (2013)CrossRef
7.
Zurück zum Zitat Glodek, M., Schels, M., Schwenker, F., Palm, G.: Combination of sequential class distributions from multiple channels using Markov fusion networks. J. Multimodal User Interfaces 8(3), 257–272 (2014)CrossRef Glodek, M., Schels, M., Schwenker, F., Palm, G.: Combination of sequential class distributions from multiple channels using Markov fusion networks. J. Multimodal User Interfaces 8(3), 257–272 (2014)CrossRef
8.
Zurück zum Zitat Glodek, M., Trentin, E., Schwenker, F., Palm, G.: Hidden Markov models with graph densities for action recognition. In: Proceedings of the International Joint Conference on Neural Networks (IJCNN), pp. 964–969. IEEE (2013) Glodek, M., Trentin, E., Schwenker, F., Palm, G.: Hidden Markov models with graph densities for action recognition. In: Proceedings of the International Joint Conference on Neural Networks (IJCNN), pp. 964–969. IEEE (2013)
9.
Zurück zum Zitat Harris, C., Stephens, M.: A combined corner and edge detector. In: Proceedings of the Alvey Vision Conference, pp. 147–151 (1988) Harris, C., Stephens, M.: A combined corner and edge detector. In: Proceedings of the Alvey Vision Conference, pp. 147–151 (1988)
10.
Zurück zum Zitat Hartley, R., Zisserman, A.: Multiple View Geometry in Computer Vision. Cambridge University Press, Cambridge (2003) Hartley, R., Zisserman, A.: Multiple View Geometry in Computer Vision. Cambridge University Press, Cambridge (2003)
11.
Zurück zum Zitat Hassner, T.: A critical review of action recognition benchmarks. In: Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 245–250. IEEE Computer Society (2013) Hassner, T.: A critical review of action recognition benchmarks. In: Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 245–250. IEEE Computer Society (2013)
12.
Zurück zum Zitat Kächele, M., Schwenker, F.: Cascaded fusion of dynamic, spatial, and textural feature sets for person-independent facial emotion recognition. In: Proceedings of the International Conference on Pattern Recognition (ICPR), pp. 4660–4665. IEEE (2014) Kächele, M., Schwenker, F.: Cascaded fusion of dynamic, spatial, and textural feature sets for person-independent facial emotion recognition. In: Proceedings of the International Conference on Pattern Recognition (ICPR), pp. 4660–4665. IEEE (2014)
14.
Zurück zum Zitat Laptev, I., Marszalek, M., Schmid, C., Rozenfeld, B.: Learning realistic human actions from movies. In: IEEE Conference on Computer Vision and Pattern Recognition 2008, CVPR 2008, pp. 1–8. IEEE (2008) Laptev, I., Marszalek, M., Schmid, C., Rozenfeld, B.: Learning realistic human actions from movies. In: IEEE Conference on Computer Vision and Pattern Recognition 2008, CVPR 2008, pp. 1–8. IEEE (2008)
15.
Zurück zum Zitat Layher, G., Giese, M.A., Neumann, H.: Learning representations of animated motion sequences - a neural model. Top. Cogn. Sci. 6(1), 170–182 (2014)CrossRef Layher, G., Giese, M.A., Neumann, H.: Learning representations of animated motion sequences - a neural model. Top. Cogn. Sci. 6(1), 170–182 (2014)CrossRef
16.
Zurück zum Zitat Liu, J., Luo, J., Shah, M.: Recognizing realistic actions from videos “in the wild”. In: IEEE Conference on Computer Vision and Pattern Recognition 2009, CVPR 2009, pp. 1996–2003. IEEE (2009) Liu, J., Luo, J., Shah, M.: Recognizing realistic actions from videos “in the wild”. In: IEEE Conference on Computer Vision and Pattern Recognition 2009, CVPR 2009, pp. 1996–2003. IEEE (2009)
17.
Zurück zum Zitat Lv, F., Nevatia, R.: Single view human action recognition using key pose matching and viterbi path searching. In: IEEE Conference on Computer Vision and Pattern Recognition 2007, CVPR’07, pp. 1–8. IEEE (2007) Lv, F., Nevatia, R.: Single view human action recognition using key pose matching and viterbi path searching. In: IEEE Conference on Computer Vision and Pattern Recognition 2007, CVPR’07, pp. 1–8. IEEE (2007)
18.
Zurück zum Zitat Mishima, Y.: A software chromakeyer using polyhedric slice. In: Proceedings of NICOGRAPH, vol. 92, pp. 44–52 (1992) Mishima, Y.: A software chromakeyer using polyhedric slice. In: Proceedings of NICOGRAPH, vol. 92, pp. 44–52 (1992)
19.
Zurück zum Zitat Mishima, Y.: Soft edge chroma-key generation based upon hexoctahedral color space. U.S. Patent and Trademark Office, US Patent 5355174 A, Oct 1994 Mishima, Y.: Soft edge chroma-key generation based upon hexoctahedral color space. U.S. Patent and Trademark Office, US Patent 5355174 A, Oct 1994
20.
Zurück zum Zitat Patron, A., Marszalek, M., Zisserman, A., Reid, I.: High five: recognising human interactions in TV shows. In: Proceedings of the British Machine Vision Conference, pp. 50.1–50.11. BMVA Press (2010). doi:10.5244/C.24.50 Patron, A., Marszalek, M., Zisserman, A., Reid, I.: High five: recognising human interactions in TV shows. In: Proceedings of the British Machine Vision Conference, pp. 50.1–50.11. BMVA Press (2010). doi:10.​5244/​C.​24.​50
21.
Zurück zum Zitat Poppe, R.: A survey on vision-based human action recognition. Image Vis. Comput. 28(6), 976–990 (2010)CrossRef Poppe, R.: A survey on vision-based human action recognition. Image Vis. Comput. 28(6), 976–990 (2010)CrossRef
22.
Zurück zum Zitat Gonzalez, R.C., Woods, R.E.: Digital Image Processing. Addison-Wesley, Reading (1993) Gonzalez, R.C., Woods, R.E.: Digital Image Processing. Addison-Wesley, Reading (1993)
23.
Zurück zum Zitat Reddy, K.K., Shah, M.: Recognizing 50 human action categories of web videos. Mach. Vis. Appl. 24(5), 971–981 (2013)CrossRef Reddy, K.K., Shah, M.: Recognizing 50 human action categories of web videos. Mach. Vis. Appl. 24(5), 971–981 (2013)CrossRef
24.
Zurück zum Zitat Roetenberg, D., Luinge, H., Slycke, P.: Xsens MVN: full 6DOF human motion tracking using miniature inertial sensors. Technical report, Xsens Technologies B. V. (2009) Roetenberg, D., Luinge, H., Slycke, P.: Xsens MVN: full 6DOF human motion tracking using miniature inertial sensors. Technical report, Xsens Technologies B. V. (2009)
25.
Zurück zum Zitat Scherer, S., Glodek, M., Schwenker, F., Campbell, N., Palm, G.: Spotting laughter in natural multiparty conversations a comparison of automatic online and offline approaches using audiovisual data. ACM Trans. Interact. Intell. Syst. (TiiS) - Special Issue on Affective Interaction in Natural Environments 2(1), 4:1–4:31 (2012) Scherer, S., Glodek, M., Schwenker, F., Campbell, N., Palm, G.: Spotting laughter in natural multiparty conversations a comparison of automatic online and offline approaches using audiovisual data. ACM Trans. Interact. Intell. Syst. (TiiS) - Special Issue on Affective Interaction in Natural Environments 2(1), 4:1–4:31 (2012)
26.
Zurück zum Zitat Schuldt, C., Laptev, I., Caputo, B.: Recognizing human actions: a local SVM approach. In: Proceedings of the 17th International Conference on Pattern Recognition 2004, ICPR 2004, vol. 3, pp. 32–36. IEEE (2004) Schuldt, C., Laptev, I., Caputo, B.: Recognizing human actions: a local SVM approach. In: Proceedings of the 17th International Conference on Pattern Recognition 2004, ICPR 2004, vol. 3, pp. 32–36. IEEE (2004)
27.
Zurück zum Zitat Smith, A.R., Blinn, J.F.: Blue screen matting. In: Proceedings of the 23rd Annual Conference on Computer Graphics and Interactive Techniques, pp. 259–268. ACM (1996) Smith, A.R., Blinn, J.F.: Blue screen matting. In: Proceedings of the 23rd Annual Conference on Computer Graphics and Interactive Techniques, pp. 259–268. ACM (1996)
28.
Zurück zum Zitat Tran, D., Sorokin, A.: Human activity recognition with metric learning. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part I. LNCS, vol. 5302, pp. 548–561. Springer, Heidelberg (2008)CrossRef Tran, D., Sorokin, A.: Human activity recognition with metric learning. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part I. LNCS, vol. 5302, pp. 548–561. Springer, Heidelberg (2008)CrossRef
Metadaten
Titel
uulmMAD – A Human Action Recognition Dataset for Ground-Truth Evaluation and Investigation of View Invariances
verfasst von
Michael Glodek
Georg Layher
Felix Heilemann
Florian Gawrilowicz
Günther Palm
Friedhelm Schwenker
Heiko Neumann
Copyright-Jahr
2015
DOI
https://doi.org/10.1007/978-3-319-14899-1_8