nach oben

Machine Vision and Applications

Erschienen in:

01.10.2013 | Original Paper

Classifying web videos using a global video descriptor

verfasst von: Berkan Solmaz, Shayan Modiri Assari, Mubarak Shah

Erschienen in: Machine Vision and Applications | Ausgabe 7/2013

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Computing descriptors for videos is a crucial task in computer vision. In this paper, we propose a global video descriptor for classification of videos. Our method, bypasses the detection of interest points, the extraction of local video descriptors and the quantization of descriptors into a code book; it represents each video sequence as a single feature vector. Our global descriptor is computed by applying a bank of 3-D spatio-temporal filters on the frequency spectrum of a video sequence; hence, it integrates the information about the motion and scene structure. We tested our approach on three datasets, KTH (Schuldt et al., Proceedings of the 17th international conference on, pattern recognition (ICPR’04), vol. 3, pp. 32–36, 2004), UCF50 (http://vision.eecs.ucf.edu/datasetsActions.html) and HMDB51 (Kuehne et al., HMDB: a large video database for human motion recognition, 2011), and obtained promising results which demonstrate the robustness and the discriminative power of our global video descriptor for classifying videos of various actions. In addition, the combination of our global descriptor and a local descriptor resulted in the highest classification accuracies on UCF50 and HMDB51 datasets.

Vorheriger Artikel The unified extreme learning machines and discriminative random fields for automatic knee cartilage and meniscus segmentation from multi-contrast MR images

Nächster Artikel A novel particle filter with implicit dynamic model for irregular motion tracking

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Nur mit Berechtigung zugänglich

Schuldt, C., Laptev, I., Caputo, B.: Recognizing human actions: a local svm approach. In: Proceedings of the 17th international conference on, pattern recognition (ICPR’04), vol. 3, pp. 32–36 (2004)

http://vision.eecs.ucf.edu/datasetsActions.html

Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: HMDB: a large video database for human motion recognition. In: ICCV (2011)

Poppe, R.: A survey on vision-based human action recognition. Image Vision Comput. 28, 976–990 (2010)CrossRef

Weinland, D., Ronfard, R., Boyer, E.: A survey of vision-based methods for action representation, segmentation and recognition. Comput. Vis. Image Underst. 115, 224–241 (2011)CrossRef

Laptev, I., Marszalek, M., Schmid, C., Rozenfeld, B.: Learning realistic human actions from movies. In: IEEE Computer Society conference on computer vision and, pattern recognition (CVPR ’08) (2008)

Liu, J., Luo, J., Shah, M.: Recognizing realistic actions from videos in the wild. In: IEEE conference on computer vision and, pattern recognition (CVPR ’09), pp. 1996–2003 (2009)

Bobick, A., Davis, J.: The recognition of human movement using temporal templates. IEEE Trans. Pattern Anal. Mach. Intell. 23, 257–267 (2001)CrossRef

Yilmaz, A., Shah, M.: A differential geometric approach to representing the human actions. Comput. Vis. Image Underst. 109, 335–351 (2008)CrossRef

10.

Black, M.: Explaining optical flow events with parameterized spatio-temporal models. In: IEEE Computer Society conference on computer vision and, pattern recognition (CVPR ’99), vol. 1, pp. 326–332 (1999)

11.

Polana, R., Nelson, R.C.: Detection and recognition of periodic, non-rigid motion. Int. J. Comput. Vision 23, 261–282 (1997)CrossRef

12.

Wu, S., Oreifej, O., Shah, M.: Action recognition in videos acquired by a moving camera using motion decomposition of lagrangian particle trajectories. In: IEEE international conference on computer vision (ICCV ’11), pp. 1419–1426 (2011)

13.

Wang, H., Klaser, A., Schmid, C., Liu, C.L.: Action recognition by dense trajectories. In: IEEE conference on computer vision and, pattern recognition (CVPR ’11), pp. 3169–3176 (2011)

14.

Laptev, I.: On space-time interest points. Int. J. Comput. Vision 64, 107–123 (2005)CrossRef

15.

Harris, C., Stephens, M.: A combined corner and edge detector. In: Proceedings of fourth Alvey vision conference, pp. 147–151 (1988)

16.

Dollar, P., Rabaud, V., Cottrell, G., Belongie, S.: Behavior recognition via sparse spatio-temporal features. In: VS-PETS, pp. 65–72 (2005)

17.

Klaser, A., Marszalek, M., Schmid, C.: A spatio-temporal descriptor based on 3d-gradients. BMVC, In (2008)

18.

Scovanner, P., Ali, S., Shah, M.: A 3-dimensional SIFT descriptor and its application to action recognition. In: ACM multimedia, pp. 357–360 (2007)

19.

Ikizler-Cinbis, N., Sclaroff, S.: Object, scene and actions: combining multiple features for human action recognition. In: Proceedings of the 11th European conference on computer vision (ECCV ’10), pp. 494–507 (2010)

20.

Oliva, A., Torralba, A.B., Guerin-Dugue, A., Herault, J.: Global semantic classification of scenes using power spectrum templates. Challenge of image retrieval, pp. 1–12 (1999)

21.

Oliva, A., Torralba, A.: Modeling the shape of the scene: a holistic representation of the spatial envelope. Int. J. Comput. Vision 42, 145–175 (2001)CrossRefMATH

22.

Heeger, D.J.: Notes on motion estimation. http://white.stanford.edu/~heeger (1998)

23.

Maaten, L.V.D., Postma, E.O., Herik, H.J.V.D.: Dimensionality reduction: a comparative review (2008)

24.

Wang, H., Ullah, M.M., Klaser, A., Laptev, I., Schmid, C.: Evaluation of local spatio-temporal features for action recognition. In: BMVC, p. 127 (2009)

25.

Chang, C.C., Lin, C.J.: LIBSVM: A library for support vector machines. ACM Trans. Intell. Syst. Technol. 2, 27:1–27:27 (2011)

26.

Jhuang, H., Serre, T., Wolf, L., Poggio, T.: A biologically inspired system for action recognition. In: IEEE 11th international conference on computer vision (ICCV’07), pp. 1–8 (2007)

27.

Gilbert, A., Illingworth, J., Bowden, R.: Action recognition using mined hierarchical compound features. IEEE Trans. Pattern Anal. Mach. Intell. 33, 883–897 (2011)CrossRef

Titel: Classifying web videos using a global video descriptor
verfasst von: Berkan Solmaz
Shayan Modiri Assari
Mubarak Shah
Publikationsdatum: 01.10.2013
Verlag: Springer Berlin Heidelberg
Erschienen in: Machine Vision and Applications / Ausgabe 7/2013
Print ISSN: 0932-8092
Elektronische ISSN: 1432-1769
DOI: https://doi.org/10.1007/s00138-012-0449-x

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Wirtschaft"

Springer Professional "Technik"

Weitere Artikel der Ausgabe 7/2013

Nearest neighbor weighted average customization for modeling faces

Breast cancer diagnosis from biopsy images with highly reliable random subspace classifier ensembles

A linear system form solution to compute the local space average color

Machine learning in medical imaging

Accurate prediction of AD patients using cortical thickness networks

SLAC: Statistical total lesion metabolic activity computation by fuzzy unsupervised learning of PET images