nach oben

International Journal of Computer Vision

Erschienen in:

01.04.2014

Max-Margin Early Event Detectors

verfasst von: Minh Hoai, Fernando De la Torre

Erschienen in: International Journal of Computer Vision | Ausgabe 2/2014

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

The need for early detection of temporal events from sequential data arises in a wide spectrum of applications ranging from human-robot interaction to video security. While temporal event detection has been extensively studied, early detection is a relatively unexplored problem. This paper proposes a maximum-margin framework for training temporal event detectors to recognize partial events, enabling early detection. Our method is based on Structured Output SVM, but extends it to accommodate sequential data. Experiments on datasets of varying complexity, for detecting facial expressions, hand gestures, and human activities, demonstrate the benefits of our approach.

Vorheriger Artikel Face Alignment by Explicit Shape Regression

Nächster Artikel Multi-Target Tracking by Online Learning a CRF Model of Appearance and Motion Patterns

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

www-01.ibm.com/software/integration/optimization/cplex-optimizer/.

http://www.robots.ox.ac.uk/~minhhoai/projects/mmed.html.

Ali, S., & Shah, M. (2010). Human action recognition in videos using kinematic features and multiple instance learning. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(2), 288–303.CrossRef

Amer, M. R., Xie, D., Zhao, M., Todorovic, S., & Zhu, S. C. (2012). Cost-sensitive top-down/bottom-up inference for multiscale activity recognition. In Proceedings of the european conference on computer vision.

Bobick, A. F., & Davis, J. W. (2001). The recognition of human movement using temporal templates. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(3), 257–267.CrossRef

Brand, M., Oliver, N., & Pentland, A. (1997). Coupled hidden Markov models for complex action recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition.

Brendel, W., & Todorovic, S. (2011). Learning spatiotemporal graphs of human activities. In Proceedings of the international conference on computer vision.

Brown, P. F., deSouza, P. V., Mercer, R. L., Pietra, V. J. D., & Lai, J. C. (1992). Class-based n-gram models of natural language. Computational Linguistics, 18(4), 467–479.

Chomat, O., & Crowley, J. (1999). Probabilistic recognition of activity using local appearance. In Proceedings of the IEEE conference on computer vision and pattern recognition.

Cohn, J., Simon, T., Matthews, I., Yang, Y., Nguyen, M. H., Tejera, M., Zhou, F., & De la Torre, F. (2009). Detecting depression from facial actions and vocal prosody. In Proceedings of international conference on affective computing and intelligent interaction.

Cooper, H., & Bowden, R. (2009). Learning signs from subtitles: A weakly supervised approach to sign language recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition.

Crammer, K., & Singer, Y. (2001). On the algorithmic implementation of multiclass kernel-based vector machines. Journal of Machine Learning Research, 2, 265–292.

Davis, J., & Tyagi, A. (2006). Minimal-latency human action recognition using reliable-inference. Image and Vision Computing, 24(5), 455–472.CrossRef

Desobry, F., Davy, M., & Doncarli, C. (2005). An online kernel change detection algorithm. IEEE Transaction on Signal Processing, 53(8), 2961–2974.CrossRefMathSciNet

Dollár, P., Rabaud, V., Cottrell, G., & Belongie, S. (2005). Behavior recognition via sparse spatio-temporal features. In ICCV Workshop on visual surveillance and performance evaluation of tracking and surveillance.

Duchenne, O., Laptev, I., Sivic, J., Bach, F. R., & Ponce, J. (2009). Automatic annotation of human actions in video. In Proceedings of the international conference on computer vision.

Efros, A., Berg, A., Mori, G., & Malik, J. (2003). Recognizing action at a distance. In Proceedings of the international conference on computer vision.

Ellis, C., Masood, S., Tappen, M. F., LaViola, J. J., & Sukthankar, R. (2013). Exploring the trade-off between accuracy and observational latency in action recognition. International Journal of Computer Vision, 101(3), 420–436.CrossRef

Fawcett, T., & Provost, F. (1999). Activity monitoring: Noticing interesting changes in behavior. In Proceedings of the SIGKDD conference on knowledge discovery and data mining.

Gorelick, L., Blank, M., Shechtman, E., Irani, M., & Basri, R. (2007). Actions as space-time shapes. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29(12), 2247–2253.CrossRef

Haider, P., Brefeld, U., & Scheffer, T. (2007). Supervised clustering of streaming data for email batch detection. In Proceedings of the international conference on machine learning.

Hoai, M., & De la Torre, F. (2012a). Max-margin early event detectors. In Proceedings of the IEEE conference on computer vision and pattern recognition.

Hoai, M., & De la Torre, F. (2012b). Maximum margin temporal clustering. In Proceedings of international conference on artificial intelligence and statistics.

Hoai, M., Lan, Z. Z., & De la Torre, F. (2011). Joint segmentation and classification of human actions in video. In Proceedings of the IEEE conference on computer vision and pattern recognition.

Jhuang, H., Serre, T., Wolf, L., & Poggio, T. (2007). A biologically inspired system for action recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition.

Kadous, M. (2002). Temporal classification: Extending the classification paradigm to multivariate time series. PhD thesis, The University of New South Wales.

Ke, Y., Sukthankar, R., & Hebert, M. (2005). Efficient visual event detection using volumetric features. In Proceedings of the international conference on computer vision.

Kim, K. J. (2003). Financial time series forecasting using support vector machines. Neurocomputing, 55(1–2), 307–319.CrossRef

Klaser, A., Marszalek, M., Schmid, C., & Zisserman, A. (2010). Human focused action localization in video. In Proceedings of international workshop on sign, gesture, activity.

Lan, T., Wang, Y., & Mori, G. (2011). Discriminative figure-centric models for joint action localization and recognition. In Proceedings of the international conference on computer vision.

Le, Q. V., Sarlos, T., & Smola, A. (2013). Fastfood—approximating kernel expansions in loglinear time. In Proceedings of the international conference on machine learning.

Liu, J., Kuipers, B., & Savarese, S. (2011). Recognizing human actions by attributes. In Proceedings of the IEEE conference on computer vision and pattern recognition.

Lucey, P., Cohn, J. F., Kanade, T., Saragih, J., Ambadar, Z., & Matthews, I. (2010). The extended Cohn–Kanade dataset (CK+): A complete dataset for action unit and emotion-specified expression. In CVPR Workshop on human communicative behavior analysis.

Maji, S., & Berg, A. C. (2009). Max-margin additive classifiers for detection. In Proceedings of the international conference on computer vision.

Marin-Jiménez, M. J., Zisserman, A., & Ferrari, V. (2011). “Here’s looking at you, kid”. Detecting people looking at each other in videos. In Proceedings of the British machine vision conference.

Masood, S., Ellis, C., Nagaraja, A., & Tappen, M. (2011). Measuring and reducing observational latency when recognizing actions. In Proceedings of the international conference on computer vision.

Mauthner, T., Roth, P., & Bischof, H. (2009). Action recognition from a small number of frames. In Computer vision winter workshop.

Nam, Y., Wohn, K., & Lee-Kwang, H. (1999). Modeling and recognition of hand gesture using colored petri nets. IEEE Transactions on Systems, Man and Cybernetics, 29(5), 514–521.CrossRef

Neill, D., Moore, A., & Cooper, G. (2006). A bayesian spatial scan statistic. In Advances in neural information processing systems.

Nguyen, M. H., Torresani, L., De la Torre, F., & Rother, C. (2009). Weakly supervised discriminative localization and classification: A joint learning process. In Proceedings of the international conference on computer vision.

Nguyen, M. H., Simon, T., De la Torre, F., & Cohn, J. (2010). Action unit detection with segment-based SVMs. In Proceedings of the IEEE conference on computer vision and pattern recognition.

Niebles, J. C., Chen, C. W., & Fei-Fei, L. (2010). Modeling temporal structure of decomposable motion segments for activity classification. In Proceedings of the european conference on computer vision.

Nowozin, S., & Shotton, J. (2012). Action points: A representation for low-latency online human action recognition. Microsoft Research Technical Report MSR-TR-2012-68, Cambridge.

Oh, S. M., Rehg, J. M., Balch, T., & Dellaert, F. (2008). Learning and inferring motion patterns using parametric segmental switching linear dynamic systems. International Journal of Computer Vision, 77(1–3), 103–124.CrossRef

Parameswaran, V., & Chellappa, R. (2006). View invariance for human action recognition. International Journal of Computer Vision, 66(1), 83–101.CrossRef

Patron-Perez, A., Marszalek, M., Zisserman, A., & Reid, I. (2010). High five: Recognising human interactions in TV shows. In Proceedings of British machine vision conference.

Pei, M., Jia, Y., & Zhu, S. C. (2011). Parsing video events with goal inference and intent prediction. In Proceedings of the international conference on computer vision.

Reddy, K. K., & Shah, M. (2012). Recognizing 50 human action categories of web videos. Machine Vision and Applications, 24(5), 971–981.CrossRef

Ryoo, M. (2011). Human activity prediction: Early recognition of ongoing activities from streaming videos. In Proceedings of the international conference on computer vision.

Ryoo, M. S., & Aggarwal, J. K. (2009). Semantic representation and recognition of continued and recursive human activities. International Journal of Computer Vision, 32(1), 1–24.CrossRef

Satkin, S., & Hebert, M. (2010). Modeling the temporal extent of actions. In Proceedings of the european conference on computer vision.

Schindler, K., & Van Gool, L. (2008). Action snippets: How many frames does human action recognition require? In Proceedings of the IEEE conference on computer vision and pattern recognition.

Shi, Y., Nguyen, M. H., Blitz, P., French, B., Fisk, S., De la Torre, F., Smailagic, A., & Siewiorek, D. (2010). Personalized stress detection from physiological measurements. In International symposium on quality of life technology.

Smith, P., da Vitoria Lobo, N., & Shah, M. (2005). Temporal boost for event recognition. In Proceedings of the international conference on computer vision.

Taskar, B., Guestrin, C., & Koller, D. (2003). Max-margin Markov networks. In Advances in neural information processing systems.

Tran, S. D., & Davis, L. S. (2008). Event modeling and recognition using Markov logic networks. In Proceedings of the european conference on computer vision.

Tsochantaridis, I., Joachims, T., Hofmann, T., & Altun, Y. (2005). Large margin methods for structured and interdependent output variables. Journal of Machine Learning Research, 6, 1453–1484.MATHMathSciNet

Vedaldi, A., & Zisserman, A. (2009). Structured output regression for detection with partial truncation. In Advances in neural information processing systems.

Vedaldi, A., & Zisserman, A. (2010). Efficient additive kernels via explicit feature maps. In Proceedings of the IEEE conference on computer vision and pattern recognition.

Yacoob, Y., & Black, M. J. (1999). Parameterized modeling and recognition of activities. Computer Vision and Image Understanding, 73(2), 232–247.CrossRef

Yang, Y., & Shah, M. (2012). Complex events detection using data-driven concepts. In Proceedings of the european conference on computer vision.

Titel: Max-Margin Early Event Detectors
verfasst von: Minh Hoai
Fernando De la Torre
Publikationsdatum: 01.04.2014
Verlag: Springer US
Erschienen in: International Journal of Computer Vision / Ausgabe 2/2014
Print ISSN: 0920-5691
Elektronische ISSN: 1573-1405
DOI: https://doi.org/10.1007/s11263-013-0683-3

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Wirtschaft"

Springer Professional "Technik"

Weitere Artikel der Ausgabe 2/2014

Decomposing Global Light Transport Using Time of Flight Imaging

The Shape Boltzmann Machine: A Strong Model of Object Shape

Face Alignment by Explicit Shape Regression

A Simple Prior-Free Method for Non-rigid Structure-from-Motion Factorization

Guest Editorial: Geometry, Lighting, Motion, and Learning

Multi-Target Tracking by Online Learning a CRF Model of Appearance and Motion Patterns