nach oben

Erschienen in:

2016 | OriginalPaper | Buchkapitel

DAPs: Deep Action Proposals for Action Understanding

verfasst von : Victor Escorcia, Fabian Caba Heilbron, Juan Carlos Niebles, Bernard Ghanem

Erschienen in: Computer Vision – ECCV 2016

Verlag: Springer International Publishing

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Object proposals have contributed significantly to recent advances in object understanding in images. Inspired by the success of this approach, we introduce Deep Action Proposals (DAPs), an effective and efficient algorithm for generating temporal action proposals from long videos. We show how to take advantage of the vast capacity of deep learning models and memory cells to retrieve from untrimmed videos temporal segments, which are likely to contain actions. A comprehensive evaluation indicates that our approach outperforms previous work on a large scale action benchmark, runs at 134 FPS making it practical for large-scale scenarios, and exhibits an appealing ability to generalize, i.e. to retrieve good quality temporal proposals of actions unseen in training.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Vorheriges Kapitel Multi-view Inverse Rendering Under Arbitrary Illumination and Albedo

Nächstes Kapitel A Large Contextual Dataset for Classification, Detection and Counting of Cars with Deep Learning

Nur mit Berechtigung zugänglich

Atmosukarto, I., Ahuja, N., Ghanem, B.: Action recognition using discriminative structured trajectory groups. In: 2015 IEEE Winter Conference on Applications of Computer Vision, pp. 899–906. IEEE (2015)

Atmosukarto, I., Ghanem, B., Ahuja, N.: Trajectory-based fisher kernel representation for action recognition in videos. In: 2012 21st International Conference on Pattern Recognition (ICPR), pp. 3333–3336. IEEE (2012)

Heilbron, F.C., Escorcia, V., Ghanem, B., Niebles, J.C.: Activitynet: a large-scale video benchmark for human activity understanding. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR, pp. 961–970 (2015)

Heilbron, F.C., Niebles, J.C., Ghanem, B.: Fast temporal activity proposals for efficient detection of human actions in untrimmed videos. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR (2016)

Chen, C., Grauman, K.: Efficient activity detection with max-subgraph search. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR, pp. 1274–1281 (2012)

Chen, W., Xiong, C., Xu, R., Corso, J.J.: Actionness ranking with lattice conditional ordinal random fields. In: IEEE Conference on Computer Vision and Pattern Recognition CVPR, pp. 748–755 (2014)

Erhan, D., Szegedy, C., Toshev, A., Anguelov, D.: Scalable object detection using deep neural networks. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR, pp. 2147–2154 (2014)

Everingham, M., Eslami, S.M.A., Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The pascal visual object classes challenge: a retrospective. Int. J. Comput. Vis. (IJCV) 111(1), 98–136 (2015)CrossRef

Gaidon, A., Harchaoui, Z., Schmid, C.: Temporal localization of actions with actoms. IEEE Trans. Pattern Anal. Mach. Intell. 35(11), 2782–2795 (2013)CrossRef

10.

van Gemert, J.C., Jain, M., Gati, E., Snoek, C.G.: Apt: action localization proposals from dense trajectories. In: British Machine Vision Conference (BMVC) (2015)

11.

Gkioxari, G., Malik, J.: Finding action tubes. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR, pp. 759–768 (2015)

12.

Gorban, A., Idrees, H., Jiang, Y.G., Zamir, A.R., Laptev, I., Shah, M., Sukthankar, R.: THUMOS challenge: action recognition with a large number of classes (2015). http://www.thumos.info/

13.

Gupta, S., Girshick, R., Arbeláez, P., Malik, J.: Learning rich features from RGB-D images for object detection and segmentation. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014, Part VII. LNCS, vol. 8695, pp. 345–360. Springer, Heidelberg (2014)

14.

Hariharan, B., Arbeláez, P., Girshick, R., Malik, J.: Simultaneous detection and segmentation. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014, Part VII. LNCS, vol. 8695, pp. 297–312. Springer, Heidelberg (2014)

15.

Hosang, J., Benenson, R., Dollár, P., Schiele, B.: What makes for effective detection proposals? IEEE Trans. Pattern Anal. Mach. Intell. 38(4), 814–830 (2016). doi:10.1109/TPAMI.2015.2465908 CrossRef

16.

Hua, Y., Alahari, K., Schmid, C.: Online object tracking with proposal selection. In: IEEE International Conference on Computer Vision (ICCV) (2015)

17.

Jain, M., van Gemert, J.C., Jégou, H., Bouthemy, P., Snoek, C.G.M.: Action localization with tubelets from motion. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 740–747 (2014)

18.

Jiang, Y.G., Liu, J., Zamir, A.R., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS challenge: action recognition with a large number of classes (2014). http://crcv.ucf.edu/THUMOS14/

19.

Karaman, S., Seidenari, L., Del Bimbo, A.: Fast saliency based pooling of fisher encoded dense trajectories (2014)

20.

Karpathy, A., Fei-Fei, L.: Deep visual-semantic alignments for generating image descriptions. In: Advances in Neural Information Processing Systems (NIPS), pp. 3128–3137 (2014)

21.

Lillo, I., Niebles, J.C., Soto, A.: A hierarchical pose-based approach to complex action understanding using dictionaries of actionlets and motion poselets. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (June 2016)

22.

Mettes, P., van Gemert, J., Cappallo, S., Mensink, T., Snoek, C.: Bag-of-fragments: selecting and encoding video fragments for event detection and recounting. In: ACM International Conference on Multimedia Retrieval (ICMR) (2015)

23.

Ng, J.Y., Hausknecht, M.J., Vijayanarasimhan, S., Vinyals, O., Monga, R., Toderici, G.: Beyond short snippets: deep networks for video classification. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR, pp. 4694–4702 (2015)

24.

Oneata, D., Revaud, J., Verbeek, J., Schmid, C.: Spatio-temporal object detection proposals. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014, Part III. LNCS, vol. 8691, pp. 737–752. Springer, Heidelberg (2014)

25.

Oneata, D., Verbeek, J., Schmid, C.: The lear submission at thumos 2014 (2014)

26.

Oneata, D., Verbeek, J.J., Schmid, C.: Action and event recognition with fisher vectors on a compact feature set. In: IEEE International Conference on Computer Vision, ICCV, pp. 1817–1824(2013)

27.

Oneata, D., Verbeek, J.J., Schmid, C.: Efficient action localization with approximately normalized fisher vectors. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR, pp. 2545–2552 (2014)

28.

Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems (NIPS) (2015)

29.

Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., Berg, A.C., Fei-Fei, L.: Imagenet large scale visual recognition challenge. Int. J. Comput. Vis. (IJCV) 115(3), 211–252 (2015)MathSciNetCrossRef

30.

Shou, Z., Wang, D., Chang, S.: Action temporal localization in untrimmed videos via multi-stage cnns. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR (2016)

31.

Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. In: Advances in Neural Information Processing Systems (NIPS), pp. 3104–3112 (2014). http://papers.nips.cc/paper/5346-sequence-to-sequence-learning-with-neural-networks

32.

Szegedy, C., Reed, S., Erhan, D., Anguelov, D.: Scalable, high-quality object detection. CoRR abs/1412.1441 (2014). http://arxiv.org/abs/1412.1441

33.

Tang, K., Yao, B., Fei-Fei, L., Koller, D.: Combining the right features for complex event recognition. In: The IEEE International Conference on Computer Vision (ICCV), December 2013

34.

Tran, D., Bourdev, L., Fergus, R., Torresani, L., Paluri, M.: Learning spatiotemporal features with 3d convolutional networks. In: IEEE International Conference on Computer Vision, ICCV, pp. 4489–4497 (2015)

35.

Uijlings, J.R.R., van de Sande, K.E.A., Gevers, T., Smeulders, A.W.M.: Selective search for object recognition. Int. J. Comput. Vis. 104(2), 154–171 (2013). http://dx.doi.org/10.1007/s11263-013-0620-5 CrossRef

36.

Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple features. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPRn, vol. 1, pp. I-511 (2001)

37.

Kuo, W., Hariharan, J.M.B.: Deepbox:learning objectness with convolutional networks. In: IEEE International Conference on Computer Vision (ICCV) (2015)

38.

Xu, Z., Yang, Y., Hauptmann, A.G.: A discriminative cnn video representation for event detection. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR (2015)

39.

Yu, G., Yuan, J.: Fast action proposals for human action detection and search. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR, pp. 1302–1311 (2015)

40.

Zitnick, C.L., Dollár, P.: Edge boxes: locating object proposals from edges. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014, Part V. LNCS, vol. 8693, pp. 391–405. Springer, Heidelberg (2014)

Titel: DAPs: Deep Action Proposals for Action Understanding
verfasst von: Victor Escorcia
Fabian Caba Heilbron
Juan Carlos Niebles
Bernard Ghanem
Verlag: Springer International Publishing
Buch: Computer Vision – ECCV 2016
Print ISBN: 978-3-319-46486-2

Electronic ISBN: 978-3-319-46487-9

Copyright-Jahr: 2016
DOI: https://doi.org/10.1007/978-3-319-46487-9_47

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"