nach oben

Erschienen in:

2017 | OriginalPaper | Buchkapitel

Object Triggered Egocentric Video Summarization

verfasst von : Samriddhi Jain, Renu M. Rameshan, Aditya Nigam

Erschienen in: Computer Analysis of Images and Patterns

Verlag: Springer International Publishing

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Egocentric videos are usually of long duration and contains lot of redundancy which makes summarization an essential task for such videos. In this work we are targeting object triggered egocentric video summarization which aims at extracting all the occurrences of an object in a given video, in near real time. We propose a modular pipeline which first aims at limiting the redundant information and then uses a Convolutional Neural Network and LSTM based approach for object detection. Following this we represent the video as a dictionary which captures the semantic information in the video. Matching a query object reduces to doing an And-Or Tree traversal followed by deepmatching algorithm for fine grained matching. The frames containing the object, which would have been missed at the pruning stage are retrieved by running a tracker on the frames selected by the pipeline mentioned. The modular pipeline allows replacing any module with its more efficient version. Performance tests ran on the overall pipeline for egocentric datasets, EDUB dataset and personal recorded videos, give an average recall of 0.76.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Vorheriges Kapitel Nonlinear Mapping Based on Spectral Angle Preserving Principle for Hyperspectral Image Analysis

Nächstes Kapitel 3D Motion Consistency Analysis for Segmentation in 2D Video Projection

goPro Camera. https://gopro.com/

Bolaños, M., Radeva, P.: Ego-object discovery. CoRR abs/1504.01639 (2015). http://arxiv.org/abs/1504.01639

Corso, J.J., Alahi, A., Grauman, K., Hager, G.D., Morency, L.P., Sawhney, H., Sheikh, Y.: Video analysis for body-worn cameras in law enforcement. arXiv preprint (2016). arXiv:1604.03130

Fathi, A., Li, Y., Rehg, J.M.: Learning to recognize daily actions using gaze. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7572, pp. 314–327. Springer, Heidelberg (2012). doi:10.1007/978-3-642-33718-5_23 CrossRef

Fischler, M.A., Bolles, R.C.: Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun. ACM 24(6), 381–395 (1981). http://doi.acm.org/10.1145/358669.358692 MathSciNetCrossRef

Girshick, R.B.: Fast R-CNN. CoRR abs/1504.08083 (2015). http://arxiv.org/abs/1504.08083

Hare, S., Golodetz, S., Saffari, A., Vineet, V., Cheng, M.M., Hicks, S.L., Torr, P.H.: Struck: structured output tracking with kernels. IEEE Trans. Pattern Anal. Mach. Intell. 38(10), 2096–2109 (2016)CrossRef

Johnson, J., Karpathy, A., Fei-Fei, L.: Densecap: fully convolutional localization networks for dense captioning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2016)

Lee, Y.J., Ghosh, J., Grauman, K.: Discovering important people and objects for egocentric video summarization. In: CVPR, vol. 2, p. 7 (2012)

10.

Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vision 60(2), 91–110 (2004)CrossRef

11.

Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint (2013). arXiv:1301.3781

12.

Nebot, A., Binefa, X., de Mántaras, R.L.: Artificial intelligence research and development. In: Proceedings of the 19th International Conference of the Catalan Association for Artificial Intelligence, vol. 288, Barcelona, Catalonia, Spain. IOS Press, 19–21 October 2016

13.

Nilsson, N.: Artificial Intelligence: A New Synthesis. Morgan Kaufmann Series in Arti. Morgan Kaufmann Publishers, Burlington (1998). https://books.google.co.in/books?id=LIXBRwkibdEC

14.

Pech-Pacheco, J.L., Cristóbal, G., Chamorro-Martinez, J., Fernández-Valdivia, J.: Diatom autofocusing in brightfield microscopy: a comparative study. In: 15th International Conference on Pattern Recognition Proceedings, vol. 3, pp. 314–317. IEEE (2000)

15.

Redmon, J., Farhadi, A.: YOLO9000: better, faster, stronger. CoRR abs/1612.08242 (2016). http://arxiv.org/abs/1612.08242

16.

Ren, S., He, K., Girshick, R.B., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. CoRR abs/1506.01497 (2015). http://arxiv.org/abs/1506.01497

17.

Revaud, J., Weinzaepfel, P., Harchaoui, Z., Schmid, C.: Deepmatching: hierarchical deformable dense matching. Int. J. Comput. Vision 120, 1–24 (2015)MathSciNet

18.

Rublee, E., Rabaud, V., Konolige, K., Bradski, G.: ORB: an efficient alternative to sift or surf. In: 2011 International Conference on Computer Vision, pp. 2564–2571. IEEE (2011)

19.

Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., et al.: Imagenet large scale visual recognition challenge. Int. J. Comput. Vision 115(3), 211–252 (2015)MathSciNetCrossRef

20.

Schwarz, H., Marpe, D., Wiegand, T.: Overview of the scalable video coding extension of the h. 264/avc standard. IEEE Trans. Circuits Syst. Video Technol. 17(9), 1103–1120 (2007)CrossRef

21.

Sivic, J., Schaffalitzky, F., Zisserman, A.: Efficient object retrieval from videos. In: 2004 12th European Signal Processing Conference, pp. 1737–1740, September 2004

22.

Vinyals, O., Toshev, A., Bengio, S., Erhan, D.: Show and tell: lessons learned from the 2015 MSCOCO image captioning challenge. IEEE Trans. Pattern Anal. Mach. Intell. (2016)

23.

Xiong, B., Kim, G., Sigal, L.: Storyline representation of egocentric videos with an applications to story-based search. In: Proceedings of the 2015 IEEE International Conference on Computer Vision (ICCV), ICCV 2015, pp. 4525–4533 (2015). doi:10.1109/ICCV.2015.514

24.

Xu, J., Mukherjee, L., Li, Y., Warner, J., Rehg, J.M., Singh, V.: Gaze-enabled egocentric video summarization via constrained submodular maximization. In: Proceedings CVPR (2015)

Titel: Object Triggered Egocentric Video Summarization
verfasst von: Samriddhi Jain
Renu M. Rameshan
Aditya Nigam
Verlag: Springer International Publishing
Buch: Computer Analysis of Images and Patterns
Print ISBN: 978-3-319-64697-8

Electronic ISBN: 978-3-319-64698-5

Copyright-Jahr: 2017
DOI: https://doi.org/10.1007/978-3-319-64698-5_36

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"