Top

World Wide Web

Published in:

27-10-2017

Exploiting detected visual objects for frame-level video filtering

Authors: Xingzhong Du, Hongzhi Yin, Zi Huang, Yi Yang, Xiaofang Zhou

Published in: World Wide Web | Issue 5/2018

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

Videos are generated at an unprecedented speed on the web. To improve the efficiency of access, developing new ways to filter the videos becomes a popular research topic. One on-going direction is using visual objects to perform frame-level video filtering. Under this direction, existing works create the unique object table and the occurrence table to maintain the connections between videos and objects. However, the creation process is not scalable and dynamic because it heavily depends on human labeling. To improve this, we propose to use detected visual objects to create these two tables for frame-level video filtering. Our study begins with investigating the existing object detection techniques. After that, we find object detection lacks the identification and connection abilities to accomplish the creation process alone. To supply these abilities, we further investigate three candidates, namely, recognizing-based, matching-based and tracking-based methods, to work with the object detection. Through analyzing the mechanism and evaluating the accuracy, we find that they are imperfect for identifying or connecting the visual objects. Accordingly, we propose a novel hybrid method that combines the matching-based and tracking-based methods to overcome the limitations. Our experiments show that the proposed method achieves higher accuracy and efficiency than the candidate methods. The subsequent analysis shows that the proposed method can efficiently support the frame-level video filtering using visual objects.

previous article No-but-semantic-match: computing semantically matched xml keyword search results

next article Recommending diverse friends in signed social networks based on adaptive soft consensus paradigm using variable length genetic algorithm

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

https://www.youtub.com

https://www.netflix.com

https://www.hulu.com

https://github.com/openalpr/openalpr

http://www.sony.com/electronics/actioncam/fdr-x1000v-body-kit

Adali, S., Candan, K.S., Chen, S., Erol, K., Subrahmanian, V.S.: The advanced video information system: Data structures and query processing. MMS 4(4), 172–186 (1996)

Breitenstein, M.D., Reichlin, F., Leibe, B., Koller-Meier, E., Gool, L.J.V.: Robust tracking-by-detection using a detector confidence particle filter. In: ICCV, pp. 1515–1522 (2009)

Donderler, M.E., Ulusoy, Ö., Gudukbay, U.: A rule-based video database system architecture. Inf. Sci. 143(1-4), 13–45 (2002)CrossRefMATH

Donderler, M.E., Ulusoy, Ö., Gudukbay, U.: Rule-based spatiotemporal query processing for video databases. VLDB J. 13(1), 86–103 (2004)CrossRef

Donderler, M.E., Saykol, E., Arslan, U., Ulusoy, Ö., Gudukbay, U.: Bilvideo: Design and implementation of a video database management system. MTA 27(1), 79–104 (2005)

Du, X., Yin, H., Huang, Z., Yang, Y., Zhou, X.: Using detected visual objects to index video database. In: ADC, pp. 333–345 (2016)

Felzenszwalb, P.F., Girshick, R.B., McAllester, D.A., Ramanan, D.: Object detection with discriminatively trained part-based models. TPAMI 32(9), 1627–1645 (2010)CrossRef

Flickner, M., Sawhney, H.S., Ashley, J., Huang, Q., Dom, B., Gorkani, M., Hafner, J., Lee, D., Petkovic, D., Steele, D., Yanker, P.: Query by image and video content: The QBIC system. IEEE Comput. 28(9), 23–32 (1995)CrossRef

Girshick, R.B., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: CVPR, pp. 580–587 (2014)

10.

Hare, S., Saffari, A., Torr, P.H.S.: Struck: Structured output tracking with kernels. In: ICCV, pp. 263–270 (2011)

11.

He, K., Zhang, X., Ren, S., Sun, J.: Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. CoRR arXiv:1502.01852 (2015)

12.

Hjelsvold, R., Midtstraum, R.: Modelling and querying video data. In: VLDB, pp. 686–694 (1994)

13.

Hu, W., Xie, N., Li, L., Zeng, X., Maybank, S.J.: A survey on visual content-based video indexing and retrieval. SMC C 41(6), 797–819 (2011)

14.

Huang, Z., Shen, H.T., Shao, J., Zhou, X., Cui, B.: Bounded coordinate system indexing for real-time video clip search. TOIS 27(3) 27(3), 17:1–33 (2009)

15.

Huang, Z., Shen, H.T., Shao, J., Cui, B., Zhou, X.: Practical online near-duplicate subsequence detection for continuous video streams. TMM 12(5), 386–398 (2010)

16.

Kalal, Z., Mikolajczyk, K., Matas, J.: Tracking-learning-detection. PAMI 34(7), 1409–1422 (2012)CrossRef

17.

Koprulu, M., Cicekli, N.K., Yazici, A.: Spatio-temporal querying in video databases. Inf. Sci. 160(1-4), 131–152 (2004)CrossRef

18.

Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: NIPS, pp. 1106–1114 (2012)

19.

Kuo, T.C.T., Chen, A.L.P.: Content-based query processing for video databases. TMM 2(1), 1–13 (2000)

20.

Kuznetsova, A., Ju Hwang, S., Rosenhahn, B., Sigal, L.: Expanding object detector’s horizon: Incremental learning framework for object detection in videos. In: CVPR, pp. 28–36 (2015)

21.

Le, T., Thonnat, M., Boucher, A., Bremond, F.: A query language combining object features and semantic events for surveillance video retrieval. In: MMM, pp. 307–317 (2008)

22.

Leutenegger, S., Chli, M., Siegwart, R.: BRISK: binary robust invariant scalable keypoints. In: ICCV, pp. 2548–2555 (2011)

23.

Lowe, D.G.: Distinctive image features from scale-invariant keypoints. IJCV 60 (2), 91–110 (2004)CrossRef

24.

Miksik, O., Mikolajczyk, K.: Evaluation of local detectors and descriptors for fast feature matching. In: ICPR, pp. 2681–2684 (2012)

25.

Muja, M., Lowe, D.G.: Fast approximate nearest neighbors with automatic algorithm configuration. In: VISAPP, pp. 331–340 (2009)

26.

Oomoto, E., Tanaka, K.: OVID: design and implementation of a video-object database system. TKDE 5(4), 629–643 (1993)

27.

Ren, S., He, K., Girshick, R.B., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: NIPS, pp. 91–99 (2015)

28.

Rosten, E., Porter, R., Drummond, T.: Faster and better: A machine learning approach to corner detection. TPAMI 32(1), 105–119 (2010)CrossRef

29.

Rublee, E., Rabaud, V., Konolige, K., Bradski, G.R.: ORB: an efficient alternative to SIFT or SURF. In: ICCV, pp. 2564–2571 (2011)

30.

Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M.S., Berg, A.C., Li, F.: Imagenet large scale visual recognition challenge. IJCV 115(3), 211–252 (2015)MathSciNetCrossRef

31.

Shen, H.T., Shao, J., Huang, Z., Zhou, X.: Effective and efficient query processing for video subsequence identification. TKDE 21(3), 321–334 (2009)

32.

Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. CoRR arXiv:1409.1556 (2014)

33.

Srivastava, N., Mansimov, E., Salakhutdinov, R.: Unsupervised learning of video representations using lstms. In: ICML, pp. 843–852 (2015)

34.

Szegedy, C., Toshev, A., Erhan, D.: Deep neural networks for object detection. In: NIPS, pp 2553–2561 (2013)

35.

Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. In: CVPR, pp. 2818–2826 (2016)

36.

Ulusoy, Ö., Gudukbay, U., Donderler, M.E., Saykol, E., Alper, C.: Bilvideo video database management system. In: VLDB, pp. 1373–1376 (2004)

37.

van de Sande, K.E.A., Gevers, T., Snoek, C.G.M.: Evaluating color descriptors for object and scene recognition. PAMI 32(9), 1582–1596 (2010)CrossRef

38.

Wang, H., Schmid, C.: Action recognition with improved trajectories. In: ICCV, pp. 3551–3558 (2013)

39.

Wang, N., Li, S., Gupta, A., Yeung, D.: Transferring rich feature hierarchies for robust visual tracking. CoRR arXiv:1501.04587 (2015)

40.

Wang, N., Yeung, D.: Learning a deep compact image representation for visual tracking. In: NIPS, pp. 809–817 (2013)

41.

Wu, Y., Lim, J., Yang, M.: Object tracking benchmark. PAMI 37(9), 1834–1848 (2015)CrossRef

42.

Yang, Y., Huang, Z., Shen, H.T., Zhou, X.: Mining multi-tag association for image tagging. WWWJ 14(2), 133–156 (2011)CrossRef

43.

Yilmaz, A., Javed, O., Shah, M.: Object tracking: A survey. ACM Comput. Surv 38(4), 13:1–45 (2006)

44.

Zhu, L., Xu, Z., Yang, Y.: Bidirectional multirate reconstruction for temporal modeling in videos. CoRR arXiv:1611.09053 (2016)

Title: Exploiting detected visual objects for frame-level video filtering
Authors: Xingzhong Du
Hongzhi Yin
Zi Huang
Yi Yang
Xiaofang Zhou
Publication date: 27-10-2017
Publisher: Springer US
Published in: World Wide Web / Issue 5/2018
Print ISSN: 1386-145X
Electronic ISSN: 1573-1413
DOI: https://doi.org/10.1007/s11280-017-0505-6

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft"

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Other articles of this Issue 5/2018

Recommending diverse friends in signed social networks based on adaptive soft consensus paradigm using variable length genetic algorithm

S-LPM: segmentation augmented light-weighting and progressive meshing for the interactive visualization of large man-made Web3D models

Spreading of social contagions without key players

ACRES: efficient query answering on large compressed sequences

Learning to embed music and metadata for context-aware music recommendation

No-but-semantic-match: computing semantically matched xml keyword search results

Premium Partner