Skip to main content
Top
Published in: World Wide Web 5/2018

27-10-2017

Exploiting detected visual objects for frame-level video filtering

Authors: Xingzhong Du, Hongzhi Yin, Zi Huang, Yi Yang, Xiaofang Zhou

Published in: World Wide Web | Issue 5/2018

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Videos are generated at an unprecedented speed on the web. To improve the efficiency of access, developing new ways to filter the videos becomes a popular research topic. One on-going direction is using visual objects to perform frame-level video filtering. Under this direction, existing works create the unique object table and the occurrence table to maintain the connections between videos and objects. However, the creation process is not scalable and dynamic because it heavily depends on human labeling. To improve this, we propose to use detected visual objects to create these two tables for frame-level video filtering. Our study begins with investigating the existing object detection techniques. After that, we find object detection lacks the identification and connection abilities to accomplish the creation process alone. To supply these abilities, we further investigate three candidates, namely, recognizing-based, matching-based and tracking-based methods, to work with the object detection. Through analyzing the mechanism and evaluating the accuracy, we find that they are imperfect for identifying or connecting the visual objects. Accordingly, we propose a novel hybrid method that combines the matching-based and tracking-based methods to overcome the limitations. Our experiments show that the proposed method achieves higher accuracy and efficiency than the candidate methods. The subsequent analysis shows that the proposed method can efficiently support the frame-level video filtering using visual objects.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Adali, S., Candan, K.S., Chen, S., Erol, K., Subrahmanian, V.S.: The advanced video information system: Data structures and query processing. MMS 4(4), 172–186 (1996) Adali, S., Candan, K.S., Chen, S., Erol, K., Subrahmanian, V.S.: The advanced video information system: Data structures and query processing. MMS 4(4), 172–186 (1996)
2.
go back to reference Breitenstein, M.D., Reichlin, F., Leibe, B., Koller-Meier, E., Gool, L.J.V.: Robust tracking-by-detection using a detector confidence particle filter. In: ICCV, pp. 1515–1522 (2009) Breitenstein, M.D., Reichlin, F., Leibe, B., Koller-Meier, E., Gool, L.J.V.: Robust tracking-by-detection using a detector confidence particle filter. In: ICCV, pp. 1515–1522 (2009)
3.
go back to reference Donderler, M.E., Ulusoy, Ö., Gudukbay, U.: A rule-based video database system architecture. Inf. Sci. 143(1-4), 13–45 (2002)CrossRefMATH Donderler, M.E., Ulusoy, Ö., Gudukbay, U.: A rule-based video database system architecture. Inf. Sci. 143(1-4), 13–45 (2002)CrossRefMATH
4.
go back to reference Donderler, M.E., Ulusoy, Ö., Gudukbay, U.: Rule-based spatiotemporal query processing for video databases. VLDB J. 13(1), 86–103 (2004)CrossRef Donderler, M.E., Ulusoy, Ö., Gudukbay, U.: Rule-based spatiotemporal query processing for video databases. VLDB J. 13(1), 86–103 (2004)CrossRef
5.
go back to reference Donderler, M.E., Saykol, E., Arslan, U., Ulusoy, Ö., Gudukbay, U.: Bilvideo: Design and implementation of a video database management system. MTA 27(1), 79–104 (2005) Donderler, M.E., Saykol, E., Arslan, U., Ulusoy, Ö., Gudukbay, U.: Bilvideo: Design and implementation of a video database management system. MTA 27(1), 79–104 (2005)
6.
go back to reference Du, X., Yin, H., Huang, Z., Yang, Y., Zhou, X.: Using detected visual objects to index video database. In: ADC, pp. 333–345 (2016) Du, X., Yin, H., Huang, Z., Yang, Y., Zhou, X.: Using detected visual objects to index video database. In: ADC, pp. 333–345 (2016)
7.
go back to reference Felzenszwalb, P.F., Girshick, R.B., McAllester, D.A., Ramanan, D.: Object detection with discriminatively trained part-based models. TPAMI 32(9), 1627–1645 (2010)CrossRef Felzenszwalb, P.F., Girshick, R.B., McAllester, D.A., Ramanan, D.: Object detection with discriminatively trained part-based models. TPAMI 32(9), 1627–1645 (2010)CrossRef
8.
go back to reference Flickner, M., Sawhney, H.S., Ashley, J., Huang, Q., Dom, B., Gorkani, M., Hafner, J., Lee, D., Petkovic, D., Steele, D., Yanker, P.: Query by image and video content: The QBIC system. IEEE Comput. 28(9), 23–32 (1995)CrossRef Flickner, M., Sawhney, H.S., Ashley, J., Huang, Q., Dom, B., Gorkani, M., Hafner, J., Lee, D., Petkovic, D., Steele, D., Yanker, P.: Query by image and video content: The QBIC system. IEEE Comput. 28(9), 23–32 (1995)CrossRef
9.
go back to reference Girshick, R.B., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: CVPR, pp. 580–587 (2014) Girshick, R.B., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: CVPR, pp. 580–587 (2014)
10.
go back to reference Hare, S., Saffari, A., Torr, P.H.S.: Struck: Structured output tracking with kernels. In: ICCV, pp. 263–270 (2011) Hare, S., Saffari, A., Torr, P.H.S.: Struck: Structured output tracking with kernels. In: ICCV, pp. 263–270 (2011)
11.
go back to reference He, K., Zhang, X., Ren, S., Sun, J.: Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. CoRR arXiv:1502.01852 (2015) He, K., Zhang, X., Ren, S., Sun, J.: Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. CoRR arXiv:1502.​01852 (2015)
12.
go back to reference Hjelsvold, R., Midtstraum, R.: Modelling and querying video data. In: VLDB, pp. 686–694 (1994) Hjelsvold, R., Midtstraum, R.: Modelling and querying video data. In: VLDB, pp. 686–694 (1994)
13.
go back to reference Hu, W., Xie, N., Li, L., Zeng, X., Maybank, S.J.: A survey on visual content-based video indexing and retrieval. SMC C 41(6), 797–819 (2011) Hu, W., Xie, N., Li, L., Zeng, X., Maybank, S.J.: A survey on visual content-based video indexing and retrieval. SMC C 41(6), 797–819 (2011)
14.
go back to reference Huang, Z., Shen, H.T., Shao, J., Zhou, X., Cui, B.: Bounded coordinate system indexing for real-time video clip search. TOIS 27(3) 27(3), 17:1–33 (2009) Huang, Z., Shen, H.T., Shao, J., Zhou, X., Cui, B.: Bounded coordinate system indexing for real-time video clip search. TOIS 27(3) 27(3), 17:1–33 (2009)
15.
go back to reference Huang, Z., Shen, H.T., Shao, J., Cui, B., Zhou, X.: Practical online near-duplicate subsequence detection for continuous video streams. TMM 12(5), 386–398 (2010) Huang, Z., Shen, H.T., Shao, J., Cui, B., Zhou, X.: Practical online near-duplicate subsequence detection for continuous video streams. TMM 12(5), 386–398 (2010)
16.
go back to reference Kalal, Z., Mikolajczyk, K., Matas, J.: Tracking-learning-detection. PAMI 34(7), 1409–1422 (2012)CrossRef Kalal, Z., Mikolajczyk, K., Matas, J.: Tracking-learning-detection. PAMI 34(7), 1409–1422 (2012)CrossRef
17.
go back to reference Koprulu, M., Cicekli, N.K., Yazici, A.: Spatio-temporal querying in video databases. Inf. Sci. 160(1-4), 131–152 (2004)CrossRef Koprulu, M., Cicekli, N.K., Yazici, A.: Spatio-temporal querying in video databases. Inf. Sci. 160(1-4), 131–152 (2004)CrossRef
18.
go back to reference Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: NIPS, pp. 1106–1114 (2012) Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: NIPS, pp. 1106–1114 (2012)
19.
go back to reference Kuo, T.C.T., Chen, A.L.P.: Content-based query processing for video databases. TMM 2(1), 1–13 (2000) Kuo, T.C.T., Chen, A.L.P.: Content-based query processing for video databases. TMM 2(1), 1–13 (2000)
20.
go back to reference Kuznetsova, A., Ju Hwang, S., Rosenhahn, B., Sigal, L.: Expanding object detector’s horizon: Incremental learning framework for object detection in videos. In: CVPR, pp. 28–36 (2015) Kuznetsova, A., Ju Hwang, S., Rosenhahn, B., Sigal, L.: Expanding object detector’s horizon: Incremental learning framework for object detection in videos. In: CVPR, pp. 28–36 (2015)
21.
go back to reference Le, T., Thonnat, M., Boucher, A., Bremond, F.: A query language combining object features and semantic events for surveillance video retrieval. In: MMM, pp. 307–317 (2008) Le, T., Thonnat, M., Boucher, A., Bremond, F.: A query language combining object features and semantic events for surveillance video retrieval. In: MMM, pp. 307–317 (2008)
22.
go back to reference Leutenegger, S., Chli, M., Siegwart, R.: BRISK: binary robust invariant scalable keypoints. In: ICCV, pp. 2548–2555 (2011) Leutenegger, S., Chli, M., Siegwart, R.: BRISK: binary robust invariant scalable keypoints. In: ICCV, pp. 2548–2555 (2011)
23.
go back to reference Lowe, D.G.: Distinctive image features from scale-invariant keypoints. IJCV 60 (2), 91–110 (2004)CrossRef Lowe, D.G.: Distinctive image features from scale-invariant keypoints. IJCV 60 (2), 91–110 (2004)CrossRef
24.
go back to reference Miksik, O., Mikolajczyk, K.: Evaluation of local detectors and descriptors for fast feature matching. In: ICPR, pp. 2681–2684 (2012) Miksik, O., Mikolajczyk, K.: Evaluation of local detectors and descriptors for fast feature matching. In: ICPR, pp. 2681–2684 (2012)
25.
go back to reference Muja, M., Lowe, D.G.: Fast approximate nearest neighbors with automatic algorithm configuration. In: VISAPP, pp. 331–340 (2009) Muja, M., Lowe, D.G.: Fast approximate nearest neighbors with automatic algorithm configuration. In: VISAPP, pp. 331–340 (2009)
26.
go back to reference Oomoto, E., Tanaka, K.: OVID: design and implementation of a video-object database system. TKDE 5(4), 629–643 (1993) Oomoto, E., Tanaka, K.: OVID: design and implementation of a video-object database system. TKDE 5(4), 629–643 (1993)
27.
go back to reference Ren, S., He, K., Girshick, R.B., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: NIPS, pp. 91–99 (2015) Ren, S., He, K., Girshick, R.B., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: NIPS, pp. 91–99 (2015)
28.
go back to reference Rosten, E., Porter, R., Drummond, T.: Faster and better: A machine learning approach to corner detection. TPAMI 32(1), 105–119 (2010)CrossRef Rosten, E., Porter, R., Drummond, T.: Faster and better: A machine learning approach to corner detection. TPAMI 32(1), 105–119 (2010)CrossRef
29.
go back to reference Rublee, E., Rabaud, V., Konolige, K., Bradski, G.R.: ORB: an efficient alternative to SIFT or SURF. In: ICCV, pp. 2564–2571 (2011) Rublee, E., Rabaud, V., Konolige, K., Bradski, G.R.: ORB: an efficient alternative to SIFT or SURF. In: ICCV, pp. 2564–2571 (2011)
30.
go back to reference Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M.S., Berg, A.C., Li, F.: Imagenet large scale visual recognition challenge. IJCV 115(3), 211–252 (2015)MathSciNetCrossRef Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M.S., Berg, A.C., Li, F.: Imagenet large scale visual recognition challenge. IJCV 115(3), 211–252 (2015)MathSciNetCrossRef
31.
go back to reference Shen, H.T., Shao, J., Huang, Z., Zhou, X.: Effective and efficient query processing for video subsequence identification. TKDE 21(3), 321–334 (2009) Shen, H.T., Shao, J., Huang, Z., Zhou, X.: Effective and efficient query processing for video subsequence identification. TKDE 21(3), 321–334 (2009)
32.
go back to reference Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. CoRR arXiv:1409.1556 (2014) Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. CoRR arXiv:1409.​1556 (2014)
33.
go back to reference Srivastava, N., Mansimov, E., Salakhutdinov, R.: Unsupervised learning of video representations using lstms. In: ICML, pp. 843–852 (2015) Srivastava, N., Mansimov, E., Salakhutdinov, R.: Unsupervised learning of video representations using lstms. In: ICML, pp. 843–852 (2015)
34.
go back to reference Szegedy, C., Toshev, A., Erhan, D.: Deep neural networks for object detection. In: NIPS, pp 2553–2561 (2013) Szegedy, C., Toshev, A., Erhan, D.: Deep neural networks for object detection. In: NIPS, pp 2553–2561 (2013)
35.
go back to reference Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. In: CVPR, pp. 2818–2826 (2016) Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. In: CVPR, pp. 2818–2826 (2016)
36.
go back to reference Ulusoy, Ö., Gudukbay, U., Donderler, M.E., Saykol, E., Alper, C.: Bilvideo video database management system. In: VLDB, pp. 1373–1376 (2004) Ulusoy, Ö., Gudukbay, U., Donderler, M.E., Saykol, E., Alper, C.: Bilvideo video database management system. In: VLDB, pp. 1373–1376 (2004)
37.
go back to reference van de Sande, K.E.A., Gevers, T., Snoek, C.G.M.: Evaluating color descriptors for object and scene recognition. PAMI 32(9), 1582–1596 (2010)CrossRef van de Sande, K.E.A., Gevers, T., Snoek, C.G.M.: Evaluating color descriptors for object and scene recognition. PAMI 32(9), 1582–1596 (2010)CrossRef
38.
go back to reference Wang, H., Schmid, C.: Action recognition with improved trajectories. In: ICCV, pp. 3551–3558 (2013) Wang, H., Schmid, C.: Action recognition with improved trajectories. In: ICCV, pp. 3551–3558 (2013)
39.
go back to reference Wang, N., Li, S., Gupta, A., Yeung, D.: Transferring rich feature hierarchies for robust visual tracking. CoRR arXiv:1501.04587 (2015) Wang, N., Li, S., Gupta, A., Yeung, D.: Transferring rich feature hierarchies for robust visual tracking. CoRR arXiv:1501.​04587 (2015)
40.
go back to reference Wang, N., Yeung, D.: Learning a deep compact image representation for visual tracking. In: NIPS, pp. 809–817 (2013) Wang, N., Yeung, D.: Learning a deep compact image representation for visual tracking. In: NIPS, pp. 809–817 (2013)
41.
go back to reference Wu, Y., Lim, J., Yang, M.: Object tracking benchmark. PAMI 37(9), 1834–1848 (2015)CrossRef Wu, Y., Lim, J., Yang, M.: Object tracking benchmark. PAMI 37(9), 1834–1848 (2015)CrossRef
42.
go back to reference Yang, Y., Huang, Z., Shen, H.T., Zhou, X.: Mining multi-tag association for image tagging. WWWJ 14(2), 133–156 (2011)CrossRef Yang, Y., Huang, Z., Shen, H.T., Zhou, X.: Mining multi-tag association for image tagging. WWWJ 14(2), 133–156 (2011)CrossRef
43.
go back to reference Yilmaz, A., Javed, O., Shah, M.: Object tracking: A survey. ACM Comput. Surv 38(4), 13:1–45 (2006) Yilmaz, A., Javed, O., Shah, M.: Object tracking: A survey. ACM Comput. Surv 38(4), 13:1–45 (2006)
44.
go back to reference Zhu, L., Xu, Z., Yang, Y.: Bidirectional multirate reconstruction for temporal modeling in videos. CoRR arXiv:1611.09053 (2016) Zhu, L., Xu, Z., Yang, Y.: Bidirectional multirate reconstruction for temporal modeling in videos. CoRR arXiv:1611.​09053 (2016)
Metadata
Title
Exploiting detected visual objects for frame-level video filtering
Authors
Xingzhong Du
Hongzhi Yin
Zi Huang
Yi Yang
Xiaofang Zhou
Publication date
27-10-2017
Publisher
Springer US
Published in
World Wide Web / Issue 5/2018
Print ISSN: 1386-145X
Electronic ISSN: 1573-1413
DOI
https://doi.org/10.1007/s11280-017-0505-6

Other articles of this Issue 5/2018

World Wide Web 5/2018 Go to the issue

Premium Partner