Top

Published in:

2018 | OriginalPaper | Chapter

DeepVS: A Deep Learning Based Video Saliency Prediction Approach

Authors : Lai Jiang, Mai Xu, Tie Liu, Minglang Qiao, Zulin Wang

Published in: Computer Vision – ECCV 2018

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

In this paper, we propose a novel deep learning based video saliency prediction method, named DeepVS. Specifically, we establish a large-scale eye-tracking database of videos (LEDOV), which includes 32 subjects’ fixations on 538 videos. We find from LEDOV that human attention is more likely to be attracted by objects, particularly the moving objects or the moving parts of objects. Hence, an object-to-motion convolutional neural network (OM-CNN) is developed to predict the intra-frame saliency for DeepVS, which is composed of the objectness and motion subnets. In OM-CNN, cross-net mask and hierarchical feature normalization are proposed to combine the spatial features of the objectness subnet and the temporal features of the motion subnet. We further find from our database that there exists a temporal correlation of human attention with a smooth saliency transition across video frames. We thus propose saliency-structured convolutional long short-term memory (SS-ConvLSTM) network, using the extracted features from OM-CNN as the input. Consequently, the inter-frame saliency maps of a video can be generated, which consider both structured output with center-bias and cross-frame transitions of human attention maps. Finally, the experimental results show that DeepVS advances the state-of-the-art in video saliency prediction.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

previous chapter A Deeply-Initialized Coarse-to-fine Ensemble of Regression Trees for Face Alignment

next chapter Learning Efficient Single-Stage Pedestrian Detectors by Asymptotic Localization Fitting

Available only for authorised users

\(\mathbf {FS}_5\) is generated by the output of the last FC layer in the objectness subnet, encoding the high level information of the sizes, class and confidence probabilities of candidate objects in each grid.

Bak, C., Kocak, A., Erdem, E., Erdem, A.: Spatio-temporal saliency networks for dynamic saliency prediction. IEEE Trans. Multimed. (2017)

Bazzani, L., Larochelle, H., Torresani, L.: Recurrent mixture density network for spatiotemporal visual attention (2017)

Chaabouni, S., Benois-Pineau, J., Amar, C.B.: Transfer learning with deep networks for saliency prediction in natural video. In: ICIP, pp. 1604–1608. IEEE (2016)

Cheng, M.M., Mitra, N.J., Huang, X., Torr, P.H., Hu, S.M.: Global contrast based salient region detection. IEEE PAMI 37(3), 569–582 (2015)CrossRef

Deng, X., Xu, M., Jiang, L., Sun, X., Wang, Z.: Subjective-driven complexity control approach for HEVC. IEEE Trans. Circuits Syst. Video Technol. 26(1), 91–106 (2016)CrossRef

Dosovitskiy, A., et al.: FlowNet: learning optical flow with convolutional networks. In: ICCV, pp. 2758–2766 (2015)

Fang, Y., Lin, W., Chen, Z., Tsai, C.M., Lin, C.W.: A video saliency detection model in compressed domain. IEEE TCSVT 24(1), 27–38 (2014)

Gal, Y., Ghahramani, Z.: A theoretically grounded application of dropout in recurrent neural networks. In: NIPS, pp. 1019–1027 (2016)

Guo, C., Zhang, L.: A novel multiresolution spatiotemporal saliency detection model and its applications in image and video compression. IEEE TIP 19(1), 185–198 (2010)MathSciNetMATH

10.

Hadizadeh, H., Enriquez, M.J., Bajic, I.V.: Eye-tracking database for a set of standard video sequences. IEEE TIP 21(2), 898–903 (2012)MathSciNetMATH

11.

Harel, J., Koch, C., Perona, P.: Graph-based visual saliency. In: NIPS, pp. 545–552 (2006)

12.

Hossein Khatoonabadi, S., Vasconcelos, N., Bajic, I.V., Shan, Y.: How many bits does it take for a stimulus to be salient? In: CVPR, pp. 5501–5510 (2015)

13.

Huang, X., Shen, C., Boix, X., Zhao, Q.: SALICON: reducing the semantic gap in saliency prediction by adapting deep neural networks. In: ICCV, pp. 262–270 (2015)

14.

T. T. INC.: Tobii TX300 eye tracker. http://www.tobiipro.com/product-listing/tobii-pro-tx300/

15.

Itti, L., Baldi, P.: Bayesian surprise attracts human attention. Vis. Res. 49(10), 1295–1306 (2009)CrossRef

16.

Itti, L., Dhavale, N., Pighin, F.: Realistic avatar eye and head animation using a neurobiological model of visual attention. Opt. Sci. Technol. 64, 64–78 (2004)

17.

Judd, T., Ehinger, K., Durand, F., Torralba, A.: Learning to predict where humans look. In: ICCV, pp. 2106–2113 (2009)

18.

Kruthiventi, S.S., Ayush, K., Babu, R.V.: DeepFix: a fully convolutional neural network for predicting human eye fixations. IEEE TIP (2017)

19.

Leboran, V., Garcia-Diaz, A., Fdez-Vidal, X.R., Pardo, X.M.: Dynamic whitening saliency. IEEE PAMI 39(5), 893–907 (2017)CrossRef

20.

Lee, S.H., Kim, J.H., Choi, K.P., Sim, J.Y., Kim, C.S.: Video saliency detection based on spatiotemporal feature learning. In: ICIP, pp. 1120–1124 (2014)

21.

Li, S., Xu, M., Ren, Y., Wang, Z.: Closed-form optimization on saliency-guided image compression for HEVC-MSP. IEEE Trans. Multimed. (2017)

22.

Li, S., Xu, M., Wang, Z., Sun, X.: Optimal bit allocation for CTU level rate control in HEVC. IEEE Trans. Circuits Syst. Video Technol. 27(11), 2409–2424 (2017)CrossRef

23.

Liu, Y., Zhang, S., Xu, M., He, X.: Predicting salient face in multiple-face videos. In: CVPR, July 2017

24.

Mathe, S., Sminchisescu, C.: Actions in the eye: dynamic gaze datasets and learnt saliency models for visual recognition. IEEE PAMI 37(7), 1408–1424 (2015)CrossRef

25.

Mital, P.K., Smith, T.J., Hill, R.L., Henderson, J.M.: Clustering of gaze during dynamic scene viewing is predicted by motion. Cogn. Comput. 3(1), 5–24 (2011)CrossRef

26.

Nguyen, T.V., Xu, M., Gao, G., Kankanhalli, M., Tian, Q., Yan, S.: Static saliency vs. dynamic saliency: a comparative study. In: ACMM, pp. 987–996. ACM (2013)

27.

Palazzi, A., Solera, F., Calderara, S., Alletto, S., Cucchiara, R.: Learning where to attend like a human driver. In: Intelligent Vehicles Symposium (IV), 2017 IEEE, pp. 920–925. IEEE (2017)

28.

Pan, J., et al.: SalGAN: visual saliency prediction with generative adversarial networks. In: CVPR workshop, January 2017

29.

Pan, J., Sayrol, E., Giro-i Nieto, X., McGuinness, K., O’Connor, N.E.: Shallow and deep convolutional networks for saliency prediction. In: CVPR, pp. 598–606 (2016)

30.

Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: CVPR, pp. 779–788 (2016)

31.

Rudoy, D., Goldman, D.B., Shechtman, E., Zelnik-Manor, L.: Learning video saliency from human gaze using candidate selection. In: CVPR, pp. 1147–1154 (2013)

32.

Wang, L., Wang, L., Lu, H., Zhang, P., Ruan, X.: Saliency detection with recurrent fully convolutional networks. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 825–841. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46493-0_50CrossRef

33.

Wang, W., Shen, J., Guo, F., Cheng, M.M., Borji, A.: Revisiting video saliency: a large-scale benchmark and a new model (2018)

34.

Wang, W., Shen, J., Shao, L.: Consistent video saliency using local gradient flow optimization and global refinement. IEEE Trans. Image Process. 24(11), 4185–4196 (2015)MathSciNetCrossRef

35.

Wang, W., Shen, J., Shao, L.: Video salient object detection via fully convolutional networks. IEEE TIP (2017)

36.

Xingjian, S., Chen, Z., Wang, H., Yeung, D.Y., Wong, W.K., Woo, W.C.: Convolutional lstm network: a machine learning approach for precipitation nowcasting. In: NIPS, pp. 802–810 (2015)

37.

Xu, M., Jiang, L., Sun, X., Ye, Z., Wang, Z.: Learning to detect video saliency with HEVC features. IEEE TIP 26(1), 369–385 (2017)MathSciNet

38.

Xu, M., Liu, Y., Hu, R., He, F.: Find who to look at: turning from action to saliency. IEEE Transactions on Image Processing 27(9), 4529–4544 (2018)MathSciNetCrossRef

39.

Zhang, J., Sclaroff, S.: Exploiting surroundedness for saliency detection: a boolean map approach. IEEE PAMI 38(5), 889–902 (2016)CrossRef

40.

Zhang, L., Tong, M.H., Cottrell, G.W.: SUNDAy: saliency using natural statistics for dynamic analysis of scenes. In: Annual Cognitive Science Conference, pp. 2944–2949 (2009)

41.

Zhang, Q., Wang, Y., Li, B.: Unsupervised video analysis based on a spatiotemporal saliency detector. arXiv preprint (2015)

42.

Zhou, F., Bing Kang, S., Cohen, M.F.: Time-mapping using space-time saliency. In: CVPR, pp. 3358–3365 (2014)

Title: DeepVS: A Deep Learning Based Video Saliency Prediction Approach
Authors: Lai Jiang
Mai Xu
Tie Liu
Minglang Qiao
Zulin Wang
Publisher: Springer International Publishing
Book: Computer Vision – ECCV 2018
Print ISBN: 978-3-030-01263-2

Electronic ISBN: 978-3-030-01264-9

Copyright Year: 2018
DOI: https://doi.org/10.1007/978-3-030-01264-9_37

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Premium Partner