skip to main content
10.1145/3240508.3240669acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Your Attention is Unique: Detecting 360-Degree Video Saliency in Head-Mounted Display for Head Movement Prediction

Authors Info & Claims
Published:15 October 2018Publication History

ABSTRACT

Head movement prediction is the key enabler for the emerging 360-degree videos since it can enhance both streaming and rendering efficiency. To achieve accurate head movement prediction, it becomes imperative to understand user's visual attention on 360-degree videos under head-mounted display (HMD). Despite the rich history of saliency detection research, we observe that traditional models are designed for regular images/videos fixed at a single viewport and would introduce problems such as central bias and multi-object confusion when applied to the multi-viewport 360-degree videos switched by user interaction. To fill in this gap, this paper shifts the traditional single-viewport saliency models that have been extensively studied for decades to a fresh panoramic saliency detection specifically tailored for 360-degree videos, and thus maximally enhances the head movement prediction performance. The proposed head movement prediction framework is empowered by a newly created dataset for 360-degree video saliency, a panoramic saliency detection model and an integration of saliency and head tracking history for the ultimate head movement prediction. Experimental results demonstrate the measurable gain of both the proposed panoramic saliency detection and head movement prediction over traditional models for regular images/videos.

References

  1. Ana De Abreu, Cagri Ozcinar, and Aljosa Smolic. 2017. Look Around You: Saliency Maps for Omnidirectional Images in VR Applications. In IEEE International Conference on Quality of Multimedia Experience (QoMEX).Google ScholarGoogle Scholar
  2. A Deniz Aladagli, Erhan Ekmekcioglu, Dmitri Jarnikov, and Ahmet Kondoz. 2017. Predicting Head Trajectories in 360$^circ$Virtual Reality Videos. In IEEE International Conference on 3D Immersion (IC3D) .Google ScholarGoogle ScholarCross RefCross Ref
  3. Marc Assens, Xavier Girotext-i-Nieto, Kevin McGuinness, and Noel E. Otext'Connor. 2017. SaltiNet: Scan-path Prediction on 360 Degree Images using Saliency Volumes. In IEEE International Conference on Computer Vision Workshop (ICCVW).Google ScholarGoogle Scholar
  4. Zoya Bylinskii, Tilke Judd, Aude Oliva, Antonio Torralba, and Fredo Durand. 2018. What Do Different Evaluation Metrics Tell Us About Saliency Models? IEEE Trans. Pattern Anal. Mach. Intell. , Vol. PP (March 2018), 1--1.Google ScholarGoogle Scholar
  5. Xavier Corbillon, Gwendal Simon, Alisa Devlic, and Jacob Chakareski. 2017a. Viewport-adaptive Navigable 360-degree Video Delivery. In IEEE International Conference on Communications (ICC).Google ScholarGoogle ScholarCross RefCross Ref
  6. Xavier Corbillon, Francesca De Simone, and Gwendal Simon. 2017b. 360-Degree Video Head Movement Dataset. In Proceedings of the 8th ACM on Multimedia Systems Conference . Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Marcella Cornia, Lorenzo Baraldi, Giuseppe Serra, and Rita Cucchiara. 2016. A Deep Multi-Level Network for Saliency Prediction. In International Conference on Pattern Recognition (ICPR).Google ScholarGoogle ScholarCross RefCross Ref
  8. International Data Corporation. https://goo.gl/DHb26g. Worldwide Quarterly Augmented and Virtual Reality Headset Tracker. ( https://goo.gl/DHb26g).Google ScholarGoogle Scholar
  9. Fanyi Duanmu, Eymen Kurdoglu, S Amir Hosseini, Yong Liu, and Yao Wang. 2017. Prioritized Buffer Control in Two-tier 360 Video Streaming. In ACM Workshop on Virtual Reality and Augmented Reality Network . Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Ching-Ling Fan, Jean Lee, Wen-Chih Lo, Chun-Ying Huang, Kuan-Ta Chen, and Cheng-Hsin Hsu. 2017. Fixation Prediction for 360 Video Streaming in Head-Mounted Virtual Reality. In ACM Workshop on Network and Operating Systems Support for Digital Audio and Video (NOSSDAV) . Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Yuming Fang, Weisi Lin, Zhenzhong Chen, Chia-Ming Tsai, and Chia-Wen Lin. 2012. Video Saliency Detection in the Compressed Domain. ACM International Conference on Multimedia (MM) . Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Yu Fang, Ryoichi Nakashima, Kazumichi Matsumiya, Ichiro Kuriki, and Satoshi Shioiri. 2015. Eye-head coordination for visual cognitive processing. In PLoS ONE 10(3): e0121035.Google ScholarGoogle ScholarCross RefCross Ref
  13. Wenzhong Guo, Xiaolong Sun, and Yuzhen Niu. 2015. Evaluation of visual saliency analysis algorithms in noisy images. IET Computer Vision , Vol. 9(2) (2015), 290--299.Google ScholarGoogle Scholar
  14. Xun Huang, Chengyao Shen, Xavier Boix, and Qi Zhao. 2015. Salicon: Reducing the semantic gap in saliency prediction by adapting deep neural networks. In ICCV . Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Tilke Judd, Krista Ehinger, Frédo Durand, and Antonio Torralba. 2009. Learning to predict where humans look. In International Conference on Computer Vision (ICCV).Google ScholarGoogle ScholarCross RefCross Ref
  16. Keras. https://keras.io. Keras: The Python Deep Learning library. ( https://keras.io).Google ScholarGoogle Scholar
  17. Wolf Kienzle, Bernhard Scholkopf, Felix Wichmann, and Matthias Franz. 2007. How to Find Interesting Locations in Video: A Spatiotemporal Interest Point Detector Learned from Human Eye Movements. In Hamprecht F.A., Schnorr C., Jahne B. (eds) Pattern Recognition. DAGM 2007. Lecture Notes in Computer Science . Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. J. B. Kuipers. 1999. Quaternions and Rotation Sequences: A Primer with Applications to Orbits, Aerospace, and Virtual Reality .Princeton University Press, Princeton, USA.Google ScholarGoogle Scholar
  19. Matthias Kummerer, Lucas Theis, and Matthias Bethge. 2015. Deep Gaze I: Boosting Saliency Prediction with Feature Maps Trained on ImageNet. In International Conference on Learning Representations (ICLR) .Google ScholarGoogle Scholar
  20. Pavlo Molchanov, Stephen Tyree, Tero Karras, Timo Aila, and Jan Kautz. 2017. Pruning convolutional neural networks for resource efficient transfer learning. In 5th International Conference on Learning Representations (ICLR).Google ScholarGoogle Scholar
  21. Rafael Monroy, Sebastian Lutz, Tejo Chalasani, and Aljosa Smolic. 2017. SalNet360: Saliency Maps for omni-directional images with CNN. In arXiv preprint arXiv:1709.06505v1 .Google ScholarGoogle Scholar
  22. Afshin Taghavi Nasrabadi, Anahita Mahzari, Joseph D. Beshay, and Ravi Prakash. 2017. Adaptive 360-Degree Video Streaming using Scalable Video Coding. In ACM International Conference on Multimedia (MM). Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Tam V Nguyen, Mengdi Xu, Guangyu Gao, Mohan Kankanhalli, Qi Tian, and Shuicheng Yan. 2013. Static Saliency vs. Dynamic Saliency: A Comparative Study. In ACM International Conference on Multimedia (MM) . Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Yuzhen Niu, Lingling Ke, and Wenzhong Guo. 2016. Evaluation of visual saliency analysis algorithms in noisy images. Machine Vision and Applications , Vol. 27(6) (2016), 915--927.Google ScholarGoogle ScholarCross RefCross Ref
  25. Yuzhen Niu, Lening Lin, Yuzhong Chen, and Lingling Ke. {n. d.}. Machine learning-based framework for saliency detection in distorted images. Multimedia Tools Application ({n. d.}). Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Yuzhen Niu, Wenqi Lin, and Xiao Ke. 2018. CF-based optimisation for saliency detection. IET Computer Vision , Vol. 12(4) (2018), 365--376.Google ScholarGoogle ScholarCross RefCross Ref
  27. Junting Pan, Elisa Sayrol, Xavier Giro-i Nieto, Kevin McGuinness, and Noel E O'Connor. 2016. Shallow and Deep Convolutional Networks for Saliency Prediction. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR) .Google ScholarGoogle Scholar
  28. Feng Qian, Lusheng Ji, Bo Han, and Vijay Gopalakrishnan. 2016. Optimizing 360 video delivery over cellular networks. In ACM Workshop on All Things Cellular: Operations, Applications and Challenges . Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Hasim Sak, Andrew Senior, and Francoise Beaufays. 2014. Long Short-Term Memory Recurrent Neural Network Architectures for Large Scale Acoustic Modeling. In 15th Annual Conference of the International Speech Communication Association (INTERSPEECH) .Google ScholarGoogle Scholar
  30. Karen Simonyan and Andrew Zisserman. 2014. Very Deep Convolutional Networks for Large-Scale Image Recognition. In arXiv preprint arXiv:1409.1556, 2014 .Google ScholarGoogle Scholar
  31. Vincent Sitzmann, Ana Serrano, Amy Pavel, Maneesh Agrawala, Diego Gutierrez, Belen Masia, and Gordon Wetzstein. 2018. Saliency in VR: How do People Explore Virtual Environments? IEEE Trans. Vis. Comput. Graphics , Vol. 24 (April 2018), 1633--1642. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Benjamin W Tatler. 2007. The Central Fixation Bias in Scene Viewing: Selecting an Optimal Viewing Position Independently of Motor Biases and Image Feature Distributions. Journal of Vision , Vol. 7 (Nov. 2007), 1--17.Google ScholarGoogle ScholarCross RefCross Ref
  33. Technavio. https://goo.gl/zJCdnO. Global 360-Degree Camera Market 2016--2020. ( https://goo.gl/zJCdnO).Google ScholarGoogle Scholar
  34. Evgeniy Upenik and Touradj Ebrahimi. 2017. A Simple Method to Obtain Visual Attention Data in Head Mounted Virtual Reality. In IEEE International Conference on Multimedia & Expo Workshops (ICMEW).Google ScholarGoogle ScholarCross RefCross Ref
  35. JMP Van Waveren. 2016. The asynchronous time warp for virtual reality on consumer hardware. In ACM Conference on Virtual Reality Software and Technology (VRST) . Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Shengke Wang, Shan Wu, Lianghua Duan, Changyin Yu, Yujuan Sun, and Junyu Dong. 2016. Person Re-Identification with Deep Features and Transfer Learning. In arXiv:1611.05244 .Google ScholarGoogle Scholar
  37. Chenglei Wu, Zhihao Tan, Zhi Wang, and Shiqiang Yang. 2017. A Dataset for Exploring User Behaviors in VR Spherical Video Streaming. In Proceedings of the 8th ACM on Multimedia Systems Conference . Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Mengbai Xiao, Chao Zhou, Yao Liu, and Songqing Chen. 2017. OpTile: Toward Optimal Tiling in 360-degree Video Streaming. In ACM International Conference on Multimedia (MM). Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Lan Xie, Zhimin Xu, Yixuan Ban, Xinggong Zhang, and Zongming Guo. 2017. 360ProbDASH: Improving QoE of 360 Video Streaming Using Tile-based HTTP Adaptive Streaming. In ACM International Conference on Multimedia (MM). Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Yanhao Zhang, Lei Qin, Qingming Huang, Kuiyuan Yang, Jun Zhang, and Hongxun Yao. 2016. From Seed Discovery to Deep Reconstruction: Predicting Saliency in Crowd via Deep Networks. In ACM International Conference on Multimedia (MM) . Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Your Attention is Unique: Detecting 360-Degree Video Saliency in Head-Mounted Display for Head Movement Prediction

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      MM '18: Proceedings of the 26th ACM international conference on Multimedia
      October 2018
      2167 pages
      ISBN:9781450356657
      DOI:10.1145/3240508

      Copyright © 2018 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 15 October 2018

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      MM '18 Paper Acceptance Rate209of757submissions,28%Overall Acceptance Rate995of4,171submissions,24%

      Upcoming Conference

      MM '24
      MM '24: The 32nd ACM International Conference on Multimedia
      October 28 - November 1, 2024
      Melbourne , VIC , Australia

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader