research-article

Fixation Prediction for 360° Video Streaming in Head-Mounted Virtual Reality

Authors:
Ching-Ling Fan

Department of Computer Science, National Tsing Hua University

Department of Computer Science, National Tsing Hua University
View Profile

,
Jean Lee

Department of Computer Science, National Tsing Hua University

Department of Computer Science, National Tsing Hua University
View Profile

,
Wen-Chih Lo

Department of Computer Science, National Tsing Hua University

Department of Computer Science, National Tsing Hua University
View Profile

,
Chun-Ying Huang

Department of Computer Science National Chiao Tung University

Department of Computer Science National Chiao Tung University
View Profile

,
Kuan-Ta Chen

Institute of Information Science Academia Sinica

Institute of Information Science Academia Sinica
View Profile

,
Cheng-Hsin Hsu

Department of Computer Science, National Tsing Hua University

Department of Computer Science, National Tsing Hua University
View Profile

NOSSDAV'17: Proceedings of the 27th Workshop on Network and Operating Systems Support for Digital Audio and VideoJune 2017Pages 67–72https://doi.org/10.1145/3083165.3083180

Published:20 June 2017Publication History

NOSSDAV'17: Proceedings of the 27th Workshop on Network and Operating Systems Support for Digital Audio and Video

Pages 67–72

ABSTRACT

We study the problem of predicting the Field-of-Views (FoVs) of viewers watching 360° videos using commodity Head-Mounted Displays (HMDs). Existing solutions either use the viewer's current orientation to approximate the FoVs in the future, or extrapolate future FoVs using the historical orientations and dead-reckoning algorithms. In this paper, we develop fixation prediction networks that concurrently leverage sensor- and content-related features to predict the viewer fixation in the future, which is quite different from the solutions in the literature. The sensor-related features include HMD orientations, while the content-related features include image saliency maps and motion maps. We build a 360° video streaming testbed to HMDs, and recruit twenty-five viewers to watch ten 360° videos. We then train and validate two design alternatives of our proposed networks, which allows us to identify the better-performing design with the optimal parameter settings. Trace-driven simulation results show the merits of our proposed fixation prediction networks compared to the existing solutions, including: (i) lower consumed bandwidth, (ii) shorter initial buffering time, and (iii) short running time.

References

2016. Augmented Virtual Reality revenue forecast revised to hit $120 billion by 2020. (2016). https://goo.gl/nw9mtP.Google Scholar
2016. Global 360-Degree Camera Market 2016-2020. (2016). https://goo.gl/zJCdnO.Google Scholar
T. Alshawi, Z. Long, and G. AlRegib. 2016. Understanding spatial correlation in eye-fixation maps for visual attention in videos. In Proc. of IEEE International Conference on Multimedia and Expo (ICME'16). 1--6.Google Scholar
A. Borji, M. Cheng, H. Jiang, and J. Li. 2014. Salient object detection: A survey. arXiv preprint arXiv:1411.5878 (2014).Google Scholar
L. Bottou. 2010. Large-scale machine learning with stochastic gradient descent. In Proc. of International Conference on Computational Statistics (COMPSTAT'10). 177--186.Google ScholarCross Ref
S. Chaabouni, J. Benois-Pineau, and C. Amar. 2016. Transfer learning with deep networks for saliency prediction in natural video. In Proc. of IEEE International Conference on Image Processing (ICIP'16). 1604--1608.Google Scholar
C. Chang, C. Hsu, C. Hsu, and K. Chen. 2016. Performance measurements of virtual reality systems: Quantifying the timing and positioning accuracy. In Proc. of ACM Conference on Multimedia (MM'16). 655--659. Google ScholarDigital Library
M. Cornia, L. Baraldi, G. Serra, and R. Cucchiara. 2016. A Deep Multi-Level Network for Saliency Prediction. In International Conference on Pattern Recognition (ICPR'16). 3488--3493.Google Scholar
T. El-Ganainy and M. Hefeeda. 2016. Streaming Virtual Reality Content. arXiv preprint arXiv:1612.08350 (2016).Google Scholar
S. Friston and A. Steed. 2014. Measuring latency in virtual environments. Transactions on Visualization and Computer Graphics 20, 4 (2014), 616--625. Google ScholarDigital Library
V Gaddam, M. Riegler, R. Eg, C. Griwodz, and P. Halvorsen. 2016. Tiling in Interactive Panoramic Video: Approaches and Evaluation. IEEE Transactions on Multimedia 18, 9 (2016), 1819--1831. Google ScholarDigital Library
R. Guntur and W. Ooi. 2012. On tile assignment for region-of-interest video streaming in a wireless LAN. In Proc. of ACM international workshop on Network and Operating System Support for Digital Audio and Video (NOSSDAV'12). 59--64. Google ScholarDigital Library
S. Hochreiter and J. Schmidhuber. 1997. Long short-term memory. Neural computation 9, 8 (1997), 1735--1780. Google ScholarDigital Library
Chun-Ying Huang, Kuan-Ta Chen, De-Yu Chen, Hwai-Jung Hsu, and Cheng-Hsin Hsu. 2014. GamingAnywhere: The First Open Source Cloud Gaming System. ACM Transactions on Multimedia Computing, Communications, and Applications 10, 1 (2014). Google ScholarDigital Library
T. Judd, K. Ehinger, F. Durand, and A. Torralba. 2009. Learning to predict where humans look. In IEEE International Conference on Computer Vision (ICCV'09). 2106--2113.Google Scholar
Y. Kavak, E. Erdem, and A. Erdem. 2017. A comparative study for feature integration strategies in dynamic saliency estimation. Signal Processing: Image Communication 51 (2017), 13--25. Google ScholarDigital Library
H. Kimata, D. Ochi, A. Kameda, H. Noto, K. Fukazawa, and A. Kojima. 2012. Mobile and multi-device interactive panorama video distribution system. In Proc. of IEEE Global Conference on Consumer Electronics (GCCE'12). 574--578.Google Scholar
B. Lucas and T. Kanade. 1981. An iterative image registration technique with an application to stereo vision. In Proc. of the International Joint Conference on Artificial Intelligence. 674--679. Google ScholarDigital Library
H. Lakshman M. Yu and B. Girod. 2015. A Framework to Evaluate Omnidirectional Video Coding Schemes. In IEEE International Symposium on Mixed and Augmented Reality. 31--36. Google ScholarDigital Library
A. Mavlankar and B. Girod. 2009. Pre-fetching based on video analysis for interactive region-of-interest streaming of soccer sequences. In Proc. of IEEE International Conference on Image Processing (ICIP'09). 3061--3064. Google ScholarDigital Library
A. Mavlankar and B. Girod. 2010. Video streaming with interactive pan/tilt/zoom. In Signals and Communication Technology. 431--455.Google Scholar
T. Nguyen, M. Xu, G. Gao, M. Kankanhalli, Q. Tian, and S. Yan. 2013. Static saliency vs. dynamic saliency: a comparative study. In Proc. of ACM International Conference on Multimedia (MM'13). 987--996. Google ScholarDigital Library
K. Simonyan and A. Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014).Google Scholar
K. Skarseth, H. Bjørlo, P. Halvorsen, M. Riegler, and C. Griwodz. 2016. OpenVQ: a video quality assessment toolkit. In Proc. of ACM International Conference on Multimedia (MM'16), OSSC paper. 1197--1200. Google ScholarDigital Library
I. Sodagar. 2011. The mpeg-dash standard for multimedia streaming over the internet. IEEE MultiMedia 18, 4 (2011), 62--67. Google ScholarDigital Library
E. Vig, M. Dorr, and D. Cox. 2014. Large-scale optimization of hierarchical features for saliency prediction in natural images. In Proc. of IEEE Conference on Computer Vision and Pattern Recognition (CVPR'14). 2798--2805. Google ScholarDigital Library
G. Simon X. Corbillon, A. Devlic and J. Chakareski. 2017. Viewport-Adaptive Navigable 360-Degree Video Delivery. In IEEE International Conference on Communications (ICC'17). Accepted to appear.Google Scholar
M. Young, G. Gaylor, S. Andrus, and B. Bodenheimer. 2014. A comparison of two cost-differentiated virtual reality systems for perception and action tasks. In Proc. of the ACM Symposium on Applied Perception. 83--90. Google ScholarDigital Library

Index Terms

Fixation Prediction for 360° Video Streaming in Head-Mounted Virtual Reality
1. Information systems
  1. Information systems applications
    1. Multimedia information systems
      1. Multimedia streaming
2. Mathematics of computing
  1. Mathematical analysis
    1. Functional analysis
      1. Approximation

Recommendations

360° Video Viewing Dataset in Head-Mounted Virtual Reality
MMSys'17: Proceedings of the 8th ACM on Multimedia Systems Conference

360° videos and Head-Mounted Displays (HMDs) are getting increasingly popular. However, streaming 360° videos to HMDs is challenging. This is because only video content in viewers' Field-of-Views (FoVs) is rendered, and thus sending complete 360° videos ...
Read More
Don’t make me sick: investigating the incidence of cybersickness in commercial virtual reality headsets
Abstract
The resurgence of interest in the use of virtual reality (VR) technology for research and entertainment purposes has led to an increase in concerns about human factor issues inherent in VR technology. One issue that has received a great deal of ...
Read More
Thinning trainer based on forest-growth model, virtual reality and computer-aided virtual environment

Immersive virtual reality is applied in many human activities. This virtual reality can be used as a training tool for thinning operations in forests. The aim of this study is to describe the complex solution of Thinning Trainer, that we developed. This ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

NOSSDAV'17: Proceedings of the 27th Workshop on Network and Operating Systems Support for Digital Audio and Video
June 2017
105 pages
ISBN:9781450350037
DOI:10.1145/3083165

Copyright © 2017 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 20 June 2017
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
360° video
HMD
prediction
virtual reality
Qualifiers
- research-article
- Research
- Refereed limited
Conference

Acceptance Rates
NOSSDAV'17 Paper Acceptance Rate15of40submissions,38%Overall Acceptance Rate118of363submissions,33%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 166
  Total Citations
  View Citations
- 1,899
  Total Downloads
- Downloads (Last 12 months)129
- Downloads (Last 6 weeks)16
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Fixation Prediction for 360° Video Streaming in Head-Mounted Virtual Reality

NOSSDAV'17: Proceedings of the 27th Workshop on Network and Operating Systems Support for Digital Audio and Video

ABSTRACT

References

Cited By

Index Terms

Recommendations

360° Video Viewing Dataset in Head-Mounted Virtual Reality

Don’t make me sick: investigating the incidence of cybersickness in commercial virtual reality headsets

Thinning trainer based on forest-growth model, virtual reality and computer-aided virtual environment

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Fixation Prediction for 360° Video Streaming in Head-Mounted Virtual Reality

NOSSDAV'17: Proceedings of the 27th Workshop on Network and Operating Systems Support for Digital Audio and Video

ABSTRACT

References

Cited By

Index Terms

Recommendations

360° Video Viewing Dataset in Head-Mounted Virtual Reality

Don’t make me sick: investigating the incidence of cybersickness in commercial virtual reality headsets

Thinning trainer based on forest-growth model, virtual reality and computer-aided virtual environment

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media