Top

Multimedia Systems

Published in:

30-06-2023 | Regular Paper

DRL-based transmission control for QoE guaranteed transmission efficiency optimization in tile-based panoramic video streaming

Authors: Jiachang Li, Haitao Zhang, Huadong Ma

Published in: Multimedia Systems | Issue 5/2023

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

In the tile-based panoramic video streaming, the Field of View (FOV) is composed of multiple real-time synchronized visible video tiles. The common panoramic video transmission control methods use the FOV prediction and redundant tile transmission to address the issues of network delay and fast viewport switching. However, these methods rely heavily on the FOV prediction accuracy and do not fully consider the transmission efficiency, which is measured by the ratio of data used for FOV to the total transmitted data. Moreover, the existing learning-based methods directly consider the ever-changing factors such as network bandwidth and viewport position in the learning process, resulting in the poor stability of the transmission control. In this paper, we propose a Deep Reinforcement Learning (DRL)-based transmission control method for the tile-based panoramic video streaming, and the objective is to optimize the transmission efficiency on the basis of the guaranteed Quality of Experience (QoE). Firstly, we define the panoramic video transmission control process as the maximization of transmission efficiency on the basis of constraining multiple QoE metrics in the preset acceptable ranges. Secondly, we design a two-stage transmission control decision-making mechanism to improve the stability of transmission process, which includes intermediate decision-making stage and final decision-making stage. During the intermediate decision-making stage, the newly defined aggregated transmission decision variables are learned by using the Rainbow Deep Q Network. In this online learning process, we only consider the QoE and transmission efficiency, and avoid directly involving the ever-changing environment factors. During the final decision-making stage, the bitrate and buffer size of each video tile are determined according to the network bandwidth and viewport under the guidance of the intermediate decisions. Finally, the experiments conducted with the actual network bandwidth and viewport track show that our method performs better in the long-term transmission efficiency than other methods.

previous article Multimodal heterogeneous graph convolutional network for image recommendation

next article HC-GCN: hierarchical contrastive graph convolutional network for unsupervised domain adaptation on person re-identification

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

https://tensorforce.readthedocs.io/en/latest/.

Alface, P.R., Macq, J.F., Verzijp, N.: Interactive omnidirectional video delivery: a bandwidth-effective approach. Bell Labs Tech. J. 16(4), 135–147 (2012)CrossRef

Bellemare, M.G., Dabney, W., Munos, R.: A distributional perspective on reinforcement learning. In: Proceedings of the International Conference on Machine Learning, pp. 449–458 (2017)

Chen, S., Wu, H., Han, X., Xiao, L.: Multi-step truncated q learning algorithm. In: Proceedings of the International Conference on Machine Learning and Cybernetics, vol. 1, pp. 194–198 (2005)

Chen, Z., Li, Y., Zhang, Y.: Recent advances in omnidirectional video coding for virtual reality: Projection and evaluation. Sig. Proc. 146, 66–78 (2018)CrossRef

Corbillon, X., Devlic, A., Simon, G., Chakareski, J.: Optimal set of 360-degree videos for viewport-adaptive streaming. In: Proceedings of the 25th ACM International Conference on Multimedia, pp. 943–951 (2017)

Corbillon, X., Simon, G., Devlic, A., Chakareski, J.: Viewport-adaptive navigable 360-degree video delivery. In: Proceedings of the IEEE International Conference on Communications (2017)

D’Acunto, L., Berg, J., Thomas, E., Niamut, O.: Using mpeg dash srd for zoomable and navigable video. In: Proceedings of the 7th International Conference on Multimedia Systems (2016)

David, E.J., Gutiérrez, J., Coutrot, A., Da Silva, M.P., Callet, P.L.: A dataset of head and eye movements for 360 videos. In: Proceedings of the 9th ACM Multimedia Systems Conference, pp. 432–437 (2018)

Duanmu, F., He, Y., Xiu, X., Hanhart, P., Ye, Y., Wang, Y.: Hybrid cubemap projection format for 360-degree video coding. In: Data Compression Conference (DCC), pp. 404–405 (2018)

10.

Duanmu, F., Kurdoglu, E., Hosseini, S.A., Liu, Y., Wang, Y.: Prioritized buffer control in two-tier 360 video streaming. In: Proceedings of the Workshop on Virtual Reality and Augmented Reality Network, pp. 13–18 (2017)

11.

Fortunato, M., Azar, M.G., Piot, B., et al.: Noisy networks for exploration. In: Proceedings of the 6th International Conference on Learning Representations (2018)

12.

Fu, C.W., Wan, L., Wong, T.T., Leung, C.S.: The rhombic dodecahedron map: An efficient scheme for encoding panoramic video. IEEE Transact. Mult. 11(4), 634–644 (2009)CrossRef

13.

Geva, S., Sitte, J.: A cartpole experiment benchmark for trainable controllers. IEEE Control Syst. Mag. 13(5), 40–51 (1993)CrossRef

14.

Graf, M., Timmerer, C., Mueller, C.: Towards bandwidth efficient adaptive streaming of omnidirectional video over http: Design, implementation, and evaluation. In: Proceedings of the 8th ACM on Multimedia Systems Conference, pp. 261–271 (2017)

15.

Hessel, M., Modayil, J., Van Hasselt, H., et al.: Rainbow: Combining improvements in deep reinforcement learning. In: Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence (2018)

16.

Hou, X., Dey, S., Zhang, J., Budagavi, M.: Predictive view generation to enable mobile 360-degree and VR experiences. In: Proceedings of the 2018 Morning Workshop on Virtual Reality and Augmented Reality Network, pp. 20–26 (2018)

17.

Huang, K.C., Chien, P.Y., Chien, C.A., Chang, H.C., Guo, J.I.: A 360-degree panoramic video system design. In: Technical Papers of 2014 International Symposium on VLSI Design, Automation and Test (2014)

18.

Huo, Y., Kuang, H.: Ts360: A two-stage deep reinforcement learning system for 360-degree video streaming. In: 2022 IEEE International Conference on Multimedia and Expo (2022)

19.

Jeppsson, M., Espeland, H.N., Stensland, H., et al.: Efficient live and on-demand tiled hevc 360 VR video streaming. Int. J. Semantic Comput. 13(03), 367–391 (2019)CrossRef

20.

Jiang, Z., Zhang, X., Xu, Y., Ma, Z., Sun, J., Zhang, Y.: Reinforcement learning based rate adaptation for 360-degree video streaming. IEEE Transact. Broadcast. 67(2), 409–423 (2021)CrossRef

21.

Liu, X., Xiao, Q., Gopalakrishnan, V., Han, B., Qian, F., Varvello, M.: 360 innovations for panoramic video streaming. In: Proceedings of the 16th ACM Workshop on Hot Topics in Networks, pp. 50–56 (2017)

22.

Mahzari, A., Taghavi Nasrabadi, A., Samiei, A., Prakash, R.: Fov-aware edge caching for adaptive 360 video streaming. In: Proceedings of the 26th ACM International Conference on Multimedia, pp. 173–181 (2018)

23.

Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A.A., Veness, J., Bellemare, M.G., Graves, A., Riedmiller, M., Fidjeland, A.K., Ostrovski, G., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015)CrossRef

24.

Nasrabadi, A.T., Mahzari, A., Beshay, J.D., Prakash, R.: Adaptive 360-degree video streaming using scalable video coding. In: Proceedings of the 25th ACM International Conference on Multimedia, pp. 1689–1697 (2017)

25.

Ohashi, A., Tanaka, Y., Masuyama, G., et al.: Fisheye stereo camera using equirectangular images. In: Proceedings of the 17th International Conference on Research and Education in Mechatronics, pp. 284–289 (2016)

26.

Petrangeli, S., Swaminathan, V., Hosseini, M., De Turck, F.: An http/2-based adaptive streaming framework for 360 virtual reality videos. In: Proceedings of the 25th ACM international conference on Multimedia, pp. 306–314 (2017)

27.

Petrangeli, S., Swaminathan, V., Hosseini, M., De Turck, F.: Improving virtual reality streaming using http/2. In: Proceedings of the 8th ACM on Multimedia Systems Conference, pp. 225–228 (2017)

28.

Podborski, D., Son, J., Bhullar, G.S., et al.: HTML5 MSE playback of mpeg 360 VR tiled streaming: JavaScript implementation of MPEG-OMAF viewport-dependent video profile with HEVC tiles. In: Proceedings of the 10th ACM Multimedia Systems Conference, pp. 324–327 (2019)

29.

Qian, F., Ji, L., Han, B., Gopalakrishnan, V.: Optimizing 360 video delivery over cellular networks. In: Proceedings of the 5th Workshop on All Things Cellular: Operations, Applications and Challenges (2016)

30.

Rondao Alface, P., Aerts, M., Tytgat, D., Lievens, S., Stevens, C., Verzijp, N., Macq, J.F.: 16k cinematic vr streaming. In: Proceedings of the 25th ACM International Conference on Multimedia, pp. 1105–1112 (2017)

31.

Schaul, T., Quan, J., Antonoglou, I., Silver, D.: Prioritized experience replay. Proceedings of the 4th International Conference on Learning Representations (2015)

32.

Son, J., Jang, D., Ryu, E.S.: Implementing motion-constrained tile and viewport extraction for VR streaming. In: Proceedings of the 28th ACM SIGMM Workshop on Network and Operating Systems Support for Digital Audio and Video, pp. 61–66 (2018)

33.

Sreedhar, K.K., Aminlou, A., Hannuksela, M.M., Gabbouj, M.: Viewport-adaptive encoding and streaming of 360-degree video for virtual reality applications. In: IEEE International Symposium on Multimedia, pp. 583–586 (2016)

34.

Stockhammer, T.: Dynamic adaptive streaming over http – standards and design principles. In: Proceedings of the Second Annual ACM Conference on Multimedia systems, pp. 133–144 (2011)

35.

Van Hasselt, H., Guez, A., Silver, D.: Deep reinforcement learning with double q-learning. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 30 (2016)

36.

Wang, Z., Schaul, T., Hessel, M., et al.: Dueling network architectures for deep reinforcement learning. In: Proceedings of the International Conference on Machine Learning, pp. 1995–2003 (2016)

37.

Xiao, G., Wu, M., Shi, Q., Zhou, Z., Chen, X.: DeepVR: Deep reinforcement learning for predictive panoramic video streaming. IEEE Trans. Cogn. Commun. Netw. 5(4), 1167–1177 (2019)CrossRef

38.

Xie, L., Xu, Z., Ban, Y., Zhang, X., Guo, Z.: 360probdash: Improving qoe of 360 video streaming using tile-based http adaptive streaming. In: Proceedings of the 25th ACM International Conference on Multimedia, pp. 315–323 (2017)

39.

Xie, L., Zhang, X., Guo, Z.: Cls: A cross-user learning based system for improving qoe in 360-degree video adaptive streaming. In: Proceedings of the 26th ACM International Conference on Multimedia, pp. 564–572 (2018)

40.

Xu, Y., Dong, Y., Wu, J., et al.: Gaze prediction in dynamic 360 immersive videos. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5333–5342 (2018)

41.

Yun, D., Chung, K.: DASH-based multi-view video streaming system. IEEE Transact. Circuits Syst. Video Technol. 28(8), 1974–1980 (2017)CrossRef

42.

Zare, A., Aminlou, A., Hannuksela, M.M., Gabbouj, M.: HEVC-compliant tile-based streaming of panoramic video for virtual reality applications. In: Proceedings of the 24th ACM International Conference on Multimedia, pp. 601–605 (2016)

43.

Zhang, Y., Zhao, P., Bian, K., Liu, Y., Song, L., Li, X.: DRL360: 360-degree video streaming with deep reinforcement learning. In: Proceedings of IEEE Conference on Computer Communications, pp. 1252–1260 (2019)

44.

Zheng, X., Jiang, G., Yu, M., Jiang, H.: Segmented spherical projection-based blind omnidirectional image quality assessment. IEEE Access 8, 31647–31659 (2016)CrossRef

Title: DRL-based transmission control for QoE guaranteed transmission efficiency optimization in tile-based panoramic video streaming
Authors: Jiachang Li
Haitao Zhang
Huadong Ma
Publication date: 30-06-2023
Publisher: Springer Berlin Heidelberg
Published in: Multimedia Systems / Issue 5/2023
Print ISSN: 0942-4962
Electronic ISSN: 1432-1882
DOI: https://doi.org/10.1007/s00530-023-01129-3

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Other articles of this Issue 5/2023

Co-attention graph convolutional network for visual question answering

Micro-expression recognition with attention mechanism and region enhancement

Few-shot ship classification based on metric learning

Unsupervised single image dehazing with generative adversarial network

Editorial note for few-shot learning for intelligent multimedia systems

EfficientFace: an efficient deep network with feature enhancement for accurate face detection