ABSTRACT
In this work, we propose an online adaptation logic for Dynamic Adaptive Streaming over HTTP (DASH) clients, where each client selects the representation that maximize the long term expected reward. The latter is defined as a combination of the decoded quality, the quality fluctuations and the rebuffering events experienced by the user during the playback. To solve this problem, we cast a Markov Decision Process (MDP) optimization for the selection of the optimal representations. System dynamics required in the MDP model are a priori unknown and are therefore learned through a Reinforcement Learning (RL) technique. The developed learning process exploits a parallel learning technique that improves the learning rate and limits sub-optimal choices, leading to a fast and yet accurate learning process that quickly converges to high and stable rewards. Therefore, the efficiency of our controller is not sacrificed for fast convergence. Simulation results show that our algorithm achieves a higher QoE than existing RL algorithms in the literature as well as heuristic solutions, as it is able to increase average QoE and reduce quality fluctuations.
- Cisco. Cisco visual networking index: forecast and methodology, 2013--2018. Cisco Public Information, 2014.Google Scholar
- Ricky KP Mok, Edmond WW Chan, and Rocky KC Chang. Measuring the quality of experience of HTTP video streaming. In Integrated Network Management (IM), 2011 IFIP/IEEE International Symposium on, pages 485--492, 2011.Google ScholarCross Ref
- Thomas Zinner, Oliver Hohlfeld, Osama Abboud, and Tobias Hoßfeld. Impact of frame rate and resolution on objective QoE metrics. In Quality of Multimedia Experience (QoMEX), Second International IEEE Workshop on, pages 29--34, 2010.Google Scholar
- Zhi Li, Ali C Begen, Joshua Gahm, Yufeng Shan, Bruce Osler, and David Oran. Streaming video over HTTP with consistent quality. In 5th ACM Multimedia Systems Conference, pages 248--258, 2014. Google ScholarDigital Library
- Maxim Claeys, Steven Latré, Jeroen Famaey, Tingyao Wu, Werner Van Leekwijck, and Filip De Turck. Design and optimisation of a (FA) Q-learning-based HTTP adaptive streaming client. Connection Science, 26(1):25--43, 2014.Google ScholarCross Ref
- Maxim Claeys, Steven Latré, Jeroen Famaey, Tingyao Wu, Werner Van Leekwijck, and Filip De Turck. Design of a Q-learning-based client quality selection algorithm for HTTP adaptive video streaming. In Adaptive and Learning Agents Workshop, part of AAMAS2013 (ALA-2013), pages 30--37, 2013.Google Scholar
- Michael Seufert, Sebastian Egger, Martin Slanina, Thomas Zinner, Tobias Hobfeld, and Phuoc Tran-Gia. A survey on quality of experience of HTTP adaptive streaming. Communications Surveys & Tutorials, IEEE, 17(1):469--492, 2014.Google ScholarCross Ref
- Shenghong Hu, Lingfen Sun, Chao Gui, Emmanuel Jammeh, and Is-Haka Mkwawa. Content-aware adaptation scheme for QoE optimized DASH applications. In Global Communications Conference (GLOBECOM), IEEE, pages 1336--1341, 2014.Google ScholarCross Ref
- Jeroen van der Hooft, Stefano Petrangeli, Maxim Claeys, Jeroen Famaey, and Filip De Turck. A learning-based algorithm for improved bandwidth-awareness of adaptive streaming clients. In Integrated Network Management (IM), 2015 IFIP/IEEE International Symposium on, pages 131--138, 2015.Google ScholarCross Ref
- Jongwook Lee and Saewoong Bahk. On the MDP-based cost minimization for video-on-demand services in a heterogeneous wireless network with multihomed terminals. Mobile Computing, IEEE Transactions on, 12(9):1737--1749, 2013. Google ScholarDigital Library
- Stefania Colonnese, Francesca Cuomo, Tommaso Melodia, and Raffaele Guida. Cloud-assisted buffer management for HTTP-based mobile video streaming. In Performance evaluation of wireless ad hoc, sensor, & ubiquitous networks, 10th ACM symposium on, pages 1--8, 2013. Google ScholarDigital Library
- Ayub Bokani, Mehdi Hassan, and Salil Kanhere. HTTP-based adaptive streaming for mobile clients using Markov Decision Processes. In 20th IEEE International Packet Video Workshop (PV), pages 1--8, 2013.Google Scholar
- Johan De Vriendt, Danny De Vleeschauwer, and David Robinson. Model for estimating QoE of video delivered using HTTP adaptive streaming. In Integrated Network Management (IM 2013), IFIP/IEEE International Symposium on, pages 1288--1293, 2013.Google Scholar
- Richard Bellman. A Markovian decision process. Technical report, DTIC Document, 1957.Google Scholar
- Richard S Sutton and Andrew G Barto. Introduction to reinforcement learning. MIT Press, 1998. Google ScholarDigital Library
- Thomas H Cormen, Charles E Leiserson, Ronald L Rivest, Clifford Stein, et al. Introduction to algorithms, volume 2. MIT press, 2001. Google ScholarDigital Library
- N. Mastronarde and M. van der Schaar. Fast reinforcement learning for energy-efficient wireless communication. Signal Processing, IEEE Transactions on, 59(12):6262--6266, 2011. Google ScholarDigital Library
- Andrzej Ruszczynski. Commentary-Post-Decision States and separable approximations are powerful tools of approximate dynamic programming. INFORMS Journal on Computing, 22(1):20--22, 2010. Google ScholarDigital Library
- Andrew G Barto, Richard S Sutton, and Chris JCH Watkins. Learning and sequential decision making. University of Massachusetts, 1989.Google Scholar
- Zhou Wang, Alan C Bovik, Hamid R Sheikh, and Eero P Simoncelli. Image quality assessment: from error visibility to structural similarity. Image Processing, IEEE Transactions on, 13(4):600--612, 2004. Google ScholarDigital Library
- Marco Zanforlin, Daniele Munaretto, Andrea Zanella, and Michele Zorzi. SSIM-based video admission control and resource allocation algorithms. In Modeling and Optimization in Mobile, Ad Hoc, and Wireless Networks (WiOpt), 12th International IEEE Symposium on, pages 656--661, 2014.Google Scholar
- Donald L Gilman, Fred J Fuglister, and J Murray Mitchell Jr. On the power spectrum of "red noise". Journal of the Atmospheric Sciences, 20(2):182--184, 1963.Google ScholarCross Ref
- Dongeun Suh, Gwangwoo Park, Haneul Ko, and Sangheon Pack. Mobility-aware DASH for cost-optimal mobile multimedia streaming services. In Computer Communications Workshops (INFOCOM WKSHPS), 2015 IEEE Conference on, pages 57--58, 2015.Google ScholarCross Ref
- Min Xing, Siyuan Xiang, and Lin Cai. A real-time adaptive algorithm for video streaming over multiple wireless access networks. Selected Areas in Communications, IEEE Journal on, 32(4):795--805, 2014.Google Scholar
- Eitan Altman, Konstantin Avrachenkov, Chadi Barakat, and Parijat Dube. Performance analysis of AIMD mechanisms over a multi-state Markovian path. Computer Networks, 47(3):307--326, 2005. Google ScholarDigital Library
- Oliver Rose. Statistical properties of MPEG video traffic and their impact on traffic modeling in ATM systems. In Local Computer Networks, 20th IEEE Conference on, pages 397--406, 1995. Google ScholarDigital Library
- Lark Kwon Choi, Yiting Liao, and Alan C Bovik. Video QoE models for the compute continuum. IEEE Multimedia Communications Technical Committee (MMTC) E-Letters, 2013.Google Scholar
- David F Williamson, Robert A Parker, and Juliette S Kendrick. The box plot: a simple visual method to interpret data. Annals of internal medicine, 110(11):916--921, 1989.Google Scholar
Index Terms
- Online learning adaptation strategy for DASH clients
Recommendations
Reinforcement learning-based rate adaptation in dynamic video streaming
AbstractVideo streaming stands out as the most significant traffic type consumed by mobile devices. This increased demand has been a major driver for research on bitrate adaptation algorithms. Bitrate adaptation ensures high user-perceived quality, which, ...
QoE-Aware Adaptive Bitrate Algorithm Based on Subepisodic Deep Reinforcement Learning for DASH
ICMLC '23: Proceedings of the 2023 15th International Conference on Machine Learning and ComputingRecently, mobile video service is booming and its traffic accounts for the vast majority of network traffic. The adaptive bitrate (ABR) algorithm in dynamic adaptive streaming over HTTP (DASH) is the key technology to improve user’s quality of ...
QoE-Fair DASH Video Streaming Using Server-side Reinforcement Learning
Special Issue on Smart Communications and Networking for Future Video Surveillance and Special Section on Extended MMSYS-NOSSDAV 2019 Best PapersTo design an optimal adaptive video streaming method, video service providers need to consider both the efficiency and the fairness of the Quality of Experience (QoE) of their users. In Reference [8], we proposed a server-side QoE-fair rate adaptation ...
Comments