ABSTRACT
Client-side video players employ adaptive bitrate (ABR) algorithms to optimize user quality of experience (QoE). Despite the abundance of recently proposed schemes, state-of-the-art ABR algorithms suffer from a key limitation: they use fixed control rules based on simplified or inaccurate models of the deployment environment. As a result, existing schemes inevitably fail to achieve optimal performance across a broad set of network conditions and QoE objectives.
We propose Pensieve, a system that generates ABR algorithms using reinforcement learning (RL). Pensieve trains a neural network model that selects bitrates for future video chunks based on observations collected by client video players. Pensieve does not rely on pre-programmed models or assumptions about the environment. Instead, it learns to make ABR decisions solely through observations of the resulting performance of past decisions. As a result, Pensieve automatically learns ABR algorithms that adapt to a wide range of environments and QoE metrics. We compare Pensieve to state-of-the-art ABR algorithms using trace-driven and real world experiments spanning a wide variety of network conditions, QoE metrics, and video properties. In all considered scenarios, Pensieve outperforms the best state-of-the-art scheme, with improvements in average QoE of 12%--25%. Pensieve also generalizes well, outperforming existing schemes even on networks for which it was not explicitly trained.
Supplemental Material
- M. Abadi et al. 2016. TensorFlow: A System for Large-scale Machine Learning. In OSDI. USENIX Association.Google Scholar
- Akamai. 2016. dash.js. https://github.com/Dash-Industry-Forum/dash.js/. (2016).Google Scholar
- S. Akhshabi, A. C. Begen, and C. Dovrolis. 2011. An Experimental Evaluation of Rate-adaptation Algorithms in Adaptive Streaming over HTTP. In MMSys. Google ScholarDigital Library
- M. Allman, V. Paxson, and E. Blanton. 2009. TCP congestion control. RFC 5681.Google Scholar
- C. M. Bishop. 2006. Pattern Recognition and Machine Learning. Springer.Google Scholar
- F. Chiariotti et al. 2016. Online learning adaptation strategy for DASH clients. In Proceedings of the 7th International Conference on Multimedia Systems. ACM, 8. Google ScholarDigital Library
- Cisco. 2016. Cisco Visual Networking Index: Forecast and Methodology, 2015--2020.Google Scholar
- M. Claeys et al. 2013. Design of a Q-learning-based client quality selection algorithm for HTTP adaptive video streaming. In Adaptive and Learning Agents Workshop.Google Scholar
- M. Claeys et al. 2014. Design and optimisation of a (FA) Q-learning-based HTTP adaptive streaming client. Connection Science (2014).Google Scholar
- Federal Communications Commission. 2016. Raw Data - Measuring Broadband America. (2016). https://www.fcc.gov/reports-research/reports/measuring-broadband-america/raw-data-measuring-broadband-america-2016Google Scholar
- DASH Industry Form. 2016. Reference Client 2.4.0. http://mediapm.edgesuite.net/dash/public/nightly/samples/dash-if-reference-player/index.html. (2016).Google Scholar
- F. Dobrian et al. 2011. Understanding the Impact of Video Quality on User Engagement. In SIGCOMM. ACM. Google ScholarDigital Library
- G. Fairhurst et al. 2015. Updating TCP to Support Rate-Limited Traffic. RFC 7661 (2015).Google Scholar
- Xavier Glorot and Yoshua Bengio. 2010. Understanding the difficulty of training deep feedforward neural networks.. In Aistats, Vol. 9. 249--256.Google Scholar
- M. T. Hagan, H. B. Demuth, M. H. Beale, and O. De Jesús. 1996. Neural network design. PWS publishing company Boston.Google Scholar
- S. Han, H. Mao, and W. J. Dally. 2015. Deep compression: Compressing deep neural network with pruning, trained quantization and huffman coding. CoRR, abs/1510.00149 2 (2015).Google Scholar
- M. Handley, J. Padhye, and S. Floyd. 2000. TCP Congestion Window Validation. RFC 2861 (2000).Google Scholar
- T.Y. Huang et al. 2012. Confused, Timid, and Unstable: Picking a Video Streaming Rate is Hard. In Proceedings of the 2012 ACM Conference on Internet Measurement Conference (IMC). ACM. Google ScholarDigital Library
- T.Y. Huang et al. 2014. A Buffer-based Approach to Rate Adaptation: Evidence from a Large Video Streaming Service. In SIGCOMM. ACM.Google Scholar
- M. Jaderberg et al. 2017. Reinforcement learning with unsupervised auxiliary tasks. In ICLR.Google Scholar
- J. Jiang, V. Sekar, and H. Zhang. 2012. Improving Fairness, Efficiency, and Stability in HTTP-based Adaptive Video Streaming with FESTIVE. In CoNEXT.Google Scholar
- J. Jiang et al. 2016. CFA: A Practical Prediction System for Video QoE Optimization. In NSDI. USENIX Association.Google Scholar
- I. Ketykó et al. 2010. QoE Measurement of Mobile YouTube Video Streaming. In Proceedings of the 3rd Workshop on Mobile Video Delivery (MoViD). ACM. Google ScholarDigital Library
- V. R Konda and J. N. Tsitsiklis. 2000. Actor-critic algorithms. In Advances in neural information processing systems. 1008--1014.Google Scholar
- S. S. Krishnan and R. K. Sitaraman. 2012. Video Stream Quality Impacts Viewer Behavior: Inferring Causality Using Quasi-experimental Designs. In Proceedings of the 2012 ACM Conference on Internet Measurement Conference (IMC). ACM. Google ScholarDigital Library
- Z. Li et al. 2014. Probe and Adapt: Rate Adaptation for HTTP Video Streaming At Scale. IEEE Journal on Selected Areas in Communications (2014).Google Scholar
- H. Mao, M. Alizadeh, I. Menache, and S. Kandula. 2016. Resource Management with Deep Reinforcement Learning. In HotNets. ACM. Google ScholarDigital Library
- H. Mao, R. Netravali, and M. Alizadeh. 2017. Neural Adaptive Video Streaming with Pensieve. (2017). http://web.mit.edu/pensieve/content/pensieve-tech-report.pdfGoogle Scholar
- V. Mnih et al. 2015. Human-level control through deep reinforcement learning. Nature 518 (2015), 529--533. Google ScholarCross Ref
- V. Mnih et al. 2016. Asynchronous methods for deep reinforcement learning. In International Conference on Machine Learning. 1928--1937.Google ScholarDigital Library
- R. K.P. Mok, E. W. W. Chan, X. Luo, and R. K.C. Chang. 2011. Inferring the QoE of HTTP Video Streaming from User-viewing Activities. In Proceedings of the First ACM SIGCOMM Workshop on Measurements Up the Stack (W-MUST). Google ScholarDigital Library
- R. K. P. Mok, E. W. W. Chan, and R. K. C. Chang. 2011. Measuring the quality of experience of HTTP video streaming. In 12th IFIP/IEEE International Symposium on Integrated Network Management (IM 2011) and Workshops. Google ScholarCross Ref
- R. Netravali et al. 2015. Mahimahi: Accurate Record-and-Replay for HTTP. In Proceedings of USENIX ATC.Google ScholarDigital Library
- K. Piamrat, C. Viho, J. M. Bonnin, and A. Ksentini. 2009. Quality of Experience Measurements for Video Streaming over Wireless Networks. In Proceedings of the 2009 Sixth International Conference on Information Technology: New Generations (ITNG). IEEE Computer Society. Google ScholarDigital Library
- W. B. Powell. 2007. Approximate Dynamic Programming: Solving the curses of dimensionality. Vol. 703. John Wiley & Sons. Google ScholarCross Ref
- B. Recht, C. Re, S. Wright, and F. Niu. 2011. Hogwild: A lock-free approach to parallelizing stochastic gradient descent. In Advances in Neural Information Processing Systems. 693--701.Google ScholarDigital Library
- H. Riiser et al. 2013. Commute Path Bandwidth Traces from 3G Networks: Analysis and Applications. In Proceedings of the 4th ACM Multimedia Systems Conference (MMSys). ACM. Google ScholarDigital Library
- J. K. Rowling. 2000. Harry Potter and the Goblet of Fire. London: Bloomsbury.Google Scholar
- Sandvine. 2015. Global Internet Phenomena-Latin American & North America.Google Scholar
- D. Silver et al. 2016. Mastering the game of Go with deep neural networks and tree search. Nature 529 (2016), 484--503. Google ScholarCross Ref
- K. Spiteri, R. Urgaonkar, and R. K. Sitaraman. 2016. BOLA: Near-Optimal Bitrate Adaptation for Online Videos. CoRR abs/1601.06748 (2016).Google Scholar
- Y. Sun et al. 2016. CS2P: Improving Video Bitrate Selection and Adaptation with Data-Driven Throughput Prediction. In SIGCOMM. ACM. Google ScholarDigital Library
- R. S. Sutton and A. G. Barto. 1998. Reinforcement Learning: An Introduction. MIT Press.Google ScholarDigital Library
- R. S. Sutton et al. 1999. Policy gradient methods for reinforcement learning with function approximation.. In NIPS, Vol. 99. 1057--1063.Google ScholarDigital Library
- Synaptic. 2016. synaptic.js -- The javascript architecture-free neural network library for node.js and the browser. https://synaptic.juancazala.com/. (2016).Google Scholar
- TFLearn. 2017. TFLearn: Deep learning library featuring a higher-level API for TensorFlow. http://tflearn.org/. (2017).Google Scholar
- J. van der Hooft et al. A learning-based algorithm for improved bandwidth-awareness of adaptive streaming clients. In 2015 IFIP/IEEE International Symposium on Integrated Network Management. IEEE. Google ScholarCross Ref
- A. S. Vezhnevets et al. 2017. FeUdal Networks for Hierarchical Reinforcement Learning. arXiv preprint arXiv:1703.01161 (2017).Google Scholar
- K. Winstein, A. Sivaraman, and H. Balakrishnan. Stochastic Forecasts Achieve High Throughput and Low Delay over Cellular Networks. In NSDI.Google Scholar
- Y. Wu and Y. Tian. 2017. Training agent for first-person shooter game with actor-critic curriculum learning. In ICLR.Google Scholar
- X. Yin, A. Jindal, V. Sekar, and B. Sinopoli. 2015. A Control-Theoretic Approach for Dynamic Adaptive Video Streaming over HTTP. In SIGCOMM. ACM. Google ScholarDigital Library
- Y. Zaki et al. 2015. Adaptive congestion control for unpredictable cellular networks. In ACM SIGCOMM Computer Communication Review. ACM. Google ScholarDigital Library
- X. K. Zou. 2015. Can Accurate Predictions Improve Video Streaming in Cellular Networks?. In HotMobile. ACM. Google ScholarDigital Library
Index Terms
- Neural Adaptive Video Streaming with Pensieve
Recommendations
Reinforcement learning-based rate adaptation in dynamic video streaming
AbstractVideo streaming stands out as the most significant traffic type consumed by mobile devices. This increased demand has been a major driver for research on bitrate adaptation algorithms. Bitrate adaptation ensures high user-perceived quality, which, ...
Smooth control of adaptive media playout for video streaming
Client-side data buffering is a common technique to deal with media playout interruptions of streaming video caused by network jitters and packet losses of best-effort networks. How-ever, stronger playout interruption protection inevitably amounts to ...
Reformed QoE-Based Approach in Bitrate-Adaptation for Dynamic Adaptive Streaming Systems
With the COVID-19 pandemic, the demand for popular streaming applications is expected to soar. Ensuring high user-perceived quality correlates with higher profits for content providers and delivery systems. Dynamic adaptive streaming over HTTP (DASH) ...
Comments