skip to main content
10.1145/3132747.3132775acmconferencesArticle/Chapter ViewAbstractPublication PagessospConference Proceedingsconference-collections
research-article
Open Access

SVE: Distributed Video Processing at Facebook Scale

Published:14 October 2017Publication History

ABSTRACT

Videos are an increasingly utilized part of the experience of the billions of people that use Facebook. These videos must be uploaded and processed before they can be shared and downloaded. Uploading and processing videos at our scale, and across our many applications, brings three key requirements: low latency to support interactive applications; a flexible programming model for application developers that is simple to program, enables efficient processing, and improves reliability; and robustness to faults and overload. This paper describes the evolution from our initial monolithic encoding script (MES) system to our current Streaming Video Engine (SVE) that overcomes each of the challenges. SVE has been in production since the fall of 2015, provides lower latency than MES, supports many diverse video applications, and has proven to be reliable despite faults and overload.

Skip Supplemental Material Section

Supplemental Material

sve.mp4

mp4

2.4 GB

References

  1. Anne Aaron and David Ronca. 2015. High Quality Video Encoding at Scale. http://techblog.netflix.com/2015/12/high-quality-video-encoding-at-scale.html. (2015).Google ScholarGoogle Scholar
  2. Daniel J Abadi, Yanif Ahmad, Magdalena Balazinska, Ugur Cetintemel, Mitch Cherniack, Jeong-Hyon Hwang, Wolfgang Lindner, Anurag Maskey, Alex Rasin, Esther Ryvkina, et al. 2005. The Design of the Borealis Stream Processing Engine. In Proceedings of the 2005 Conference on Innovative Data Systems Research.Google ScholarGoogle Scholar
  3. Daniel J Abadi, Don Carney, Ugur Çetintemel, Mitch Cherniack, Christian Convey, Sangdon Lee, Michael Stonebraker, Nesime Tatbul, and Stan Zdonik. 2003. Aurora: a new model and architecture for data stream management. The VLDB Journal--The International Journal on Very Large Data Bases 12, 2 (2003), 120--139. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Tyler Akidau, Alex Balikov, Kaya Bekiroğlu, Slava Chernyak, Josh Haberman, Reuven Lax, Sam McVeety, Daniel Mills, Paul Nordstrom, and Sam Whittle. 2013. MillWheel: Fault-Tolerant Stream Processing at Internet Scale. Proceedings of the VLDB Endowment 6, 11 (2013), 1033--1044. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Apache Storm 2017. http://storm.apache.org/. (2017).Google ScholarGoogle Scholar
  6. Doug Beaver, Sanjeev Kumar, Harry C. Li, Jason Sobel, and Peter Vajgel. 2010. Finding a Needle in Haystack: Facebook's Photo Storage. In Proceedings of the 9th USENIX Symposium on Operating Systems Design and Implementation. USENIX. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Nathan Bronson, Zach Amsden, George Cabrera, Prasad Chakka, Peter Dimov, Hui Ding, Jack Ferris, Anthony Giardullo, Sachin Kulkarni, Harry Li, Mark Marchukov, Dmitri Petrov, Lovro Puzar, Yee Jiun Song, and Venkat Venkataramani. 2013. TAO: Facebook's Distributed Data Store for the Social Graph. In Proceedings of the 2013 USENIX Annual Technical Conference. USENIX. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Sirish Chandrasekaran, Owen Cooper, Amol Deshpande, Michael J Franklin, Joseph M Hellerstein, Wei Hong, Sailesh Krishnamurthy, Samuel R Madden, Fred Reiss, and Mehul A Shah. 2003. TelegraphCQ: Continuous Dataflow Processing for an Uncertain World. In Proceedings of the 2003 ACM SIGMOD International Conference on Management of Data. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Tyson Condie, Neil Conway, Peter Alvaro, Joseph M Hellerstein, Khaled Elmeleegy, and Russell Sears. 2010. MapReduce Online. In Proceedings of the 7th USENIX Symposium on Networked Systems Design and Implementation. USENIX. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Jeffrey Dean and Sanjay Ghemawat. 2004. MapReduce: Simplified Data Processing on Large Clusters. In Proceedings of the 6th Symposium on Operating Systems Design and Implementation. USENIX. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Sadjad Fouladi, Riad S. Wahby, Brennan Shacklett, Karthikeyan Vasuki Balasubramaniam, William Zeng, Rahul Bhalerao, Anirudh Sivaraman, George Porter, and Keith Winstein. 2017. Encoding, Fast and Slow: Low-Latency Video Processing Using Thousands of Tiny Threads. In Proceedings of the 14th USENIX Symposium on Networked Systems Design and Implementation. USENIX. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. HipHop Virtual Machine 2017. http://hhvm.com/. (2017).Google ScholarGoogle Scholar
  13. Qi Huang, Ken Birman, Robbert van Renesse, Wyatt Lloyd, Sanjeev Kumar, and Harry C. Li. 2013. An Analysis of Facebook Photo Caching. In Proceedings of the 24th ACM Symposium on Operating System Principles. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Michael Isard, Mihai Budiu, Yuan Yu, Andrew Birrell, and Dennis Fetterly. 2007. Dryad: Distributed Data-Parallel Programs from Sequential Building Blocks. In Proceedings of the 2nd ACM SIGOPS European Conference on Computer Systems. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Sanjeev Kulkarni, Nikunj Bhagat, Maosong Fu, Vikas Kedigehalli, Christopher Kellogg, Sailesh Mittal, Jignesh M. Patel, Karthikeyan Ramasamy, and Siddarth Taneja. 2015. Twitter Heron: Stream Processing at Scale. In Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Wei Lin, Zhengping Qian, Junwei Xu, Sen Yang, Jingren Zhou, and Lidong Zhou. 2016. StreamScope: Continuous Reliable Distributed Processing of Big Data Streams. In Proceedings of the 13th USENIX Symposium on Networked Systems Design and Implementation. USENIX. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Yao-Chung Lin, Hugh Denman, and Anil Kokaram. 2015. Multipass Encoding for Reducing Pulsing Artifacts in Cloud Based Video Transcoding. In Proceedings of the 2015 IEEE International Conference on Image Processing. IEEE.Google ScholarGoogle ScholarCross RefCross Ref
  18. Rajeev Motwani, Jennifer Widom, Arvind Arasu, Brian Babcock, Shivnath Babu, Mayur Datar, Gurmeet Manku, Chris Olston, Justin Rosenstein, and Rohit Varma. 2003. Query Processing, Resource Management, and Approximation in a Data Stream Management System. In Proceedings of the 2003 Conference on Innovative Data Systems Research.Google ScholarGoogle Scholar
  19. Subramanian Muralidhar, Wyatt Lloyd, Sabyasachi Roy, Cory Hill, Ernest Lin, Weiwen Liu, Satadru Pan, Shiva Shankar, Viswanath Sivakumar, Linpeng Tang, et al. 2014. f4: Facebook's Warm BLOB Storage System. In Proceedings of the 11th USENIX Symposium on Operating Systems Design and Implementation. USENIX. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Derek Gordon Murray, Frank McSherry, Rebecca Isaacs, Michael Isard, Paul Barham, and Martín Abadi. 2013. Naiad: A Timely Dataflow System. In Proceedings of the 24th ACM Symposium on Operating System Principles. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Derek G Murray, Malte Schwarzkopf, Christopher Smowton, Steven Smith, Anil Madhavapeddy, and Steven Hand. 2011. CIEL: a universal execution engine for distributed data-flow computing. In Proceedings of the 8th USENIX Symposium on Networked Systems Design and Implementation. USENIX. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Leonardo Neumeyer, Bruce Robbins, Anish Nair, and Anand Kesari. 2010. S4: Distributed Stream Computing Platform. In Proceedings of the 2010 IEEE International Conference on Data Mining Workshops. IEEE. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Edmund B. Nightingale, Kaushik Veeraraghavan, Peter M. Chen, and Jason Flinn. 2006. Rethink the Sync. In Proceedings of the 7th USENIX Symposium on Operating Systems Design and Implementation. USENIX. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Edmund B Nightingale, Kaushik Veeraraghavan, Peter M Chen, and Jason Flinn. 2008. Rethink the Sync. ACM Transactions on Computer Systems (TOCS) 26, 3 (2008), 6. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Russell Power and Jinyang Li. 2010. Piccolo: Building Fast, Distributed Programs with Partitioned Tables. In Proceedings of the 9th USENIX Symposium on Operating Systems Design and Implementation. USENIX. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Ariel Rabkin, Matvey Arye, Siddhartha Sen, Vivek S Pai, and Michael J Freedman. 2014. Aggregation and Degradation in JetStream: Streaming Analytics in the Wide Area. In Proceedings of the 11th USENIX Symposium on Networked Systems Design and Implementation. USENIX. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Vijay Rao and Edwin Smith. 2016. Facebook's new front-end server design delivers on performance without sucking up power. https://code.facebook.com/posts/1711485769063510. (2016).Google ScholarGoogle Scholar
  28. Alyson Shontell. 2015. Facebook is now generating 8 billion video views per day from just 500 million people âĂŤ here's how that's possible. https://tinyurl.com/yc3jhxuu. (2015).Google ScholarGoogle Scholar
  29. Linpeng Tang, Qi Huang, Wyatt Lloyd, Sanjeev Kumar, and Kai Li. 2015. RIPQ: Advanced Photo Caching on Flash for Facebook. In Proceedings of the 13th USENIX Conference on File and Storage Technologies. USENIX. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Linpeng Tang, Qi Huang, Amit Puntambekar, Ymir Vigfusson, Wyatt Lloyd, and Kai Li. 2017. Popularity Prediction of Facebook Videos for Higher Quality Streaming. In Proceedings of the 2017 USENIX Annual Technical Conference. USENIX. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. The Hack Programming Language 2017. http://hacklang.org/. (2017).Google ScholarGoogle Scholar
  32. Kaushik Veeraraghavan, Justin Meza, David Chou, Wonho Kim, Sonia Margulis, Scott Michelson, Rajesh Nishtalaand Daniel Obenshain, Dmitri Perelman, and Yee Jiun Song. 2016. Kraken: Leveraging Live Traffic Tests to Identify and Resolve Resource Utilization Bottlenecks in Large Scale Web Services. In Proceedings of the 12th USENIX Symposium on Operating Systems Design and Implementation. USENIX. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Rick Wong, Zhan Chen, Anne Aaron, Megha Manohara, and Darrell Denlinger. 2016. Chelsea: Encoding in the Fast Lane. http://techblog.netflix.com/2016/07/chelsea-encoding-in-fast-lane.html. (2016).Google ScholarGoogle Scholar
  34. Yuan Yu, Michael Isard, Dennis Fetterly, Mihai Budiu, Úlfar Erlingsson, Pradeep Kumar Gunda, and Jon Currey. 2008. DryadLINQ: A System for General-Purpose Distributed Data-Parallel Computing Using a High-Level Language. In Proceedings of the 8th USENIX Symposium on Operating Systems Design and Implementation. USENIX. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Matei Zaharia, Mosharaf Chowdhury, Tathagata Das, Ankur Dave, Justin Ma, Murphy McCauly, Michael J. Franklin, Scott Shenker, and Ion Stoica. 2012. Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing. In Proceedings of the 9th USENIX Symposium on Networked Systems Design and Implementation. USENIX. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Matei Zaharia, Tathagata Das, Haoyuan Li, Timothy Hunter, Scott Shenker, and Ion Stoica. 2013. Discretized Streams: Fault-Tolerant Streaming Computation at Scale. In Proceedings of the 24th ACM Symposium on Operating System Principles. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Haoyu Zhang, Ganesh Ananthanarayanan, Peter Bodik, Matthai Philipose, Paramvir Bahl, and Michael J. Freedman. 2017. Live Video Analytics at Scale with Approximation and Delay-Tolerance.. In Proceedings of the 14th USENIX Symposium on Networked Systems Design and Implementation. USENIX. Google ScholarGoogle ScholarDigital LibraryDigital Library

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in
  • Published in

    cover image ACM Conferences
    SOSP '17: Proceedings of the 26th Symposium on Operating Systems Principles
    October 2017
    677 pages
    ISBN:9781450350853
    DOI:10.1145/3132747

    Copyright © 2017 Owner/Author

    This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs International 4.0 License.

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 14 October 2017

    Permissions

    Request permissions about this article.

    Request Permissions

    Check for updates

    Qualifiers

    • research-article
    • Research
    • Refereed limited

    Acceptance Rates

    Overall Acceptance Rate131of716submissions,18%

    Upcoming Conference

    SOSP '24

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader