Abstract
Video sharing services that allow ordinary Web users to upload video clips of their choice and watch video clips uploaded by others have recently become very popular. This article identifies invariants in video sharing workloads, through comparison of the workload characteristics of four popular video sharing services. Our traces contain metadata on approximately 1.8 million videos which together have been viewed approximately 6 billion times. Using these traces, we study the similarities and differences in use of several Web 2.0 features such as ratings, comments, favorites, and propensity of uploading content. In general, we find that active contribution, such as video uploading and rating of videos, is much less prevalent than passive use. While uploaders in general are skewed with respect to the number of videos they upload, the fraction of multi-time uploaders is found to differ by a factor of two between two of the sites. The distributions of lifetime measures of video popularity are found to have heavy-tailed forms that are similar across the four sites. Finally, we consider implications for system design of the identified invariants. To gain further insight into caching in video sharing systems, and the relevance to caching of lifetime popularity measures, we gathered an additional dataset tracking views to a set of approximately 1.3 million videos from one of the services, over a twelve-week period. We find that lifetime popularity measures have some relevance for large cache (hot set) sizes (i.e., a hot set defined according to one of these measures is indeed relatively “hot”), but that this relevance substantially decreases as cache size decreases, owing to churn in video popularity.
- Acharya, S., Smith, B., and Parnes, P. 2000. Characterizing user access to videos on the World Wide Web. In Proceedings of the SPIE Multimedia Computing and Networking (MMCN) Conference. 130--141.Google Scholar
- Adamic, L. Zipf, power-laws, and Pareto - A ranking tutorial. http://www.hpl.hp.com/research/idl/papers/ranking/ranking.html.Google Scholar
- Almeida, J., Krueger, J., Eager, D., and Vernon, M. 2001. Analysis of educational media server workloads. In Proceedings of the International Workshop on Network and Operating System Support for Digital Audio and Video (NOSSDAV). 21--30. Google ScholarDigital Library
- Arlitt, M. and Williamson, C. 1997. Internet web servers: Workload characterization and performance implications. IEEE/ACM Trans. on Netw. 5, 5, 631--645. Google ScholarDigital Library
- Barford, P. and Crovella, M. 1998. Generating representative web workloads for network and server performance evaluation. SIGMETRICS Perform. Eval. Rev. 26, 1, 151--160. Google ScholarDigital Library
- Breslau, L., Cao, P., Fan, L., Phillips, G., and Shenker, S. 1999. Web caching and Zipf-like distributions: Evidence and implications. In Proceedings of the IEEE INFOCOM Conference. 126--134.Google Scholar
- Cha, M., Kwak, H., Rodriguez, P., Ahn, Y., and Moon, S. 2007. I Tube, You Tube, Everybody Tubes: Analyzing the world’s largest user generated content video system. In Proceedings of the ACM Internet Measurement Conference (IMC). 1--14. Google ScholarDigital Library
- Cheng, X., Dale, C., and Lui, J. 2008. Statistics and social network of YouTube videos. In Proceedings of the International Workshop on Quality of Service (IWQoS). 229--238.Google Scholar
- Clauset, A., Shalizi, C., and Newman, M. 2009. Power-law distributions in empirical data. SIAM Rev. 51, 4, 661--703. Google ScholarDigital Library
- Cuong, C. D. 2007. YouTube scalability. Google Seattle Conference on Scalability. http://www.techpresentations.org/YouTube_Scalability.Google Scholar
- Downey, A. B. 2005. Lognormal and Pareto distributions in the Internet. Comput. Comm. 28, 7, 790--801. Google ScholarDigital Library
- Gill, P., Arlitt, M., Li, Z., and Mahanti, A. 2007. YouTube traffic characterization: A view from the edge. In Proceedings of the ACM Internet Measurement Conference (IMC). 15--28. Google ScholarDigital Library
- Gill, P., Arlitt, M., Li, Z., and Mahanti, A. 2008. Characterizing user sessions on YouTube. In Proceedings of the SPIE Multimedia Computing and Networking (MMCN) Conference.Google Scholar
- Gummadi, K., Dunn, R., Saroiu, S., Gribble, S., Levy, H., and Zahorjan, J. 2003. Measurement, modeling and analysis of a peer-to-peer file-sharing workload. SIGOPS Oper. Syst. Rev. 37, 5, 314--329. Google ScholarDigital Library
- Guo, L., Tan, E., Chen, S., Xiao, Z., and Zhang, X. 2008. The stretched exponential distribution of Internet media access patterns. In Proceedings of the ACM Symposium on Principles of Distributed Computing (PODC). 283--294. Google ScholarDigital Library
- Halvey, M. and Keane, M. 2007a. Analysis of online video search and sharing. In Proceedings of the ACM Hypertext and Hypermedia Conference. 217--226. Google ScholarDigital Library
- Halvey, M. and Keane, M. 2007b. Exploring social dynamics in online media sharing. In Proceedings of the International Conference on World Wide Web (WWW). 1273--1274. Google ScholarDigital Library
- Hefeeda, M. and Saleh, O. 2008. Traffic modeling and proportional partial caching for peer-to-peer systems. IEEE/ACM Trans. Netw. 16, 6, 1447--1460. Google ScholarDigital Library
- Mahanti, A., Williamson, C., and Eager, D. 2000. Traffic analysis of a web proxy caching hierarchy. IEEE Netw. 14, 3, 16--23. Google ScholarDigital Library
- Mitra, S., Agrawal, M., Yadav, A., Carlsson, N., Eager, D., and Mahanti, A. 2009. Characterizing web-based video sharing workloads. In Proceedings of the International Conference on World Wide Web (WWW). 1191--1192. Google ScholarDigital Library
- Mitzenmacher, M. 2004. A brief history of generative models for power law and lognormal distributions. Internet Math. 1, 2, 226--251.Google ScholarCross Ref
- Newman, M. 2005. Power laws, Pareto distributions and Zipf’s law. Contemp. Phys. 46, 5, 323--351.Google ScholarCross Ref
- Yu, H., Zheng, D., Zhao, B., and Zheng, W. 2006. Understanding user behavior in large-scale video-on-demand systems. In Proceedings of the ACM SIGOPS/EuroSys European Conference on Computer Systems (EuroSys). 333--344. Google ScholarDigital Library
- Zink, M., Suh, K., and Kurose, J. 2008. Watch global, cache local: Youtube network traffic at a campus network - Measurements and implications. In Proceedings of the SPIE Multimedia Computing and Networking (MMCN) Conference.Google Scholar
Index Terms
- Characterizing Web-Based Video Sharing Workloads
Recommendations
Tag suggestion and localization in user-generated videos based on social knowledge
WSM '10: Proceedings of second ACM SIGMM workshop on Social mediaNowadays, almost any web site that provides means for sharing user-generated multimedia content, like Flickr, Facebook, YouTube and Vimeo, has tagging functionalities to let users annotate the material that they want to share. The tags are then used to ...
Semantic annotation of personal video content using an image folksonomy
ICIP'09: Proceedings of the 16th IEEE international conference on Image processingThe increasing popularity of user-generated content (UGC) requires effective annotation techniques in order to facilitate precise content search and retrieval. In this paper, we propose a new approach for the semantic annotation of personal video ...
Content and geographical locality in user-generated content sharing systems
NOSSDAV '12: Proceedings of the 22nd international workshop on Network and Operating System Support for Digital Audio and VideoUser Generated Content (UGC), such as YouTube videos, accounts for a substantial fraction of the Internet traffic. To optimize their performance, UGC services usually rely on both proactive and reactive approaches that exploit spatial and temporal ...
Comments