skip to main content
10.1145/1873951.1874198acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Video genetics: a case study from YouTube

Published:25 October 2010Publication History

ABSTRACT

We explore in a single but large case study how videos within YouTube, competing for view counts, are like organisms within an ecology, competing for survival. We develop this analogy, whose core idea shows that short video clips, best detected across videos as near-duplicate keyframes, behave similarly to genes. We report work in progress, on a dataset of 5.4K videos with 210K keyframes on a single topic, which traces sequences, not bags, of "near-dups" over time, both within videos and across them. We demonstrate their utility to: cleanse responses to queries contaminated by over-eager YouTube query expansion; separate videos temporally according to their responses to external events; track the evolution and lifespan of continuing video "stories"; automatically locate video summaries already present within a video ecology; quickly verify video copying via a direct application of the Smith-Waterman algorithm used in genetics - which also provides useful feedback for tuning the near-dup detection and clustering process; and quickly classify videos via a kind of Lempel-Ziv encoding into the categories of news, monologue, dialogue, and slideshow. We demonstrate a number of novel visualizations of this large dataset, including a direct use of the Matlab black-body "hot" false-color map, together with the GraphViz package, to display the gene-like inheritance of viral properties of keyframes. We further speculate that, as with genes, there are "functional roles" for semantic categories of clips, and, as with species, there are differing rates of "genetic drift" for each video genre.

References

  1. D. Arijon. Grammar of the Film Language. Silman-James Press, 1976.Google ScholarGoogle Scholar
  2. R. Colbaugh, K. Glass, and P. Ormerod. Predictability and Prediction for an Experimental Cultural Market. In Advances in Social Computing, volume 6007, pages 79--86. Springer Berlin, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. R. Crane and D. Sornette. Viral, Quality, and Junk Videos on YouTube: Separating Content From Noise in an Information-Rich Environment. In Proceedings of the AAAI Symposium on Social Information Processing, pages 18--20, March 2008.Google ScholarGoogle Scholar
  4. C. Gini. Measurement of Inequality of Incomes. Economic Journal, 31:124--126, 1921.Google ScholarGoogle ScholarCross RefCross Ref
  5. J. L. Iribarren and E. Moro. Impact of Human Activity Patterns on the Dynamics of Information Diffusion. Physical Review Letters, 103(3):038702-1--038702-4, July 2009.Google ScholarGoogle ScholarCross RefCross Ref
  6. J. Leskovec, L. Backstrom, and J. Kleinberg. Meme-Tracking and the Dynamics of the News Cycle. In Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery nd Data Mining (KDD'09), pages 497--506. ACM, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. J. Lin, E. Keogh, S. Lonardi, and B. Chiu. A symbolic representation of time series, with implications for streaming algorithms. In DMKD '03: Proceedings of the 8th ACM SIGMOD workshop on Research issues in data mining and knowledge discovery, pages 2--11, New York, NY, USA, 2003. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. M. Salganik, P. Dodds, and D. Watts. Experimental Study of Inequality and Unpredictability in an Artificial Cultural Market. Science, 311(5762):854--856, 2006.Google ScholarGoogle ScholarCross RefCross Ref
  9. T. Smith and M. Waterman. Identification of Common Molecular Subsequences. Journal of Molecular Biology, 147(1):195--197, March 1981.Google ScholarGoogle ScholarCross RefCross Ref
  10. E. Sun, I. Rosenn, C. Marlow, and T. Lento. Gesundheit! Modeling Contagion through Facebook News Feeds. In Proceedings of the Third International Conference on Weblogs and Social Media. AAAI Press, May 2009.Google ScholarGoogle Scholar
  11. J. Ziv and A. Lempel. A Universal Algorithm for Sequential Data Compression. IEEE Transactions on Information Theory, 23(3):337--343, 1977.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Video genetics: a case study from YouTube

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        MM '10: Proceedings of the 18th ACM international conference on Multimedia
        October 2010
        1836 pages
        ISBN:9781605589336
        DOI:10.1145/1873951

        Copyright © 2010 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 25 October 2010

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        Overall Acceptance Rate995of4,171submissions,24%

        Upcoming Conference

        MM '24
        MM '24: The 32nd ACM International Conference on Multimedia
        October 28 - November 1, 2024
        Melbourne , VIC , Australia

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader