skip to main content
research-article

Beyond search: Event-driven summarization for web videos

Published:02 December 2011Publication History
Skip Abstract Section

Abstract

The explosive growth of Web videos brings out the challenge of how to efficiently browse hundreds or even thousands of videos at a glance. Given an event-driven query, social media Web sites usually return a large number of videos that are diverse and noisy in a ranking list. Exploring such results will be time-consuming and thus degrades user experience. This article presents a novel scheme that is able to summarize the content of video search results by mining and threading “key” shots, such that users can get an overview of main content of these videos at a glance. The proposed framework mainly comprises four stages. First, given an event query, a set of Web videos is collected associated with their ranking order and tags. Second, key-shots are established and ranked based on near-duplicate keyframe detection and they are threaded in a chronological order. Third, we analyze the tags associated with key-shots. Irrelevant tags are filtered out via a representativeness and descriptiveness analysis, whereas the remaining tags are propagated among key-shots by random walk. Finally, summarization is formulated as an optimization framework that compromises relevance of key-shots and user-defined skimming ratio. We provide two types of summarization: video skimming and visual-textual storyboard. We conduct user studies on twenty event queries for over hundred hours of videos crawled from YouTube. The evaluation demonstrates the feasibility and effectiveness of the proposed solution.

References

  1. Benoit, H. and Bernard, M. 2006. Automatic video summarization. In Interactive Video, Algorithms and Technologies, 27--41.Google ScholarGoogle Scholar
  2. Capra, R. G., Lee, C. A., Marchionini, G., Russell, T., Shah, C., and Stutzman, F. 2008. Selection and context scoping for digital video collections: An investigation of youtube and blogs. In Proceedings of the Joint Conference on Digital Libraries (JCDL). Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Cha, M., Kwak, H., Rodriguez, P., Ahn, Y., and Moon, S. 2007. I tube, you tube, everybody tubes: analyzing the world's largest user generated content video system. In Proceedings of the 7th ACM SIGCOMM Conference on Internet Measurement. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Chen, B. W., Wang, J. C., and Wang, J. F. 2003. A novel video summarization based on mining the story-structure and semantic relations among concept entities. IEEE Trans. Multimedia 9, 295--312. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Cheng, X., Dale, C., and Liu, J. 2007. Understanding the characteristics of Internet short video sharing: Youtube as a case study. In Proceedings of the 7th ACM SIGCOMM Conference on Internet Measurement.Google ScholarGoogle Scholar
  6. Chua, T. S., Hong, R., and Tang, J. 2010. Multimedia question answering. Scholarpedia 5, 5, 9546.Google ScholarGoogle ScholarCross RefCross Ref
  7. Cilibrasi, R. and Vitanyi, P. 2007. The google similarity distance. IEEE Trans. Knowl. Data Engin. 19, 370--383. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Duygulu, P., Pan, J.-Y., and Forsyth, D. A. 2003. Towards auto-documentary: Tracking the evolution of news stories. In Proceedings of the 11th ACM International Conference on Multimedia. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Hong, R., Li, G., Nie, L., Tang, J., and Chua, T. S. 2010a. Exploring large scale data for multimedia qa: An initial study. In Proceedings of the ACM International Conference on Image and Video Retrieval (CIVR). Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Hong, R., Tang, J., Tan, H. K., Ngo, C. W., and Chua, T. S. 2009. Event driven summarization for web videos. In Proceedings of the ACM Multimedia Workshop on Social Media. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Hong, R., Tang, J., Zha, Z. J., Luo, Z., and Chua, T. S. 2010b. Mediapedia: Mining web knowledge to construct multimedia encyclopedia. In Proceedings of the International Conference on Multimedia Modelling (MMM). Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Hong, R., Wang, M., Xu, M., Yan, S., and Chua, T. S. 2010c. Dynamic captioning: Video accessibility enhancement for hearing impairment. In Proceedings of the ACM International Conference on Multimedia (ACM MM). Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Hsu, W. H., Kennedy, L. S., and Chang, S. F. 2007. Video search reranking through random walk over document-level context graph. In Proceedings of the ACM 14th International Conference on Multimedia. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Jing, Y. and Baluja, S. 2008. Pagerank for product image search. In Proceedings of the 17th International World Wide Web Conference. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Ke, Y., Suthankar, R., and Huston, L. 2004. Efficient near-duplicate detection and sub-image retrieval. In Proceedings of the 12th ACM International Conference on Multimedia. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Li, X. and Pang, Y. 2010. Deterministic column-based matrix decomposition. IEEE Trans. Knowl. Data Engin. 22, 1, 145--149. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Lowe, D. 2004. Distinctive image features from scale-invariant keypoints. Int. J. Comput. 60, 91--110. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Money, A. G. and Agius, H. 2007. Video summarization: a conceptual framework and survey of the state of the art. J. Vis. Commun. Image Represent. 19, 121--143. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Neo, S. Y., Ran, Y., Goh, H. K., Zheng, Y., and Chua, T. S. 2007. The use of topic evolution to help users browse and find answers in news video corpus. In Proceedings of the 14th ACM International Conference on Multimedia. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Ngo, C. W., Ma, Y. F., and Zhang, H. J. 2005. Video summarization and scene detection by graph modeling. IEEE Trans. Circ. Syst. Video Techn. 15, 296--315. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Pedro, J. S. and Dominguez, S. 2007. Network-aware identification of video clip fragments. In Proceedings of the ACM International Conference on Image and Video Retrieval. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Peng, Y. and Ngo, C. W. 2006. Clip-based similarity measure for query-dependent clip retrieval and video summarization. IEEE Trans. Circ. Syst. Video Techn. 16, 612--627. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Shen, J., Shepherd, J., Cui, B., and Tan, K. 2009. A novel framework for efficient automated singer identification in large music databasesn. ACM Trans. Inform. Syst. 27, 3. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Shen, J., Tao, D., and Li, X. 2008. Modality mixture projections for semantic video event detection. IEEE Trans. Circ. Syst. Video Techn. 18, 1587--1596. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Siersdorfer, S., Pedro, J. S., and Sanderson, M. 2009. Automatic video tagging using content redundancy. In Proceedings of the 32nd Annual ACM SIGIR Conference. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Tang, J., Hong, R., Yan, S., Chua, T. S., Qi, G. J., and Jain, R. 2011. Image annotation by knn-sparse graph-based label propagation over noisily tagged web images. ACM Trans. Intell. Syst. 2, 2, 1--14. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Truong, B. T. and Venkatesh, S. 2007. Video abstraction: A systematic review and classification. ACM Trans. Multmedia Comput. Com. Appl. 3, 1. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Wang, M., Hua, X. S., Hong, R., Tang, J., Qi, G. J., and Song, Y. 2009a. Unified video annotation via multi-graph learning. IEEE Trans. Circ. Syst. Video Techn. 19, 5, 733--746. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Wang, M., Hua, X. S., Tang, J., and Hong, R. 2009b. Beyond distance measurement: Constructing neighborhood similarity for video annotation. IEEE Trans. Multimedia 11, 3, 465--476. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Wu, X., Hauptmann, A. G., and Ngo, C. W. 2007. Practical elimination of near-duplicates from web video search. In Proceedings of the 15th International ACM Conference on Multimedia. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Wu, X., Ngo, C. W., and Li, Q. 2006. Threading and autodocumenting news videos. IEEE Sign. Proces. Mag. 23, 59--68.Google ScholarGoogle ScholarCross RefCross Ref
  32. Xuelong Li, Yanwei Pang, Y. Y. 2010. L1-norm-based 2dpca. IEEE Trans. Syst. Man Cyb. Part B 40, 4, 1170--1175. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Yang, H., Chaisorn, L., Zhao, Y., Neo, S. Y., and Chua, T. S. 2003. Videoqa: question answering on news video. In Proceedings of the 11th ACM International Conference on Multimedia. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Yang, Y., Xu, D., Nie, F., Yan, S., and Zhuang, Y. 2010. Image clustering using local discriminant models and global integration. IEEE Trans. Image Proces. 10, 2761--2773. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Yang, Y., Zhuang, Y., Tao, D., Xu, D., Yu, J., and Luo, J. 2011. Recognizing cartoon image gestures for retrieval and interactive cartoon clip synthesis. IEEE Trans. Circ. Sys. Video Tech. 20, 12, 1745--1756. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Zhang, D.-Q. and Chang, S.-F. 2004. Detecting image near-duplicate by stochastic attributed relational graph matching with learning. In Proceedings of the 12th ACM International Conference on Multimedia. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Zhao, W. and Ngo, C. W. 2008. Scale-rotation invariant pattern entropy for keypoint-based near-duplicate detection. IEEE Trans. Image Process. 18, 412--423. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Zhao, W., Ngo, C. W., Tan, H. K., and Wu, X. 2007. Near-duplicate keyframe identification with interest point matching and pattern learning. IEEE Trans. Multimedia 9, 1037--1048. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Zhu, X., Fan, J., Elmagarmid, A. K., and Wu, X. 2003. Hierarchical video content description and summarization using unified semantic and visual similarity. Multimedia Syst. 9, 31--53. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Beyond search: Event-driven summarization for web videos

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image ACM Transactions on Multimedia Computing, Communications, and Applications
          ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 7, Issue 4
          November 2011
          108 pages
          ISSN:1551-6857
          EISSN:1551-6865
          DOI:10.1145/2043612
          Issue’s Table of Contents

          Copyright © 2011 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 2 December 2011
          • Accepted: 1 January 2010
          • Revised: 1 December 2009
          • Received: 1 September 2009
          Published in tomm Volume 7, Issue 4

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article
          • Research
          • Refereed

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader