Abstract
The explosive growth of Web videos brings out the challenge of how to efficiently browse hundreds or even thousands of videos at a glance. Given an event-driven query, social media Web sites usually return a large number of videos that are diverse and noisy in a ranking list. Exploring such results will be time-consuming and thus degrades user experience. This article presents a novel scheme that is able to summarize the content of video search results by mining and threading “key” shots, such that users can get an overview of main content of these videos at a glance. The proposed framework mainly comprises four stages. First, given an event query, a set of Web videos is collected associated with their ranking order and tags. Second, key-shots are established and ranked based on near-duplicate keyframe detection and they are threaded in a chronological order. Third, we analyze the tags associated with key-shots. Irrelevant tags are filtered out via a representativeness and descriptiveness analysis, whereas the remaining tags are propagated among key-shots by random walk. Finally, summarization is formulated as an optimization framework that compromises relevance of key-shots and user-defined skimming ratio. We provide two types of summarization: video skimming and visual-textual storyboard. We conduct user studies on twenty event queries for over hundred hours of videos crawled from YouTube. The evaluation demonstrates the feasibility and effectiveness of the proposed solution.
- Benoit, H. and Bernard, M. 2006. Automatic video summarization. In Interactive Video, Algorithms and Technologies, 27--41.Google Scholar
- Capra, R. G., Lee, C. A., Marchionini, G., Russell, T., Shah, C., and Stutzman, F. 2008. Selection and context scoping for digital video collections: An investigation of youtube and blogs. In Proceedings of the Joint Conference on Digital Libraries (JCDL). Google ScholarDigital Library
- Cha, M., Kwak, H., Rodriguez, P., Ahn, Y., and Moon, S. 2007. I tube, you tube, everybody tubes: analyzing the world's largest user generated content video system. In Proceedings of the 7th ACM SIGCOMM Conference on Internet Measurement. Google ScholarDigital Library
- Chen, B. W., Wang, J. C., and Wang, J. F. 2003. A novel video summarization based on mining the story-structure and semantic relations among concept entities. IEEE Trans. Multimedia 9, 295--312. Google ScholarDigital Library
- Cheng, X., Dale, C., and Liu, J. 2007. Understanding the characteristics of Internet short video sharing: Youtube as a case study. In Proceedings of the 7th ACM SIGCOMM Conference on Internet Measurement.Google Scholar
- Chua, T. S., Hong, R., and Tang, J. 2010. Multimedia question answering. Scholarpedia 5, 5, 9546.Google ScholarCross Ref
- Cilibrasi, R. and Vitanyi, P. 2007. The google similarity distance. IEEE Trans. Knowl. Data Engin. 19, 370--383. Google ScholarDigital Library
- Duygulu, P., Pan, J.-Y., and Forsyth, D. A. 2003. Towards auto-documentary: Tracking the evolution of news stories. In Proceedings of the 11th ACM International Conference on Multimedia. Google ScholarDigital Library
- Hong, R., Li, G., Nie, L., Tang, J., and Chua, T. S. 2010a. Exploring large scale data for multimedia qa: An initial study. In Proceedings of the ACM International Conference on Image and Video Retrieval (CIVR). Google ScholarDigital Library
- Hong, R., Tang, J., Tan, H. K., Ngo, C. W., and Chua, T. S. 2009. Event driven summarization for web videos. In Proceedings of the ACM Multimedia Workshop on Social Media. Google ScholarDigital Library
- Hong, R., Tang, J., Zha, Z. J., Luo, Z., and Chua, T. S. 2010b. Mediapedia: Mining web knowledge to construct multimedia encyclopedia. In Proceedings of the International Conference on Multimedia Modelling (MMM). Google ScholarDigital Library
- Hong, R., Wang, M., Xu, M., Yan, S., and Chua, T. S. 2010c. Dynamic captioning: Video accessibility enhancement for hearing impairment. In Proceedings of the ACM International Conference on Multimedia (ACM MM). Google ScholarDigital Library
- Hsu, W. H., Kennedy, L. S., and Chang, S. F. 2007. Video search reranking through random walk over document-level context graph. In Proceedings of the ACM 14th International Conference on Multimedia. Google ScholarDigital Library
- Jing, Y. and Baluja, S. 2008. Pagerank for product image search. In Proceedings of the 17th International World Wide Web Conference. Google ScholarDigital Library
- Ke, Y., Suthankar, R., and Huston, L. 2004. Efficient near-duplicate detection and sub-image retrieval. In Proceedings of the 12th ACM International Conference on Multimedia. Google ScholarDigital Library
- Li, X. and Pang, Y. 2010. Deterministic column-based matrix decomposition. IEEE Trans. Knowl. Data Engin. 22, 1, 145--149. Google ScholarDigital Library
- Lowe, D. 2004. Distinctive image features from scale-invariant keypoints. Int. J. Comput. 60, 91--110. Google ScholarDigital Library
- Money, A. G. and Agius, H. 2007. Video summarization: a conceptual framework and survey of the state of the art. J. Vis. Commun. Image Represent. 19, 121--143. Google ScholarDigital Library
- Neo, S. Y., Ran, Y., Goh, H. K., Zheng, Y., and Chua, T. S. 2007. The use of topic evolution to help users browse and find answers in news video corpus. In Proceedings of the 14th ACM International Conference on Multimedia. Google ScholarDigital Library
- Ngo, C. W., Ma, Y. F., and Zhang, H. J. 2005. Video summarization and scene detection by graph modeling. IEEE Trans. Circ. Syst. Video Techn. 15, 296--315. Google ScholarDigital Library
- Pedro, J. S. and Dominguez, S. 2007. Network-aware identification of video clip fragments. In Proceedings of the ACM International Conference on Image and Video Retrieval. Google ScholarDigital Library
- Peng, Y. and Ngo, C. W. 2006. Clip-based similarity measure for query-dependent clip retrieval and video summarization. IEEE Trans. Circ. Syst. Video Techn. 16, 612--627. Google ScholarDigital Library
- Shen, J., Shepherd, J., Cui, B., and Tan, K. 2009. A novel framework for efficient automated singer identification in large music databasesn. ACM Trans. Inform. Syst. 27, 3. Google ScholarDigital Library
- Shen, J., Tao, D., and Li, X. 2008. Modality mixture projections for semantic video event detection. IEEE Trans. Circ. Syst. Video Techn. 18, 1587--1596. Google ScholarDigital Library
- Siersdorfer, S., Pedro, J. S., and Sanderson, M. 2009. Automatic video tagging using content redundancy. In Proceedings of the 32nd Annual ACM SIGIR Conference. Google ScholarDigital Library
- Tang, J., Hong, R., Yan, S., Chua, T. S., Qi, G. J., and Jain, R. 2011. Image annotation by knn-sparse graph-based label propagation over noisily tagged web images. ACM Trans. Intell. Syst. 2, 2, 1--14. Google ScholarDigital Library
- Truong, B. T. and Venkatesh, S. 2007. Video abstraction: A systematic review and classification. ACM Trans. Multmedia Comput. Com. Appl. 3, 1. Google ScholarDigital Library
- Wang, M., Hua, X. S., Hong, R., Tang, J., Qi, G. J., and Song, Y. 2009a. Unified video annotation via multi-graph learning. IEEE Trans. Circ. Syst. Video Techn. 19, 5, 733--746. Google ScholarDigital Library
- Wang, M., Hua, X. S., Tang, J., and Hong, R. 2009b. Beyond distance measurement: Constructing neighborhood similarity for video annotation. IEEE Trans. Multimedia 11, 3, 465--476. Google ScholarDigital Library
- Wu, X., Hauptmann, A. G., and Ngo, C. W. 2007. Practical elimination of near-duplicates from web video search. In Proceedings of the 15th International ACM Conference on Multimedia. Google ScholarDigital Library
- Wu, X., Ngo, C. W., and Li, Q. 2006. Threading and autodocumenting news videos. IEEE Sign. Proces. Mag. 23, 59--68.Google ScholarCross Ref
- Xuelong Li, Yanwei Pang, Y. Y. 2010. L1-norm-based 2dpca. IEEE Trans. Syst. Man Cyb. Part B 40, 4, 1170--1175. Google ScholarDigital Library
- Yang, H., Chaisorn, L., Zhao, Y., Neo, S. Y., and Chua, T. S. 2003. Videoqa: question answering on news video. In Proceedings of the 11th ACM International Conference on Multimedia. Google ScholarDigital Library
- Yang, Y., Xu, D., Nie, F., Yan, S., and Zhuang, Y. 2010. Image clustering using local discriminant models and global integration. IEEE Trans. Image Proces. 10, 2761--2773. Google ScholarDigital Library
- Yang, Y., Zhuang, Y., Tao, D., Xu, D., Yu, J., and Luo, J. 2011. Recognizing cartoon image gestures for retrieval and interactive cartoon clip synthesis. IEEE Trans. Circ. Sys. Video Tech. 20, 12, 1745--1756. Google ScholarDigital Library
- Zhang, D.-Q. and Chang, S.-F. 2004. Detecting image near-duplicate by stochastic attributed relational graph matching with learning. In Proceedings of the 12th ACM International Conference on Multimedia. Google ScholarDigital Library
- Zhao, W. and Ngo, C. W. 2008. Scale-rotation invariant pattern entropy for keypoint-based near-duplicate detection. IEEE Trans. Image Process. 18, 412--423. Google ScholarDigital Library
- Zhao, W., Ngo, C. W., Tan, H. K., and Wu, X. 2007. Near-duplicate keyframe identification with interest point matching and pattern learning. IEEE Trans. Multimedia 9, 1037--1048. Google ScholarDigital Library
- Zhu, X., Fan, J., Elmagarmid, A. K., and Wu, X. 2003. Hierarchical video content description and summarization using unified semantic and visual similarity. Multimedia Syst. 9, 31--53. Google ScholarDigital Library
Index Terms
- Beyond search: Event-driven summarization for web videos
Recommendations
Event driven summarization for web videos
WSM '09: Proceedings of the first SIGMM workshop on Social mediaThe explosive growth of web videos brings out the challenge of how to efficiently browse hundreds or even thousands of videos at a glance. Given an event-driven query, social media web sites can easily return a ranked list of large but diverse and ...
Discovering Event Evolution Graphs Based on News Articles Relationships
ICEBE '14: Proceedings of the 2014 IEEE 11th International Conference on e-Business EngineeringThere are many news articles reported online everyday. Within an ongoing topic, people can find a huge amount of news articles. A topic often consists of several events, and people are interested in the whole evolution of a topic along a timeline. This ...
Beyond search: statistical topic models for text analysis
SIGIR '11: Proceedings of the 34th international ACM SIGIR conference on Research and development in Information RetrievalSearch is generally a means to the end of finishing a task. While the current search engines are useful to users for finding relevant information, they offer little help to users for further digesting and analyzing the overwhelming found information ...
Comments