research-article

Beyond search: Event-driven summarization for web videos

Authors:
Richang Hong

National University of Singapore, Singapore

National University of Singapore, Singapore
View Profile

,
Jinhui Tang

National University of Singapore, Singapore

National University of Singapore, Singapore
View Profile

,
Hung-Khoon Tan

City University of Hong Kong, Hong Kong

City University of Hong Kong, Hong Kong
View Profile

,
Chong-Wah Ngo

City University of Hong Kong, Hong Kong

City University of Hong Kong, Hong Kong
View Profile

,
Shuicheng Yan

National University of Singapore, Singapore

National University of Singapore, Singapore
View Profile

,
Tat-Seng Chua

National University of Singapore, Singapore

National University of Singapore, Singapore
View Profile

ACM Transactions on Multimedia Computing, Communications, and Applications Volume 7 Issue 4Article No.: 35pp 1–18https://doi.org/10.1145/2043612.2043613

Published:02 December 2011Publication History

ACM Transactions on Multimedia Computing, Communications, and Applications

Abstract

The explosive growth of Web videos brings out the challenge of how to efficiently browse hundreds or even thousands of videos at a glance. Given an event-driven query, social media Web sites usually return a large number of videos that are diverse and noisy in a ranking list. Exploring such results will be time-consuming and thus degrades user experience. This article presents a novel scheme that is able to summarize the content of video search results by mining and threading “key” shots, such that users can get an overview of main content of these videos at a glance. The proposed framework mainly comprises four stages. First, given an event query, a set of Web videos is collected associated with their ranking order and tags. Second, key-shots are established and ranked based on near-duplicate keyframe detection and they are threaded in a chronological order. Third, we analyze the tags associated with key-shots. Irrelevant tags are filtered out via a representativeness and descriptiveness analysis, whereas the remaining tags are propagated among key-shots by random walk. Finally, summarization is formulated as an optimization framework that compromises relevance of key-shots and user-defined skimming ratio. We provide two types of summarization: video skimming and visual-textual storyboard. We conduct user studies on twenty event queries for over hundred hours of videos crawled from YouTube. The evaluation demonstrates the feasibility and effectiveness of the proposed solution.

References

Benoit, H. and Bernard, M. 2006. Automatic video summarization. In Interactive Video, Algorithms and Technologies, 27--41.Google Scholar
Capra, R. G., Lee, C. A., Marchionini, G., Russell, T., Shah, C., and Stutzman, F. 2008. Selection and context scoping for digital video collections: An investigation of youtube and blogs. In Proceedings of the Joint Conference on Digital Libraries (JCDL). Google ScholarDigital Library
Cha, M., Kwak, H., Rodriguez, P., Ahn, Y., and Moon, S. 2007. I tube, you tube, everybody tubes: analyzing the world's largest user generated content video system. In Proceedings of the 7th ACM SIGCOMM Conference on Internet Measurement. Google ScholarDigital Library
Chen, B. W., Wang, J. C., and Wang, J. F. 2003. A novel video summarization based on mining the story-structure and semantic relations among concept entities. IEEE Trans. Multimedia 9, 295--312. Google ScholarDigital Library
Cheng, X., Dale, C., and Liu, J. 2007. Understanding the characteristics of Internet short video sharing: Youtube as a case study. In Proceedings of the 7th ACM SIGCOMM Conference on Internet Measurement.Google Scholar
Chua, T. S., Hong, R., and Tang, J. 2010. Multimedia question answering. Scholarpedia 5, 5, 9546.Google ScholarCross Ref
Cilibrasi, R. and Vitanyi, P. 2007. The google similarity distance. IEEE Trans. Knowl. Data Engin. 19, 370--383. Google ScholarDigital Library
Duygulu, P., Pan, J.-Y., and Forsyth, D. A. 2003. Towards auto-documentary: Tracking the evolution of news stories. In Proceedings of the 11th ACM International Conference on Multimedia. Google ScholarDigital Library
Hong, R., Li, G., Nie, L., Tang, J., and Chua, T. S. 2010a. Exploring large scale data for multimedia qa: An initial study. In Proceedings of the ACM International Conference on Image and Video Retrieval (CIVR). Google ScholarDigital Library
Hong, R., Tang, J., Tan, H. K., Ngo, C. W., and Chua, T. S. 2009. Event driven summarization for web videos. In Proceedings of the ACM Multimedia Workshop on Social Media. Google ScholarDigital Library
Hong, R., Tang, J., Zha, Z. J., Luo, Z., and Chua, T. S. 2010b. Mediapedia: Mining web knowledge to construct multimedia encyclopedia. In Proceedings of the International Conference on Multimedia Modelling (MMM). Google ScholarDigital Library
Hong, R., Wang, M., Xu, M., Yan, S., and Chua, T. S. 2010c. Dynamic captioning: Video accessibility enhancement for hearing impairment. In Proceedings of the ACM International Conference on Multimedia (ACM MM). Google ScholarDigital Library
Hsu, W. H., Kennedy, L. S., and Chang, S. F. 2007. Video search reranking through random walk over document-level context graph. In Proceedings of the ACM 14th International Conference on Multimedia. Google ScholarDigital Library
Jing, Y. and Baluja, S. 2008. Pagerank for product image search. In Proceedings of the 17th International World Wide Web Conference. Google ScholarDigital Library
Ke, Y., Suthankar, R., and Huston, L. 2004. Efficient near-duplicate detection and sub-image retrieval. In Proceedings of the 12th ACM International Conference on Multimedia. Google ScholarDigital Library
Li, X. and Pang, Y. 2010. Deterministic column-based matrix decomposition. IEEE Trans. Knowl. Data Engin. 22, 1, 145--149. Google ScholarDigital Library
Lowe, D. 2004. Distinctive image features from scale-invariant keypoints. Int. J. Comput. 60, 91--110. Google ScholarDigital Library
Money, A. G. and Agius, H. 2007. Video summarization: a conceptual framework and survey of the state of the art. J. Vis. Commun. Image Represent. 19, 121--143. Google ScholarDigital Library
Neo, S. Y., Ran, Y., Goh, H. K., Zheng, Y., and Chua, T. S. 2007. The use of topic evolution to help users browse and find answers in news video corpus. In Proceedings of the 14th ACM International Conference on Multimedia. Google ScholarDigital Library
Ngo, C. W., Ma, Y. F., and Zhang, H. J. 2005. Video summarization and scene detection by graph modeling. IEEE Trans. Circ. Syst. Video Techn. 15, 296--315. Google ScholarDigital Library
Pedro, J. S. and Dominguez, S. 2007. Network-aware identification of video clip fragments. In Proceedings of the ACM International Conference on Image and Video Retrieval. Google ScholarDigital Library
Peng, Y. and Ngo, C. W. 2006. Clip-based similarity measure for query-dependent clip retrieval and video summarization. IEEE Trans. Circ. Syst. Video Techn. 16, 612--627. Google ScholarDigital Library
Shen, J., Shepherd, J., Cui, B., and Tan, K. 2009. A novel framework for efficient automated singer identification in large music databasesn. ACM Trans. Inform. Syst. 27, 3. Google ScholarDigital Library
Shen, J., Tao, D., and Li, X. 2008. Modality mixture projections for semantic video event detection. IEEE Trans. Circ. Syst. Video Techn. 18, 1587--1596. Google ScholarDigital Library
Siersdorfer, S., Pedro, J. S., and Sanderson, M. 2009. Automatic video tagging using content redundancy. In Proceedings of the 32nd Annual ACM SIGIR Conference. Google ScholarDigital Library
Tang, J., Hong, R., Yan, S., Chua, T. S., Qi, G. J., and Jain, R. 2011. Image annotation by knn-sparse graph-based label propagation over noisily tagged web images. ACM Trans. Intell. Syst. 2, 2, 1--14. Google ScholarDigital Library
Truong, B. T. and Venkatesh, S. 2007. Video abstraction: A systematic review and classification. ACM Trans. Multmedia Comput. Com. Appl. 3, 1. Google ScholarDigital Library
Wang, M., Hua, X. S., Hong, R., Tang, J., Qi, G. J., and Song, Y. 2009a. Unified video annotation via multi-graph learning. IEEE Trans. Circ. Syst. Video Techn. 19, 5, 733--746. Google ScholarDigital Library
Wang, M., Hua, X. S., Tang, J., and Hong, R. 2009b. Beyond distance measurement: Constructing neighborhood similarity for video annotation. IEEE Trans. Multimedia 11, 3, 465--476. Google ScholarDigital Library
Wu, X., Hauptmann, A. G., and Ngo, C. W. 2007. Practical elimination of near-duplicates from web video search. In Proceedings of the 15th International ACM Conference on Multimedia. Google ScholarDigital Library
Wu, X., Ngo, C. W., and Li, Q. 2006. Threading and autodocumenting news videos. IEEE Sign. Proces. Mag. 23, 59--68.Google ScholarCross Ref
Xuelong Li, Yanwei Pang, Y. Y. 2010. L1-norm-based 2dpca. IEEE Trans. Syst. Man Cyb. Part B 40, 4, 1170--1175. Google ScholarDigital Library
Yang, H., Chaisorn, L., Zhao, Y., Neo, S. Y., and Chua, T. S. 2003. Videoqa: question answering on news video. In Proceedings of the 11th ACM International Conference on Multimedia. Google ScholarDigital Library
Yang, Y., Xu, D., Nie, F., Yan, S., and Zhuang, Y. 2010. Image clustering using local discriminant models and global integration. IEEE Trans. Image Proces. 10, 2761--2773. Google ScholarDigital Library
Yang, Y., Zhuang, Y., Tao, D., Xu, D., Yu, J., and Luo, J. 2011. Recognizing cartoon image gestures for retrieval and interactive cartoon clip synthesis. IEEE Trans. Circ. Sys. Video Tech. 20, 12, 1745--1756. Google ScholarDigital Library
Zhang, D.-Q. and Chang, S.-F. 2004. Detecting image near-duplicate by stochastic attributed relational graph matching with learning. In Proceedings of the 12th ACM International Conference on Multimedia. Google ScholarDigital Library
Zhao, W. and Ngo, C. W. 2008. Scale-rotation invariant pattern entropy for keypoint-based near-duplicate detection. IEEE Trans. Image Process. 18, 412--423. Google ScholarDigital Library
Zhao, W., Ngo, C. W., Tan, H. K., and Wu, X. 2007. Near-duplicate keyframe identification with interest point matching and pattern learning. IEEE Trans. Multimedia 9, 1037--1048. Google ScholarDigital Library
Zhu, X., Fan, J., Elmagarmid, A. K., and Wu, X. 2003. Hierarchical video content description and summarization using unified semantic and visual similarity. Multimedia Syst. 9, 31--53. Google ScholarDigital Library

Index Terms

Beyond search: Event-driven summarization for web videos
1. Information systems
  1. Information systems applications
    1. Multimedia information systems
  2. World Wide Web
    1. Web applications
    2. Web services

Recommendations

Event driven summarization for web videos
WSM '09: Proceedings of the first SIGMM workshop on Social media

The explosive growth of web videos brings out the challenge of how to efficiently browse hundreds or even thousands of videos at a glance. Given an event-driven query, social media web sites can easily return a ranked list of large but diverse and ...
Read More
Discovering Event Evolution Graphs Based on News Articles Relationships
ICEBE '14: Proceedings of the 2014 IEEE 11th International Conference on e-Business Engineering

There are many news articles reported online everyday. Within an ongoing topic, people can find a huge amount of news articles. A topic often consists of several events, and people are interested in the whole evolution of a topic along a timeline. This ...
Read More
Beyond search: statistical topic models for text analysis
SIGIR '11: Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval

Search is generally a means to the end of finishing a task. While the current search engines are useful to users for finding relevant information, they offer little help to users for further digesting and analyzing the overwhelming found information ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in

ACM Transactions on Multimedia Computing, Communications, and Applications Volume 7, Issue 4
November 2011
108 pages
ISSN:1551-6857
EISSN:1551-6865
DOI:10.1145/2043612
Issue’s Table of Contents

Copyright © 2011 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 2 December 2011
- Accepted: 1 January 2010
- Revised: 1 December 2009
- Received: 1 September 2009
Published in tomm Volume 7, Issue 4

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Event evolution
Web video summarization
key-shot tagging
key-shot threading
Qualifiers
- research-article
- Research
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 84
  Total Citations
  View Citations
- 771
  Total Downloads
- Downloads (Last 12 months)14
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Beyond search: Event-driven summarization for web videos

ACM Transactions on Multimedia Computing, Communications, and Applications

Abstract

References

Cited By

Index Terms

Recommendations

Event driven summarization for web videos

Discovering Event Evolution Graphs Based on News Articles Relationships

Beyond search: statistical topic models for text analysis

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Beyond search: Event-driven summarization for web videos

ACM Transactions on Multimedia Computing, Communications, and Applications

Abstract

References

Cited By

Index Terms

Recommendations

Event driven summarization for web videos

Discovering Event Evolution Graphs Based on News Articles Relationships

Beyond search: statistical topic models for text analysis

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media