skip to main content
10.1145/1878151.1878155acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Tag suggestion and localization in user-generated videos based on social knowledge

Published:25 October 2010Publication History

ABSTRACT

Nowadays, almost any web site that provides means for sharing user-generated multimedia content, like Flickr, Facebook, YouTube and Vimeo, has tagging functionalities to let users annotate the material that they want to share. The tags are then used to retrieve the uploaded content, and to ease browsing and exploration of these collections, e.g. using tag clouds. However, while tagging a single image is straightforward, and sites like Flickr and Facebook allow also to tag easily portions of the uploaded photos, tagging a video sequence is more cumbersome, so that users just tend to tag the overall content of a video. Moreover, the tagging process is completely manual, and often users tend to spend as few time as possible to annotate the material, resulting in a sparse annotation of the visual content. A semi-automatic process, that helps the users to tag a video sequence would improve the quality of annotations and thus the overall user experience. While research on image tagging has received a considerable attention in the latest years, there are still very few works that address the problem of automatically assigning tags to videos, locating them temporally within the video sequence. In this paper we present a system for video tag suggestion and temporal localization based on collective knowledge and visual similarity of frames. The algorithm suggests new tags that can be associated to a given keyframe exploiting the tags associated to videos and images uploaded to social sites like YouTube and Flickr and visual features.

References

  1. S. Choudhury, J. Breslin, and A. Passant. Enrichment and ranking of the YouTube tag space and integration with the linked data cloud. In Proc. of International Semantic Web Conference (ISWC), 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. M. Guillaumin, T. Mensink, J. Verbeek, and C. Schmid. Tagprop: Discriminative metric learning in nearest neighbor models for image auto-annotation. In Proc. of ICCV, 2009.Google ScholarGoogle ScholarCross RefCross Ref
  3. L. S. Kennedy, S.-F. Chang, and I. V. Kozintsev. To search or to label? Predicting the performance of search-based automatic image classifiers. In Proc. of ACM MIR, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. L. S. Kennedy, M. Slaney, and K. Weinberger. Reliable tags using image similarity. In Proc. of ACM MM Workshop on Web-Scale Multimedia Corpus, Beijing, China, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. X. Li, C. Snoek, and M. Worring. Learning tag relevance by neighbor voting for social image retrieval. In Proc. of ACM MIR, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. X. Li, C. Snoek, and M. Worring. Unsupervised multi-feature tag relevance learning for social image retrieval. In Proc. of ACM CIVR, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. X. Li, C. G. M. Snoek, and M. Worring. Learning social tag relevance by neighbor voting. IEEE Transactions on Multimedia, 11(7):1310--1322, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. D. Liu, X.-S. Hua, L. Yang, M. Wang, and H.-J. Zhang. Tag ranking. In Proc. of International World Wide Web Conference (WWW), 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Y. Liu and N. Yu. Dual linkage refinement for YouTube video topic discovery. In Proc. of IEEE ICME, 2010.Google ScholarGoogle ScholarCross RefCross Ref
  10. S. G. Sevil, O. Kucuktunc, P. Duygulu, and F. Can. Automatic tag expansion using visual similarity for photo sharing websites. Multimedia Tools and Applications, 49(1):81--99, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. S. Siersdorfer, J. San Pedro, and M. Sanderson. Automatic video tagging using content redundancy. In Proc. of ACM SIGIR, pages 395--402, New York, NY, USA, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. B. Sigurbjörnsson and R. van Zwol. Flickr tag recommendation based on collective knowledge. In Proc. of International World Wide Web Conference (WWW), 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. H.-K. Tan, C.-W. Ngo, R. Hong, and T.-S. Chua. Scalable detection of partial near-duplicate videos by visual-temporal consistency. In Proc. of ACM Multimedia, pages 145--154, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. L. von Ahn and L. Dabbish. Labeling images with a computer game. In Proc. of ACM Conference on Human Factors in Computing Systems, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. C. Wang, F. Jing, L. Zhang, and H.-J. Zhang. Scalable search-based image annotation of personal images. In Proc. of ACM MIR, pages 269--278, New York, NY, USA, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. L. Wu, L. Yang, N. Yu, and X.-S. Hua. Learning to tag. In Proc. of International World Wide Web Conference (WWW), 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. X. Wu, A. Hauptmann, and C.-W. Ngo. Practical elimination of near-duplicates from web video search. In Proc. of ACM Multimedia, pages 218--227, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. X. Wu, C.-W. Ngo, A. G. Hauptmann, and H.-K. Tan. Real-time near-duplicate elimination for web video search with content and context. IEEE Transactions on Multimedia, 11(2):196--207, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. X. Wu, W.-L. Zhao, and C.-W. Ngo. Towards Google challenge: Combining contextual and social information for web video categorization. In Proc. of ACM Multimedia, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. W. Zhao, X. Wu, and C. Ngo. On the annotation of web videos by efficient near-duplicate search. IEEE Transactions on Multimedia, to appear in 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Tag suggestion and localization in user-generated videos based on social knowledge

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          WSM '10: Proceedings of second ACM SIGMM workshop on Social media
          October 2010
          74 pages
          ISBN:9781450301732
          DOI:10.1145/1878151

          Copyright © 2010 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 25 October 2010

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Upcoming Conference

          MM '24
          MM '24: The 32nd ACM International Conference on Multimedia
          October 28 - November 1, 2024
          Melbourne , VIC , Australia

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader