skip to main content
10.1145/957013.957065acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
Article

Discriminative model fusion for semantic concept detection and annotation in video

Published:02 November 2003Publication History

ABSTRACT

In this paper we describe a general information fusion algorithm that can be used to incorporate multimodal cues in building user-defined semantic concept models. We compare this technique with a Bayesian Network-based approach on a semantic concept detection task. Results indicate that this technique yields superior performance. We demonstrate this approach further by building classifiers of arbitrary concepts in a score space defined by a pre-deployed set of multimodal concepts. Results show annotation for user-defined concepts both in and outside the pre-deployed set is competitive with our best video-only models on the TREC Video 2002 corpus.

References

  1. W. Adams, G. Iyengar, C.-Y. Lin, et. al Semantic Indexing of Multimedia Content Using Visual, Audio and Text Cues. Eurasip JASP., 2:170--185, 2003.Google ScholarGoogle Scholar
  2. W. H. Adams, A. Amir, C. Dorai, et. al Ibm research TREC-2002 video retrieval system. In E. M. Voorhees and D. K. Harman, editors, Proc. TREC-11, Gaithersburg, MD, 2003. NIST.Google ScholarGoogle Scholar
  3. S. F. Chang, W. Chen, and H. Sundaram. Semantic visual templates - linking features to semantics. In Proc. ICIP, volume 3, pages 531--535, Chicago, IL, October 1998. IEEE.Google ScholarGoogle ScholarCross RefCross Ref
  4. G. Iyengar and A. B. Lippman. Models for automatic classification of video sequences. In Storage and Retrieval from Image and Video Databases, volume VI. SPIE, Jan 1998.Google ScholarGoogle Scholar
  5. H. J. Nock, W. H. Adams, and G. Iyengar et. al. User-trainable video annotation using multimodal cues. In Proc. SIGIR, Toronto, Canada, July 2003. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. S. Robertson, S. Walker, S. Jones, M. Hancock-Beaulieu, and M. Gatford. Okapi at trec-3. In Proc. TREC-3, pages 109--126. NIST Special Publication 500-226, 1995.Google ScholarGoogle Scholar
  7. J. R. Smith and S.-F. Chang. Visualseek: a fully automated content-based query system. In Proc. fourth intl. conf. multimedia, pages 87--92. ACM, May 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. V. Vapnik. The Nature of Statistical Learning Theory. Springer, New York, USA, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. N. Vasconcelos and A. Lippman. Bayesian modeling of video editing and structure: Semantic features for video summarization and browsing. In Proc. ICIP, volume 2, pages 550--555, Chicago IL, October 1998. IEEE.Google ScholarGoogle ScholarCross RefCross Ref
  10. T. Zhang and C. Kuo. An integrated approach to multimodal media content analysis. In Storage and Retrieval from Image and Video Databases, volume 3972, pages 506--517, San Jose, CA, January 2000. SPIE.Google ScholarGoogle Scholar

Index Terms

  1. Discriminative model fusion for semantic concept detection and annotation in video

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      MULTIMEDIA '03: Proceedings of the eleventh ACM international conference on Multimedia
      November 2003
      670 pages
      ISBN:1581137222
      DOI:10.1145/957013

      Copyright © 2003 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 2 November 2003

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • Article

      Acceptance Rates

      Overall Acceptance Rate995of4,171submissions,24%

      Upcoming Conference

      MM '24
      MM '24: The 32nd ACM International Conference on Multimedia
      October 28 - November 1, 2024
      Melbourne , VIC , Australia

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader