skip to main content
10.1145/1509212.1509215acmconferencesArticle/Chapter ViewAbstractPublication PagesmdmConference Proceedingsconference-collections
research-article

Mining the web for visual concepts

Published:24 August 2008Publication History

ABSTRACT

The web has the potential to serve as an excellent source of example imagery for visual concepts. Image search engines based on text keywords can fetch thousands of images for a given query; however, their results tend to be visually noisy. We present a technique that allows a user to refine noisy search results and characterize a more precise visual object class. With a small amount of user intervention we are able to re-rank search engine results to obtain many more examples of the desired concept. Our approach is based on semi-supervised machine learning in a novel probabilistic graphical model composed of both generative and discriminative elements. Learning is achieved via a hybrid expectation maximization / expected gradient procedure initialized with a small example set defined by the user. We demonstrate our approach on images of musical instruments collected from Google image search. The rankings given by our model show significant improvement with respect to the user-refined query. The results are suitable for improving user experience in image search applications and for collecting large labeled datasets for computer vision research.

References

  1. K. Barnard, P. Duygulu, N. de Freitas, D. Forsyth, D. Blei, and M. Jordan. Matching words and pictures. Journal of Machine Learning Research, 3:1107---1135, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. T. Berg and D. Forsyth. Animals on the Web. Computer Vision and Pattern Recognition, 2006 IEEE Computer Society Conference on, 2, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. D. Blei, A. Ng, and M. Jordan. Latent Dirichlet allocation. Journal of Machine Learning Research, 3(5):993--1022, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. A. P. Dempster, N. M. Laird, and D. B. Rubin. Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society. Series B (Methodological), 39(1):1--38, 1977.Google ScholarGoogle ScholarCross RefCross Ref
  5. G. Druck, C. Pal, J. Zhu, and A. McCallum. Semi-supervised classification with hybrid generative/discriminative methods. In Proceedings of Knowledge Discovery and Data Mining (KDD), 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. R. Fergus, L. Fei-Fei, P. Perona, and A. Zisserman. Learning Object Categories from GoogleŠs Image Search. Computer Vision, 2005. ICCV 2005. Tenth IEEE International Conference on, 2, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. R. S. S. T. R. Z. Ghahramani. Optimization with em and expectation-conjugate-gradient. In International Conference on Machine Learning (ICML), 2003.Google ScholarGoogle Scholar
  8. J. Lasserre, C. M. Bishop, and T. Minka. Principled hybrids of generative and discriminative models. In In Proceedings 2006 IEEE Conference on Computer Vision and Pattern Recognition, New York, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. G. W. Li-Jia Li and L. Fei-Fei. Optimol: automatic object picture collection via incremental model learning. In IEEE Computer Vision and Pattern Recognition (CVPR), 2007.Google ScholarGoogle Scholar
  10. D. Lowe. Object recognition from local scale-invariant features. International Conference on Computer Vision, 2:1150--1157, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Y. Lu, C. Hu, X. Zhu, H. Zhang, and Q. Yang. A unified framework for semantics and feature based relevance feedback in image retrieval systems. Proceedings of the eighth ACM international conference on Multimedia, pages 31--37, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. A. McCallum, C. Pal, G. Druck, and X. Wang. Multi-conditional learning: Generative/discriminative training for clustering and classification. Proceedings of 21st National Conference on Artificial Intelligence (AAAI), 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. A. K. McCallum. Mallet: A machine learning for language toolkit. http://mallet.cs.umass.edu, 2002.Google ScholarGoogle Scholar
  14. A. Y. Ng and M. I. Jordan. On discriminative vs. generative classifiers: A comparison of logistic regression and naive bayes. In NIPS, pages 841--848, 2001.Google ScholarGoogle Scholar
  15. E. Nowak, F. Jurie, and B. Triggs. Sampling strategies for bag-of-features image classification. Proc. ECCV, 4:490--503, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. J. Ponce, T. Berg, M. Everingham, D. Forsyth, M. Hebert, S. Lazebnik, M. Marszalek, C. Schmid, B. Russell, A. Torralba, et al. Dataset Issues in Object Recognition. Toward Category-Level Object Recognition. LNCS, 4170.Google ScholarGoogle Scholar
  17. Y. Rui, T. Huang, M. Ortega, and S. Mehrotra. Relevance feedback: a power tool for interactive content-basedimage retrieval. Circuits and Systems for Video Technology, IEEE Transactions on, 8(5):644--655, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. F. Schroff, A. Criminisi, and A. Zisserman. Harvesting image databases from the web. In Proceedings of the 11th International Conference on Computer Vision, Rio de Janeiro, Brazil, 2007.Google ScholarGoogle ScholarCross RefCross Ref
  19. Y. Teh, M. Jordan, M. Beal, and D. Blei. Hierarchical dirichlet processes. Journal of the American Statistical Association, 101(476):1566--1581, 2006.Google ScholarGoogle ScholarCross RefCross Ref
  20. A. Torralba, R. Fergus, and W. T. Freeman. Tiny images. Technical Report MIT-CSAIL-TR-2007-024, Computer Science and Artificial Intelligence Lab, Massachusetts Institute of Technology, 2007.Google ScholarGoogle Scholar

Index Terms

  1. Mining the web for visual concepts

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        MDM '08: Proceedings of the 9th International Workshop on Multimedia Data Mining: held in conjunction with the ACM SIGKDD 2008
        August 2008
        74 pages
        ISBN:9781605582610
        DOI:10.1145/1509212

        Copyright © 2008 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 24 August 2008

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader