skip to main content
10.1145/1871437.1871675acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
poster

Fast query expansion using approximations of relevance models

Published:26 October 2010Publication History

ABSTRACT

Pseudo-relevance feedback (PRF) improves search quality by expanding the query using terms from high-ranking documents from an initial retrieval. Although PRF can often result in large gains in effectiveness, running two queries is time consuming, limiting its applicability. We describe a PRF method that uses corpus pre-processing to achieve query-time speeds that are near those of the original queries. Specifically, Relevance Modeling, a language modeling based PRF method, can be recast to benefit substantially from finding pairwise document relationships in advance. Using the resulting Fast Relevance Model (fastRM), we substantially reduce the online retrieval time and still benefit from expansion. We further explore methods for reducing the preprocessing time and storage requirements of the approach, allowing us to achieve up to a 10% increase in MAP over unexpanded retrieval,vwhile only requiring 1% of the time of standard expansion.

References

  1. N. Abdul-Jaleel, J. Allan, W. B. Croft, F. Diaz, L. Larkey, X. Li, M. D. Smucker, and C. Wade. UMASS at TREC 2004 --- novelty and HARD. In Proceedings of TREC, Gaithersburg, MD, USA, 2004. NIST.Google ScholarGoogle ScholarCross RefCross Ref
  2. V. N. Anh and A. Moffat. Pruned query evaluation using pre-computed impacts. In Proceedings of SIGIR, pages 372--379, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. T. Elsayed, J. Lin, and D. W. Oard. Pairwise document similarity in large collections with mapreduce. In Proceedings of ACL/HLT, pages 265--268, Morristown, NJ, USA, 2008. Association for Computational Linguistics. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. K. Järvelin and J. Kekäläinen. Cumulated gain-based evaluation of IR techniques. ACM Trans. Inf. Syst., 20(4):422--446, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. V. Lavrenko and J. Allan. Real-time query expansion in relevance models. IR 473, University of Massachusetts Amherst, 2006.Google ScholarGoogle Scholar
  6. V. Lavrenko and W. B. Croft. Relevance based language models. In Proceedings of SIGIR, pages 120--127, New York, NY, USA, 2001. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. J. Lin. Brute force and indexed approaches to pairwise document similarity comparisons with MapReduce. In Proceedings of SIGIR, pages 155--162, New York, NY, USA, 2009. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. M. D. Smucker, J. Allan, and B. Carterette. A comparison of statistical significance tests for information retrieval evaluation. In Proceedings of CIKM, pages 623--632, New York, NY, USA, 2007. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. T. Strohman. Efficient Processing of Complex Features for Information Retrieval. PhD thesis, University of Massachusetts Amherst, December 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Fast query expansion using approximations of relevance models

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        CIKM '10: Proceedings of the 19th ACM international conference on Information and knowledge management
        October 2010
        2036 pages
        ISBN:9781450300995
        DOI:10.1145/1871437

        Copyright © 2010 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 26 October 2010

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • poster

        Acceptance Rates

        Overall Acceptance Rate1,861of8,427submissions,22%

        Upcoming Conference

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader