Abstract
The relative ineffectiveness of information retrieval systems is largely caused by the inaccuracy with which a query formed by a few keywords models the actual user information need. One well known method to overcome this limitation is automatic query expansion (AQE), whereby the user’s original query is augmented by new features with a similar meaning. AQE has a long history in the information retrieval community but it is only in the last years that it has reached a level of scientific and experimental maturity, especially in laboratory settings such as TREC. This survey presents a unified view of a large number of recent approaches to AQE that leverage various data sources and employ very different principles and techniques. The following questions are addressed. Why is query expansion so important to improve search effectiveness? What are the main steps involved in the design and implementation of an AQE component? What approaches to AQE are available and how do they compare? Which issues must still be resolved before AQE becomes a standard component of large operational information retrieval systems (e.g., search engines)?
- Agichtein, E., Lawrence, S., and Gravano, L. 2004. Learning to find answers to questions on the Web. ACM Trans. on Internet Technol. 4, 2, 1299--162. Google ScholarDigital Library
- Agirre, E., Ansa, O., Arregi, X., de Lacalle, M. L., Otegi, A., Saralegi, X., and Saragoza, H. 2009. Elhuyar-ixa: Semantic relatedness and cross-lingual passage retrieval. In Proceedings of CLEF. Springer. Google ScholarDigital Library
- Agirre, E., Di Nunzio, G. M., Mandl, T., and Otegi, A. 2009. Clef 2009 ad hoc track overview: Robust--wsd task. In Proceedings of CLEF. Springer. Google ScholarDigital Library
- Agrawal, R., Imielinski, T., and Swami, A. 1993. Mining association rules between sets of items in large databases. In Proceedings of the ACM SIGMOD International Conference on Management of Data. ACM Press, 207--216. Google ScholarDigital Library
- Allan, J. 1996. Incremental relevance feedback for information filtering. In Proceedings of the 19th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM Press, 270--278. Google ScholarDigital Library
- Amati, G. 2003. Probabilistic models for information retrieval based on divergence from randomness. Ph.D. thesis, Department of Computing Science, University of Glasgow, UK.Google Scholar
- Amati, G., Carpineto, C., and Romano, G. 2001. FUB at TREC-10 Web Track: A probabilistic framework for topic relevance term weighting. In Proceedings of the 10th Text REtrieval Conference (TREC’10). NIST Special Publication 500--250. National Institute of Standards and Technology (NIST), Gaithersburg, MD, 182--191.Google Scholar
- Amati, G., Carpineto, C., and Romano, G. 2003. Comparing weighting models for monolingual information retrieval. In Proceedings of the 4th Workshop of the Cross-Language Evaluation Forum (CLEF’03). Springer, 310--318.Google Scholar
- Amati, G., Carpineto, C., and Romano, G. 2004. Query difficulty, robustness, and selective application of query expansion. In Proceedings of the 26th European Conference on Information Retrieval (ECIR’04). Springer, 127--137.Google Scholar
- Anderson, J. R. 1983. A spreading activation theory of memory. J. Verbal Learn. Verbal Behav. 22, 261--295.Google ScholarCross Ref
- Arguello, J., Elsas, J. L., Callan, J., and Carbonell, J. G. 2008. Document representation and query expansion models for blog recommendation. In Proceedings of the 2nd International Conference on Weblogs and Social Media. AAAI Press, 10--18.Google Scholar
- Attar, R. and Fraenkel, A. S. 1977. Local feedback in full-text retrieval systems. J. ACM 24, 3, 397--417. Google ScholarDigital Library
- Baeza-Yates, R. and Ribeiro-Neto, B. 1999. Modern Information Retrieval. Addison Wesley. Google ScholarDigital Library
- Bai, J., Nie, J.-Y., and Cao, G. 2006. Context-dependent term relations for information retrieval. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 551--559. Google ScholarDigital Library
- Bai, J., Song, D., Bruza, P., Nie, J.-Y., and Cao, G. 2005. Query expansion using term relationships in language models for information retrieval. In Proceedings of the 14th ACM International Conference on Information and Knowledge Management. ACM Press, 688--695. Google ScholarDigital Library
- Bai, J., Nie, J.-Y., Cao, G., and Bouchard, H. 2007. Using query contexts in information retrieval. In Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM Press, 15--22. Google ScholarDigital Library
- Ballesteros, L. and Croft, W. B. 1997. Phrasal translation and query expansion techniques for cross-language information retrieval. In Proceedings of the 20th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM Press, 84--91. Google ScholarDigital Library
- Ballesteros, L. and Croft, W. B. 1998. Resolving ambiguity for cross-language retrieval. In Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM Press, 64--71. Google ScholarDigital Library
- Bast, H. and Weber, I. 2006. Type less, find more: fast autocompletion search with a succinct index. In Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM Press, 364--371. Google ScholarDigital Library
- Bast, H., Majumdar, D., and Weber, I. 2007. Efficient interactive query expansion with complete search. In Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM Press, 857--860. Google ScholarDigital Library
- Beeferman, D. and Berger, A. 2000. Agglomerative clustering of a search engine query log. In Proceedings of the 6th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM Press, 407--416. Google ScholarDigital Library
- Belkin, N. J. and Croft, W. B. 1992. Information filtering and information retrieval: Two sides of the same coin? Comm. ACM 35, 12, 29--38. Google ScholarDigital Library
- Bernardini, A. and Carpineto, C. 2008. Fub at trec 2008 relevance feedback track: extending rocchio with distributional term analysis. In Proceedings of TREC-2008. National Institute of Standards and Technology, Gaithersburg, MD, USA.Google Scholar
- Bernardini, A., Carpineto, C., and D’Amico, M. 2009. Full-subtopic retrieval with keyphrase-based search results clustering. In Proceedings of the IEEE/WIC/ACM International Conference on Web Intelligence. IEEE Computer Society, 206--213. Google ScholarDigital Library
- Bhogal, J., Macfarlane, A., and Smith, P. 2007. A review of ontology based query expansion. Info. Process. Manage. 43, 4, 866--886. Google ScholarDigital Library
- Billerbeck, B. 2005. Efficient query expansion. Ph.D. thesis, RMIT University, Melbourne, Australia.Google Scholar
- Billerbeck, B. and Zobel, J. 2003. When query expansion fails. In Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Informaion Retrieval. ACM Press, 387--388. Google ScholarDigital Library
- Billerbeck, B. and Zobel, J. 2004a. Questioning query expansion: An examination of behaviour and parameters. In Proceedings of the 15th Australasian Database Conference. Vol. 27, Australian Computer Society, 69--76. Google ScholarDigital Library
- Billerbeck, B. and Zobel, J. 2004b. Techniques for efficient query expansion. In Proceedings of the String Processing and Information Retrieval Symposium. Springer, 30--42.Google Scholar
- Billerbeck, B. and Zobel, J. 2005. Document expansion versus query expansion for ad-hoc retrieval. In Proceedings of the 10th Australasian Document Computing Symposium. Australian Computer Society, Sydney, Australia, 34--41.Google Scholar
- Billerbeck, B., Scholer, F., Williams, H. E., and Zobel, J. 2003. Query expansion using associated queries. In Proceedings of the 12th ACM International Conference on Information and Knowledge Management. ACM Press, 2--9. Google ScholarDigital Library
- Bilotti, M., Katz, B., and Lin, J. 2004. What works better for question answering: Stemming or morphological query expansion? In Proceedings of the Information Retrieval for Question Answering (IR4QA) Workshop at SIGIR’04.Google Scholar
- Bodoff, D. and Kambil, A. 1998. Partial coordination. I. The best of pre-coordination and post-coordination. J. Amer. Soc. Info. Sciences 49, 14, 1254--1269. Google ScholarDigital Library
- Broder, A. 2002. A taxonomy of web search. ACM SIGIR Forum 36, 2, 3--10. Google ScholarDigital Library
- Broder, A., Ciccolo, P., E.Gabrilovich, Josifovski, V., Metzler, D., Riedel, L., and Yuan, J. 2009. Online expansion of rare queries for sponsored search. In Proceedings of the 18th international conference on World Wide Web. ACM, 511--520. Google ScholarDigital Library
- Buckley, C. and Harman, D. K. 2003. Reliable information access final workshop report. In Proceedings of the Reliable Information Access Workshop (RIA). NRRC, 1--30.Google Scholar
- Buckley, C., Salton, G., Allan, G., and Singhal, A. 1995. Automatic query expansion using smart: Trec3. In Proceedings of the 3rd Text REtrieval Conference (TREC-3). NIST Special Publication 500--226. National Institute of Standards and Technology (NIST), Gaithersburg, MD, 69--80.Google Scholar
- Buscher, G., Dengel, A., and van Elst, L. 2008. Query expansion using gaze-based feedback on the subdocument level. In Proceedings of the 31th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM Press, 387--394. Google ScholarDigital Library
- Cao, G., Gao, J., Nie, J.-Y., and Bai, J. 2007. Extending query translation to cross-language query expansion with markov chain models. In Proceedings of the 16th Conference on Information and Knowledge Management (CIKM’07). ACM Press. Google ScholarDigital Library
- Cao, G., Gao, J., Nie, J.-Y., and Robertson, S. 2008. Selecting good expansion terms for pseudo-relevance feedback. In Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM Press, 243--250. Google ScholarDigital Library
- Carmel, D., Farchi, E., Petruschka, Y., and Soffer, A. 2002. Automatic query refinement using lexical affinities with maximal information gain. In Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM Press, 283--290. Google ScholarDigital Library
- Carpineto, C. and Romano, G. 2004. Concept Data Analysis: Theory and Applications. John Wiley & Sons. Google ScholarDigital Library
- Carpineto, C., De Mori, R., Romano, G., and Bigi, B. 2001. An information theoretic approach to automatic query expansion. ACM Trans. Info. Syst. 19, 1, 1--27. Google ScholarDigital Library
- Carpineto, C., Romano, G., and Giannini, V. 2002. Improving retrieval feedback with multiple term-ranking function combination. ACM Trans. Info. Syst. 20, 3, 259--290. Google ScholarDigital Library
- Carpineto, C., Osiński, S., Romano, G., and Weiss, D. 2009. A survey of Web clustering engines. ACM Comput. Surv. 41, 3. Google ScholarDigital Library
- Chang, Y., Ounis, I., and Kim, M. 2006. Query reformulation using automatically generated query concepts from a document space. Info. Process. Manage. 42, 2, 453--468. Google ScholarDigital Library
- Chen, L., L’Abbate, M., Thiel, U., and Neuhold, E. J. 2004. Increasing the customers choice: Query expansion based on the layer-seeds method and its application in e-commerce. In Proceedings of the IEEE International Conference on e-Technology, e-Commerce and e-Service (EEE’04). IEEE Computer Society, 317--324. Google ScholarDigital Library
- Chirita, P.-A., Firan, C. S., and Nejdl, W. 2007. Personalized query expansion for the web. In Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM Press, 7--14. Google ScholarDigital Library
- Chu, W. W., Liu, Z., and Mao, W. 2002. Textual document indexing and retrieval via knowledge sources and data mining. Comm. Institute of Info. Comput. Machinery 5, 2.Google Scholar
- Church, K. and Hanks, P. 1990. Word association norms, mutual information and lexicography. Computat. Linguist. 16, 1, 22--29. Google ScholarDigital Library
- Church, K. and Smyth, B. 2007. Mobile content enrichment. In Proceedings of the 12th International Conference on Intelligent User Interfaces. ACM Press, 112--121. Google ScholarDigital Library
- Collins-Thompson, K. 2009. Reducing the risk of query expansion via robust constrained optimization. In Proceedings of the 18th Conference on Information and Knowledge Management (CIKM’09). ACM Press, 837--846. Google ScholarDigital Library
- Collins-Thompson, K. and Callan, J. 2005. Query expansion using random walk models. In Proceedings of the 14th Conference on Information and Knowledge Management (CIKM’05). ACM Press, 704--711. Google ScholarDigital Library
- Collins-Thompson, K. and Callan, J. 2007. Estimation and use of uncertainty in pseudo-relevance feedback. In Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM Press, 303--310. Google ScholarDigital Library
- Crabtree, D., Andreae, P., and Gao, X. 2007. Exploiting underrepresented query aspects for automatic query expansion. In Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM Press, 191--200. Google ScholarDigital Library
- Crestani, F. 1997. Application of spreading activation techniques in information retrieval. Artif. Intell. 11, 6, 453--482. Google ScholarDigital Library
- Cronen-Townsend, S. and Croft, W. B. 2002. Quantifying query ambiguity. In Proceedings of the 2nd International Conference on Human Language Technology Research. ACM Press, 104--109. Google ScholarDigital Library
- Crouch, C. and Yang, B. 1992. Experiments in automatic statistical thesaurus construction. In Proceedings of the 15th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM Press, 77--88. Google ScholarDigital Library
- Cui, H., Wen, J.-R., Nie, J.-Y., and Ma, W.-Y. 2003. Query expansion by mining user logs. IEEE Trans. Knowl. Data Engin. 15, 4, 829--839. Google ScholarDigital Library
- Custis, T. and Al-Kofahi, K. 2007. A new approach for evaluating query expansion: Query-document term mismatch. In Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM Press, 575--582. Google ScholarDigital Library
- Deerwester, S., Dumais, S. T., Furnas, W., Landauer, T. K., and Harshman, R. 1990. Indexing by latent semantic analysis. J. Amer. Soc. Info. Science 41, 6, 391--407.Google ScholarCross Ref
- Dempster, A., Laird, N., and Rubin, D. 1977. Maximum likelihood from incomplete data via the EM algorithm. J. Royal Statist. Soc. Series B (Methodological) 39, 1, 1--38.Google ScholarCross Ref
- Diaz, F. and Metzler, D. 2006. Improving the estimation of relevance models using large external corpora. In Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM Press, 154--161. Google ScholarDigital Library
- Doszkocs, T. E. 1978. AID, an Associative Interactive Dictionary for Online Searching. Online Rev. 2, 2, 163--174.Google ScholarCross Ref
- Efron, M. 2008. Query Expansion and Dimensionality Reduction: Notions of Optimality in Rocchio Relevance Feedback and Latent Semantic Indexing. Info. Process. Manage. 44, 1, 163--180. Google ScholarDigital Library
- Efthimiadis, E. N. 1993. A user-centred evaluation of ranking algorithms for interactive query expansion. In Proceedings of the 16th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM Press, 146--159. Google ScholarDigital Library
- Efthimiadis, E. N. 1996. Query expansion. In Annual Review of Information Systems and Technology, M. E. Williams Ed., ASIS&T, 121--187.Google Scholar
- Finkelstein, L., Gabrilovich, E., Matias, Y., Rivlin, E., Solan, Z., Wolfman, G., and Ruppin, E. 2002. Placing search in context: The concept revisited. ACM Trans. Info. Syst. 20, 1, 116--131. Google ScholarDigital Library
- Fitzpatrick, L. and Dent, M. 1997. Automatic feedback using past queries: Social searching? In Proceedings of the 20th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM Press, 306--313. Google ScholarDigital Library
- Flemmings, R., Barros, J., Geraldo, A. P., and Moreira, V. P. 2009. Bbk-ufrgs@clef2009: Query expansion of geographic place names. In Proceedings of CLEF.Google Scholar
- Fujii, A. 2008. Modeling anchor text and classifying queries to enhance web document retrieval. In Proceeding of the 17th International Conference on World Wide Web. ACM Press, 337--346. Google ScholarDigital Library
- Furnas, G. W., Landauer, T. K., Gomez, L. M., and Dumais, S. T. 1987. The vocabulary problem in human-system communication. Comm. ACM 30, 11, 964--971. Google ScholarDigital Library
- Gauch, S., Wang, J., and Rachakonda, S. M. 1999. A corpus analysis approach for automatic query expansion and its extension to multiple databases. ACM Trans. Info. Syst. 17, 3, 250--269. Google ScholarDigital Library
- Gong, Z., Cheang, C.-W., and U, L. 2006. Multi-term web query expansion using wordnet. In Proceedings of the 17th International Conference on Database and Expert Systems Applications (DEXA’06). Springer, 379--388. Google ScholarDigital Library
- Gonzalo, J., Verdejo, F., Chugur, I., and Cigarrän, J. M. 1998. Indexing with wordnet synsets can improve text retrieval. In Proceedings of the COLING/ACL Workshop on Usage of WordNet in Natural Language Processing Systems. Association for Computational Linguistics, 647--678.Google Scholar
- Graupmann, J., Cai, J., and Schenkel, R. 2005. Automatic query refinement using mined semantic relations. In Proceedings of the International Workshop on Challenges in Web Information Retrieval and Integration (WIRI). IEEE Computer Society, 205--213. Google ScholarDigital Library
- Hanani, U., Shapira, B., and Shoval, P. 2004. Information filtering: Overview of issues, research and systems. User Model. User-Adapt. Interact. 11, 3, 203--259. Google ScholarDigital Library
- Harabagiu, S. and Lacatusu, F. 2004. Strategies for advanced question answering. In Proceedings of the HLT- NAACL’04 Workshop on Pragmatics of Question Answering. 1--9.Google Scholar
- Harabagiu, S., Moldovan, D., Pasca, M., Mihalcea, R., Surdeanu, M., Bunescu, R., Grju, R., Rus, V., and Morarescu, P. 2001. The role of lexico-semantic feedback in open-domain textual question-answering. In Proceedings of the 39th Annual Meeting of the Association for Computational Linguistics (ACL-01). Association for Computational Linguistics, 282--289. Google ScholarDigital Library
- Harman, D. K. 1992. Relevance feedback and other query modification techniques. In Information Retrieval -- Data Structures and Algorithms, W. B. Frakes and R. Baeza-Yates Eds., Prentice Hall, Englewood Cliffs, N. J., 241--263. Google ScholarDigital Library
- Harper, G. W. and van Rijsbergen, C. J. 1978. An evaluation of feedback in document retrieval using co-occurrence data. J. Documentation 34, 3, 189--216.Google ScholarCross Ref
- Hauff, C., Hiemstra, D., and de Jong, F. 2008. A survey of pre-retrieval query performance predictors. In Proceedings of the 17th Conference on Information and Knowledge Management (CIKM’08). ACM Press, 1419--1420. Google ScholarDigital Library
- He, B. and Ounis, I. 2007. Combining fields for query expansion and adaptive query expansion. Info. Process. Manage. 43, 1294--1307. Google ScholarDigital Library
- He, B. and Ounis, I. 2009a. Finding good feedback documents. In Proceedings of the 18th Conference on Information and Knowledge Management (CIKM’09). ACM Press, 2011--2014. Google ScholarDigital Library
- He, B. and Ounis, I. 2009b. Studying query expansion effectiveness. In Proceedings of the 31th European Conference on Information Retrieval (ECIR’09). Springer, 611--619. Google ScholarDigital Library
- Hidalgo, J. M. G., de Buenaga Rodríguez, M., and Pérez, J. C. C. 2005. The role of word sense disambiguation in automated text categorization. In Proceedings of the 10th International Conference on Applications of Natural Language to Information Systems. Springer, 298--309. Google ScholarDigital Library
- Hu, J., Deng, W., and Guo, J. 2006. Improving retrieval performance by global analysis. In Proceedings of the 18th International Conference on Pattern Recognition. IEEE Computer Society, 703--706. Google ScholarDigital Library
- Huang, C.-C., Chien, L.-F., and Oyang, Y.-J. 2003. Relevant term suggestion in interactive web search based on contextual information in query session logs. J. Amer. Soc. Info. Science Technol. 54, 7, 638--649. Google ScholarDigital Library
- Huang, C.-C., Lin, K.-M., and Chien, L.-F. 2005. Automatic training corpora acquisition through web mining. In Proceedings of the IEEE/WIC/ACM International Conference on Web Intelligence. IEEE Computer Society, 193--199. Google ScholarDigital Library
- Hull, D. A. 1996. Stemming algorithms: a case study for detailed evaluation. J. Amer. Soc. Info. Science 47, 1, 70--84. Google ScholarDigital Library
- Ide, E. 1971. New experiments in relevance feedback. In The SMART Retrieval System, G. Salton Ed., Prentice Hall, Englewood Cliffs, N. J., 337--354.Google Scholar
- Jelinek, F. and Mercer, R. L. 1980. Interpolated estimation of markov source parameters from sparse data. In Proceedings of the Workshop on Pattern Recognition in Practice. North-Holland, Amsterdam, The Netherlands, 381--397.Google Scholar
- Joachims, T., Granka, L., Pan, B., Hembrooke, H., Radlinski, F., and Gay, G. 2007. Evaluating the accuracy of implicit feedback from clicks and query reformulations in web search. ACM Trans. Info. Syst. 25, 2, 7. Google ScholarDigital Library
- Jones, R., Rey, B., Madani, O., and Greiner, W. 2006. Generating query substitutions. In Proceedings of the 15th International Conference on World Wide Web. ACM Press, 387--396. Google ScholarDigital Library
- Jones, S. 1993. A thesaurus data model for an intelligent retrieval system. J. Info. Science 19, 3, 167--178. Google ScholarDigital Library
- Jones, S. 1995. Interactive thesaurus navigation: Intelligence rules ok? J. Amer. Soc. for Info. Science 46, 1, 52--59. Google ScholarDigital Library
- Kamvar, M. and Baluja, S. 2007. The role of context in query input: Using contextual signals to complete queries on mobile devices. In Proceedings of the 9th International Conference on Human Computer Interaction with Mobile Devices and Services. ACM Press, 405--412. Google ScholarDigital Library
- Kanaan, G., Al-Shalabi, R., Ghwanmeh, S., and Bani-Ismail, B. 2008. Interactive and automatic query expansion: A comparative study with an application on Arabic. Amer. J. Appl. Sciences 5, 11, 1433--1436.Google ScholarCross Ref
- Kekäläinen, J. and Järvelin, K. 1998. The impact of query structure and query expansion on retrieval performance. In Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM Press, 130--137. Google ScholarDigital Library
- Kherfi, M. L., Ziou, D., and Bernardi., A. 2004. Image retrieval from the World Wide Web: Issues, techniques, and systems. ACM Comput. Surv. 36, 1, 35--67. Google ScholarDigital Library
- Koehn, P. 2010. Statistical Machine Translation. Cambridge University Press. Google ScholarDigital Library
- Kraaij, W., Nie, J., and Simard, M. 2003. Embedding Web-Based Statistical Translation Models in Cross-Language Information Retrieval. Computat. Linguist. 29, 3, 381--420. Google ScholarDigital Library
- Kraft, R. and Zien, J. 2004. Mining anchor text for query refinement. In Proceedings of the 13th International Conference on World Wide Web. ACM Press, 666--674. Google ScholarDigital Library
- Krovetz, R. 1993. Viewing morphology as an inference process. In Proceedings of the 16th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM Press, 191--202. Google ScholarDigital Library
- Krovetz, R. and Croft, W. B. 1992. Lexical ambiguity and information retrieval. ACM Trans. Info. Syst. 10, 2, 115--141. Google ScholarDigital Library
- Kurland, O., Lee, L., and Domshlak, C. 2005. Better than the real thing?: Iterative pseudo-query processing using cluster-based language models. In Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM Press, 19--26. Google ScholarDigital Library
- Kwok, K. L., Grunfeld, L., Sun, K. L., and Deng, P. 2004. TREC2004 robust track experiments using PIRCS. In Proceedings of the 13th Text REtrieval Conference (TREC-8). National Institute of Standards and Technology, Gaithersburg, MD.Google Scholar
- Lam-Adesina, A. M. and Jones, G. J. F. 2001. Applying summarization techniques for term selection in relevance feedback. In Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. 1--9. Google ScholarDigital Library
- Latiri, C. C., Yahia, S. B., Chevallet, J. P., and Jaoua, A. 2004. Query expansion using fuzzy association rules between terms. In Proceedings of the 4th International Conference Journées de l’Informatique Messine (JIM’03).Google Scholar
- Lau, R. Y. K., Bruza, P. D., and Song, D. 2004. Belief revision for adaptive information retrieval. In Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM Press, 130--137. Google ScholarDigital Library
- Lau, T. and Horvitz, E. 1999. Patterns of search: Analyzing and modeling web query refinement. In Proceedings of the 7th International Conference on User Modeling. Springer, 119--128. Google ScholarDigital Library
- Lavelli, A., Sebastiani, F., and Zanoli, R. 2004. Distributional term representations: an experimental comparison. In Proceedings of the 16th Conference on Information and Knowledge Management (CIKM’04). ACM Press, 615--624. Google ScholarDigital Library
- Lavrenko, V. and Allan, J. 2006. Realtime query expansion in relevance models. IR 473, University of Massachusetts.Google Scholar
- Lavrenko, V. and Croft, W. B. 2001. Relevance based language models. In Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM Press, 120--127. Google ScholarDigital Library
- Lee, K. S., Croft, W. B., and Allan, J. 2008. A cluster-based resampling method for pseudo-relevance feedback. In Proceedings of the 31th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM Press, 235--242. Google ScholarDigital Library
- Lesk, M. E. 1969. Word-Word Associations in Document Retrieval Systems. Amer. Documentation 20, 1, 8--36.Google ScholarCross Ref
- Lesk, M. E. 1988. They said true things, but called them by wrong names -- vocabulary problems over time in retrieval. In Proceedings of the Waterloo OED Conference. ACM Press, 1--10.Google Scholar
- Lin, J. and Murray, G. C. 2005. Assessing the term independence assumption in blind relevance feedback. In Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM Press, 635--636. Google ScholarDigital Library
- Liu, S., Liu, F., Yu, C., and Meng, W. 2004. An effective approach to document retrieval via utilizing wordnet and recognizing phrases. In Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM Press, 266--272. Google ScholarDigital Library
- Liu, Y., Li, C., Zhang, P., and Xiong, Z. 2008. A query expansion algorithm based on phrases semantic similarity. In Proceedings of the International Symposiums on Information Processing. IEEE Computer Society, 31--35. Google ScholarDigital Library
- Lv, Y. and Zhai, C. 2009. Adaptive relevance feedback in information retrieval. In Proceedings of the 18th Conference on Information and Knowledge Management (CIKM’09). ACM Press, 255--264. Google ScholarDigital Library
- Macdonald, C. and Ounis, I. 2007. Expertise drift and query expansion in expert search. In Proceedings of the 16th Conference on Information and Knowledge Management (CIKM’07). ACM Press. Google ScholarDigital Library
- Mandala, R., Takenobu, T., and Hozumi, T. 1998. The use of wordnet in information retrieval. In Proceedings of the ACL Workshop on the Usage of WordNet in Information Retrieval. Association for Computational Linguistics, 31--37.Google Scholar
- Mandala, R., Tokunaga, T., and Tanaka, H. 1999. Combining multiple evidence from different types of thesaurus for query expansion. In Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM Press, 191--197. Google ScholarDigital Library
- Manning, C. D., Raghavan, P., and Sch’́utze, H. 2008. Introduction to Information Retrieval. Cambridge University Press. Google ScholarDigital Library
- Maron, M. E. and Kuhns, J. L. 1960. On relevance, probabilistic indexing and information retrieval. J. ACM 7, 3, 216--244. Google ScholarDigital Library
- McNamee, P. and Mayfield, J. 2002. Comparing cross-language query expansion techniques by degrading translation resources. In Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM Press, 159--166. Google ScholarDigital Library
- Melucci, M. 2008. A Basis for Information Retrieval in Context. ACM Trans. Info. Syst. 26, 3, Article No 14. Google ScholarDigital Library
- Metzler, D. and Croft, W. B. 2007. Latent concept expansion using Markov random fields. In Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM Press, 311--318. Google ScholarDigital Library
- Miller, G. A., Beckwith, R. T., Fellbaum, C. D., Gross, D., and Miller, K. 1990. WordNet: An online lexical database. Int. J. Lexicography 3, 4, 235--244.Google ScholarCross Ref
- Milne, D. N., Witten, I. H., and Nichols, D. M. 2007. A knowledge-based search engine powered by wikipedia. In Proceedings of the 16th ACM Conference on Information and Knowledge Management. ACM Press, 445--454. Google ScholarDigital Library
- Minker, J., Wilson, G. A., and Zimmerman, B. H. 1972. An evaluation of query expansion by the addition of clustered terms for a document retrieval system. Info. Stor. Retrieval 8, 6, 329--348.Google ScholarCross Ref
- Mitra, M., Singhal, A., and Buckley, C. 1998. Improving automatic query expansion. In Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM Press, 206--214. Google ScholarDigital Library
- Montague, M. and Aslam, J. 2001. Relevance score normalization for metasearch. In Proceedings of the 10th International Conference on Information and Knowledge Management. ACM Press, 427--433. Google ScholarDigital Library
- Nallapati, R. and Shah, C. 2006. Evaluating the quality of query refinement suggestions in information retrieval. IR 521, University of Massachusetts.Google Scholar
- Natsev, A., Haubold, A., Tes̆ić, J., Xie, L., and Yan, R. 2007. Semantic concept-based query expansion and re-ranking for multimedia retrieval. In Proceedings of the 15th International Conference on Multimedia. ACM Press, 991--1000. Google ScholarDigital Library
- Navigli, R. 2009. Word sense disambiguation: A survey. ACM Comput. Surv. 41, 2, 1--69. Google ScholarDigital Library
- Navigli, R. and Velardi, P. 2003. An analysis of ontology-based query expansion strategies. In Proceedings of the ECML/PKDD-2003 Workshop on Adaptive Text Extraction and Mining.Google Scholar
- Navigli, R. and Velardi, P. 2005. Structural semantic interconnections: A knowledge-based approach to word sense disambiguation. IEEE Trans. Pattern Anal. Mach. Intell. 27, 7, 1075--1086. Google ScholarDigital Library
- Osiński, S. and Weiss, D. 2005. A concept-driven algorithm for clustering search results. IEEE Intell. Syst. 20, 3, 48--54. Google ScholarDigital Library
- Palleti, P., Karnick, H., and Mitra, P. 2007. Personalizedweb search using probabilistic query expansion. In Proceedings of the IEEE/WIC/ACM International Conference on Web Intelligence. IEEE Computer Society, 83--86. Google ScholarDigital Library
- Park, L. A. F. and Ramamohanarao, K. 2007. Query expansion using a collection dependent probabilistic latent semantic thesaurus. In Proceedings of the 11th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD’07). Springer, 224--235. Google ScholarDigital Library
- Perugini, S. and Ramakrishnan, N. 2006. Interacting with web hierarchies. IT Professional 8, 4, 19--28. Google ScholarDigital Library
- Pirkola, A., Hedlund, T., Keskusalo, H., and Ja̋rvelin, K. 2001. Dictionary-based cross-language information retrieval: Problems, methods, and research findings. Info. Retrieval 4, 209--230. Google ScholarDigital Library
- Porter, M. F. 1982. Implementing a probabilistic information retrieval system. Info. Technol.: Resear. Develop. 1, 2, 131--156.Google Scholar
- Porter, M. F. 1997. An algorithm for suffix stripping. In Readings in Information Retrieval, K. S. Jones and P. Willett Eds., Morgan Kaufmann, 313--316. Google ScholarDigital Library
- Qiu, Y. and Frei, H.-P. 1993. Concept-based query expansion. In Proceedings of the 16th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM Press, 160--169. Google ScholarDigital Library
- Riezler, S., Vasserman, A., Tsochantaridis, I., Mittal, V., and Liu, Y. 2007. Statistical machine translation for query expansion in answer retrieval. In Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics (ACL-07). Association for Computational Linguistics, 464--471.Google Scholar
- Robertson, S. E. 1986. On relevance weight estimation and query expansion. J. Documentation 42, 3, 182--188.Google ScholarCross Ref
- Robertson, S. E. 1990. On term selection for query expansion. J. Documentation 46, 4, 359--364. Google ScholarDigital Library
- Robertson, S. E. and Sparck Jones, K. 1976. Relevance weighting of search terms. J. Amer. Soc. Info. Science 27, 129--146.Google ScholarCross Ref
- Robertson, S. E. and Walker, S. 2000. Microsoft cambridge at trec-9: Filtering track. In Proceedings of the 9th Text REtrieval Conference (TREC-9). NIST Special Publication 500-249. National Institute of Standards and Technology (NIST), Gaithersburg, MD, 361--368.Google Scholar
- Robertson, S. E., Walker, S., and Beaulieu, M. M. 1998. Okapi at TREC-7: Automatic ad hoc, filtering, VLC, and interactive track. In Proceedings of the 7th Text REtrieval Conference (TREC-7), NIST Special Publication 500-242. National Institute of Standards and Technology (NIST), Gaithersburg, MD, 253--264.Google Scholar
- Rocchio, J. J. 1971. Relevance feedback in information retrieval. In The SMART Retrieval System, G. Salton Ed., Prentice-Hall, Englewood Cliffs, NJ, 313--323.Google Scholar
- Ruthven, I. 2003. Re-examining the potential effectiveness of interactive query expansion. In Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM Press, 213--220. Google ScholarDigital Library
- Ruthven, I. and Lalmas, M. 2003. A survey on the use of relevance feedback for information access systems. Knowl. Engin. Rev. 18, 2, 95--145. Google ScholarDigital Library
- Sahlgren, M. 2005. An introduction to random indexing. In Proceedings of the Methods and Applications of Semantic Indexing Workshop at the 7th International Conference on Terminology and Knowledge Engineering.Google Scholar
- Sakai, T., Manabe, M., and Koyama, M. 2005. Flexible pseudo-relevance feedback via selective sampling. ACM Trans. Info. Syst. 4, 2, 111--35. Google ScholarDigital Library
- Salton, G. and Buckley, C. 1990. Improving retrieval performance by relevance feedback. J. Amer. Soc. Info. Science 41, 4, 288--297.Google ScholarCross Ref
- Salton, G. and McGill, M. 1983. Introduction to Modern Information Retrieval. McGraw Hill, New York, NY. Google ScholarDigital Library
- Sanderson, M. 1994. Word sense disambiguation and information retrieval. In Proceedings of the 17th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM Press, 142--151. Google ScholarDigital Library
- Sanderson, M. 2000. Retrieving with good sense. Info. Retrieval 2, 1, 49--69. Google ScholarDigital Library
- Savoy, J. 2005. Comparative study of monolingual and multilingual search models for use with asian languages. ACM Trans. Asian Lang. Info. Process. 4, 2, 163--189. Google ScholarDigital Library
- Schlaefer, N., Ko, J., Betteridge, J., Sautter, G., and amd E. Nyberg, M. P. 2007. Semantic extensions of the Ephyra QA system for TREC 2007. In Proceedings of the 16th Text REtrieval Conference (TREC’07). NIST Special Publication 500-274. National Institute of Standards and Technology (NIST), Gaithersburg, MD, 332--341.Google Scholar
- Schütze, H. and Pedersen, J. O. 1995. Information retrieval based on word senses. In Proceedings of the 4th Annual Symposium on Document Analysis and Information Retrieval. 161--175.Google Scholar
- Schütze, H. and Pedersen, O. 1997. A co-occurrence based thesaurus and two applications to information retrieval. Info. Process. Manage. 33, 3, 307--318. Google ScholarDigital Library
- Semeraro, G., Lops, P., Basile, P., and de Gemmis, M. 2009. On the tip of my thought: Playing the guillotine game. In Proceedings of the 21st International Joint Conference on Artificial Intelligence. AAAI Press, 1543--1548. Google ScholarDigital Library
- Shen, X. and Zhai, C. 2005. Active feedback in ad hoc information retrieval. In Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM Press, 59--66. Google ScholarDigital Library
- Shokouhi, M., Azzopardi, L., and Thomas, P. 2009. Effective query expansion for federated search. In Proceedings of the 32nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM Press, 427--434. Google ScholarDigital Library
- Singhal, A. and Pereira, F. 1999. Document expansion for speech retrieval. In Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM Press, 34--41. Google ScholarDigital Library
- Song, M., Song, I.-Y., Allen, R. B., and Obradovic, Z. 2006. Keyphrase extraction-based query expansion in digital libraries. In Proceedings of the 6th ACM/IEEE-CS joint International Conference on Digital Libraries (JCDL’06). ACM Press, 202--209. Google ScholarDigital Library
- Song, M., Song, I.-Y., Hu, X., and Allen, R. B. 2007. Integration of association rules and ontologies for semantic query expansion. Data Knowl. Engin. 63, 1, 63--75. Google ScholarDigital Library
- Sun, R., Ong, C.-H., and Chua, T.-S. 2006. Mining dependency relations for query expansion in passage retrieval. In Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM Press, 382--389. Google ScholarDigital Library
- Suryanto, M. A., Lim, E.-P., Sun, A., and Chiang, R. H. 2007. Document expansion versus query expansion for ad-hoc retrieval. In Proceedings of the ACM 1st Workshop on CyberInfrastructure: Information Management in eScience. ACM Press, 47--54.Google Scholar
- Theobald, M., Shenkel, R., and Weikum, G. 2004. Top-k query evaluation with probabilistic guarantees. In Proceedings of the 13th International Conference on Very Large Data Bases. ACM Press, 648--659. Google ScholarDigital Library
- Theobald, M., Shenkel, R., and Weikum, G. 2005. Efficient and selftuning incremental query expansion for top-k query processing. In Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM Press, 242--249. Google ScholarDigital Library
- van Rijsbergen, C. J. 1979. Information Retrieval. Butterworths. Google ScholarDigital Library
- Vechtomova, O. 2009. Query expansion for information retrieval. In Encyclopedia of Database Systems, L. Liu and M. T. Özsu Eds., Springer, 2254--2257.Google Scholar
- Vechtomova, O. and Karamuftuoglu, M. 2004. Elicitation and use of relevance feedback information. Info. Process. Manage. 42, 1, 191--206. Google ScholarDigital Library
- Véronis, J. 2004. HyperLex: lexical cartography for information retrieval. Computer Speech Lang. 18, 3, 223--252.Google ScholarCross Ref
- Voorhees, E. 1993. Using wordnet to disambiguate word senses for text retrieval. In Proceedings of the 16th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM Press, 171--180. Google ScholarDigital Library
- Voorhees, E. 1994. Query expansion using lexical-semantic relations. In Proceedings of the 17th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM Press, 61--69. Google ScholarDigital Library
- Voorhees, E. 2004. Overview of the trec 2004 robust track. In Proceedings of the 13th Text REtrieval Conference (TREC-7). NIST Special Publication 500-261. National Institute of Standards and Technology (NIST), Gaithersburg, MD.Google Scholar
- Voorhees, E. and Harman, D. 1998. Overview of the seventh text retrieval conference (TREC-7). In Proceedings of the 7th Text REtrieval Conference (TREC-7). NIST Special Publication 500-242. National Institute of Standards and Technology (NIST), Gaithersburg, MD, 1--24.Google Scholar
- Wang, H., Liang, Y., Fu, L., Xue, G.-R., and Yu, Y. 2009. Efficient query expansion for advertisement search. In Proceedings of the 32nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM Press, 51--58. Google ScholarDigital Library
- Wang, X., Fang, H., and Zhai, C. 2008. A study of methods for negative relevance feedback. In Proceedings of the 31th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM Press, 219--226. Google ScholarDigital Library
- Wei, X. and Croft, W. B. 2007. Modeling term associations for ad-hoc retrieval performance within language modeling framework. In Proceedings of the 29th European Conference on IR Research (ECIR’07). Springer, 52--63. Google ScholarDigital Library
- White, R. W., Ruthven, I., and Jose, J. M. 2005. A study of factors affecting the utility of implicit relevance feedback. In Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM Press, 35--42. Google ScholarDigital Library
- Winaver, M., Kurland, O., and Domshlak, C. 2007. Towards robust query expansion: Model selection in the language modeling framework. In Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM Press, 729--730. Google ScholarDigital Library
- Witten, I. H., Moffat, A., and Bell, T. C. 1999. Managing Gigabytes: Compressing and Indexing Documents and Images 2nd Ed. Morgan Kaufman. Google ScholarDigital Library
- Wong, S. K. M., Ziarko, W., Raghavan, V. V., and Wong, P. C. N. 1987. On modeling of information retrieval concepts in vector spaces. ACM Trans. Datab. Syst. 12, 2, 299--321. Google ScholarDigital Library
- Wong, W. S., Luk, R. W. P., Leong, H. V., Ho, K. S., and Lee, D. L. 2008. Re-examining the effects of adding relevance information in a relevance feedback environment. Info. Process. Manage. 44, 3, 1086--1116. Google ScholarDigital Library
- Xu, J. and Croft, W. B. 1996. Query expansion using local and global document analysis. In Proceedings of the 19th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM Press, 4--11. Google ScholarDigital Library
- Xu, J. and Croft, W. B. 2000. Improving the effectiveness of information retrieval with local context analysis. ACM Trans. Info. Syst. 18, 1, 79--112. Google ScholarDigital Library
- Xu, Y., Jones, G. J. F., and Wang, B. 2009. Query dependent pseudo-relevance feedback based on wikipedia. In Proceedings of the 32nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM Press, 59--66. Google ScholarDigital Library
- Xu, Z. and Akella, R. 2007. Incorporating diversity and density in active learning for relevance feedback. In Proceedings of the 29th European Conference on IR Research (ECIR’07). Springer, 246--257. Google ScholarDigital Library
- Xue, G.-R., Zeng, H.-J., Chen, Z., Yu, Y., Ma, W.-Y., Xi, W., and Fan, W. 2004. Optimizing web search using web click-through data. In Proceedings of the 13th ACM International Conference on Information and Knowledge Management. ACM Press, 118--126. Google ScholarDigital Library
- Yin, Z., Shokouhi, M., and Craswell, N. 2009. Query expansion using external evidence. In Proceedings of the 31th European Conference on Information Retrieval (ECIR’09). Springer, 362--374. Google ScholarDigital Library
- Yu, S., Cai, D., Wen, J. R., and Ma, W. Y. 2003. Improving pseudo-relevance feedback in web information retrieval using web page segmentation. In Proceedings of the 12th International Conference on World Wide Web. ACM, 11--18. Google ScholarDigital Library
- Zelikovitz, S. and Hirsh, H. 2000. Improving short-text classification using unlabeled background knowledge to assess document similarity. In Proceedings of the 17th International Conference on Machine Learning (ICML’00). National Institute of Standards and Technology (NIST), 1183--1190. Google ScholarDigital Library
- Zha, Z.-J., Yang, L., Mei, T., Wang, M., and Wang, Z. 2009. Visual query suggestion. In Proceedings of the 17th ACM International Conference on Multimedia. ACM Press, 15--24. Google ScholarDigital Library
- Zhai, C. and Lafferty, J. 2001a. Model-based feedback in the language modeling approach to information retrieval. In Proceedings of the 10th International Conference on Information and Knowledge Management. ACM Press, 403--410. Google ScholarDigital Library
- Zhai, C. and Lafferty, J. 2001b. A study of smoothing methods for language models applied to ad hoc information retrieval. In Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM Press, 334--342. Google ScholarDigital Library
- Zimmer, C., Tryfonopoulos, C., and Weikum, G. 2008. Exploiting correlated keywords to improve approximate information filtering. In Proceedings of the 31th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM Press, 323--330. Google ScholarDigital Library
Index Terms
- A Survey of Automatic Query Expansion in Information Retrieval
Recommendations
Document expansion for image retrieval
RIAO '10: Adaptivity, Personalization and Fusion of Heterogeneous InformationSuccessful information retrieval requires effective matching between the user's search request and the contents of relevant documents. Often the request entered by a user may not use the same topic relevant terms as the authors' of these documents. One ...
Incremental blind feedback: An effective approach to automatic query expansion
Automatic query expansion (AQE) is a useful technique for enhancing the effectiveness of information retrieval systems. In this article, we propose a novel AQE algorithm which first adopts a systematic incremental approach to choose feedback documents ...
Ontology Based Query Expansion with a Probabilistic Retrieval Model
IRFC 2013: Proceedings of the 6th Information Retrieval Facility Conference on Multidisciplinary Information Retrieval - Volume 8201This paper examines the use of ontologies for defining query context. The information retrieval system used is based on the probabilistic retrieval model. We extend the use of relevance feedback RFB and pseudo-relevance feedback PF query expansion ...
Comments