ABSTRACT
We consider the task of suggesting related queries to users after they issue their initial query to a web search engine. We propose a machine learning approach to learn the probability that a user may find a follow-up query both useful and relevant, given his initial query. Our approach is based on a machine learning model which enables us to generalize to queries that have never occurred in the logs as well. The model is trained on co-occurrences mined from the search logs, with novel utility and relevance models, and the machine learning step is done without any labeled data by human judges. The learning step allows us to generalize from the past observations and generate query suggestions that are beyond the past co-occurred queries. This brings significant gains in coverage while yielding modest gains in relevance. Both offline (based on human judges) and online (based on millions of user interactions) evaluations demonstrate that our approach significantly outperforms strong baselines.
- D. Beeferman and A. Berger. Agglomerative clustering of a search engine query log. In Proceedings of the sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 407--416. Acm Press, 2000. Google ScholarDigital Library
- P. Boldi, F. Bonchi, C. Castillo, D. Donato, A. Gionis, and S. Vigna. The query-flow graph: model and applications. In Proceedings of the 17th ACM conference on Information and knowledge management, CIKM '08, pages 609--618, New York, NY, USA, 2008. ACM. Google ScholarDigital Library
- P. Boldi, F. Bonchi, C. Castillo, D. Donato, and S. Vigna. Query suggestions using query-flow graphs. In Proceedings of the 2009 workshop on Web Search Click Data, WSCD '09, pages 56--63, New York, NY, USA, 2009. ACM. Google ScholarDigital Library
- O. Chapelle and Y. Chang. Yahoo! learning to rank challenge overview. Journal of Machine Learning Research - Proceedings Track, 14:1--24, 2011.Google Scholar
- L. B. Chilton and J. Teevan. Addressing people's information needs directly in a web search result page. In Proceedings of the 20th international conference on World wide web, WWW '11, pages 27--36, 2011. Google ScholarDigital Library
- H. Deng, I. King, and M. R. Lyu. Entropy-biased models for query representation on the click graph. In Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval, SIGIR '09, pages 339--346, New York, NY, USA, 2009. ACM. Google ScholarDigital Library
- T. Dunning. Accurate methods for the statistics of surprise and coincidence. Comput. Linguist., 19:61--74, March 1993. Google ScholarDigital Library
- L. Fitzpatrick and M. Dent. Automatic feedback using past queries: social searching? In Proceedings of the 20th annual international ACM SIGIR conference on Research and development in information retrieval, SIGIR '97, pages 306--313, New York, NY, USA, 1997. ACM. Google ScholarDigital Library
- J. H. Friedman. Greedy function approximation: A gradient boosting machine. Annals of Statistics, 29, 2000.Google Scholar
- C. Huang, L. Chien, and Y. Oyang. Relevant term suggestion in interactive web search based on contextual information in query session logs. Journal of the American Society for Information Science and Technology, 54:638--649, 2003. Google ScholarDigital Library
- A. Jain, U. Ozertem, and E. Velipasaoglu. Synthesizing high utility suggestions for rare web search queries. In Proceedings of the 34th international ACM SIGIR conference on Research and development in Information, SIGIR '11, pages 805--814, 2011. Google ScholarDigital Library
- E. C. Jensen, S. M. Beitzel, A. Chowdhury, and O. Frieder. Query phrase suggestion from topically tagged session logs. In H. L. Larsen, G. Pasi, D. O. Arroyo, T. Andreasen, and H. Christiansen, editors, Flexible Query Answering Systems, 7th International Conference, FQAS 2006, Milan, Italy, June 7--10, 2006, Proceedings, volume 4027 of Lecture Notes in Computer Science, pages 185--196. Springer, 2006. Google ScholarDigital Library
- R. Jones, B. Rey, O. Madani, and W. Greiner. Generating query substitutions. In Proceedings of the 15th international conference on World Wide Web, WWW '06, pages 387--396, New York, NY, USA, 2006. ACM. Google ScholarDigital Library
- V. I. Levenshtein. Binary codes capable of correcting deletions, insertions, and reversals. Soviet Physics Doklady, (10), 1966.Google Scholar
- Q. Mei, D. Zhou, and K. Church. Query suggestion using hitting time. In Proceeding of the 17th ACM conference on Information and knowledge management, CIKM '08, pages 469--478, New York, NY, USA, 2008. ACM. Google ScholarDigital Library
- G. A. Miller. Wordnet: A lexical database for english. Communications of the ACM, 38:39--41, 1995. Google ScholarDigital Library
- R. C. Moore. On Log-Likelihood-Ratios and the Significance of Rare Events. In Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing (EMNLP'04), 2004.Google Scholar
- U. Ozertem, E. Velipasaoglu, and L. Lai. Suggestion set utility maximization using session logs. In Proceedings of the 20th international ACM Conference on Information and Knowledge Management, CIKM '11, 2011. Google ScholarDigital Library
- D. Paranjpe. Learning document aboutness from implicit user feedback and document structure. In CIKM '09: Proceeding of the 18th ACM conference on Information and knowledge management, pages 365--374. ACM, 2009. Google ScholarDigital Library
- V. V. Raghavan and H. Sever. On the reuse of past optimal queries. In Proceedings of the ACM SIGIR Conference on Research and Development in Information Retrieval, pages 344--350. ACM Press, 1995. Google ScholarDigital Library
- C. H. Ricardo Baeza-Yates and M. Mendoza. Query recommendation using query logs in search engines. In Trends in Database Technology - EDBT 2004 Workshops, 2005. Google ScholarDigital Library
- E. Sadikov, J. Madhavan, L. Wang, and A. Halevy. Clustering query refinements by user intent. In Proceedings of the 19th international conference on World wide web, WWW '10, pages 841--850, New York, NY, USA, 2010. ACM. Google ScholarDigital Library
- I. Szpektor, A. Gionis, and Y. Maarek. Improving recommendation for long-tail queries via templates. In Proceedings of the 20th international conference on World wide web, WWW '11, pages 47--56, New York, NY, USA, 2011. ACM. Google ScholarDigital Library
- A. Thanopoulos, N. Fakotakis, and G. Kokkinakis. Comparative evaluation of collocation extraction metrics. In Proceedings of the 3rd Language Resources Evaluation Conference, pages 620--625, 2002.Google Scholar
Index Terms
- Learning to suggest: a machine learning framework for ranking query suggestions
Recommendations
Structured query suggestion for specialization and parallel movement: effect on search behaviors
WWW '12: Proceedings of the 21st international conference on World Wide WebQuery suggestion, which enables the user to revise a query with a single click, has become one of the most fundamental features of Web search engines. However, it is often difficult for the user to choose from a list of query suggestions, and to ...
To Suggest, or Not to Suggest for Queries with Diverse Intents: Optimizing Search Result Presentation
WSDM '16: Proceedings of the Ninth ACM International Conference on Web Search and Data MiningWe propose a method of optimizing search result presentation for queries with diverse intents, by selectively presenting query suggestions for leading users to more relevant search results. The optimization is based on a probabilistic model of users who ...
Reranking search results for sparse queries
CIKM '11: Proceedings of the 20th ACM international conference on Information and knowledge managementIt is well known that clickthrough data can be used to improve the effectiveness of search results: broadly speaking, a query's past clicks are a predictor of future clicks on documents. However, when a new or unusual query appears, or when a system is ...
Comments