skip to main content
10.1145/2348283.2348290acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
research-article

Learning to suggest: a machine learning framework for ranking query suggestions

Published:12 August 2012Publication History

ABSTRACT

We consider the task of suggesting related queries to users after they issue their initial query to a web search engine. We propose a machine learning approach to learn the probability that a user may find a follow-up query both useful and relevant, given his initial query. Our approach is based on a machine learning model which enables us to generalize to queries that have never occurred in the logs as well. The model is trained on co-occurrences mined from the search logs, with novel utility and relevance models, and the machine learning step is done without any labeled data by human judges. The learning step allows us to generalize from the past observations and generate query suggestions that are beyond the past co-occurred queries. This brings significant gains in coverage while yielding modest gains in relevance. Both offline (based on human judges) and online (based on millions of user interactions) evaluations demonstrate that our approach significantly outperforms strong baselines.

References

  1. D. Beeferman and A. Berger. Agglomerative clustering of a search engine query log. In Proceedings of the sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 407--416. Acm Press, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. P. Boldi, F. Bonchi, C. Castillo, D. Donato, A. Gionis, and S. Vigna. The query-flow graph: model and applications. In Proceedings of the 17th ACM conference on Information and knowledge management, CIKM '08, pages 609--618, New York, NY, USA, 2008. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. P. Boldi, F. Bonchi, C. Castillo, D. Donato, and S. Vigna. Query suggestions using query-flow graphs. In Proceedings of the 2009 workshop on Web Search Click Data, WSCD '09, pages 56--63, New York, NY, USA, 2009. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. O. Chapelle and Y. Chang. Yahoo! learning to rank challenge overview. Journal of Machine Learning Research - Proceedings Track, 14:1--24, 2011.Google ScholarGoogle Scholar
  5. L. B. Chilton and J. Teevan. Addressing people's information needs directly in a web search result page. In Proceedings of the 20th international conference on World wide web, WWW '11, pages 27--36, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. H. Deng, I. King, and M. R. Lyu. Entropy-biased models for query representation on the click graph. In Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval, SIGIR '09, pages 339--346, New York, NY, USA, 2009. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. T. Dunning. Accurate methods for the statistics of surprise and coincidence. Comput. Linguist., 19:61--74, March 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. L. Fitzpatrick and M. Dent. Automatic feedback using past queries: social searching? In Proceedings of the 20th annual international ACM SIGIR conference on Research and development in information retrieval, SIGIR '97, pages 306--313, New York, NY, USA, 1997. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. J. H. Friedman. Greedy function approximation: A gradient boosting machine. Annals of Statistics, 29, 2000.Google ScholarGoogle Scholar
  10. C. Huang, L. Chien, and Y. Oyang. Relevant term suggestion in interactive web search based on contextual information in query session logs. Journal of the American Society for Information Science and Technology, 54:638--649, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. A. Jain, U. Ozertem, and E. Velipasaoglu. Synthesizing high utility suggestions for rare web search queries. In Proceedings of the 34th international ACM SIGIR conference on Research and development in Information, SIGIR '11, pages 805--814, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. E. C. Jensen, S. M. Beitzel, A. Chowdhury, and O. Frieder. Query phrase suggestion from topically tagged session logs. In H. L. Larsen, G. Pasi, D. O. Arroyo, T. Andreasen, and H. Christiansen, editors, Flexible Query Answering Systems, 7th International Conference, FQAS 2006, Milan, Italy, June 7--10, 2006, Proceedings, volume 4027 of Lecture Notes in Computer Science, pages 185--196. Springer, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. R. Jones, B. Rey, O. Madani, and W. Greiner. Generating query substitutions. In Proceedings of the 15th international conference on World Wide Web, WWW '06, pages 387--396, New York, NY, USA, 2006. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. V. I. Levenshtein. Binary codes capable of correcting deletions, insertions, and reversals. Soviet Physics Doklady, (10), 1966.Google ScholarGoogle Scholar
  15. Q. Mei, D. Zhou, and K. Church. Query suggestion using hitting time. In Proceeding of the 17th ACM conference on Information and knowledge management, CIKM '08, pages 469--478, New York, NY, USA, 2008. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. G. A. Miller. Wordnet: A lexical database for english. Communications of the ACM, 38:39--41, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. R. C. Moore. On Log-Likelihood-Ratios and the Significance of Rare Events. In Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing (EMNLP'04), 2004.Google ScholarGoogle Scholar
  18. U. Ozertem, E. Velipasaoglu, and L. Lai. Suggestion set utility maximization using session logs. In Proceedings of the 20th international ACM Conference on Information and Knowledge Management, CIKM '11, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. D. Paranjpe. Learning document aboutness from implicit user feedback and document structure. In CIKM '09: Proceeding of the 18th ACM conference on Information and knowledge management, pages 365--374. ACM, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. V. V. Raghavan and H. Sever. On the reuse of past optimal queries. In Proceedings of the ACM SIGIR Conference on Research and Development in Information Retrieval, pages 344--350. ACM Press, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. C. H. Ricardo Baeza-Yates and M. Mendoza. Query recommendation using query logs in search engines. In Trends in Database Technology - EDBT 2004 Workshops, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. E. Sadikov, J. Madhavan, L. Wang, and A. Halevy. Clustering query refinements by user intent. In Proceedings of the 19th international conference on World wide web, WWW '10, pages 841--850, New York, NY, USA, 2010. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. I. Szpektor, A. Gionis, and Y. Maarek. Improving recommendation for long-tail queries via templates. In Proceedings of the 20th international conference on World wide web, WWW '11, pages 47--56, New York, NY, USA, 2011. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. A. Thanopoulos, N. Fakotakis, and G. Kokkinakis. Comparative evaluation of collocation extraction metrics. In Proceedings of the 3rd Language Resources Evaluation Conference, pages 620--625, 2002.Google ScholarGoogle Scholar

Index Terms

  1. Learning to suggest: a machine learning framework for ranking query suggestions

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        SIGIR '12: Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
        August 2012
        1236 pages
        ISBN:9781450314725
        DOI:10.1145/2348283

        Copyright © 2012 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 12 August 2012

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        Overall Acceptance Rate792of3,983submissions,20%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader