skip to main content
10.1145/1963405.1963416acmotherconferencesArticle/Chapter ViewAbstractPublication PageswwwConference Proceedingsconference-collections
research-article

Improving recommendation for long-tail queries via templates

Published:28 March 2011Publication History

ABSTRACT

The ability to aggregate huge volumes of queries over a large population of users allows search engines to build precise models for a variety of query-assistance features such as query recommendation, correction, etc. Yet, no matter how much data is aggregated, the long-tail distribution implies that a large fraction of queries are rare. As a result, most query assistance services perform poorly or are not even triggered on long-tail queries. We propose a method to extend the reach of query assistance techniques (and in particular query recommendation) to long-tail queries by reasoning about rules between query templates rather than individual query transitions, as currently done in query-flow graph models. As a simple example, if we recognize that 'Montezuma' is a city in the rare query "Montezuma surf" and if the rule 'city surf → beach has been observed, we are able to offer "Montezuma beach" as a recommendation, even if the two queries were never observed in a same session. We conducted experiments to validate our hypothesis, first via traditional small-scale editorial assessments but more interestingly via a novel automated large scale evaluation methodology. Our experiments show that general coverage can be relatively increased by 24% using templates without penalizing quality. Furthermore, for 36% of the 95M queries in our query flow graph, which have no out edges and thus could not be served recommendations, we can now offer at least one recommendation in 98% of the cases.

References

  1. G. Agarwal, G. Kabra, and K. C.-C. Chang. Towards rich query interpretation: Walking back and forth for mining query templates. In WWW, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. I. Antonellis, H. Garcia-Molina, and C.-C. Chang. Simrank++: Query rewriting through link analysis of the click graph. In VLDB, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. R. A. Baeza-Yates, C. A. Hurtado, and M. Mendoza. Query recommendation using query logs in search engines. In EDBT Workshops, 2004.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. D. Beeferman and A. Berger. Agglomerative clustering of a search engine query log. In KDD, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. P. Boldi, F. Bonchi, C. Castillo, D. Donato, A. Gionis, and S. Vigna. The query-flow graph: model and applications. In CIKM, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. D. Chiang. A hierarchical phrase-based model for statistical machine translation. In ACL, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. J. Cohen. A coefficient of agreement for nominal scales. Educational and Psychological Measurement, 20(1):37--46, April 1960.Google ScholarGoogle ScholarCross RefCross Ref
  8. N. Craswell and M. Szummer. Random walks on the click graph. In SIGIR, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. M. Fernández-Fernández and D. Gayo-Avello. Hierarchical taxonomy extraction by mining topical query sessions. In KDIR, 2009.Google ScholarGoogle Scholar
  10. B. M. Fonseca, P. B. Golgher, E. S. de Moura, and N. Ziviani. Using association rules to discover search engines related queries. In LA-WEB, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. A. Fuxman, P. Tsaparas, K. Achan, and R. Agrawal. Using the wisdom of the crowds for keyword generation. In WWW, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. S. Goel, A. Broder, E. Gabrilovich, and B. Pang. Anatomy of the long tail: ordinary people with extraordinary tastes. In WSDM, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. J. Guo, G. Xu, X. Cheng, and H. Li. Named entity recognition in query. In SIGIR, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. R. Jones, B. Rey, O. Madani, and W. Greiner. Generating query substitutions. In WWW, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. J. R. Landis and G. G. Koch. The measurement of observer agreement for categorical data. Biometrics, 33(1), 1977.Google ScholarGoogle Scholar
  16. X. Li, Y.-Y. Wang, and A. Acero. Extracting structured information from user queries with semi-supervised conditional random fields. In SIGIR, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. D. Lin and P. Pantel. Discovery of inference rules for question answering. Natural Language Engineering, 7(4):343--360, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. J. Luxenburger, S. Elbassuoni, and G. Weikum. Matching task profiles and user needs in personalized web search. In CIKM, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Q. Mei, D. Zhou, and K. Church. Query suggestion using hitting time. In CIKM, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. G. A. Miller. Wordnet: A lexical database for english. In Communications of the ACM, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. M. Paşca. Weakly-supervised discovery of named entities using web search queries. In CIKM, 2007.Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. M. Richardson. Learning about the world through long-term query logs. ACM Transactions on the Web, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. E. Sadikov, J. Madhavan, L. Wang, and A. Halevy. Clustering query refinements by user intent. In WWW, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. R. Snow, D. Jurafsky, and A. Y. Ng. Learning syntactic patterns for automatic hypernym discovery. In NIPS, 2005.Google ScholarGoogle Scholar
  25. F. M. Suchanek, G. Kasneci, and G. Weikum. Yago: A core of semantic knowledge - unifying wordnet and wikipedia. In WWW, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. I. Szpektor and I. Dagan. Learning entailment rules for unary templates. In COLING, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. J.-R. Wen, J.-Y. Nie, and H.-J. Zhang. Clustering user queries of a search engine. In WWW, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. R. W. White, M. Bilenko, and S. Cucerzan. Studying the use of popular destinations to enhance web search interaction. In SIGIR, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. R. W. White, M. Bilenko, and S. Cucerzan. Leveraging popular destinations to enhance web search interaction. ACM Transactions on the Web, 2(3), 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Z. Zhang and O. Nasraoui. Mining search engine query logs for query recommendation. In WWW, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in
  • Published in

    cover image ACM Other conferences
    WWW '11: Proceedings of the 20th international conference on World wide web
    March 2011
    840 pages
    ISBN:9781450306324
    DOI:10.1145/1963405

    Copyright © 2011 ACM

    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 28 March 2011

    Permissions

    Request permissions about this article.

    Request Permissions

    Check for updates

    Qualifiers

    • research-article

    Acceptance Rates

    Overall Acceptance Rate1,899of8,196submissions,23%

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader