Skip to main content
Top
Published in: Discover Computing 4/2013

01-08-2013 | Search Intents and Diversification

Learning to rank query suggestions for adhoc and diversity search

Authors: Rodrygo L. T. Santos, Craig Macdonald, Iadh Ounis

Published in: Discover Computing | Issue 4/2013

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Query suggestions have become pervasive in modern web search, as a mechanism to guide users towards a better representation of their information need. In this article, we propose a ranking approach for producing effective query suggestions. In particular, we devise a structured representation of candidate suggestions mined from a query log that leverages evidence from other queries with a common session or a common click. This enriched representation not only helps overcome data sparsity for long-tail queries, but also leads to multiple ranking criteria, which we integrate as features for learning to rank query suggestions. To validate our approach, we build upon existing efforts for web search evaluation and propose a novel framework for the quantitative assessment of query suggestion effectiveness. Thorough experiments using publicly available data from the TREC Web track show that our approach provides effective suggestions for adhoc and diversity search.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Footnotes
1
An analogy to the document ranking problem can be made in which field-based models, such as BM25F (Zaragoza et al. 2004), leverage evidence from fields such as the title, body, URL, or the anchor text of incoming hyperlinks in order to score a document.
 
6
All rankings were obtained in February 2012 using Bing API v2.0.
 
7
All query suggestions were obtained in February 2012 using Bing API v2.0.
 
8
Note that suggestions with a relevance label 1 (i.e., with a positive yet lower retrieval effectiveness than that attained by the initial query) are also considered, as they may bring useful evidence for the diversification scenario addressed in Sec. 6.2.
 
Literature
go back to reference Alonso, O., Rose, D. E., & Stewart, B. (2008). Crowdsourcing for relevance evaluation. SIGIR Forum, 42(2), 9–15.CrossRef Alonso, O., Rose, D. E., & Stewart, B. (2008). Crowdsourcing for relevance evaluation. SIGIR Forum, 42(2), 9–15.CrossRef
go back to reference Amati, G. (2003). Probabilistic models for information retrieval based on divergence from randomness. PhD thesis. :University of Glasgow. Amati, G. (2003). Probabilistic models for information retrieval based on divergence from randomness. PhD thesis. :University of Glasgow.
go back to reference Amati, G., Ambrosi, E., Bianchi, M., Gaibisso, C., & Gambosi, G. (2007). FUB, IASI-CNR and University of Tor Vergata at TREC 2007 Blog track. In Proceedings of TREC. Amati, G., Ambrosi, E., Bianchi, M., Gaibisso, C., & Gambosi, G. (2007). FUB, IASI-CNR and University of Tor Vergata at TREC 2007 Blog track. In Proceedings of TREC.
go back to reference Baeza-Yates, R. A., Hurtado, C. A., & Mendoza, M. (2004). Query recommendation using query logs in search engines. In Proceedings of ClustWeb at EDBT (pp. 588–596). Baeza-Yates, R. A., Hurtado, C. A., & Mendoza, M. (2004). Query recommendation using query logs in search engines. In Proceedings of ClustWeb at EDBT (pp. 588–596).
go back to reference Boldi, P., Bonchi, F., Castillo, C., Donato, D., Gionis, A., & Vigna, S. (2008). The query-flow graph: Model and applications. In Proceedings of CIKM (pp. 609–618). Boldi, P., Bonchi, F., Castillo, C., Donato, D., Gionis, A., & Vigna, S. (2008). The query-flow graph: Model and applications. In Proceedings of CIKM (pp. 609–618).
go back to reference Boldi, P., Bonchi, F., Castillo, C., Donato, D., & Vigna, S. (2009). Query suggestions using query-flow graphs. In Proceedings of WSCD at WSDM (pp. 56–63). Boldi, P., Bonchi, F., Castillo, C., Donato, D., & Vigna, S. (2009). Query suggestions using query-flow graphs. In Proceedings of WSCD at WSDM (pp. 56–63).
go back to reference Broccolo, D., Marcon, L., Nardini, F. M., Perego, R., & Silvestri, F. (2012). Generating suggestions for queries in the long tail with an inverted index. Information Processing and Management, 48(2), 326–339.CrossRef Broccolo, D., Marcon, L., Nardini, F. M., Perego, R., & Silvestri, F. (2012). Generating suggestions for queries in the long tail with an inverted index. Information Processing and Management, 48(2), 326–339.CrossRef
go back to reference Burges, C. J. C. (2010). From RankNet to LambdaRank to LambdaMART: An overview. Technical report MSR-TR-2010-82, Microsoft Research. Burges, C. J. C. (2010). From RankNet to LambdaRank to LambdaMART: An overview. Technical report MSR-TR-2010-82, Microsoft Research.
go back to reference Carterette, B., Allan, J., & Sitaraman, R. (2006). Minimal test collections for retrieval evaluation. In Proceedings of SIGIR (pp. 268–275). Carterette, B., Allan, J., & Sitaraman, R. (2006). Minimal test collections for retrieval evaluation. In Proceedings of SIGIR (pp. 268–275).
go back to reference Carterette, B., Pavlu, V., Kanoulas, E., Aslam, J. A., & Allan, J. (2009). If I dad a million queries. In Proceedings of ECIR (pp. 288–300). New York: Springer. Carterette, B., Pavlu, V., Kanoulas, E., Aslam, J. A., & Allan, J. (2009). If I dad a million queries. In Proceedings of ECIR (pp. 288–300). New York: Springer.
go back to reference Chapelle, O., & Chang, Y. (2011). Yahoo! learning to rank challenge overview. Journal of Machine Learning Research, 14, 1–24. Chapelle, O., & Chang, Y. (2011). Yahoo! learning to rank challenge overview. Journal of Machine Learning Research, 14, 1–24.
go back to reference Chapelle, O., Metlzer, D., Zhang, Y., & Grinspan, P. (2009). Expected reciprocal rank for graded relevance. In Proceedings of CIKM (pp. 621–630). Chapelle, O., Metlzer, D., Zhang, Y., & Grinspan, P. (2009). Expected reciprocal rank for graded relevance. In Proceedings of CIKM (pp. 621–630).
go back to reference Clarke, C. L. A., Craswell, N., & Soboroff, I. (2009). Overview of the TREC 2009 Web track. In Proceeding of TREC. Clarke, C. L. A., Craswell, N., & Soboroff, I. (2009). Overview of the TREC 2009 Web track. In Proceeding of TREC.
go back to reference Clarke, C. L. A., Craswell, N., Soboroff, I., & Ashkan, A. (2011). A comparative analysis of cascade measures for novelty and diversity. In Proceedings of WSDM (pp. 75–84). Clarke, C. L. A., Craswell, N., Soboroff, I., & Ashkan, A. (2011). A comparative analysis of cascade measures for novelty and diversity. In Proceedings of WSDM (pp. 75–84).
go back to reference Clarke, C. L. A., Craswell, N., Soboroff, I., & Cormack, G. V. (2010). Overview of the TREC 2010 Web track. In Proceedings of TREC. Clarke, C. L. A., Craswell, N., Soboroff, I., & Cormack, G. V. (2010). Overview of the TREC 2010 Web track. In Proceedings of TREC.
go back to reference Clarke, C. L. A., Craswell, N., Soboroff, I., & Voorhees, E. M. (2011). Overview of the TREC 2011 Web track. In Proceedidngs of TREC. Clarke, C. L. A., Craswell, N., Soboroff, I., & Voorhees, E. M. (2011). Overview of the TREC 2011 Web track. In Proceedidngs of TREC.
go back to reference Clarke, C. L. A., Kolla, M., Cormack, G. V., Vechtomova, O., Ashkan, A., Büttcher, S., & MacKinnon, I. (2008). Novelty and diversity in information retrieval evaluation. In Proceedings of SIGIR (pp. 659–666). Clarke, C. L. A., Kolla, M., Cormack, G. V., Vechtomova, O., Ashkan, A., Büttcher, S., & MacKinnon, I. (2008). Novelty and diversity in information retrieval evaluation. In Proceedings of SIGIR (pp. 659–666).
go back to reference Clarke, C. L. A., Kolla, M., & Vechtomova, O. (2009). An effectiveness measure for ambiguous and underspecified queries. In Proceedings of ICTIR (pp. 188–199). Clarke, C. L. A., Kolla, M., & Vechtomova, O. (2009). An effectiveness measure for ambiguous and underspecified queries. In Proceedings of ICTIR (pp. 188–199).
go back to reference Cucerzan, S., & White, R. W. (2007). Query suggestion based on user landing pages. In Proceedings of SIGIR (pp. 875–876). New York: ACM. Cucerzan, S., & White, R. W. (2007). Query suggestion based on user landing pages. In Proceedings of SIGIR (pp. 875–876). New York: ACM.
go back to reference Dang, V., Bendersky, M., & Croft, W. B. (2010). Learning to rank query reformulations. In Proceedings of SIGIR (pp. 807–808). :ACM. Dang, V., Bendersky, M., & Croft, W. B. (2010). Learning to rank query reformulations. In Proceedings of SIGIR (pp. 807–808). :ACM.
go back to reference Dean, J. (2009). Challenges in building large-scale information retrieval systems: invited talk. In Proceedings of WSDM (p. 1). New York: ACM. Dean, J. (2009). Challenges in building large-scale information retrieval systems: invited talk. In Proceedings of WSDM (p. 1). New York: ACM.
go back to reference Downey, D., Dumais, S., & Horvitz, E. (2007). Heads and tails: studies of web search with common and rare queries. In Proceedings of SIGIR (pp. 847–848). Downey, D., Dumais, S., & Horvitz, E. (2007). Heads and tails: studies of web search with common and rare queries. In Proceedings of SIGIR (pp. 847–848).
go back to reference Fonseca, B. M., Golgher, P. B., De Moura, E. S., Pôssas, B., & Ziviani, N. (2003). Discovering search engine related queries using association rules. Journal of Web Engineering, 2, 215–227. Fonseca, B. M., Golgher, P. B., De Moura, E. S., Pôssas, B., & Ziviani, N. (2003). Discovering search engine related queries using association rules. Journal of Web Engineering, 2, 215–227.
go back to reference Ganjisaffar, Y., Caruana, R., & Lopes, C. (2011). Bagging gradient-boosted trees for high precision, low variance ranking models. In Proceedings of SIGIR (pp. 85–94), Beijing, China. Ganjisaffar, Y., Caruana, R., & Lopes, C. (2011). Bagging gradient-boosted trees for high precision, low variance ranking models. In Proceedings of SIGIR (pp. 85–94), Beijing, China.
go back to reference Hauff, C., Kelly, D., & Azzopardi, L. (2010). A comparison of user and system query performance predictions. In Proceedings of CIKM (pp. 979–988). Hauff, C., Kelly, D., & Azzopardi, L. (2010). A comparison of user and system query performance predictions. In Proceedings of CIKM (pp. 979–988).
go back to reference Jansen, B. J., Spink, A., Bateman, J., & Saracevic, T. (1998). Real life information retrieval: A study of user queries on the web. SIGIR Forum, 32(1), 5–17.CrossRef Jansen, B. J., Spink, A., Bateman, J., & Saracevic, T. (1998). Real life information retrieval: A study of user queries on the web. SIGIR Forum, 32(1), 5–17.CrossRef
go back to reference Järvelin, K., & Kekäläinen, J. (2002). Cumulated gain-based evaluation of IR techniques. ACM Transactions on Information Systems, 20(4), 422–446.CrossRef Järvelin, K., & Kekäläinen, J. (2002). Cumulated gain-based evaluation of IR techniques. ACM Transactions on Information Systems, 20(4), 422–446.CrossRef
go back to reference Jones, R., Rey, B., Madani, O., & Greiner, W. (2006). Generating query substitutions. In Proceedings of WWW (pp. 387–396). Jones, R., Rey, B., Madani, O., & Greiner, W. (2006). Generating query substitutions. In Proceedings of WWW (pp. 387–396).
go back to reference Liu, T.-Y. (2009). Learning to rank for information retrieval. Foundations and Trends in Information Retrieval, 3(3), 225–331.CrossRef Liu, T.-Y. (2009). Learning to rank for information retrieval. Foundations and Trends in Information Retrieval, 3(3), 225–331.CrossRef
go back to reference Mei, Q., Zhou, D., & Church, K. (2008). Query suggestion using hitting time. In Proceedings of CIKM (pp. 469–478). Mei, Q., Zhou, D., & Church, K. (2008). Query suggestion using hitting time. In Proceedings of CIKM (pp. 469–478).
go back to reference Metzler, D. (2007). Automatic feature selection in the Markov random field model for information retrieval. In Proceedings of CIKM (pp. 253–262). Metzler, D. (2007). Automatic feature selection in the Markov random field model for information retrieval. In Proceedings of CIKM (pp. 253–262).
go back to reference Metzler, D., & Croft, W. B. (2005). A Markov random field model for term dependencies. In Proceedings of SIGIR (pp. 472–479). Metzler, D., & Croft, W. B. (2005). A Markov random field model for term dependencies. In Proceedings of SIGIR (pp. 472–479).
go back to reference Ounis, I., Amati, G., Plachouras, V., He, B., Macdonald, C., & Lioma, C. (2006). Terrier: A high performance and scalable information retrieval platform. In Proceedings of OSIR at SIGIR. Ounis, I., Amati, G., Plachouras, V., He, B., Macdonald, C., & Lioma, C. (2006). Terrier: A high performance and scalable information retrieval platform. In Proceedings of OSIR at SIGIR.
go back to reference Peng, J., Macdonald, C., He, V., Plachouras, V., & Ounis, I. (2007). Incorporating term dependency in the DFR framework. In Proceedings of SIGIR. New York: ACM Press. Peng, J., Macdonald, C., He, V., Plachouras, V., & Ounis, I. (2007). Incorporating term dependency in the DFR framework. In Proceedings of SIGIR. New York: ACM Press.
go back to reference Qin, T., Liu, T.-Y., Xu, J., & Li, H. (2009). LETOR: A benchmark collection for research on learning to rank for information retrieval. Information Retrieval, 13(4), 347–374. Qin, T., Liu, T.-Y., Xu, J., & Li, H. (2009). LETOR: A benchmark collection for research on learning to rank for information retrieval. Information Retrieval, 13(4), 347–374.
go back to reference Robertson, S. (2008). On the optimisation of evaluation metrics. In Proceedings of LR4IR at SIGIR. Robertson, S. (2008). On the optimisation of evaluation metrics. In Proceedings of LR4IR at SIGIR.
go back to reference Robertson, S. E., Walker, S., Jones, S., Hancock-Beaulieu, M., & Gatford, M. (1994). Okapi at TREC-3. In Proceedings of TREC. Robertson, S. E., Walker, S., Jones, S., Hancock-Beaulieu, M., & Gatford, M. (1994). Okapi at TREC-3. In Proceedings of TREC.
go back to reference Santos, R. L. T., Macdonald, C., & Ounis, I. (2010). Exploiting query reformulations for web search result diversification. In Proceedings of WWW (pp. 881–890). Santos, R. L. T., Macdonald, C., & Ounis, I. (2010). Exploiting query reformulations for web search result diversification. In Proceedings of WWW (pp. 881–890).
go back to reference Santos, R. L. T., Macdonald, C., & Ounis, I. (2011). How diverse are web search results? In Proceedings of SIGIR (pp. 1187–1188). Santos, R. L. T., Macdonald, C., & Ounis, I. (2011). How diverse are web search results? In Proceedings of SIGIR (pp. 1187–1188).
go back to reference Santos, R. L. T., Macdonald, C., & Ounis, I. (2011). Intent-aware search result diversification. In Proceedings of SIGIR (pp. 595–604). Santos, R. L. T., Macdonald, C., & Ounis, I. (2011). Intent-aware search result diversification. In Proceedings of SIGIR (pp. 595–604).
go back to reference Sheldon, D., Shokouhi, M., Szummer, M., & Craswell, N. (2011). LambdaMerge: merging the results of query reformulations. In Proceedings of WSDM (pp. 795–804). Sheldon, D., Shokouhi, M., Szummer, M., & Craswell, N. (2011). LambdaMerge: merging the results of query reformulations. In Proceedings of WSDM (pp. 795–804).
go back to reference Silvestri, F. (2010). Mining query logs: turning search usage data into knowledge. Foundations and Trends® in Information Retrieval, 4(1–2), 1–174.MATHCrossRef Silvestri, F. (2010). Mining query logs: turning search usage data into knowledge. Foundations and Trends® in Information Retrieval, 4(1–2), 1–174.MATHCrossRef
go back to reference Song, R., Luo, Z., Nie, J.-Y., Yu, Y., & Hon, H.-W. (2009). Identification of ambiguous queries in web search. Information Processing and Management, 45(2), 216–229.CrossRef Song, R., Luo, Z., Nie, J.-Y., Yu, Y., & Hon, H.-W. (2009). Identification of ambiguous queries in web search. Information Processing and Management, 45(2), 216–229.CrossRef
go back to reference Song, Y., Zhou, D., & Wei He, L. (2011). Post-ranking query suggestion by diversifying search results. In Proceedings of SIGIR (pp. 815–824). Beijing, China. Song, Y., Zhou, D., & Wei He, L. (2011). Post-ranking query suggestion by diversifying search results. In Proceedings of SIGIR (pp. 815–824). Beijing, China.
go back to reference Spärck-Jones, K., Robertson, S. E., & Sanderson, M. (2007). Ambiguous requests: Implications for retrieval tests, systems and theories. SIGIR Forum, 41(2), 8–17.CrossRef Spärck-Jones, K., Robertson, S. E., & Sanderson, M. (2007). Ambiguous requests: Implications for retrieval tests, systems and theories. SIGIR Forum, 41(2), 8–17.CrossRef
go back to reference Szpektor, I., Gionis, A., & Maarek, Y. (2011). Improving recommendation for long-tail queries via templates. In Proceedings of WWW (pp. 47–56). Szpektor, I., Gionis, A., & Maarek, Y. (2011). Improving recommendation for long-tail queries via templates. In Proceedings of WWW (pp. 47–56).
go back to reference Wang, X., & Zhai, C. (2008). Mining term association patterns from search logs for effective query reformulation. In Proceedings of CIKM (pp. 479–488). Wang, X., & Zhai, C. (2008). Mining term association patterns from search logs for effective query reformulation. In Proceedings of CIKM (pp. 479–488).
go back to reference Zaragoza, H., Craswell, N., Taylor, M. J., Saria, S., & Robertson, S. E. (2004). Microsoft Cambridge at TREC 13: Web and hard tracks. In Proceedings of TREC. Zaragoza, H., Craswell, N., Taylor, M. J., Saria, S., & Robertson, S. E. (2004). Microsoft Cambridge at TREC 13: Web and hard tracks. In Proceedings of TREC.
go back to reference Zhai, C., & Lafferty, J. (2001). A study of smoothing methods for language models applied to ad hoc information retrieval. In Proceedings of SIGIR (pp. 334–342). Zhai, C., & Lafferty, J. (2001). A study of smoothing methods for language models applied to ad hoc information retrieval. In Proceedings of SIGIR (pp. 334–342).
go back to reference Zhang, Z., & Nasraoui, O. (2006). Mining search engine query logs for query recommendation. In Proceedings of WWW (pp. 1039–1040). Zhang, Z., & Nasraoui, O. (2006). Mining search engine query logs for query recommendation. In Proceedings of WWW (pp. 1039–1040).
Metadata
Title
Learning to rank query suggestions for adhoc and diversity search
Authors
Rodrygo L. T. Santos
Craig Macdonald
Iadh Ounis
Publication date
01-08-2013
Publisher
Springer Netherlands
Published in
Discover Computing / Issue 4/2013
Print ISSN: 2948-2984
Electronic ISSN: 2948-2992
DOI
https://doi.org/10.1007/s10791-012-9211-2

Other articles of this Issue 4/2013

Discover Computing 4/2013 Go to the issue

Search Intents and Diversification

Increasing evaluation sensitivity to diversity

Premium Partner