research-article

User modeling in search logs via a nonparametric bayesian approach

Authors:
Hongning Wang

University of Illinois at Urbana-Champaign, Urbana, IL, USA

University of Illinois at Urbana-Champaign, Urbana, IL, USA
View Profile

,
ChengXiang Zhai

University of Illinois at Urbana-Champaign, Urbana, IL, USA

University of Illinois at Urbana-Champaign, Urbana, IL, USA
View Profile

,
Feng Liang

University of Illinois at Urbana-Champaign, Urbana, IL, USA

University of Illinois at Urbana-Champaign, Urbana, IL, USA
View Profile

,
Anlei Dong

Yahoo! Labs, Sunnyvale, CA, USA

Yahoo! Labs, Sunnyvale, CA, USA
View Profile

,
Yi Chang

Yahoo! Labs, Sunnyvale, CA, USA

Yahoo! Labs, Sunnyvale, CA, USA
View Profile

WSDM '14: Proceedings of the 7th ACM international conference on Web search and data miningFebruary 2014Pages 203–212https://doi.org/10.1145/2556195.2556262

Published:24 February 2014Publication History

WSDM '14: Proceedings of the 7th ACM international conference on Web search and data mining

Pages 203–212

ABSTRACT

Searchers' information needs are diverse and cover a broad range of topics; hence, it is important for search engines to accurately understand each individual user's search intents in order to provide optimal search results. Search log data, which records users' search behaviors when interacting with search engines, provides a valuable source of information about users' search intents. Therefore, properly characterizing the heterogeneity among the users' observed search behaviors is the key to accurately understanding their search intents and to further predicting their behaviors.

In this work, we study the problem of user modeling in the search log data and propose a generative model, dpRank, within a non-parametric Bayesian framework. By postulating generative assumptions about a user's search behaviors, dpRank identifies each individual user's latent search interests and his/her distinct result preferences in a joint manner. Experimental results on a large-scale news search log data set validate the effectiveness of the proposed approach, which not only provides in-depth understanding of a user's search intents but also benefits a variety of personalized applications.

References

E. Agichtein, E. Brill, S. Dumais, and R. Ragno. Learning user interaction models for predicting web search result preferences. In SIGIR'06, pages 3--10. ACM, 2006. Google ScholarDigital Library
R. Baeza-Yates and B. Ribeiro-Neto. Modern information retrieval, volume 463. ACM press New York, 1999. Google ScholarDigital Library
J. Bian, X. Li, F. Li, Z. Zheng, and H. Zha. Ranking specialization for web search: a divide-and-conquer approach by using topical ranksvm. In WWW'2010, pages 131--140. ACM, 2010. Google ScholarDigital Library
O. Chapelle and Y. Chang. Yahoo! learning to rank challenge overview. Journal of Machine Learning Research-Proceedings Track, 14:1--24, 2011.Google Scholar
O. Chapelle and Y. Zhang. A dynamic bayesian network click model for web search ranking. In WWW'09, pages 1--10. ACM, 2009. Google ScholarDigital Library
V. Dang. Ranklib-v2.1. http://people.cs.umass.edu/ vdang/ranklib.html.Google Scholar
G. E. Dupret and B. Piwowarski. A user browsing model to predict search engine click data from past observations. In SIGIR'08, pages 331--338. ACM, 2008. Google ScholarDigital Library
T. Ferguson. A bayesian analysis of some nonparametric problems. The annals of statistics, pages 209--230, 1973.Google Scholar
R. Fidel and M. Crandall. Users' perception of the performance of a filtering system. In SIGIR'97, pages 198--205. ACM, 1997. Google ScholarDigital Library
G. Giannopoulos, U. Brefeld, T. Dalamagas, and T. Sellis. Learning to rank user intent. In CIKM'2011, pages 195--200. ACM, 2011. Google ScholarDigital Library
B. Jansen, A. Spink, and T. Saracevic. Real life, real users, and real needs: a study and analysis of user queries on the web. Information processing & management, 36(2):207--227, 2000. Google ScholarDigital Library
T. Joachims. Optimizing search engines using clickthrough data. In KDD'02, pages 133--142. ACM, 2002. Google ScholarDigital Library
T. Joachims, L. Granka, B. Pan, H. Hembrooke, and G. Gay. Accurately interpreting clickthrough data as implicit feedback. In SIGIR'05, pages 154--161. ACM, 2005. Google ScholarDigital Library
A. C. König, M. Gamon, and Q. Wu. Click-through prediction for news queries. In SIGIR'09, pages 347--354. ACM, 2009. Google ScholarDigital Library
A. Kulkarni, J. Teevan, K. Svore, and S. Dumais. Understanding temporal query dynamics. In WSDM'11, pages 167--176. ACM, 2011. Google ScholarDigital Library
F. Liu, C. Yu, and W. Meng. Personalized web search by mapping user queries to categories. In CIKM'02, pages 558--565. ACM, 2002. Google ScholarDigital Library
R. Neal. Markov chain sampling methods for dirichlet process mixture models. Journal of computational and graphical statistics, 9(2):249--265, 2000.Google Scholar
R. Neal. Mcmc using hamiltonian dynamics. Handbook of Markov Chain Monte Carlo, 54:113--162, 2010.Google Scholar
D. Rose and D. Levinson. Understanding user goals in web search. In WWW'04, pages 13--19. ACM, 2004. Google ScholarDigital Library
B. Sarwar, G. Karypis, J. Konstan, and J. Riedl. Item-based collaborative filtering recommendation algorithms. In WWW'2001, pages 285--295. ACM, 2001. Google ScholarDigital Library
J. Sethuraman. A constructive definition of dirichlet priors. Statistica Sinica, 4:639--650, 1994.Google Scholar
F. Silvestri. Mining query logs: Turning search usage data into knowledge. Foundations and Trends in Information Retrieval, 4(1--2):1--174, 2010. Google ScholarDigital Library
D. Sontag, K. Collins-Thompson, P. N. Bennett, R. W. White, S. Dumais, and B. Billerbeck. Probabilistic models for personalizing web search. In WSDM'12, pages 433--442. ACM, 2012. Google ScholarDigital Library
J. Teevan, S. Dumais, and D. Liebling. To personalize or not to personalize: modeling queries with variation in user intent. In SIGIR'08, pages 163--170. ACM, 2008. Google ScholarDigital Library
Y. W. Teh, M. I. Jordan, M. J. Beal, and D. M. Blei. Hierarchical dirichlet processes. Journal of the American Statistical Association, 101(476):1566--1581, 2006.Google ScholarCross Ref
R. W. White and S. M. Drucker. Investigating behavioral variability in web search. In WWW'07, pages 21--30. ACM, 2007. Google ScholarDigital Library
Q. Wu, C. Burges, K. Svore, and J. Gao. Adapting boosting for information retrieval measures. Information Retrieval, 13(3):254--270, 2010. Google ScholarDigital Library
Y. Zhang, W. Chen, D. Wang, and Q. Yang. User-click modeling for understanding and predicting search-behavior. In KDD'2011, pages 1388--1396. ACM, 2011. Google ScholarDigital Library

Index Terms

User modeling in search logs via a nonparametric bayesian approach
1. Information systems
  1. Information systems applications

Recommendations

Mining query subtopics from search log data
SIGIR '12: Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval

Most queries in web search are ambiguous and multifaceted. Identifying the major senses and facets of queries from search log data, referred to as query subtopic mining in this paper, is a very important issue in web search. Through search log analysis, ...
Read More
Re-ranking search results using query logs
CIKM '06: Proceedings of the 15th ACM international conference on Information and knowledge management

This work addresses two common problems in search, frequently occurring with underspecified user queries: the top-ranked results for such queries may not contain documents relevant to the user's search intent, and fresh and relevant pages may not get ...
Read More
Supporting exploratory search tasks with interactive user modeling
ASIST '13: Proceedings of the 76th ASIS&T Annual Meeting: Beyond the Cloud: Rethinking Information Boundaries

This paper presents the design and study of interactive user modeling to support exploratory search tasks. Contrary to traditional interactions, such as query based search, query suggestions, or relevance feedback, interactive user modeling allows a ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
WSDM '14: Proceedings of the 7th ACM international conference on Web search and data mining
February 2014
712 pages
ISBN:9781450323512
DOI:10.1145/2556195
General Chairs:
Ben Carterette
University of Delaware, USA
,
Fernando Diaz
Microsoft Research, USA
,
Program Chairs:
Carlos Castillo
Qatar Computing Research Institute, Qatar
,
Donald Metzler
Google, USA
Copyright © 2014 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 24 February 2014
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
non-parametric bayesian
search log mining
user modeling
Qualifiers
- research-article
Conference

Acceptance Rates
WSDM '14 Paper Acceptance Rate64of355submissions,18%Overall Acceptance Rate498of2,863submissions,17%
More
Upcoming Conference
WSDM '25

Sponsor:

sigir

sigir

sigir

sigir

The Eighteenth ACM International Conference on Web Search and Data Mining

April 7 - 11, 2025

Hannover , Germany
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 10
  Total Citations
  View Citations
- 279
  Total Downloads
- Downloads (Last 12 months)6
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

User modeling in search logs via a nonparametric bayesian approach

WSDM '14: Proceedings of the 7th ACM international conference on Web search and data mining

ABSTRACT

References

Cited By

Index Terms

Recommendations

Mining query subtopics from search log data

Re-ranking search results using query logs

Supporting exploratory search tasks with interactive user modeling