Article

Learn from web search logs to organize search results

Authors:
Xuanhui Wang

University of Illinois at Urbana-Champaign

University of Illinois at Urbana-Champaign
View Profile

,
ChengXiang Zhai

University of Illinois at Urbana-Champaign

University of Illinois at Urbana-Champaign
View Profile

SIGIR '07: Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrievalJuly 2007Pages 87–94https://doi.org/10.1145/1277741.1277759

Published:23 July 2007Publication History

SIGIR '07: Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval

Pages 87–94

ABSTRACT

Effective organization of search results is critical for improving the utility of any search engine. Clustering search results is an effective way to organize search results, which allows a user to navigate into relevant documents quickly. However, two deficiencies of this approach make it not always work well: (1) the clusters discovered do not necessarily correspond to the interesting aspects of a topic from the user's perspective; and (2) the cluster labels generated are not informative enough to allow a user to identify the right cluster. In this paper, we propose to address these two deficiencies by (1) learning "interesting aspects" of a topic from Web search logs and organizing search results accordingly; and (2) generating more meaningful cluster labels using past query words entered by users. We evaluate our proposed method on a commercial search engine log data. Compared with the traditional methods of clustering search results, our method can give better result organization and more meaningful labels.

References

E. Agichtein, E. Brill, and S. T. Dumais. Improving web search ranking by incorporating user behavior information. In SIGIR, pages 19--26, 2006. Google ScholarDigital Library
J. A. Aslam, E. Pelekov, and D. Rus. The star clustering algorithm for static and dynamic information organization. Journal of Graph Algorithms and Applications, 8(1):95--129, 2004.Google ScholarCross Ref
R. A. Baeza-Yates. Applications of web query mining. In ECIR, pages 7--22, 2005. Google ScholarDigital Library
D. Beeferman and A. L. Berger. Agglomerative clustering of a search engine query log. In KDD, pages 407--416, 2000. Google ScholarDigital Library
D. Carmel, E. Yom-Tov, A. Darlow, and D. Pelleg. What makes a query difficult? In SIGIR, pages 390--397, 2006. Google ScholarDigital Library
H. Chen and S. T. Dumais. Bringing order to the web: automatically categorizing search results. In CHI, pages 145--152, 2000. Google ScholarDigital Library
S. Cronen-Townsend, Y. Zhou, and W. B. Croft. Predicting query performance. In Proceedings of ACM SIGIR 2002, pages 299--306, 2002. Google ScholarDigital Library
S. T. Dumais, E. Cutrell, and H. Chen. Optimizing search by showing results in context. In CHI, pages 277--284, 2001. Google ScholarDigital Library
M. A. Hearst and J. O. Pedersen. Reexamining the cluster hypothesis: Scatter/gather on retrieval results. In SIGIR, pages 76--84, 1996. Google ScholarDigital Library
T. Joachims. Optimizing search engines using clickthrough data. In KDD, pages 133--142, 2002. Google ScholarDigital Library
T. Joachims. Evaluating Retrieval Performance Using Clickthrough Data., pages 79--96. Physica/Springer Verlag, 2003. in J. Franke and G. Nakhaeizadeh and I. Renz, "Text Mining".Google Scholar
R. Jones, B. Rey, O. Madani, and W. Greiner. Generating query substitutions. In WWW, pages 387--396, 2006. Google ScholarDigital Library
K. Kummamuru, R. Lotlikar, S. Roy, K. Singal, and R. Krishnapuram. A hierarchical monothetic document clustering algorithm for summarization and browsing search results. In WWW, pages 658--665, 2004. Google ScholarDigital Library
Microsoft Live Labs. Accelerating search in academic research, 2006. http://research.microsoft.com/ur/us/fundingopps/RFPs/Search 2006 RFP.aspx.Google Scholar
P. Pirolli, P. K. Schank, M. A. Hearst, and C. Diehl. Scatter/gather browsing communicates the topic structure of a very large text collection. In CHI, pages 213--220, 1996. Google ScholarDigital Library
F. Radlinski and T. Joachims. Query chains: learning to rank from implicit feedback. In KDD, pages 239--248, 2005. Google ScholarDigital Library
S. E. Robertson and S. Walker. Some simple effective approximations to the 2-poisson model for probabilistic weighted retrieval. In SIGIR, pages 232--241, 1994. Google ScholarDigital Library
G. Salton, A. Wong, and C. S. Yang. A vector space model for automatic indexing. Commun. ACM, 18(11):613--620, 1975. Google ScholarDigital Library
X. Shen, B. Tan, and C. Zhai. Context-sensitive information retrieval using implicit feedback. In SIGIR, pages 43--50, 2005. Google ScholarDigital Library
C. J. van Rijsbergen. Information Retrieval, second edition. Butterworths, London, 1979. Google ScholarDigital Library
V. N. Vapnik. The Nature of Statistical Learning Theory. Springer-Verlag, Berlin, 1995. Google ScholarDigital Library
Vivisimo. http://vivisimo.com/.Google Scholar
X. Wang, J.-T. Sun, Z. Chen, and C. Zhai. Latent semantic analysis for multiple-type interrelated data objects. In SIGIR, pages 236--243, 2006. Google ScholarDigital Library
J.-R. Wen, J.-Y. Nie, and H. Zhang. Clustering user queries of a search engine. In WWW, pages 162--168, 2001. Google ScholarDigital Library
E. Yom-Tov, S. Fine, D. Carmel, and A. Darlow. Learning to estimate query difficulty: including applications to missing content detection and distributed information retrieval. In SIGIR, pages 512--519, 2005. Google ScholarDigital Library
O. Zamir and O. Etzioni. Web document clustering: A feasibility demonstration. In SIGIR, pages 46--54, 1998. Google ScholarDigital Library
O. Zamir and O. Etzioni. Grouper: A dynamic clustering interface to web search results. Computer Networks, 31(11-16):1361--1374, 1999. Google ScholarDigital Library
H.-J. Zeng, Q.-C. He, Z. Chen, W.-Y. Ma, and J. Ma. Learning to cluster web search results. In SIGIR, pages 210--217, 2004. Google ScholarDigital Library

Index Terms

Learn from web search logs to organize search results
1. Information systems
  1. Information retrieval
    1. Information retrieval query processing
    2. Retrieval tasks and goals
      1. Clustering and classification
  2. Information systems applications
    1. Data mining
      1. Clustering

Recommendations

Learning to cluster web search results
SIGIR '04: Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval

Organizing Web search results into clusters facilitates users' quick browsing through search results. Traditional clustering techniques are inadequate since they don't generate clusters with highly readable names. In this paper, we reformalize the ...
Read More
Re-ranking search results using query logs
CIKM '06: Proceedings of the 15th ACM international conference on Information and knowledge management

This work addresses two common problems in search, frequently occurring with underspecified user queries: the top-ranked results for such queries may not contain documents relevant to the user's search intent, and fresh and relevant pages may not get ...
Read More
Topic-driven web search result organization by leveraging wikipedia semantic knowledge
CIKM '10: Proceedings of the 19th ACM international conference on Information and knowledge management

Effective organization of web search results can greatly improve the utility of search engine and enhance the quality of search results. However, the organization of search results is difficult because the sub-topics of a query are usually not ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SIGIR '07: Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
July 2007
946 pages
ISBN:9781595935977
DOI:10.1145/1277741
General Chairs:
Wessel Kraaij
TNO, The Netherlands
,
Arjen P. de Vries
CWI, The Netherlands
,
Program Chairs:
Charles L. A. Clarke
University of Waterloo, Canada
,
Norbert Fuhr
University of Duisburg-Essen, Germany
,
Noriko Kando
National Institute of Informatics, Japan
Copyright © 2007 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 23 July 2007
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
interesting aspects
search engine logs
search result organization
Qualifiers
- Article
Conference

Acceptance Rates
Overall Acceptance Rate792of3,983submissions,20%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 112
  Total Citations
  View Citations
- 1,632
  Total Downloads
- Downloads (Last 12 months)5
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Learn from web search logs to organize search results

SIGIR '07: Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval

ABSTRACT

References

Cited By

Index Terms

Recommendations

Learning to cluster web search results

Re-ranking search results using query logs

Topic-driven web search result organization by leveraging wikipedia semantic knowledge