skip to main content
10.1145/1277741.1277759acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
Article

Learn from web search logs to organize search results

Published:23 July 2007Publication History

ABSTRACT

Effective organization of search results is critical for improving the utility of any search engine. Clustering search results is an effective way to organize search results, which allows a user to navigate into relevant documents quickly. However, two deficiencies of this approach make it not always work well: (1) the clusters discovered do not necessarily correspond to the interesting aspects of a topic from the user's perspective; and (2) the cluster labels generated are not informative enough to allow a user to identify the right cluster. In this paper, we propose to address these two deficiencies by (1) learning "interesting aspects" of a topic from Web search logs and organizing search results accordingly; and (2) generating more meaningful cluster labels using past query words entered by users. We evaluate our proposed method on a commercial search engine log data. Compared with the traditional methods of clustering search results, our method can give better result organization and more meaningful labels.

References

  1. E. Agichtein, E. Brill, and S. T. Dumais. Improving web search ranking by incorporating user behavior information. In SIGIR, pages 19--26, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. J. A. Aslam, E. Pelekov, and D. Rus. The star clustering algorithm for static and dynamic information organization. Journal of Graph Algorithms and Applications, 8(1):95--129, 2004.Google ScholarGoogle ScholarCross RefCross Ref
  3. R. A. Baeza-Yates. Applications of web query mining. In ECIR, pages 7--22, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. D. Beeferman and A. L. Berger. Agglomerative clustering of a search engine query log. In KDD, pages 407--416, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. D. Carmel, E. Yom-Tov, A. Darlow, and D. Pelleg. What makes a query difficult? In SIGIR, pages 390--397, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. H. Chen and S. T. Dumais. Bringing order to the web: automatically categorizing search results. In CHI, pages 145--152, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. S. Cronen-Townsend, Y. Zhou, and W. B. Croft. Predicting query performance. In Proceedings of ACM SIGIR 2002, pages 299--306, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. S. T. Dumais, E. Cutrell, and H. Chen. Optimizing search by showing results in context. In CHI, pages 277--284, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. M. A. Hearst and J. O. Pedersen. Reexamining the cluster hypothesis: Scatter/gather on retrieval results. In SIGIR, pages 76--84, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. T. Joachims. Optimizing search engines using clickthrough data. In KDD, pages 133--142, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. T. Joachims. Evaluating Retrieval Performance Using Clickthrough Data., pages 79--96. Physica/Springer Verlag, 2003. in J. Franke and G. Nakhaeizadeh and I. Renz, "Text Mining".Google ScholarGoogle Scholar
  12. R. Jones, B. Rey, O. Madani, and W. Greiner. Generating query substitutions. In WWW, pages 387--396, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. K. Kummamuru, R. Lotlikar, S. Roy, K. Singal, and R. Krishnapuram. A hierarchical monothetic document clustering algorithm for summarization and browsing search results. In WWW, pages 658--665, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Microsoft Live Labs. Accelerating search in academic research, 2006. http://research.microsoft.com/ur/us/fundingopps/RFPs/Search 2006 RFP.aspx.Google ScholarGoogle Scholar
  15. P. Pirolli, P. K. Schank, M. A. Hearst, and C. Diehl. Scatter/gather browsing communicates the topic structure of a very large text collection. In CHI, pages 213--220, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. F. Radlinski and T. Joachims. Query chains: learning to rank from implicit feedback. In KDD, pages 239--248, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. S. E. Robertson and S. Walker. Some simple effective approximations to the 2-poisson model for probabilistic weighted retrieval. In SIGIR, pages 232--241, 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. G. Salton, A. Wong, and C. S. Yang. A vector space model for automatic indexing. Commun. ACM, 18(11):613--620, 1975. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. X. Shen, B. Tan, and C. Zhai. Context-sensitive information retrieval using implicit feedback. In SIGIR, pages 43--50, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. C. J. van Rijsbergen. Information Retrieval, second edition. Butterworths, London, 1979. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. V. N. Vapnik. The Nature of Statistical Learning Theory. Springer-Verlag, Berlin, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Vivisimo. http://vivisimo.com/.Google ScholarGoogle Scholar
  23. X. Wang, J.-T. Sun, Z. Chen, and C. Zhai. Latent semantic analysis for multiple-type interrelated data objects. In SIGIR, pages 236--243, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. J.-R. Wen, J.-Y. Nie, and H. Zhang. Clustering user queries of a search engine. In WWW, pages 162--168, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. E. Yom-Tov, S. Fine, D. Carmel, and A. Darlow. Learning to estimate query difficulty: including applications to missing content detection and distributed information retrieval. In SIGIR, pages 512--519, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. O. Zamir and O. Etzioni. Web document clustering: A feasibility demonstration. In SIGIR, pages 46--54, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. O. Zamir and O. Etzioni. Grouper: A dynamic clustering interface to web search results. Computer Networks, 31(11-16):1361--1374, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. H.-J. Zeng, Q.-C. He, Z. Chen, W.-Y. Ma, and J. Ma. Learning to cluster web search results. In SIGIR, pages 210--217, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Learn from web search logs to organize search results

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          SIGIR '07: Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
          July 2007
          946 pages
          ISBN:9781595935977
          DOI:10.1145/1277741

          Copyright © 2007 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 23 July 2007

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • Article

          Acceptance Rates

          Overall Acceptance Rate792of3,983submissions,20%

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader