ABSTRACT
Aggregated search refers to the integration of content from specialized corpora or verticals into web search results. Aggregation improves search when the user has vertical intent but may not be aware of or desire vertical search. In this paper, we address the issue of integrating search results from a news vertical into web search results. News is particularly challenging because, given a query, the appropriate decision---to integrate news content or not---changes with time. Our system adapts to news intent in two ways. First, by inspecting the dynamics of the news collection and query volume, we can track development of and interest in topics. Second, by using click feedback, we can quickly recover from system errors. We define several click-based metrics which allow a system to be monitored and tuned without annotator effort.
- J. Allan, editor. Topic Detection and Tracking: Event-based Information Organization, volume 12 of The Information Retrieval Series. Springer, 2002. Google ScholarDigital Library
- H. Becker, C. Meek, and D. M. Chickering. Modeling contextual factors of click rates. In AAAI, pages 1310--1315, 2007. Google ScholarDigital Library
- D. Beeferman and A. Berger. Agglomerative clustering of a search engine query log. In KDD 2000, pages 407--416, 2000. Google ScholarDigital Library
- S. M. Beitzel, E. C. Jensen, O. Frieder, D. Grossman, D. D. Lewis, A. Chowdhury, and A. Kolcz. Automatic web query classification using labeled and unlabeled training data. In SIGIR 2005, pages 581--582, 2005. Google ScholarDigital Library
- N. J. Belkin and W. B. Croft. Information filtering and information retrieval: two sides of the same coin? CACM, 35(12):29--38, 1992. Google ScholarDigital Library
- A. Z. Broder, M. Fontoura, E. Gabrilovich, A. Joshi, V. Josifovski, and T. Zhang. Robust classification of rare queries using web knowledge. In SIGIR 2007, pages 231--238, 2007. Google ScholarDigital Library
- J. Callan. Distributed information retrieval. In W. B. Croft, editor, Advances in Information Retrieval. Kluwer Academic Publishers, 2000.Google Scholar
- B. Carterette and R. Jones. Evaluating search engines by modeling the relationship between relevance and clicks. In NIPS, 2007.Google Scholar
- S. Cronen-Townsend, Y. Zhou, and W. B. Croft. Predicting query performance. In SIGIR 2002, pages 299--306, 2002. Google ScholarDigital Library
- A. S. Das, M. Datar, A. Garg, and S. Rajaram. Google news personalization: scalable online collaborative filtering. In WWW 2007, pages 271--280, 2007. Google ScholarDigital Library
- F. Diaz and R. Jones. Using temporal profiles of queries for precision prediction. In SIGIR 2004, pages 18--24, 2004. Google ScholarDigital Library
- G. Dupret, V. Murdock, and B. Piwowarski. Web search engine evaluation using clickthrough data and a user model. In Query Log Analysis: Social And Technological Challenges. A workshop at the 16th International World Wide Web Conference (WWW 2007), May 2007.Google Scholar
- G. E. Dupret and B. Piwowarski. A user browsing model to predict search engine click data from past observations. In SIGIR 2008, pages 331--338, 2008. Google ScholarDigital Library
- R. Jones and F. Diaz. Temporal profiles of queries. ACM Trans. Inf. Syst., 25(3):14, July 2007. Google ScholarDigital Library
- N. K. Jong and P. Stone. Bayesian models of nonstationary markov decision processes. In The IJCAI-2005 Workshop on Planning and Learning in A Priori Unknown or Dynamic Domains, 2005.Google Scholar
- I.-H. Kang and G. Kim. Query type classification for web document retrieval. In SIGIR 2003, pages 64--71, 2003. Google ScholarDigital Library
- R. Kleinberg, A. Slivkins, and E. Upfal. Multi-armed bandits in metric spaces. In STOC 2008. 2008. Google ScholarDigital Library
- V. Lavrenko and W. B. Croft. Relevance based language models. In SIGIR 2001, pages 120--127, 2001. Google ScholarDigital Library
- X. Li, Y.-Y. Wang, and A. Acero. Learning query intent from regularized click graphs. In SIGIR 2008, pages 339--346, 2008. Google ScholarDigital Library
- C.-J. Lin, R. C. Weng, and S. S. Keerthi. Trust region newton method for large-scale logistic regression. Journal of Machine Learning Research, 9:627--650, 2008. Google ScholarDigital Library
- A. K. McCallum and K. Nigam. Employing EM in pool-based active learning for text classification. In ICML 1998, pages 350--358, 1998. Google ScholarDigital Library
- D. Metzler, S. T. Dumais, and C. Meek. Similarity measures for short segments of text. In ECIR, pages 16--27, 2007. Google ScholarDigital Library
- V. Murdock and M. Lalmas, editors. Proceedings of the SIGIR Workshop on Aggregated Search, 2008.Google Scholar
- S. Pandey, D. Agarwal, D. Chakrabarti, and V. Josifovski. Bandits for taxonomies: A model-based approach. In SDM, 2007.Google ScholarCross Ref
- S. Pandey, D. Chakrabarti, and D. Agarwal. Multi-armed bandit problems with dependent arms. In ICML 2007, pages 721--728, 2007. Google ScholarDigital Library
- F. Radlinski, R. Kleinberg, and T. Joachims. Learning diverse rankings with multi-armed bandits. In ICML 2008, pages 784--791, 2008. Google ScholarDigital Library
- M. Sahami and T. D. Heilman. A web-based kernel function for measuring the similarity of short text snippets. In WWW 2006, pages 377--386, 2006. Google ScholarDigital Library
- M. J. A. Strens. A bayesian framework for reinforcement learning. In ICML 2000, pages 943--950, 2000. Google ScholarDigital Library
- T. Strohman, D. Metzler, H. Turtle, and W. B. Croft. Indri: A language model-based search engine for complex queries. In Proceedings of the International Conference on Intelligence Analysis, 2004.Google Scholar
- R. Sutton and A. Barto. Reinforcement Learning. MIT Press, 1998. Google ScholarDigital Library
- M. Vlachos, C. Meek, Z. Vagena, and D. Gunopulos. Identifying similarities, periodicities and bursts for online search queries. In SIGMOD 2004, pages 131--142, 2004. Google ScholarDigital Library
- J.-R. Wen, J.-Y. Nie, and H.-J. Zhang. Query clustering using user logs. ACM Trans. Inf. Syst., 20(1):59--81, 2002. Google ScholarDigital Library
- X. Zhang. Fast Algorithms for Burst Detection. PhD thesis, New York University, 2006. Google ScholarDigital Library
- Y. Zhang, W. Xu, and J. P. Callan. Exploration and exploitation in adaptive filtering based on bayesian active learning. In ICML 2003, pages 896--903, 2003.Google Scholar
Index Terms
- Integration of news content into web results
Recommendations
Click-through prediction for news queries
SIGIR '09: Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrievalA growing trend in commercial search engines is the display of specialized content such as news, products, etc. interleaved with web search results. Ideally, this content should be displayed only when it is highly relevant to the search query, as it ...
v-TCM: vertical-aware transformer click model for web search
SAC '22: Proceedings of the 37th ACM/SIGAPP Symposium on Applied ComputingUnderstanding 1 and predicting user click behavior on a web search engine results page is critical for online advertising and recommendation engines. The click prediction results can be further used to estimate the relevance of the search engine ...
AllInOneNews: development and evaluation of a large-scale news metasearch engine
SIGMOD '07: Proceedings of the 2007 ACM SIGMOD international conference on Management of dataAllInOneNews is the largest news metasearch engine in the world, connecting to over 1,000 news sites over 150 countries. Implementing a large-scale metasearch engine like AllInOneNews needs to overcome unique challenges not faced by building small ...
Comments